A NEW TECHNIQUE FOR DETERMINING REGION OF INTEREST IN SELECTIVE VIDEO PROTECTION APPROACH

Today, the use of video communication in real-time applications is rising at a rapid rate and most of these videos require secure transmissions like surveillance, video conferencing, video on demand, medical and military imaging systems. The trade-off between data security and real-time performance is the main challenge in this field. This paper presents a protection system that provides a balancing between the security level and coding efficiency. A smart selection for the motion information of the video is done based on canny edge detection scaling which reduces the amount of encrypted data and strong and fast stream encryption is done using key generation model of chaos and RC4 algorithm. The experimental results of the performance metrics confirmed the high perceptual and cryptographic security of the proposed system against attacks and showed that all the requirements of compression efficiency are satisfied. keywords: Compression efficiency, Smart selection, Canny edge detection, Chaos, The RC4 algorithm.


I. INTRODUCTION
Digital medium data (images, sounds, videos, etc.) is rapidly entering people's lives with the continuous growth and advancement of computer networks and wireless mobile communications technologies [1]. As a result, video networking systems, such as the associated video conferencing system, on-demand videos, and on-the-spot video surveillance system, have become more diversified [2]. In the case of attacks in open network communications, such as interception of data and theft of personal data, unauthorized copying and piracy, video data without encryption will become very vulnerable. Video data protection is of great importance for multimedia use and is currently a challenge in developing video technology [3].
H.264 / AVC is the commonly used IP-and wireless network video encoding format that supports a variety of applications such as video storage, television transmission, real-time video communications, and on-demand video. H.264/ AVC has unique coding structure characteristics, huge data volumes, and high real-time demand. It is, therefore, a complex and challenging task to attempt to encrypt video data without affecting the coding quality. Efficient features, stability, and compression in a video encryption device must be taken into account. Encryption has proven to be a viable method of protecting video data content [4]. The original data of your video is still unknown even though the encrypted data is intercepted by foreign spies because they have not the right decryption key and this allows for the protection of video data.
The conventional encryption algorithms can be highly protected the compressed video data but not format compliant and the computational complexity is high [5], therefore; selective encryption technique is widely used in this field as it provides another efficient way in which the compressed video can be secure on the multimedia social network with satisfying the compression efficiency requirements where only the important or sensitive video data are chosen for encryption. In the proposed work, a smart selection technique based on edge detection is used to select the significant and effective parameters of the H.264 compression standard in an intelligent way. As the working nature of the compression standard is streaming processing, RC4 stream cipher is used with a chaotic system to encrypt the selected parameters of H.264. As it is known that the security of RC4 depends significantly on it's key and a lot of researches was done in this field to solve this problem [6], [7], [8], [9]. Thus in this work, RC4 is strengthened in this part by using a combination of two nonlinear chaotic methods, Henon-map and Sine-map in a key generation module. This paper aims to make a balance between two paws, satisfying a good level of security that protects the compressed video against various attacks and on the other hand, preserving the compression efficiency by avoiding the increase in the video bitrate, don't increasing the encoding time and the computational complexity and ensuring the video displaying in the decoder without problems which means maintaining the format compliance. The proposed strategy provided these requirements by determining the best significant and little amount of video data based on a smart selection technique, and design a fast stream and strong encryption algorithm.
II. RELATED STUDIES [4] In this scheme, analyzation was done to the impact of the quantization parameter (QP) on the encryption of the sign of T 1 s and the inter macroblock non-zero coefficients and based on their results, the syntax elements chosen for encryption are the sign of intra-macroblock non-zero DCTcoefficients, the sign of trailing ones (T 1 s), the intra prediction modes (IPMs) and the sign of motion vector difference (MVD) to keep the H.264/AVC video secure in multimedia social networks. This scheme used an RC4 encryption algorithm with keyspace (2 2048 ) which made it very efficient against brute-force attacks. The simulation results ensured that this scheme is efficient in achieving perceptual security, preserving bitrate, and having low computational cost. [10] presented a partial encryption method for H.264 of video conference applications based on chaos. The scheme used two piecewise linear chaotic maps (PWLCMs) in building pseudo-random bit generators that generated two renewable key streams, one used for the encryption operation and the other used for the encryption decision whether to encrypt or not. Three syntax elements of H.264 bitstream were selected for encryption, non-zero quantified coefficients for both (I) and (P) frames, the signs of motion vectors (MVs) differences, and the Intra-Prediction Mode (IPMs). The results of the performance analysis indicated that this scheme is secure and very efficient in terms of computational complexity and coding efficiency. [11] designed a real-time video protection system based on the CABAC module of the H.264 entropy coding stage and the chaotic systems built from Logistic map. The encrypted bin-strings of CABAC were Intra-Prediction Mode (IPM), Motion Vector Difference (MVD), and residue coefficients, the encryption process was xoring the bin-strings with the chaotic sequence. The simulation results ensured that this encryption technique is very suitable for real-time application due to its low computational complexity. [12] in this work, they designed a video protection system suitable for the constrained devices in an Internet of multimedia things environment where they adopted the EXPer (extended permutation with exclusive OR) innovative to encrypt the H.264 bitstream selectively. The selected parameters include the signs of motion vector difference (MVD) and the absolute values of delta QP of both CAVLC and CABAC and the textural syntax elements of CAVLC (signs of T 1 s, suffix and sign of NZ levels) and CABAC (UEG0 suffixes, signs of NZ-TC levels). The proposed cipher includes confusion and diffusion processes represented by permutation and xoring with three dynamic keys generated per video sequence. The simulation analysis confirmed that the EXPer system provided significant confidentiality with a small computational cost and a negligible bitrate overhead that made the proposed system very suitable for real-time applications.  [13]. H.264/AVC has achieved a significant improvement in compression performance compared to prior standards, and it provides a networkfriendly representation of the video that addresses both conversational and non-conversational applications [14]. Since the H.264 compression standard is based on YUV color space, the first step is converting the input video of RGB color format into YUV color format [15] . In H.264, the video passes through many divisions where in the first stage the source video sequence is divided into cycles that consist of a fixed number of pictures called the group of picture (GOP), then each picture is subdivided into one or more slices and each slice consist of an integral number of Macroblocks (MB), MB is the smallest coding unit in a frame that comprises information belonging to a region of (16 × 16) Y luma samples along with the related (U) and (V) component samples [25]. The cycle starts with Intra-prediction (I) frame and followed by many predicted (P) or bi-predicted (B) frames. I-frame is predicted using intra prediction for the current frame without using previously encoded pictures and the produced information includes the IPM and the residual data, there are different prediction modes for each block size where the luma components have two block sizes are (16×16) and (4×4), and one of (8×) block size for each chroma components, (16 × 16) has 4 prediction modes, (4 × 4) has 9 prediction modes and (8 × 8) chroma has 4 prediction modes [16]. P-frame and B-frame are predicted using inter prediction where each macroblock is predicted from an area of the same size in a reference picture to give the MV using motion estimation methods. Thus, the produced data from the prediction stage includes the prediction information (IPMD, MVD) and the residual error. The residual data will then pass through the next compression stages as shown in Fig. 1 that include transformation, quantization, and entropy coding which is CAVLC for the H.264 and the prediction information will be encoded using the fixed-length coding method (Exp-Golomb) where the encoder converts the syntax element into an index code-num.
the codeword constructs from three parts: Mzeros, 1, INFO where INFO is M-bit field carrying information, the length of each codeword is equal to (2M + 1) bits [17].

B. Canny Edge Detection
Edges are considered the most important aspect which provides valuable information for image analysis [18]. Edges are essentially the outline that distinguishes the object from its context. Edge detection is very complicated and is influenced by a deteriorating mechanism due to varying fluctuating noise levels [18]. The edge detector Canny is an operator for edge detection. Multi-stage algorithms are used to identify a wide range of borders [19]. The canny edge algorithm for edge detection is efficient. It takes a gray image as an input, processes it and generates the result that displays the intensity discontinuities. The canny edge detector performs five phases: input smoothing by Gauze filter, picture gradient detection, 1) Canny has better identification (criteria for identification). The canny method will illuminate all the current edges that conform to the threshold of the user-specified parameter.
2) Canny has the best way to locate. Canny is capable of generating minimum distance between the observed edge and the actual image edge.
3) It provides a clear response. It gives a single answer for every tip. This makes the edge detection for the next picture less confusing. Canny Edge Detection's identification of parameters would affect both detection outcomes and edge.
Parameters are: • Gaussian default value.
• Value of threshold. 1) Allow a Gaussian filter to remove all noise from an image. The resulting image will be less blurred after applying a Gaussian filter. The filter is used to obtain the real edge of the image. When a Gaussian filter is not used, the noise itself is observed sometimes as an edge.
2) Detect the Sobel operator's edge of the value "4.75" , so that you can see the edges and distortions. The results from both operators are combined to get the cumulative result by the equation given below: 3) Use the following formula to determine the direction: The detection uses two thresholds (maximum and minimum threshold values). The pixel is negated as the image background when the pixel gradient is greater than the maximum threshold. As pixel gradient from maximum to minimum threshold and if linked to another edge pixel higher than the maximum threshold, it is adopted as the edge.
4) Reduce the emerging edge line by using non-maximum suppression. this process produces a slimmer edge line.
5) The last step is to set up two threshold values to determine the binary value of image pixels. An example of the result of applying the canny edge detection method to an image is described in Fig. 2.

C. Chaotic Map Methods
The nature of the chaotic map methods in it's fundamental concepts of high sensitivity to the initial parameters and conditions has led many researchers to produce new chaos-based encryption algorithms [21], [22], [6], [8], [9], [23], [24], [25]. The chaotic map schemes have a good combination of randomness, velocity, high security and good encryption efficiency.
• Henon Map Henon-map is a two-dimension discrete dynamical system, defined by the equations [7]: It generates two random and unpredictable sequences (X, Y) based on initial values (x 0 , y 0 ) and control parameters (a, b) where a ∈ (1.54, 2) and b ∈ [0, 1], the values(a=1.4, b=0.3) give the higher chaotic behavior of henon-map.
• Sine Map Sine-map is a simple type of discrete chaotic system. It is obtained from the sine function and its mathematical equation is [24]: It is based on the initial value (z 0 ) and the control parameter (r) which its value in the range of r ∈ [0, 1] and the value (r = 0.94) is the best for the chaotic behavior of sine-map.

D. Traditional RC4 Algorithm
The RC4 is a stream cipher, symmetric key, designed in 1987 by Ron Rivest for RSA Security and characterized by its speed and simplicity. It is a variable key length (from 1 to 256 bytes) stream cipher with byte-oriented operations and based on the use of random permutation [26]. The proposed system mainly consists of two phases: the selective video encryption phase which is performed on the sender side, and the selective video decryption phase which is executed on the receiver side, and each phase has its stages as shown in Fig. 3.

A. Selective Video Encryption Phase
The selective video encryption phase includes three parts: key generation based on a chaotic system, selection technique, and encryption procedure for the selected parameters using the RC4 algorithm. bytes is taken to order Y sequence bytes with it. The following operation is Xoring the rearranged Y with Z sequence, then the produced sequence is 1, kollpXored with the sorted X to generate the secret key sequence which is tested with NIST and proved that it is unpredictable and has high randomness. The length of the generated sequence from the chaos is equal to the key length of RC4 which is (256 bytes) and it is the number of iterations for the two chaotic maps.
• Selection Technique: The selection technique of data that would be encrypted should consider important performance metrics which are compression efficiency, time cost, and security. To satisfy these requirements, the selection must be for a small amount of encoding data to not cause delay to the encoding time and don't change the compression ratio.
At the same time, these data must be significant, and sensitive to achieve a suitable level of security.   2) Frame-level: The second level of the selection is the frames inside each cycle, these frames are classified into three types, I-frame, P-frame, and B-frame. Since the prediction process to P and B frames is depending on I-frame as the reference frame, the encryption of the I-frame will be affected on them greatly and make them unintelligible.
Thus I-frame is chosen to be encrypted in all cycles. As the cycle level classifies the cycles into significant and insignificant based on their information scale where the significant cycle has more information, subsequently it has more motion information, therefore; the movement information is selected to be encrypted for all P-frames in the significant cycle as P-frame is the reference for the adjacent P and B frames and as a result, its encryption will be effected on them, while only I-frame will be encrypted for the insignificant cycle because it has less motion information. This level of selection is shown in Fig. 6.

1) Visual Analysis
Visual analysis ensures that no information about the original video can be extracted from the encrypted video. As shown in Table I, the encrypted videos are very noisy and unclear which means that the proposed selective encryption guarantees visual perception security.

3) Encrypted Area (EA)
The encrypted area is the ratio of encrypted bits to the whole bitstream length in terms of percentage [27]. The EA for videos calculated for the selected syntax parameters (IPM, MVD), when the video contains a lot of objects, this indicates that there is more movement, therefore; the more details the video contains, the more syntax elements are selected, and as a result, the EA would be higher. As it is obvious in Table III that each of the testing video sequences has a different EA because each video represents different combinations of the scene such as fast motion, complicated texture, related to camera motion, still background and active foreground.    Table IV were the experiments that done to (4) video sequences resulted from a negligible difference that can't change the size of the encoded video bitstream. Thus, the proposed encryption system satisfying the compression

5) Time Cost
Time Cost is the additional processing delay caused by the encryption process to the encoding time [27]. Table V gives the rate of the changing to the encoding time for the testing video sequences and as it is obvious that the impact of encryption is a very small and not noticeable delay (null) which means that the proposed system satisfying time efficiency. The percentage of time cost is calculated using the following equation: where TCE, TCD, TE, and TD are the time of compression, decompression, encryption and decryption respectively.

6) Format Compliant
The selected syntax parameters for encryption are not format or control information, therefore; the general bitstream format is preserved and the encrypted bitstream is decoded without any problem. Thus, the proposed system is compatible with the compression format.

7) Peak signal-to-noise ratio (PSNR)
In this evaluation, the original video is considered as a signal and the encrypted video is considered as noise. The calculation equations of PSNR are [6]: Where the MSE equation is, I(i, j) is the pixel value of the original video and I (i, j) is the pixel value of the encrypted video at the location (i, j).
The PSNR evaluation for the testing videos is done for the three color planes (Y U V) of the video. In Table VI Table VII shows the randomness test of the generated key by the chaotic system [25].

10) Entropy Analysis
Information entropy is a measure of the randomness amount in information content, it is calculated by the following equation [6]: Where (S i ) is the pixel value, the probability of the symbol (S i ) is P (S i ) and (n) is the total number of symbols which is (256). In Table VIII, the entropy results assessed the randomness in the content of the video. as it is illustrated that the results values of all the encrypted sequences are closer to (8) which refers to the high randomness in the encrypted videos and the high degradation in its quality. If it is focused on the entropy values of the encrypted videos, it will be noticed that the videos with the highest values and the closest to 8 are the videos with the highest amount of encrypted data (EA) where the proportion between the entropy and the EA is directly proportional.

11) Comparative Evaluation
The performance of the proposed system is compared with prior works proposed recently, each one of these researches used various encryption techniques and protected different syntax elements of the H.264 standard. The comparison between them is done based on the essential performance metrics (PSNR, Encrypted Area, Time-Cost, Encrypted Data, Encryption Algorithm, Key Space), these criteria's measure the security level and the effect of the encryption on the compression. As it's described in Table IX, the proposed approach has less PSNR value than the rest references which means it has a higher security level. On the other hand, the rate of the encrypted area of the proposed approach is less than others while the security level is higher than others. That means the proposed approach balance the security level and the rate of encrypted area. Also, the rate of time cost of the proposed approach is fewer than the rest except [1] still has a few time cost because this research used only chaos for encryption. Therefore; the security level of [1] is lower because of its focus on speed only without taking care of the other requirements of the efficient performance like security. cost and bit rate overhead which makes it very suitable for real-time applications. The results of the security metrics like visual, histogram, PSNR, SSIM, keyspace, and NIST proved the high level of protection provided by this system to the encrypted video. Besides, the comparative evaluation between the proposed method and the other recent schemes clarified the improvement that was satisfied by balancing between security and compression efficiency.