In cryptography, it is a measurement of uncertainty or unpredictability.

Let us have a probability mass function (probability of discrete events equal to certain value) and let us have an event. The lower the probability of the event(s), the higher the information entropy.

Mathematically for uniform distribution:

$$H_1 = m \cdot log(N)$$

and in general case:

$$H_2 = m \cdot \sum_{i=1}^{N} P_i \cdot log(\frac{1}{P_i})$$

where:

• H stands for information entropy
• m is a number of independent events
• N is number of discrete states
• Pi is probability of i-th event

It is mathematically easy to prove that H2 is less or equal to H1. Intuitively – let’s say that you have a coin which has different probabilities for heads and tails, for instance Pheads = 0.75 and Ptails = 0.25. You feel more certain that when you flip this coin you will get a head. Therefore the information entropy (uncertainty) is smaller than in case of 0.5 probability for each.

Applications:

• key strength
• anomaly detection (useful for analyzing network traffic and packets)
• image processing