p1619 (disk): Security concerns of LRW and an alternative mode

Matt Ball Wed, 21 Dec 2005 15:28:02 -0800

Sorry for the long e-mail, but there are some important security issues with 
LRW that we need to look at before finalizing the standard.  These issues won't 
necessarily preclude using LRW, but I think it's important to know how NOT to 
use LRW.


Abstract:

This document discusses ways to attack LRW through algebraic weaknesses in the 
Galois Multiplier.  This attack becomes strong if Key2 (K2) looks similar to 
the plaintext (e.g. if K2 is an ASCII password).  The result of this attack is 
to leak 128-bits of plaintext for each detected collision (similar to ECB 
mode).    A covert channel within LRW is discussed.  Lastly, an alternative 
mode is proposed that eliminates these weaknesses.


Alternative mode:

I've attached a picture for a slightly different tweak mode that replaces the 
Galois multiplier with a second AES engine.  Has anyone seen a mode like this?  
In particular, is there any known IP claims to this mode? (I do not make any 
claims myself)

I'm sending this out because it looks like LRW can leak information if the KEY2 
looks similar to the plaintext.  We may want to add an alternative mode if 
people want a higher security level.

Here are some advantages of this alternative mode:
- Using I = 0 is secure (This is also true with LRW if Electronic Code Book 
mode is secure)
- Knowing the Tweak (T) does not help in recovering Key2 (whereas in LRW, 
knowledge of T completely reveals Key2)
- It is possible to implement this using less hardware by reusing the AES 
engine.  The throughput would be half as much, but this may not matter with 
laptop hard disks.
- Detection of collisions becomes significantly harder


Concerning LRW mode:

There is a possible security problem if Key2 is poorly chosen.  In particular, 
if Key2 has low entropy, low hamming weight, or long runs of binary 1s or 0s, 
(or is otherwise systematically non-random) it then becomes possible to detect 
situations where the AES engine encoded the same data (thus revealing 128-bits 
about the plaintext).  The problem is especially bad if Key2 is a relatively 
short ASCII password.  In this case, the Key2 would look very similar to ASCII 
data on the hard disk, making it easier to create collisions with the plaintext.


Recovering Key2 in LRW:

What happens if we take the XOR of two ciphertext blocks?  Generally speaking, 
we should get a pseudo-random number.  However, what happens when the AES 
engine encoded the same data?  By XORing the cipherblocks, we completely remove 
the output of the AES engine (which was the same) and are left only with data 
relating to Key2 (K2) and the IV (a.k.a. I).  Let's take a closer look at this:

(Let '+' be bitwise XOR, and '*' be a 128-bit Galois Field multiply; C? = 
Ciphertext, K2 = Key2, I? = IV, P? = plaintext; substitute '?' with 1 or 2):

>From the LRW specification, we get these equations:

        C1 = (K2 * I1) + AES(K1, K2 * I1 + P1)
        C2 = (K2 * I2) + AES(K1, K2 * I2 + P2)

Now, assume that the output of the AES engine is the same for both these 
equations.  Here's what happens when we add the two lines:

        C1 + C2 = (K2 * I1) + (K2 * I2) + (AES1 + AES2)
        C1 + C2 = (K2 * I1) + (K2 * I2) + 0                             (output 
of AES1 and AES2 matches)
        C1 + C2 = K2 * (I1 + I2)                                        
(distributive law for finite fields)

Before going any further, take a look at the last line:  The sum of two 
ciphertexts involved in a collision equals Key2 times the two Is XORed 
together.  Now, let's assume that we are incrementing the least-significant 
digit of I each time.  In this case, (I1 XOR I2) = 1, assuming I1 is even and 
I2 immediately follows I1.  Note that the XOR operator will cancel-out anything 
within I1 or I2 that matches (e.g. the drive number).

IMPORTANT POINT: This equation holds even if we use the upper bits of I for a 
drive number or other unique identifier.

So what this means is that for every other consecutive pair of ciphertexts, we 
get this equation:

        C1 + C2 = K2 * (1) = K2

It then becomes trivial to recover K2.

Now, let's looks at what happens when K2 has low hamming weight (i.e. low 
number of binary 1s).  In general, taking (I1 + I2) will also have low hamming 
weight (this is because we don't vary the Is very much).    Because of the 
nature of the 128-bit multiplier, the result of K2 * (I1 + I2) will also have 
low hamming weight.  This is especially true if the most-significant bits of K2 
are zero.  In this case, we may not even need to take the modulo of the 
primitive polynomial after expanding the multiplication.  Even if we did, the 
generator polynomial itself has low hamming weight and will not add too much to 
the final result (especially if I1 + I2 only has 1 bit set).

In this case, it becomes relatively easy to detect collisions within the AES 
engine by performing statistical analysis on all the combinations of the 
ciphertext blocks.  If we want to get more sophisticated, it would be possible 
to completely eliminate the effects of the Galois multiply altogether by 
dividing each ciphertext sum by (I1 + I2):

        Compute (C1 + C2) * (I1 + I2)^-1 = K2

Since I1 and I2 are known for each cipher block, this computation becomes 
straightforward.  In dedicated hardware, it could be possible to compute the 
inverse of I1 and I2 in about 128 clock cycles through successive squaring.  
Frequently, (I1 + I2) will result in the same value, so it should be possible 
to precompute all the most common values, making a software-based attack 
feasible.


LRW Collisions:

Now, for this attack to be useful, it has to be somewhat likely that we get a 
collision.  A first guess would be to say that we wouldn't start getting 
collisions until the birthday limit of the cipher, which is around 2^64 
ciphertext blocks (128 bits / 2).  However, this is assuming that the input 
into the AES engine is a totally random number.  In practice, this is not 
generally true.  In fact, the chance of a collision greatly increases when K2 
looks very similar to the plaintext (or rather the XOR of two plaintext 
blocks).  Let's take the example of using an ASCII password for K2.  
Furthermore, let's assume that the harddisk mostly contains ASCII characters.

How do we create a collision?  That is, how do we make the inputs of the AES 
engine equivalent?  Here is the equation for the input into the AES engine:

        AES(K1, K2 * I + P)

For a collision, we need an I and P that match this equation:

        AES(K1, K2 * I1 + P1) = AES(K1, K2 * I2 + P2)
        K2 * I1 + P1 = K2 * I2 + P2                     (if the outputs of AES 
match, the inputs match)
or
        P1 + P2 = K2 * I1 + K2 * I2                     (rearrange)
        P1 + P2 = K2 * (I1 + I2)                        (distributive law)

Interestingly, we get that (I1 + I2) term again.  Keep in mind that this will 
generally have low hamming weight, making it so K2 * (I1 + I2) will also have 
low hamming weight.  In the case where I1 and I2 are consecutive, we get  P1 + 
P2 = K2 (because I1 + I2 = 1).

If P1, P2 and K2 are all ASCII strings, the average entropy will be much lower 
than 128 bits.  Some studies show that text contains about 1.5 bits of entropy 
per byte.  If this is the case, we could expect a total entropy of about 1.5 
bits * 16 bytes = 24 bits.  The birthday attack would then succeed after about 
2^12 = 4096 ciphertext blocks.  

In practice, there's probably more entropy than 1.5 bits.  However, if we say 
there is 4 bits of entropy per character, our birthday limit just went down 
from 2^64 to 2^32.  It would be quite reasonable for a harddisk to contain 64 
GB of data to make this attack work.

Obviously, there is a lot of hand waving here.  We would probably want to do a 
study to see what the real numbers look like.


Here's the bottom line:

If K2 resembles the plaintext, it becomes possible to attack the data well 
below the theoretical birthday limit.

Ideally, the security of a cryptosystem should be completely independent of the 
data patterns.  This is true of both CBC and CTR modes.  ECB falls flat on its 
face in this regard.  It looks like LRW has some of the same characteristics as 
ECB.



Covert Channel with LRW:

If an application is running on the host computer that has knowledge of K2, it 
becomes possible to create a covert channel by writing blocks that 
intentionally create collisions.  This is done by writing blocks with the 
relationship P1 + P2 = K2 * (I1 + I2).  If several blocks are written in this 
manner, it becomes possible to detect this using the methods outlined above.  
However, in this case, it doesn't matter if K2 is strong or weak because it's 
possible to write several consecutive blocks that contain collisions.  The 
attack algorithm would then search for consecutive blocks that have collisions. 
 Using this covert channel, it is possible to send binary information by either 
writing blocks with collisions or without.  It might even be possible to encode 
K1 in this manner, causing a total loss of security.



Notes on alternative mode:

The alternative proposed mode does not have the shortcomings mentioned above.  
Since we have replaced the Galois multiplier with a second AES engine, we are 
now working with strong pseudorandom data for the tweak value.  None of the 
simple algebraic properties hold.  The birthday limit returns to 2^64 blocks 
because the output of the second AES engine is random.  Furthermore, if it was 
somehow possible to recover the Tweak value (T), this would not reveal Key2 
(K2).  Lastly, there are no weak values for I to worry about (assuming AES is 
indistinguishable from an ideal cipher).


Let me know what you all think.

Matt Ball
Embedded Software Engineer
Quantum Corporation
4001 Discovery Drive, Suite 1100
Boulder, CO 80303
(720) 406-5766

<<attachment: LRW_Alternative.jpg>>

p1619 (disk): Security concerns of LRW and an alternative mode

Reply via email to