Hi Matt,
On Dec 21, 2005, at 3:20 PM, Matt Ball wrote:
Sorry for the long e-mail, but there are some important security
issues with LRW that we need to look at before finalizing the
standard. These issues won't necessarily preclude using LRW, but I
think it's important to know how NOT to use LRW.
Abstract:
This document discusses ways to attack LRW through algebraic
weaknesses in the Galois Multiplier. This attack becomes strong if
Key2 (K2) looks similar to the plaintext (e.g. if K2 is an ASCII
password). The result of this attack is to leak 128-bits of
plaintext for each detected collision (similar to ECB mode). A
covert channel within LRW is discussed. Lastly, an alternative
mode is proposed that eliminates these weaknesses.
Alternative mode:
I've attached a picture for a slightly different tweak mode that
replaces the Galois multiplier with a second AES engine. Has
anyone seen a mode like this? In particular, is there any known IP
claims to this mode? (I do not make any claims myself)
the original LRW paper proposed a mode that used two block cipher
invocations per block of plaintext, one to produce the tweak T, and
the other to compute C = E(K, P + T) + T. AFAIK the LRW authors did
not patent their work - someone asked Moses Liskov about patents at
the Crypto conference, and he said that they had not filed for a
patent. He also didn't rule out doing that in the future, but of
course at that point overseas rights were gone (since disclosure
preceeded any potential filing). Of course, anyone using LRW or a
mode like it needs to be aware of the OCB patent.
I'm sending this out because it looks like LRW can leak information
if the KEY2 looks similar to the plaintext. We may want to add an
alternative mode if people want a higher security level.
Here are some advantages of this alternative mode:
- Using I = 0 is secure (This is also true with LRW if Electronic
Code Book mode is secure)
- Knowing the Tweak (T) does not help in recovering Key2 (whereas
in LRW, knowledge of T completely reveals Key2)
- It is possible to implement this using less hardware by reusing
the AES engine. The throughput would be half as much, but this may
not matter with laptop hard disks.
- Detection of collisions becomes significantly harder
Concerning LRW mode:
There is a possible security problem if Key2 is poorly chosen. In
particular, if Key2 has low entropy, low hamming weight, or long
runs of binary 1s or 0s, (or is otherwise systematically non-
random) it then becomes possible to detect situations where the AES
engine encoded the same data (thus revealing 128-bits about the
plaintext). The problem is especially bad if Key2 is a relatively
short ASCII password. In this case, the Key2 would look very
similar to ASCII data on the hard disk, making it easier to create
collisions with the plaintext.
Recovering Key2 in LRW:
What happens if we take the XOR of two ciphertext blocks?
Generally speaking, we should get a pseudo-random number. However,
what happens when the AES engine encoded the same data? By XORing
the cipherblocks, we completely remove the output of the AES engine
(which was the same) and are left only with data relating to Key2
(K2) and the IV (a.k.a. I). Let's take a closer look at this:
(Let '+' be bitwise XOR, and '*' be a 128-bit Galois Field
multiply; C? = Ciphertext, K2 = Key2, I? = IV, P? = plaintext;
substitute '?' with 1 or 2):
From the LRW specification, we get these equations:
C1 = (K2 * I1) + AES(K1, K2 * I1 + P1)
C2 = (K2 * I2) + AES(K1, K2 * I2 + P2)
Now, assume that the output of the AES engine is the same for both
these equations. Here's what happens when we add the two lines:
C1 + C2 = (K2 * I1) + (K2 * I2) + (AES1 + AES2)
C1 + C2 = (K2 * I1) + (K2 * I2) + 0 (output of AES1 and AES2
matches)
C1 + C2 = K2 * (I1 + I2)
(distributive law for finite fields)
Before going any further, take a look at the last line: The sum of
two ciphertexts involved in a collision equals Key2 times the two
Is XORed together. Now, let's assume that we are incrementing the
least-significant digit of I each time. In this case, (I1 XOR I2)
= 1, assuming I1 is even and I2 immediately follows I1. Note that
the XOR operator will cancel-out anything within I1 or I2 that
matches (e.g. the drive number).
IMPORTANT POINT: This equation holds even if we use the upper bits
of I for a drive number or other unique identifier.
So what this means is that for every other consecutive pair of
ciphertexts, we get this equation:
C1 + C2 = K2 * (1) = K2
It then becomes trivial to recover K2.
Now, let's looks at what happens when K2 has low hamming weight
(i.e. low number of binary 1s). In general, taking (I1 + I2) will
also have low hamming weight (this is because we don't vary the Is
very much). Because of the nature of the 128-bit multiplier, the
result of K2 * (I1 + I2) will also have low hamming weight. This
is especially true if the most-significant bits of K2 are zero. In
this case, we may not even need to take the modulo of the primitive
polynomial after expanding the multiplication. Even if we did, the
generator polynomial itself has low hamming weight and will not add
too much to the final result (especially if I1 + I2 only has 1 bit
set).
In this case, it becomes relatively easy to detect collisions
within the AES engine by performing statistical analysis on all the
combinations of the ciphertext blocks. If we want to get more
sophisticated, it would be possible to completely eliminate the
effects of the Galois multiply altogether by dividing each
ciphertext sum by (I1 + I2):
Compute (C1 + C2) * (I1 + I2)^-1 = K2
Since I1 and I2 are known for each cipher block, this computation
becomes straightforward. In dedicated hardware, it could be
possible to compute the inverse of I1 and I2 in about 128 clock
cycles through successive squaring. Frequently, (I1 + I2) will
result in the same value, so it should be possible to precompute
all the most common values, making a software-based attack feasible.
LRW Collisions:
Now, for this attack to be useful, it has to be somewhat likely
that we get a collision. A first guess would be to say that we
wouldn't start getting collisions until the birthday limit of the
cipher, which is around 2^64 ciphertext blocks (128 bits / 2).
However, this is assuming that the input into the AES engine is a
totally random number. In practice, this is not generally true.
In fact, the chance of a collision greatly increases when K2 looks
very similar to the plaintext (or rather the XOR of two plaintext
blocks). Let's take the example of using an ASCII password for
K2. Furthermore, let's assume that the harddisk mostly contains
ASCII characters.
How do we create a collision? That is, how do we make the inputs
of the AES engine equivalent? Here is the equation for the input
into the AES engine:
AES(K1, K2 * I + P)
For a collision, we need an I and P that match this equation:
AES(K1, K2 * I1 + P1) = AES(K1, K2 * I2 + P2)
K2 * I1 + P1 = K2 * I2 + P2 (if the outputs of AES match, the
inputs match)
or
P1 + P2 = K2 * I1 + K2 * I2 (rearrange)
P1 + P2 = K2 * (I1 + I2) (distributive law)
Interestingly, we get that (I1 + I2) term again. Keep in mind that
this will generally have low hamming weight, making it so K2 * (I1
+ I2) will also have low hamming weight. In the case where I1 and
I2 are consecutive, we get P1 + P2 = K2 (because I1 + I2 = 1).
If P1, P2 and K2 are all ASCII strings, the average entropy will be
much lower than 128 bits. Some studies show that text contains
about 1.5 bits of entropy per byte. If this is the case, we could
expect a total entropy of about 1.5 bits * 16 bytes = 24 bits. The
birthday attack would then succeed after about 2^12 = 4096
ciphertext blocks.
In practice, there's probably more entropy than 1.5 bits. However,
if we say there is 4 bits of entropy per character, our birthday
limit just went down from 2^64 to 2^32. It would be quite
reasonable for a harddisk to contain 64 GB of data to make this
attack work.
Obviously, there is a lot of hand waving here. We would probably
want to do a study to see what the real numbers look like.
Here's the bottom line:
If K2 resembles the plaintext, it becomes possible to attack the
data well below the theoretical birthday limit.
Ideally, the security of a cryptosystem should be completely
independent of the data patterns. This is true of both CBC and CTR
modes. ECB falls flat on its face in this regard. It looks like
LRW has some of the same characteristics as ECB.
That's an interesting analysis. However, if the key is chosen
uniformly at random, then there are no issues with LRW, right? So
the concern could be alleviated by requiring that keys be uniformly
random. Any mode of operation is going to have problems if its keys
are low-entropy ASCII strings. If we need to assume that the user
has chosen K2 poorly, then we ought to expect that they have not
chosen K1 well either. In that case, it is not clear that LRW fails
in a way that's more spectacular than any other mode would, AFAICT.
In practice, if there is a desire to use password-based keys, then
I'd suggest that a dedicated password-to-key function be used (PKCS#5
or HMAC, for example). This is a good idea for any system that
admits passwords, though of course it is getting into the subject of
key management (which the WG should probably start formal work
on :-) Alternatively, for LRW, the mode definition could be modified
so that K2 is derived from K1, similar to what GCM, CMAC, and some
other modes do.
David
Covert Channel with LRW:
If an application is running on the host computer that has
knowledge of K2, it becomes possible to create a covert channel by
writing blocks that intentionally create collisions. This is done
by writing blocks with the relationship P1 + P2 = K2 * (I1 + I2).
If several blocks are written in this manner, it becomes possible
to detect this using the methods outlined above. However, in this
case, it doesn't matter if K2 is strong or weak because it's
possible to write several consecutive blocks that contain
collisions. The attack algorithm would then search for consecutive
blocks that have collisions. Using this covert channel, it is
possible to send binary information by either writing blocks with
collisions or without. It might even be possible to encode K1 in
this manner, causing a total loss of security.
Notes on alternative mode:
The alternative proposed mode does not have the shortcomings
mentioned above. Since we have replaced the Galois multiplier with
a second AES engine, we are now working with strong pseudorandom
data for the tweak value. None of the simple algebraic properties
hold. The birthday limit returns to 2^64 blocks because the output
of the second AES engine is random. Furthermore, if it was somehow
possible to recover the Tweak value (T), this would not reveal Key2
(K2). Lastly, there are no weak values for I to worry about
(assuming AES is indistinguishable from an ideal cipher).
Let me know what you all think.
Matt Ball
Embedded Software Engineer
Quantum Corporation
4001 Discovery Drive, Suite 1100
Boulder, CO 80303
(720) 406-5766
<LRW_Alternative.jpg>