Re: echainiv not working as supposed to be?
On Wed, Sep 07, 2016 at 12:08:10AM +0800, Herbert Xu wrote: > > > So, should echainiv use a per-context per-cpu array instead that -- > > for simplicity -- gets initialised with random bytes and will be > > updated as it's now, just not with a single global per-cpu array, but > > a per-context one? > > As I said, the per-cpu IV is never seen by anyone so there is no > need to make it per-tfm. On second thought you're right. The global array was a very lame idea. This patch changes it so that it instead does a multiply with the salt. ---8<-- Subject: crypto: echainiv - Replace chaining with multiplication The current implementation uses a global per-cpu array to store data which are used to derive the next IV. This is insecure as the attacker may change the stored data. This patch removes all traces of chaining and replaces it with multiplication of the salt and the sequence number. Fixes: a10f554fa7e0 ("crypto: echainiv - Add encrypted chain IV...") Cc: sta...@vger.kernel.org Reported-by: Mathias KrauseSigned-off-by: Herbert Xu diff --git a/crypto/echainiv.c b/crypto/echainiv.c index 1b01fe9..e3d889b 100644 --- a/crypto/echainiv.c +++ b/crypto/echainiv.c @@ -1,8 +1,8 @@ /* * echainiv: Encrypted Chain IV Generator * - * This generator generates an IV based on a sequence number by xoring it - * with a salt and then encrypting it with the same key as used to encrypt + * This generator generates an IV based on a sequence number by multiplying + * it with a salt and then encrypting it with the same key as used to encrypt * the plain text. This algorithm requires that the block size be equal * to the IV size. It is mainly useful for CBC. * @@ -24,81 +24,17 @@ #include #include #include -#include #include -#include -#include +#include #include -#define MAX_IV_SIZE 16 - -static DEFINE_PER_CPU(u32 [MAX_IV_SIZE / sizeof(u32)], echainiv_iv); - -/* We don't care if we get preempted and read/write IVs from the next CPU. */ -static void echainiv_read_iv(u8 *dst, unsigned size) -{ - u32 *a = (u32 *)dst; - u32 __percpu *b = echainiv_iv; - - for (; size >= 4; size -= 4) { - *a++ = this_cpu_read(*b); - b++; - } -} - -static void echainiv_write_iv(const u8 *src, unsigned size) -{ - const u32 *a = (const u32 *)src; - u32 __percpu *b = echainiv_iv; - - for (; size >= 4; size -= 4) { - this_cpu_write(*b, *a); - a++; - b++; - } -} - -static void echainiv_encrypt_complete2(struct aead_request *req, int err) -{ - struct aead_request *subreq = aead_request_ctx(req); - struct crypto_aead *geniv; - unsigned int ivsize; - - if (err == -EINPROGRESS) - return; - - if (err) - goto out; - - geniv = crypto_aead_reqtfm(req); - ivsize = crypto_aead_ivsize(geniv); - - echainiv_write_iv(subreq->iv, ivsize); - - if (req->iv != subreq->iv) - memcpy(req->iv, subreq->iv, ivsize); - -out: - if (req->iv != subreq->iv) - kzfree(subreq->iv); -} - -static void echainiv_encrypt_complete(struct crypto_async_request *base, -int err) -{ - struct aead_request *req = base->data; - - echainiv_encrypt_complete2(req, err); - aead_request_complete(req, err); -} - static int echainiv_encrypt(struct aead_request *req) { struct crypto_aead *geniv = crypto_aead_reqtfm(req); struct aead_geniv_ctx *ctx = crypto_aead_ctx(geniv); struct aead_request *subreq = aead_request_ctx(req); - crypto_completion_t compl; - void *data; + __be64 nseqno; + u64 seqno; u8 *info; unsigned int ivsize = crypto_aead_ivsize(geniv); int err; @@ -108,8 +44,6 @@ static int echainiv_encrypt(struct aead_request *req) aead_request_set_tfm(subreq, ctx->child); - compl = echainiv_encrypt_complete; - data = req; info = req->iv; if (req->src != req->dst) { @@ -127,29 +61,30 @@ static int echainiv_encrypt(struct aead_request *req) return err; } - if (unlikely(!IS_ALIGNED((unsigned long)info, -crypto_aead_alignmask(geniv) + 1))) { - info = kmalloc(ivsize, req->base.flags & - CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL: - GFP_ATOMIC); - if (!info) - return -ENOMEM; - - memcpy(info, req->iv, ivsize); - } - - aead_request_set_callback(subreq, req->base.flags, compl, data); + aead_request_set_callback(subreq, req->base.flags, + req->base.complete, req->base.data); aead_request_set_crypt(subreq, req->dst, req->dst,
Re: echainiv not working as supposed to be?
On Tue, Sep 06, 2016 at 02:24:59PM +0200, Mathias Krause wrote: > > For each request it XORs the salt into the provided IV (req->iv). For > ESP the provided IV is the sequence number or, at least, parts of it. > The thus modified IV gets written into the *destination* buffer > (req->dst) which might be a different buffer than the source buffer > (not in the ESP case, though, as it passes the same sg for src and > dst). Afterwards, the per-cpu IV gets copied over into req->iv, > effectively overwriting the generated XORed value. Then the inner > crypto request gets initiated which may finish synchronously or > asynchronously. Either way, the per-cpu IV gets updated with the new > value from subreq->iv, which should be equal to req->iv in the normal > case. I think your description is basically correct. However, you seem to be missing the fact that the real IV (i.e., the IV transmitted over the wire) is the first block of the encrypted ciphertext. IOW the per-cpu IV is never sent over the wire. It's only used to encrypt the first block of the ciphertext, which becomes the actual IV. > So, should echainiv use a per-context per-cpu array instead that -- > for simplicity -- gets initialised with random bytes and will be > updated as it's now, just not with a single global per-cpu array, but > a per-context one? As I said, the per-cpu IV is never seen by anyone so there is no need to make it per-tfm. Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
echainiv not working as supposed to be?
Hi Herbert, I'm a little bit confused on how echainiv is supposed to work. I think it's doing a few things wrongly, I just can't decide what's actually wrong as the intended mode of operation is unclear to me. At least, as-is, the code isn't quite operating as advertised in the comment at the top of the file. So, in hope you could enlighten me, here's my understanding of the current _implementation_ of echainiv (assuming aligned req->iv, for simplicity): For each request it XORs the salt into the provided IV (req->iv). For ESP the provided IV is the sequence number or, at least, parts of it. The thus modified IV gets written into the *destination* buffer (req->dst) which might be a different buffer than the source buffer (not in the ESP case, though, as it passes the same sg for src and dst). Afterwards, the per-cpu IV gets copied over into req->iv, effectively overwriting the generated XORed value. Then the inner crypto request gets initiated which may finish synchronously or asynchronously. Either way, the per-cpu IV gets updated with the new value from subreq->iv, which should be equal to req->iv in the normal case. Things that are unclear to me: 1/ Why is the XORed IV written to the destination buffer and not the source buffer? Isn't the sub-request supposed to use the IV from either req->src or req->iv -- and especially *not* from req->dst? 2/ Why does the XORed IV gets overwritten with the per-cpu IV prior passing it on to the sub-request, making all possible IV locations in req->src, req->dst and req->iv *all* be different? Moreover, the behaviour of 2/ actually leads to the fact, that the very first IV on each CPU will be a null IV -- not a salted sequence number. All following IVs will be the result of the (or better "some") sub-request's IV update, stored into the per-cpu variable -- still no salted sequence numbers, though. That behaviour is a bug, IMHO, as this clearly differs from what is described in the comment. However, I'm uncertain on how to fix it, as the intended mode of operation is unclear to me... Should only the first IV of each crypto transform be the salted sequence number and all following the result of the sub-request's IV update, therefore not stored in a single per-cpu variable but some echainiv context specific one? 3/ Why is echainiv using per-cpu IVs in the first place? Shouldn't those be crypto context specific? Currently we're happily mixing IVs from different transforms (with possibly different IV sizes!) with each other via the per-cpu variables. That might be an "issue" if one echainiv user can maliciously mess with the IVs of another user, e.g. via AF_ALG. So, should echainiv use a per-context per-cpu array instead that -- for simplicity -- gets initialised with random bytes and will be updated as it's now, just not with a single global per-cpu array, but a per-context one? That would allow us to still have a lock-less IV generator but one that cannot be accidentally / maliciously be tampered with by other echainiv users. Thanks, Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html