[openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
Resolved by overlapping buffer checks. Closing. Matt -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
Not defined means we make no guarantees. OpenSSL can depend on what it knows to be true. In the next release we can revisit this. -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
I don't think that will work. The SSL code uses in-place buffers extensively, so in == out definitely needs to be defined. The question is only whether out < in is also acceptable. Either way, for BoringSSL, I've gone ahead and tightened our aliasing constraints to forbid out < in and require equality, so that we don't have to keep chasing down discrepancies in the assembly code in advance of a decision being made here. (I think there is something to be said for being able to in-place-ish decrypt a structure with a record header and write the output without the header, but perhaps this use case is not worth the cost---I see the numbers went down slightly for chacha-x86.pl. Then again, most other files manage it naturally. It's a decision you all will need to make.) David On Wed, Jun 15, 2016 at 11:01 AM Rich Salz via RT wrote: > I think for now, we just note this in the documentation: behavior for > overlapping buffers, and even in-place buffers, is not defined. > > It's like memcpy() vs memmove(). > > -- > Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 > Please log in as guest with password guest if prompted > > -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
[openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
I think for now, we just note this in the documentation: behavior for overlapping buffers, and even in-place buffers, is not defined. It's like memcpy() vs memmove(). -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
Brian Smith wrote: > It seems that 32-bit ARM has the same limitation as x86 that the input and > output pointers must match or the input and output buffers must not overlap > at all. I'm not sure which ARM code path (NEON or non-NEON, or both) has > this issue. Just to follow up on this: I think this might actually be a QEMU ARM (32-bit) emulator bug, or a configuration issue on my part. In one version of the QEMU emulator, I have no trouble. But, in another, newer, version of the QEMU emulator, I get results like this for BoringSSL's chacha_test (modified to print all the results before failing): Mismatch at length 64 with in-place offset 1. Mismatch at length 64 with in-place offset 2. Mismatch at length 64 with in-place offset 5. Mismatch at length 64 with in-place offset 6. Mismatch at length 64 with in-place offset 9. Notice, in particular, that it only happens when the input length is 64, and only for specific offsets. Like I said, I consistently get these failures on the Android emulator but not in a newer version of QEMU. It doesn't make any difference whether NEON is enabled or disabled; I believe this is because the ARM code only uses NEON if there are at least 3 blocks. Anyway, I see in the ARM chacha code that there is a special case when the length is 64, so it might be worth double-checking that code. Just FYI. -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
It seems that 32-bit ARM has the same limitation as x86 that the input and output pointers must match or the input and output buffers must not overlap at all. I'm not sure which ARM code path (NEON or non-NEON, or both) has this issue. Cheers, Brian -- https://briansmith.org/ -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
The current state is that, as far as I can tell, overlapping requirements are undocumented (or is it somewhere and I missed it?) and, for ChaCha, architecture-specific. I think something certainly needs to be done. Either changing chacha-x86.pl and allowing any out <= in overlap, or declaring that you want out == in (or something else) with, at minimum, a documentation change. I would actually suggest going further and updating EVP_CipherUpdate to enforce the rule and raise an error if the caller doesn't honor it. Otherwise we'll continue to be in the situation where callers may write code that works on some architectures but not others. (BoringSSL's EVP_AEAD API will fail with OUTPUT_ALIASES_INPUT if aliasing requirements aren't honored.) Actually, I'm not sure how to best translate an out == in rule to streaming EVP_CipherUpdate for block ciphers. Imagine feeding one byte at a time to EVP_CipherUpdate, in will naturally get ahead of out and then synchronize at block boundaries, so the rule can't be as straight forward as "out == in". (Whereas out <= in naturally covers this behavior.) Given the numbers in https://mta.openssl.org/pipermail/openssl-dev/2016-March/005625.html the cost seems fairly modest and this is only for 32-bit, not 64-bit. Based on that, and that other implementations I've tested handle the case fine, I think this is a reasonable requirement to impose. Of course, I am also biased here because out == in will cause me some nuisance. :-) One can certainly argue that out == in is perhaps easier to handle than out <= in and it is not worth allowing it. Either way, I'm not an OpenSSL team member and can't make a decision on behalf of you all. This is something you all have to pick from. David On Fri, Mar 4, 2016 at 7:24 AM Andy Polyakov via RT wrote: > >>> If the other EVP ciphers universally allow this then I think we must > >> treat this > >>> as a bug, because people may be relying on this behaviour. There is > also > >>> sporadic documentation in lower-level APIs (AES source and des.pod) > that > >> the > >>> buffers may overlap. > >>> > >>> If it's inconsistent then, at the very least, we must document that it > >> is not > >>> allowed. > >> > >> I'd like to argue that EVP is not place to provide any guarantees about > >> partially overlapping buffers. Even though all current ciphers process > >> data in ascending address order, we shouldn't make assumption that there > >> won't be one that processes data in reverse order. > > > > > > I'm afraid that, since we haven't documented it, the world may already > have > > made that assumption. > > Fear is irrational and destructive feeling. Having faith that world is > better than that it nothing but healthy :-) What I'm saying is that > let's put a little bit more substance into discourse. Would anybody > consider it *sane* programming practice to rely on partially overlapping > buffers in *general* case? I.e. without actually *knowing* (as opposite > to *assuming*) what's gong on? [Control question: does compiler > guarantee order of references to memory?] As said in last message I > don't consider it sane and even consider it natural [which means that > I'd expect majority to not consider it sane too]. > > Once again, I'm not saying that nothing would be done, I simply want to > figure out where does line go. From my personal view point I'd say that > nothing *has to* be done, but it's just me. You seem to say that we're > obliged to support partially overlapping buffers. My question then is > *any* overlap, *any* cost? Shall we settle for simply writing down that > application developer may not rely on partially overlapping buffers? If > so, do we fix the modules in question arguing that this quality might be > desirable in different context [where modules in question can be used]? > > > > -- > Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 > Please log in as guest with password guest if prompted > > -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
The current state is that, as far as I can tell, overlapping requirements are undocumented (or is it somewhere and I missed it?) and, for ChaCha, architecture-specific. I think something certainly needs to be done. Either changing chacha-x86.pl and allowing any out <= in overlap, or declaring that you want out == in (or something else) with, at minimum, a documentation change. I would actually suggest going further and updating EVP_CipherUpdate to enforce the rule and raise an error if the caller doesn't honor it. Otherwise we'll continue to be in the situation where callers may write code that works on some architectures but not others. (BoringSSL's EVP_AEAD API will fail with OUTPUT_ALIASES_INPUT if aliasing requirements aren't honored.) Actually, I'm not sure how to best translate an out == in rule to streaming EVP_CipherUpdate for block ciphers. Imagine feeding one byte at a time to EVP_CipherUpdate, in will naturally get ahead of out and then synchronize at block boundaries, so the rule can't be as straight forward as "out == in". (Whereas out <= in naturally covers this behavior.) Given the numbers in https://mta.openssl.org/pipermail/openssl-dev/2016-March/005625.html the cost seems fairly modest and this is only for 32-bit, not 64-bit. Based on that, and that other implementations I've tested handle the case fine, I think this is a reasonable requirement to impose. Of course, I am also biased here because out == in will cause me some nuisance. :-) One can certainly argue that out == in is perhaps easier to handle than out <= in and it is not worth allowing it. Either way, I'm not an OpenSSL team member and can't make a decision on behalf of you all. This is something you all have to pick from. David On Fri, Mar 4, 2016 at 7:24 AM Andy Polyakov via RT wrote: > >>> If the other EVP ciphers universally allow this then I think we must > >> treat this > >>> as a bug, because people may be relying on this behaviour. There is > also > >>> sporadic documentation in lower-level APIs (AES source and des.pod) > that > >> the > >>> buffers may overlap. > >>> > >>> If it's inconsistent then, at the very least, we must document that it > >> is not > >>> allowed. > >> > >> I'd like to argue that EVP is not place to provide any guarantees about > >> partially overlapping buffers. Even though all current ciphers process > >> data in ascending address order, we shouldn't make assumption that there > >> won't be one that processes data in reverse order. > > > > > > I'm afraid that, since we haven't documented it, the world may already > have > > made that assumption. > > Fear is irrational and destructive feeling. Having faith that world is > better than that it nothing but healthy :-) What I'm saying is that > let's put a little bit more substance into discourse. Would anybody > consider it *sane* programming practice to rely on partially overlapping > buffers in *general* case? I.e. without actually *knowing* (as opposite > to *assuming*) what's gong on? [Control question: does compiler > guarantee order of references to memory?] As said in last message I > don't consider it sane and even consider it natural [which means that > I'd expect majority to not consider it sane too]. > > Once again, I'm not saying that nothing would be done, I simply want to > figure out where does line go. From my personal view point I'd say that > nothing *has to* be done, but it's just me. You seem to say that we're > obliged to support partially overlapping buffers. My question then is > *any* overlap, *any* cost? Shall we settle for simply writing down that > application developer may not rely on partially overlapping buffers? If > so, do we fix the modules in question arguing that this quality might be > desirable in different context [where modules in question can be used]? > > > > -- > Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 > Please log in as guest with password guest if prompted > > -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
>> Fear is irrational and destructive feeling. Having faith that world is >> better than that it nothing but healthy :-) What I'm saying is that >> let's put a little bit more substance into discourse. Would anybody >> consider it *sane* programming practice to rely on partially overlapping >> buffers in *general* case? I.e. without actually *knowing* (as opposite >> to *assuming*) what's gong on? [Control question: does compiler >> guarantee order of references to memory?] As said in last message I >> don't consider it sane and even consider it natural [which means that >> I'd expect majority to not consider it sane too]. > > One the cool features of the OCB code some folks I know to be using > and relying on is that it supports in-place encryption. You give > it a buffer, and it is encrypted in place. This is specifically > promised by the API and is noticeably fast. > > No idea whether this is a useful datapoint... Question if specifically about *partially* overlapping buffers. Or in other words it's not a question whether or not *fully* overlapping buffers, a.k.a. in-place processing, should be supported (they should) or may be used (they may). -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
> On Mar 4, 2016, at 7:24 AM, Andy Polyakov via RT wrote: > > Fear is irrational and destructive feeling. Having faith that world is > better than that it nothing but healthy :-) What I'm saying is that > let's put a little bit more substance into discourse. Would anybody > consider it *sane* programming practice to rely on partially overlapping > buffers in *general* case? I.e. without actually *knowing* (as opposite > to *assuming*) what's gong on? [Control question: does compiler > guarantee order of references to memory?] As said in last message I > don't consider it sane and even consider it natural [which means that > I'd expect majority to not consider it sane too]. One the cool features of the OCB code some folks I know to be using and relying on is that it supports in-place encryption. You give it a buffer, and it is encrypted in place. This is specifically promised by the API and is noticeably fast. No idea whether this is a useful datapoint... -- Viktor. -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
>>> If the other EVP ciphers universally allow this then I think we must >> treat this >>> as a bug, because people may be relying on this behaviour. There is also >>> sporadic documentation in lower-level APIs (AES source and des.pod) that >> the >>> buffers may overlap. >>> >>> If it's inconsistent then, at the very least, we must document that it >> is not >>> allowed. >> >> I'd like to argue that EVP is not place to provide any guarantees about >> partially overlapping buffers. Even though all current ciphers process >> data in ascending address order, we shouldn't make assumption that there >> won't be one that processes data in reverse order. > > > I'm afraid that, since we haven't documented it, the world may already have > made that assumption. Fear is irrational and destructive feeling. Having faith that world is better than that it nothing but healthy :-) What I'm saying is that let's put a little bit more substance into discourse. Would anybody consider it *sane* programming practice to rely on partially overlapping buffers in *general* case? I.e. without actually *knowing* (as opposite to *assuming*) what's gong on? [Control question: does compiler guarantee order of references to memory?] As said in last message I don't consider it sane and even consider it natural [which means that I'd expect majority to not consider it sane too]. Once again, I'm not saying that nothing would be done, I simply want to figure out where does line go. From my personal view point I'd say that nothing *has to* be done, but it's just me. You seem to say that we're obliged to support partially overlapping buffers. My question then is *any* overlap, *any* cost? Shall we settle for simply writing down that application developer may not rely on partially overlapping buffers? If so, do we fix the modules in question arguing that this quality might be desirable in different context [where modules in question can be used]? -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
On Fri, Mar 4, 2016 at 12:48 PM Andy Polyakov via RT wrote: > > If the other EVP ciphers universally allow this then I think we must > treat this > > as a bug, because people may be relying on this behaviour. There is also > > sporadic documentation in lower-level APIs (AES source and des.pod) that > the > > buffers may overlap. > > > > If it's inconsistent then, at the very least, we must document that it > is not > > allowed. > > I'd like to argue that EVP is not place to provide any guarantees about > partially overlapping buffers. Even though all current ciphers process > data in ascending address order, we shouldn't make assumption that there > won't be one that processes data in reverse order. I'm afraid that, since we haven't documented it, the world may already have made that assumption. > I'd even argue that > not providing such guarantee is natural, i.e. can be naturally > *implied*. Just like you may not expect a tablet to work after you glued > wheels to it to make a skateboard, arguing that nowhere does it say that > it's not a viable idea. It might work, and apparently did for somebody, > but you may not *expect* it to, neither as tablet or skateboard. And > tablet manufacturer has no obligation to disclaim it in writing. > > I'm not saying that this particular problem can't/won't be addressed, > though I consider it kind of bad style. Because it kind of sets a > precedent of creating an undesired illusion. BTW, further measurements > have shown that unlike others, Core2 suffers 20% performance regression. > Well, one can argue that nobody cares about Core2, but what if it was > contemporary processor? > > > -- > Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 > Please log in as guest with password guest if prompted > > -- > openssl-dev mailing list > To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev > -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
On Fri, Mar 4, 2016 at 12:48 PM Andy Polyakov via RT wrote: > > If the other EVP ciphers universally allow this then I think we must > treat this > > as a bug, because people may be relying on this behaviour. There is also > > sporadic documentation in lower-level APIs (AES source and des.pod) that > the > > buffers may overlap. > > > > If it's inconsistent then, at the very least, we must document that it > is not > > allowed. > > I'd like to argue that EVP is not place to provide any guarantees about > partially overlapping buffers. Even though all current ciphers process > data in ascending address order, we shouldn't make assumption that there > won't be one that processes data in reverse order. I'm afraid that, since we haven't documented it, the world may already have made that assumption. > I'd even argue that > not providing such guarantee is natural, i.e. can be naturally > *implied*. Just like you may not expect a tablet to work after you glued > wheels to it to make a skateboard, arguing that nowhere does it say that > it's not a viable idea. It might work, and apparently did for somebody, > but you may not *expect* it to, neither as tablet or skateboard. And > tablet manufacturer has no obligation to disclaim it in writing. > > I'm not saying that this particular problem can't/won't be addressed, > though I consider it kind of bad style. Because it kind of sets a > precedent of creating an undesired illusion. BTW, further measurements > have shown that unlike others, Core2 suffers 20% performance regression. > Well, one can argue that nobody cares about Core2, but what if it was > contemporary processor? > > > -- > Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 > Please log in as guest with password guest if prompted > > -- > openssl-dev mailing list > To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev > -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
> If the other EVP ciphers universally allow this then I think we must treat > this > as a bug, because people may be relying on this behaviour. There is also > sporadic documentation in lower-level APIs (AES source and des.pod) that the > buffers may overlap. > > If it's inconsistent then, at the very least, we must document that it is not > allowed. I'd like to argue that EVP is not place to provide any guarantees about partially overlapping buffers. Even though all current ciphers process data in ascending address order, we shouldn't make assumption that there won't be one that processes data in reverse order. I'd even argue that not providing such guarantee is natural, i.e. can be naturally *implied*. Just like you may not expect a tablet to work after you glued wheels to it to make a skateboard, arguing that nowhere does it say that it's not a viable idea. It might work, and apparently did for somebody, but you may not *expect* it to, neither as tablet or skateboard. And tablet manufacturer has no obligation to disclaim it in writing. I'm not saying that this particular problem can't/won't be addressed, though I consider it kind of bad style. Because it kind of sets a precedent of creating an undesired illusion. BTW, further measurements have shown that unlike others, Core2 suffers 20% performance regression. Well, one can argue that nobody cares about Core2, but what if it was contemporary processor? -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
> I'm unclear on what EVP_CIPHER's interface guarantees are, but our EVP_AEAD > APIs are documented to allow in/out buffers to alias as long as out is <= > in. This matches what callers might expect from a naive implementation. > > Our AES-GCM EVP_AEADs, which share code with OpenSSL, have tended to match > this pattern too. For ChaCha, of chacha-{x86,x86_64,armv4,armv8}.pl and the > C implementation, all seem satisfy this (though it's possible I don't have > complete coverage) except for chacha-x86.pl. That one works if in == out, > but not if out is slightly behind. > > We were able to reproduce problems when in = out + 1. The SSE3 code > triggers if the input is at least 256 bytes and the non-SSE3 code if the > input is at least 64 bytes. The non-SSE3 code is because the words in a > block are processed in a slightly funny order (0, 4, 8, 9, 12, 14, 1, 2, 3, > 5, 6, 7, 10, 11, 13, 15). I haven't looked at the SSE3 case carefully, but > I expect it's something similar. It's in 16-byte chunks numbered 0,4,8,12, 1,5,8,13, 2,6,... > Could the blocks perhaps be processed in a more straight-forward ordering, > so that chacha-x86.pl behaves like the other implementations? (It's nice to > avoid bugs that only trigger in one implementation.) Or is this order > necessary for something? It's the order in which amount of references to memory is minimal. But double-check attached. diff --git a/crypto/chacha/asm/chacha-x86.pl b/crypto/chacha/asm/chacha-x86.pl index 850c917..986e7f7 100755 --- a/crypto/chacha/asm/chacha-x86.pl +++ b/crypto/chacha/asm/chacha-x86.pl @@ -19,13 +19,13 @@ # P4 18.6/+84% # Core29.56/+89% 4.83 # Westmere 9.50/+45% 3.35 -# Sandy Bridge 10.5/+47% 3.20 -# Haswell 8.15/+50% 2.83 -# Silvermont 17.4/+36% 8.35 +# Sandy Bridge 10.7/+47% 3.24 +# Haswell 8.22/+50% 2.89 +# Silvermont 17.8/+36% 8.53 # Sledgehammer 10.2/+54% -# Bulldozer13.4/+50% 4.38(*) +# Bulldozer13.5/+50% 4.39(*) # -# (*) Bulldozer actually executes 4xXOP code path that delivers 3.55; +# (*) Bulldozer actually executes 4xXOP code path that delivers 3.50; $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; push(@INC,"${dir}","${dir}../../perlasm"); @@ -238,18 +238,20 @@ if ($xmm) { &xor($a, &DWP(4*0,$b)); # xor with input &xor($b_,&DWP(4*4,$b)); - &mov(&DWP(4*0,"esp"),$a); + &mov(&DWP(4*0,"esp"),$a); # off-load for later write &mov($a,&wparam(0));# load output pointer &xor($c, &DWP(4*8,$b)); &xor($c_,&DWP(4*9,$b)); &xor($d, &DWP(4*12,$b)); &xor($d_,&DWP(4*14,$b)); - &mov(&DWP(4*4,$a),$b_); # write output - &mov(&DWP(4*8,$a),$c); - &mov(&DWP(4*9,$a),$c_); - &mov(&DWP(4*12,$a),$d); - &mov(&DWP(4*14,$a),$d_); + &mov(&DWP(4*4,"esp"),$b_); + &mov($b_,&DWP(4*0,"esp")); + &mov(&DWP(4*8,"esp"),$c); + &mov(&DWP(4*9,"esp"),$c_); + &mov(&DWP(4*12,"esp"),$d); + &mov(&DWP(4*14,"esp"),$d_); + &mov(&DWP(4*0,$a),$b_); # write output in order &mov($b_,&DWP(4*1,"esp")); &mov($c, &DWP(4*2,"esp")); &mov($c_,&DWP(4*3,"esp")); @@ -266,35 +268,45 @@ if ($xmm) { &xor($d, &DWP(4*5,$b)); &xor($d_,&DWP(4*6,$b)); &mov(&DWP(4*1,$a),$b_); + &mov($b_,&DWP(4*4,"esp")); &mov(&DWP(4*2,$a),$c); &mov(&DWP(4*3,$a),$c_); + &mov(&DWP(4*4,$a),$b_); &mov(&DWP(4*5,$a),$d); &mov(&DWP(4*6,$a),$d_); - &mov($b_,&DWP(4*7,"esp")); - &mov($c, &DWP(4*10,"esp")); + &mov($c,&DWP(4*7,"esp")); + &mov($d,&DWP(4*8,"esp")); + &mov($d_,&DWP(4*9,"esp")); + &add($c,&DWP(64+4*7,"esp")); + &mov($b_, &DWP(4*10,"esp")); + &xor($c,&DWP(4*7,$b)); &mov($c_,&DWP(4*11,"esp")); + &mov(&DWP(4*7,$a),$c); + &mov(&DWP(4*8,$a),$d); + &mov(&DWP(4*9,$a),$d_); + + &add($b_, &DWP(64+4*10,"esp")); + &add($c_,&DWP(64+4*11,"esp")); + &xor($b_, &DWP(4*10,$b)); + &xor($c_,&DWP(4*11,$b)); + &mov(&DWP(4*10,$a),$b_); + &mov(&DWP(4*11,$a),$c_); + + &mov($c,&DWP(4*12,"esp")); + &mov($c_,&DWP(4*14,"esp")); &mov($d, &DWP(4*13,"esp")); &mov($d_,&DWP(4*15,"esp")); - &add($b_,&DWP(64+4*7,"esp")); - &add($c, &DWP(64+4*10,"esp")); - &add($c_,&DWP(64+4*11,"esp")); &add($d, &DWP(64+4*13,"esp")); &add($d_,&DWP(64+4*15,"esp")); - &xor($b_,&DWP(4*7,$b)); - &xor($c, &DWP(4*10,$b)); - &xor($c_,&DWP(4*11,$b)); &xor($d, &DWP(4*13,$b)); &xor($d_,&DWP(4*15,$b)); &le
[openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
If the other EVP ciphers universally allow this then I think we must treat this as a bug, because people may be relying on this behaviour. There is also sporadic documentation in lower-level APIs (AES source and des.pod) that the buffers may overlap. If it's inconsistent then, at the very least, we must document that it is not allowed. -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
[openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files
I'm unclear on what EVP_CIPHER's interface guarantees are, but our EVP_AEAD APIs are documented to allow in/out buffers to alias as long as out is <= in. This matches what callers might expect from a naive implementation. Our AES-GCM EVP_AEADs, which share code with OpenSSL, have tended to match this pattern too. For ChaCha, of chacha-{x86,x86_64,armv4,armv8}.pl and the C implementation, all seem satisfy this (though it's possible I don't have complete coverage) except for chacha-x86.pl. That one works if in == out, but not if out is slightly behind. We were able to reproduce problems when in = out + 1. The SSE3 code triggers if the input is at least 256 bytes and the non-SSE3 code if the input is at least 64 bytes. The non-SSE3 code is because the words in a block are processed in a slightly funny order (0, 4, 8, 9, 12, 14, 1, 2, 3, 5, 6, 7, 10, 11, 13, 15). I haven't looked at the SSE3 case carefully, but I expect it's something similar. Could the blocks perhaps be processed in a more straight-forward ordering, so that chacha-x86.pl behaves like the other implementations? (It's nice to avoid bugs that only trigger in one implementation.) Or is this order necessary for something? David -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev