Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
,"Paul E . McKenney" ,Andrew Morton ,Christopher Li ,Dou Liyang ,Masahiro Yamada ,Daniel Borkmann ,Markus Trippelsdorf ,Peter Foley ,Steven Rostedt ,Tim Chen ,Ard Biesheuvel ,Catalin Marinas ,Matthew Wilcox ,Michal Hocko ,Rob Landley ,Jiri Kosina ,"H . J . Lu" ,Paul Bolle ,Baoquan He ,Daniel Micay ,the arch/x86 maintainers ,linux-crypto@vger.kernel.org,LKML ,xen-de...@lists.xenproject.org,kvm list ,Linux PM list ,linux-arch ,linux-spa...@vger.kernel.org,Kernel Hardening From: h...@zytor.com Message-ID: <0ef6faaa-a99c-4f0d-9e4a-ad25e9395...@zytor.com> On July 19, 2017 4:25:56 PM PDT, Thomas Garnier wrote: >On Wed, Jul 19, 2017 at 4:08 PM, H. Peter Anvin wrote: >> On 07/19/17 15:47, Thomas Garnier wrote: >>> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin >wrote: >>>> On 07/18/17 15:33, Thomas Garnier wrote: >>>>> The x86 relocation tool generates a list of 32-bit signed >integers. There >>>>> was no need to use 64-bit integers because all addresses where >above the 2G >>>>> top of the memory. >>>>> >>>>> This change add a large-reloc option to generate 64-bit unsigned >integers. >>>>> It can be used when the kernel plan to go below the top 2G and >32-bit >>>>> integers are not enough. >>>> >>>> Why on Earth? This would only be necessary if the *kernel itself* >was >>>> more than 2G, which isn't going to happen for the forseeable >future. >>> >>> Because the relocation integer is an absolute address, not an offset >>> in the binary. Next iteration, I can try using a 32-bit offset for >>> everyone. >> >> It is an absolute address *as the kernel was originally linked*, for >> obvious reasons. > >Sure when the kernel was just above 0x8000, it doesn't >work when it goes down to 0x. That's why using an >offset might make more sense in general. > >> >> -hpa >> What is the motivation for changing the pre linked address at all? -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [RFC 06/22] kvm: Adapt assembly for PIE support
,Chris Metcalf ,"Paul E . McKenney" ,Andrew Morton ,Christopher Li ,Dou Liyang ,Masahiro Yamada ,Daniel Borkmann ,Markus Trippelsdorf ,Peter Foley ,Steven Rostedt ,Tim Chen ,Catalin Marinas ,Matthew Wilcox ,Michal Hocko ,Rob Landley ,Jiri Kosina ,"H . J . Lu" ,Paul Bolle ,Baoquan He ,Daniel Micay ,the arch/x86 maintainers ,"linux-crypto@vger.kernel.org" ,Linux Kernel Mailing List ,xen-de...@lists.xenproject.org,kvm list ,linux-pm ,linux-arch ,Linux-Sparse ,Kernel Hardening From: h...@zytor.com Message-ID: <83ba7600-bc8d-4c91-812c-dd2a0bf44...@zytor.com> On July 19, 2017 3:58:07 PM PDT, Ard Biesheuvel wrote: >On 19 July 2017 at 23:27, H. Peter Anvin wrote: >> On 07/19/17 08:40, Thomas Garnier wrote: >>>> >>>> This doesn't look right. It's accessing a per-cpu variable. The >>>> per-cpu section is an absolute, zero-based section and not subject >to >>>> relocation. >>> >>> PIE does not respect the zero-based section, it tries to have >>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE >>> (while keeping the zero absolute design by default). >>> >> >> This is silly. The right thing is for PIE is to be explicitly >absolute, >> without (%rip). The use of (%rip) memory references for percpu is >just >> an optimization. >> > >Sadly, there is an issue in binutils that may prevent us from doing >this as cleanly as we would want. > >For historical reasons, bfd.ld emits special symbols like >__GLOBAL_OFFSET_TABLE__ as absolute symbols with a section index of >SHN_ABS, even though it is quite obvious that they are relative like >any other symbol that points into the image. Unfortunately, this means >that binutils needs to emit R_X86_64_RELATIVE relocations even for >SHN_ABS symbols, which means we lose the ability to use both absolute >and relocatable symbols in the same PIE image (unless the reloc tool >can filter them out) > >More info here: >https://sourceware.org/bugzilla/show_bug.cgi?id=19818 The reloc tool already has the ability to filter symbols. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH 5/7] x86, boot, LLVM: Use regparm=0 for memcpy and memset
On 03/17/17 05:08, Peter Zijlstra wrote: > On Thu, Mar 16, 2017 at 05:15:18PM -0700, Michael Davidson wrote: >> Use the standard regparm=0 calling convention for memcpy and >> memset when building with clang. >> >> This is a work around for a long standing clang bug >> (see https://llvm.org/bugs/show_bug.cgi?id=3997) where >> clang always uses the standard regparm=0 calling convention >> for any implcit calls to memcpy and memset that it generates >> (eg for structure assignments and initialization) even if an >> alternate calling convention such as regparm=3 has been specified. > > Seriously, fix LLVM already. > Yes, this is a real stinker, in no small part because once clang is fixed to DTRT then this is actually broken... -hpa
Re: [PATCH 3/7] x86, LLVM: suppress clang warnings about unaligned accesses
On 04/13/17 16:14, Matthias Kaehlcke wrote: > El Mon, Apr 03, 2017 at 04:01:58PM -0700 Matthias Kaehlcke ha dit: > >> El Fri, Mar 17, 2017 at 04:50:19PM -0700 h...@zytor.com ha dit: >> >>> On March 16, 2017 5:15:16 PM PDT, Michael Davidson wrote: Suppress clang warnings about potential unaliged accesses to members in packed structs. This gets rid of almost 10,000 warnings about accesses to the ring 0 stack pointer in the TSS. Signed-off-by: Michael Davidson --- arch/x86/Makefile | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 894a8d18bf97..7f21703c475d 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -128,6 +128,11 @@ endif KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args) endif +ifeq ($(cc-name),clang) +# Suppress clang warnings about potential unaligned accesses. +KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member) +endif + ifdef CONFIG_X86_X32 x32_ld_ok := $(call try-run,\ /bin/echo -e '1: .quad 1b' | \ >>> >>> Why conditional on clang? >> >> My understanding is that this warning is clang specific, it is not >> listed on https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html > > Actually this warning affects other platforms besides x86 > (e.g. arm64), I'll submit a patch that disables the warning on all > platforms. > Drop the ifeq ($(cc-name),clang). You should assume that if you have to add one of those ifeq's then you are doing something fundamentally wrong. -hpa
Re: [PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags
On 03/17/17 14:32, H. Peter Anvin wrote: > > NAK. Fix your compiler, or use a wrapper script or something. It is > absolutely *not* acceptable to disable this since future versions of > clang *should* support that. > > That being said, it might make sense to look for a key pattern like > "(un|not )supported" on stderr the try-run macro. Is there really no > -Wno- or -Werror= option to turn off this craziness? > Well, guess what... I found it myself. -W{no-,error=}ignored-optimization-argument Either variant will make this sane. -hpa
Re: [PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags
On 03/16/17 17:15, Michael Davidson wrote: > Unfortunately, while clang generates a warning about these flags > being unsupported it still exits with a status of 0 so we have > to explicitly disable them instead of just using a cc-option check. > > Signed-off-by: Michael Davidson > --- > Makefile | 2 ++ > arch/x86/Makefile | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/Makefile b/Makefile > index b21fd0ca2946..5e97e5fc1eea 100644 > --- a/Makefile > +++ b/Makefile > @@ -629,7 +629,9 @@ ARCH_AFLAGS := > ARCH_CFLAGS := > include arch/$(SRCARCH)/Makefile > > +ifneq ($(cc-name),clang) > KBUILD_CFLAGS+= $(call cc-option,-fno-delete-null-pointer-checks,) > +endif > KBUILD_CFLAGS+= $(call cc-disable-warning,frame-address,) > > ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION > diff --git a/arch/x86/Makefile b/arch/x86/Makefile > index 2d449337a360..894a8d18bf97 100644 > --- a/arch/x86/Makefile > +++ b/arch/x86/Makefile > @@ -87,11 +87,13 @@ else > KBUILD_AFLAGS += -m64 > KBUILD_CFLAGS += -m64 > > +ifneq ($(cc-name),clang) > # Align jump targets to 1 byte, not the default 16 bytes: > KBUILD_CFLAGS += -falign-jumps=1 > > # Pack loops tightly as well: > KBUILD_CFLAGS += -falign-loops=1 > +endif > > # Don't autogenerate traditional x87 instructions > KBUILD_CFLAGS += $(call cc-option,-mno-80387) > NAK. Fix your compiler, or use a wrapper script or something. It is absolutely *not* acceptable to disable this since future versions of clang *should* support that. That being said, it might make sense to look for a key pattern like "(un|not )supported" on stderr the try-run macro. Is there really no -Wno- or -Werror= option to turn off this craziness? -hpa
Re: [PATCH] x86/crypto: fix %progbits -> @progbits
On 01/19/17 13:28, Denys Vlasenko wrote: > %progbits form is used on ARM (where @ is a comment char). > > x86 consistently uses @progbits everywhere else. However, it looks like %progbits works on all architectures (at least include/linux/init.h seems to imply so.) Perhaps a tree-wide replacement the other way would make more sense. Personally I would also like to see these parameters macroized, to keep someone from getting them wrong, just like we have __INIT, __INITRODATA etc already, we should just have plain __TEXT __DATA __BSS... -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Entropy sources
On 08/25/16 16:35, Sandy Harris wrote: > On Thu, Aug 25, 2016 at 5:30 PM, H. Peter Anvin wrote: > >> The network stack is a good source of entropy, *once it is online*. >> However, the most serious case is while the machine is still booting, >> when the network will not have enabled yet. >> >> -hpa > > One possible solution is at: > https://github.com/sandy-harris/maxwell > > A small (< 700 lines) daemon that gets entropy from timer imprecision > and variations in time for arithmetic (cache misses, interrupts, etc.) > and pumps it into /dev/random. Make it the first userspace program > started and all should be covered. Directory above includes a PDF doc > with detailed rationale and some discussion of alternate solutions. > > Of course if you are dealing with a system-on-a-chip or low-end > embedded CPU & the timer is really inadequate, this will not work > well. Conceivably well enough, but we could not know that without > detailed analysis for each chip in question. > A lot of this is exactly the same thing /dev/random already does in kernel space. I have already in the past expressed skepticism toward this approach, because a lot of the analysis done has simply been bogus. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Entropy sources (was: /dev/random - a new approach)
On 08/20/16 22:37, Jeffrey Walton wrote: >> >> The biggest problem there is that the timer interrupt adds *no* entropy >> unless there is a source of asynchronicity in the system. On PCs, >> traditionally the timer has been run from a completely different crystal >> (14.31818 MHz) than the CPU, which is the ideal situation, but if they >> are run off the same crystal and run in lockstep, there is very little >> if anything there. On some systems, the timer may even *be* the only >> source of time, and the entropy truly is zero. > > It seems like a networked computer should have an abundance on entropy > available from the network stack. Every common case I can come up with > includes a networked computer. If a handheld is outside of coverage, > then it probably does not have the randomness demands because it can't > communicate (i.e., TCP sequence numbers, key agreement, etc). > > In fact, there are at least two papers that use bits from the network stack: > The network stack is a good source of entropy, *once it is online*. However, the most serious case is while the machine is still booting, when the network will not have enabled yet. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/5] /dev/random - a new approach
On 08/18/16 22:56, Herbert Xu wrote: > On Thu, Aug 18, 2016 at 10:49:47PM -0400, Theodore Ts'o wrote: >> >> That really depends on the system. We can't assume that people are >> using systems with a 100Hz clock interrupt. More often than not >> people are using tickless kernels these days. That's actually the >> problem with changing /dev/urandom to block until things are >> initialized. > > Couldn't we disable tickless until urandom has been seeded? In fact > perhaps we should accelerate the timer interrupt rate until it has > been seeded? > The biggest problem there is that the timer interrupt adds *no* entropy unless there is a source of asynchronicity in the system. On PCs, traditionally the timer has been run from a completely different crystal (14.31818 MHz) than the CPU, which is the ideal situation, but if they are run off the same crystal and run in lockstep, there is very little if anything there. On some systems, the timer may even *be* the only source of time, and the entropy truly is zero. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/5] /dev/random - a new approach
On 08/16/16 15:28, H. Peter Anvin wrote: > On 08/15/16 22:45, Stephan Mueller wrote: >> Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin: >> >> Hi H, >> >>> On 08/11/16 05:24, Stephan Mueller wrote: >>>> * prevent fast noise sources from dominating slow noise sources >>>> >>>> in case of /dev/random >>> >>> Can someone please explain if and why this is actually desirable, and if >>> this assessment has been passed to someone who has actual experience >>> with cryptography at the professional level? >> >> There are two motivations for that: >> >> - the current /dev/random is compliant to NTG.1 from AIS 20/31 which >> requires >> (in brief words) that entropy comes from auditible noise sources. Currently >> in >> my LRNG only RDRAND is a fast noise source which is not auditible (and it is >> designed to cause a VM exit making it even harder to assess it). To make the >> LRNG to comply with NTG.1, RDRAND can provide entropy but must not become >> the >> sole entropy provider which is the case now with that change. >> >> - the current /dev/random implementation follows the same concept with the >> exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later >> versions, this was changed. >> > > I'm not saying it should be *sole*. I am questioning the value in > limiting it, as it seems to me that it could only ever produce a worse > result. > Also, it would be great to actually get a definition for "auditable". A quantum white noise source which exceeds the sampling bandwidth is an ideal RNG; how do you "audit" that? If what you are doing is looking for imperfections, those imperfections can be trivially emulated. If what you mean is an audit on the chip or circuit level, that would require some mechanism to know that all items were built identically without deviation, which may be possible for intelligence agencies or the military who have full control of their supply chain, but for anyone else that is most likely an impossible task. How many people are going to crack the case and look at even a discrete transistor circuit, and how many of *those* are going to be able to discern if that circuit is subject to RF capture, or its output even used? I have been trying to figure out how to reasonably solve this problem for a long time now, and it is not just a problem for RDSEED (RDRAND is a slightly different beast.) The only reason RDSEED exposes the problem particularly harshly is because it is extremely high bandwidth compared to other noise sources and it is architecturally integrated into the CPU, but the same would apply to an external noise generator connected via PCIe, for example. Incidentally, I am hoping -- and this is a personal statement and nothing official from Intel -- that at some future date RDRAND (not RDSEED) will be fast enough that it can completely replace even prandom_u32(), which I really hope can be non-controversial as prandom_u32() isn't cryptographically strong in the first place. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/5] /dev/random - a new approach
On 08/15/16 22:45, Stephan Mueller wrote: > Am Montag, 15. August 2016, 13:42:54 CEST schrieb H. Peter Anvin: > > Hi H, > >> On 08/11/16 05:24, Stephan Mueller wrote: >>> * prevent fast noise sources from dominating slow noise sources >>> >>> in case of /dev/random >> >> Can someone please explain if and why this is actually desirable, and if >> this assessment has been passed to someone who has actual experience >> with cryptography at the professional level? > > There are two motivations for that: > > - the current /dev/random is compliant to NTG.1 from AIS 20/31 which requires > (in brief words) that entropy comes from auditible noise sources. Currently > in > my LRNG only RDRAND is a fast noise source which is not auditible (and it is > designed to cause a VM exit making it even harder to assess it). To make the > LRNG to comply with NTG.1, RDRAND can provide entropy but must not become the > sole entropy provider which is the case now with that change. > > - the current /dev/random implementation follows the same concept with the > exception of 3.15 and 3.16 where RDRAND was not rate-limited. In later > versions, this was changed. > I'm not saying it should be *sole*. I am questioning the value in limiting it, as it seems to me that it could only ever produce a worse result. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/5] /dev/random - a new approach
On 08/11/16 05:24, Stephan Mueller wrote: > * prevent fast noise sources from dominating slow noise sources > in case of /dev/random Can someone please explain if and why this is actually desirable, and if this assessment has been passed to someone who has actual experience with cryptography at the professional level? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] crypto: sha256-mb - cleanup a || vs | typo
On 06/29/16 07:42, Dan Carpenter wrote: > || and | behave basically the same here but || is intended. It causes a > static checker warning to mix up bitwise and logical operations. > > Signed-off-by: Dan Carpenter > > diff --git a/arch/x86/crypto/sha256-mb/sha256_mb.c > b/arch/x86/crypto/sha256-mb/sha256_mb.c > index c9d5dcc..4ec895a 100644 > --- a/arch/x86/crypto/sha256-mb/sha256_mb.c > +++ b/arch/x86/crypto/sha256-mb/sha256_mb.c > @@ -299,7 +299,7 @@ static struct sha256_hash_ctx > *sha256_ctx_mgr_submit(struct sha256_ctx_mgr *mgr, >* Or if the user's buffer contains less than a whole block, >* append as much as possible to the extra block. >*/ > - if ((ctx->partial_block_buffer_length) | (len < SHA256_BLOCK_SIZE)) { > + if ((ctx->partial_block_buffer_length) || (len < SHA256_BLOCK_SIZE)) { > /* Compute how many bytes to copy from user buffer into >* extra block >*/ > As far as I know the | was an intentional optimization, so you may way to look at the generated code. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG
On 06/20/16 08:49, Stephan Mueller wrote: > Am Montag, 20. Juni 2016, 11:01:47 schrieb Theodore Ts'o: > > Hi Theodore, > >> >> So simply doing chacha20 encryption in a tight loop in the kernel >> might not be a good proxy for what would actually happen in real life >> when someone calls getrandom(2). (Another good question to ask is >> when someone might be needing to generate millions of 256-bit session >> keys per second, when the D-H setup, even if you were using ECCDH, >> would be largely dominating the time for the connection setup anyway.) > > Is speed everything we should care about? What about: > > - offloading of crypto operation from the CPU > This sounds like a speed operation (and very unlikely to be a win given the usage). > - potentially additional security features a hardware cipher may provide like > cache coloring attack resistance? How about burning that bridge when and if we get to it? It sounds very hypothetical. I guess I could add in some comments here about how a lot of these problems can be eliminated by offloading an entire DRNG into hardware, but I don't think it is productive. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux/bitops.h
On May 6, 2016 1:07:13 PM PDT, Sasha Levin wrote: >On 05/04/2016 08:30 PM, H. Peter Anvin wrote: >> On 05/04/16 15:06, John Denker wrote: >>> On 05/04/2016 02:56 PM, H. Peter Anvin wrote: >>>>> Beware that shifting by an amount >= the number of bits in the >>>>> word remains Undefined Behavior. >>> >>>> This construct has been supported as a rotate since at least gcc2. >>> >>> How then should we understand the story told in commit d7e35dfa? >>> Is the story wrong? >>> >>> At the very least, something inconsistent is going on. There >>> are 8 functions. Why did d7e35dfa change one of them but >>> not the other 7? >> >> Yes. d7e35dfa is baloney IMNSHO. All it does is produce worse code, >and the description even says so. > >No, the description says that it produces worse code for *really >really* ancient >GCC versions. > >> As I said, gcc has treated the former code as idiomatic since gcc 2, >so that support is beyond ancient. > >Because something works in a specific way on one compiler isn't a >reason to >ignore this noncompliance with the standard. > > >Thanks, >Sasha When the compiler in question is our flagship target and our reference compiler, then yes, it matters. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux/bitops.h
On May 6, 2016 1:07:13 PM PDT, Sasha Levin wrote: >On 05/04/2016 08:30 PM, H. Peter Anvin wrote: >> On 05/04/16 15:06, John Denker wrote: >>> On 05/04/2016 02:56 PM, H. Peter Anvin wrote: >>>>> Beware that shifting by an amount >= the number of bits in the >>>>> word remains Undefined Behavior. >>> >>>> This construct has been supported as a rotate since at least gcc2. >>> >>> How then should we understand the story told in commit d7e35dfa? >>> Is the story wrong? >>> >>> At the very least, something inconsistent is going on. There >>> are 8 functions. Why did d7e35dfa change one of them but >>> not the other 7? >> >> Yes. d7e35dfa is baloney IMNSHO. All it does is produce worse code, >and the description even says so. > >No, the description says that it produces worse code for *really >really* ancient >GCC versions. > >> As I said, gcc has treated the former code as idiomatic since gcc 2, >so that support is beyond ancient. > >Because something works in a specific way on one compiler isn't a >reason to >ignore this noncompliance with the standard. > > >Thanks, >Sasha 4.6.2 is not "really, really ancient." -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: better patch for linux/bitops.h
On 05/05/2016 03:18 PM, ty...@mit.edu wrote: > > So this is why I tend to take a much more pragmatic viewpoint on > things. Sure, it makes sense to pay attention to what the C standard > writers are trying to do to us; but if we need to suppress certain > optimizations to write sane kernel code --- I'm ok with that. And > this is why using a trust-but-verify on a specific set of compilers > and ranges of compiler versions is a really good idea > For the record, the "portable" construct has apparently only been supported since gcc 4.6.3. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: better patch for linux/bitops.h
On May 5, 2016 3:18:09 PM PDT, ty...@mit.edu wrote: >On Thu, May 05, 2016 at 05:34:50PM -0400, Sandy Harris wrote: >> >> I completely fail to see why tests or compiler versions should be >> part of the discussion. The C standard says the behaviour in >> certain cases is undefined, so a standard-compliant compiler >> can generate more-or-less any code there. >> > >> As long as any of portability, reliability or security are among our >> goals, any code that can give undefined behaviour should be >> considered problematic. > >Because compilers have been known not necessarily to obey the specs, >and/or interpret the specs in way that not everyone agrees with. It's >also the case that we are *already* disabling certain C optimizations >which are technically allowed by the spec, but which kernel >programmers consider insane (e.g., strict aliasing). > >And of course, memzero_explicit() which crypto people understand is >necessary, is something that technically compilers are allowed to >optimize according to the spec. So trying to write secure kernel code >which will work on arbitrary compilers may well be impossible. > >And which is also why people have said (mostly in jest), "A >sufficiently advanced compiler is indistinguishable from an >adversary." (I assume people will agree that optimizing away a memset >needed to clear secrets from memory would be considered adversarial, >at the very least!) > >So this is why I tend to take a much more pragmatic viewpoint on >things. Sure, it makes sense to pay attention to what the C standard >writers are trying to do to us; but if we need to suppress certain >optimizations to write sane kernel code --- I'm ok with that. And >this is why using a trust-but-verify on a specific set of compilers >and ranges of compiler versions is a really good idea > >- Ted I have filed a gcc bug to have the preexisting rotate idiom officially documented as a GNU C extension. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70967 -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: better patch for linux/bitops.h
On 05/05/2016 03:18 PM, ty...@mit.edu wrote: > On Thu, May 05, 2016 at 05:34:50PM -0400, Sandy Harris wrote: >> >> I completely fail to see why tests or compiler versions should be >> part of the discussion. The C standard says the behaviour in >> certain cases is undefined, so a standard-compliant compiler >> can generate more-or-less any code there. >> > >> As long as any of portability, reliability or security are among our >> goals, any code that can give undefined behaviour should be >> considered problematic. > > Because compilers have been known not necessarily to obey the specs, > and/or interpret the specs in way that not everyone agrees with. It's > also the case that we are *already* disabling certain C optimizations > which are technically allowed by the spec, but which kernel > programmers consider insane (e.g., strict aliasing). > > And of course, memzero_explicit() which crypto people understand is > necessary, is something that technically compilers are allowed to > optimize according to the spec. So trying to write secure kernel code > which will work on arbitrary compilers may well be impossible. > > And which is also why people have said (mostly in jest), "A > sufficiently advanced compiler is indistinguishable from an > adversary." (I assume people will agree that optimizing away a memset > needed to clear secrets from memory would be considered adversarial, > at the very least!) > > So this is why I tend to take a much more pragmatic viewpoint on > things. Sure, it makes sense to pay attention to what the C standard > writers are trying to do to us; but if we need to suppress certain > optimizations to write sane kernel code --- I'm ok with that. And > this is why using a trust-but-verify on a specific set of compilers > and ranges of compiler versions is a really good idea > In theory, theory and practice should agree, but in practice, practice is what counts. I fully agree we should get rid of UD behavior where doing so is practical, but not at the cost of breaking real-life compilers, expecially not gcc, and to a lesser but still very real extent icc and clang. I would also agree that we should push the gcc developers to add to the manual C-standard-UD behavior which are well-defined under the gnu89/gnu99/gnu11 C dialects. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: better patch for linux/bitops.h
On 05/04/16 21:03, Jeffrey Walton wrote: On Wed, May 4, 2016 at 11:50 PM, Theodore Ts'o wrote: ... But instead of arguing over what works and doesn't, let's just create the the test set and just try it on a wide range of compilers and architectures, hmmm? What are the requirements? Here's a short list: * No undefined behavior - important because the compiler writers use the C standard * Compiles to native "rotate IMMEDIATE" if the rotate amount is a "constant expression" and the machine provides it - translates to a native rotate instruction if available - "rotate IMM" can be 3 times faster than "rotate REG" - do any architectures *not* provide a rotate? * Compiles to native "rotate REGISTER" if the rotate is variable and the machine provides it - do any architectures *not* provide a rotate? * Constant time - important to high-integrity code - Non-security code paths probably don't care Maybe the first thing to do is provide a different rotates for the constant-time requirement when its in effect? The disagreement here is the priority between these points. In my very strong opinion, "no undefined behavior" per the C standard is way less important than the others; what matters is what gcc and the other compilers we care about do. The kernel relies on various versions of C-standard-undefined behavior *all over the place*; for one thing sizeof(void *) == sizeof(size_t) == sizeof(unsigned long)!! but they are well-defined in the subcontext we care about. (And no, not all architectures provide a rotate instruction.) -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: better patch for linux/bitops.h
On May 4, 2016 7:54:12 PM PDT, Jeffrey Walton wrote: >On Wed, May 4, 2016 at 10:41 PM, H. Peter Anvin wrote: >> On May 4, 2016 6:35:44 PM PDT, Jeffrey Walton >wrote: >>>On Wed, May 4, 2016 at 5:52 PM, John Denker wrote: >>>> On 05/04/2016 02:42 PM, I wrote: >>>> >>>>> I find it very odd that the other seven functions were not >>>>> upgraded. I suggest the attached fix-others.diff would make >>>>> things more consistent. >>>> >>>> Here's a replacement patch. >>>> ... >>> >>>+1, commit it. >>> >>>Its good for three additional reasons. First, this change means the >>>kernel is teaching the next generation the correct way to do things. >>>Many developers aspire to be kernel hackers, and they sometimes use >>>the code from bitops.h. I've actually run across the code in >>>production during an audit where the developers cited bitops.h. >>> >>>Second, it preserves a "silent and dark" cockpit during analysis. >That >>>is, when analysis is run, no findings are generated. Auditors and >>>security folks like quiet tools, much like the way pilots like their >>>cockpits (flashing lights and sounding buzzers usually means >something >>>is wrong). >>> >>>Third, the pattern is recognized by the major compilers, so the >kernel >>>should not have any trouble when porting any of the compilers. I >often >>>use multiple compiler to tease out implementation defined behavior in >>>a effort to achieve greater portability. Here are the citations to >>>ensure the kernel is safe with the pattern: >>> >>> * GCC: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157 >>> * ICC: http://software.intel.com/en-us/forums/topic/580884 >>> >>>However, Clang may cause trouble because they don't want the >>>responsibility of recognizing the pattern: >>> >>> * https://llvm.org/bugs/show_bug.cgi?id=24226#c8 >>> >>>Instead, they provide a defective rotate. The "defect" here is its >>>non-constant time due to the branch, so it may not be suitable for >>>high-integrity or high-assurance code like linux-crypto: >>> >>> * https://llvm.org/bugs/show_bug.cgi?id=24226#c5 >>> >>>Jeff >> >> So you are actually saying outright that we should sacrifice *actual* >portability in favor of *theoretical* portability? What kind of >twilight zone did we just step into?! > >I'm not sure what you mean. It will be well defined on all platforms. >Clang may not recognize the pattern, which means they could run >slower. GCC and ICC will be fine. > >Slower but correct code is what you have to live with until the Clang >dev's fix their compiler. > >Its kind of like what Dr. Jon Bentley said: "If it doesn't have to be >correct, I can make it as fast as you'd like it to be". > >Jeff The current code works on all compilers we care about. The code you propose does not; it doesn't work on anything but very recent versions of our flagship target compiler, and pretty your own admission might even cause security hazards in the kernel if compiled on clang. That qualifies as insane in my book. The church code is de facto portable with the intended outcome, the one you propose is not. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: better patch for linux/bitops.h
On May 4, 2016 6:35:44 PM PDT, Jeffrey Walton wrote: >On Wed, May 4, 2016 at 5:52 PM, John Denker wrote: >> On 05/04/2016 02:42 PM, I wrote: >> >>> I find it very odd that the other seven functions were not >>> upgraded. I suggest the attached fix-others.diff would make >>> things more consistent. >> >> Here's a replacement patch. >> ... > >+1, commit it. > >Its good for three additional reasons. First, this change means the >kernel is teaching the next generation the correct way to do things. >Many developers aspire to be kernel hackers, and they sometimes use >the code from bitops.h. I've actually run across the code in >production during an audit where the developers cited bitops.h. > >Second, it preserves a "silent and dark" cockpit during analysis. That >is, when analysis is run, no findings are generated. Auditors and >security folks like quiet tools, much like the way pilots like their >cockpits (flashing lights and sounding buzzers usually means something >is wrong). > >Third, the pattern is recognized by the major compilers, so the kernel >should not have any trouble when porting any of the compilers. I often >use multiple compiler to tease out implementation defined behavior in >a effort to achieve greater portability. Here are the citations to >ensure the kernel is safe with the pattern: > > * GCC: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157 > * ICC: http://software.intel.com/en-us/forums/topic/580884 > >However, Clang may cause trouble because they don't want the >responsibility of recognizing the pattern: > > * https://llvm.org/bugs/show_bug.cgi?id=24226#c8 > >Instead, they provide a defective rotate. The "defect" here is its >non-constant time due to the branch, so it may not be suitable for >high-integrity or high-assurance code like linux-crypto: > > * https://llvm.org/bugs/show_bug.cgi?id=24226#c5 > >Jeff So you are actually saying outright that we should sacrifice *actual* portability in favor of *theoretical* portability? What kind of twilight zone did we just step into?! -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux/bitops.h
On May 4, 2016 6:20:32 PM PDT, Jeffrey Walton wrote: >On Wed, May 4, 2016 at 7:06 PM, Andi Kleen wrote: >> On Wed, May 04, 2016 at 03:06:04PM -0700, John Denker wrote: >>> On 05/04/2016 02:56 PM, H. Peter Anvin wrote: >>> >> Beware that shifting by an amount >= the number of bits in the >>> >> word remains Undefined Behavior. >>> >>> > This construct has been supported as a rotate since at least gcc2. >>> >>> How then should we understand the story told in commit d7e35dfa? >>> Is the story wrong? >> >> I don't think Linux runs on a system where it would make a difference >> (like a VAX), and also gcc always converts it before it could. >> Even UBSan should not complain because it runs after the conversion >> to ROTATE. >> >From what I understand, its a limitation in the barrel shifter and the >way the shift bits are handled. > >Linux runs on a great number of devices, so its conceivable (likely?) >a low-cost board would have hardware limitations that not found in >modern desktops and servers or VAX. > >Jeff This is a compiler feature, though. And if icc or clang don't support the idiom they would never have compiled a working kernel. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux/bitops.h
On 05/04/16 15:06, John Denker wrote: On 05/04/2016 02:56 PM, H. Peter Anvin wrote: Beware that shifting by an amount >= the number of bits in the word remains Undefined Behavior. This construct has been supported as a rotate since at least gcc2. How then should we understand the story told in commit d7e35dfa? Is the story wrong? At the very least, something inconsistent is going on. There are 8 functions. Why did d7e35dfa change one of them but not the other 7? Yes. d7e35dfa is baloney IMNSHO. All it does is produce worse code, and the description even says so. As I said, gcc has treated the former code as idiomatic since gcc 2, so that support is beyond ancient. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG
On May 4, 2016 2:42:53 PM PDT, John Denker wrote: >On 05/04/2016 12:07 PM, ty...@thunk.org wrote: > >> it doesn't hit the >> UB case which Jeffrey was concerned about. > >That should be good enough for present purposes > >However, in the interests of long-term maintainability, I >would suggest sticking in a comment or assertion saying >that ror32(,shift) is never called with shift=0. This >can be removed if/when bitops.h is upgraded. > >There is a track record of compilers doing Bad Things in >response to UB code, including some very counterintuitive >Bad Things. > >On Wed, May 04, 2016 at 11:29:57AM -0700, H. Peter Anvin wrote: >>> >>> If bitops.h doesn't do the right thing, we need to >>> fix bitops.h. > >Most of the ror and rol functions in linux/bitops.h >should be considered unsafe, as currently implemented. >http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/bitops.h?id=04974df8049fc4240d22759a91e035082ccd18b4#n103 > >I don't see anything in the file that suggests any limits >on the range of the second argument. So at best it is an >undocumented trap for the unwary. This has demonstrably >been a problem in the past. The explanation in the attached >fix-rol32.diff makes amusing reading. > >Of the eight functions > ror64, rol64, ror32, rol32, ror16, rol16, ror8, and rol8, >only one of them can handle shifting by zero, namely rol32. >It was upgraded on Thu Dec 3 22:04:01 2015; see the attached >fix-rol32.diff. > >I find it very odd that the other seven functions were not >upgraded. I suggest the attached fix-others.diff would make >things more consistent. > >Beware that shifting by an amount >= the number of bits in the >word remains Undefined Behavior. This should be either documented >or fixed. It could be fixed easily enough. This construct has been supported as a rotate since at least gcc2. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG
On 05/04/2016 12:07 PM, ty...@thunk.org wrote: > > If we are all agreed that what is in bitops.h is considered valid, > then we can start converting people over to using the version defined > in bitops.h, and if there is some compiler issue we need to work > around, at least we only need to put the workaround in a single header > file. > Yes, please. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG
On May 4, 2016 11:22:25 AM PDT, Jeffrey Walton wrote: >On Wed, May 4, 2016 at 1:49 PM, wrote: >> On Wed, May 04, 2016 at 10:40:20AM -0400, Jeffrey Walton wrote: >>> > +static inline u32 rotl32(u32 v, u8 n) >>> > +{ >>> > + return (v << n) | (v >> (sizeof(v) * 8 - n)); >>> > +} >>> >>> That's undefined behavior when n=0. >> >> Sure, but it's never called with n = 0; I've double checked and the >> compiler seems to do the right thing with the above pattern as well. > >> Hmm, it looks like there is a "standard" version rotate left and >right >> defined in include/linux/bitops.h. So I suspect it would make sense >> to use rol32 as defined in bitops.h --- and this is probably >something > >bitops.h could work in this case, but its not an ideal solution. GCC >does not optimize the code below as expected under all use cases >because GCC does not recognize it as a rotate (see >http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157): > >return (v << n) | (v >> (sizeof(v) * 8 - n)); > >And outside of special cases like Salsa, ChaCha and BLAKE2, the code >provided in bitops.h suffers UB on arbitrary data. So I think care >needs to be taken when selecting functions from bitops.h. > >> that we should do for the rest of crypto/*.c, where people seem to be >> defininig their own version of something like rotl32 (I copied the >> contents of crypto/chacha20_generic.c to lib/chacha20, so this >pattern >> of defining one's own version of rol32 isn't new). > >Yeah, I kind of thought there was some duplication going on. > >But I think bitops.h should be fixed. Many folks don't realize the >lurking UB, and many folks don't realize its not always optimized >well. > >>> I think the portable way to do a rotate that avoids UB is the >>> following. GCC, Clang and ICC recognize the pattern, and emit a >rotate >>> instruction. >>> >>> static const unsigned int MASK=31; >>> return (v<>(-n&MASK)); >>> >>> You should also avoid the following because its not constant time >due >>> to the branch: >>> >>> return n == 0 ? v : (v << n) | (v >> (sizeof(v) * 8 - n)); >>> >> >> Where is this coming from? I don't see this construct in the patch. > >My bad... It was a general observation. I've seen folks try to correct >the UB by turning to something like that. > >Jeff We don't care about UB, we care about gcc, and to a lesser extent LLVM and ICC. If bitops.h doesn't do the right thing, we need to fix bitops.h. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] random: replace non-blocking pool with a Chacha20-based CRNG
On May 4, 2016 10:30:41 AM PDT, ty...@mit.edu wrote: >On Tue, May 03, 2016 at 10:50:25AM +0200, Stephan Mueller wrote: >> > +/* >> > + * crng_init = 0 --> Uninitialized >> > + *2 --> Initialized >> > + *3 --> Initialized from input_pool >> > + */ >> > +static int crng_init = 0; >> >> shouldn't that be an atomic_t ? > >The crng_init variable only gets incremented (it progresses from >0->1->2->3) and it's protected by the primary_crng->lock spinlock. >There are a few places where we read it without first taking the lock, >but that's where we depend on it getting incremented and where if we >race with another CPU which has just bumped the crng_init status, it's >not critical. (Or we can take the lock and then recheck the crng_init >if we really need to be sure.) > >> > +static void _initialize_crng(struct crng_state *crng) >> > +{ >> > + int i; >> > + unsigned long rv; >> >> Why do you use unsigned long here? I thought the state[i] is unsigned >int. > >Because it gets filled in via arch_get_random_seed_long(&rv) and >arch_get_random_log(&rv). If that means we get 64 bits and we only >use 32 bits, that's OK > >> Would it make sense to add the ChaCha20 self test vectors from >RFC7539 here to >> test that the ChaCha20 works? > >We're not doing that for any of the other ciphers and hashes. We >didn't do that for SHA1, and the chacha20 code where I took this from >didn't check for this as well. What threat are you most worried >about. We don't care about chacha20 being exactly chacha20, so long >as it's cryptographically strong. In fact I think I removed a >potential host order byteswap in the set key operation specifically >because we don't care and interop. > >If this is required for FIPS mode, we can add that later. I was >primarily trying to keep the patch small to be easier for people to >look at it, so I've deliberately left off bells and whistles that >aren't strictly needed to show that the new approach is sound. > >> > + if (crng_init++ >= 2) >> > + wake_up_interruptible(&crng_init_wait); >> >> Don't we have a race here with the crng_init < 3 check in crng_reseed > >> considering multi-core systems? > >No, because we are holding the primary_crng->lock spinlock. I'll add >a comment explaining the locking protections which is intended for >crng_init where we declare it. > > >> > + if (num < 16 || num > 32) { >> > + WARN_ON(1); >> > + pr_err("crng_reseed: num is %d?!?\n", num); >> > + } >> > + num_words = (num + 3) / 4; >> > + for (i = 0; i < num_words; i++) >> > + primary_crng.state[i+4] ^= tmp[i]; >> > + primary_crng.init_time = jiffies; >> > + if (crng_init < 3) { >> >> Shouldn't that one be if (crng_init < 3 && num >= 16) ? > >I'll just change the above WRN_ON test to be: > > BUG_ON(num < 16 || num > 32); > >This really can't happen, and I had it as a WARN_ON with a printk for >debugging purpose in case I was wrong about how the code works. > >> > + crng_init = 3; >> > + process_random_ready_list(); >> > + wake_up_interruptible(&crng_init_wait); >> > + pr_notice("random: crng_init 3\n"); >> >> Would it make sense to be more descriptive here to allow readers of >dmesg to >> understand the output? > >Yes, what we're currently printing during the initialization: > >random: crng_init 1 >random: crng_init 2 >random: crng_init 3 > >was again mostly for debugging purposes. What I'm thinking about >doing is changing crng_init 2 and 3 messages to be: > >random: crng fast init done >random: crng conservative init done > >> > + } >> > + ret = 1; >> > +out: >> > + spin_unlock_irqrestore(&primary_crng.lock, flags); >> >> memzero_explicit of tmp? > >Good point, I've added a call to memzero_explicit(). > >> > +static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE]) >> > +{ >> > + unsigned long v, flags; >> > + struct crng_state *crng = &primary_crng; >> > + >> > + if (crng_init > 2 && >> > + time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)) >> > + crng_reseed(&input_pool); >> > + spin_lock_irqsave(&crng->lock, flags); >> > + if (arch_get_random_long(&v)) >> > + crng->state[14] ^= v; >> > + chacha20_block(&crng->state[0], out); >> > + if (crng->state[12] == 0) >> > + crng->state[13]++; > >> What is the purpose to only cover the 2nd 32 bit value of the nonce >with >> arch_get_random? >> >> state[12]++? Or why do you increment the nonce? > >In Chacha20, its output is a funcrtion of the state array, where >state[0..3] is a constant specified by the Chacha20 definition, >state[4..11] is the Key, and state[12..15] is the IV. The >chacha20_block() function increments state[12] each time it is called, >so state[12] is being used as the block counter. The increment of >state[13] is used to make state[13] to be the upper 32-bits of the >block counter. We *should* be reseeding more often than 2**32 calls >to chacha20_block
Re: General flags to turn things off (getrandom, pid lookup, etc)
On 07/25/2014 11:30 AM, Andy Lutomirski wrote: > - 32-bit GDT code segments [huge attack surface] > - 64-bit GDT code segments [probably pointless] I presume you mean s/GDT/LDT/. We already don't allow 64-bit LDT code segments. Also, it is unclear to me how 32-bit LDT segments have a huge attack surface, given that there will realistically always be a 32-bit *GDT* segment present. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -v5] random: introduce getrandom(2) system call
On 07/24/2014 01:54 PM, Andy Lutomirski wrote: > > Or that someone writes userspace code that gets -EPERM/-EACCESS on > getrandom with GRND_RANDOM and falls back to something worse than > getrandom w/o GRND_RANDOM. > -ENXIO? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] random: limit the contribution of the hw rng to at most half
On 07/17/2014 03:08 PM, Theodore Ts'o wrote: > > There was another complaint more recently, that wasn't related to > nordrand. And I was rather disturbed that in > add_interrupt_randomness() 97% of the entropy was coming from RDSEED. > I'm willing to have some of the entropy come from RDSEED by default, > but 97% seems to really way too much. > OK, the that formula was Linus' suggestion. > And the emergency refill code in random_read() was extremely > problematic. First of all, we're dropping 512 bytes on the stack, > which is kind of bad, but hopefully all of the code paths which call > random_read() won't have a terribly deep stack. 512 bytes is fine here (1K would possibly not have been.) > Secondly, even after the emergency refill, we block anyway. Not unless we don't get any data back. For a nonzero return we loop back to extract_entropy_user(). > Remember, regardless of whether or not you believe (as a former head > of NSA's technology directorate recently claimed) that the DUAL-EC p > and q were generated using NSA's standard high-quality HW RNG system > which they use to generate all of their crypto variables, or (as the > NIST advisory panel has recently concluded) that it was highly likely > that DUAL-EC was backdoored, it's the optics that matter, because > those of us without top-drawer SI security clearances can't prove > things one way or another. Dual-EC was also heavily criticized by the civilian cryptographic community since at least 2006. > And when 99.95% of the entropy when reading large quantities of keying > material from /dev/random is coming from RDSEED, that just has > terrible, terrible optics I just want to make sure we don't negatively impact the real security of users because of "optics". We already have a lot of problems with people extracting long-living keys from /dev/urandom because /dev/random is too slow. I have no objection to the khwrngd route -- in fact, as you know, I have been heavily encouraging the development of khwrngd with exactly this in mind -- but it is just an issue of timing. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] random: limit the contribution of the hw rng to at most half
On 07/17/2014 03:03 AM, Theodore Ts'o wrote: > For people who don't trust a hardware RNG which can not be audited, > the changes to add support for RDSEED can be troubling since 97% or > more of the entropy will be contributed from the in-CPU hardware RNG. > > We now have a in-kernel khwrngd, so for those people who do want to > implicitly trust the CPU-based system, we could create an arch-rng > hw_random driver, and allow khwrng refill the entropy pool. This > allows system administrator whether or not they trust the CPU (I > assume the NSA will trust RDRAND/RDSEED implicitly :-), and if so, > what level of entropy derating they want to use. > > The reason why this is a really good idea is that if different people > use different levels of entropy derating, it will make it much more > difficult to design a backdoor'ed hwrng that can be generally > exploited in terms of the output of /dev/random when different attack > targets are using differing levels of entropy derating. > > Signed-off-by: Theodore Ts'o I saw exactly one complaint to that nature, but that was from someone who really wanted the "nordrand" option (at which point I observed that it had inadvertently left RDSEED enabled which quickly got rectified.) The implication was that this was a request from a specific customer who presumably have their own "audited" hardware RNG. There may have been other complaints (justified or not) but if so I haven't seen them. I'm wondering if we are overgeneralizing here and if so if it wouldn't be better to defer this until the hwrng supplier for this is ready, which probably won't happen in time for 3.17 just given the current timeline. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arch_random_refill
On 05/11/2014 08:36 PM, Stephan Mueller wrote: > > But in our current predicament, not everybody trusts a few potentially easily > manipulated gates that have no other purpose than produce white noise which > are developed by the biggest chip vendor in the US. Gates which have other > purposes may not be that easily manipulated. > Incidentally, I disagree with the "easily manipulated" bit. Yes, I have seen the paper which says that you can do it in such a way that it doesn't show up on *visual* examination. However, put an electrical probe on it and it shows up immediately. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arch_random_refill
On 05/11/2014 08:36 PM, Stephan Mueller wrote: > > Ohh, ok, thanks for fixing that. :-) > > Though what makes me wonder is the following: why are some RNGs forced to use > the hw_random framework whereas some others are not? What is the driver for > that? > > The current state of random.c vs. drivers/char/hw_random and the strong in- > kernel separation between both makes me wonder. Isn't that all kind of > inconsistent? > The main differences are speed of access, trivial interface, and architectural guarantees. You also don't have to deal with enumeration, DMA engines, interrupts, indirect access, or bus drivers, which all are utterly unacceptable on a synchronous path. That being said, it is getting clear that we most likely would be better off with the kernel directly feeding from at least a subset of the hw_random drivers, rather than waiting for user space to come along and launch a daemon... after $DEITY knows how many other processes have already been launched. There are patches being worked on to make that happen, although there are a fair number of potential issues, including the fact that some of the hw_random drivers are believed to be dodgy -- for example, the TPM driver: some TPMs are believed to not contain any entropy element and simply rely on a factory-seeded nonvolatile counter (since the TPM has to have support for nonvolatile counters anyway, this hardware is already present.) -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arch_random_refill
On 05/11/2014 04:01 PM, Stephan Mueller wrote: > Hi Peter, > > some time back when the RDRAND instruction was debated, a patch was offered > for driver/char/random.c that in essence turned /dev/random into a frontend > for RDRAND in case that instruction was available. The patch kind of > monopolized the noise sources such that if a user space "random number hog" > pulled data from /dev/random endlessly, the (almost) only noise source was > RDRAND. As that patch treated RDRAND to provide entropy, the blocking > behavior > went away for /dev/random. > This is false in a number of ways. First of all... we NEVER pulled either /dev/random or /dev/urandom directly from RDRAND. We used RDRAND directly for kernel internal randomness uses. Users did object to this. > That patch did not sit well with some developers and it got finally changed > such that the output of RDRAND is now just XORed with the output of the > "classic" /dev/random behavior -- /dev/random is still blocking. Mixing in RDRAND into /dev/random and /dev/urandom is actually > With the current development cycle for 3.15, the function arch_random_refill > is added as presented in [1]. It now uses RDSEED instead of RDRAND. Yet, the > way this function is called in random_read seems (as I have no system with an > RDSEED, I cannot test) to show the very same behavior as the aforementioned > RDRAND patch: the blocking behavior of /dev/random will be gone and RDSEED > will monopolize the noise sources in case of a user space hog. There is a huge difference between this and what people objected to earlier: we filter everything through the kernel random number pool system, which would require a herculean mathematical effort to reverse even if the output of RDSEED was 100% predictable. > Note, I do not see an issue with the patch that adds RDSEED as part of > add_interrupt_randomness outlined in [2]. The reason is that this patch does > not monopolizes the noise sources. > > I do not want to imply that Intel (or any other chip manufacturer that will > hook into arch_random_refill) intentionally provides bad entropy (and this > email shall not start a discussion about entropy again), but I would like to > be able to only use noise sources that I can fully audit. As it is with > hardware, I am not able to see what it is doing. I have to point out the irony in this given your previous proposals, however... > Thus, may I ask that arch_random_refill is revised such that it will not > monopolize the noise sources? If somebody wants that, he can easily use rngd. Feel free to build the kernel without CONFIG_ARCH_RANDOM, or use the "nordrand" option to the kernel. These options are there for a reason. Now when you mention it, though, the nordrand option should turn off RDSEED as well as RDRAND. It currently doesn't; that is a bug, plain and simple. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: sha{256,512}_ssse3 - remove asmlinkage from static functions
On 04/17/2014 09:58 PM, Herbert Xu wrote: >> >> It doesn't make sense, sorry. The right thing to drop here is not >> "asmlinkage", it is "static": this is an external declaration. > > It's a function pointer that's static, not the function that > it's pointing to. > {facepalm} Right, function *pointer*. Duh. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: sha{256,512}_ssse3 - remove asmlinkage from static functions
On 04/17/2014 08:28 AM, Marek Vasut wrote: > On Wednesday, April 16, 2014 at 06:19:50 PM, Jianyu Zhan wrote: >> Commit 128ea04a9885("lto: Make asmlinkage __visible") restricts >> asmlinkage to externally_visible, this causes compilation warnings: >> >> arch/x86/crypto/sha256_ssse3_glue.c:56:1: >> warning: ‘externally_visible’ attribute have effect only on public >> objects [-Wattributes] >> >> static asmlinkage void (*sha256_transform_asm)(const char *, u32 *, >> u64); ^ >> >> arch/x86/crypto/sha512_ssse3_glue.c:55:1: >> warning: ‘externally_visible’ attribute have effect only on public >> objects [-Wattributes] static asmlinkage void >> (*sha512_transform_asm)(const char *, u64 *, ^ >> >> Drop asmlinkage here to avoid such warnings. >> >> Also see Commit 8783dd3a37a5853689e1("irqchip: Remove asmlinkage from >> static functions") >> >> Signed-off-by: Jianyu Zhan > > Makes sense, please add my humble > > Reviewed-by: Marek Vasut > It doesn't make sense, sorry. The right thing to drop here is not "asmlinkage", it is "static": this is an external declaration. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] x86/crypto: ghash: use C implementation for setkey()
On 03/27/2014 04:46 AM, Ard Biesheuvel wrote: > On 27 March 2014 12:36, Herbert Xu wrote: >> On Thu, Mar 27, 2014 at 12:29:00PM +0100, Ard Biesheuvel wrote: >>> The GHASH setkey() function uses SSE registers but fails to call >>> kernel_fpu_begin()/kernel_fpu_end(). Instead of adding these calls, and >>> then having to deal with the restriction that they cannot be called from >>> interrupt context, move the setkey() implementation to the C domain. >> >> Note that setkey cannot be called from interrupt context since >> allocation/setkey is supposed to be slow-path material. >> >> But your approach is fine by me. > > I agree that it makes little sense to call this from atomic context, > but that still means (I think, but the x86 guys should confirm) that > you are supposed to call kernel_fpu_begin() and kernel_fpu_end(). > Yes. I'm suspecting calling kernel_fpu_begin() for a single GF operation is probably not worth it, so I'm fine with reimplementing it in integer logic. Acked-by: H. Peter Anvin -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
On 03/24/2014 09:10 AM, Mathias Krause wrote: > The recent addition of the AVX2 variant of the SHA1 hash function wrongly > disabled the AVX variant by introducing a flaw in the feature test. Fixed > in patch 1. > > The alignment calculations of the AVX2 assembler implementation are > questionable, too. Especially the page alignment of the stack pointer is > broken in multiple ways. Fixed in patch 2. In patch 3 another issue for > code alignment is fixed. > > Please apply! > > Mathias Krause (3): > crypto: x86/sha1 - re-enable the AVX variant > crypto: x86/sha1 - fix stack alignment of AVX2 variant > crypto: x86/sha1 - reduce size of the AVX2 asm implementation > > arch/x86/crypto/sha1_avx2_x86_64_asm.S |8 ++-- > arch/x86/crypto/sha1_ssse3_glue.c | 26 -- > 2 files changed, 18 insertions(+), 16 deletions(-) > For all these patches: Reviewed-by: H. Peter Anvin Thank you for doing these. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] crypto: x86/sha1 - regression and other fixes
On 03/24/2014 09:10 AM, Mathias Krause wrote: > The recent addition of the AVX2 variant of the SHA1 hash function wrongly > disabled the AVX variant by introducing a flaw in the feature test. Fixed > in patch 1. > > The alignment calculations of the AVX2 assembler implementation are > questionable, too. Especially the page alignment of the stack pointer is > broken in multiple ways. Fixed in patch 2. In patch 3 another issue for > code alignment is fixed. > > Please apply! > Yikes. Yes... the alignment calculation is confused in the extreme. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 1/1] crypto: SHA1 transform x86_64 AVX2
On 03/20/2014 05:09 PM, Herbert Xu wrote: > On Thu, Mar 20, 2014 at 03:23:27PM -0700, H. Peter Anvin wrote: >> Again... Herbert, David... yours or mine? > > Yes I will be taking this through cryptodev. > Thanks! Acked-by: H. Peter Anvin -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 1/1] crypto: SHA1 transform x86_64 AVX2
Again... Herbert, David... yours or mine? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/1] crypto: SHA1 transform x86_64 AVX2
Herbert, do you want to take this or should I? In the former case, please feel free to add my: Acked-by: H. Peter Anvin -hpa On 03/19/2014 02:21 PM, chandramouli narayanan wrote: > From 8948c828aa929a3e2531ca88e3f05fbeb1ff53db Mon Sep 17 00:00:00 2001 > From: Chandramouli Narayanan > Date: Wed, 19 Mar 2014 09:30:12 -0700 > Subject: [PATCH v4 1/1] crypto: SHA1 transform x86_64 AVX2 > > This git patch adds x86_64 AVX2 optimization of SHA1 > transform to crypto support. The patch has been tested with 3.14.0-rc1 > kernel. > > On a Haswell desktop, with turbo disabled and all cpus running > at maximum frequency, tcrypt shows AVX2 performance improvement > from 3% for 256 bytes update to 16% for 1024 bytes update over > AVX implementation. > > This patch adds sha1_avx2_transform(), the glue, build and > configuration changes needed for AVX2 optimization of > SHA1 transform to crypto support. > > sha1-ssse3 is one module which adds the necessary optimization > support (SSSE3/AVX/AVX2) for the low-level SHA1 transform function. With > better > optimization support, transform function is overridden as the case may be. > In the case of AVX2, due to performance reasons across datablock sizes, > the AVX or AVX2 transform function is used at run-time as it suits best. > The Makefile change therefore appends the necessary objects to the linkage. > Due to this, the patch merely appends AVX2 transform to the existing build mix > and Kconfig support and leaves the configuration build support as is. > > Changes noted from the initial version of this patch are based on the > feedback from the community: > a) check for BMI2 in addition to AVX2 support since > __sha1_transform_avx2() uses rorx > b) Since the module build has dependency on 64bit, it is > redundant to check it in the code here. > c) coding style cleanup > d) simplification of the assembly code where macros are repetitively used. > > Signed-off-by: Chandramouli Narayanan > --- > arch/x86/crypto/Makefile | 3 + > arch/x86/crypto/sha1_avx2_x86_64_asm.S | 706 > + > arch/x86/crypto/sha1_ssse3_glue.c | 50 ++- > crypto/Kconfig | 4 +- > 4 files changed, 754 insertions(+), 9 deletions(-) > create mode 100644 arch/x86/crypto/sha1_avx2_x86_64_asm.S > > diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile > index 6ba54d6..61d6e28 100644 > --- a/arch/x86/crypto/Makefile > +++ b/arch/x86/crypto/Makefile > @@ -79,6 +79,9 @@ aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o fpu.o > aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o > ghash-clmulni-intel-y := ghash-clmulni-intel_asm.o ghash-clmulni-intel_glue.o > sha1-ssse3-y := sha1_ssse3_asm.o sha1_ssse3_glue.o > +ifeq ($(avx2_supported),yes) > +sha1-ssse3-y += sha1_avx2_x86_64_asm.o > +endif > crc32c-intel-y := crc32c-intel_glue.o > crc32c-intel-$(CONFIG_64BIT) += crc32c-pcl-intel-asm_64.o > crc32-pclmul-y := crc32-pclmul_asm.o crc32-pclmul_glue.o > diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S > b/arch/x86/crypto/sha1_avx2_x86_64_asm.S > new file mode 100644 > index 000..559eb6c > --- /dev/null > +++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S > @@ -0,0 +1,706 @@ > +/* > + * Implement fast SHA-1 with AVX2 instructions. (x86_64) > + * > + * This file is provided under a dual BSD/GPLv2 license. When using or > + * redistributing this file, you may do so under either license. > + * > + * GPL LICENSE SUMMARY > + * > + * Copyright(c) 2014 Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of version 2 of the GNU General Public License as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * Contact Information: > + * Ilya Albrekht > + * Maxim Locktyukhin > + * Ronen Zohar > + * Chandramouli Narayanan > + * > + * BSD LICENSE > + * > + * Copyright(c) 2014 Intel Corporation. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the docu
Re: [PATCH 2/2] SHA1 transform: x86_64 AVX2 optimization - glue & build-v2
On 03/17/2014 09:53 AM, chandramouli narayanan wrote: > On second thoughts, with sha1-sse3-(CONFIG_AS_AVX2) += > sha1_avx2_x86_64_asm.o, I have build issues and sha1_transform_avx2 > undefined in sha1-sss3.ko. > > I can rid #ifdef CONFIG_AS_AVX2 in patch1. The following works though: > ifeq ($(avx2_supported),yes) > sha1-ssse3-y += sha1_avx2_x86_64_asm.o > endif Yes, the sad thing is that the CONFIG_AS_* things aren't real config symbols, despite the name. They might be in the future when Kconfig can run test probes (something we have needed for a very long time.) The "yes" versus "y", though, is a total faceplant. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RESEND 3] hwrng: add randomness to system from rng sources
On 03/04/2014 02:39 PM, Matt Mackall wrote: > > [temporarily coming out of retirement to provide a clue] > > The pool mixing function is intentionally _reversible_. This is a > crucial security property. > > That means, if I have an initial secret pool state X, and hostile > attacker controlled data Y, then we can do: > > X' = mix(X, Y) > > and > > X = unmix(X', Y) > > We can see from this that the combination of (X' and Y) still contain > the information that was originally in X. Since it's clearly not in Y.. > it must all remain in X'. > This of course assumes that the attacker doesn't know the state of the pool X. The other thing to note is that reversible doesn't necessarily mean linear (the current mixing function is linear.) AES, for example, is reversible (if and only if you possess the key) but is highly nonlinear. I'm not saying we should use AES to mix the pool -- it is almost guaranteed to be too expensive. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RESEND 3] hwrng: add randomness to system from rng sources
On 03/03/2014 03:51 PM, Kees Cook wrote: > When bringing a new RNG source online, it seems like it would make sense > to use some of its bytes to make the system entropy pool more random, > as done with all sorts of other devices that contain per-device or > per-boot differences. > > Signed-off-by: Kees Cook I would like to raise again the concept of at least optionally using a kernel thread, rather than a user-space daemon, to feed hwrng output to the kernel pools. The main service rngd provides is FIPS tests, but those FIPS tests were withdrawn as a standard over 10 years ago and are known to be extremely weak, at the very best a minimal sanity check. Furthermore, they are completely useless against the output of any RNG which contains a cryptographic whitener, which is the vast majority of commercial sources -- this is especially so since rngd doesn't expect to have to do any data reduction for output coming from hwrng. Furthermore, if more than one hwrng device is available, rngd will only be able to read one of them because of the way /dev/hwrng is implemented in the kernel. For contrast, the kernel could do data reduction just fine by only crediting the entropy coming out of each hwrng driver with a fractional amount. That does *not* mean that there aren't random number generators which require significant computation better done in user space. For example, an audio noise daemon or a lava lamp camera which requires video processing. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)
On 01/06/2014 03:39 PM, Tim Chen wrote: > > Will renaming the file to aesni_intel_avx-x86_64.S make things clearer > now? > > Tim Yes. Acked-by: H. Peter Anvin Herbert, can you pick it up? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)
On 01/06/2014 12:26 PM, Borislav Petkov wrote: > On Mon, Jan 06, 2014 at 10:10:55AM -0800, Tim Chen wrote: >> Yes, the code is in the file named aesni_intel_avx.S. So it should be >> clear that the code is meant for x86_64. > > How do you deduce aesni_intel_avx.S is meant for x86_64 only from the > name? > > Shouldn't it be called aesni_intel_avx-x86_64.S, as is the naming > convention in arch/x86/crypto/ > Quite. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)
On 01/06/2014 09:57 AM, Tim Chen wrote: > On Mon, 2014-01-06 at 09:45 -0800, H. Peter Anvin wrote: >> Can the code be adjusted to compile for 32 bit x86 or is that pointless? >> > > Code was optimized for wide registers. So it is only meant for x86_64. > Aren't the "wide registers" the vector registers? Or are you also relying on 64-bit integer registers (in which case we should just rename the file to make it clear it is x86-64 only.) -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1] crypto: aesni - fix build on x86 (32bit)
Can the code be adjusted to compile for 32 bit x86 or is that pointless? Tim Chen wrote: > > >On Mon, 2013-12-30 at 15:52 +0200, Andy Shevchenko wrote: >> It seems commit d764593a "crypto: aesni - AVX and AVX2 version of >AESNI-GCM >> encode and decode" breaks a build on x86_32 since it's designed only >for >> x86_64. This patch makes a compilation unit conditional to >CONFIG_64BIT and >> functions usage to CONFIG_X86_64. > >Thanks for catching and fixing it. > >Tim >> >> Signed-off-by: Andy Shevchenko -- Sent from my mobile phone. Please pardon brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random
On 11/10/2013 09:21 AM, Stephan Mueller wrote: > > Here you say it: we *assume*. And please bear in mind that we all know for a > fact that the theory behind quantum physics is not complete, because it does > not work together with the theory of relativity. That again is a hint that > there is NO proof that decay is really unpredictable. > Honestly, if you want to be taken seriously, this is not the kind of thing to put in your discussion. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.
On 07/19/2013 04:26 PM, Greg Kroah-Hartman wrote: >> >> RAID has effectively the same issue, and we just "solved" it by >> compiling in all the accelerators into the top-level module. > > Then there's nothing to be done in udev or kmod, right? > I don't know. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.
On 07/19/2013 04:16 PM, Greg Kroah-Hartman wrote: > > udev isn't doing any module loading, 'modprobe' is just being called for > any new module alias that shows up in the system, and all of the drivers > that match it then get loaded. > > How is it a problem if a module is attempted to be loaded that is > already loaded? How is it a problem if a different module is loaded for > a device already bound to a driver? Both of those should be total > "no-ops" for the kernel. > > But, I don't know anything about the cpu code, how is loading a module > causing problems? That sounds like it needs to be fixes, as any root > user can load modules whenever they want, you can't protect the kernel > from doing that. > The issue here seems to be the dynamic binding nature of the crypto subsystem. When something needs crypto, it will request the appropriate crypto module (e.g. crct10dif), which may race with detecting a specific hardware accelerator based on CPUID or device information (e.g. crct10dif_pclmul). RAID has effectively the same issue, and we just "solved" it by compiling in all the accelerators into the top-level module. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.
On 07/18/2013 03:17 PM, Rafael J. Wysocki wrote: >> >> alias x86cpu:vendor:*:family:*:model:*:feature:*0081* crct10dif_pclmul >> >> This should cause udev to load the crct10dif_pclml module when cpu >> support the PCLMULQDQ (feature code 0081). I did my testing during >> development on 3.10 and the module was indeed loaded. >> >> However, I found that the following commit under 3.11-rc1 broke >> the mechanism after some bisection. >> >> commit ac212b6980d8d5eda705864fc5a8ecddc6d6eacc >> Author: Rafael J. Wysocki >> Date: Fri May 3 00:26:22 2013 +0200 >> >> ACPI / processor: Use common hotplug infrastructure >> Split the ACPI processor driver into two parts, one that is >> non-modular, resides in the ACPI core and handles the enumeration >> and hotplug of processors and one that implements the rest of the >> existing processor driver functionality. >> Rafael, can you check and see if this can be fixed so those >> optimized >> crypto modules for Intel cpu that support them can be loaded? > > I think this is an ordering issue between udev startup and the time when > devices are registered. > > I wonder what happens if you put those modules into the initramfs image? > OK, this bothers me on some pretty deep level... a set of changes exclusively in drivers/acpi breaking functionality which had nothing to do with ACPI, specifically CPU-feature-based module loading. Please let me know what the investigation comes up with, or if I need to get more directly involved. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/11] Added macro to check for AVX2 feature.
Just syntactic overhead. We should probably discuss it among ourselves first though. Tim Chen wrote: >On Fri, 2013-03-22 at 17:21 -0700, H. Peter Anvin wrote: >> I really, really hate these macros... Not sure they are worth the >extra noise. >> > >I can do without the macro and I'll remove it. > >Wonder the reason that such macro should be avoided? > >Tim -- Sent from my mobile phone. Please excuse brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/11] Added macro to check for AVX2 feature.
I really, really hate these macros... Not sure they are worth the extra noise. Tim Chen wrote: >Macro to facilitate checking the availability of the AVX2 feature. > >Signed-off-by: Tim Chen >--- > arch/x86/include/asm/cpufeature.h | 1 + > 1 file changed, 1 insertion(+) > >diff --git a/arch/x86/include/asm/cpufeature.h >b/arch/x86/include/asm/cpufeature.h >index 2d9075e..db98ec7 100644 >--- a/arch/x86/include/asm/cpufeature.h >+++ b/arch/x86/include/asm/cpufeature.h >@@ -313,6 +313,7 @@ extern const char * const x86_power_flags[32]; > #define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16) > #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) > #define cpu_has_topoext boot_cpu_has(X86_FEATURE_TOPOEXT) >+#define cpu_has_avx2 boot_cpu_has(X86_FEATURE_AVX2) > > #ifdef CONFIG_X86_64 > -- Sent from my mobile phone. Please excuse brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3 v2] Optimize CRC32C calculation using PCLMULQDQ in crc32c-intel module
On 09/27/2012 03:44 PM, Tim Chen wrote: Version 2 This version of the patch series fixes compilation errors for 32 bit x86 targets. Version 1 This patch series optimized CRC32C calculations with PCLMULQDQ instruction for crc32c-intel module. It speeds up the original implementation by 1.6x for 1K buffer and by 3x for buffer 4k or more. The tcrypt module was enhanced for doing speed test on crc32c calculations. Herbert - you are handling this one, right? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: camellia-x86_64 - module init/exit functions should be static
On 03/21/2012 05:46 PM, Herbert Xu wrote: > On Wed, Mar 21, 2012 at 05:40:03PM -0700, Randy Dunlap wrote: >> On 03/15/2012 01:11 PM, Jussi Kivilinna wrote: >> >>> This caused conflict with twofish-x86_64-3way when compiled into kernel, >>> same function names and not static. >> >> Have these patches been merged anywhere? >> I'm still seeing build problems in linux-next 20120321. > > Thanks for the reminder, I'll push these through today. Please push these to Linus ASAP, it is breaking the x86-64 allyesconfig build upstream right now. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH 00/16] Crypto keys and module signing [ver #2]
On 11/29/2011 03:42 PM, David Howells wrote: > > I have provided a couple of subtypes: DSA and RSA. Both types have signature > verification facilities available within the kernel, and both can be used for > module signature verification with any encryption algorithm known by the PGP > parser, provided the appropriate algorithm is compiled directly into the > kernel. > Do we really need the complexity of a full OpenPGP parser? Parsers are notorious security problems. Furthermore, using DSA in anything but a hard legacy application is not something you want to encourage, so why support DSA? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources
On 06/19/2011 08:07 AM, Herbert Xu wrote: > On Sun, Jun 19, 2011 at 09:38:43AM -0400, Neil Horman wrote: >> >> It sounds to me like, if its desireous to bypass the entropy pool, then we >> should bypass the /dev/random path altogether. Why not write a hwrng driver >> that can export access to the rdrand instruction via a misc device. > > I presume the rdrand instruction can be used from user-space > directly. > Yes, it can. Again, RDRAND is not suitable for /dev/random (as opposed to /dev/urandom users.) /dev/urandom is used both by user space (and here the only reason to hook it up to /dev/urandom is compatibility with existing userspace; we are working separately to enabling user space users like OpenSSL to use RDRAND directly) and by kernel users via the internal APIs. /dev/random as far as I can tell is only ever fed to userspace, however, the guarantees that it is at least supposed to give are very, very strict. RDRAND do not fulfill those criteria, but we should be able to use it as part of its implementation. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources
On 06/17/2011 01:28 PM, Matt Mackall wrote: >> >> The one use case that it is cryptographically insufficient for is to >> seed a new PRNG, which probably means it is unsuitable for being fed >> as-is into /dev/random. > > The thing to understand about the input side of /dev/random is that it's > COMPLETELY immune to untrusted data. So there's absolutely no harm in > sending it data of questionable entropy so long as you don't tell it to > account it. And, of course, if it DOES contain entropy, it makes things > better. > > Think of it this way: I have a coin in my pocket. You, the attacker, > tell me to flip it. You can do that any number of times and not improve > your guess about the coin's state over your initial guess. This is what > it means to have a reversible mixing function: no number of iterations > reduces the degrees of freedom of the pool. > What I meant is that it is unsuitable to *bypass the pool* for /dev/random. I think we can -- and almost certainly should -- use RDRAND on the input side; we just have to figure out the accounting. However, RDRAND is high enough quality (and high enough bandwidth) that it should be not just possible but desirable to completely bypass the pool system for /dev/urandom users and pull straight from the RDRAND instruction. I don't actually know what the exact numbers look like, but the stall conditions being looked at are of the order of "every core in the socket trying to execute RDRAND at the same time." -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] tsc: wire up entropy generation function
On 06/13/2011 04:10 PM, Matt Mackall wrote: > > Well the issue is that samples are going to be roughly aligned to some > multiple of the bus frequency. If an interrupt occurs on bus cycle X, > this code will be hit at roughly TSC cycle X*M+d. > >> This is correct; at the very least I would multiply the low 32 bits of >> the TSC with a 32-bit prime number before mixing. > > This multiply doesn't actually accomplish anything. Mixing known values > into the pool is harmless because the mixing function is fully > reversable. ie: > > unmix(sample, mix(sample, pool)) = pool > > If you didn't know the contents of pool before, you can't guess it > after. > > The danger comes if you (a) already know pool and (b) can narrow sample > to a tractable set of values. Then you can calculate a set of possible > pool' values and compare the resulting output to confirm the actual > state of pool' (aka state extension attack). Sticking a constant > multiplier on sample doesn't affect this attack at all. > It only matters if you actually truncate it to LSBs. >> However, the big issue with this is that it's recursive... what causes >> this to be invoked... probably an interrupt, which is going to have been >> invoked by a timer, quite possible the TSC deadline timer. Oops. > > ..and the sampling function already mixes in a TSC timestamp with the > provided timestamp. Right. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] tsc: wire up entropy generation function
On 06/13/2011 05:39 PM, Kent Borg wrote: > I was assuming that drivers, responding to an interrupt from some > external event, would want to make this call. > Such as a network driver. Those already are doing this. > Two points: > > 1. Why look at the high-order bits? How are they going to have values > that cannot be inferred? If you are looking for entropy, the low-order > bits is where they're going keep it. (Think Willie Sutton.) If looking > at the low-byte is cheaper, then let's be cheap. If, however, if > looking at more bytes *is* as cheap, then there is no harm. But in any > case let's keep the code lean enough that it can be called from an > interrupt service routine. Consider the case where the TSC is running in steps of 64 and you're using the low 6 bits. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] tsc: wire up entropy generation function
On 06/13/2011 03:27 PM, Venkatesh Pallipadi wrote: > On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson wrote: >> TSC is high enough resolution that we can use its low-order byte to >> stir new data into the random number generator entropy pool. > > From what I vaguely remember from years past, rdtsc, especially last > few bits of it are not very good as random number source. As they are > based on lower bus frequency and a multiplier. May be things have > changed these days. Adding Peter and Suresh for comments. This is correct; at the very least I would multiply the low 32 bits of the TSC with a 32-bit prime number before mixing. However, the big issue with this is that it's recursive... what causes this to be invoked... probably an interrupt, which is going to have been invoked by a timer, quite possible the TSC deadline timer. Oops. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation
On 11/03/2009 01:03 AM, Ingo Molnar wrote: >> >> .macro xmm_num opd xmm >> .ifc \xmm,%xmm0 >> \opd = 0 >> .endif >> .ifc \xmm,%xmm1 >> \opd = 1 >> .endif >> .ifc \xmm,%xmm2 >> \opd = 2 >> .endif >> .ifc \xmm,%xmm3 >> \opd = 3 >> .endif >> .ifc \xmm,%xmm4 >> \opd = 4 >> .endif >> .ifc \xmm,%xmm5 >> \opd = 5 >> .endif >> .ifc \xmm,%xmm6 >> \opd = 6 >> .endif >> .ifc \xmm,%xmm7 >> \opd = 7 >> .endif >> .ifc \xmm,%xmm8 >> \opd = 8 >> .endif >> .ifc \xmm,%xmm9 >> \opd = 9 >> .endif >> .ifc \xmm,%xmm10 >> \opd = 10 >> .endif >> .ifc \xmm,%xmm11 >> \opd = 11 >> .endif >> .ifc \xmm,%xmm12 >> \opd = 12 >> .endif >> .ifc \xmm,%xmm13 >> \opd = 13 >> .endif >> .ifc \xmm,%xmm14 >> \opd = 14 >> .endif >> .ifc \xmm,%xmm15 >> \opd = 15 >> .endif >> .endm >> >> .macro PSHUFB_XMM xmm1 xmm2 >> xmm_num pshufb_opd1 \xmm1 >> xmm_num pshufb_opd2 \xmm2 >> .if (pshufb_opd1 < 8) && (pshufb_opd2 < 8) >> .byte 0x66, 0x0f, 0x38, 0x00, 0xc0 | pshufb_opd1 | (pshufb_opd2<<3) >> .elseif (pshufb_opd1 >= 8) && (pshufb_opd2 < 8) >> .byte 0x66, 0x41, 0x0f, 0x38, 0x00, 0xc0 | (pshufb_opd1-8) | (pshufb_opd2<<3) >> .elseif (pshufb_opd1 < 8) && (pshufb_opd2 >= 8) >> .byte 0x66, 0x44, 0x0f, 0x38, 0x00, 0xc0 | pshufb_opd1 | ((pshufb_opd2-8)<<3) >> .else >> .byte 0x66, 0x45, 0x0f, 0x38, 0x00, 0xc0 | (pshufb_opd1-8) | >> ((pshufb_opd2-8)<<3) >> .endif >> .endm > > Looks far too clever, i like it :-) We have quite a few assembly macros > in arch/x86/include/asm/. The above one could be put into calling.h for > example. > I would really like to see something like that, with only one minor tweak: please use submacros to generate the REX and MODRM bytes, since we are *guaranteed* to want to do the same thing again. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -v2 4/5] x86: Move kernel_fpu_using to irq_is_fpu_using in asm/i387.h
Herbert Xu wrote: Peter, do you want to apply this patch in your tree or would you prefer for it to go through my tree along with the rest of the series? I'll take it tomorrow... want to double-check that we don't have any real issues w.r.t. corruption there. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h
Huang Ying wrote: > > After some thinking, I think something as follow may be more > appropriate: > > /* This may be useful for someone else */ > static inline bool fpu_using(void) > { > return !(read_cr0() & X86_CR0_TS); > } > > static inline bool irq_fpu_using(void) > { > return in_interrupt() && fpu_using(); > } > Yes, looks good. I'll pull in the patch as soon as I get it. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h
Herbert Xu wrote: > On Wed, Jun 17, 2009 at 10:06:44AM -0700, H. Peter Anvin wrote: > >> Huang: if I recall correctly, these functions were originally designed >> to deal with the fact that VIA processors generate spurious #TS faults >> due to broken design of the Padlock instructions. The AES and PCLMUL >> instructions actually use SSE registers and so will require different >> structure. > > No irq_ts_save was the one designed for the VIA, the Intel stuff > does save the FPU state. > Ah, sorry, misremembered. Great! Will apply. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h
Ingo Molnar wrote: >> >> +static inline int kernel_fpu_using(void) >> +{ >> +if (in_interrupt() && !(read_cr0() & X86_CR0_TS)) >> +return 1; >> +return 0; >> +} >> + > > Looks sane to me. Herbert, do you ack it? > Although I have to say, the structure of: if (boolean test) return 1; return 0; ... truly was hit with the ugly stick. It really should be: static inline bool kernel_fpu_using(void) { return in_interrupt() && !(read_cr0() && C86_CR0_TS); } Huang: if I recall correctly, these functions were originally designed to deal with the fact that VIA processors generate spurious #TS faults due to broken design of the Padlock instructions. The AES and PCLMUL instructions actually use SSE registers and so will require different structure. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html