Re: [Qemu-devel] [PATCH for-2.9 v2] virtio-crypto: zeroize the key material before free

Gonglei (Arei) Fri, 09 Dec 2016 19:00:44 -0800

>
> 
> On 09.12.2016 02:42, Gonglei (Arei) wrote:
> > Hi,
> >
> >>
> >> Subject: Re: [Qemu-devel] [PATCH for-2.9 v2] virtio-crypto: zeroize the key
> >> material before free
> >>
> >> On 08.12.2016 16:23, Eric Blake wrote:
> >>> On 12/07/2016 08:28 PM, Gonglei (Arei) wrote:
> >>>
> >>>>> As far as I'm aware, other projects usually have a special memset
> >>>>> variation for doing this. That is because compilers may choose to
> >>>>> "optimize" memset(p, ...) + free(p) to just the free(p). Having a
> >>>>
> >>>> Actually, I googled this, but I didn't find a definite answer. And
> >>>>
> >>>> The Linux kernel uses kzfree instead of memset + kfree
> >> (mm/slab_common.c).
> >>
> >> Well, I personally don't mind whether we use a custom zfree() or a
> >> custom memset_s() + free(). I only mind that this patch actually always
> >> does what it is intended to do.
> >>
> > Yes, but why linux kernel to think about the compiler optimization for
> > memset for sensitive data?
> 
> I'm afraid I don't quite (syntactically) understand this question. Do
> you mean to ask why the Linux kernel would have to think about this
> optimization? My answer to that would be because the optimization of
> memset() + free() is known, and they probably want to protect against
> the compiler optimizing it even with -ffreestanding and differently
> called functions -- you never know.
> 
Sorry, that's my typo. ;)


My question is why not linux kernel to think about the compiler optimization
on the code layer. Because the realization of kzfree is just memset + kfree, 
it didn't add any memory barrier to prevent memset is optimized, or use
secure_memset, or whatever.

So you answer means they maybe use some compiler options to avoid
the compiler optimizing?

> Related example: gcc detects code that basically does a memset() and
> replaces it with a call to memset(). There were a couple of versions
> where it did that even with -ffreestanding or -fno-builtin, which is bad
> if you want to actually write a memset(). They fixed it by now (so that
> -ffreestanding and -fno-builtin will imply
> -fno-tree-loop-distribute-patterns, which will disable that optimization.
> 
> Now imagine the same case here: Maybe at some point gcc is able to
> detect that kfree() is basically free() and will then optimize the
> memset() before kfree() away. A couple of versions later, somebody will
> notice that -- but at that point, the damage has been done already and
> there are compiled versions out which leak sensitive data somewhere.
> 
> I think this is why the Linux kernel decides to be proactive and use
> kzfree() from the start, and this is also why I'm proposing to not delay
> this issue in qemu until some compiler actually makes it a real issue.
> 
> >>> If we're worried about cleaning things without allowing the compiler a
> >>> chance to optimize, then writing our own qemu_zfree() wrapper may
> indeed
> >>> make sense.  But that won't cover the case in Daniel's earlier patch
> >>> (referenced elsewhere in this thread), as that was zeroizing stack
> >>> memory (before it went out of scope) rather than heap memory (before
> >>> free).  So you'd still need some sort of 'write this memory no matter
> >>> what' primitive that would be directly usable on stack memory and
> >>> indirectly used as part of the qemu_zfree() wrapper.
> >>>
> > If we wrap a secure_memset(), then both stack memory and heap
> > memory can use it.
> >
> > Pls see:
> >
> > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf
> >
> > But the secure_memset() is not as efficient as possible due to the nature of
> the volatile
> > type qualifier preventing the compiler from optimizing the code at all.
> >
> > It will cause huge performance regression at hot path of data plane...
> 
> I would suggest we implement an own secure_memset() or secure_bzero()
> which falls back on what we have:
> - memset_s(), if that is available;
> - explizit_bzero() if we have libbsd;
> - SecureZeroMemory() on Windows
> 
> Or we have to write it ourselves:
> https://sourceware.org/ml/libc-alpha/2014-12/msg00506.html suggests that
> putting a full memory barrier after the memset() should be enough.
> 
I like this approach.

> >>> But I wouldn't worry about it for now, unless someone proves we actually
> >>> have a compiler optimizing away the cleanups.
> >>
> >> It's true that we don't have to worry about it *now*, but the approach
> >> of waiting until the compiler breaks it does not seem right to me.
> >>
> >> If at some point some compiler recognizes g_free() as being the same
> >> function as free() and thus optimizes memset() + g_free(), we simply
> >> won't notice because nobody will keep track of the generated assembly
> >> output all the time.
> >>
> >> There is a reason C11 introduced memset_s(), and it is this.
> >>
> > ...does it make sense if we introduce secure_memset() now ?
> >
> > Waiting for your feedback. Thanks!
> 
> I would propose so; or better something like secure_bzero() or
> secure_memzero(), because we can then fall back on existing
> implementations like explizit_bzero() or SecureZeroMemory().
> 
Both OK.

> We don't have to implement this immediately, but we should do it before
> the 2.9 release.
> 
Agree.

Regards,
-Gonglei

Re: [Qemu-devel] [PATCH for-2.9 v2] virtio-crypto: zeroize the key material before free

Reply via email to