> I believe the theory is that the compiler (remember, this is 
> __builtin_memset) can optimize away portions of the zeroing, or can optimize 
> zeroing for small sizes.

Ahhh! I didn't consider that the compiler would be doing analysis of the larger 
context, and potentially skipping zeroing parts that are set immediately after 
the call.

Thanks!

-Ravi (rpokala@)

-----Original Message-----
From: "Jonathan T. Looney" <j...@freebsd.org>
Date: 2018-06-06, Wednesday at 22:58
To: Ravi Pokala <rpok...@freebsd.org>
Cc: Mateusz Guzik <mjgu...@gmail.com>, Mateusz Guzik <m...@freebsd.org>, 
src-committers <src-committ...@freebsd.org>, <svn-src-all@freebsd.org>, 
<svn-src-h...@freebsd.org>
Subject: Re: svn commit: r334702 - head/sys/sys

> On Wed, Jun 6, 2018 at 10:14 PM, Ravi Pokala <rpok...@freebsd.org> wrote:
>>
>> -----Original Message-----
>> From: <owner-src-committ...@freebsd.org> on behalf of Mateusz Guzik 
>> <mjgu...@gmail.com>
>> Date: 2018-06-06, Wednesday at 09:01
>> To: Ravi Pokala <rpok...@freebsd.org>
>> Cc: Mateusz Guzik <m...@freebsd.org>, src-committers 
>> <src-committ...@freebsd.org>, <svn-src-all@freebsd.org>, 
>> <svn-src-h...@freebsd.org>
>> Subject: Re: svn commit: r334702 - head/sys/sys
>>
>>> On Wed, Jun 6, 2018 at 1:35 PM, Ravi Pokala <rpok...@freebsd.org> wrote:
>>>
>>>>> + * Passing the flag down requires malloc to blindly zero the entire 
>>>>> object.
>>>>> + * In practice a lot of the zeroing can be avoided if most of the object
>>>>> + * gets explicitly initialized after the allocation. Letting the compiler
>>>>> + * zero in place gives it the opportunity to take advantage of this 
>>>>> state.
>>>>
>>>> This part, I still don't understand. :-(
>>>
>>> The call to bzero() is still for the full length passed in, so how does 
>>> this help?
>>>
>>> bzero is:
>>> #define bzero(buf, len) __builtin_memset((buf), 0, (len))
>> 
>> I'm afraid that doesn't answer my question; you're passing the full length 
>> to __builtin_memset() too.
> 
> I believe the theory is that the compiler (remember, this is 
> __builtin_memset) can optimize away portions of the zeroing, or can optimize 
> zeroing for small sizes.
> 
> For example, imagine you do this:
> 
>     struct foo {
>         uint32_t a;
>         uint32_t b;
>     };
> 
>     struct foo *
>     alloc_foo(void)
>     {
>         struct foo *rv;
> 
>         rv = malloc(sizeof(*rv), M_TMP, M_WAITOK|M_ZERO);
>         rv->a = 1;
>         rv->b = 2;
>         return (rv);
>     }
> 
> In theory, the compiler can be smart enough to know that the entire structure 
> is initialized, so it is not necessary to zero it.
> 
> (I personally have not tested how well this works in practice. However, this 
> change theoretically lets the compiler be smarter and optimize away unneeded 
> work.)
> 
> At minimum, it should let the compiler replace calls to memset() (and the 
> loops there) with optimal instructions to zero the exact amount of memory 
> that needs to be initialized. (Again, I haven't personally tested how smart 
> the compilers we use are about producing optimal code in this situation.)
> 
> Jonathan


_______________________________________________
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to