HI, Richard,

> On Jun 21, 2021, at 2:53 AM, Richard Biener <rguent...@suse.de> wrote:
> 
>> 
>> 
>> This is for the compatibility with CLANG. -:). 
>> (https://reviews.llvm.org/D54604)
> 
> I don't care about functional 1:1 "compatibility" with CLANG.

Okay.  -:)

> 
>> 1. Pattern initialization
>> 
>>  This is the recommended initialization approach. Pattern initialization's
> 
> But elsewhere you said pattern initialization is only for debugging,
> not production …

Yes. Pattern initialization is only for debugging purpose during development 
phase.

> 
>> Use a pattern that fits them all.  I mean memory allocation hardening
>> fills allocated storage with a repeated (byte) pattern and people are
>> happy with that.  It also makes it easy to spot uninitialized storage
>> from a debugger.  So please, do not over-design this, it really doesn't
>> make any sense and the common case you are inevitably chasing here
>> would already be fine with a random repeated pattern.
>> 
>> So, My question is:
>> 
>> If we want to pattern initialize with the single repeated pattern for all 
>> types, with one is better to use:  “0xAAAAAAAA”
>> or “0xFFFFFFFF” , or other pattern that our current glibc used? What’s that 
>> pattern?
> 
> It's set by the user.

Yes, looks like that glibc uses a byte-repeated pattern that is set by the user 
through environment variable.

> 
>> Will  “0xAAAAAAAA” in a floating type auto variable crash the program?
>> Will “0xFFFFFFFF” in a pointer type auto variable crash the program? (Might 
>> crash?)
>> 
>> 
>> (thus also my suggestion to split out
>> padding handling - now we can also split out pattern init handling,
>> maybe somebody else feels like reviewing and approving this, who knows).
>> 
>> I am okay with further splitting pattern initialization part to a separate 
>> patch. Then we will
>> have 4 independent patches in total:
>> 
>> 1. -fauto-var-init=zero and all the handling in other passes to the new 
>> added call to .DEFERRED_INIT.
>> 2. Add -fauto-var-init=pattern
>> 3. Add -fauto-var-init-padding
>> 4. Add -ftrivial-auto-var-init for CLANG compatibility.
>> 
>> Are the above the correct understanding?
> 
> I think we can drop -fauto-var-init=pattern and just go with block
> initializing which will cover padding as well which means we can
> stay with the odd -ftrivial-auto-var-init name used by CLANG and
> add no additional options.

Yes, this is a good idea. 

block initializing will cover all paddings automatically. 

Shall we do block initializing for both “zero initialization” and “pattern 
initialization”?

Currently, for zero initialization, I used the following:

>>> +    case AUTO_INIT_ZERO:
>>> +      init = build_zero_cst (TREE_TYPE (var));
>>> +      expand_assignment (var, init, false);
>>> +      break;

Looks like that the current “expand_assignment” does not initialize paddings 
with zeroes. 
Shall I also use “memset” for “zero initialization”?

> 
>> As said, block-initializing with a repeated pattern is OK and I can see
>> that being useful.  Trying to produce "nicer" values for floats, bools
>> and pointers on 32bit platforms is IMHO not going to fix anything and
>> introduce as many problems as it will "fix".
>> 
>> Yes, I agree, if we can find a good repeated pattern for all types’s 
>> pattern initialization, that will be much easier and simpler to 
>> implement, I am happy to do that.  (Honestly, the part of implementation 
>> that took me most of the time is pattern-initialization.. and I am still 
>> not very comfortable with this part Of the code myself.  -:)
> 
> There's no "safe" pattern besides all-zero for all "undefined" uses
> (note that uses do not necessarily use declared types).  Which is why
> recommending pattern init is somewhat misguided.  There's maybe 
> some useful pattern that more readily produces crashes, those that
> produce a FP sNaN for all of the float types.

So, pattern value as 0xFF might be better than 0xAA since 0xFFFFFFFF will be a 
NaN value for floating type?

> 
>> And if you block-initialize stuff you then automagically cover padding.
>> I call this a win-win, no?
>> 
>> Yes, this will also initialize paddings with patterns (Not zeroes as CLANG 
>> did).
>> Shall we compatible with CLANG on this?
> 
> No, why?

Okay.

>> in my example code (untested) you then still need
>> 
>>  expand_assignment (var, ctor, false);
>> 
>> it would be the easiest way to try pattern init with a pattern that's
>> bigger than a byte (otherwise of course the memset path is optimal).
>> 
>> If the pattern that is used to initialize all types is byte-repeatable, for 
>> example, 0xA or 0xF, then
>> We can use memset to initialize all types, however, the potential problem 
>> is, if later we decide
>> To change to another pattern that might not be byte-repeatable, then the 
>> memset implementation
>> is not proper at that time.
>> 
>> Is it possible that we might change the pattern later?
> 
> The pattern should be documented as an implementation detail unless
> we want to expose it to the user via, say, -fpattern-init=0xdeadbeef.

Not sure whether it’s necessary to expose this to user.

One question that is important to the implementation is:

Shall we use “byte-repeated” or “word-repeated” pattern?
Is “word-repeated” pattern better than “byte-repeated” pattern?

For implementation, “byte-repeated” pattern will make the whole implementation 
much simpler since both “zero initialization” 
and “pattern initialization” can be implemented with “memset” with different 
“value”.  

So, if “word-repeated” pattern will not have too much more benefit, I will 
prefer “byte-repeated” pattern.

Let me know your comments here.

> 
>> 
>> 
>> As said, for example glibc allocator hardening with MALLOC_PERTURB_
>> uses simple byte-init.
>> 
>> What’s the pattern glibc used?
> 
> The value of the MALLOC_PERTURB_ environment truncated to a byte.

Okay.

thanks.

Qing
> 
> Richard.
> 

Reply via email to