On Wed, Jul 14, 2021 at 4:10 PM Qing Zhao <qing.z...@oracle.com> wrote: > > Hi, Richard, > > > On Jul 14, 2021, at 2:14 AM, Richard Biener <richard.guent...@gmail.com> > > wrote: > > > > On Wed, Jul 14, 2021 at 1:17 AM Qing Zhao <qing.z...@oracle.com> wrote: > >> > >> Hi, Kees, > >> > >> I took a look at the kernel testing case you attached in the previous > >> email, and found the testing failed with the following case: > >> > >> #define INIT_STRUCT_static_all = { .one = arg->one, \ > >> .two = arg->two, \ > >> .three = arg->three, \ > >> .four = arg->four, \ > >> } > >> > >> i.e, when the structure type auto variable has been explicitly initialized > >> in the source code. -ftrivial-auto-var-init in the 4th version > >> does not initialize the paddings for such variables. > >> > >> But in the previous version of the patches ( 2 or 3), > >> -ftrivial-auto-var-init initializes the paddings for such variables. > >> > >> I intended to remove this part of the code from the 4th version of the > >> patch since the implementation for initializing such paddings is > >> completely different from > >> the initializing of the whole structure as a whole with memset in this > >> version of the implementation. > >> > >> If we really need this functionality, I will add another separate patch > >> for this additional functionality, but not with this patch. > >> > >> Richard, what’s your comment and suggestions on this? > > > > I think this can be addressed in the gimplifier by adjusting > > gimplify_init_constructor to clear > > the object before the initialization (if it's not done via aggregate > > copying). > > I did this in the previous versions of the patch like the following: > > @@ -5001,6 +5185,17 @@ gimplify_init_constructor (tree *expr_p, gimple_seq > *pre_p, gimple_seq *post_p, > /* If a single access to the target must be ensured and all elements > are zero, then it's optimal to clear whatever their number. */ > cleared = true; > + else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED > + && !TREE_STATIC (object) > + && type_has_padding (type)) > + /* If the user requests to initialize automatic variables with > + paddings inside the type, we should initialize the paddings too. > + C guarantees that brace-init with fewer initializers than members > + aggregate will initialize the rest of the aggregate as-if it were > + static initialization. In turn static initialization guarantees > + that pad is initialized to zero bits. > + So, it's better to clear the whole record under such situation. > */ > + cleared = true; > else > cleared = false; > > Then the paddings are also initialized to zeroes with this option. (Even for > -ftrivial-auto-var-init=pattern). > > Is the above change Okay? (With this change, when > -ftrivial-auto-var-init=pattern, the paddings for the > structure variables that have explicit initializer will be ZEROed, not 0xFE)
I guess that would be the simplest way, yes. > > The clearing > > could be done via .DEFERRED_INIT. > > You mean to add additional calls to .DEFERRED_INIT for each individual > padding of the structure in “gimplify_init_constructor"? > Then later during RTL expand, expand these calls the same as other calls? No, I actually meant to in your patch above set defered_padding_init = true; and where 'cleared' is processed do sth like if (defered_padding_init) .. emit .DEFERRED_INIT for the _whole_ variable .. else if (cleared) .. original cleared handling ... that would retain the pattern init but possibly be less efficient in the end. > > > > Note that I think .DEFERRED_INIT can be elided for variables that do > > not have their address > > taken - otherwise we'll also have to worry about aggregate copy > > initialization and SRA > > decomposing the copy, initializing only the used parts. > > Please explain this a little bit more. For sth like struct S { int i; long j; }; void bar (struct S); struct S foo (struct S *p) { struct S q = *p; struct S r = q; bar (r); return r; } we don't get a .DEFERRED_INIT for 'r' (do we?) and SRA decomposes the init to <bb 2> : q = *p_2(D); q$i_9 = p_2(D)->i; q$j_10 = p_2(D)->j; r.i = q$i_9; r.j = q$j_10; bar (r); D.1953 = r; r ={v} {CLOBBER}; return D.1953; which leaves its padding uninitialized. Hmm, and that even happens when you make bar take struct S * and thus pass the address of 'r' to bar. Richard. > Thanks. > > Qing > > > > Richard. > > > >> Thanks. > >> > >> Qing > >> > >>> On Jul 13, 2021, at 4:29 PM, Kees Cook <keesc...@chromium.org> wrote: > >>> > >>> On Mon, Jul 12, 2021 at 08:28:55PM +0000, Qing Zhao wrote: > >>>>> On Jul 12, 2021, at 12:56 PM, Kees Cook <keesc...@chromium.org> wrote: > >>>>> On Wed, Jul 07, 2021 at 05:38:02PM +0000, Qing Zhao wrote: > >>>>>> This is the 4th version of the patch for the new security feature for > >>>>>> GCC. > >>>>> > >>>>> It looks like padding initialization has regressed to where things where > >>>>> in version 1[1] (it was, however, working in version 2[2]). I'm seeing > >>>>> these failures again in the kernel self-test: > >>>>> > >>>>> test_stackinit: small_hole_static_all FAIL (uninit bytes: 3) > >>>>> test_stackinit: big_hole_static_all FAIL (uninit bytes: 61) > >>>>> test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7) > >>>>> test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3) > >>>>> test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61) > >>>>> test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7) > >>>> > >>>> Are the above failures for -ftrivial-auto-var-init=zero or > >>>> -ftrivial-auto-var-init=pattern? Or both? > >>> > >>> Yes, I was only testing =zero (the kernel test handles =pattern as well: > >>> it doesn't explicitly test for 0x00). I've verified with =pattern now, > >>> too. > >>> > >>>> For the current implementation, I believe that all paddings should be > >>>> initialized with this option, > >>>> for -ftrivial-auto-var-init=zero, the padding will be initialized to > >>>> zero as before, however, for > >>>> -ftrivial-auto-var-init=pattern, the padding will be initialized to 0xFE > >>>> byte-repeatable patterns. > >>> > >>> I've double-checked that I'm using the right gcc, with the flag. > >>> > >>>>> > >>>>> In looking at the gcc test cases, I think the wrong thing is > >>>>> being checked: we want to verify the padding itself. For example, > >>>>> in auto-init-17.c, the actual bytes after "four" need to be checked, > >>>>> rather than "four" itself. > >>>> > >>>> ******For the current auto-init-17.c > >>>> > >>>> 1 /* Verify zero initialization for array type with structure element > >>>> with > >>>> 2 padding. */ > >>>> 3 /* { dg-do compile } */ > >>>> 4 /* { dg-options "-ftrivial-auto-var-init=zero" } */ > >>>> 5 > >>>> 6 struct test_trailing_hole { > >>>> 7 int one; > >>>> 8 int two; > >>>> 9 int three; > >>>> 10 char four; > >>>> 11 /* "sizeof(unsigned long) - 1" byte padding hole here. */ > >>>> 12 }; > >>>> 13 > >>>> 14 > >>>> 15 int foo () > >>>> 16 { > >>>> 17 struct test_trailing_hole var[10]; > >>>> 18 return var[2].four; > >>>> 19 } > >>>> 20 > >>>> 21 /* { dg-final { scan-assembler "movl\t\\\$0," } } */ > >>>> 22 /* { dg-final { scan-assembler "movl\t\\\$20," } } */ > >>>> 23 /* { dg-final { scan-assembler "rep stosq" } } */ > >>>> ~ > >>>> ******We have the assembly as: (-ftrivial-auto-var-init=zero) > >>>> > >>>> .file "auto-init-17.c" > >>>> .text > >>>> .globl foo > >>>> .type foo, @function > >>>> foo: > >>>> .LFB0: > >>>> .cfi_startproc > >>>> pushq %rbp > >>>> .cfi_def_cfa_offset 16 > >>>> .cfi_offset 6, -16 > >>>> movq %rsp, %rbp > >>>> .cfi_def_cfa_register 6 > >>>> subq $40, %rsp > >>>> leaq -160(%rbp), %rax > >>>> movq %rax, %rsi > >>>> movl $0, %eax > >>>> movl $20, %edx > >>>> movq %rsi, %rdi > >>>> movq %rdx, %rcx > >>>> rep stosq > >>>> movzbl -116(%rbp), %eax > >>>> movsbl %al, %eax > >>>> leave > >>>> .cfi_def_cfa 7, 8 > >>>> ret > >>>> .cfi_endproc > >>>> .LFE0: > >>>> .size foo, .-foo > >>>> .section .note.GNU-stack,"",@progbits > >>>> > >>>> From the above, we can see, “zero” will be used to initialize 8 * 20 = > >>>> 16 * 10 bytes of memory starting from the beginning of “var”, that > >>>> include all the padding holes inside > >>>> This array of structure. > >>>> > >>>> I didn’t see issue with padding initialization here. > >>> > >>> Hm, agreed -- this test does do the right thing. > >>> > >>>>> But this isn't actually sufficient because they may _accidentally_ > >>>>> be zero already. The kernel tests specifically make sure to fill the > >>>>> about-to-be-used stack with 0xff before calling a function like foo() > >>>>> above. > >>> > >>> I've extracted the kernel test to build for userspace, and it behaves > >>> the same way. See attached "stackinit.c". > >>> > >>> $ gcc-build/auto-var-init.4/installed/bin/gcc -O2 -Wall -o stackinit > >>> stackinit.c > >>> $ ./stackinit 2>&1 | grep failures: > >>> stackinit: failures: 23 > >>> $ gcc-build/auto-var-init.4/installed/bin/gcc -O2 -Wall > >>> -ftrivial-auto-var-init=zero -o stackinit stackinit.c > >>> stackinit.c: In function ‘__leaf_switch_none’: > >>> stackinit.c:326:26: warning: statement will never be executed > >>> [-Wswitch-unreachable] > >>> 326 | uint64_t var; > >>> | ^~~ > >>> $ ./stackinit 2>&1 | grep failures: > >>> stackinit: failures: 6 > >>> > >>> Same failures as seen in the kernel test (and an expected warning > >>> about the initialization that will never happen for a pre-case switch > >>> statement). > >>> > >>>>> > >>>>> (And as an aside, it seems like naming the test cases with some details > >>>>> about what is being tested in the filename would be nice -- it was > >>>>> a little weird having to dig through their numeric names to find the > >>>>> padding tests.) > >>>> > >>>> Yes, I will fix the testing names to more reflect the testing details. > >>> > >>> Great! > >>> > >>> -- > >>> Kees Cook > >>> <stackinit.c> >