Re: Question on -fwrapv and -fwrapv-pointer

2023-09-15 Thread Kees Cook via Gcc-patches
On Fri, Sep 15, 2023 at 05:47:08PM +, Qing Zhao wrote:
> 
> 
> > On Sep 15, 2023, at 1:26 PM, Richard Biener  
> > wrote:
> > 
> > 
> > 
> >> Am 15.09.2023 um 17:37 schrieb Qing Zhao :
> >> 
> >> 
> >> 
>  On Sep 15, 2023, at 11:29 AM, Richard Biener 
>   wrote:
>  
>  
>  
> > Am 15.09.2023 um 17:25 schrieb Qing Zhao :
>  
>  
>  
> > On Sep 15, 2023, at 8:41 AM, Arsen Arsenović  wrote:
> > 
> > 
> > Qing Zhao  writes:
> > 
> >> Even though unsigned integer overflow is well defined, it might be
> >> unintentional, shall we warn user about this?
> > 
> > This would be better addressed by providing operators or functions that
> > do overflow checking in the language, so that they can be explicitly
> > used where overflow is unexpected.
>  
>  Yes, that will be very helpful to prevent unexpected overflow in the 
>  program in general.
>  However, this will mainly benefit new codes.
>  
>  For the existing C codes, especially large applications, we still need 
>  to identify all the places 
>  Where the overflow is unexpected, and fix them. 
>  
>  One good example is linux kernel. 
>  
> > One could easily imagine a scenario
> > where overflow is not expected in some region of code but is in the
> > larger application.
>  
>  Yes, that’s exactly the same situation Linux kernel faces now, the 
>  unexpected Overflow and 
>  expected wrap-around are mixed together inside one module. 
>  It’s hard to detect the unexpected overflow under such situation based 
>  on the current GCC. 
> >>> 
> >>> But that’s hardly GCCs fault nor can GCC fix that in any way.  Only the 
> >>> programmer can distinguish both cases.
> >> 
> >> Right, compiler cannot fix this. 
> >> But can provide some tools to help the user to detect this more 
> >> conveniently. 
> >> 
> >> Right now, GCC provides two set of options for different types:
> >> 
> >> A. Turn the overflow to expected wrap-around (remove UB);
> >> B. Detect overflow;
> >> 
> >>   AB
> >>  remove UB-fsanitize=…
> >> signed   -fwrapvsigned-integer-overflow
> >> pointer   -fwrapv-pointerpointer-overflow (broken in Clang)
> >> 
> >> However, Options in A and B excluded with each other. They cannot mix 
> >> together for a single file.
> >> 
> >> What’s requested from Kernel is:
> >> 
> >> compiler needs to provide a functionality that can mix these two together 
> >> for a file. 
> >> 
> >> i.e, apply A (convert UB to defined behavior WRAP-AROUND) only to part of 
> >> the program.  And then add -fsnaitize=*overflow to detect all other
> >> Unexpected overflows in the program.
> >> 
> >> This is currently missing from GCC, I guess?
> > 
> > How can GCC know which part of the program wants wrapping and which 
> > sanitizing?
> 
> GCC doesn’t know, but the user knows. 
> 
> Then just provide the user a way to mark part of the program to be wrapping 
> around and excluded from sanitizing? 
> 
> Currently, GCC provides 
> 
> __attribute__(optimize ("wrapv"))  
> 
> To mark the specific function to be wrapped around.
> 
> However, this attribute does not work for linux kernel due to the following 
> reason:
> 
> Attribute optimize should be only used for debugging purpose;
> The kernel has banned its usage;
> 
> So, a new attribute was requested from Linux kernel security: 
> 
>  request wrap-around behavior for specific function (PR102317)
> __attribute__((wrapv)) 
> 
> Is this request reasonable?

After working through this discussion, I'd say it's likely more helpful
to have a way to disable the sanitizers for a given function (or
variable). i.e. The goal for the kernel would that untrapped wrap-around
would be the very rare exception. e.g. our refcount_t implementation:
https://elixir.bootlin.com/linux/v6.5/source/include/linux/refcount.h#L200

Then we can continue to build the kernel with -fno-strict-overflow (to
avoid UB), but gain sanitizer coverage for all run-time wraps, except
for the very few places where we depend on it. Getting there will also
take some non-trivial refactoring on our end, but without the sanitizers
we're unlikely to find them all.

-- 
Kees Cook


Re: Question on -fwrapv and -fwrapv-pointer

2023-09-15 Thread Kees Cook via Gcc-patches
On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
> >
> >
> >
> > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> > >
> > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> >  CLANG already provided -fsanitize=unsigned-integer-overflow. GCC
> >  might need to do the same.
> > >>>
> > >>> NO. There is no such thing as unsigned integer overflow. That option
> > >>> is badly designed and the GCC community has rejected a few times now
> > >>> having that sanitizer before. It is bad form to have a sanitizer for
> > >>> well defined code.
> > >>
> > >> Even though unsigned integer overflow is well defined, it might be
> > >> unintentional, shall we warn user about this?
> > >
> > > *Everything* could be unintentional and should be warned then.  GCC is a
> > > compiler, not an advanced AI educating the programmers.
> >
> > Well, you are right in some sense. -:)
> >
> > However, overflow is one important source for security flaws, it’s 
> > important  for compilers to detect
> > overflows in the programs in general.
> 
> Except it is NOT an overflow. Rather it is wrapping. That is a big
> point here. unsigned wraps and does NOT overflow. Yes there is a major
> difference.

Right, yes. I will try to pick my language very carefully. :)

The practical problem I am trying to solve in the 30 million lines of
Linux kernel code is that of catching arithmetic wrap-around. The
problem is one of evolving the code -- I can't just drop -fwrapv and
-fwrapv-pointer because it's not possible to fix all the cases at once.
(And we really don't want to reintroduce undefined behavior.)

So, for signed, pointer, and unsigned types, we need:

a) No arithmetic UB -- everything needs to have deterministic behavior.
   The current solution here is "-fno-strict-overflow", which eliminates
   the UB and makes sure everything wraps.

b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
   would work with -fsanitize=[signed-integer|pointer]-overflow except
   due to "a)" we always wrap. And there isn't currently coverage like
   this for unsigned (in GCC).

Our problem is that the kernel is filled with a mix of places where there
is intended wrap-around and unintended wrap-around. We can chip away at
fixing the intended wrap-around that we can find with static analyzers,
etc, but at the end of the day there is a long tail of finding the places
where intended wrap-around is hiding. But when the refactoring is
sufficiently completely, we can move the wrap-around warning to a trap,
and the kernel will not longer have this class of security flaw.

As a real-world example, here is a bug where a u8 wraps around causing
an under-allocation that allowed for a heap overwrite:

https://git.kernel.org/linus/6311071a0562
https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422

If there were more than 255 elements in a linked list, the allocation
would be too small, and the second loop would write past the end of the
allocation. This is a pretty classic allocation underflow and linear
heap write overflow security flaw. (And it would be trivially stopped by
trapping on the u8 wrap around.)

So, I want to be able to catch that at run-time. But we also have code
doing things like "if (ulong + offset < ulong) { ... }":

https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artpec6_crypto.c#L1187

This is easy for a static analyzer to find and we can replace it with a
non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find
them all immediately, especially for the signed and pointer cases.

So, I need to retain the "everything wraps" behavior while still being
able to detect when it happens.

-- 
Kees Cook


Re: Question on -fwrapv and -fwrapv-pointer

2023-09-14 Thread Kees Cook via Gcc-patches
On Thu, Sep 14, 2023 at 01:57:41PM -0700, Andrew Pinski wrote:
> Now -fsanitize=pointer-overflow is already there for GCC which was
> added in r8-2238-gc9b39a4955f56fe609ef5478 . LLVM/clang also provides
> it in the same timeframe too .
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80998

Ah, thanks for the link! And checking it just now it seems like Clang's
implementation doesn't work. Fun times.

-Kees

-- 
Kees Cook


Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-25 Thread Kees Cook via Gcc-patches
On Fri, Aug 25, 2023 at 03:24:22PM +, Qing Zhao wrote:
> This is the 3rd version of the patch, per our discussion based on the
> review comments for the 1st and 2nd version, the major changes in this

This tests out great for me; thanks you! I'm able to build the entire
kernel tree with 201 annotations[1] added. Things work as expected. :)

-Kees

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=devel/next-20230825/counted_by

-- 
Kees Cook


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-17 Thread Kees Cook via Gcc-patches
On Thu, Aug 17, 2023 at 01:44:42PM +, Qing Zhao wrote:
> Thanks for the testing case. 
> Yes, I noticed this issue too, and already fixed it in my private branch. 
> 
> With the latest patch, the compilation has no issue:
> [opc@qinzhao-ol8u3-x86 108896]$ sh t
> /home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
> [opc@qinzhao-ol8u3-x86 108896]$ 

Great! Thanks. I look forward to v3. For now I'll leave off these 2
annotations in my kernel builds. :)

-Kees

-- 
Kees Cook


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-17 Thread Kees Cook via Gcc-patches
On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote:
> On Fri, Aug 04, 2023 at 07:44:28PM +, Qing Zhao wrote:
> > This is the 2nd version of the patch, per our discussion based on the
> > review comments for the 1st version, the major changes in this version
> 
> I've been using Coccinelle to find and annotate[1] structures (193 so
> far...), and I've encountered 2 cases of GCC internal errors. I'm working
> on a minimized test case, but just in case these details are immediately
> helpful, here's what I'm seeing:

Okay, I got it minimized:

$ cat poc.c
struct a {
  unsigned long c;
  char d[] __attribute__((__counted_by__(c)));
} *b;

void f(long);

void e(void) {
  long g = __builtin_dynamic_object_size(b->d, 1);
  f(g);
}
$ gcc -O2 -c -o /dev/null poc.c
poc.c: In function 'e':
poc.c:8:6: error: incorrect sharing of tree nodes
8 | void e(void) {
  |  ^
*b.0_1
_2 = _1->d;
during GIMPLE pass: objsz
poc.c:8:6: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
../../../../gcc/gcc/passes.cc:2142

-- 
Kees Cook


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-16 Thread Kees Cook via Gcc-patches
On Fri, Aug 04, 2023 at 07:44:28PM +, Qing Zhao wrote:
> This is the 2nd version of the patch, per our discussion based on the
> review comments for the 1st version, the major changes in this version

I've been using Coccinelle to find and annotate[1] structures (193 so
far...), and I've encountered 2 cases of GCC internal errors. I'm working
on a minimized test case, but just in case these details are immediately
helpful, here's what I'm seeing:

../drivers/net/wireless/ath/wcn36xx/smd.c: In function 
'wcn36xx_smd_rsp_process':
../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: error: incorrect sharing of 
tree nodes
 3299 | int wcn36xx_smd_rsp_process(struct rpmsg_device *rpdev,
  | ^~~
MEM[(struct wcn36xx_hal_ind_msg *)_96]
_15 = [(struct wcn36xx_hal_ind_msg *)_96].msg;
during GIMPLE pass: objsz
../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: internal compiler error: 
verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
../../../../gcc/gcc/passes.cc:2142

The associated struct is:

struct wcn36xx_hal_ind_msg {
struct list_head list;
size_t msg_len;
u8 msg[] __counted_by(msg_len);
};



And:

../drivers/usb/gadget/function/f_fs.c: In function '__ffs_epfile_read_data':
../drivers/usb/gadget/function/f_fs.c:900:16: error: incorrect sharing of tree 
nodes
  900 | static ssize_t __ffs_epfile_read_data(struct ffs_epfile *epfile,
  |^~
MEM[(struct ffs_buffer *)_67]
_5 = [(struct ffs_buffer *)_67].storage;
during GIMPLE pass: objsz
../drivers/usb/gadget/function/f_fs.c:900:16: internal compiler error: 
verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
../../../../gcc/gcc/passes.cc:2142

with:

struct ffs_buffer {
size_t length;
char *data;
char storage[] __counted_by(length);
};


[1] 
https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

-- 
Kees Cook


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-09 Thread Kees Cook via Gcc-patches
On Mon, Aug 07, 2023 at 04:33:13PM +, Qing Zhao wrote:
> What’s the testing case for the one that failed? 
> If it’s 
> 
> __builtin_dynamic_object_size(p->array, 0/2) without the allocation 
> information in the routine, 
> then with the current algorithm, gcc cannot deduce the size for the whole 
> object.
> 
> If not such case, let me know.

I found some more bugs in my tests (now fixed), but I'm left with a single
failure case, which is think it going to boil down to pointer/pointee
issue we discussed earlier.

Using my existing testing tool:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

I see this error with the "counted_by_seen_by_bdos" case:

Expected __builtin_dynamic_object_size(p, 1) (18446744073709551615) == 
sizeof(*p) + p->count * sizeof(*p->array) (80)

A reduced view of the code is:

struct annotated *p;
int index = MAX_INDEX + unconst;

p = alloc_annotated(index);

EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->count * 
sizeof(*p->array));

It looks like __bdos will not use the __counted_by information from the
pointee if the argument is only the pointer. i.e. this test works:

EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->count * 
sizeof(*p->array));

However, I thought if any part of the pointee was used (e.g. p->count,
p->array), GCC would be happy to start using the pointee details?

And, again, for the initial version at this feature, I'm fine with this
failing test being declared not a valid test. :) But I'll need some
kind of builtin that can correctly interrogate a pointer to find the
full runtime size with the assumption that pointer is valid, but that
can come later.

And as a side note, I am excited to see the very correct warnings for
bad (too late) assignment of the __counted_by member:

p->array[0] = 0;
p->count = 1;

array-bounds.c: In function 'invalid_assignment_order':
array-bounds.c:366:17: warning: '*p.count' is used uninitialized 
[-Wuninitialized]
  366 | p->array[0] = 0;
  | ^~~

Yay! :)

-Kees

-- 
Kees Cook


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-08 Thread Kees Cook via Gcc-patches
On Tue, Aug 08, 2023 at 04:54:46PM +0200, Martin Uecker wrote:
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);

I think this contains a typo. The above should be:

sizeof(struct foo) + N * sizeof(*foo->t);

> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.
> 
> https://godbolt.org/z/K1hvaK1ns

And to add to the mess here, Clang used to get this wrong for statically
initialized flexible array members (returning 8):

https://godbolt.org/z/Tee96dazs

But that was fixed recently when I noticed it a few weeks ago.

> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.

Hm, in my thinking, FAMs are individually padded, so since they don't
"count" to the struct size, I think this is "as expected".

Note that the Linux kernel explicitly chooses to potentially over-estimate
for FAM allocations since there has been inconsistent sizes used depend
on the style of the prior fake FAM usage, the use of unions, etc.

i.e. our macro is, effectively:

#define struct_size(ptr, member, count) \
(sizeof(*ptr) + (count) * sizeof(*ptr->member))

since the other case can lead to truncated sizes:

#define struct_size(ptr, member, count) \
(offsetof(typeof(*ptr), member) + (count) * sizeof(*ptr->member))


struct foo {
int a, b, c;
int count;
union {
struct {
char small;
char char_fam[];
};
struct {
int large;
int int_fam[];
};
};
};

https://godbolt.org/z/b1v74Mzhd

i.e.:
struct_size(x, char_fam, 1) < sizeof(*x)

so accessing "large" fails, etc. Yes, it's all horrible, but we have to
deal with this kind of thing in the kernel. :(

-- 
Kees Cook


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-07 Thread Kees Cook via Gcc-patches
On Fri, Aug 04, 2023 at 07:44:28PM +, Qing Zhao wrote:
> This is the 2nd version of the patch, per our discussion based on the
> review comments for the 1st version, the major changes in this version
> are:

Thanks for the update!

> 
> 1. change the name "element_count" to "counted_by";
> 2. change the parameter for the attribute from a STRING to an
> Identifier;
> 3. Add logic and testing cases to handle anonymous structure/unions;
> 4. Clarify documentation to permit the situation when the allocation
> size is larger than what's specified by "counted_by", at the same time,
> it's user's error if allocation size is smaller than what's specified by
> "counted_by";
> 5. Add a complete testing case for using counted_by attribute in
> __builtin_dynamic_object_size when there is mismatch between the
> allocation size and the value of "counted_by", the expecting behavior
> for each case and the explanation on why in the comments. 

All the "normal" test cases I have are passing; this is wonderful! :)

I'm still seeing unexpected situations when I've intentionally set
counted_by to be smaller than alloc_size, but I assume it's due to not
yet having the patch you mention below.

> As discussed, I plan to add two more separate patch sets after this initial
> patch set is approved and committed.
> 
> set 1. A new warning option and a new sanitizer option for the user error
>when the allocation size is smaller than the value of "counted_by".
> set 2. An improvement to __builtin_dynamic_object_size  for the following
>case:
> 
> struct A
> {
> size_t foo;
> int array[] __attribute__((counted_by (foo)));
> };
> 
> extern struct fix * alloc_buf ();
> 
> int main ()
> {
> struct fix *p = alloc_buf ();
> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>   /* with the current algorithm, it’s UNKNOWN */ 
> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>   /* with the current algorithm, it’s UNKNOWN */
> }

Should the above be bdos instead of bos?

> Bootstrapped and regression tested on both aarch64 and X86, no issue.

I've updated the Linux kernel's macros for the name change and done
build tests with my first pass at "easy" cases for adding counted_by:
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by=adc5b3cb48a049563dc673f348eab7b6beba8a9b

Everything is working as expected. :)

-Kees

-- 
Kees Cook




Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Kees Cook via Gcc-patches
On Thu, Aug 03, 2023 at 09:31:24PM +, Qing Zhao wrote:
> So, the basic question is:
> 
> Given the following:
> 
> struct fix {
>   int others;
>   int array[10];
> }
> 
> extern struct fix * alloc_buf ();
> 
> int main ()
> {
>   struct fix *p = alloc_buf ();
>   __builtin_object_size(p->array,0) == ?
> }
> 
> Given p->array, can the compiler determine that p points to an object that 
> has TYPE struct fix?
> 
> If the answer is YES, then the current__builtin_object_size algorithm can be 
> improved to determine __builtin_object_size(p->array, 0)  with the TYPE of 
> the struct fix.

I think it is fine to leave __bos(..., 0) as-is. From the Linux kernel's
use of __bos, we are almost exclusively only interesting the mode 1, not
node 0. :)

-- 
Kees Cook


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Kees Cook via Gcc-patches
On Thu, Aug 03, 2023 at 07:55:54PM +, Qing Zhao wrote:
> 
> 
> > On Aug 3, 2023, at 1:51 PM, Kees Cook  wrote:
> > 
> > On August 3, 2023 10:34:24 AM PDT, Qing Zhao  wrote:
> >> One thing I need to point out first is, currently, even for regular fixed 
> >> size array in the structure,
> >> We have this same issue, for example:
> >> 
> >> #define LENGTH 10
> >> 
> >> struct fix {
> >> size_t foo;
> >> int array[LENGTH];
> >> };
> >> 
> >> …
> >> int main ()
> >> {
> >> struct fix *p;
> >> p = alloc_buf_more ();
> >> 
> >> expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
> >> expect(__builtin_object_size(p->array, 0), -1);
> >> }
> >> 
> >> Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for 
> >> it.
> >> This is not a special issue for flexible array member.
> > 
> > Is this true with -fstrict-flex-arrays=3 ?
> 
> Yes. 

Okay, right, I understand now -- it doesn't see the allocation, therefore
max size is unknown. Sounds good.

-Kees

-- 
Kees Cook


Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Kees Cook via Gcc-patches
On August 3, 2023 10:34:24 AM PDT, Qing Zhao  wrote:
>One thing I need to point out first is, currently, even for regular fixed size 
>array in the structure,
>We have this same issue, for example:
>
>#define LENGTH 10
>
>struct fix {
>  size_t foo;
>  int array[LENGTH];
>};
>
>…
>int main ()
>{
>  struct fix *p;
>  p = alloc_buf_more ();
>
>  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
>  expect(__builtin_object_size(p->array, 0), -1);
>}
>
>Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
>This is not a special issue for flexible array member.

Is this true with -fstrict-flex-arrays=3 ?

-Kees

>
>Qing
>
>
>On Aug 3, 2023, at 1:19 PM, Siddhesh Poyarekar  wrote:
>> 
>> On 2023-08-03 12:43, Qing Zhao wrote:
  Surely we could emit that for __bdos(q->array, 0) though, couldn't we?
>>> For __bdos(q->array, 0), we only have the access info for the sub-object 
>>> q->array, we can surely decide the size of the sub-object q->array, but we 
>>> still cannot
>>> decide the whole object that is pointed by q (the same reason as above), 
>>> right?
>> 
>> It's tricky, I mean we could assume p to be a valid object due to the 
>> dereference and hence assume that q->foo is also valid and that there's at 
>> least sizeof(*q) + q->foo * sizeof (q->array) bytes available.  The question 
>> then is whether q could be pointing to an element of an array of `struct 
>> annotated`.  Could we ever have a valid array of such structs that have a 
>> flex array at the end?  Wouldn't it always be a single object?
>> 
>> In fact for all pointers to such structs with a flex array at the end, could 
>> we always assume that it is a single object and never part of an array, and 
>> hence return sizeof()?
>> 
>> Thanks,
>> Sid
>


-- 
Kees Cook


Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Kees Cook via Gcc-patches
On Tue, Aug 01, 2023 at 09:35:30PM +, Qing Zhao wrote:
> 
> 
> > On Jul 31, 2023, at 1:07 PM, Siddhesh Poyarekar  wrote:
> > 
> > On 2023-07-31 13:03, Siddhesh Poyarekar wrote:
> >> On 2023-07-31 12:47, Qing Zhao wrote:
> >>> Hi, Sid and Jakub,
> >>> 
> >>> I have a question in the following source portion of the routine 
> >>> “addr_object_size” of gcc/tree-object-size.cc:
> >>> 
> >>>   743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
> >>>   744   if (bytes != error_mark_node)
> >>>   745 {
> >>>   746   bytes = size_for_offset (var_size, bytes);
> >>>   747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
> >>> MEM_REF)
> >>>   748 {
> >>>   749   tree bytes2 = compute_object_offset (TREE_OPERAND 
> >>> (ptr, 0),
> >>>   750pt_var);
> >>>   751   if (bytes2 != error_mark_node)
> >>>   752 {
> >>>   753   bytes2 = size_for_offset (pt_var_size, bytes2);
> >>>   754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
> >>>   755 }
> >>>   756 }
> >>>   757 }
> >>> 
> >>> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM 
> >>> or not?
> >>> Shall we use
> >>> 
> >>> (object_size_type & OST_MINIMUM
> >>>  ? MIN_EXPR : MAX_EXPR)
> >>> 
> >> That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like 
> >> this:
> >> typedef struct
> >> {
> >>   int a;
> >> } A;
> >> size_t f()
> >> {
> >>   A *p = malloc (1);
> >>   return __builtin_object_size (p, 0);
> > 
> > Correction, that should be __builtin_object_size (p->a, 0).
> 
> Actually, it should be __builtin_object_size(p->a, 1).
> For __builtin_object_size(p->a,0), gcc always uses the allocation size for 
> the whole object.
> 
> GCC’s current behavior is:
> 
> For the size of the whole object, GCC currently always uses the allocation 
> size. 
> And for the size in the sub-object, GCC chose the smaller one among the 
> allocation size and the TYPE_SIZE. 
> 
> Is this correct behavior?
> 
> thanks.
> 
> Qing
> 
> Please see the following small example to show the above behavior:
> 
> =
> 
> #include 
> #include 
> 
> #define LENGTH 10
> #define SIZE_BUMP 5 
> #define noinline __attribute__((__noinline__))
> 
> struct fix {
>   size_t foo;
>   int array[LENGTH]; 
> };
> 
> #define expect(p, _v) do { \
> size_t v = _v; \
> if (p == v) \
> __builtin_printf ("ok:  %s == %zd\n", #p, p); \
> else \
> {  \
>   __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> } \
> } while (0);
> 
> 
> /* in the following function, malloc allocated more space than size of the 
>struct fix.  Then what's the correct behavior we expect
>the __builtin_object_size should have for the following?
>  */
> 
> static struct fix * noinline alloc_buf_more ()
> {
>   struct fix *p;
>   p = malloc(sizeof (struct fix) + SIZE_BUMP * sizeof (int)); 
> 
>   /*when checking the observed access p->array, we have info on both
> observered allocation and observed access, 
> A. from observed allocation (alloc_size): (LENGTH + SIZE_BUMP) * sizeof 
> (int)
> B. from observed access (TYPE): LENGTH * sizeof (int)
>*/
>
>   /* for MAXIMUM size in the whole object: currently, GCC always used the A.  
> */
>   expect(__builtin_object_size(p->array, 0), (LENGTH + SIZE_BUMP) * 
> sizeof(int));

ok:  __builtin_object_size(p->array, 0) == 60

This is what I'd expect, yes: all memory from "array" to end of
allocation, and that matches here: (LENGTH + SIZE_BUMP) * sizeof(int)

> 
>   /* for MAXIMUM size in the sub-object: currently, GCC chose the smaller
>  one among these two: B.  */
>   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));

ok:  __builtin_object_size(p->array, 1) == 40

Also as I'd expect: just LENGTH * sizeof(int), the remaining bytes
starting at "array", based on type info, regardless of rest of allocation.

> 
>   return p;
> }
> 
> /* in the following function, malloc allocated less space than size of the 
>struct fix.  Then what's the correct behavior we expect
>the __builtin_object_size should have for the following?
>  */
> 
> static struct fix * noinline alloc_buf_less ()
> {
>   struct fix *p;
>   p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int)); 
> 
>   /*when checking the observed access p->array, we have info on both
> observered allocation and observed access, 
> A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof 
> (int)
> B. from observed access (TYPE): LENGTH * sizeof (int)
>*/
>
>   /* for MAXIMUM size in the whole object: currently, GCC always used the A.  
> */
>   expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * 
> sizeof(int));

ok:  __builtin_object_size(p->array, 0) == 20

My brain just melted a 

Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-08-01 Thread Kees Cook via Gcc-patches
On Mon, Jul 31, 2023 at 08:14:42PM +, Qing Zhao wrote:
> /* In general, Due to type casting, the type for the pointee of a pointer
>does not say anything about the object it points to,
>So, __builtin_object_size can not directly use the type of the pointee
>to decide the size of the object the pointer points to.
> 
>there are only two reliable ways:
>A. observed allocations  (call to the allocation functions in the routine)
>B. observed accesses (read or write access to the location of the 
>  pointer points to)
> 
>that provide information about the type/existence of an object at
>the corresponding address.
> 
>for A, we use the "alloc_size" attribute for the corresponding allocation
>functions to determine the object size;
> 
>For B, we use the SIZE info of the TYPE attached to the corresponding 
> access.
>(We treat counted_by attribute as a complement to the SIZE info of the TYPE
> for FMA) 
> 
>The only other way in C which ensures that a pointer actually points
>to an object of the correct type is 'static':
> 
>void foo(struct P *p[static 1]);   
> 
>See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
>for more details.  */

This is a great explanation; thank you!

In the future I might want to have a new builtin that will allow
a program to query a pointer when neither A nor B have happened. But
for the first version of the __counted_by infrastructure, the above
limitations seen fine.

For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) +
sizeof(*p->flex_array_member) * p->counted_by_member). Though since
there might be multiple flex array members, maybe this can't work. :)

-Kees

-- 
Kees Cook


Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-07-17 Thread Kees Cook via Gcc-patches
On Mon, Jul 17, 2023 at 09:17:48PM +, Qing Zhao wrote:
> 
> > On Jul 13, 2023, at 4:31 PM, Kees Cook  wrote:
> > 
> > In the bug, the problem is that "p" isn't known to be allocated, if I'm
> > reading that correctly?
> 
> I think that the major point in PR109557 
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> 
> for the following pointer p.3_1, 
> 
> p.3_1 = p;
> _2 = __builtin_object_size (p.3_1, 0);
> 
> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p 
> when the TYPE_SIZE can be determined at compile time?
> 
> Answer:  From just knowing the type of the pointee of p, the compiler cannot 
> determine the size of the object.  

Why is that? "p" points to "struct P", which has a fixed size. There
must be an assumption somewhere that a pointer is allocated, otherwise
__bos would almost never work?

> Therefore the bug has been closed. 
> 
> In your following testing 5:
> 
> > I'm not sure this is a reasonable behavior, but
> > let me get into the specific test, which looks like this:
> > 
> > TEST(counted_by_seen_by_bdos)
> > {
> >struct annotated *p;
> >int index = MAX_INDEX + unconst;
> > 
> >p = alloc_annotated(index);
> > 
> >REPORT_SIZE(p->array);
> > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
> >/* Check array size alone. */
> > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
> > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * 
> > sizeof(*p->array));
> >/* Check check entire object size. */
> > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
> > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo 
> > * sizeof(*p->array));
> > }
> > 
> > Test 5 should pass as well, since, again, p can be examined. Passing p
> > to __bdos implies it is allocated and the __counted_by annotation can be
> > examined.
> 
> Since the call to the routine “alloc_annotated" cannot be inlined, GCC does 
> not see any allocation calls for the pointer p.

Correct.

> At the same time, due to the same reason as PR109986, GCC cannot determine 
> the size of the object by just knowing the TYPE_SIZE
> of the pointee of p.  

So the difference between test 3 and test 5 is that "p" is explicitly
dereferenced to find "array", and therefore an assumption is established
that "p" is allocated?

> So, this is exactly the same issue as PR109557.  It’s an existing behavior 
> per the current __buitlin_object_size algorithm.
> I am still not very sure whether the situation in PR109557 can be improved or 
> not, but either way, it’s a separate issue.

Okay, so the issue is one of object allocation visibility (or
assumptions there in)?

> Please see the new testing case I added for PR109557, you will see that the 
> above case 5 is a similar case as the new testing case in PR109557.

I will ponder this a bit more to see if I can come up with a useful
test case to replace the part from "test 5" above.

> > 
> > If "return p->array[index];" would be caught by the sanitizer, then
> > it follows that __builtin_dynamic_object_size(p, 1) must also know the
> > size. i.e. both must examine "p" and its trailing flexible array and
> > __counted_by annotation.
> > 
> >> 
> >> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> >> 
> >> for the following annotated structure: 
> >> 
> >> 
> >> struct annotated {
> >>unsigned long flags;
> >>size_t foo;
> >>int array[] __attribute__((counted_by (foo)));
> >> };
> >> 
> >> 
> >> struct annotated *p;
> >> int index = 16;
> >> 
> >> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real 
> >> size 
> >> 
> >> p->foo = index + 2;  // p->foo was set by a different value than the real 
> >> size of p->array as in test 9 and 10
> > 
> > Right, tests 9 and 10 are checking that the _smallest_ possible value of
> > the array is used. (There are two sources of information: the allocation
> > size and the size calculated by counted_by. The smaller of the two
> > should be used when both are available.)
> 
> The counted_by attribute is used to annotate a Flexible array member on how 
> many elements it will have.
> However, if this information can not accurately reflect the real number of 
> elements for the array allocated, 
> What’s the purpose of such information? 

For example, imagine code that allocates space for 100 elements since
the common case is that the number of elements will grow over time.
Elements are added as it goes. For example:

struct grows {
int alloc_count;
int valid_count;
struct element item[] __counted_by(valid_count);
} *p;

void something(void)
{
p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
p->alloc_count = 100;
p->valid_count = 0;

/* this loop doesn't check that we don't go over 100. */
while (items_to_copy) {
struct element *item_ptr = get_next_item();
/* 

Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-07-13 Thread Kees Cook via Gcc-patches
On Thu, Jul 06, 2023 at 06:56:21PM +, Qing Zhao wrote:
> Hi, Kees,
> 
> I have updated my V1 patch with the following changes:
> A. changed the name to "counted_by"
> B. changed the argument from a string to an identifier
> C. updated the documentation and testing cases accordingly.

Sounds great!

> 
> And then used this new gcc to test 
> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with 
> the following change)
> [opc@qinzhao-ol8u3-x86 Kees]$ !1091
> diff array-bounds.c array-bounds.c.org
> 32c32
> < # define __counted_by(member)   __attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)   
> > __attribute__((__element_count__(#member)))
> 34c34
> < # define __counted_by(member)   __attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)   /* 
> > __attribute__((__element_count__(#member))) */
> 
> Then I got the following result:
> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> not ok 3 global.unknown_size_unknown_to_bdos
> not ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> The same as your previous results. Then I took a look at all the failed 
> testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.
> 
>  in a summary, there are two major issues:
> 1.  The reason for the failed testing 7 is the same issue as I observed in 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
> Which is not a bug, it’s an expected behavior. 

In the bug, the problem is that "p" isn't known to be allocated, if I'm
reading that correctly? I'm not sure this is a reasonable behavior, but
let me get into the specific test, which looks like this:

TEST(counted_by_seen_by_bdos)
{
struct annotated *p;
int index = MAX_INDEX + unconst;

p = alloc_annotated(index);

REPORT_SIZE(p->array);
/* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
/* Check array size alone. */
/* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
/* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * 
sizeof(*p->array));
/* Check check entire object size. */
/* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
/* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * 
sizeof(*p->array));
}

Test 1 trivially passes -- this is just a sanity check.

Test 2 should be SIZE_MAX because __bos only knows compile-time sizes.

Test 3 should pass: counted_by knows the allocation size and can be
queried as part of the "p" object.

Test 4 should be SIZE_MAX because __bos only knows compile-time sizes.

Test 5 should pass as well, since, again, p can be examined. Passing p
to __bdos implies it is allocated and the __counted_by annotation can be
examined.

If "return p->array[index];" would be caught by the sanitizer, then
it follows that __builtin_dynamic_object_size(p, 1) must also know the
size. i.e. both must examine "p" and its trailing flexible array and
__counted_by annotation.

> 
> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> 
> for the following annotated structure: 
> 
> 
> struct annotated {
> unsigned long flags;
> size_t foo;
> int array[] __attribute__((counted_by (foo)));
> };
> 
> 
> struct annotated *p;
> int index = 16;
> 
> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
> 
> p->foo = index + 2;  // p->foo was set by a different value than the real 
> size of p->array as in test 9 and 10

Right, tests 9 and 10 are checking that the _smallest_ possible value of
the array is used. (There are two sources of information: the allocation
size and the size calculated by counted_by. The smaller of the two
should be used when both are available.)

> or
> p->foo was not set to any value as in test 3 and 4

For tests 3 and 4, yes, this was my mistake. I have fixed up the tests
now. Bill noticed the same issue. Sorry for the think-o!

> 
> 
> i.e, the value of p->foo is NOT synced with the number of elements allocated 
> for the array p->array.  
> 
> I think that this should be considered as an user error, and the 
> documentation of the attribute should include
> this requirement.  (In the LLVM’s RFC, such requirement was included in the 
> programing model: 
> 

Re: [RFC/RFT,V2 0/3] Add compiler support for Kernel Control Flow Integrity

2023-06-21 Thread Kees Cook via Gcc-patches
On Sat, Mar 25, 2023 at 01:11:14AM -0700, Dan Li wrote:
> This series of patches is mainly used to support the control flow
> integrity protection of the linux kernel [1], which is similar to
> -fsanitize=kcfi in clang 16.0 [2,3].
> 
> Any suggestion please let me know :).

Hi Dan,

It's been a couple months, and I didn't see any other feedback on this
proposal. I was curious what the status of this work is. Are you able to
attend GNU Cauldron[1] this year? I'd love to see this get some traction
in GCC.

Thanks!

-Kees

[1] https://gcc.gnu.org/wiki/cauldron2023

-- 
Kees Cook


Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-05-26 Thread Kees Cook via Gcc-patches
On Thu, May 25, 2023 at 04:14:47PM +, Qing Zhao wrote:
> GCC will pass the number of elements info from the attached attribute to both 
> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
> or dynamic object size issues during runtime for flexible array members.
> 
> This new feature will provide nice protection to flexible array members (which
> currently are completely ignored by both __builtin_dynamic_object_size and
> bounds sanitizers).

Testing went pretty well, though I think I found some bdos issues:

- some things that bdos can't know the size of, and correctly returned
  SIZE_MAX in the past, now thinks are 0-sized.
- while bdos correctly knows the size of an element_count-annotated
  flexible array, it doesn't know the size of the containing object
  (i.e. it returns SIZE_MAX).

Also, I think I found a precedence issue:

- if both __alloc_size and 'element_count' are in use, the _smallest_
  of the two is what I would expect to be enforced by the sanitizer
  and reported by __bdos. As is, alloc_size appears to be used when
  it is available, regardless of what 'element_count' shows.

I've updated my test cases to show it more clearly, but here is the
before/after:


GCC 13 (correctly does not implement "element_count"):

$ ./array-bounds 2>&1 | grep -v ^'#'
TAP version 13
1..12
ok 1 global.fixed_size_seen_by_bdos
ok 2 global.fixed_size_enforced_by_sanitizer
ok 3 global.unknown_size_unknown_to_bdos
ok 4 global.unknown_size_ignored_by_sanitizer
ok 5 global.alloc_size_seen_by_bdos
ok 6 global.alloc_size_enforced_by_sanitizer
not ok 7 global.element_count_seen_by_bdos
not ok 8 global.element_count_enforced_by_sanitizer
not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer


ToT GCC + this element_count series:

$ ./array-bounds 2>&1 | grep -v ^'#'
TAP version 13
1..12
ok 1 global.fixed_size_seen_by_bdos
ok 2 global.fixed_size_enforced_by_sanitizer
not ok 3 global.unknown_size_unknown_to_bdos
not ok 4 global.unknown_size_ignored_by_sanitizer
ok 5 global.alloc_size_seen_by_bdos
ok 6 global.alloc_size_enforced_by_sanitizer
not ok 7 global.element_count_seen_by_bdos
ok 8 global.element_count_enforced_by_sanitizer
not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer


Test suite is here:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

-- 
Kees Cook


Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-05-26 Thread Kees Cook via Gcc-patches
On Thu, May 25, 2023 at 04:14:47PM +, Qing Zhao wrote:
> This patch set introduces a new attribute "element_count" to annotate bounds 
> for C99 flexible array member.

Thank you for this work! I'm really excited to start using it in the
Linux kernel. I'll give this a spin, but I know you've already been
testing this series against the test cases I created earlier, so I don't
expect any problems. :)

One bike-shedding note: with the recent "-fbounds-safety" RFC posted for
LLVM, we may want to consider renaming "element_count" to "counted_by":
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/

Thanks again!

-Kees

-- 
Kees Cook


Re: [V6][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-04-12 Thread Kees Cook via Gcc-patches
On Tue, Mar 28, 2023 at 03:49:43PM +, Qing Zhao wrote:
> the C front-end has been approved by Joseph.
> 
> Jacub, could you please eview the middle end part of the changes of this 
> patch?
> 
> The major change is in tree-object-size.cc (addr_object_size).
>  (To use the new TYPE_INCLUDE_FLEXARRAY info). 
> 
> This patch is to fix 
> PR101832(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832),
> and is needed for Linux Kernel security.  It’s better to be put into GCC13.
> 
> Thanks a lot!

Just to confirm, I've done build testing with the Linux kernel, and this
is behaving as I'd expect. This makes my life MUCH easier -- many fewer
false positives for our bounds checking. :)

-Kees

-- 
Kees Cook


Re: [V4][PATCH 0/2] Handle component_ref to a structure/union field including FAM for builtin_object_size

2023-03-08 Thread Kees Cook via Gcc-patches
On Fri, Feb 24, 2023 at 06:35:03PM +, Qing Zhao wrote:
> Hi, Joseph and Richard,
> 
> Could you please review this patch and let me know whether it’s ready
>  for committing into GCC13?
> 
> The fix to Bug PR101832 is an important patch for kernel security
> purpose. it's better to be put into GCC13.

Hi!

This series tests well for me. Getting this landed would be very nice
for the Linux kernel. :)

Are there any additional review notes for it, or is it ready to land?

Thanks!

-Kees

-- 
Kees Cook


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-09 Thread Kees Cook via Gcc-patches
On Thu, Feb 09, 2023 at 02:40:57PM +, Qing Zhao wrote:
> So, the major question here is:
> 
>  in addition to the C99 standard flexible array member [ ], shall we include 
> [0], [1] or even [4] into this extension, and treat the structure with a 
> trailing [0], [1], or [4] embedded into another structure/union still as 
> flexible-sized?
> 
> I think that we might need to limit this extension ONLY to C99 standard FAM [ 
> ].  All other [0], [1], or [4] should be excluded from this extension. The 
> reasons are:
> 
> 1. The real usages of such GCC extension (embedding structure with FAM into 
> another structure/union), as my understanding, the old glibc’s <_G_config.h> 
> (https://gcc.gnu.org/legacy-ml/gcc-patches/2002-08/msg01149.html), and the 
> bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832, ONLY involved C99 
> standard FAM;
> 
> 2. Embedding a structure with C99 FAM [] into the end of another structure, 
> and still treat it flexible sized might have more usages, and as discussed 
> with Kees, it might be reasonable to promote this into a  C standard later if 
> needed.
> 
> So, based on this consideration, I think I should only document the following 
> as GCC extension:
> 
> struct flex  { int length; char data[ ]; };
> struct out_flex { int m; struct flex flex_data; };
> 
> Issue warnings for the following: (when the structure is not at the end)
> 
> struct out_flex_mid  {  struct flex flex_data; int m};
> 
> 
> However, for the trailing [0], [1], or [4], when such structure embedded into 
> the end of another structure, We should NOT treat the outer structure as 
> flexible sized. 
> Logically, we will NOT issue warnings when such structure is not at the end. 
> 
> Let me know if you have any comment or suggestions.

FWIW this all sounds correct to me.

-- 
Kees Cook


Re: [PATCH 2/2] Documentation Update.

2023-02-02 Thread Kees Cook via Gcc-patches
On Thu, Feb 02, 2023 at 02:31:53PM +, Qing Zhao wrote:
> 
> > On Feb 2, 2023, at 3:33 AM, Richard Biener  wrote:
> > 
> > On Wed, 1 Feb 2023, Siddhesh Poyarekar wrote:
> > 
> >> On 2023-02-01 13:24, Qing Zhao wrote:
> >>> 
> >>> 
>  On Feb 1, 2023, at 11:55 AM, Siddhesh Poyarekar 
>  wrote:
>  
>  On 2023-01-31 09:11, Qing Zhao wrote:
> > Update documentation to clarify a GCC extension on structure with
> > flexible array member being nested in another structure.
> > gcc/ChangeLog:
> > * doc/extend.texi: Document GCC extension on a structure containing
> > a flexible array member to be a member of another structure.
>  
>  Should this resolve pr#77650 since the proposed action there appears to 
>  be
>  to document these semantics?
> >>> 
> >>> My understanding of pr77650 is specifically for documentation on the
> >>> following case:
> >>> 
> >>> The structure with a flexible array member is the middle field of another
> >>> structure.
> >>> 
> >>> Which I added in the documentation as the 2nd situation.
> >>> However, I am still not very comfortable on my current clarification on 
> >>> this
> >>> situation: how should we document on
> >>> the expected gcc behavior to handle such situation?
> >> 
> >> I reckon wording that dissuades programmers from using this might be
> >> appropriate, i.e. don't rely on this and if you already have such nested 
> >> flex
> >> arrays, change code to remove them.
> >> 
> > +In the above, @code{flex_data.data[]} is allowed to be extended 
> > flexibly
> > to
> > +the padding. E.g, up to 4 elements.
> >> 
> >> """
> >> ... Relying on space in struct padding is bad programming practice and any
> >> code relying on this behaviour should be modified to ensure that flexible
> >> array members only end up at the ends of arrays.  The `-pedantic` flag 
> >> should
> >> help identify such uses.
> >> """
> >> 
> >> Although -pedantic will also flag on flex arrays nested in structs even if
> >> they're at the end of the parent struct, so my suggestion on the warning is
> >> not really perfect.
> > 
> > Wow, so I checked and we indeed accept
> > 
> > struct X { int n; int data[]; };
> > struct Y { struct X x; int end; };
> > 
> > and -pedantic says
> > 
> > t.c:2:21: warning: invalid use of structure with flexible array member 
> > [-Wpedantic]
> >2 | struct Y { struct X x; int end; };
> >  |
> 
> Currently, -pedantic report the same message for flex arrays nested in 
> structs at the end of the parent struct AND in the middle of the parent 
> struct. 
> Shall we distinguish them and report different warning messages in order to 
> discourage the latter case? 
> 
> And at the same time, in the documentation, clarify these two situations, and 
> discourage the latter case at the same time as well?
> >   
> > 
> > and clang reports
> > 
> > t.c:2:21: warning: field 'x' with variable sized type 'struct X' not at 
> > the end of a struct or class is a GNU extension 
> > [-Wgnu-variable-sized-type-not-at-end]
> > struct Y { struct X x; int end; };
> >  
> >  ^
> 
> Clang’s warning message is clearer. 
> > 
> > looking at PR77650 what seems missing there is the semantics of this
> > extension as expected/required by the glibc use.  comment#5 seems
> > to suggest that for my example above its expected that
> > Y.x.data[0] aliases Y.end?!
> 
> Should we mentioned this alias relationship in the doc?
> 
> >  There must be a better way to write
> > the glibc code and IMHO it would be best to deprecate this extension.
> 
> Agreed. This is really a bad practice, should be deprecated. 
> We can give warning first in this release, and then deprecate this extension 
> in a latter release. 

Right -- this can lead (at least) to type confusion and other problems
too. We've been trying to remove all of these overlaps in the Linux
kernel. I mention it the "Overlapping composite structure members"
section at https://people.kernel.org/kees/bounded-flexible-arrays-in-c

-- 
Kees Cook


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Kees Cook via Gcc-patches
On Thu, Dec 01, 2022 at 10:27:41PM +, Qing Zhao wrote:
> Hi, Sid,
> 
> Thanks a lot for the input.
> 
> After more thinking based on your and Kees’ comments, I have the following 
> thought:
> 
> 1. -fstrict-flex-arrays=level should control both GCC code gen and warnings 
> consistently;
> 2. We need warnings specifically for -fstrict-flex-arrays=level to report any 
> misuse of flexible 
>  array corresponding to the “level” to gradually encourage language 
> standard.
> 
> So, based on the above two, I think what I did in this current patch is 
> correct:
> 
> 1.  We eliminate the control from -Warray-bounds=level on treating flex 
> arrays, 
>  now only "-fstrict-flex-arrasy=level" controls how the warning treating 
> the flex arrays.
> 2.  We add a separate new warning -Wstrict-flex-arrays to report any misuse 
> corresponding to
>  the different level of -fstrict-flex-arrays.
> 
> Although we can certainly merge these new warnings into -Warray-bounds, 
> however, as Sid mentioned,
> -Warray-bounds does issue a lot more warnings than just flexible arrays 
> misuse. I think it’s necessary 
> To provide a seperate warning to only issue flexible array misuse.
> 
> Let me know if you have any more comments on this.

Okay, that seems good. Given that -Warray-bounds is part of -Wall, what
should happen for -Wstrict-flex-arrays=N?

-- 
Kees Cook


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Kees Cook via Gcc-patches
On Thu, Dec 01, 2022 at 05:04:02PM +, Qing Zhao wrote:
> 
> 
> > On Dec 1, 2022, at 11:42 AM, Kees Cook  wrote:
> > 
> > On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
> >> '-Wstrict-flex-arrays'
> >> Warn about inproper usages of flexible array members according to
> >> the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> >> the trailing array field of a structure if it's available,
> >> otherwise according to the LEVEL of the option
> >> '-fstrict-flex-arrays=LEVEL'.
> >> 
> >> This option is effective only when LEVEL is bigger than 0.
> >> Otherwise, it will be ignored with a warning.
> >> 
> >> when LEVEL=1, warnings will be issued for a trailing array
> >> reference of a structure that have 2 or more elements if the
> >> trailing array is referenced as a flexible array member.
> >> 
> >> when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> >> issued for a trailing one-element array reference of a structure if
> >> the array is referenced as a flexible array member.
> >> 
> >> when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> >> issued for a trailing zero-length array reference of a structure if
> >> the array is referenced as a flexible array member.
> >> 
> >> At the same time, -Warray-bounds is updated:
> > 
> > Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
> > only the latter was going to exist?
> 
> Yes, It’s very easy to merge these two warnings into one: 
> 
>  -Warray-bounds, when combined with -fstrict-flex-arrays,  in addition to 
> report all the out-of-bounds warnings, it also report 
> the misuse of flexible array members according to the LEVEL of 
> -fstrict-flex-arrays
> 
> The major question is, do we need one separate warning option to report the 
> misuse of flexible array member only?
> If so, then we need to add a new one. 

I guess it is up to you, but I think it just makes things needlessly
complex. I think having 1 option for behavior (-ftrict-flex-arrays),
and 1 option for warnings (-Warray-bounds) is sufficient. I think adding
-Wstrict-flex-arrays is confusing.

-- 
Kees Cook


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Kees Cook via Gcc-patches
On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
> '-Wstrict-flex-arrays'
>  Warn about inproper usages of flexible array members according to
>  the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
>  the trailing array field of a structure if it's available,
>  otherwise according to the LEVEL of the option
>  '-fstrict-flex-arrays=LEVEL'.
> 
>  This option is effective only when LEVEL is bigger than 0.
>  Otherwise, it will be ignored with a warning.
> 
>  when LEVEL=1, warnings will be issued for a trailing array
>  reference of a structure that have 2 or more elements if the
>  trailing array is referenced as a flexible array member.
> 
>  when LEVEL=2, in addition to LEVEL=1, additional warnings will be
>  issued for a trailing one-element array reference of a structure if
>  the array is referenced as a flexible array member.
> 
>  when LEVEL=3, in addition to LEVEL=2, additional warnings will be
>  issued for a trailing zero-length array reference of a structure if
>  the array is referenced as a flexible array member.
> 
> At the same time, -Warray-bounds is updated:

Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
only the latter was going to exist?

Are you trying to split code gen (-fstrict-flex-arrays) from warnings?
Is that needed?

-- 
Kees Cook


Re: [PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-11-22 Thread Kees Cook via Gcc-patches
On Tue, Nov 22, 2022 at 03:02:04PM +, Qing Zhao wrote:
> 
> 
> > On Nov 22, 2022, at 9:10 AM, Qing Zhao via Gcc-patches 
> >  wrote:
> > 
> > 
> > 
> >> On Nov 22, 2022, at 3:16 AM, Richard Biener  wrote:
> >> 
> >> On Mon, 21 Nov 2022, Qing Zhao wrote:
> >> 
> >>> 
> >>> 
>  On Nov 18, 2022, at 11:31 AM, Kees Cook  wrote:
>  
>  On Fri, Nov 18, 2022 at 03:19:07PM +, Qing Zhao wrote:
> > Hi, Richard,
> > 
> > Honestly, it?s very hard for me to decide what?s the best way to handle 
> > the interaction 
> > between -fstrict-flex-array=M and -Warray-bounds=N. 
> > 
> > Ideally,  -fstrict-flex-array=M should completely control the behavior 
> > of -Warray-bounds.
> > If possible, I prefer this solution.
> > 
> > However, -Warray-bounds is included in -Wall, and has been used 
> > extensively for a long time.
> > It?s not safe to change its default behavior. 
>  
>  I prefer that -fstrict-flex-arrays controls -Warray-bounds. That
>  it is in -Wall is _good_ for this reason. :) No one is going to add
>  -fstrict-flex-arrays (at any level) without understanding what it does
>  and wanting those effects on -Warray-bounds.
> >>> 
> >>> 
> >>> The major difficulties to let -fstrict-flex-arrays controlling 
> >>> -Warray-bounds was discussed in the following threads:
> >>> 
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604133.html
> >>> 
> >>> Please take a look at the discussion and let me know your opinion.
> >> 
> >> My opinion is now, after re-considering and with seeing your new 
> >> patch, that -Warray-bounds=2 should be changed to only add
> >> "the intermediate results of pointer arithmetic that may yield out of 
> >> bounds values" and that what it considers a flex array should now
> >> be controlled by -fstrict-flex-arrays only.
> >> 
> >> That is, I think, the only thing that's not confusing to users even
> >> if that implies a change from previous behavior that we should
> >> document by clarifying the -Warray-bounds documentation as well as
> >> by adding an entry to the Caveats section of gcc-13/changes.html
> >> 
> >> That also means that =2 will get _less_ warnings with GCC 13 when
> >> the user doesn't use -fstrict-flex-arrays as well.
> > 
> > Okay.  So, this is for -Warray-bounds=2.
> > 
> > For -Warray-bounds=1 -fstrict-flex-array=N, if N > 1, should 
> > -fstrict-flex-array=N control -Warray-bounds=1?
> 
> More thinking on this. (I might misunderstand a little bit in the previous 
> email)
> 
> If I understand correctly now, what you proposed was:
> 
> 1. The level of -Warray-bounds will NOT control how a trailing array is 
> considered as a flex array member anymore. 
> 2. Only the level of -fstrict-flex-arrays will control this;
> 3. Keep the current default  behavior of -Warray-bounds on treating trailing 
> arrays as flex array member (treating all [0],[1], and [] as flexible array 
> members). 
> 4. Updating the documentation for -Warray-bounds by clarifying this change, 
> and also as an entry to the Caveats section on such change on -Warray-bounds.
> 
> If the above is correct, Yes, I like this change. Both the user interface and 
> the internal implementation will be simplified and cleaner. 
> 
> Let me know if you see any issue with my above understanding.
> 
> Thanks a lot.

FWIW, this matches what I think makes the most sense too.

-- 
Kees Cook


Re: [PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-11-18 Thread Kees Cook via Gcc-patches
On Fri, Nov 18, 2022 at 03:19:07PM +, Qing Zhao wrote:
> Hi, Richard,
> 
> Honestly, it’s very hard for me to decide what’s the best way to handle the 
> interaction 
> between -fstrict-flex-array=M and -Warray-bounds=N. 
> 
> Ideally,  -fstrict-flex-array=M should completely control the behavior of 
> -Warray-bounds.
> If possible, I prefer this solution.
> 
> However, -Warray-bounds is included in -Wall, and has been used extensively 
> for a long time.
> It’s not safe to change its default behavior. 

I prefer that -fstrict-flex-arrays controls -Warray-bounds. That
it is in -Wall is _good_ for this reason. :) No one is going to add
-fstrict-flex-arrays (at any level) without understanding what it does
and wanting those effects on -Warray-bounds.

-- 
Kees Cook


Re: [[GCC13][Patch][V3] 1/2] Add a new option -fstrict-flex-array[=n] and new attribute strict_flex_array

2022-08-31 Thread Kees Cook via Gcc-patches
On Wed, Aug 31, 2022 at 08:35:12PM +, Qing Zhao wrote:
> One of the major purposes of the new option -fstrict-flex-array is to 
> encourage standard conforming programming style. 
> 
> So, it might be reasonable to treat -fstrict-flex-array similar as -pedantic 
> (but only for flexible array members)? 
> If so, then issuing warnings when the standard doesn’t support is reasonable 
> and desirable. 

I guess the point is that "-std=c89 -fstrict-flex-arrays=3" leaves "[]"
available for use still? I think this doesn't matter. If someone wants
it to be really strict, they'd just add -Wpedantic.

-- 
Kees Cook


Re: [[GCC13][Patch][V3] 1/2] Add a new option -fstrict-flex-array[=n] and new attribute strict_flex_array

2022-08-31 Thread Kees Cook via Gcc-patches
On Wed, Aug 31, 2022 at 08:16:49PM +, Qing Zhao wrote:
> 
> > On Aug 31, 2022, at 4:09 PM, Joseph Myers  wrote:
> > 
> > On Wed, 31 Aug 2022, Qing Zhao wrote:
> > 
>  When -std=gnu89 + -fstrict-flex-array=3 (ONLY C99 flexible array member 
>  [] is treated as a valid flexible array) present together,
> >>> 
> >>> That seems reasonable enough without a warning.  If people want a warning 
> >>> for flexible array members in older language modes, they can use 
> >>> -pedantic; I don't think we need to warn for any particular 
> >>> -fstrict-flex-array modes there.
> >> 
> >> So, you mean,
> >> 
> >> 1. GCC with -std=gnu89 support all [0], [1], and [] as Flexible array 
> >> member;
> >> 2. Therefore. -std=gnu89 + -fstrict-flex-array=3 does not need a warning;
> >> 
> >> ?
> > 
> > Yes.
> > 
> >> Then, how about:
> >> 
> >> -std=c89:
> >> 
> >> 1. GCC with -std=c89 also support all [0], [1], and [] as Flexible array 
> >> member;
> >> 2, therefore, -std=c89 + -fstrict-flex-array does not need a warning too.
> >> 
> >> ?
> > 
> > Yes.
> > 
> 
> Okay, I am fine with this.
> 
> Richard and Kees,  what’s your opinion on this?

Agreed: I think it's fine not to warn about these "conflicting" flags in
those cases. It looks like the C standard warnings about flexible arrays
are already hidden behind -Wpedantic, so nothing else is needed, IMO.
Using -fstrict-flex-arrays just enforces that warning. ;)

-- 
Kees Cook


Re: [GCC13][Patch][V2][2/2]Add a new option -fstrict-flex-array[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-08-02 Thread Kees Cook via Gcc-patches
On Tue, Jul 19, 2022 at 02:11:19PM +, Qing Zhao wrote:
> From a09f39ded462611286a44d9e8273de8342673ba2 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Mon, 18 Jul 2022 18:12:26 +
> Subject: [PATCH 2/2] Use new flag DECL_NOT_FLEXARRAY in __builtin_object_size
> [PR101836]
> 
> Use new flag DECL_NOT_FLEXARRAY to determine whether the trailing array
> of a structure is flexible array member in __builtin_object_size.

FWIW, with these patches I've done builds of the Linux kernel using
the ...=3 level, and things are looking correct on our end. (There are,
as expected, many places in Linux that need to be fixed, and that work
is on-going, guided by this option's results.)

-Kees

-- 
Kees Cook


Re: [GCC13][Patch][V2][1/2]Add a new option -fstrict-flex-array[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-07-28 Thread Kees Cook via Gcc-patches
On Thu, Jul 28, 2022 at 07:26:57AM +, Richard Biener wrote:
> On Tue, 19 Jul 2022, Qing Zhao wrote:
> > [...]
> > +@cindex @code{strict_flex_array} variable attribute
> > +@item strict_flex_array (@var{level})
> > +The @code{strict_flex_array} attribute should be attached to the trailing
> > +array field of a structure.  It specifies the level of strictness of
> > +treating the trailing array field of a structure as a flexible array
> > +member. @var{level} must be an integer betwen 0 to 3.
> > +
> > +@var{level}=0 is the least strict level, all trailing arrays of structures
> > +are treated as flexible array members. @var{level}=3 is the strictest 
> > level,
> > +only when the trailing array is declared as a flexible array member per C99
> > +standard onwards ([]), it is treated as a flexible array member.
> 
> How is level 3 (thus -fstrict-flex-array) interpreted when you specify 
> -std=c89?  How for -std=gnu89?

To me, it makes sense that either c99 is required (most sane to me)
or it would disable flexible arrays entirely (seems an unlikely combo to
be useful).

> 
> > +
> > +There are two more levels in between 0 and 3, which are provided to support
> > +older codes that use GCC zero-length array extension ([0]) or one-size 
> > array
> > +as flexible array member ([1]):
> > +When @var{level} is 1, the trailing array is treated as a flexible array 
> > member
> > +when it is declared as either "[]", "[0]", or "[1]";
> > +When @var{level} is 2, the trailing array is treated as a flexible array 
> > member
> > +when it is declared as either "[]", or "[0]".
> 
> Given the above does adding level 2 make sense given that [0] is a GNU
> extension?

Level 1 removes the general "all trailing arrays are flex arrays" logic, but
allows the 2 common "historical" fake flex array styles ("[1]" and "[0]").
Level 2 additionally removes the "[1]" style.
Level 3 additionally removes the "[0]" style.

I don't understand how "[0]" being a GNU extension matters here for
level 2 -- it's dropping "[1]". And for level 3, the point is to defang
the GNU extension of "[0]" to no longer mean "flexible array", and
instead only mean "zero sized member" (as if it were something like
"struct { } no_size;").

Note that for the Linux kernel, we only care about level 3, but could
make do with level 2. We need to purge all the "fake" flexible array usage
so we can start building a sane set of behaviors around array bounds
that are reliably introspectable.


As a related bit of feature creep, it would be great to expose something
like __builtin_has_flex_array_p() so FORTIFY could do a better job
filtering __builtin_object_size() information.

Given:

struct inside {
int foo;
int bar;
unsigned long items[];
};

struct outside {
int a;
int b;
struct inside inner;
};

The follow properties are seen within, for example:

void stuff(struct outside *outer, struct inside *inner)
{
...
}

__builtin_object_size(>inner, 1) == 8
__builtin_object_size(inner, 1) == -1

(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832)

So things like FORTIFY misfire on >inner, as it's _not_ actually
8 bytes -- it has a potential trailing flex array.

If it could be introspected better, FORTIFY could check for the flex
array. For example, instead of using the inconsistent __bos(ptr, 1) for
finding member sizes, it could do something like:

#define __member_size(ptr)  \
(__builtin_has_flex_array_p(ptr) ? -1 : \
 __builtin_object_size(ptr, 1))

-- 
Kees Cook


Re: [GCC13][Patch][V2][1/2]Add a new option -fstrict-flex-array[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-07-27 Thread Kees Cook via Gcc-patches
On Tue, Jul 19, 2022 at 02:09:46PM +, Qing Zhao wrote:
> [...]
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index c8d96723f4c..10d16532f0d 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -101,6 +101,8 @@ static tree handle_special_var_sec_attribute (tree *, 
> tree, tree, int, bool *);
> static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
> static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
> int, bool *);
> +static tree handle_strict_flex_array_attribute (tree *, tree, tree,
> + int, bool *);

Something mangled these patches -- leading blank characters got dropped?

-- 
Kees Cook


Re: [PATCH v3 1/1] [ARM] Add support for TLS register based stack protector canary access

2021-10-26 Thread Kees Cook via Gcc-patches
On Tue, Oct 26, 2021 at 10:18:36AM +0200, Ard Biesheuvel wrote:
> Add support for accessing the stack canary value via the TLS register,
> so that multiple threads running in the same address space can use
> distinct canary values. This is intended for the Linux kernel running in
> SMP mode, where processes entering the kernel are essentially threads
> running the same program concurrently: using a global variable for the
> canary in that context is problematic because it can never be rotated,
> and so the OS is forced to use the same value as long as it remains up.
> 
> Using the TLS register to index the stack canary helps with this, as it
> allows each CPU to context switch the TLS register along with the rest
> of the process, permitting each process to use its own value for the
> stack canary.
> 
> 2021-10-21 Ard Biesheuvel 
> 
>   * config/arm/arm-opts.h (enum stack_protector_guard): New
>   * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
>   New
>   * config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define
>   (arm_option_override_internal): Handle and put in error checks
>   for stack protector guard options.
>   (arm_option_reconfigure_globals): Likewise
>   (arm_stack_protect_tls_canary_mem): New
>   (arm_stack_protect_guard): New
>   * config/arm/arm.md (stack_protect_set): New
>   (stack_protect_set_tls): Likewise
>   (stack_protect_test): Likewise
>   (stack_protect_test_tls): Likewise
>   * config/arm/arm.opt (-mstack-protector-guard): New
>   (-mstack-protector-guard-offset): New.
> 
> Signed-off-by: Ard Biesheuvel 

I can't speak to the specific implementation details here, but this
builds for me, and behaves as expected. I get a working kernel[1],
and have verified[2] that we have per-task canaries for arm32. :) Yay!

Tested-by: Kees Cook 

Who's best to review and commit this? Qing, is something you're able to
review?

Thanks!

-Kees

[1] 
https://lore.kernel.org/linux-arm-kernel/20211021142516.1843042-1-a...@kernel.org/
[2] 
https://lore.kernel.org/linux-hardening/2021103826.330653-3-keesc...@chromium.org/

-- 
Kees Cook


Re: [RFC PATCH 0/1] implement TLS register based stack canary for ARM

2021-10-21 Thread Kees Cook via Gcc-patches
On Thu, Oct 21, 2021 at 06:34:04PM +0200, Ard Biesheuvel wrote:
> On Thu, 21 Oct 2021 at 12:23, Ard Biesheuvel  wrote:
> >
> > Bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102352
> >
> > In the Linux kernel, user processes calling into the kernel are
> > essentially threads running in the same address space, of a program that
> > never terminates. This means that using a global variable for the stack
> > protector canary value is problematic on SMP systems, as we can never
> > change it unless we reboot the system. (Processes that sleep for any
> > reason will do so on a call into the kernel, which means that there will
> > always be live kernel stack frames carrying copies of the canary taken
> > when the function was entered)
> >
> > AArch64 implements -mstack-protector-guard=sysreg for this purpose, as
> > this permits the kernel to use different memory addresses for the stack
> > canary for each CPU, and context switch the chosen system register with
> > the rest of the process, allowing each process to use its own unique
> > value for the stack canary.
> >
> > This patch implements something similar, but for the 32-bit ARM kernel,
> > which will start using the user space TLS register TPIDRURO to index
> > per-process metadata while running in the kernel. This means we can just
> > add an offset to TPIDRURO to obtain the address from which to load the
> > canary value.
> >
> > The patch is a bit rough around the edges, but produces the correct
> > results as far as I can tell.
> 
> This is a lie

LOL.

> 
> > However, I couldn't quite figure out how
> > to modify the patterns so that the offset will be moved into the
> > immediate offset field of the LDR instructions, so currently, the ADD of
> > the offset is always a distinct instruction.
> >
> 
> ... and this is no longer true now that I fixed the correctness
> problem. I will be sending out a v2 shortly, so please disregard this
> one for now.

Heh, I hadn't even had a chance to test it, so I'll hold off. :)

Thanks!

-Kees

-- 
Kees Cook


Re: [patch][gcc12-changes] Add a new item about the support for automatic static variable initialization

2021-09-29 Thread Kees Cook via Gcc-patches
On Wed, Sep 29, 2021 at 02:43:35PM +, Qing Zhao wrote:
> FYI, just committed the change:
> 
> https://gcc.gnu.org/gcc-12/changes.html

Looks great to me; thanks! :)

-Kees

-- 
Kees Cook


Re: [patch][gcc12-changes] Add a new item about the support for automatic static variable initialization

2021-09-28 Thread Kees Cook via Gcc-patches
On Tue, Sep 28, 2021 at 08:31:13PM +, Qing Zhao wrote:
> Hi,
> 
> This is the patch for the gcc12 changes  per your request. 
> 
> Kees provided most of the wording. 
> 
> Please take a look and let’s know whether it’s good for commit?
> 
> thanks.
> 
> Qing
> 
> 
> 
> 
> From: qing zhao 
> Date: Tue, 28 Sep 2021 12:01:42 -0700
> Subject: [PATCH] gcc-12/changes.html: Uninitialized stack variables
>  initialization update
> 
>   * htdocs/gcc-12/changes.html (Eliminating uninitialized variables):
>   Item about the support for automatic static variable initialization.
> ---
>  htdocs/gcc-12/changes.html | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index 1f156a9..8e2979c 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -245,6 +245,25 @@ a work-in-progress.
>  
>  Other significant improvements
>  
> +Eliminating uninitialized variables
> +
> +
> +  GCC can now initialize all stack variables implicitly, including
> +  padding. This is intended to eliminate all classes of uninitialized
> +  stack variable flaws. Lack of explicit initialization will still
> +  warn when -Wuninitialized is active. For best
> +  debugging, use of the new command-line option
> +  -ftrivial-auto-var-init=pattern can be used to fill
> +  variables with a repeated 0xFE pattern, which tends to illuminate
> +  many bugs (e.g. pointers receive invalid addresses, sizes and indices
> +  are very large). For best production results, the new command-line
> +  option -ftrivial-auto-var-init=zero can be used to
> +  fill variables with 0x00, which tends to provide a safer state for
> +  bugs (e.g. pointers are NULL, strings are NULL filled, and sizes

Minor nit: I've always been corrected that "NULL" refers to a pointer, and
"NUL" refers to the "null character", so the latter use of NULL should be
"NUL": ... pointers are NULL, strings are NUL filled, and size ...

I mix this up all the time, so apologies if that got introduced by me!
:)

-Kees

> +  and indices are 0).
> +  
> +
> +
>  Debugging formats
>  
>  
> -- 
> 1.9.1
> 
> 

-- 
Kees Cook


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-09 Thread Kees Cook via Gcc-patches
On Thu, Sep 09, 2021 at 10:49:11PM +, Qing Zhao wrote:
> Hi, FYI
> 
> I just committed the following patch to gcc upstream:
> 
> 
> https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html

Hurray! Thank you so much for working on this, and thanks also to the
reviewers and everyone else poking at it.

I will go update my Linux Plumbers slides to say "supported" instead of
"proposed". :)

-Kees

-- 
Kees Cook


Re: [patch][version 7]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-29 Thread Kees Cook via Gcc-patches
On Thu, Jul 29, 2021 at 08:02:43PM +, Qing Zhao wrote:
> This is the 7th version of the patch for the new security feature for GCC.
> I have tested it with bootstrap on both x86 and aarch64, regression testing 
> on both x86 and aarch64.
> Also compile CPU2017 (running is ongoing), without any issue.
> 
> Please take a look and let me know any issue.

All my kernel tests pass; this looks great! Thank you. :)

-- 
Kees Cook


Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-28 Thread Kees Cook via Gcc-patches
On Tue, Jul 27, 2021 at 03:26:00AM +, Qing Zhao wrote:
> This is the 6th version of the patch for the new security feature for GCC.
> 
> I have tested it with bootstrap on both x86 and aarch64, regression testing 
> on both x86 and aarch64.
> Also compile CPU2017 (running is ongoing), without any issue. (With the fix 
> to bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586).
> 
> Please take a look and let me know any issue.

Good news, this passes all my initialization tests in the kernel. Yay! :)

However, I see an unexpected side-effect from some static initializations:

net/core/sock.c: In function 'sock_no_sendpage':
net/core/sock.c:2849:23: warning: 'msg' is used uninitialized [-Wuninitialized]
 2849 | struct msghdr msg = {.msg_flags = flags};
  |   ^~~   

It seems like -Wuninitialized has suddenly stopped noticing explicit
static initializers when there are bit fields in the struct. Here's a
minimized case:

$ cat init.c
struct weird {
int bit : 1;
int val;
};

int func(int val)
{
struct weird obj = { .val = val };
return obj.val;
}

$ gcc -c -o init.o -Wall -O2 -ftrivial-auto-var-init=zero init.c
init.c: In function ‘func’:
init.c:8:22: warning: ‘obj’ is used uninitialized [-Wuninitialized]
8 | struct weird obj = { .val = val };
  |  ^~~
init.c:8:22: note: ‘obj’ declared here
8 | struct weird obj = { .val = val };
  |  ^~~



-- 
Kees Cook


Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-14 Thread Kees Cook via Gcc-patches
On Wed, Jul 14, 2021 at 07:30:45PM +, Qing Zhao wrote:
> Hi, Kees,
> 
> 
> > On Jul 14, 2021, at 2:11 PM, Kees Cook  wrote:
> > 
> > On Wed, Jul 14, 2021 at 02:09:50PM +, Qing Zhao wrote:
> >> Hi, Richard,
> >> 
> >>> On Jul 14, 2021, at 2:14 AM, Richard Biener  
> >>> wrote:
> >>> 
> >>> On Wed, Jul 14, 2021 at 1:17 AM Qing Zhao  wrote:
>  
>  Hi, Kees,
>  
>  I took a look at the kernel testing case you attached in the previous 
>  email, and found the testing failed with the following case:
>  
>  #define INIT_STRUCT_static_all  = { .one = arg->one,\
>    .two = arg->two,\
>    .three = arg->three,\
>    .four = arg->four,  \
>    }
>  
>  i.e, when the structure type auto variable has been explicitly 
>  initialized in the source code.  -ftrivial-auto-var-init in the 4th 
>  version
>  does not initialize the paddings for such variables.
>  
>  But in the previous version of the patches ( 2 or 3), 
>  -ftrivial-auto-var-init initializes the paddings for such variables.
>  
>  I intended to remove this part of the code from the 4th version of the 
>  patch since the implementation for initializing such paddings is 
>  completely different from
>  the initializing of the whole structure as a whole with memset in this 
>  version of the implementation.
>  
>  If we really need this functionality, I will add another separate patch 
>  for this additional functionality, but not with this patch.
>  
>  Richard, what’s your comment and suggestions on this?
> >>> 
> >>> I think this can be addressed in the gimplifier by adjusting
> >>> gimplify_init_constructor to clear
> >>> the object before the initialization (if it's not done via aggregate
> >>> copying).  
> >> 
> >> I did this in the previous versions of the patch like the following:
> >> 
> >> @@ -5001,6 +5185,17 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
> >> *pre_p, gimple_seq *post_p,
> >>  /* If a single access to the target must be ensured and all elements
> >> are zero, then it's optimal to clear whatever their number.  */
> >>  cleared = true;
> >> +  else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED
> >> +   && !TREE_STATIC (object)
> >> +   && type_has_padding (type))
> >> +/* If the user requests to initialize automatic variables with
> >> +   paddings inside the type, we should initialize the paddings too.
> >> +   C guarantees that brace-init with fewer initializers than members
> >> +   aggregate will initialize the rest of the aggregate as-if it were
> >> +   static initialization.  In turn static initialization guarantees
> >> +   that pad is initialized to zero bits.
> >> +   So, it's better to clear the whole record under such situation.  */
> >> +cleared = true;
> >>else
> >>  cleared = false;
> >> 
> >> Then the paddings are also initialized to zeroes with this option. (Even 
> >> for -ftrivial-auto-var-init=pattern).
> > 
> > Thanks! I've tested with the attached patch to v4 and it passes all my
> > tests again.
> > 
> >> Is the above change Okay? (With this change, when 
> >> -ftrivial-auto-var-init=pattern, the paddings for the
> >> structure variables that have explicit initializer will be ZEROed, not 
> >> 0xFE)
> > 
> > Padding zeroing in the face of pattern-init is correct (and matches what
> > Clang does).
> 
> During the discussion before the 4th version of the patch, we have agreed 
> that pattern-init will use 0xFE byte-repeatable patterns 
> to initialize all the types (this includes the paddings when the structure 
> type variables are not explicitly initialized). And will not match
> Clang’s current behavior. 

Right, that's fine.

> If we initialize the paddings when the structure type variables are 
> explicitly initialized to Zeroes, then there will be inconsistency 
> between values that are used to initialize structure paddings under different 
> situations, This looks not good to me.
> 
> If we have agreed on using 0xFE byte-repeatable patterns for pattern-init, 
> then all the paddings should be initialized with the same 
> pattern. 

Ah! By "situation", you mean how the compiler chooses to initialize the
structure members?

It sounds like for =zero mode, padding will be 0, but for =pattern,
padding may be either 0x00 or 0xFE, depending on which kind of
initialization is internally chosen. Is that right? I'm fine with this
since the =zero case is what I'm primarily focused on being as safe
as possible.

-Kees

-- 
Kees Cook


Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-14 Thread Kees Cook via Gcc-patches
On Wed, Jul 14, 2021 at 02:09:50PM +, Qing Zhao wrote:
> Hi, Richard,
> 
> > On Jul 14, 2021, at 2:14 AM, Richard Biener  
> > wrote:
> > 
> > On Wed, Jul 14, 2021 at 1:17 AM Qing Zhao  wrote:
> >> 
> >> Hi, Kees,
> >> 
> >> I took a look at the kernel testing case you attached in the previous 
> >> email, and found the testing failed with the following case:
> >> 
> >> #define INIT_STRUCT_static_all  = { .one = arg->one,\
> >>.two = arg->two,\
> >>.three = arg->three,\
> >>.four = arg->four,  \
> >>}
> >> 
> >> i.e, when the structure type auto variable has been explicitly initialized 
> >> in the source code.  -ftrivial-auto-var-init in the 4th version
> >> does not initialize the paddings for such variables.
> >> 
> >> But in the previous version of the patches ( 2 or 3), 
> >> -ftrivial-auto-var-init initializes the paddings for such variables.
> >> 
> >> I intended to remove this part of the code from the 4th version of the 
> >> patch since the implementation for initializing such paddings is 
> >> completely different from
> >> the initializing of the whole structure as a whole with memset in this 
> >> version of the implementation.
> >> 
> >> If we really need this functionality, I will add another separate patch 
> >> for this additional functionality, but not with this patch.
> >> 
> >> Richard, what’s your comment and suggestions on this?
> > 
> > I think this can be addressed in the gimplifier by adjusting
> > gimplify_init_constructor to clear
> > the object before the initialization (if it's not done via aggregate
> > copying).  
> 
> I did this in the previous versions of the patch like the following:
> 
> @@ -5001,6 +5185,17 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq *post_p,
> /* If a single access to the target must be ensured and all elements
>are zero, then it's optimal to clear whatever their number.  */
> cleared = true;
> + else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED
> +  && !TREE_STATIC (object)
> +  && type_has_padding (type))
> +   /* If the user requests to initialize automatic variables with
> +  paddings inside the type, we should initialize the paddings too.
> +  C guarantees that brace-init with fewer initializers than members
> +  aggregate will initialize the rest of the aggregate as-if it were
> +  static initialization.  In turn static initialization guarantees
> +  that pad is initialized to zero bits.
> +  So, it's better to clear the whole record under such situation.  */
> +   cleared = true;
>   else
> cleared = false;
> 
> Then the paddings are also initialized to zeroes with this option. (Even for 
> -ftrivial-auto-var-init=pattern).

Thanks! I've tested with the attached patch to v4 and it passes all my
tests again.

> Is the above change Okay? (With this change, when 
> -ftrivial-auto-var-init=pattern, the paddings for the
> structure variables that have explicit initializer will be ZEROed, not 0xFE)

Padding zeroing in the face of pattern-init is correct (and matches what
Clang does).

-Kees

-- 
Kees Cook
commit 8c52b69540b064e930e4d9e2e3dc011ca002306d
Author: Kees Cook 
AuthorDate: Wed Jul 14 11:17:27 2021 -0700
Commit: Kees Cook 
CommitDate: Wed Jul 14 12:08:56 2021 -0700

Fix padding

Based on v2 and bddde2d2-8594-4bf6-8142-cd09b0ebb...@oracle.com

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 4db53cda77f8..dd3da86d6663 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -5071,6 +5071,18 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	  /* If a single access to the target must be ensured and all elements
 	 are zero, then it's optimal to clear whatever their number.  */
 	  cleared = true;
+	else if (opt_for_fn (current_function_decl, flag_auto_var_init)
+		   > AUTO_INIT_UNINITIALIZED
+		 && !TREE_STATIC (object)
+		 && type_has_padding (type))
+	  /* If the user requests to initialize automatic variables with
+	 paddings inside the type, we should initialize the paddings too.
+	 C guarantees that brace-init with fewer initializers than members
+	 aggregate will initialize the rest of the aggregate as-if it were
+	 static initialization.  In turn static initialization guarantees
+	 that pad is initialized to zero bits.
+	 So, it's better to clear the whole record under such situation.  */
+	  cleared = true;
 	else
 	  cleared = false;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 1aa6e557a049..7889c10d639f 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10818,6 +10818,72 @@ lower_bound_in_type (tree outer, tree inner)
 }
 }
 
+/* Returns true when the given TYPE has padding 

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-13 Thread Kees Cook via Gcc-patches
On Tue, Jul 13, 2021 at 11:16:59PM +, Qing Zhao wrote:
> Hi, Kees,
> 
> I took a look at the kernel testing case you attached in the previous email, 
> and found the testing failed with the following case:
> 
> #define INIT_STRUCT_static_all  = { .one = arg->one,\
> .two = arg->two,\
> .three = arg->three,\
> .four = arg->four,  \
> }
> 
> i.e, when the structure type auto variable has been explicitly initialized in 
> the source code.  -ftrivial-auto-var-init in the 4th version
> does not initialize the paddings for such variables.  
> 
> But in the previous version of the patches ( 2 or 3), -ftrivial-auto-var-init 
> initializes the paddings for such variables.
> 
> I intended to remove this part of the code from the 4th version of the patch 
> since the implementation for initializing such paddings is completely 
> different from 
> the initializing of the whole structure as a whole with memset in this 
> version of the implementation. 
> 
> If we really need this functionality, I will add another separate patch for 
> this additional functionality, but not with this patch.

Yes, this is required to get proper coverage for initialization in the
kernel (or, really, any program). Without this, things are still left
uninitialized in the padding of structs.

A separate patch is fine by me; my only desire is to still have it be
part of -ftrivial-auto-var-init when it's all done. :)

Thanks!

-- 
Kees Cook


Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-13 Thread Kees Cook via Gcc-patches
On Tue, Jul 13, 2021 at 02:29:33PM -0700, Kees Cook wrote:
> I've extracted the kernel test to build for userspace, and it behaves
> the same way. See attached "stackinit.c".

I've adjusted this slightly (the "static" tests weren't testing the
correct thing, but the results remained the same). Here's what I see.

This is the variable on the stack:

struct test_small_hole {
size_t one;  /* 0 8 */
char   two;  /* 8 1 */

/* XXX 3 bytes hole, try to pack */

intthree;/*12 4 */
long unsigned int  four; /*16 8 */

/* size: 24, cachelines: 1, members: 4 */
/* sum members: 21, holes: 1, sum holes: 3 */
/* last cacheline: 24 bytes */
};

The above is 0x18 in size.

00405370 :
  405370:   40 0f b6 c7 movzbl %dil,%eax
  405374:   41 89 f1mov%esi,%r9d
  405377:   48 8d 74 24 b8  lea-0x48(%rsp),%rsi

-0x48(%rsp) is the location of the variable.

  40537c:   44 0f be c7 movsbl %dil,%r8d
  405380:   48 b9 01 01 01 01 01movabs $0x101010101010101,%rcx
  405387:   01 01 01 
  40538a:   49 89 c2mov%rax,%r10
  40538d:   48 c7 44 24 b8 00 00movq   $0x0,-0x48(%rsp)
  405394:   00 00 

8 byte move of 0 to -0x48(%rsp) through -0x41

  405396:   48 f7 e1mul%rcx
  405399:   c6 44 24 c0 00  movb   $0x0,-0x40(%rsp)

1 byte move of 0 to -0x40(%rsp)

  40539e:   4c 0f af d1 imul   %rcx,%r10
  4053a2:   c7 44 24 c4 00 00 00movl   $0x0,-0x3c(%rsp)
  4053a9:   00 

4 byte move of 0 to -0x3c through -0x39 (note that -0x3f, -0x3e,
-0x3d is _not_ written, which maps to the 3 byte struct hole).

  4053aa:   48 c7 44 24 c8 00 00movq   $0x0,-0x38(%rsp)
  4053b1:   00 00 

8 byte move of 0 to -0x38(%rsp) through -0x31.

  4053b3:   48 89 35 d6 9c 00 00mov%rsi,0x9cd6(%rip) # 40f090 


variable address is saved to global.

  4053ba:   4c 01 d2add%r10,%rdx
  4053bd:   48 89 44 24 d8  mov%rax,-0x28(%rsp)
  4053c2:   48 c7 05 b3 9c 00 00movq   $0x18,0x9cb3(%rip) # 40f080 

  4053c9:   18 00 00 00 

variable size is saved to global.

  4053cd:   48 89 54 24 e0  mov%rdx,-0x20(%rsp)
  4053d2:   48 89 44 24 e8  mov%rax,-0x18(%rsp)
  4053d7:   48 89 54 24 f0  mov%rdx,-0x10(%rsp)
  4053dc:   45 84 c9test   %r9b,%r9b
  4053df:   75 1d   jne4053fe 

  4053e1:   48 8b 44 24 c8  mov-0x38(%rsp),%rax
  4053e6:   66 0f 6f 44 24 b8   movdqa -0x48(%rsp),%xmm0
  4053ec:   48 89 05 bd 9c 00 00mov%rax,0x9cbd(%rip) # 40f0b0 

  4053f3:   44 89 c0mov%r8d,%eax
  4053f6:   0f 29 05 a3 9c 00 00movaps %xmm0,0x9ca3(%rip) # 40f0a0 


Here's the unrolled memcpy (8 bytes and 16 bytes) to the global buffer,
taking the "uninitialized" padding with it.

  4053fd:   c3  ret

-- 
Kees Cook
// SPDX-License-Identifier: GPL-2.0-or-later
/*
 * Test cases for compiler-based stack variable zeroing via
 * -ftrivial-auto-var-init={zero,pattern} or CONFIG_GCC_PLUGIN_STRUCTLEAK*.
 *
 * Build example:
 * gcc -O2 -Wall -ftrivial-auto-var-init=zero -o stackinit stackinit.c
 */

/* Userspace headers. */
#include 
#include 
#include 
#include 
#include 

/* Linux kernel-isms */
#define KBUILD_MODNAME		"stackinit"
#define pr_fmt(fmt)		KBUILD_MODNAME ": " fmt
#define pr_err(fmt, ...)	fprintf(stderr, pr_fmt(fmt), ##__VA_ARGS__)
#define pr_warn(fmt, ...)	fprintf(stderr, pr_fmt(fmt), ##__VA_ARGS__)
#define pr_info(fmt, ...)	fprintf(stdout, pr_fmt(fmt), ##__VA_ARGS__)
#define __init			/**/
#define __user			/**/
#define noinline		__attribute__((__noinline__))
#define __aligned(x)		__attribute__((__aligned__(x)))
#ifdef __clang__
# define __compiletime_error(message) /**/
#else
# define __compiletime_error(message) __attribute__((__error__(message)))
#endif
#define __compiletime_assert(condition, msg, prefix, suffix)		\
	do {\
		extern void prefix ## suffix(void) __compiletime_error(msg); \
		if (!(condition))	\
			prefix ## suffix();\
	} while (0)
#define _compiletime_assert(condition, msg, prefix, suffix) \
	__compiletime_assert(condition, msg, prefix, suffix)
#define compiletime_assert(condition, msg) \
	_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
#define BUILD_BUG_ON(condition) \
	BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
typedef uint8_t			u8;
typedef uint16_t		u16;
typedef uint32_t		u32;
typedef uint64_t		u64;


/* Exfiltration buffer. */
#define MAX_VAR_SIZE	128
static u8 

Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-13 Thread Kees Cook via Gcc-patches
On Mon, Jul 12, 2021 at 08:28:55PM +, Qing Zhao wrote:
> > On Jul 12, 2021, at 12:56 PM, Kees Cook  wrote:
> > On Wed, Jul 07, 2021 at 05:38:02PM +, Qing Zhao wrote:
> >> This is the 4th version of the patch for the new security feature for GCC.
> > 
> > It looks like padding initialization has regressed to where things where
> > in version 1[1] (it was, however, working in version 2[2]). I'm seeing
> > these failures again in the kernel self-test:
> > 
> > test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
> > test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
> > test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
> > test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
> > test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
> > test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
>  
> Are the above failures for -ftrivial-auto-var-init=zero or 
> -ftrivial-auto-var-init=pattern?  Or both?

Yes, I was only testing =zero (the kernel test handles =pattern as well:
it doesn't explicitly test for 0x00). I've verified with =pattern now,
too.

> For the current implementation, I believe that all paddings should be 
> initialized with this option, 
> for -ftrivial-auto-var-init=zero, the padding will be initialized to zero as 
> before, however, for
> -ftrivial-auto-var-init=pattern, the padding will be initialized to 0xFE 
> byte-repeatable patterns.

I've double-checked that I'm using the right gcc, with the flag.

> > 
> > In looking at the gcc test cases, I think the wrong thing is
> > being checked: we want to verify the padding itself. For example,
> > in auto-init-17.c, the actual bytes after "four" need to be checked,
> > rather than "four" itself.
> 
> **For the current auto-init-17.c
> 
>   1 /* Verify zero initialization for array type with structure element with
>   2padding.  */
>   3 /* { dg-do compile } */
>   4 /* { dg-options "-ftrivial-auto-var-init=zero" } */
>   5 
>   6 struct test_trailing_hole {
>   7 int one;
>   8 int two;
>   9 int three;
>  10 char four;
>  11 /* "sizeof(unsigned long) - 1" byte padding hole here. */
>  12 };
>  13 
>  14 
>  15 int foo ()
>  16 {
>  17   struct test_trailing_hole var[10];
>  18   return var[2].four;
>  19 }
>  20 
>  21 /* { dg-final { scan-assembler "movl\t\\\$0," } } */
>  22 /* { dg-final { scan-assembler "movl\t\\\$20," } } */
>  23 /* { dg-final { scan-assembler "rep stosq" } } */
> ~  
> **We have the assembly as: (-ftrivial-auto-var-init=zero)
> 
> .file   "auto-init-17.c"
> .text
> .globl  foo
> .type   foo, @function
> foo:
> .LFB0:
> .cfi_startproc
> pushq   %rbp
> .cfi_def_cfa_offset 16
> .cfi_offset 6, -16
> movq%rsp, %rbp
> .cfi_def_cfa_register 6
> subq$40, %rsp
> leaq-160(%rbp), %rax
> movq%rax, %rsi
> movl$0, %eax
> movl$20, %edx
> movq%rsi, %rdi
> movq%rdx, %rcx
> rep stosq
> movzbl  -116(%rbp), %eax
> movsbl  %al, %eax
> leave
> .cfi_def_cfa 7, 8
> ret
> .cfi_endproc
> .LFE0:
> .size   foo, .-foo
> .section.note.GNU-stack,"",@progbits
> 
> From the above, we can see,  “zero” will be used to initialize 8 * 20 = 16 * 
> 10 bytes of memory starting from the beginning of “var”, that include all the 
> padding holes inside
> This array of structure. 
> 
> I didn’t see issue with padding initialization here.

Hm, agreed -- this test does do the right thing.

> > But this isn't actually sufficient because they may _accidentally_
> > be zero already. The kernel tests specifically make sure to fill the
> > about-to-be-used stack with 0xff before calling a function like foo()
> > above.

I've extracted the kernel test to build for userspace, and it behaves
the same way. See attached "stackinit.c".

$ gcc-build/auto-var-init.4/installed/bin/gcc -O2 -Wall -o stackinit stackinit.c
$ ./stackinit 2>&1 | grep failures:
stackinit: failures: 23
$ gcc-build/auto-var-init.4/installed/bin/gcc -O2 -Wall 
-ftrivial-auto-var-init=zero -o stackinit stackinit.c
stackinit.c: In function ‘__leaf_switch_none’:
stackinit.c:326:26: warning: statement will never be executed
[-Wswitch-unreachable]
  326 | uint64_t var;
  |  ^~~
$ ./stackinit 2>&1 | grep failures:
stackinit: failures: 6

Same failures as seen in the kernel test (and an expected warning
about the initialization that will never happen for a pre-case switch
statement).

> > 
> > (And as an aside, it seems like naming the test cases with some details
> > about what is being tested in the filename would be nice -- it was
> > a little weird having to dig through their numeric names to find the
> > padding tests.)
> 
> Yes, I will fix the testing names to more reflect the testing details. 

Great!


Re: [patch][version 4]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-12 Thread Kees Cook via Gcc-patches
On Wed, Jul 07, 2021 at 05:38:02PM +, Qing Zhao wrote:
> Hi, 
> 
> This is the 4th version of the patch for the new security feature for GCC.
> 
> I have tested it with bootstrap on both x86 and aarch64, regression testing 
> on both x86 and aarch64.
> Also compile and run CPU2017, without any issue.
> 
> Please take a look and let me know your comments and suggestions.

Thanks for the update!

It looks like padding initialization has regressed to where things where
in version 1[1] (it was, however, working in version 2[2]). I'm seeing
these failures again in the kernel self-test:

test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)

In looking at the gcc test cases, I think the wrong thing is
being checked: we want to verify the padding itself. For example,
in auto-init-17.c, the actual bytes after "four" need to be checked,
rather than "four" itself. For example, something like this:

struct test_trailing_hole {
int one;
int two;
int three;
char four;
/* "sizeof(unsigned long) - 1" byte padding hole here. */
};

#define offsetofend(STRUCT, MEMBER) \
  (__builtin_offsetof(STRUCT, MEMBER) + sizeofSTRUCT *)0)->MEMBER)))

int foo ()
{ 
  struct test_trailing_hole var[10];
  unsigned char *ptr = (unsigned char *)[2];
  int i;

  for (i = 0; i < sizeof(var[2]) - offsetofend(typeof(var[2]), four); i++) {
if (ptr[i] != 0)
  return 1;
  } 
  return 0;
}

But this isn't actually sufficient because they may _accidentally_
be zero already. The kernel tests specifically make sure to fill the
about-to-be-used stack with 0xff before calling a function like foo()
above.

(And as an aside, it seems like naming the test cases with some details
about what is being tested in the filename would be nice -- it was
a little weird having to dig through their numeric names to find the
padding tests.)

Otherwise, this looks like it's coming along; I remain very excited!
Thank you for sticking with it. :)

-Kees

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565840.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-April/567754.html

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-22 Thread Kees Cook via Gcc-patches
On Tue, Jun 22, 2021 at 09:25:57AM +0100, Richard Sandiford wrote:
> Kees Cook  writes:
> > On Mon, Jun 21, 2021 at 03:39:45PM +, Qing Zhao wrote:
> >> So, if “pattern value” is “0x”, then it’s a valid 
> >> canonical virtual memory address.  However, for most OS, 
> >> “0x” should be not in user space.
> >> 
> >> My question is, is “0xF” good for pointer? Or 
> >> “0x” better?
> >
> > I think 0xFF repeating is fine for this version. Everything else is a
> > "nice to have" for the pattern-init, IMO. :)
> 
> Sorry to be awkward, but 0xFF seems worse than 0xAA to me.
> 
> For integer types, all values are valid representations, and we're
> relying on the pattern being “obviously” wrong in context.  0x…
> is unlikely to be a correct integer but 0x… would instead be a
> “nice” -1.  It would be difficult to tell in a debugger that a -1
> came from pattern init rather than a deliberate choice.

I can live with 0xAA. On x86_64, this puts it nicely in the middle of
the middle of the non-canonical space:

0x8000 - 0x7fff

The only trouble is with 32-bit, where the value 0x is a
legitimate allocatable userspace address. If we want some kind-of middle
ground, how about 0xFE? That'll be non-canonical on x86_64, and at the
high end of the i386 kernel address space.

> I agree that, all other things being equal, it would be nice to use NaNs
> for floats.  But relying on wrong numerical values for floats doesn't
> seem worse than doing that for integers.
> 
> 0xAA… for float is (if I've got this right) -3.0316488252093987e-13,
> which admittedly doesn't stand out as wrong.  But I'm not sure we
> should sacrifice integer debugging for float debugging here.

In some future version type-specific patterns would be a nice improvement,
but I don't want that to block getting the zero-init portion landed. :)

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-21 Thread Kees Cook via Gcc-patches
On Mon, Jun 21, 2021 at 03:39:45PM +, Qing Zhao wrote:
> So, if “pattern value” is “0x”, then it’s a valid canonical 
> virtual memory address.  However, for most OS, “0x” should be 
> not in user space.
> 
> My question is, is “0xF” good for pointer? Or 
> “0x” better?

I think 0xFF repeating is fine for this version. Everything else is a
"nice to have" for the pattern-init, IMO. :)

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-18 Thread Kees Cook via Gcc-patches
On Wed, Jun 16, 2021 at 07:39:02PM +, Qing Zhao wrote:
> So, the major question now is:
> 
> Is one single repeatable pattern enough for pattern initialization for all 
> different types of auto variables?
> 
> If YES, then the implementation for pattern initialization will be much 
> easier and simpler
>   as you pointed out. And will save me a lot of pain to implement this 
> part.
> If NO, then we have to keep the current complicate implementation since it 
> provides us
>   the flexibility to assign different patterns to different types.
> 
> Honestly, I don’t have a good justification on this question myself.
> 
> The previous references I have so far are the current behavior of CLANG and 
> Microsoft compiler.
> 
> For your reference,
> . CLANG uses different patterns for INTEGER  (0x) and FLOAT 
> (0x) and 32-bit pointer (0x00AA)
> https://reviews.llvm.org/D54604
> . Microsoft uses different patterns for INTEGERS ( 0xE2), FLOAT (1.0)
> https://msrc-blog.microsoft.com/2020/05/13/solving-uninitialized-stack-memory-on-windows/
> 
> My understanding from CLANG’s comment is, the patterns are easier to crash 
> the program for the certain type, therefore easier to
> catch any potential bugs.

Right, this is the justification for the different patterns. I am
fine with a static value for the first version of this functionality,
as long as it's a non-canonical virtual memory address when evaluated
as a pointer (so that the pattern can't be made to aim at a legitimate
fixed allocatable address in memory).

> Don’t know why Microsoft chose the pattern like this.
> 
> So, For GCC, what should we do on the pattern initializations, shall we 
> choose one single repeatable pattern for all the types as you suggested,
> Or chose different patterns for different types as Clang and Microsoft 
> compiler’s behavior?
> 
> Kees, do you have any comment on this?
> 
> How did Linux Kernel use -ftrivial-auto-var-init=pattern feature of CLANG?

It's just used as-is from the compiler, and recommended for "debug
builds".

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-11 Thread Kees Cook via Gcc-patches
On Fri, Jun 11, 2021 at 01:04:09PM +0200, Richard Biener wrote:
> On Tue, 8 Jun 2021, Kees Cook wrote:
> 
> > On Tue, Jun 08, 2021 at 09:41:38AM +0200, Richard Biener wrote:
> > > On Mon, 7 Jun 2021, Qing Zhao wrote:
> > > 
> > > > Hi, 
> > > > 
> > > > > On Jun 7, 2021, at 2:53 AM, Richard Biener  wrote:
> > > > > 
> > > > >> 
> > > > >> To address the above suggestion:
> > > > >> 
> > > > >> My study shows: the call to __builtin_clear_padding is expanded 
> > > > >> during gimplification phase.
> > > > >> And there is no __bultin_clear_padding expanding during rtx 
> > > > >> expanding phase.
> > > > >> However, for -ftrivial-auto-var-init, padding initialization should 
> > > > >> be done both in gimplification phase and rtx expanding phase.
> > > > >> since the __builtin_clear_padding might not be good for rtx 
> > > > >> expanding, reusing __builtin_clear_padding might not work.
> > > > >> 
> > > > >> Let me know if you have any more comments on this.
> > > > > 
> > > > > Yes, I didn't suggest to literally emit calls to 
> > > > > __builtin_clear_padding 
> > > > > but instead to leverage the lowering code, more specifically share the
> > > > > code that figures _what_ is to be initialized (where the padding is)
> > > > > and eventually the actual code generation pieces.  That might need 
> > > > > some
> > > > > refactoring but the code where padding resides should be present only
> > > > > a single time (since it's quite complex).
> > > > 
> > > > Okay, I see your point here.
> > > > 
> > > > > 
> > > > > Which is also why I suggested to split out the padding initialization
> > > > > bits to a separate patch (and option).
> > > > 
> > > > Personally, I am okay with splitting padding initialization from this 
> > > > current patch,
> > > > Kees, what’s your opinion on this? i.e, the current 
> > > > -ftrivial-auto-var-init will NOT initialize padding, we will add 
> > > > another option to 
> > > > Explicitly initialize padding.
> > > 
> > > It would also be possible to have -fauto-var-init, -fauto-var-init-padding
> > > and have -ftrivial-auto-var-init for clang compatibility enabling both.
> > 
> > Sounds good to me!
> > 
> > > Or -fauto-var-init={zero,pattern,padding} and allow
> > > -fauto-var-init=pattern,padding to be specified.  Note there's also
> > > padding between auto variables on the stack - that "trailing"
> > > padding isn't initialized either?  (yes, GCC sorts variables to minimize
> > > that padding)  For example for
> > > 
> > > void foo()
> > > {
> > >   char a[3];
> > >   bar (a);
> > > }
> > > 
> > > there's 12 bytes padding after 'a', shouldn't we initialize that?  If not,
> > > why's other padding important to be initialized?
> > 
> > This isn't a situation that I'm aware of causing real-world problems.
> > The issues have all come from padding within an addressable object. I
> > haven't tested Clang's behavior on this (and I have no kernel tests for
> > this padding), but I do check for trailing padding, like:
> > 
> > struct test_trailing_hole {
> > char *one;
> > char *two;
> > char *three;
> > char four;
> > /* "sizeof(unsigned long) - 1" byte padding hole here. */
> > };
> 
> Any justification why tail padding for
> 
>  struct foo { double x; char x[3]; } a;
> 
> is important but not for
> 
>  char x[3];
> 
> ?  It does look like an odd inconsistency to me.

The problem is with sizeof() and the various compounding results related
to it. Namely, things that do whole-struct copies (which is unfortunately
common in the kernel) will include the padding for "a" since it is within
the object, as represented by sizeof(), but not for x:

#include 

int main(void)
{
struct foo { double y; char x[3]; } a;
char x[3];

printf("a: %zu (a.y: %zu, a.x: %zu)\n", sizeof(a), sizeof(a.y), 
sizeof(a.x));
printf("x: %zu\n", sizeof(x));

return 0;
}

a: 16 (a.y: 8, a.x: 3)
x: 3

And it gets worse with structs-within-structs, etc.

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-11 Thread Kees Cook via Gcc-patches
On Fri, Jun 11, 2021 at 03:49:02PM +, Qing Zhao wrote:
> 
> On Jun 11, 2021, at 6:12 AM, Richard Biener  wrote:
> > [...]
> > 
> > The main point is that you need to avoid building the explicit initializer
> > only to have it consumed by assignment expansion.  If you want to keep
> > all the singing and dancing (as opposed to maybe initializing with a
> > 0x1 byte pattern) then I think for efficiency you still want to
> > block-initialize the variable and then only fixup the special fields.
> 
> Yes, this is a good idea. 
> 
> We can memset the whole structure with repeated pattern “0xAA” first,
> Then mixup BOOLEAN_TYPE and POINTER TYPE for 32-bit platform. 
> That might be more efficient. 
> 
> > 
> > But as said, all this is quite over-designed IMHO and simply
> > zeroing everything would be much simpler and good enough.
> 
> So, the fundenmental questions are:
> 
> 1. do we need the functionality of “Pattern Initialization” for debugging 
> purpose?
> I see that other compilers support both Zero initialization and Pattern 
> initialization. (Clang and Microsoft compiler)
> 
> http://lists.llvm.org/pipermail/cfe-dev/2020-April/065221.html
> https://msrc-blog.microsoft.com/2020/05/13/solving-uninitialized-stack-memory-on-windows/
> Pattern init is used in development build for debugging purpose, zero init is 
> used in production build for security purpose.

Correct -- "pattern" is much better about triggering all kinds of
problems, and suitable for debug builds. "zero" is less disruptive and
generally provides a safer failure mode for production builds.

> So, I assume that GCC might want to provide similar functionality?  But I am 
> open on this. 
> 
> Kees, will Kernel use “Pattern initialization” feature? 

It is currently support, yes. Note that if I had to choose between "nothing" and
"only zero", I will happily take "only zero". However, it seems like
pattern init isn't much more of an addition in this series.

> 2. Since “Pattern initialization” is just used for debugging purpose, the 
> runtime and code size overhead might not be that 
> Important at all, right?

That has been my impression, yes.

Thanks!

-Kees

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-08 Thread Kees Cook via Gcc-patches
On Tue, Jun 08, 2021 at 05:32:32PM +, Qing Zhao wrote:
> Thanks a lot.
> 
> Kees. 
> 
> Do you have the same issue with my emails?

Some of them, yes. This one was fine.

> I see this problem with my email mostly to part of the emails that were sent 
> to gcc-patches alias. 
> Other emails are fine. 
> 
> > On Jun 8, 2021, at 11:56 AM, Kees Cook  wrote:
> > 
> > On Tue, Jun 08, 2021 at 09:37:30AM +0200, Richard Biener wrote:
> >> On Mon, 7 Jun 2021, Qing Zhao wrote:
> >>> On Jun 7, 2021, at 2:48 AM, Richard Biener 
> >>> mailto:rguent...@suse.de>> wrote:
> >>> 
> >>> Meh - can you try using a mailer that does proper quoting?  It's difficult
> >>> to spot your added comments.  Will try anyway (and sorry for the delay)
> >>> 
> >>> Only the email replied to gcc-patch alias had this issue, all the other 
> >>> emails I sent are fine. Not sure why?
> >> 
> >> All your mails have this problem for me, it makes it quite difficult to
> >> follow the conversation.
> > 
> > I think the first step is to make sure the MUA is sending "text only"
> > emails. Then configure the "quoting style" to do the standard "> "-style.
> > 
> > What email client are you using?
> 
> I am using Mac’s Apple Mail client on my computer. 
> 
> I have been using this mail client for a long time, but only had such issues 
> recently. 

I share your pain! Gmail frequently likes making tiny breaking changes
too. :)

> Really not sure what’s going on.
> 
> I will try to figure this out.

Thanks!

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-08 Thread Kees Cook via Gcc-patches
On Tue, Jun 08, 2021 at 09:41:38AM +0200, Richard Biener wrote:
> On Mon, 7 Jun 2021, Qing Zhao wrote:
> 
> > Hi, 
> > 
> > > On Jun 7, 2021, at 2:53 AM, Richard Biener  wrote:
> > > 
> > >> 
> > >> To address the above suggestion:
> > >> 
> > >> My study shows: the call to __builtin_clear_padding is expanded during 
> > >> gimplification phase.
> > >> And there is no __bultin_clear_padding expanding during rtx expanding 
> > >> phase.
> > >> However, for -ftrivial-auto-var-init, padding initialization should be 
> > >> done both in gimplification phase and rtx expanding phase.
> > >> since the __builtin_clear_padding might not be good for rtx expanding, 
> > >> reusing __builtin_clear_padding might not work.
> > >> 
> > >> Let me know if you have any more comments on this.
> > > 
> > > Yes, I didn't suggest to literally emit calls to __builtin_clear_padding 
> > > but instead to leverage the lowering code, more specifically share the
> > > code that figures _what_ is to be initialized (where the padding is)
> > > and eventually the actual code generation pieces.  That might need some
> > > refactoring but the code where padding resides should be present only
> > > a single time (since it's quite complex).
> > 
> > Okay, I see your point here.
> > 
> > > 
> > > Which is also why I suggested to split out the padding initialization
> > > bits to a separate patch (and option).
> > 
> > Personally, I am okay with splitting padding initialization from this 
> > current patch,
> > Kees, what’s your opinion on this? i.e, the current -ftrivial-auto-var-init 
> > will NOT initialize padding, we will add another option to 
> > Explicitly initialize padding.
> 
> It would also be possible to have -fauto-var-init, -fauto-var-init-padding
> and have -ftrivial-auto-var-init for clang compatibility enabling both.

Sounds good to me!

> Or -fauto-var-init={zero,pattern,padding} and allow
> -fauto-var-init=pattern,padding to be specified.  Note there's also
> padding between auto variables on the stack - that "trailing"
> padding isn't initialized either?  (yes, GCC sorts variables to minimize
> that padding)  For example for
> 
> void foo()
> {
>   char a[3];
>   bar (a);
> }
> 
> there's 12 bytes padding after 'a', shouldn't we initialize that?  If not,
> why's other padding important to be initialized?

This isn't a situation that I'm aware of causing real-world problems.
The issues have all come from padding within an addressable object. I
haven't tested Clang's behavior on this (and I have no kernel tests for
this padding), but I do check for trailing padding, like:

struct test_trailing_hole {
char *one;
char *two;
char *three;
char four;
/* "sizeof(unsigned long) - 1" byte padding hole here. */
};

-Kees

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-08 Thread Kees Cook via Gcc-patches
On Tue, Jun 08, 2021 at 09:37:30AM +0200, Richard Biener wrote:
> On Mon, 7 Jun 2021, Qing Zhao wrote:
> > On Jun 7, 2021, at 2:48 AM, Richard Biener 
> > mailto:rguent...@suse.de>> wrote:
> > 
> > Meh - can you try using a mailer that does proper quoting?  It's difficult
> > to spot your added comments.  Will try anyway (and sorry for the delay)
> > 
> > Only the email replied to gcc-patch alias had this issue, all the other 
> > emails I sent are fine. Not sure why?
> 
> All your mails have this problem for me, it makes it quite difficult to
> follow the conversation.

I think the first step is to make sure the MUA is sending "text only"
emails. Then configure the "quoting style" to do the standard "> "-style.

What email client are you using?

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-07 Thread Kees Cook via Gcc-patches
On Mon, Jun 07, 2021 at 04:18:46PM +, Qing Zhao wrote:
> Hi, 
> 
> > On Jun 7, 2021, at 2:53 AM, Richard Biener  wrote:
> > 
> >> 
> >> To address the above suggestion:
> >> 
> >> My study shows: the call to __builtin_clear_padding is expanded during 
> >> gimplification phase.
> >> And there is no __bultin_clear_padding expanding during rtx expanding 
> >> phase.
> >> However, for -ftrivial-auto-var-init, padding initialization should be 
> >> done both in gimplification phase and rtx expanding phase.
> >> since the __builtin_clear_padding might not be good for rtx expanding, 
> >> reusing __builtin_clear_padding might not work.
> >> 
> >> Let me know if you have any more comments on this.
> > 
> > Yes, I didn't suggest to literally emit calls to __builtin_clear_padding 
> > but instead to leverage the lowering code, more specifically share the
> > code that figures _what_ is to be initialized (where the padding is)
> > and eventually the actual code generation pieces.  That might need some
> > refactoring but the code where padding resides should be present only
> > a single time (since it's quite complex).
> 
> Okay, I see your point here.
> 
> > 
> > Which is also why I suggested to split out the padding initialization
> > bits to a separate patch (and option).
> 
> Personally, I am okay with splitting padding initialization from this current 
> patch,
> Kees, what’s your opinion on this? i.e, the current -ftrivial-auto-var-init 
> will NOT initialize padding, we will add another option to 
> Explicitly initialize padding.

If two new options are needed, that's fine. But "-ftrivial-auto-var-init"
needs to do both (it is _not_ getting fully initialized if padding isn't
included). And would be a behavioral mismatch between Clang and GCC. :)

-- 
Kees Cook


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-07 Thread Kees Cook via Gcc-patches
On Mon, Jun 07, 2021 at 09:48:41AM +0200, Richard Biener wrote:
> On Thu, 27 May 2021, Qing Zhao wrote:
> > @@ -5001,6 +5185,17 @@ gimplify_init_constructor (tree *expr_p, gimple_seq
> > *pre_p, gimple_seq *post_p,
> >  /* If a single access to the target must be ensured and all
> > elements
> > are zero, then it's optimal to clear whatever their number.
> > */
> >  cleared = true;
> > +   else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED
> > +&& !TREE_STATIC (object)
> > +&& type_has_padding (type))
> > + /* If the user requests to initialize automatic variables with
> > +paddings inside the type, we should initialize the paddings
> > too.
> > +C guarantees that brace-init with fewer initializers than
> > members
> > +aggregate will initialize the rest of the aggregate as-if it
> > were
> > +static initialization.  In turn static initialization
> > guarantees
> > +that pad is initialized to zero bits.
> > +So, it's better to clear the whole record under such
> > situation.  */
> > + cleared = true;
> > 
> > so here we have padding as well - I think this warrants to be controlled
> > by an extra option?  And we can maybe split this out to a separate
> > patch? (the whole padding stuff)
> > 
> > Clang does the padding initialization with this option, shall we be 
> > consistent with Clang?
> 
> Just for the sake of consistency?  No.  Is there a technical reason
> for this complication?  Say we have
> 
>   struct { short s; int i; } a;
> 
> what's the technical reason to initialize the padding?  I might
> be tempted to use -ftrivial-auto-init but I'd definitely don't
> want to spend cycles/instructions initializing the padding in the
> above struct.

Yes, this is very important. This is one of the more common ways memory
content leaks happen in programs (especially the kernel). e.g.:

struct example {
short s;
int i;
};

struct example instance = { .i = foo };

While "s" gets zeroed, the padding may not, and may contain prior memory
contents. Having this be deterministically zero is important for this
feature. If the structure gets byte-copied to a buffer (e.g. syscall,
etc), the padding will go along for the ride.

-- 
Kees Cook


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-01 Thread Kees Cook via Gcc-patches
On Tue, Jun 01, 2021 at 04:35:53PM -0400, David Malcolm wrote:
> [...]
> Did this patch get reviewed/approved?

It's still under review, but I think it's close.

> Is the latest version still this one:
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565581.html
> or is there a more recent version that should be reviewed?

Yup, here's the latest (v3):
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570208.html

> (I don't think I'm qualified to approve the patch, I'm just a fan of
> the approach.  FWIW I've been experimenting with extending -fanalyzer
> to detect infoleaks in the kernel, whereas AIUI this patch is about
> mitigating them)

Thanks for your interest! If you patch your GCC with this, it should
Just Work in the kernel (i.e. you can set CONFIG_INIT_STACK_ALL_ZERO=y)

> Hope this is constructive

Yup! Please report back any testing; that'll help show people are
interested in the feature. :)

-- 
Kees Cook


Re: [patch for gcc12 stage1][version 2] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-04-23 Thread Kees Cook via Gcc-patches
On Fri, Apr 23, 2021 at 08:05:29PM +0100, Richard Sandiford wrote:
> Finally getting to this now that the GCC 11 rush is over.  Sorry for
> the slow response.
> 
> I've tried to review most of the code below, but skipped the testsuite
> parts in the interests of time.  I'll probably have more comments in
> future rounds, just wanted to get the ball rolling.
> 
> This is realy Richi's area more than mine though, so please take this
> with a grain of salt.
> 
> Qing Zhao  writes:
> > 2.  initialize all paddings to zero when -ftrivial-auto-var-init is present.
> > In expr.c (store_constructor):
> >
> > Clear the whole structure when
> > -ftrivial-auto-var-init and the structure has paddings.
> >
> > In gimplify.c (gimplify_init_constructor):
> >
> > Clear the whole structure when
> > -ftrivial-auto-var-init and the structure has paddings.
> 
> Just to check: are we sure we want to use zero as the padding fill
> value even for -ftrivial-auto-var-init=pattern?  Or should it be
> 0xAA instead, to match the integer fill pattern?
> 
> I can see the arguments both ways, just thought it was worth asking.

I have no opinion myself, but I can give background.  Originally, Clang
implemented using pattern, but there was discussion around it and the
decision there was to go with zero init, as it seemed to more closely
match the C spec:
https://github.com/llvm/llvm-project/commit/d39fbc7e20d84364e409ce59724ce20625637062

> > +This is C and C++'s default.
> > +
> > +@item
> > +@samp{pattern} Initialize automatic variables with values which will likely
> > +transform logic bugs into crashes down the line, are easily recognized in a
> > +crash dump and without being values that programmers can rely on for useful
> > +program semantics.
> > +The values used for pattern initialization might be changed in the future.
> > +
> > +@item
> > +@samp{zero} Initialize automatic variables with zeroes.
> > +@end itemize
> > +
> > +The default is @samp{uninitialized}.
> > +
> > +You can control this behavior for a specific variable by using the variable
> > +attribute @code{uninitialized} (@pxref{Variable Attributes}).
> > +
> 
> I think it's important to say here that GCC still considers the
> variables to be uninitialised and still considers reading them to
> be undefined behaviour.  The option is simply trying to improve the
> security and predictability of the program in the presence of these
> uninitialised variables.

Excellent point, yes. That'd be good to call out.

> > […]
> > @@ -1831,6 +2000,17 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
> >as they may contain a label address.  */
> > walk_tree (, force_labels_r, NULL, NULL);
> > }
> > +  /* When there is no explicit initializer, if the user requested,
> > +We should insert an artifical initializer for this automatic
> > +variable for non vla variables.  */
> 
> I think we should explain why we can skip VLAs here.

FWIW, in testing, VLAs do get initialized, so I guess there's a separate
place where it happens.


Thanks for the review!

-- 
Kees Cook


Re: [patch for gcc12 stage1][version 2] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-04-07 Thread Kees Cook via Gcc-patches
On Wed, Mar 24, 2021 at 04:21:49PM -0500, Qing Zhao wrote:
> This is the 2nd version of the patch for the new security feature for GCC.
> 
> Could you please take a look at it and let me know any comments and issues.

This behaves perfectly as far as I'm able to test in the Linux kernel!
Thank you!

For comparison to v1, here's the stack init test output for version 2 of the 
patch:

test_stackinit: u8_zero ok
test_stackinit: u16_zero ok
test_stackinit: u32_zero ok
test_stackinit: u64_zero ok
test_stackinit: char_array_zero ok
test_stackinit: small_hole_zero ok
test_stackinit: big_hole_zero ok
test_stackinit: trailing_hole_zero ok
test_stackinit: packed_zero ok
test_stackinit: small_hole_dynamic_partial ok
test_stackinit: big_hole_dynamic_partial ok
test_stackinit: trailing_hole_dynamic_partial ok
test_stackinit: packed_dynamic_partial ok
test_stackinit: small_hole_static_partial ok
test_stackinit: big_hole_static_partial ok
test_stackinit: trailing_hole_static_partial ok
test_stackinit: packed_static_partial ok
test_stackinit: small_hole_static_all ok
test_stackinit: big_hole_static_all ok
test_stackinit: trailing_hole_static_all ok
test_stackinit: packed_static_all ok
test_stackinit: small_hole_dynamic_all ok
test_stackinit: big_hole_dynamic_all ok
test_stackinit: trailing_hole_dynamic_all ok
test_stackinit: packed_dynamic_all ok
test_stackinit: small_hole_runtime_partial ok
test_stackinit: big_hole_runtime_partial ok
test_stackinit: trailing_hole_runtime_partial ok
test_stackinit: packed_runtime_partial ok
test_stackinit: small_hole_runtime_all ok
test_stackinit: big_hole_runtime_all ok
test_stackinit: trailing_hole_runtime_all ok
test_stackinit: packed_runtime_all ok
test_stackinit: u8_none ok
test_stackinit: u16_none ok
test_stackinit: u32_none ok
test_stackinit: u64_none ok
test_stackinit: char_array_none ok
test_stackinit: switch_1_none XFAIL (uninit bytes: 8)
test_stackinit: switch_2_none XFAIL (uninit bytes: 8)
test_stackinit: small_hole_none ok
test_stackinit: big_hole_none ok
test_stackinit: trailing_hole_none ok
test_stackinit: packed_none ok
test_stackinit: user ok
test_stackinit: all tests passed!

The switch cases are an "expected fail" still, and that's totally fine
for now[1]. Those were all purged from real kernel code anyway. ;)

So that's a big "Ack" from me. :)

What are next steps for this patch? I know a lot of people are looking
forward to it. (And then long-open bug[2] for auto-init can be closed.)

Thanks again!

-Kees

[1] Clang doesn't handle this either https://bugs.llvm.org/show_bug.cgi?id=44916
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87210

> 
> Thanks.
> 
> Qing
> 
> **compared to Version 1, this version added the following new features to 
> address Kees’s comments:
> 
> 1.  correctly handle VLA inside a structure for pattern initialization.
> In tree.c (build_pattern_cst):
> 
> +   /* if the field is a variable length array, it should be the last
> +  field of the record, and no need to initialize.  */
> +   if (TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE
> +   && TYPE_SIZE (TREE_TYPE (field)) == NULL_TREE
> +   && ((TYPE_DOMAIN (TREE_TYPE (field)) != NULL_TREE
> +   && TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (field)))
> +  == NULL_TREE)
> +  || TYPE_DOMAIN (TREE_TYPE (field)) == NULL_TREE))
> + continue;
> 
> 2.  initialize all paddings to zero when -ftrivial-auto-var-init is present.
> In expr.c (store_constructor):
> 
>   Clear the whole structure when
> -ftrivial-auto-var-init and the structure has paddings.
> 
> In gimplify.c (gimplify_init_constructor):
> 
>   Clear the whole structure when
> -ftrivial-auto-var-init and the structure has paddings.
> 
> As agreed with Kees, treat the issue related to auto variables outside of the 
> cases and inside the switch as a low priority one. 
> 
> 3. Add  the following new testing cases for the above 1 and 2:
> 
> * c-c++-common/auto-init-13.c: New test.
> * c-c++-common/auto-init-14.c: New test. 
> 
>   * gcc.target/aarch64/auto-init-9.c: New test.
>   * gcc.target/aarch64/auto-init-10.c: New test.
> * gcc.target/aarch64/auto-init-11.c: New test.
> * gcc.target/aarch64/auto-init-12.c: New test.
> * gcc.target/aarch64/auto-init-13.c: New test.
> * gcc.target/aarch64/auto-init-14.c: New test.
> * gcc.target/aarch64/auto-init-15.c: New test.
> * gcc.target/aarch64/auto-init-16.c: New test.
> * gcc.target/aarch64/auto-init-17.c: New test.
> * gcc.target/aarch64/auto-init-18.c: New test.
> * gcc.target/aarch64/auto-init-19.c: New test.
> * gcc.target/aarch64/auto-init-20.c: New test.
> 
> * gcc.target/i386/auto-init-9.c: New test.
> * gcc.target/i386/auto-init-10.c: New test.
> * gcc.target/i386/auto-init-11.c: New test.
> * 

Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-12 Thread Kees Cook via Gcc-patches
On Fri, Mar 12, 2021 at 03:35:28PM -0600, Qing Zhao wrote:
> Hi, Kees,
> 
> I am looking at the structure padding initialization issue. And also have 
> some questions:
> 
> 
> > On Feb 24, 2021, at 10:41 PM, Kees Cook  wrote:
> > 
> > It looks like there is still some issues with padding and pre-case
> > switch variables. Here's the test output, FWIW:
> > 
> > 
> > test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
> > test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
> > test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
> > test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
> > test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
> > test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
> > 
> > test_stackinit: switch_1_none FAIL (uninit bytes: 8)
> > test_stackinit: switch_2_none FAIL (uninit bytes: 8)
> > test_stackinit: failures: 8
> > 
> > 
> > /* Simple structure with padding likely to be covered by compiler. */
> > struct test_small_hole {
> > size_t one;
> > char two;
> > /* 3 byte padding hole here. */
> > int three;
> > unsigned long four;
> > };
> > 
> > /* Try to trigger unhandled padding in a structure. */
> > struct test_aligned {
> > u32 internal1;
> > u64 internal2;
> > } __aligned(64);
> > 
> > struct test_big_hole {
> > u8 one;
> > u8 two;
> > u8 three;
> > /* 61 byte padding hole here. */
> > struct test_aligned four;
> > } __aligned(64);
> > 
> > struct test_trailing_hole {
> > char *one;
> > char *two;
> > char *three;
> > char four;
> > /* "sizeof(unsigned long) - 1" byte padding hole here. */
> > };
> > 
> > They fail when they're statically initialized (either fully or
> > partially),
> 
> So, when the structure is not statically initialized,  the compiler 
> initialization is good?
> 
> For the failing cases, what’s the behavior of the LLVM 
> -ftrivial-auto-var-init?
> 
> From the LLVM patch:  (https://reviews.llvm.org/D54604 
> )
> 
> 
> To keep the patch simple, only some undef is removed for now, see
> replaceUndef. The padding-related infoleaks are therefore not all gone yet.
> This will be addressed in a follow-up, mainly because addressing 
> padding-related
> leaks should be a stand-alone option which is implied by variable
> initialization.
> 

Right, padding init happened in:
https://github.com/llvm/llvm-project/commit/4f7bc0eee7e6099b1abd57dac3c83529944ab23c

And was further clarified that, IIUC, padding _must be zero_ regardless
of pattern-vs-zero in:
https://github.com/llvm/llvm-project/commit/d39fbc7e20d84364e409ce59724ce20625637062

> Yes, in GCC’s implementation, I think that  fixing all padding-related leaks 
> also require a
> separate patch.

That's fine -- but it'll need to be tied to -ftrivial-auto-var-init,
since otherwise the memory isn't actually fully initialized. :)

-Kees

-- 
Kees Cook


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-11 Thread Kees Cook via Gcc-patches
On Thu, Mar 11, 2021 at 03:47:17PM -0600, Qing Zhao wrote:
> Hi, Kees,
> 
> Sorry for the late reply (I have been busy with other work recently).
> 
> Currently, I am working on the issue of flexible length array as the last 
> field of the structure.
> 
> In order to fix it correctly, I have the following question:
> 
> 
> > On Feb 26, 2021, at 3:42 PM, Kees Cook  wrote:
> > 
> > On Thu, Feb 25, 2021 at 05:56:38PM -0600, Qing Zhao wrote:
> >> Just noticed that you didn’t add -fauto-var-init-approach=D to the command 
> >> line.
> > 
> > Ah-ha! I didn't realize that was needed; thanks. However, now some of the 
> > sources crash in a different way. Here's the reproducer:
> > 
> > $ cat poc.i
> > struct a {
> >  int b;
> >  int array[];
> > };
> > void c() {
> >  struct a d;
> > }
> > 
> 
> For such variable length array as the last field of the structure, static 
> initialization is not allowed, 
> User needs to explicitly allocate memory and initialize the allocated array 
> manually in the source code. 
> 
> So, if the compiler has to initialize this structure when requested by 
> -ftrivial-auto-var-init,  I think that 
> only the fields before the last fields need to be initialized, Is this the 
> correct behavior you expected?

Right, that would be my expectation as well. Putting such a struct on
the stack tends to be nonsensical, but maybe happens if part of a union,
which would get initialized correctly, etc:

union {
struct a {
int b;
int array[];
};
char buf[32];
};

-- 
Kees Cook


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-02-26 Thread Kees Cook via Gcc-patches
On Thu, Feb 25, 2021 at 05:56:38PM -0600, Qing Zhao wrote:
> Just noticed that you didn’t add -fauto-var-init-approach=D to the command 
> line.

Ah-ha! I didn't realize that was needed; thanks. However, now some of the 
sources crash in a different way. Here's the reproducer:

$ cat poc.i
struct a {
  int b;
  int array[];
};
void c() {
  struct a d;
}

$ gcc -ftrivial-auto-var-init=pattern -fauto-var-init-approach=D -c /dev/null 
poc.i
during RTL pass: expand
poc.i: In function ‘c’:
poc.i:6:12: internal compiler error: in build_pattern_cst, at tree.c:2652
6 |   struct a d;
  |^
0x75b572 build_pattern_cst(tree_node*)
../../../gcc/gcc/tree.c:2652
0x10db116 build_pattern_cst(tree_node*)
../../../gcc/gcc/tree.c:2612
0xb8a230 expand_DEFERRED_INIT
../../../gcc/gcc/internal-fn.c:2980
0x970e17 expand_call_stmt
../../../gcc/gcc/cfgexpand.c:2749
0x970e17 expand_gimple_stmt_1
../../../gcc/gcc/cfgexpand.c:3844
0x970e17 expand_gimple_stmt
../../../gcc/gcc/cfgexpand.c:4008
0x9766b3 expand_gimple_basic_block
../../../gcc/gcc/cfgexpand.c:6045
0x9780d6 execute
../../../gcc/gcc/cfgexpand.c:6729
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

I assume it's not handling the flex-array happily?

-- 
Kees Cook


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-02-25 Thread Kees Cook via Gcc-patches
On Thu, Feb 25, 2021 at 12:15:01PM -0600, Qing Zhao wrote:
> > On Feb 24, 2021, at 10:41 PM, Kees Cook  wrote:
> > [...]
> > test_stackinit: trailing_hole_none ok
> > test_stackinit: packed_none ok
> > test_stackinit: user ok
> > test_stackinit: failures: 8
> 
> Does the above testing include “pattern initialization” in addition to “zero 
> initialization”?

This was from the zero-init case. I've just tested pattern-init now and
it actually crashes GCC. I minimized the test case to this:

$ cat main.i
a() { char b[1]; }
$ gcc -ftrivial-auto-var-init=pattern -c /dev/null main.i
main.i:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | a() { char b[1]; }
  | ^
main.i: In function ‘a’:
main.i:1:12: internal compiler error: in gimplify_init_ctor_eval, at
gimplify.c:4873
1 | a() { char b[1]; }
  |^
0x69740d gimplify_init_ctor_eval
../../../gcc/gcc/gimplify.c:4873
0xb5ac8f gimplify_init_constructor
../../../gcc/gcc/gimplify.c:5320
0xb6b68a gimplify_modify_expr
../../../gcc/gcc/gimplify.c:5952
0xb533ba gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), 
int)
../../../gcc/gcc/gimplify.c:14262
0xb56b26 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7056
0xb68e6e gimplify_and_add(tree_node*, gimple**)
../../../gcc/gcc/gimplify.c:489
0xb68e6e gimple_add_init_for_auto_var
../../../gcc/gcc/gimplify.c:1892
0xb68e6e gimplify_decl_expr
../../../gcc/gcc/gimplify.c:2010
0xb53bd6 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), 
int)
../../../gcc/gcc/gimplify.c:14459
0xb56b26 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7056
0xb5727d gimplify_bind_expr
../../../gcc/gcc/gimplify.c:1421
0xb536f0 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), 
int)
../../../gcc/gcc/gimplify.c:14463
0xb6ccc9 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7056
0xb6ccc9 gimplify_body(tree_node*, bool)
../../../gcc/gcc/gimplify.c:15498
0xb6d0ed gimplify_function_tree(tree_node*)
../../../gcc/gcc/gimplify.c:15652
0x9ae8d7 cgraph_node::analyze()
../../../gcc/gcc/cgraphunit.c:670
0x9b13a7 analyze_functions
../../../gcc/gcc/cgraphunit.c:1233
0x9b1f9d symbol_table::finalize_compilation_unit()
../../../gcc/gcc/cgraphunit.c:2511
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.


> > struct test_..._hole instance = { .two = ..., };
> > 
> > or
> > 
> > struct test_..._hole instance = { .one = ...,
> >   .two = ...,
> >   .three = ...,
> >   .four = ...,
> > 
> > };
> 
> So, when the structure variables are not statically initialized, all the 
> paddings are initialized correctly by the compiler?

For the zero case, yes. (Usually such happen via copies into stack from from 
.rodata
sections, or accidentally from already-initialized copies.)

> In the current implementation, when the auto variable is explicitly 
> initialized, compiler will do nothing.
> Looks like for structure variables we need some special handling to 
> initialize the paddings.
> Need to study this a little bit and see how to fix it.

Deterministic padding init is a big part of this defense since that's
where the bulk of the really hard-to-find problems have existed (i.e. in
padding holes in structures that got copied around).

> > The last case is for switch variables outside of case statements, like
> > "var" here:
> > 
> > switch (path) {
> > unsigned long var;
> > 
> > case ..:
> > ...
> > case ..:
> > ...
> > ...
> > }
> > 
> 
> Will study this case more.

For the kernel, this is a low priority, since I purged the code of all
variable declarations outside of an execution path[1]. For reference,
here's the Clang bug for the same problem[2] (though latest Clang git
appears to no longer exhibit this issue, so they may have fixed it).

-Kees

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep=Distribute+switch+variables+for+initialization
[2] https://bugs.llvm.org/show_bug.cgi?id=44916

-- 
Kees Cook


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-02-24 Thread Kees Cook via Gcc-patches
(please keep me in CC, I'm not subscribed...)

On Thu Feb 18, 2021 Qing Zhao said:
> Initialize automatic variables with new first class option 
> -ftrivial-auto-var-init=[uninitialized|pattern|zero]

Yay! I'm really excited to see this. Thank you for working on
it! I've built GCC with this applied, and it works out of the box
for a Linux kernel build, which correctly detects the availability
of -ftrivial-auto-var-init=[pattern|zero] for the respective
CONFIG_INIT_STACK_ALL_PATTERN and CONFIG_INIT_STACK_ALL_ZERO options.

The output from the kernel's CONFIG_TEST_STACKINIT module shows coverage
for most uninitialized cases. Yay! :)

It looks like there is still some issues with padding and pre-case
switch variables. Here's the test output, FWIW:

test_stackinit: u8_zero ok
test_stackinit: u16_zero ok
test_stackinit: u32_zero ok
test_stackinit: u64_zero ok
test_stackinit: char_array_zero ok
test_stackinit: small_hole_zero ok
test_stackinit: big_hole_zero ok
test_stackinit: trailing_hole_zero ok
test_stackinit: packed_zero ok
test_stackinit: small_hole_dynamic_partial ok
test_stackinit: big_hole_dynamic_partial ok
test_stackinit: trailing_hole_dynamic_partial ok
test_stackinit: packed_dynamic_partial ok
test_stackinit: small_hole_static_partial ok
test_stackinit: big_hole_static_partial ok
test_stackinit: trailing_hole_static_partial ok
test_stackinit: packed_static_partial ok
test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
test_stackinit: packed_static_all ok
test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
test_stackinit: packed_dynamic_all ok
test_stackinit: small_hole_runtime_partial ok
test_stackinit: big_hole_runtime_partial ok
test_stackinit: trailing_hole_runtime_partial ok
test_stackinit: packed_runtime_partial ok
test_stackinit: small_hole_runtime_all ok
test_stackinit: big_hole_runtime_all ok
test_stackinit: trailing_hole_runtime_all ok
test_stackinit: packed_runtime_all ok
test_stackinit: u8_none ok
test_stackinit: u16_none ok
test_stackinit: u32_none ok
test_stackinit: u64_none ok
test_stackinit: char_array_none ok
test_stackinit: switch_1_none FAIL (uninit bytes: 8)
test_stackinit: switch_2_none FAIL (uninit bytes: 8)
test_stackinit: small_hole_none ok
test_stackinit: big_hole_none ok
test_stackinit: trailing_hole_none ok
test_stackinit: packed_none ok
test_stackinit: user ok
test_stackinit: failures: 8

The kernel's test for this is a mess[1] of macros I used to avoid losing
my sanity from cut/pasting, but it makes the tests hard to read. To
break it out, the failing cases are due to padding, as seen with the
"test_small_hole", "test_big_hole", and "test_trailing_hole" structures:

/* Simple structure with padding likely to be covered by compiler. */
struct test_small_hole {
size_t one;
char two;
/* 3 byte padding hole here. */
int three;
unsigned long four;
};

/* Try to trigger unhandled padding in a structure. */
struct test_aligned {
u32 internal1;
u64 internal2;
} __aligned(64);

struct test_big_hole {
u8 one;
u8 two;
u8 three;
/* 61 byte padding hole here. */
struct test_aligned four;
} __aligned(64);

struct test_trailing_hole {
char *one;
char *two;
char *three;
char four;
/* "sizeof(unsigned long) - 1" byte padding hole here. */
};

They fail when they're statically initialized (either fully or
partially), for example:

struct test_..._hole instance = { .two = ..., };

or

struct test_..._hole instance = { .one = ...,
  .two = ...,
  .three = ...,
  .four = ...,
};

The last case is for switch variables outside of case statements, like
"var" here:

switch (path) {
unsigned long var;

case ..:
...
case ..:
...
...
}


I'm really looking forward to having this available. Thanks again!

-Kees

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/test_stackinit.c

-- 
Kees Cook


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-10 Thread Kees Cook via Gcc-patches
[tried to clean up quoting...]

On Tue, Sep 08, 2020 at 10:00:09AM -0500, Qing Zhao wrote:
> 
> > On Sep 7, 2020, at 8:06 AM, Rodriguez Bahena, Victor 
> >  wrote:
> > 
> >>> On Thu, Sep 03, 2020 at 09:29:54AM -0500, Qing Zhao wrote:
> >>> So, my question is:
> >>>
> >>> From the security point of view, does clearing ALL registers have more 
> >>> benefit than clearing USED registers?  
> >>> From my understanding, clearing registers that are not used in the 
> >>> current routine does NOT provide additional benefit, correct me if I am 
> >>> wrong here.
> >  
> > You are right, it does not provide additional security
> 
> Then, is it necessary to provide 
> 
> -fzero-call-used-regs=all-arg|all-gpr|all   to the user?
> 
> Can we just delete these 3 sub options?

Well... I'd say there is some benefit (remember that ROP gadgets are
built from function trailers, so there is rarely a concern over what the
rest of the function is doing). Generally, they are chained together
based on just the last couple instructions:

 *useful action*
 *ret*

So with ...=used this turns into:

 *useful action*
 *clear some registers*
 *ret*

Which may still be helpful (if, for example, the state being built by
the attacker is using registers _not_ in the cleared list). However:

 *useful action*
 *clear all registers*
 *ret*

Means that suddenly the ROP chain cannot use *any* of the caller-saved
registers to hold state.

So, while ...=used is likely going to block a lot, ...=all will block
even more. I'd prefer to have both available, if for no other reason
than to compare the ROP gadget availability for any given binary (e.g.
if some future attack is found that bypasses ...=used, does it also
bypass ...=all?)

-- 
Kees Cook


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-03 Thread Kees Cook via Gcc-patches
On Thu, Sep 03, 2020 at 09:29:54AM -0500, Qing Zhao wrote:
> On average, all the options starting with “used_…”  (i.e, only the registers 
> that are used in the routine will be zeroed) have very low runtime overheads, 
> at most 1.72% for integer benchmarks, and 1.17% for FP benchmarks. 
> If all the registers will be zeroed, the runtime overhead is bigger, all_arg 
> is 5.7%, all_gpr is 3.5%, and all is 17.56% for integer benchmarks on 
> average. 
> Looks like the overhead of zeroing vector registers is much bigger. 
> 
> For ROP mitigation, -fzero-call-used-regs=used-gpr-arg should be enough, the 
> runtime overhead with this is very small.

That looks great; thanks for doing those tests!

(And it seems like these benchmarks are kind of a "worst case" scenario
with regard to performance, yes? As in it's mostly tight call loops?)

-- 
Kees Cook


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-10 Thread Kees Cook via Gcc-patches
On Mon, Aug 10, 2020 at 11:39:26PM -0300, Alexandre Oliva wrote:
> Erhm, I don't get why it's important that they be zeroed.  It seems to
> me that restoring their original values, or setting them to random
> values, would be just as good defenses from having them set within the

In the performance analysis I looked at a while ago, doing the
register-self-xor is extremely fast to run (IIRC the cycle counts on x86
were absolutely tiny), and it's smaller for code size which minimized
the overall image footprint.

> [...]
> Code that sets the register to zero in the epilogue would be much harder
> for an attacker to change indeed.

Yes, a fixed value is a significantly better defensive position to take
for ROP. And specifically zero _tends_ to be the safest choice as it's
less "useful" to be used as a size, index, or pointer. And, no, it is
not perfect, but nothing can be if we're dealing with trying to defend
against arbitrary ROP gadget finding (or uninitialized stack contents,
where the same argument for "zero is best" also holds[1]).

-Kees

[1] https://lists.llvm.org/pipermail/cfe-dev/2020-April/065221.html

-- 
Kees Cook


Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-06 Thread Kees Cook via Gcc-patches
On Thu, Aug 06, 2020 at 10:37:43AM +0200, Richard Biener wrote:
> OK, so -fzero-call-used-regs is a ROP mitigation technique.  To me
> it sounded more like a mitigation against information leaks which
> then would be highly incomplete w/o spill slot clearing.  Like
> we had that discussion on secure erase of memory that should not
> be DSEd.

I've viewed stack erasure as separate from register clearing. The
"when" of stack erasure tends to define which things are being defended
against. If the stack is being erased on function entry, you're defending
against all the various "uninitialized" variable attacks (which can be
info exposures, flow control redirection, etc). If it's on function exit,
this is more aimed at avoiding stale data and minimizing what's available
during an attack (and it also provides similar "uninit" defenses, just
in a different way). And FWIW, past benchmarks on this appear to indicate
erase-on-entry is more cache-friendly.

-- 
Kees Cook


Re: PING[STAGE 1][PATCH][x86][1/3]: Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]

2020-05-04 Thread Kees Cook via Gcc-patches
On Mon, May 04, 2020 at 12:02:45PM -0700, H.J. Lu wrote:
> On Mon, May 4, 2020 at 11:21 AM Kees Cook  wrote:
> >
> > On Mon, May 04, 2020 at 11:51:49AM -0500, Qing Zhao wrote:
> > > Hi,
> > >
> > > This is a PING for this patch for gcc11 stage 1.
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544058.html 
> > > 
> > >
> > > Please take a look on it.
> >
> > Hi!
> >
> > I think the best course of action here would be the rebase these patches
> > and resend them against the current GCC code base as inline patches
> > (not attachments as you sent earlier), following the details here:
> > https://gcc.gnu.org/contribute.html
> >
> > > > https://outflux.net/slides/2019/lpc/gcc-and-clang.pdf 
> > > > 
> > > > Tested on  x86-64 with bootstrapping GCC trunk, regression tests 
> > > > exposed several new regressions, these new regressions are
> > > > fixed by 2 following patches I will send in next two emails.
> >
> > I look forward to seeing these! I'd really like to have the feature
> > available as another defense in depth for the Linux kernel.
> >
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545081.html

Thanks!

Would it be helpful for me to report a "Tested-by: Kees Cook..." or
something similar on that thread? What's the best way to indicate that
kind of review on gcc-patches?

-- 
Kees Cook


Re: PING[STAGE 1][PATCH][x86][1/3]: Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]

2020-05-04 Thread Kees Cook via Gcc-patches
On Mon, May 04, 2020 at 11:51:49AM -0500, Qing Zhao wrote:
> Hi,
> 
> This is a PING for this patch for gcc11 stage 1. 
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544058.html 
> 
> 
> Please take a look on it.

Hi!

I think the best course of action here would be the rebase these patches
and resend them against the current GCC code base as inline patches
(not attachments as you sent earlier), following the details here:
https://gcc.gnu.org/contribute.html

> > https://outflux.net/slides/2019/lpc/gcc-and-clang.pdf 
> > 
> > Tested on  x86-64 with bootstrapping GCC trunk, regression tests exposed 
> > several new regressions, these new regressions are
> > fixed by 2 following patches I will send in next two emails.

I look forward to seeing these! I'd really like to have the feature
available as another defense in depth for the Linux kernel.

-Kees

-- 
Kees Cook


Re: [PATCH] [RFC] rs6000: -mreadonly-in-sdata (PR82411)

2018-02-28 Thread Kees Cook via gcc-patches
On Wed, Feb 28, 2018 at 4:39 PM, Segher Boessenkool
 wrote:
> On Wed, Feb 28, 2018 at 04:15:23PM -0800, Kees Cook wrote:
>> On Wed, Feb 28, 2018 at 3:23 PM, Segher Boessenkool
>>  wrote:
>> > I'm trying to figure out how to get that file to build at all :-)
>
>> $ CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc make wii_defconfig
>> $ for i in CONFIG_STRICT_KERNEL_RWX \
>>CONFIG_DEBUG_KERNEL \
>>CONFIG_DEBUG_RODATA_TEST ; do
>> CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc ./scripts/config -e $i
>> done
>
> I use ppc6xx_defconfig normally; I had to disable CONFIG_HIBERNATION
> as well.
>
> Without the new flag, rodata_test_data ends up in .sdata in the object
> file and in .data in vmlinux.
>
> Adding
>
> ===
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index ccd2556..31b9613 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -143,6 +143,8 @@ CFLAGS-$(CONFIG_PPC64)  += $(call 
> cc-option,-mcmodel=med
>  CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions)
>  CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 $(MULTIPLEWORD)
>
> +CFLAGS-$(CONFIG_PPC32) += $(call cc-option,-mno-readonly-in-sdata)
> +
>  ifeq ($(CONFIG_PPC_BOOK3S_64),y)
>  CFLAGS-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=power7,-mtune=power4)
>  CFLAGS-$(CONFIG_GENERIC_CPU) += -mcpu=power4
> ===
>
> it ends up in .rodata in both.

Excellent! That's perfect. :)

> It would be nice to have a more comprehensive test, but this will do; I'll
> commit it tomorrow (will resend, with doc and changelog and all that).

Thank you! Do you want to send the Makefile fix upstream too, or
should I route that?

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH] [RFC] rs6000: -mreadonly-in-sdata (PR82411)

2018-02-28 Thread Kees Cook via gcc-patches
On Wed, Feb 28, 2018 at 3:23 PM, Segher Boessenkool
 wrote:
> On Wed, Feb 28, 2018 at 01:46:33PM -0800, Kees Cook wrote:
>> On Tue, Feb 27, 2018 at 2:01 PM, Segher Boessenkool
>>  wrote:
>> > This adds a new option -mreadonly-in-sdata (on by default) that
>> > controls whether readonly data can be put in sdata.  (For EABI this
>> > does nothing, readonly data is put in sdata2 as usual).
>>
>> Cool! Thanks for working on this.
>>
>> > Kees, could you try this out with your use case?  Add the flag
>> > -mno-readonly-in-sdata in your build scripts.  The patch is against
>> > GCC trunk.
>>
>> I'm struggling to create a ppc cross compiler, otherwise I would be
>> happy to test this. :)
>
> For compiling kernels you can use
> 

Hah, yes, I had just asked around privately about cross building tools
and was aimed there only to discover you're the author. ;)

>> If you're able to build a ppc kernel and remove the "static" from
>> mm/rodata_test_data.c's rodata_test_data and see the rodata_test_data
>> variable not in .sdata, that should be sufficient. The case reported
>> was here:
>> https://lkml.org/lkml/2017/9/21/156
>>
>> I'll continue trying to solve my cross build issue...
>
> I'm trying to figure out how to get that file to build at all :-)

This works for me with my distro cross compiler:

$ git diff
diff --git a/mm/rodata_test.c b/mm/rodata_test.c
index d908c8769b48..6bb4deb12e78 100644
--- a/mm/rodata_test.c
+++ b/mm/rodata_test.c
@@ -14,7 +14,7 @@
 #include 
 #include 

-static const int rodata_test_data = 0xC3;
+const int rodata_test_data = 0xC3;

 void rodata_test(void)
 {

$ CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc make wii_defconfig
$ for i in CONFIG_STRICT_KERNEL_RWX \
   CONFIG_DEBUG_KERNEL \
   CONFIG_DEBUG_RODATA_TEST ; do
CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc ./scripts/config -e $i
done

$ CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc make -j...
...
$ objdump -t mm/rodata_test.o | grep rodata_test_data
 g O .sdata 0004 rodata_test_data


-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH] [RFC] rs6000: -mreadonly-in-sdata (PR82411)

2018-02-28 Thread Kees Cook via gcc-patches
On Tue, Feb 27, 2018 at 2:01 PM, Segher Boessenkool
 wrote:
> This adds a new option -mreadonly-in-sdata (on by default) that
> controls whether readonly data can be put in sdata.  (For EABI this
> does nothing, readonly data is put in sdata2 as usual).

Cool! Thanks for working on this.

> Kees, could you try this out with your use case?  Add the flag
> -mno-readonly-in-sdata in your build scripts.  The patch is against
> GCC trunk.

I'm struggling to create a ppc cross compiler, otherwise I would be
happy to test this. :)

If you're able to build a ppc kernel and remove the "static" from
mm/rodata_test_data.c's rodata_test_data and see the rodata_test_data
variable not in .sdata, that should be sufficient. The case reported
was here:
https://lkml.org/lkml/2017/9/21/156

I'll continue trying to solve my cross build issue...

Thanks!

-Kees

-- 
Kees Cook
Pixel Security