[Bug c/77992] Provide feature to initialize padding bytes to avoid information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 --- Comment #15 from Kangjie Lu --- (In reply to Richard Biener from comment #14) > Re-opening as an enhacement request for sth like -fexplict-padding adding > artificial fields to structures padding. > > Patches welcome (hint: look into stor-layout.c, start_record_layout / > finish_record_layout, place_field). Sounds great! Thanks for the hint. Since we have been working on preventing uninitialized data leaks for quite a while, we are interested in implementing this gcc feature. Once we finish the beta version, we will ask gcc for review and test.
[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 --- Comment #13 from Kangjie Lu --- (In reply to jos...@codesourcery.com from comment #10) > If you care about information in bytes that are not part of a field with > other semantic significance, you should use -Werror=padded to get errors > on structs with padding and use that information to add explicit dummy > fields in the source code where there was padding. Once there are > explicit dummy fields, their values will be preserved by the compiler, so > you can either zero the whole struct with memset and rely on the zeroing > of dummy fields not being optimized away, or use a struct initializer and > rely on it implicitly zeroing those fields. (Of course this may reduce > efficiency as optimizations such as SRA now need to track values of those > fields, whereas they do not need to track values of padding.) This is a candidate solution, but I think it cannot scale. Given that the Linux kernel has tens of thousands of modules, the idea of manually initializing padding bytes for all data structures will be definitely declined by the Linux community. My opinion is still that, as padding is introduced by compilers and is "invisible" to developers, initializing padding should be done by on the compiler side.
[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 --- Comment #12 from Kangjie Lu --- (In reply to Andreas Schwab from comment #11) > The problem with that strategy is that padding is architecture dependent, > and care must be taken not to introduce ABI breakage. Agreed. Or a developer will have to write corresponding dummy fields for various platforms, which will be annoying for code maintenance.
[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 --- Comment #9 from Kangjie Lu --- (In reply to Andrew Pinski from comment #8) > A simple google search (secure memset [glibc]) finds a few things: > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf > > https://sourceware.org/ml/libc-alpha/2014-12/msg00506.html > > https://www.securecoding.cert.org/confluence/display/c/MSC06-C. > +Beware+of+compiler+optimizations > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537 Thanks for sharing these interesting links. Sure, compiler optimizations sometime may aggressively eliminate dead code. As I mentioned in my last reply, this is not a problem in our work because our instrumentation is inserted after all LLVM optimization passes. The inserted memset will not be removed. Back to my original problem, many Linux kernel developers also hope GCC can provide a feature (like a compilation option) that can zero-initialize padding bytes. Fixing these information leaks manually will make the code maintenance extremely difficult. Anyway, I just wanted to report this issue :)
[Bug c/77992] Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 --- Comment #7 from Kangjie Lu --- (In reply to Andrew Pinski from comment #6) > >More information can be found in our research paper: > >http://www.cc.gatech.edu/~klu38/publications/unisan-ccs16.pdf > > > You research paper is wrong and does not consider C is an inherently > insecure language to be begin with. There are many other things wrong with > it. Like for an example recommending the use of memset when you want to > hide the stores from the compiler. There is already a thread on the glibc > mailing list about this exact thing about adding a secure memset which is > GCC is not going to optimize away. Thanks for your feedback. We do think C is not safe language and that's why we want to secure programs written in C. Could you provide me more information about the thread. We use LLVM instead of GCC. Our instrumentation is inserted after optimization passes. Thanks!
[Bug driver/77992] Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 Kangjie Lu changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #4 from Kangjie Lu --- (In reply to Andrew Pinski from comment #3) > There is no way in C to do that. If you want a secure language you need > something different. Could you please explain why there is no way in C to initialize padding? Besides performance (I understand that the unaligned initialization could be expensive), any other reasons? Thanks!
[Bug driver/77992] Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 Kangjie Lu changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #2 from Kangjie Lu --- Then I guess this is an unspecified area in C11. Anyway, the failure to initialize the padding bytes will cause information leaks; many leaks have been confirmed. I would suggest gcc to initialize padding bytes even it is not specified in C11. Thanks, Kangjie
[Bug driver/77992] New: Failures to initialize padding bytes -- causing many information leaks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992 Bug ID: 77992 Summary: Failures to initialize padding bytes -- causing many information leaks Product: gcc Version: 5.4.0 Status: UNCONFIRMED Severity: critical Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: kjlu at gatech dot edu Target Milestone: --- Created attachment 39817 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39817&action=edit testcase Hello, I'd like to report an implementation (or even design) problem in GCC. Chapter ยง6.7.9/10 in C11: "If an object that has static or thread storage duration is not initialized explicitly, then: ... if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;" According to this specification, padding bytes should be initialized when the initializer is static. Take a look at this example (say x86_64): / struct S { long l; char c; }; void main () { struct S s ={ .l = 0, .c = 0 }; } / The developer has carefully initialized all fields with constants. Object "s" is supposed to be fully initialized, i.e., the seven padding bytes right after "s.c" are supposed to be initialized. However, these padding bytes are not initialized in fact. In contrast, LLVM would initialize the padding bytes in such a case. Similarly, when "variables" are used to initialize the fields of "s", padding bytes are not initialized either, such as: / struct S s ={ .l = variable1, .c = variable2 }; / Such failures to initialize padding bytes will result in many information leaks. We have found many information leaks in the Linux kernel. Here is an example: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4482 More information can be found in our research paper: http://www.cc.gatech.edu/~klu38/publications/unisan-ccs16.pdf The testing program for reproducing the leak is attached. Testing environment: "Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2)" My suggestion to reliably address this problem is that padding bytes of an object, which are implicitly introduced by compilers, should be zero-initialized upon object allocation. Please let me know if you need more information or any assistance. Best Regards, Kangjie Lu