https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952
H.J. Lu <hjl.tools at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|WAITING |NEW --- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Andrew Cooper from comment #22) > One curious thing I have discovered. While auditing the -mharden-sls=all > code generation in Xen, I found examples where I got "ret int3 ret int3" > with no intervening instructions. > > It turns out this is not a regression in this change. It is a pre-existing > missing optimisation, which is made more obvious when every ret is extended > with an int3. > > It occurs for functions with either no stack frame at all, or functions > which have an early exit before setting up the stack frame. Some examples > which occur at -O1 do not occur at -O2. > > One curious example which does still repro at -O2 is this. We have a hash > lookup function: > > struct context *sidtab_search(struct sidtab *s, u32 sid) > { > int hvalue; > struct sidtab_node *cur; > > if ( !s ) > return NULL; > > hvalue = SIDTAB_HASH(sid); > cur = s->htable[hvalue]; > while ( cur != NULL && sid > cur->sid ) > cur = cur->next; > > if ( cur == NULL || sid != cur->sid ) > { > /* Remap invalid SIDs to the unlabeled SID. */ > sid = SECINITSID_UNLABELED; > hvalue = SIDTAB_HASH(sid); > cur = s->htable[hvalue]; > while ( cur != NULL && sid > cur->sid ) > cur = cur->next; > if ( !cur || sid != cur->sid ) > return NULL; > } > > return &cur->context; > } > > which compiles (reformatted a little for width - unmodified: > https://paste.debian.net/hidden/7bf675d6/) to: > > <sidtab_search>: > 48 85 ff test %rdi,%rdi > /------- 74 63 je <sidtab_search+0x68> > | 48 8b 17 mov (%rdi),%rdx > | 89 f0 mov %esi,%eax > | 83 e0 7f and $0x7f,%eax > | 48 8b 04 c2 mov (%rdx,%rax,8),%rax > | 48 85 c0 test %rax,%rax > | /--- 75 13 jne <sidtab_search+0x29> > | /|--- eb 17 jmp <sidtab_search+0x2f> > | || 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) > | || 00 > | ||/-> 48 8b 40 48 mov 0x48(%rax),%rax > | ||| 48 85 c0 test %rax,%rax > | +||-- 74 06 je <sidtab_search+0x2f> > | |\|-> 39 30 cmp %esi,(%rax) > | | \-- 72 f3 jb <sidtab_search+0x20> > | /|---- 74 24 je <sidtab_search+0x53> > | |\---> 48 8b 42 28 mov 0x28(%rdx),%rax > | | 48 85 c0 test %rax,%rax > | | /--- 75 11 jne <sidtab_search+0x49> > |/|-|--- eb 32 jmp <sidtab_search+0x6c> // (1) > ||| | 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) > ||| |/-> 48 8b 40 48 mov 0x48(%rax),%rax > ||| || 48 85 c0 test %rax,%rax > |||/||-- 74 17 je <sidtab_search+0x60> // (2) > ||||\|-> 83 38 04 cmpl $0x4,(%rax) > |||| \-- 76 f2 jbe <sidtab_search+0x40> > |||| 83 38 05 cmpl $0x5,(%rax) > +|||---- 75 15 jne <sidtab_search+0x68> > ||\|---> 48 83 c0 08 add $0x8,%rax > || | c3 retq > || | cc int3 > || | 0f 1f 80 00 00 00 00 nopl 0x0(%rax) > || \---> c3 retq // Target of (2) > || cc int3 > || 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) > \|-----> 31 c0 xor %eax,%eax > | c3 retq > | cc int3 > \-----> c3 retq // Target of (1) > cc int3 > 66 90 xchg %ax,%ax > > There are 4 exits in total. Two have to set up %eax, so they can't usefully > be merged. > > However, the unconditional jmp at (1) is 2 bytes, and could fully contain > its target ret;int3 without even impacting the surrounding padding. Whether > it inlines or merges, this drops 4 bytes. > > The conditional jump at (2) could be folded in to any of the other exit > paths, dropping 16 bytes from the total size size. > > I have no idea how easy/hard this may be to track down, or whether it is > worth pursuing urgently, but it probably does want looking at, seeing as SLS > hardening doubles the hit. Please open a separate bug to track it. Should shrink-wrap handle it?