> Instead of calling memset: > > ffffffff8100cd8d: e8 0e 15 7a 00 callq ffffffff817ae2a0 > <__memset> > > and having a JMP inside it depending on the feature supported, let's simply > have the REP; STOSB directly in the code: > > ... > ffffffff81000442: 4c 89 d7 mov %r10,%rdi > ffffffff81000445: b9 00 10 00 00 mov $0x1000,%ecx > > <---- new memset > ffffffff8100044a: f3 aa rep stos %al,%es:(%rdi) > ffffffff8100044c: 90 nop > ffffffff8100044d: 90 nop > ffffffff8100044e: 90 nop
You can fit entire "xor eax, eax; rep stosb" inside call instruction. > /* clobbers used by memset_orig() and memset_rep_good() */ > : "rsi", "rdx", "r8", "r9", "memory"); eh... I'd just drop it. These registers screw up everything. Time to rebase memset0().