------- Comment #4 from svfuerst at gmail dot com 2010-04-30 16:33 ------- Argh, the sar trick doesn't work when the number is negative and even. Sorry about the extra noise.
This leaves as the best code: mov %rsi,%rdx shr $0x3f,%rdx lea (%rdi,%rdx,1),%rax and $0x1,%eax sub %rdx,%rax sbb %rdx,%rdx This is still better than current version. Of course, changing the and instruction will allow faster versions of x%4, x%8, x%16 etc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43883