On x86 processors, I think bitwise operators should be optimized for individual bit accesses, which should be converted to BT instructions.
For example: x |= 1<<y; compiles to BTS instead of SHL/OR x ^= 1<<y; compiles to BTC instead of SHL/XOR x &= ~(1<<y); compiles to BTR instead of SHL/NOT/AND if (x&(1<<y))... compiles to BT instead of SHL/AND Especially when y is not known at compile time, this will provide a significant performance increase.