Hi David & Hannes, This conversation is veering off course. I think this doesn't really matter at all. Gcc converts u64 into essentially a pair of u32 on 32-bit platforms, so the alignment requirements for 32-bit is at a maximum 32 bits. On 64-bit platforms the alignment requirements are related at a maximum to the biggest register size, so 64-bit alignment. For this reason, no matter the behavior of __aligned(8), we're okay. Likewise, even without __aligned(8), if gcc aligns structs by their biggest member, then we get 4 byte alignment on 32-bit and 8 byte alignment on 64-bit, which is fine. There's no 32-bit platform that will trap on a 64-bit unaligned access because there's no such thing as a 64-bit access there. In short, we're fine.
(The reason in6_addr aligns itself to 4 bytes on 64-bit platforms is that it's defined as being u32 blah[4]. If we added a u64 blah[2], we'd get 8 byte alignment, but that's not in the header. Feel free to start a new thread about this issue if you feel this ought to be added for whatever reason.) One optimization that's been suggested on this list is that instead of u8 key[16] and requiring the alignment attribute, I should just use u64 key[2]. This seems reasonable to me, and it will also save the endian conversion call. These keys generally aren't transmitted over a network, so I don't think a byte-wise encoding is particularly important. The other suggestion I've seen is that I make the functions take a const void * instead of a const u8 * for the data, in order to save ugly casts. I'll do this too. Meanwhile Linus has condemned our 4dwords/2qwords naming, and I'll need to think of something different. The best I can think of right now is siphash_4_u32/siphash_2_u64, but I don't find it especially pretty. Open to suggestions. Regards, Jason