>Oliver Smith wrote:
 >> On 7/26/2010 2:37 AM, Martin Sustrik wrote:
 >>> In theory, it would make VSM messages smaller (no content pointer).
 >>> However, in practice, you'll get 6 bytes (x86_64) of padding
 >following
 >>> the structure -- because of compiler aligning it to CPU word
 >boundary.
 >>>
 >> #pragma pack(push, 1)
 >> ?
 >
 >Then you'll get unaligned data. CPU will be busy rotating bits back and
 >forth :)
 >

[Ben Kloosterman] Yes this is correct but it's worth noting to have sizeof()
= 32 bytes is better despite this extra cost as the alignment cost is only
inccurred for flags and size. 

Though better still is as per the code I posted 

- align the struct on 16 ( default on some archs anyway) 
- use bit fields for the size and flags ( I was surprised this was much
faster) out of a 32 bit value
- use pack(4)  
- use a union to remove the pointer
- put the data at the start so it is 16byte aligned and can use SSE2. (If
the compiler is smart enough) 

Test shows a consistent improvement of 1-4%  and it should work well on
different archs. 

Regards, 

Ben 

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to