>Ben,
 >
 >> Using the union appeared to work well ( and I'm primarily interested
 >here as
 >> I want to introduce a variable size by copy message for my queue which
 >a
 >> union makes easier)
 >
 >Yes. I was thinking of introducing the union myself. The problem is that
 >changing the zmq_msg_t layout would break some of the language bindings,
 >so it has to be done carefully...

Completely agree.. I have made the changes but not experienced enough to
submit them.  I also want to add a new variable length copy message type for
my own InProc queue.

Most just use the get size and get data pointer , I found it pretty easy to
change the code. 
 

>
 >> The pack(1) was interesting and had a greater cost than I expected ,
 >that
 >> being said it made messages smaller which  provided a benefit to the
 >smaller
 >> ref based tests.
 >
 >At the level of optimisation going on here there are no magic solutions.
 >The structure of zmq_msg_t has to crafted by hand to get the best
 >performance.

Yes , I do think the sizeof should be a multiple of 16 or preferably 32. 

 >
 >> A better option may be to just use a int for a flags and size field ..
 >use
 >> the first 8 bits for size and just cast it to a byte from an int the
 >flags
 >> can work on the higher bits. Alternatively you can make the size 24
 >bits and
 >> flags 8 bits removing the need for a separate size field in the ref
 >based
 >> message ( unless anyone can see a reason for > 24 bit messages?).
 >>
 >> You then have small messages  up to size 28 and still fit in 32 bytes
 >( even
 >> on 64 bit machines)  , may try that now and force 16 byte alignment..
 >
 >I though of something like this:
 >
 >struct zmq_msg_t
 >{
 >     union {
 >         void *ptr;
 >         struct {
 >             ...
 >         } vsm;
 >     } content;
 >     unsigned char type;
 >     unsigned char flags;
 >}
 >
 >In theory, it would make VSM messages smaller (no content pointer).
 >However, in practice, you'll get 6 bytes (x86_64) of padding following
 >the structure -- because of compiler aligning it to CPU word boundary.

Yes that's why a sizeof() based on a paragraph will help. Here is what I now
have with a Max_size of  12, 28 , 44 or 60.   It should also work just as
well on 64 bit though prob better to be sure it is packed to 32 bit.  Very
similar. 

        #pragma pack(4)  //ensures fields are not packed  on 8 byte
boundaries
      typedef struct
        {
                void *content;
        } zmq_m_t;

        typedef struct
        {
        
                unsigned char vsm_data [ZMQ_MAX_VSM_SIZE];
        } zmq_vsm_t;


        typedef union 
        {
                zmq_vsm_t  vsm;
                zmq_m_t   m;
                zmq_var    var; // variable sized by copy messages.

        } zmq_msgdata_t;


        __declspec(align(16))  //TODO make portable  //good for SSE2 
                typedef struct
        {
                
                zmq_msgdata_t  msgs; // on some compilers the data field
will be first and on a boundary
                unsigned flags : 8 ;
                unsigned vsm_size : 24;

                inline bool Is_VSM()
                {
                        return (flags & ZMQ_MSG_VSM)  !=0;
                }

        } zmq_msg_t;
#pragma pack()

Note since it was free due to 32 bit alignment ( and the slight drop by
using 8 bit pack(1)  I put message size here , and then  removed the other
size field , this means max message drops from 4G  to 16 Meg  but personally
16 Meg is more than enough for a single message.  The bit fields were faster
than packed byte fields. 

Regards, 

Ben 


_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to