Michael Yanowitz wrote: > Hello: > > I am relatively new to Python and this is my first post on > this mailing list. > > I am confused as to why I am getting size differences in the following > cases: > > >>>>print struct.calcsize("I") > > 4 > >>>>print struct.calcsize("H") > > 2 > >>>>print struct.calcsize("HI") > > 8 > >>>>print struct.calcsize("IH") > > 6 > > Why is it 8 bytes in the third case and why would it be only 6 bytes > in the last case if it is 8 in the previous?
By default the struct module uses native byte-order and alignment which may insert padding. In your case, the integer is forced to start on a 4-byte boundary so two pad bytes must be inserted between the short and the int. When the int is first no padding is needed - the short starts on a 2-byte boundary. To eliminate the padding you should use any of the options that specify 'standard' alignment instead of native: In [2]: struct.calcsize('I') Out[2]: 4 In [3]: struct.calcsize('H') Out[3]: 2 In [4]: struct.calcsize('HI') Out[4]: 8 In [5]: struct.calcsize('IH') Out[5]: 6 In [6]: struct.calcsize('!HI') Out[6]: 6 In [7]: struct.calcsize('>HI') Out[7]: 6 In [8]: struct.calcsize('<HI') Out[8]: 6 > > I tried specifying big endian and little endian and they both have > the same results. Are you sure? They should use standard alignment as in the example above. > > I suspect, there is some kind of padding involved, but it does not > seem to be done consistently or in a recognizable method. > > I will be reading shorts and longs sent from C into Python > through a socket. > > Suppose I know I am getting 34 bytes, and the last 6 bytes are a 2-byte > word followed by a 4-byte int, how can I be assured that it will be in that > format? > > In a test, I am sending data in this format: > PACK_FORMAT = "HBBBBHBBBBBBBBBBBBBBBBBBBBHI" > which is 34 bytes Are you sure? Not for me: In [9]: struct.calcsize('HBBBBHBBBBBBBBBBBBBBBBBBBBHI') Out[9]: 36 Kent -- http://mail.python.org/mailman/listinfo/python-list