RE: struct size confusion:
Thanks for your and everyone else's feedback. I got it to work now by prefixing the PACK_FORMAT with "!". I previously thought I could only use the "!' with the unpack. I still don't fully understand the byte allignment stuff (I am sure I will get it eventually), but I am content that it is working now. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Fredrik Lundh Sent: Wednesday, March 22, 2006 9:28 AM To: python-list@python.org Subject: Re: struct size confusion: Michael Yanowitz wrote: >I am relatively new to Python and this is my first post on > this mailing list. > >I am confused as to why I am getting size differences in the following > cases: > > >>> print struct.calcsize("I") > 4 > >>> print struct.calcsize("H") > 2 > >>> print struct.calcsize("HI") > 8 > >>> print struct.calcsize("IH") > 6 > >Why is it 8 bytes in the third case and why would it be only 6 bytes > in the last case if it is 8 in the previous? because modern platforms tend to use an alignment equal to the size of the item; 2-byte objects are stored at even addresses, 4-byte objects are stored at addresses that are multiples of four, etc. in other words, HI is stored as 2 bytes H data plus 2 bytes padding plus four bytes I data, while IH is four bytes I data, no padding, and 2 bytes H data. >I tried specifying big endian and little endian and they both have > the same results. are you sure? (see below) > I suspect, there is some kind of padding involved, but it does not > seem to be done consistently or in a recognizable method. the alignment options are described in the library reference: http://docs.python.org/lib/module-struct.html default is native byte order, native padding: >>> struct.calcsize("IH") 6 >>> struct.calcsize("HI") 8 to specify other byte orders, use a prefix character. this also disables padding. e.g. >>> struct.calcsize("!IH") 6 >>> struct.calcsize("!HI") 6 -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: struct size confusion:
Michael Yanowitz wrote: >I am relatively new to Python and this is my first post on > this mailing list. > >I am confused as to why I am getting size differences in the following > cases: > > >>> print struct.calcsize("I") > 4 > >>> print struct.calcsize("H") > 2 > >>> print struct.calcsize("HI") > 8 > >>> print struct.calcsize("IH") > 6 > >Why is it 8 bytes in the third case and why would it be only 6 bytes > in the last case if it is 8 in the previous? because modern platforms tend to use an alignment equal to the size of the item; 2-byte objects are stored at even addresses, 4-byte objects are stored at addresses that are multiples of four, etc. in other words, HI is stored as 2 bytes H data plus 2 bytes padding plus four bytes I data, while IH is four bytes I data, no padding, and 2 bytes H data. >I tried specifying big endian and little endian and they both have > the same results. are you sure? (see below) > I suspect, there is some kind of padding involved, but it does not > seem to be done consistently or in a recognizable method. the alignment options are described in the library reference: http://docs.python.org/lib/module-struct.html default is native byte order, native padding: >>> struct.calcsize("IH") 6 >>> struct.calcsize("HI") 8 to specify other byte orders, use a prefix character. this also disables padding. e.g. >>> struct.calcsize("!IH") 6 >>> struct.calcsize("!HI") 6 -- http://mail.python.org/mailman/listinfo/python-list
Re: struct size confusion:
Michael Yanowitz wrote: > Hello: > >I am relatively new to Python and this is my first post on > this mailing list. > >I am confused as to why I am getting size differences in the following > cases: > > print struct.calcsize("I") > > 4 > print struct.calcsize("H") > > 2 > print struct.calcsize("HI") > > 8 > print struct.calcsize("IH") > > 6 > >Why is it 8 bytes in the third case and why would it be only 6 bytes > in the last case if it is 8 in the previous? By default the struct module uses native byte-order and alignment which may insert padding. In your case, the integer is forced to start on a 4-byte boundary so two pad bytes must be inserted between the short and the int. When the int is first no padding is needed - the short starts on a 2-byte boundary. To eliminate the padding you should use any of the options that specify 'standard' alignment instead of native: In [2]: struct.calcsize('I') Out[2]: 4 In [3]: struct.calcsize('H') Out[3]: 2 In [4]: struct.calcsize('HI') Out[4]: 8 In [5]: struct.calcsize('IH') Out[5]: 6 In [6]: struct.calcsize('!HI') Out[6]: 6 In [7]: struct.calcsize('>HI') Out[7]: 6 In [8]: struct.calcsize(' >I tried specifying big endian and little endian and they both have > the same results. Are you sure? They should use standard alignment as in the example above. > > I suspect, there is some kind of padding involved, but it does not > seem to be done consistently or in a recognizable method. > >I will be reading shorts and longs sent from C into Python > through a socket. > >Suppose I know I am getting 34 bytes, and the last 6 bytes are a 2-byte > word followed by a 4-byte int, how can I be assured that it will be in that > format? > >In a test, I am sending data in this format: > PACK_FORMAT = "HHHI" > which is 34 bytes Are you sure? Not for me: In [9]: struct.calcsize('HHHI') Out[9]: 36 Kent -- http://mail.python.org/mailman/listinfo/python-list
Re: struct size confusion:
Michael Yanowitz wrote: >Why is it 8 bytes in the third case and why would it be only 6 bytes > in the last case if it is 8 in the previous? >From TFM: """ Native size and alignment are determined using the C compiler's sizeof expression. This is always combined with native byte order. Standard size and alignment are as follows: no alignment is required for any type (so you have to use pad bytes); short is 2 bytes; int and long are 4 bytes; long long (__int64 on Windows) is 8 bytes; float and double are 32-bit and 64-bit IEEE floating point numbers, respectively """ See this how to achieve the desired results (on my system at least): >>> print struct.calcsize("I") 4 >>> print struct.calcsize("H") 2 >>> print struct.calcsize("HI") 8 >>> print struct.calcsize("=HI") 6 >>> print struct.calcsize("=IH") 6 >>> Regards, Diez -- http://mail.python.org/mailman/listinfo/python-list
struct size confusion:
Hello: I am relatively new to Python and this is my first post on this mailing list. I am confused as to why I am getting size differences in the following cases: >>> print struct.calcsize("I") 4 >>> print struct.calcsize("H") 2 >>> print struct.calcsize("HI") 8 >>> print struct.calcsize("IH") 6 Why is it 8 bytes in the third case and why would it be only 6 bytes in the last case if it is 8 in the previous? I tried specifying big endian and little endian and they both have the same results. I suspect, there is some kind of padding involved, but it does not seem to be done consistently or in a recognizable method. I will be reading shorts and longs sent from C into Python through a socket. Suppose I know I am getting 34 bytes, and the last 6 bytes are a 2-byte word followed by a 4-byte int, how can I be assured that it will be in that format? In a test, I am sending data in this format: PACK_FORMAT = "HHHI" which is 34 bytes However, when I receive the data, I am using the format: UNPACK_FORMAT = "!HHHHI" which has the extra H in the second to last position to make them compatible, but that makes it 36 bytes. I am trying to come up with some explanation as to where the extra 2 bytes come from. Thanks in advance: -- http://mail.python.org/mailman/listinfo/python-list