ctypes.Structure is *literally* the interface to the C struct that as Chris mentions has fixed offsets for all members. I don't think that should (can?) be altered.
In file formats (beyond net protocols) the string size + variable length string motif comes up often and I am frequently re-implementing the two-line read-an-int + read-{}.format-bytes. On Thu, Jan 19, 2017 at 12:17 PM, Joao S. O. Bueno <jsbu...@python.org.br> wrote: > I am for upgrading struct to these, if possible. > > But besides my +1, I am writting in to remember folks thatthere is another > "struct" model in the stdlib: > > ctypes.Structure - > > For reading a lot of records with the same structure it is much more handy > than > struct, since it gives one a suitable Python object on instantiation. > > However, it also can't handle variable lenght fields automatically. > > But maybe, the improvement could be made on that side, or another package > altogether taht works more like it than current "struct". > > > > On 19 January 2017 at 16:08, Elizabeth Myers <elizab...@interlinked.me> > wrote: > > On 19/01/17 06:47, Elizabeth Myers wrote: > >> On 19/01/17 05:58, Rhodri James wrote: > >>> On 19/01/17 08:31, Mark Dickinson wrote: > >>>> On Thu, Jan 19, 2017 at 1:27 AM, Steven D'Aprano <st...@pearwood.info > > > >>>> wrote: > >>>>> [...] struct already supports > >>>>> variable-width formats. > >>>> > >>>> Unfortunately, that's not really true: the Pascal strings it supports > >>>> are in some sense variable length, but are stored in a fixed-width > >>>> field. The internals of the struct module rely on each field starting > >>>> at a fixed offset, computable directly from the format string. I don't > >>>> think variable-length fields would be a good fit for the current > >>>> design of the struct module. > >>>> > >>>> For the OPs use-case, I'd suggest a library that sits on top of the > >>>> struct module, rather than an expansion to the struct module itself. > >>> > >>> Unfortunately as the OP explained, this makes the struct module a poor > >>> fit for protocol decoding, even as a base layer for something. It's > one > >>> of the things I use python for quite frequently, and I always end up > >>> rolling my own and discarding struct entirely. > >>> > >> > >> Yes, for variable-length fields the struct module is worse than useless: > >> it actually reduces clarity a little. Consider: > >> > >>>>> test_bytes = b'\x00\x00\x00\x0chello world!' > >> > >> With this, you can do: > >> > >>>>> length = int.from_bytes(test_bytes[:4], 'big') > >>>>> string = test_bytes[4:length] > >> > >> or you can do: > >> > >>>>> length = struct.unpack_from('!I', test_bytes)[0] > >>>>> string = struct.unpack_from('{}s'.format(length), test_bytes, 4)[0] > >> > >> Which looks more readable without consulting the docs? ;) > >> > >> Building anything on top of the struct library like this would lead to > >> worse-looking code for minimal gains in efficiency. To quote Jamie > >> Zawinksi, it is like building a bookshelf out of mashed potatoes as it > >> stands. > >> > >> If we had an extension similar to netstruct: > >> > >>>>> length, string = struct.unpack('!I$', test_bytes) > >> > >> MUCH improved readability, and also less verbose. :) > > > > I also didn't mention that when you are unpacking iteratively (e.g., you > > have multiple strings), the code becomes a bit more hairy: > > > >>>> test_bytes = b'\x00\x05hello\x00\x07goodbye\x00\x04test' > >>>> offset = 0 > >>>> while offset < len(test_bytes): > > ... length = struct.unpack_from('!H', test_bytes, offset)[0] > > ... offset += 2 > > ... string = struct.unpack_from('{}s'.format(length), test_bytes, > > offset)[0] > > ... offset += length > > > > It actually gets a lot worse when you have to unpack a set of strings in > > a context-sensitive manner. You have to be sure to update the offset > > constantly so you can always unpack strings appropriately. Yuck! > > > > It's worth mentioning that a few years ago, a coworker and I found > > ourselves needing variable length strings in the context of a binary > > protocol (DHCP), and wound up abandoning the struct module entirely > > because it was unsuitable. My co-worker said the same thing I did: "it's > > like building a bookshelf out of mashed potatoes." > > > > I do understand it might require a possible major rewrite or major > > changes the struct module, but in the long run, I think it's worth it > > (especially because the struct module is not all that big in scope). As > > it stands, the struct module simply is not suited for protocols where > > you have variable-length strings, and in my experience, that is the vast > > majority of modern binary protocols on the Internet. > > > > -- > > Elizabeth > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas@python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/