On 5 Aug 2009, at 16:46 , Martin P. Hellwig wrote:
Hi List,

On several occasions I have needed (and build) a parser that reads a binary piece of data with custom structure. For example (bogus one):

BE
+---------+---------+-------------+-------------+------+--------+
| Version | Command | Instruction | Data Length | Data | Filler |
+---------+---------+-------------+-------------+------+--------+
Version: 6 bits
Command: 4 bits
Instruction: 5 bits
Data Length: 5 bits
Data: 0-31 bits
Filler: filling 0 bits to make the packet dividable by 8

what I usually do is read the packet in binary mode, convert the output to a concatenated 'binary string'(i.e. '0101011000110') and then use slice indeces to get the right data portions. Depending on what I need to do with these portions I convert them to whatever is handy (usually an integer).

This works out fine for me. Most of the time I also put the ASCII art diagram of this 'protocol' as a comment in the code, making it more readable/understandable.

Though there are a couple of things that bothers me with my approach:
- This seems such a general problem that I think that there must be already a general pythonic solution. - Using a string for binary representation takes at least 8 times more memory for the packet than strictly necessary.
- Seems to need a lot of prep work before doing the actual parsing.

Any suggestion is greatly appreciated.
The gold standard for binary parsing (and serialization) is probably Erlang's bit syntax, but as far as Python goes you might be interested by Hachoir (http://hachoir.org/ but it seems down right now).

It's not going to match your second point, but it can probably help with the rest (caveat: I haven't used hachoir personally).
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to