On Thu, Jan 9, 2014 at 3:14 PM, Ethan Furman <et...@stoneleaf.us> wrote:
> Sorry, I was too short with my example. My use case is binary files, with > ASCII metadata and binary metadata, as well as ASCII-encoded numeric > values, binary-coded numeric values, ASCII-encoded boolean values, and > who-knows-what-(before checking the in-band metadata)-encoded text. I have > to process all of it, and before we say "It's just a documentation issue" I > want to make sure it /is/ just a documentation issue. > As I am coming to understand it -- yes, using latin-1 would let you work with all that. You could decode the binary data using latin-1, which would give you a unicode object, which would: 1) act like ascii for ascii values, for the normal string operations, search, replace, etc, etc... 2) have a 1:1 mapping of indexes to bytes in the original. 3) be not-too-bad for memory and other performance (as I understand it py3 now has a cool unicode implementation that does not waste a lot of bytes for low codepoints) 4) would preserve the binary data that was not directly touched. Though you'd still have to encode() to bytes to get chunks that could be used as binary -- i.e. passed to the struct module, or to a frombytes() or frombuffer() method of say numpy, or PIL or something... But I'm no expert.... -Chris > > -- > ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com