On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote: > AFAIK (and just for the record), there could be both Latin1 text and UTF-16 > in a PDF (and other encodings too), depending on the font used: [...] > In Python2, txt is just a str, but in Python3 handling everything as latin1 > string obviously doesn't work for TTF in this case.
Nobody is suggesting that you use Latin-1 for *everything*. We're suggesting that you use it for blobs of binary data that represent arbitrary bytes. First you have to get your binary data in the first place, using whatever technique is necessary. Here's one way to get a blob of binary data: # encode four C shorts into a fixed-width struct struct.pack(">hhhh", 23, 42, 17, 99) Here's another way: # encode a text string into UTF-16 "My name is Steven".encode("utf-16be") Both examples return a bytes object containing arbitrary bytes. How do you combine those arbitrary bytes with a string template while still keeping all code-points under U+0100? By decoding to Latin-1. -- Steven _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com