On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop <[email protected]> wrote:
> What this all means is that the PDF objects are expressed in ASCII,
> "stream" objects like images and fonts may have a binary part and I never
> saw those UTF+16 strings.
>
hmm -- I wonder if they are out there in the wild, though....
> u"stream\n%s\nendstream\nendobj"%binary_data.decode('latin-1')
>>
>
> The argument for dropping "%f" et al. has been that if something is a
> text, then it should be Unicode. Conversely, if it is not text, then it
> should not be Unicode.
>
>
????
What I'm trying to demostrate / test is that you can use unicode objects
for mixed binary + ascii, if you make sure to encode/decode using latin-1 (
any others?). The idea is that ascii can be seen/used as text, and other
bytes are preserved, and you can ignore whatever meaning latin-1 gives them.
using unicode objects means that you can use the existing string formatting
(%s), and if you want to pass in binary blobs, you need to decode them as
latin-1, creating a unicode object, which will get interpolated into your
unicode object, but then that unicode gets encoded back to latin-1, the
original bytes are preserved.
I think this it confusing, as we are calling it latin-1, but not really
using it that way, but it seems it should work.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
[email protected]
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com