Re: unicode(s, enc).encode(enc) == s ?

2008-01-03 Thread mario
On Jan 2, 9:34 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > In any case, it goes well beyond the situation that triggered my > > original question in the first place, that basically was to provide a > > reasonable check on whether round-tripping a string is successful -- > > this is in the

Re: unicode(s, enc).encode(enc) == s ?

2008-01-03 Thread mario
Thanks again. I will chunk my responses as your message has too much in it for me to process all at once... On Jan 2, 9:34 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > Thanks a lot Martin and Marc for the really great explanations! I was > > wondering if it would be reasonable to imagine a

Re: unicode(s, enc).encode(enc) == s ?

2008-01-02 Thread Martin v. Löwis
> Thanks a lot Martin and Marc for the really great explanations! I was > wondering if it would be reasonable to imagine a utility that will > determine whether, for a given encoding, two byte strings would be > equivalent. But that is much easier to answer: s1.decode(enc) == s2.decode(enc) A

Re: unicode(s, enc).encode(enc) == s ?

2008-01-02 Thread mario
Thanks a lot Martin and Marc for the really great explanations! I was wondering if it would be reasonable to imagine a utility that will determine whether, for a given encoding, two byte strings would be equivalent. But I think such a utility will require *extensive* knowledge about many bizarritie

Re: unicode(s, enc).encode(enc) == s ?

2007-12-28 Thread Martin v. Löwis
> Wow, that's not easy to see why would anyone ever want that? Is there > any logic behind this? It's the pre-Unicode solution to the "we want to have many characters encoded in a single file" problem. Suppose you have pre-defined characters sets A, B, C, and you want text to contain characters f

Re: unicode(s, enc).encode(enc) == s ?

2007-12-28 Thread Marc 'BlackJack' Rintsch
On Fri, 28 Dec 2007 03:00:59 -0800, mario wrote: > On Dec 27, 7:37 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >> Certainly. ISO-2022 is famous for having ambiguous encodings. Try >> these: >> >> unicode("Hallo","iso-2022-jp") >> unicode("\x1b(BHallo","iso-2022-jp") >> unicode("\x1b(JHallo","

Re: unicode(s, enc).encode(enc) == s ?

2007-12-28 Thread mario
On Dec 27, 7:37 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Certainly. ISO-2022 is famous for having ambiguous encodings. Try > these: > > unicode("Hallo","iso-2022-jp") > unicode("\x1b(BHallo","iso-2022-jp") > unicode("\x1b(JHallo","iso-2022-jp") > unicode("\x1b(BHal\x1b(Jlo","iso-2022-jp")

Re: unicode(s, enc).encode(enc) == s ?

2007-12-27 Thread Martin v. Löwis
> Given no UnicodeErrors, are there any cases for the following not to > be True? > > unicode(s, enc).encode(enc) == s Certainly. ISO-2022 is famous for having ambiguous encodings. Try these: unicode("Hallo","iso-2022-jp") unicode("\x1b(BHallo",&quo

unicode(s, enc).encode(enc) == s ?

2007-12-27 Thread mario
I have checks in code, to ensure a decode/encode cycle returns the original string. Given no UnicodeErrors, are there any cases for the following not to be True? unicode(s, enc).encode(enc) == s mario -- http://mail.python.org/mailman/listinfo/python-list