Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Neil Cerutti
On 2006-10-19, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > In <[EMAIL PROTECTED]>, Neil Cerutti wrote: >>> Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode >>> object. With print this is implicitly converted to string. The >>> char set used depends on your console >> >> No, the

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Neil Cerutti
On 2006-10-19, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > In <[EMAIL PROTECTED]>, Neil Cerutti wrote: > >>> Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode >>> object. With print this is implicitly converted to string. The >>> char set used depends on your console >> >> No, th

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, Neil Cerutti wrote: >> Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode >> object. With print this is implicitly converted to string. The >> char set used depends on your console > > No, the setting of the console encoding (sys.stdout.encoding) is > ignored. Nope

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Neil Cerutti
On 2006-10-19, Michael Ströder <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: >> >> print 'K\xc3\xb6ni'.decode('utf-8') >> >> and this line raised a UnicodeDecode exception. > > Works for me. > > Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode > object. With print this is implici

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread NoelByron
Michael Ströder wrote: > [EMAIL PROTECTED] wrote: > > > > print 'K\xc3\xb6ni'.decode('utf-8') > > > > and this line raised a UnicodeDecode exception. > > Works for me. > > Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode object. With > print this is implicitly converted to string. The char

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Michael Ströder
[EMAIL PROTECTED] wrote: > > print 'K\xc3\xb6ni'.decode('utf-8') > > and this line raised a UnicodeDecode exception. Works for me. Note that 'K\xc3\xb6ni'.decode('utf-8') returns a Unicode object. With print this is implicitly converted to string. The char set used depends on your console Chec

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread NoelByron
Duncan Booth wrote: > [EMAIL PROTECTED] wrote: > > > 'K\xc3\xb6ni'.decode('utf-8') # 'K\xc3\xb6ni' should be 'König', > > contains a german 'umlaut' > > > > but failed since python assumes every string to decode to be ASCII? > > No, Python would assume the string to be utf-8 encoded in this cas

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread NoelByron
> > > > 'K\xc3\xb6ni'.decode('utf-8') # 'K\xc3\xb6ni' should be 'König', > > "Köni", to be precise. Äh, yes. ;o) > > contains a german 'umlaut' > > > > but failed since python assumes every string to decode to be ASCII? > > that should work, and it sure works for me: > > >>> s = 'K\xc3\xb6ni

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Duncan Booth
[EMAIL PROTECTED] wrote: > 'K\xc3\xb6ni'.decode('utf-8') # 'K\xc3\xb6ni' should be 'König', > contains a german 'umlaut' > > but failed since python assumes every string to decode to be ASCII? No, Python would assume the string to be utf-8 encoded in this case: >>> 'K\xc3\xb6ni'.decode('utf

Re: UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread Fredrik Lundh
[EMAIL PROTECTED] wrote: > I'm struggling with the conversion of a UTF-8 string to latin-1. As far > as I know the way to go is to decode the UTF-8 string to unicode and > then encode it back again to latin-1? > > So I tried: > > 'K\xc3\xb6ni'.decode('utf-8') # 'K\xc3\xb6ni' should be 'Kön

UTF-8 to unicode or latin-1 (and yes, I read the FAQ)

2006-10-19 Thread NoelByron
Hi! I'm struggling with the conversion of a UTF-8 string to latin-1. As far as I know the way to go is to decode the UTF-8 string to unicode and then encode it back again to latin-1? So I tried: 'K\xc3\xb6ni'.decode('utf-8') # 'K\xc3\xb6ni' should be 'König', contains a german 'umlaut' but