Re: string processing question

2009-05-01 Thread norseman
Piet van Oostrum wrote: Kurt Mueller (KM) wrote: KM> But from the command line python interprets the code KM> as 'latin_1' I presume. That is why I have to convert KM> the "ä" with unicode(). KM> Am I right? There are a couple of stages: 1. Your terminal emulator interprets your keystrokes,

Re: string processing question

2009-05-01 Thread Piet van Oostrum
> Kurt Mueller (KM) wrote: >KM> But from the command line python interprets the code >KM> as 'latin_1' I presume. That is why I have to convert >KM> the "ä" with unicode(). >KM> Am I right? There are a couple of stages: 1. Your terminal emulator interprets your keystrokes, encodes them in a

Re: string processing question

2009-05-01 Thread Scott David Daniels
Kurt Mueller wrote: Scott David Daniels schrieb: To discover what is happening, try something like: python -c 'for a in "ä", unicode("ä"): print len(a), a' I suspect that in your encoding, "ä" is two bytes long, and in unicode it is converted to to a single character. :> python -c 'for a

Re: string processing question

2009-05-01 Thread Kurt Mueller
Sion Arrowsmith wrote: > Kurt Mueller wrote: >> :> python -c 'print unicode("ä", "utf8")' >> ä >> :> python -c 'print unicode("ä", "utf8")' | cat >> Traceback (most recent call last): >> File "", line 1, in >> UnicodeEncodeError: 'ascii' codec can't encode characters in position >> 0-1: ordinal n

Re: string processing question

2009-05-01 Thread Sion Arrowsmith
Kurt Mueller wrote: >:> python -c 'print unicode("ä", "utf8")' >ä > >:> python -c 'print unicode("ä", "utf8")' | cat >Traceback (most recent call last): > File "", line 1, in >UnicodeEncodeError: 'ascii' codec can't encode characters in position >0-1: ordinal not in range(128) $ python -c 'imp

Re: string processing question

2009-05-01 Thread Kurt Mueller
Scott David Daniels schrieb: > To discover what is happening, try something like: > python -c 'for a in "ä", unicode("ä"): print len(a), a' > > I suspect that in your encoding, "ä" is two bytes long, and in > unicode it is converted to to a single character. :> python -c 'for a in "ä", unicode

Re: string processing question

2009-05-01 Thread Kurt Mueller
Paul McGuire schrieb: > -- > Weird. What happens if you change the second print statement to: > print b.center(6,u"-") Same behavior. I have an even more minimal example: :> python -c 'print unicode("ä", "utf8")' ä :> python -c 'print unicode("ä", "utf8")' | cat Traceback (most recent call las

Re: string processing question

2009-04-30 Thread Scott David Daniels
Kurt Mueller wrote: on a Linux system and python 2.5.1 I have the following behaviour which I do not understand: case 1 python -c 'a="ä"; print a ; print a.center(6,"-") ; b=unicode(a, "utf8"); print b.center(6,"-")' ä --ä-- --ä--- To discover what is happening, try something like: python

Re: string processing question

2009-04-30 Thread norseman
Kurt Mueller wrote: Hi, on a Linux system and python 2.5.1 I have the following behaviour which I do not understand: case 1 python -c 'a="ä"; print a ; print a.center(6,"-") ; b=unicode(a, "utf8"); print b.center(6,"-")' ä --ä-- --ä--- case 2 - an UnicodeEncodeError in this case: p

Re: string processing question

2009-04-30 Thread Paul McGuire
On Apr 30, 11:55 am, Kurt Mueller wrote: > Hi, > > on a Linux system and python 2.5.1 I have the > following behaviour which I do not understand: > > case 1> python -c 'a="ä"; print a ; print a.center(6,"-") ; b=unicode(a, > "utf8"); print b.center(6,"-")' > > ä > --ä-- > --ä--- > > Weird. What

string processing question

2009-04-30 Thread Kurt Mueller
Hi, on a Linux system and python 2.5.1 I have the following behaviour which I do not understand: case 1 > python -c 'a="ä"; print a ; print a.center(6,"-") ; b=unicode(a, "utf8"); > print b.center(6,"-")' ä --ä-- --ä--- > case 2 - an UnicodeEncodeError in this case: > python -c 'a="ä";