Re: unicode question

2015-01-28 Thread Michael Torrie
On 01/28/2015 03:17 PM, Albert-Jan Roskam wrote: >> I do not know how complete the support is, but this is copied from 3.4.2, >> which uses tcl/tk 8.6. t = "الحركات" for c in t: print(c) # Prints rightmost char above first >> ا >> ل >> ح >> ر >> ك >> ا >> ت > > Wow, I never knew this w

Re: unicode question

2015-01-28 Thread Albert-Jan Roskam
On Wed, Jan 28, 2015 8:21 AM CET Terry Reedy wrote: >On 1/27/2015 12:17 AM, Rehab Habeeb wrote: >> Hi there python staff >> does python support arabic language for texts ? and what to do if it >> support it? >> i wrote hello in Arabic using codeskulptor and the powers

Re: unicode question

2015-01-27 Thread Terry Reedy
On 1/27/2015 12:17 AM, Rehab Habeeb wrote: Hi there python staff does python support arabic language for texts ? and what to do if it support it? i wrote hello in Arabic using codeskulptor and the powershell just for testing and the same error appeared( a sytanx error in unicode)!! I do not kno

Re: unicode question

2015-01-27 Thread random832
On Tue, Jan 27, 2015, at 12:25, Mark Lawrence wrote: > People might find this http://bugs.python.org/issue1602 and hence this > https://github.com/Drekin/win-unicode-console useful. The latter is > available on pypi. However, Arabic is one of those scripts that runs up against the real limitati

Re: unicode question

2015-01-27 Thread Mark Lawrence
On 27/01/2015 16:13, random...@fastmail.us wrote: On Tue, Jan 27, 2015, at 00:17, Rehab Habeeb wrote: Hi there python staff does python support arabic language for texts ? and what to do if it support it? i wrote hello in Arabic using codeskulptor and the powershell just for testing and the same

Re: unicode question

2015-01-27 Thread random832
On Tue, Jan 27, 2015, at 00:17, Rehab Habeeb wrote: > Hi there python staff > does python support arabic language for texts ? and what to do if it > support it? > i wrote hello in Arabic using codeskulptor and the powershell just for > testing and the same error appeared( a sytanx error in unicode)

Re: unicode question

2015-01-26 Thread Chris Angelico
On Tue, Jan 27, 2015 at 4:17 PM, Rehab Habeeb wrote: > Hi there python staff > does python support arabic language for texts ? and what to do if it support > it? > i wrote hello in Arabic using codeskulptor and the powershell just for > testing and the same error appeared( a sytanx error in unicod

unicode question

2015-01-26 Thread Rehab Habeeb
Hi there python staff does python support arabic language for texts ? and what to do if it support it? i wrote hello in Arabic using codeskulptor and the powershell just for testing and the same error appeared( a sytanx error in unicode)!! -- https://mail.python.org/mailman/listinfo/python-list

Re: Beginner python 3 unicode question [SOLVED]

2013-11-16 Thread Chris Angelico
On Sun, Nov 17, 2013 at 8:44 AM, Laszlo Nagy wrote: > >> >> So is the default utf-8 or not? Should the documentation be updated? Or do >> we have a bug in the interactive shell? >> > It was my fault, sorry. The other program used os.system at some places, and > it accidentally used python2 instead

Re: Beginner python 3 unicode question

2013-11-16 Thread Chris Angelico
On Sun, Nov 17, 2013 at 8:19 AM, Laszlo Nagy wrote: > print("digest",digest,type(digest)) > > This function was called inside a script, and gave me this: > > ('digest', '\xa0\x98\x8b\xff\x04\xf9V;\xbd\x1eIHzh\x10-\xc5!\x14\x1b', 'str'>) > This looks very much like you're running under Py

Re: Beginner python 3 unicode question [SOLVED]

2013-11-16 Thread Laszlo Nagy
So is the default utf-8 or not? Should the documentation be updated? Or do we have a bug in the interactive shell? It was my fault, sorry. The other program used os.system at some places, and it accidentally used python2 instead of python 3. :-( -- This message has been scanned for viruse

Re: Beginner python 3 unicode question

2013-11-16 Thread Luuk
On 16-11-2013 21:57, Laszlo Nagy wrote: the error is in one of the lines you did not copy here because this works without problems: <> #!/usr/bin/python Most probably, your /usr/bin/python program is python version 2, and not python version 3 Try the same program with /usr/bin/python3.

Re: Beginner python 3 unicode question

2013-11-16 Thread Laszlo Nagy
Why it is behaving differently on the command line? What should I do to fix this? I was experimenting with this a bit more and found some more confusing things. Can somebody please enlight me? Here is a test function: def password_hash(self,password): public = bytearray([rando

Re: Beginner python 3 unicode question

2013-11-16 Thread Laszlo Nagy
the error is in one of the lines you did not copy here because this works without problems: <> #!/usr/bin/python Most probably, your /usr/bin/python program is python version 2, and not python version 3 Try the same program with /usr/bin/python3. And also try the interactive mode with

Re: Beginner python 3 unicode question

2013-11-16 Thread Luuk
On 16-11-2013 20:12, Laszlo Nagy wrote: Example interactive: $ python3 Python 3.3.1 (default, Sep 25 2013, 19:29:01) [GCC 4.7.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> import base64 >>> base64.b32encode(uuid.uuid1().bytes)[:-6].lowe

Beginner python 3 unicode question

2013-11-16 Thread Laszlo Nagy
Example interactive: $ python3 Python 3.3.1 (default, Sep 25 2013, 19:29:01) [GCC 4.7.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> import base64 >>> base64.b32encode(uuid.uuid1().bytes)[:-6].lower() b'zsz653co6ii6hgjejqhw42ncgy' >>> But w

Re: tkinter unicode question

2010-07-27 Thread Ned Deily
In article <20100727204532.r7gmz.27213.r...@cdptpa-web20-z02>, wrote: > Just curious if anyone could shed some light on this? I'm using > tkinter, but I can't seem to get certain unicode characters to > show in the label for Python 3. > > In my test, the label and button will contain the sa

tkinter unicode question

2010-07-27 Thread jyoung79
Just curious if anyone could shed some light on this? I'm using tkinter, but I can't seem to get certain unicode characters to show in the label for Python 3. In my test, the label and button will contain the same 3 characters - a Greek Alpha, a Greek Omega with a circumflex and soft breath

Re: Another (simple) unicode question

2009-10-29 Thread Scott David Daniels
John Machin wrote: On Oct 29, 10:02 pm, Rustom Mody wrote:... I thought of trying to port it to python3 but it barfs on some unicode related stuff (after running 2to3) which I am unable to wrap my head around. Can anyone direct me to what I should read to try to understand this? to which Jon

Re: Another (simple) unicode question

2009-10-29 Thread Carl Banks
On Oct 29, 4:02 am, Rustom Mody wrote: > Constructhttp://construct.wikispaces.com/is a kick-ass binary file > structurer (written by a 21 year old!) > I thought of trying to port it to python3 but it barfs on some unicode > related stuff (after running 2to3) which I am unable to wrap my head > aro

Re: Another (simple) unicode question

2009-10-29 Thread John Machin
On Oct 29, 10:02 pm, Rustom Mody wrote: > Constructhttp://construct.wikispaces.com/is a kick-ass binary file > structurer (written by a 21 year old!) > I thought of trying to port it to python3 but it barfs on some unicode > related stuff (after running 2to3) which I am unable to wrap my head > ar

Another (simple) unicode question

2009-10-29 Thread Rustom Mody
Construct http://construct.wikispaces.com/ is a kick-ass binary file structurer (written by a 21 year old!) I thought of trying to port it to python3 but it barfs on some unicode related stuff (after running 2to3) which I am unable to wrap my head around. Can anyone direct me to what I should read

Re: a simple unicode question

2009-10-28 Thread Tim Arnold
"Chris Jones" wrote in message news:mailman.2149.1256707687.2807.python-l...@python.org... > On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote: >> Chris Jones wrote: > > [..] > >>> Best part of Unicode is that there are multiple encodings, right? ;-) >> >> No, the best part about Unicode is

Re: a simple unicode question

2009-10-28 Thread Gabriel Genellina
En Wed, 28 Oct 2009 02:28:01 -0300, Chris Jones escribió: On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote: Chris Jones wrote: Best part of Unicode is that there are multiple encodings, right? ;-) No, the best part about Unicode is there is no encoding! Unicode does not define any enco

Re: a simple unicode question

2009-10-27 Thread Chris Jones
On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote: > Chris Jones wrote: [..] >> Best part of Unicode is that there are multiple encodings, right? ;-) > > No, the best part about Unicode is there is no encoding! > Unicode does not define any encoding; RFC 3629: "ISO/IEC 10646 and Unicode

Re: a simple unicode question

2009-10-27 Thread Lie Ryan
Chris Jones wrote: On Wed, Oct 21, 2009 at 12:35:11PM EDT, Nobody wrote: [..] Characters outside the 16-bit range aren't supported on all builds. They won't be supported on most Windows builds, as Windows uses 16-bit Unicode extensively: I knew nothing about UTF-16 & friends before this thre

Re: a simple unicode question

2009-10-22 Thread Gabriel Genellina
En Thu, 22 Oct 2009 17:08:21 -0300, escribió: On 10/22/2009 03:23 AM, Gabriel Genellina wrote: En Wed, 21 Oct 2009 15:14:32 -0300, escribió: On Oct 21, 4:59 am, Bruno Desthuilliers wrote: beSTEfar a écrit : (snip) > When parsing strings, use Regular Expressions. And now you have _two_ p

Re: a simple unicode question

2009-10-22 Thread rurpy
On 10/22/2009 03:23 AM, Gabriel Genellina wrote: > En Wed, 21 Oct 2009 15:14:32 -0300, escribió: > >> On Oct 21, 4:59 am, Bruno Desthuilliers > 42.desthuilli...@websiteburo.invalid> wrote: >>> beSTEfar a écrit : >>> (snip) >>> > When parsing strings, use Regular Expressions. >>> >>> And now you h

Re: a simple unicode question

2009-10-22 Thread Chris Jones
On Wed, Oct 21, 2009 at 12:35:11PM EDT, Nobody wrote: [..] > Characters outside the 16-bit range aren't supported on all builds. > They won't be supported on most Windows builds, as Windows uses 16-bit > Unicode extensively: I knew nothing about UTF-16 & friends before this thread. Best part of

Re: a simple unicode question

2009-10-22 Thread Gabriel Genellina
En Wed, 21 Oct 2009 15:14:32 -0300, escribió: On Oct 21, 4:59 am, Bruno Desthuilliers wrote: beSTEfar a écrit : (snip) > When parsing strings, use Regular Expressions. And now you have _two_ problems For some simple parsing problems, Python's string methods are powerful enough to make REs

Re: a simple unicode question

2009-10-21 Thread Terry Reedy
Nobody wrote: Just curious, why did you choose to set the upper boundary at 0x? Characters outside the 16-bit range aren't supported on all builds. They won't be supported on most Windows builds, as Windows uses 16-bit Unicode extensively: Python 2.5.1 (r251:54863, Apr 18 2007, 08

Re: a simple unicode question

2009-10-21 Thread rurpy
On Oct 21, 4:59 am, Bruno Desthuilliers wrote: > beSTEfar a écrit : > (snip) >  > When parsing strings, use Regular Expressions. > > And now you have _two_ problems > > For some simple parsing problems, Python's string methods are powerful > enough to make REs overkill. And for any complex enough

Re: a simple unicode question

2009-10-21 Thread Nobody
On Wed, 21 Oct 2009 05:16:56 -0400, Chris Jones wrote: >> > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? >> >> You can get them from the unicodedata module, e.g.: >> >> import unicodedata >> for i in xrange(0x1): >>n = unicodedata.name(unichr(i),None) >>

Re: a simple unicode question

2009-10-21 Thread Bruno Desthuilliers
beSTEfar a écrit : (snip) > When parsing strings, use Regular Expressions. And now you have _two_ problems For some simple parsing problems, Python's string methods are powerful enough to make REs overkill. And for any complex enough parsing (any recursive construct for example - think XML, H

Re: a simple unicode question

2009-10-21 Thread Chris Jones
On Wed, Oct 21, 2009 at 12:20:35AM EDT, Nobody wrote: > On Tue, 20 Oct 2009 17:56:21 +, George Trojan wrote: [..] > > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? > > You can get them from the unicodedata module, e.g.: > > import unicodedata > for i in xrange(0x100

Re: a simple unicode question

2009-10-21 Thread Scott David Daniels
George Trojan wrote: Scott David Daniels wrote: ... And if you are unsure of the name to use: >>> import unicodedata >>> unicodedata.name(u'\xb0') 'DEGREE SIGN' > Thanks for all suggestions. It took me a while to find out how to > configure my keyboard to be able to type the degree sign. I

Re: a simple unicode question

2009-10-20 Thread Mark Tolonen
"George Trojan" wrote in message news:hbktk6$8b...@news.nems.noaa.gov... Thanks for all suggestions. It took me a while to find out how to configure my keyboard to be able to type the degree sign. I prefer to stick with pure ASCII if possible. Where are the literals (i.e. u'\N{DEGREE SIGN}') d

Re: a simple unicode question

2009-10-20 Thread Martin v. Löwis
> Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found > http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt > Is that the place to look? Correct - you are supposed to fill in a Unicode character name into the \N escape. The specific list of names depends on the version of the UCD

Re: a simple unicode question

2009-10-20 Thread Nobody
On Tue, 20 Oct 2009 17:56:21 +, George Trojan wrote: > Thanks for all suggestions. It took me a while to find out how to > configure my keyboard to be able to type the degree sign. I prefer to > stick with pure ASCII if possible. > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I

Re: a simple unicode question

2009-10-20 Thread George Trojan
Thanks for all suggestions. It took me a while to find out how to configure my keyboard to be able to type the degree sign. I prefer to stick with pure ASCII if possible. Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt Is

Re: a simple unicode question

2009-10-20 Thread Scott David Daniels
Mark Tolonen wrote: Is there a better way of getting the degrees? It seems your string is UTF-8. \xc2\xb0 is UTF-8 for DEGREE SIGN. If you type non-ASCII characters in source code, make sure to declare the encoding the file is *actually* saved in: # coding: utf-8 s = '''48° 13' 16.80" N'

Re: a simple unicode question

2009-10-19 Thread Mark Tolonen
"George Trojan" wrote in message news:hbidd7$i9...@news.nems.noaa.gov... A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s

Re: a simple unicode question

2009-10-19 Thread Mark Tolonen
"George Trojan" wrote in message news:hbidd7$i9...@news.nems.noaa.gov... A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s

Re: a simple unicode question

2009-10-19 Thread beSTEfar
On 19 Okt, 21:07, George Trojan wrote: > A trivial one, this is the first time I have to deal with Unicode. I am > trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is > "iso-8859-1". To get the degrees I did >  >>> encoding='iso-8859-1' >  >>> q=s.decode(encoding) >  >>> q.spl

Re: a simple unicode question

2009-10-19 Thread Diez B. Roggisch
George Trojan schrieb: A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s.decode(encoding) >>> q.split() [u'48\xc2\xb0', u"13

a simple unicode question

2009-10-19 Thread George Trojan
A trivial one, this is the first time I have to deal with Unicode. I am trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is "iso-8859-1". To get the degrees I did >>> encoding='iso-8859-1' >>> q=s.decode(encoding) >>> q.split() [u'48\xc2\xb0', u"13'", u'16.80"', u'N'] >>> r=

Re: python 3.1 unicode question

2009-09-16 Thread Duncan Booth
jeffunit wrote: >>That looks like a "surrogate escape" (See PEP 383) >>http://www.python.org/dev/peps/pep-0383/. It indicates the wrong >>encoding was used to decode the filename. > > That seems likely. How do I set the encoding to something correct to > decode the filename? > > Clearly win

Re: python 3.1 unicode question

2009-09-15 Thread Chris Rebert
On Tue, Sep 15, 2009 at 9:48 PM, jeffunit wrote: > At 09:25 PM 9/15/2009, Mark Tolonen wrote: >> >> "jeffunit" wrote in message >> news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com... >>> >>> I wrote a program that diffs files and prints out matching file names. >>> I will be executing th

Re: python 3.1 unicode question

2009-09-15 Thread jeffunit
At 09:25 PM 9/15/2009, Mark Tolonen wrote: "jeffunit" wrote in message news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com... I wrote a program that diffs files and prints out matching file names. I will be executing the output with sh, to delete select files. Most of the files names are

Re: python 3.1 unicode question

2009-09-15 Thread Mark Tolonen
"jeffunit" wrote in message news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com... I wrote a program that diffs files and prints out matching file names. I will be executing the output with sh, to delete select files. Most of the files names are plain ascii, but about 10% of them have un

python 3.1 unicode question

2009-09-15 Thread jeffunit
I wrote a program that diffs files and prints out matching file names. I will be executing the output with sh, to delete select files. Most of the files names are plain ascii, but about 10% of them have unicode characters in them. When I try to print the string containing the name, I get an excep

Re: (Simple?) Unicode Question

2009-08-30 Thread Nobody
On Sun, 30 Aug 2009 02:36:49 +, Steven D'Aprano wrote: >>> So long as your terminal has a sensible encoding, and you have a good >>> quality font, you should be able to print any string you can create. >> >> UTF-8 isn't a particularly sensible encoding for terminals. > > Did I mention UTF-8?

Re: (Simple?) Unicode Question

2009-08-29 Thread Steven D'Aprano
On Sat, 29 Aug 2009 20:09:12 +0100, Nobody wrote: > On Sat, 29 Aug 2009 08:26:54 +, Steven D'Aprano wrote: > >> Python only needs to know when you convert the text to or from bytes. I >> can do this: >> > s = "hello" > t = "world" > print(' '.join([s, t])) >> hello world >> >> a

Re: (Simple?) Unicode Question

2009-08-29 Thread Nobody
On Sat, 29 Aug 2009 08:26:54 +, Steven D'Aprano wrote: > Python only needs to know when you convert the text to or from bytes. I > can do this: > s = "hello" t = "world" print(' '.join([s, t])) > hello world > > and not need to care anything about encodings. > > So long as y

Re: (Simple?) Unicode Question

2009-08-29 Thread Steven D'Aprano
On Sat, 29 Aug 2009 09:34:43 +0200, Thorsten Kampe wrote: > * Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700) >> > Further, does anything, except a printing device need to know the >> > encoding of a piece of "text"? > > Python needs to know if you are processing the text. Python only needs to

Re: (Simple?) Unicode Question

2009-08-29 Thread Thorsten Kampe
* Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700) > > Further, does anything, except a printing device need to know the > > encoding of a piece of "text"? Python needs to know if you are processing the text. > I may be wrong, but I believe that's part of the idea between separation > of strin

Re: (Simple?) Unicode Question

2009-08-27 Thread Albert Hopkins
On Thu, 2009-08-27 at 22:09 +0530, Shashank Singh wrote: > Hi All! > > I have a very simple (and probably stupid) question eluding me. > When exactly is the char-set information needed? > > To make my question clear consider reading a file. > While reading a file, all I get is basically an array

Re: (Simple?) Unicode Question

2009-08-27 Thread Rami Chowdhury
Further, does anything, except a printing device need to know the encoding of a piece of "text"? I may be wrong, but I believe that's part of the idea between separation of string and bytes types in Python 3.x. I believe, if you are using Python 3.x, you don't need the character encoding mum

(Simple?) Unicode Question

2009-08-27 Thread Shashank Singh
Hi All! I have a very simple (and probably stupid) question eluding me. When exactly is the char-set information needed? To make my question clear consider reading a file. While reading a file, all I get is basically an array of bytes. Now suppose a file has 10 bytes in it (all is data, no metad

Re: Unicode question

2006-07-28 Thread Martin v. Löwis
Ben Edwards (lists) wrote: > Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do > sys.setdefaultencoding = 'iso−8859−1' That "works", but has no effect. You bind the variable sys.setdefaultencoding to some value, but that value is never used for anything (do sys.getdefaultenc

Re: Unicode question

2006-07-28 Thread Steve M
Ben Edwards (lists) wrote: > I am using python 2.4 on Ubuntu dapper, I am working through Dive into > Python. > > There are a couple of inconsictencies. > > Firstly sys.setdefaultencoding('iso-8859-1') does not work, I have to do > sys.setdefaultencoding = 'iso-8859-1' When you run a Python script

Re: Unicode question

2006-07-28 Thread Max Erickson
"Ben Edwards (lists)" <[EMAIL PROTECTED]> wrote: > I am using python 2.4 on Ubuntu dapper, I am working through Dive > into Python. ... > Any insight? > Ben Did you follow all the instructions, or did you try to call sys.setdefaultencoding interactively? See: http://diveintopython.org/xml_pro

Unicode question

2006-07-28 Thread Ben Edwards (lists)
I am using python 2.4 on Ubuntu dapper, I am working through Dive into Python. There are a couple of inconsictencies. Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do sys.setdefaultencoding = 'iso−8859−1' secondly the following does not give a 'UnicodeError: ASCII encodin

[OT] Re: a unicode question?

2006-04-11 Thread Peter Otten
John Machin wrote: > ... and yes Peter, info travels faster also from China that it does > from Armenia :-()) Q: Can info travel faster from Armenia than from China? Radio Yerevan: In principle, yes. Just make sure that it doesn't go the other way round the globe or meets some friends on the way.

Re: a unicode question?

2006-04-10 Thread John Machin
E, it get's worse: not only is the title written in Chinese, it is encoded as gb2312 -- here is the repr() of the first few chunks: "\n\n\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) : \xc4\xd a\xb2\xbf\xc8\xcb\xd4\xb1\xb3\xd6\xb9\xc9 - \xcb\xd1\xba\xfc\xb9\xc9\xc6\xb1\n\n" and here is wha

Re: a unicode question?

2006-04-10 Thread Serge Orlov
[EMAIL PROTECTED] wrote: > Mr. John Machin > > This question come form the flow codes. I use the PyXml to build a DOM > tree. > > from xml.dom.ext.reader import HtmlLib > doc = > HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028') > title_elem = doc.documentElement.getElem

Re: a unicode question?

2006-04-09 Thread zdwang
Mr. John Machin This question come form the flow codes. I use the PyXml to build a DOM tree. from xml.dom.ext.reader import HtmlLib doc = HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028') title_elem = doc.documentElement.getElementsByTagName("TITLE")[0] title_string = t

Re: a unicode question?

2006-04-09 Thread zdwang
Mr. John Machin, Thank you very much! -- http://mail.python.org/mailman/listinfo/python-list

Re: a unicode question?

2006-04-09 Thread John Machin
What do you mean by "ansi string"? Here is a superficially not-unreasonable answer to your more specific question: # >>> s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' # >>> s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' # >>> s3 = s1.encode('latin1') # >>> s2 == s3 # True But what are you

a unicode question?

2006-04-09 Thread zdwang
Hello, There is a unicode string, I want to change it to ansi string. but it raise an exception. Could you help me? ## I want to change s1 to s2. s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) ' -- http://mail.python.org/mailman

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread Ben Finney
"Ian Sparks" <[EMAIL PROTECTED]> writes: > This is probably stupid and/or misguided but supposing I'm passed a > byte-string value that I want to be unicode, this is what I do. I'm > sure I'm missing something very important. Perhaps you need to read one of the good Python Unicode tutorials, such

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread Kent Johnson
ianaré wrote: > maybe a bit off topic, but how does one find the console's encoding > from within python? > In [1]: import sys In [3]: sys.stdout.encoding Out[3]: 'cp437' In [4]: sys.stdin.encoding Out[4]: 'cp437' Kent -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread John Machin
The most important thing that you are missing is that you need to know the encoding used for the 8-bit-character string. Let's guess that it's Latin1. Then all you have to do is use the unicode() builtin function, or the string decode method. # >>> s = 'Jos\xe9' # >>> s # 'Jos\xe9' # >>> u = unico

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread ianaré
maybe a bit off topic, but how does one find the console's encoding from within python? -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode question : turn "José" into u"José"

2006-04-05 Thread aurora
First of all, if you run this on the console, find out your console's encoding. In my case it is English Windows XP. It uses 'cp437'. C:\>chcp Active code page: 437 Then >>> s = "José" >>> u = u"Jos\u00e9" # same thing in unicode escape >>> s.decode('cp437') == u # use encoding that

Unicode question : turn "José" into u"José"

2006-04-05 Thread Ian Sparks
This is probably stupid and/or misguided but supposing I'm passed a byte-string value that I want to be unicode, this is what I do. I'm sure I'm missing something very important. Short version : >>> s = "José" #Start with non-unicode string >>> unicoded = eval("u'%s'" % "José") Long version :

Re: unicode question

2006-03-01 Thread Walter Dörwald
Edward Loper wrote: > Walter Dörwald wrote: >> Edward Loper wrote: >> >>> [...] >>> Surely there's a better way than converting back and forth 3 times? Is >>> there a reason that the 'backslashreplace' error mode can't be used >>> with codecs.decode? >>> >>> >>> 'abc \xff\xe8 def'.decode('ascii

Re: unicode question

2006-02-27 Thread Edward Loper
Walter Dörwald wrote: > Edward Loper wrote: > >> [...] >> Surely there's a better way than converting back and forth 3 times? Is >> there a reason that the 'backslashreplace' error mode can't be used >> with codecs.decode? >> >> >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace') >> Trac

Re: unicode question

2006-02-27 Thread Walter Dörwald
Edward Loper wrote: > [...] > Surely there's a better way than converting back and forth 3 times? Is > there a reason that the 'backslashreplace' error mode can't be used with > codecs.decode? > > >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace') > Traceback (most recent call last): >

Re: unicode question

2006-02-25 Thread Kent Johnson
Edward Loper wrote: > I would like to convert an 8-bit string (i.e., a str) into unicode, > treating chars \x00-\x7f as ascii, and converting any chars \x80-xff > into a backslashed escape sequences. I.e., I want something like this: > > >>> decode_with_backslashreplace('abc \xff\xe8 def') > u'a

Re: unicode question

2006-02-25 Thread Tim Roberts
Edward Loper <[EMAIL PROTECTED]> wrote: >I would like to convert an 8-bit string (i.e., a str) into unicode, >treating chars \x00-\x7f as ascii, and converting any chars \x80-xff >into a backslashed escape sequences. I.e., I want something like this: > > >>> decode_with_backslashreplace('abc \xff

unicode question

2006-02-24 Thread Edward Loper
I would like to convert an 8-bit string (i.e., a str) into unicode, treating chars \x00-\x7f as ascii, and converting any chars \x80-xff into a backslashed escape sequences. I.e., I want something like this: >>> decode_with_backslashreplace('abc \xff\xe8 def') u'abc \\xff\\xe8 def' The best I c

Re: Unicode Question

2006-01-09 Thread David Pratt
Hi Max. Many thanks for helping to realize where I was missing the point and making this clearer. Regards, David Max Erickson wrote: > The encoding argument to unicode() is used to specify the encoding of the > string that you want to translate into unicode. The interpreter stores > unicode as

Re: Unicode Question

2006-01-09 Thread David Pratt
Hi Erik. Thank you for your reply. The advice I has helped clarify this for me. Regards, David Erik Max Francis wrote: > David Pratt wrote: > > >>This is not working for me. Can someone explain why. Many thanks. > > > Because '\xbe' isn't UTF-8 for the character you want, '\xc2\xbe' is, as

Re: Unicode Question

2006-01-09 Thread David Pratt
Hi Martin. Many thanks for your reply. What I am reall after, the following accomplishes. > > If you are looking for "at the same time", perhaps this is also > interesting: > > py> unicode('\xbe', 'windows-1252').encode('utf-8') > '\xc2\xbe' > Your answer really helped quite a bit to clarify t

Re: Unicode Question

2006-01-09 Thread Max Erickson
The encoding argument to unicode() is used to specify the encoding of the string that you want to translate into unicode. The interpreter stores unicode as unicode, it isn't encoded... >>> unicode('\xbe','cp1252') u'\xbe' >>> unicode('\xbe','cp1252').encode('utf-8') '\xc2\xbe' >>> max -- ht

Unicode Question

2006-01-09 Thread David Pratt
Hi. I am working through some tutorials on unicode and am hoping that someone can help explain this for me. I am on mac platform using python 2.4.1 at the moment. I am experimenting with unicode with the 3/4 symbol. I want to prepare strings for db storage that come from normal Windows machin

Re: Unicode Question

2006-01-09 Thread Martin v. Löwis
David Pratt wrote: > I want to prepare strings for db storage that come from normal Windows > machine (cp1252) so my understanding is to unicode and encode to utf-8 > and to store properly. That also depends on the database. The database must accept UTF-8-encoded strings, and must not modify them

Re: Unicode Question

2006-01-09 Thread Erik Max Francis
David Pratt wrote: > This is not working for me. Can someone explain why. Many thanks. Because '\xbe' isn't UTF-8 for the character you want, '\xc2\xbe' is, as you just showed yourself in the code snippet. -- Erik Max Francis && [EMAIL PROTECTED] && http://www.alcyone.com/max/ San Jose, CA, US

Re: Once again a unicode question

2005-03-26 Thread Nicolas Evrard
* Serge Orlov [23:45 26/03/05 CET]: Nicolas Evrard wrote: Hello, I'm puzzled by this test I made while trying to transform a page in html to plain text. Because I cannot send unicode to feed, nor str so how can I do this ? Seems like the parser is in the broken state after the first exception. Fe

Re: Once again a unicode question

2005-03-26 Thread Serge Orlov
Nicolas Evrard wrote: > Hello, > > I'm puzzled by this test I made while trying to transform a page in > html to plain text. Because I cannot send unicode to feed, nor str so > how can I do this ? Seems like the parser is in the broken state after the first exception. Feed only binary strings to i

Once again a unicode question

2005-03-26 Thread Nicolas Evrard
Hello, I'm puzzled by this test I made while trying to transform a page in html to plain text. Because I cannot send unicode to feed, nor str so how can I do this ? [EMAIL PROTECTED]:~$ python2.4 .Python 2.4.1c2 (#2, Mar 19 2005, 01:04:19) .[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2 .Type "help", "

Re: unicode question

2004-11-29 Thread Bengt Richter
On Tue, 23 Nov 2004 20:37:04 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: >Steve Holden wrote: >> Am I the only person who found it scary that Bengt could apparently >> casually drop on a polynomial the would decode to " Löwis"? Well, don't give me too much credit