On 01/28/2015 03:17 PM, Albert-Jan Roskam wrote:
>> I do not know how complete the support is, but this is copied from 3.4.2,
>> which uses tcl/tk 8.6.
t = "الحركات"
for c in t: print(c) # Prints rightmost char above first
>> ا
>> ل
>> ح
>> ر
>> ك
>> ا
>> ت
>
> Wow, I never knew this w
On Wed, Jan 28, 2015 8:21 AM CET Terry Reedy wrote:
>On 1/27/2015 12:17 AM, Rehab Habeeb wrote:
>> Hi there python staff
>> does python support arabic language for texts ? and what to do if it
>> support it?
>> i wrote hello in Arabic using codeskulptor and the powers
On 1/27/2015 12:17 AM, Rehab Habeeb wrote:
Hi there python staff
does python support arabic language for texts ? and what to do if it
support it?
i wrote hello in Arabic using codeskulptor and the powershell just for
testing and the same error appeared( a sytanx error in unicode)!!
I do not kno
On Tue, Jan 27, 2015, at 12:25, Mark Lawrence wrote:
> People might find this http://bugs.python.org/issue1602 and hence this
> https://github.com/Drekin/win-unicode-console useful. The latter is
> available on pypi.
However, Arabic is one of those scripts that runs up against the real
limitati
On 27/01/2015 16:13, random...@fastmail.us wrote:
On Tue, Jan 27, 2015, at 00:17, Rehab Habeeb wrote:
Hi there python staff
does python support arabic language for texts ? and what to do if it
support it?
i wrote hello in Arabic using codeskulptor and the powershell just for
testing and the same
On Tue, Jan 27, 2015, at 00:17, Rehab Habeeb wrote:
> Hi there python staff
> does python support arabic language for texts ? and what to do if it
> support it?
> i wrote hello in Arabic using codeskulptor and the powershell just for
> testing and the same error appeared( a sytanx error in unicode)
On Tue, Jan 27, 2015 at 4:17 PM, Rehab Habeeb
wrote:
> Hi there python staff
> does python support arabic language for texts ? and what to do if it support
> it?
> i wrote hello in Arabic using codeskulptor and the powershell just for
> testing and the same error appeared( a sytanx error in unicod
Hi there python staff
does python support arabic language for texts ? and what to do if it
support it?
i wrote hello in Arabic using codeskulptor and the powershell just for
testing and the same error appeared( a sytanx error in unicode)!!
--
https://mail.python.org/mailman/listinfo/python-list
On Sun, Nov 17, 2013 at 8:44 AM, Laszlo Nagy wrote:
>
>>
>> So is the default utf-8 or not? Should the documentation be updated? Or do
>> we have a bug in the interactive shell?
>>
> It was my fault, sorry. The other program used os.system at some places, and
> it accidentally used python2 instead
On Sun, Nov 17, 2013 at 8:19 AM, Laszlo Nagy wrote:
> print("digest",digest,type(digest))
>
> This function was called inside a script, and gave me this:
>
> ('digest', '\xa0\x98\x8b\xff\x04\xf9V;\xbd\x1eIHzh\x10-\xc5!\x14\x1b', 'str'>)
>
This looks very much like you're running under Py
So is the default utf-8 or not? Should the documentation be updated?
Or do we have a bug in the interactive shell?
It was my fault, sorry. The other program used os.system at some places,
and it accidentally used python2 instead of python 3. :-(
--
This message has been scanned for viruse
On 16-11-2013 21:57, Laszlo Nagy wrote:
the error is in one of the lines you did not copy here
because this works without problems:
<>
#!/usr/bin/python
Most probably, your /usr/bin/python program is python version 2, and not
python version 3
Try the same program with /usr/bin/python3.
Why it is behaving differently on the command line? What should I do
to fix this?
I was experimenting with this a bit more and found some more confusing
things. Can somebody please enlight me?
Here is a test function:
def password_hash(self,password):
public = bytearray([rando
the error is in one of the lines you did not copy here
because this works without problems:
<>
#!/usr/bin/python
Most probably, your /usr/bin/python program is python version 2, and not
python version 3
Try the same program with /usr/bin/python3. And also try the interactive
mode with
On 16-11-2013 20:12, Laszlo Nagy wrote:
Example interactive:
$ python3
Python 3.3.1 (default, Sep 25 2013, 19:29:01)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import uuid
>>> import base64
>>> base64.b32encode(uuid.uuid1().bytes)[:-6].lowe
Example interactive:
$ python3
Python 3.3.1 (default, Sep 25 2013, 19:29:01)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import uuid
>>> import base64
>>> base64.b32encode(uuid.uuid1().bytes)[:-6].lower()
b'zsz653co6ii6hgjejqhw42ncgy'
>>>
But w
In article <20100727204532.r7gmz.27213.r...@cdptpa-web20-z02>,
wrote:
> Just curious if anyone could shed some light on this? I'm using
> tkinter, but I can't seem to get certain unicode characters to
> show in the label for Python 3.
>
> In my test, the label and button will contain the sa
Just curious if anyone could shed some light on this? I'm using
tkinter, but I can't seem to get certain unicode characters to
show in the label for Python 3.
In my test, the label and button will contain the same 3
characters - a Greek Alpha, a Greek Omega with a circumflex and
soft breath
John Machin wrote:
On Oct 29, 10:02 pm, Rustom Mody wrote:...
I thought of trying to port it to python3 but it barfs on some unicode
related stuff (after running 2to3) which I am unable to wrap my head
around.
Can anyone direct me to what I should read to try to understand this?
to which Jon
On Oct 29, 4:02 am, Rustom Mody wrote:
> Constructhttp://construct.wikispaces.com/is a kick-ass binary file
> structurer (written by a 21 year old!)
> I thought of trying to port it to python3 but it barfs on some unicode
> related stuff (after running 2to3) which I am unable to wrap my head
> aro
On Oct 29, 10:02 pm, Rustom Mody wrote:
> Constructhttp://construct.wikispaces.com/is a kick-ass binary file
> structurer (written by a 21 year old!)
> I thought of trying to port it to python3 but it barfs on some unicode
> related stuff (after running 2to3) which I am unable to wrap my head
> ar
Construct http://construct.wikispaces.com/ is a kick-ass binary file
structurer (written by a 21 year old!)
I thought of trying to port it to python3 but it barfs on some unicode
related stuff (after running 2to3) which I am unable to wrap my head
around.
Can anyone direct me to what I should read
"Chris Jones" wrote in message
news:mailman.2149.1256707687.2807.python-l...@python.org...
> On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote:
>> Chris Jones wrote:
>
> [..]
>
>>> Best part of Unicode is that there are multiple encodings, right? ;-)
>>
>> No, the best part about Unicode is
En Wed, 28 Oct 2009 02:28:01 -0300, Chris Jones
escribió:
On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote:
Chris Jones wrote:
Best part of Unicode is that there are multiple encodings, right? ;-)
No, the best part about Unicode is there is no encoding!
Unicode does not define any enco
On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote:
> Chris Jones wrote:
[..]
>> Best part of Unicode is that there are multiple encodings, right? ;-)
>
> No, the best part about Unicode is there is no encoding!
> Unicode does not define any encoding;
RFC 3629:
"ISO/IEC 10646 and Unicode
Chris Jones wrote:
On Wed, Oct 21, 2009 at 12:35:11PM EDT, Nobody wrote:
[..]
Characters outside the 16-bit range aren't supported on all builds.
They won't be supported on most Windows builds, as Windows uses 16-bit
Unicode extensively:
I knew nothing about UTF-16 & friends before this thre
En Thu, 22 Oct 2009 17:08:21 -0300, escribió:
On 10/22/2009 03:23 AM, Gabriel Genellina wrote:
En Wed, 21 Oct 2009 15:14:32 -0300, escribió:
On Oct 21, 4:59 am, Bruno Desthuilliers wrote:
beSTEfar a écrit :
(snip)
> When parsing strings, use Regular Expressions.
And now you have _two_ p
On 10/22/2009 03:23 AM, Gabriel Genellina wrote:
> En Wed, 21 Oct 2009 15:14:32 -0300, escribió:
>
>> On Oct 21, 4:59 am, Bruno Desthuilliers > 42.desthuilli...@websiteburo.invalid> wrote:
>>> beSTEfar a écrit :
>>> (snip)
>>> > When parsing strings, use Regular Expressions.
>>>
>>> And now you h
On Wed, Oct 21, 2009 at 12:35:11PM EDT, Nobody wrote:
[..]
> Characters outside the 16-bit range aren't supported on all builds.
> They won't be supported on most Windows builds, as Windows uses 16-bit
> Unicode extensively:
I knew nothing about UTF-16 & friends before this thread.
Best part of
En Wed, 21 Oct 2009 15:14:32 -0300, escribió:
On Oct 21, 4:59 am, Bruno Desthuilliers wrote:
beSTEfar a écrit :
(snip)
> When parsing strings, use Regular Expressions.
And now you have _two_ problems
For some simple parsing problems, Python's string methods are powerful
enough to make REs
Nobody wrote:
Just curious, why did you choose to set the upper boundary at 0x?
Characters outside the 16-bit range aren't supported on all builds. They
won't be supported on most Windows builds, as Windows uses 16-bit Unicode
extensively:
Python 2.5.1 (r251:54863, Apr 18 2007, 08
On Oct 21, 4:59 am, Bruno Desthuilliers wrote:
> beSTEfar a écrit :
> (snip)
> > When parsing strings, use Regular Expressions.
>
> And now you have _two_ problems
>
> For some simple parsing problems, Python's string methods are powerful
> enough to make REs overkill. And for any complex enough
On Wed, 21 Oct 2009 05:16:56 -0400, Chris Jones wrote:
>> > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined?
>>
>> You can get them from the unicodedata module, e.g.:
>>
>> import unicodedata
>> for i in xrange(0x1):
>>n = unicodedata.name(unichr(i),None)
>>
beSTEfar a écrit :
(snip)
> When parsing strings, use Regular Expressions.
And now you have _two_ problems
For some simple parsing problems, Python's string methods are powerful
enough to make REs overkill. And for any complex enough parsing (any
recursive construct for example - think XML, H
On Wed, Oct 21, 2009 at 12:20:35AM EDT, Nobody wrote:
> On Tue, 20 Oct 2009 17:56:21 +, George Trojan wrote:
[..]
> > Where are the literals (i.e. u'\N{DEGREE SIGN}') defined?
>
> You can get them from the unicodedata module, e.g.:
>
> import unicodedata
> for i in xrange(0x100
George Trojan wrote:
Scott David Daniels wrote:
...
And if you are unsure of the name to use:
>>> import unicodedata
>>> unicodedata.name(u'\xb0')
'DEGREE SIGN'
> Thanks for all suggestions. It took me a while to find out how to
> configure my keyboard to be able to type the degree sign. I
"George Trojan" wrote in message
news:hbktk6$8b...@news.nems.noaa.gov...
Thanks for all suggestions. It took me a while to find out how to
configure my keyboard to be able to type the degree sign. I prefer to
stick with pure ASCII if possible.
Where are the literals (i.e. u'\N{DEGREE SIGN}') d
> Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found
> http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt
> Is that the place to look?
Correct - you are supposed to fill in a Unicode character name into
the \N escape. The specific list of names depends on the version of
the UCD
On Tue, 20 Oct 2009 17:56:21 +, George Trojan wrote:
> Thanks for all suggestions. It took me a while to find out how to
> configure my keyboard to be able to type the degree sign. I prefer to
> stick with pure ASCII if possible.
> Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I
Thanks for all suggestions. It took me a while to find out how to
configure my keyboard to be able to type the degree sign. I prefer to
stick with pure ASCII if possible.
Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found
http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt
Is
Mark Tolonen wrote:
Is there a better way of getting the degrees?
It seems your string is UTF-8. \xc2\xb0 is UTF-8 for DEGREE SIGN. If
you type non-ASCII characters in source code, make sure to declare the
encoding the file is *actually* saved in:
# coding: utf-8
s = '''48° 13' 16.80" N'
"George Trojan" wrote in message
news:hbidd7$i9...@news.nems.noaa.gov...
A trivial one, this is the first time I have to deal with Unicode. I am
trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is
"iso-8859-1". To get the degrees I did
>>> encoding='iso-8859-1'
>>> q=s
"George Trojan" wrote in message
news:hbidd7$i9...@news.nems.noaa.gov...
A trivial one, this is the first time I have to deal with Unicode. I am
trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is
"iso-8859-1". To get the degrees I did
>>> encoding='iso-8859-1'
>>> q=s
On 19 Okt, 21:07, George Trojan wrote:
> A trivial one, this is the first time I have to deal with Unicode. I am
> trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is
> "iso-8859-1". To get the degrees I did
> >>> encoding='iso-8859-1'
> >>> q=s.decode(encoding)
> >>> q.spl
George Trojan schrieb:
A trivial one, this is the first time I have to deal with Unicode. I am
trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is
"iso-8859-1". To get the degrees I did
>>> encoding='iso-8859-1'
>>> q=s.decode(encoding)
>>> q.split()
[u'48\xc2\xb0', u"13
A trivial one, this is the first time I have to deal with Unicode. I am
trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is
"iso-8859-1". To get the degrees I did
>>> encoding='iso-8859-1'
>>> q=s.decode(encoding)
>>> q.split()
[u'48\xc2\xb0', u"13'", u'16.80"', u'N']
>>> r=
jeffunit wrote:
>>That looks like a "surrogate escape" (See PEP 383)
>>http://www.python.org/dev/peps/pep-0383/. It indicates the wrong
>>encoding was used to decode the filename.
>
> That seems likely. How do I set the encoding to something correct to
> decode the filename?
>
> Clearly win
On Tue, Sep 15, 2009 at 9:48 PM, jeffunit wrote:
> At 09:25 PM 9/15/2009, Mark Tolonen wrote:
>>
>> "jeffunit" wrote in message
>> news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com...
>>>
>>> I wrote a program that diffs files and prints out matching file names.
>>> I will be executing th
At 09:25 PM 9/15/2009, Mark Tolonen wrote:
"jeffunit" wrote in message
news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com...
I wrote a program that diffs files and prints out matching file names.
I will be executing the output with sh, to delete select files.
Most of the files names are
"jeffunit" wrote in message
news:20090915144123964.ljka6...@cdptpa-omta01.mail.rr.com...
I wrote a program that diffs files and prints out matching file names.
I will be executing the output with sh, to delete select files.
Most of the files names are plain ascii, but about 10% of them have
un
I wrote a program that diffs files and prints out matching file names.
I will be executing the output with sh, to delete select files.
Most of the files names are plain ascii, but about 10% of them have unicode
characters in them. When I try to print the string containing the name, I get
an excep
On Sun, 30 Aug 2009 02:36:49 +, Steven D'Aprano wrote:
>>> So long as your terminal has a sensible encoding, and you have a good
>>> quality font, you should be able to print any string you can create.
>>
>> UTF-8 isn't a particularly sensible encoding for terminals.
>
> Did I mention UTF-8?
On Sat, 29 Aug 2009 20:09:12 +0100, Nobody wrote:
> On Sat, 29 Aug 2009 08:26:54 +, Steven D'Aprano wrote:
>
>> Python only needs to know when you convert the text to or from bytes. I
>> can do this:
>>
> s = "hello"
> t = "world"
> print(' '.join([s, t]))
>> hello world
>>
>> a
On Sat, 29 Aug 2009 08:26:54 +, Steven D'Aprano wrote:
> Python only needs to know when you convert the text to or from bytes. I
> can do this:
>
s = "hello"
t = "world"
print(' '.join([s, t]))
> hello world
>
> and not need to care anything about encodings.
>
> So long as y
On Sat, 29 Aug 2009 09:34:43 +0200, Thorsten Kampe wrote:
> * Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700)
>> > Further, does anything, except a printing device need to know the
>> > encoding of a piece of "text"?
>
> Python needs to know if you are processing the text.
Python only needs to
* Rami Chowdhury (Thu, 27 Aug 2009 09:44:41 -0700)
> > Further, does anything, except a printing device need to know the
> > encoding of a piece of "text"?
Python needs to know if you are processing the text.
> I may be wrong, but I believe that's part of the idea between separation
> of strin
On Thu, 2009-08-27 at 22:09 +0530, Shashank Singh wrote:
> Hi All!
>
> I have a very simple (and probably stupid) question eluding me.
> When exactly is the char-set information needed?
>
> To make my question clear consider reading a file.
> While reading a file, all I get is basically an array
Further, does anything, except a printing device need to know the
encoding of a piece of "text"?
I may be wrong, but I believe that's part of the idea between separation
of string and bytes types in Python 3.x. I believe, if you are using
Python 3.x, you don't need the character encoding mum
Hi All!
I have a very simple (and probably stupid) question eluding me.
When exactly is the char-set information needed?
To make my question clear consider reading a file.
While reading a file, all I get is basically an array of bytes.
Now suppose a file has 10 bytes in it (all is data, no metad
Ben Edwards (lists) wrote:
> Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do
> sys.setdefaultencoding = 'iso−8859−1'
That "works", but has no effect. You bind the variable
sys.setdefaultencoding to some value, but that value is never used for
anything (do sys.getdefaultenc
Ben Edwards (lists) wrote:
> I am using python 2.4 on Ubuntu dapper, I am working through Dive into
> Python.
>
> There are a couple of inconsictencies.
>
> Firstly sys.setdefaultencoding('iso-8859-1') does not work, I have to do
> sys.setdefaultencoding = 'iso-8859-1'
When you run a Python script
"Ben Edwards (lists)" <[EMAIL PROTECTED]> wrote:
> I am using python 2.4 on Ubuntu dapper, I am working through Dive
> into Python.
...
> Any insight?
> Ben
Did you follow all the instructions, or did you try to call
sys.setdefaultencoding interactively?
See:
http://diveintopython.org/xml_pro
I am using python 2.4 on Ubuntu dapper, I am working through Dive into
Python.
There are a couple of inconsictencies.
Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do
sys.setdefaultencoding = 'iso−8859−1'
secondly the following does not give a 'UnicodeError: ASCII encodin
John Machin wrote:
> ... and yes Peter, info travels faster also from China that it does
> from Armenia :-())
Q: Can info travel faster from Armenia than from China?
Radio Yerevan: In principle, yes. Just make sure that it doesn't go the
other way round the globe or meets some friends on the way.
E, it get's worse: not only is the title written in Chinese, it
is encoded as gb2312 -- here is the repr() of the first few chunks:
"\n\n\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) :
\xc4\xd
a\xb2\xbf\xc8\xcb\xd4\xb1\xb3\xd6\xb9\xc9 -
\xcb\xd1\xba\xfc\xb9\xc9\xc6\xb1\n\n"
and here is wha
[EMAIL PROTECTED] wrote:
> Mr. John Machin
>
> This question come form the flow codes. I use the PyXml to build a DOM
> tree.
>
> from xml.dom.ext.reader import HtmlLib
> doc =
> HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028')
> title_elem = doc.documentElement.getElem
Mr. John Machin
This question come form the flow codes. I use the PyXml to build a DOM
tree.
from xml.dom.ext.reader import HtmlLib
doc =
HtmlLib.FromHtmlUrl('http://stock.business.sohu.com/q/nbcg.php?code=600028')
title_elem = doc.documentElement.getElementsByTagName("TITLE")[0]
title_string = t
Mr. John Machin, Thank you very much!
--
http://mail.python.org/mailman/listinfo/python-list
What do you mean by "ansi string"?
Here is a superficially not-unreasonable answer to your more specific
question:
# >>> s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) '
# >>> s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) '
# >>> s3 = s1.encode('latin1')
# >>> s2 == s3
# True
But what are you
Hello,
There is a unicode string, I want to change it to ansi string. but
it raise an exception.
Could you help me?
## I want to change s1 to s2.
s1 = u'\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) '
s2 = '\xd6\xd0\xb9\xfa\xca\xaf\xbb\xaf(600028) '
--
http://mail.python.org/mailman
"Ian Sparks" <[EMAIL PROTECTED]> writes:
> This is probably stupid and/or misguided but supposing I'm passed a
> byte-string value that I want to be unicode, this is what I do. I'm
> sure I'm missing something very important.
Perhaps you need to read one of the good Python Unicode tutorials,
such
ianaré wrote:
> maybe a bit off topic, but how does one find the console's encoding
> from within python?
>
In [1]: import sys
In [3]: sys.stdout.encoding
Out[3]: 'cp437'
In [4]: sys.stdin.encoding
Out[4]: 'cp437'
Kent
--
http://mail.python.org/mailman/listinfo/python-list
The most important thing that you are missing is that you need to know
the encoding used for the 8-bit-character string. Let's guess that it's
Latin1.
Then all you have to do is use the unicode() builtin function, or the
string decode method.
# >>> s = 'Jos\xe9'
# >>> s
# 'Jos\xe9'
# >>> u = unico
maybe a bit off topic, but how does one find the console's encoding
from within python?
--
http://mail.python.org/mailman/listinfo/python-list
First of all, if you run this on the console, find out your console's
encoding. In my case it is English Windows XP. It uses 'cp437'.
C:\>chcp
Active code page: 437
Then
>>> s = "José"
>>> u = u"Jos\u00e9" # same thing in unicode escape
>>> s.decode('cp437') == u # use encoding that
This is probably stupid and/or misguided but supposing I'm passed a byte-string
value that I want to be unicode, this is what I do. I'm sure I'm missing
something very important.
Short version :
>>> s = "José" #Start with non-unicode string
>>> unicoded = eval("u'%s'" % "José")
Long version :
Edward Loper wrote:
> Walter Dörwald wrote:
>> Edward Loper wrote:
>>
>>> [...]
>>> Surely there's a better way than converting back and forth 3 times? Is
>>> there a reason that the 'backslashreplace' error mode can't be used
>>> with codecs.decode?
>>>
>>> >>> 'abc \xff\xe8 def'.decode('ascii
Walter Dörwald wrote:
> Edward Loper wrote:
>
>> [...]
>> Surely there's a better way than converting back and forth 3 times? Is
>> there a reason that the 'backslashreplace' error mode can't be used
>> with codecs.decode?
>>
>> >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace')
>> Trac
Edward Loper wrote:
> [...]
> Surely there's a better way than converting back and forth 3 times? Is
> there a reason that the 'backslashreplace' error mode can't be used with
> codecs.decode?
>
> >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace')
> Traceback (most recent call last):
>
Edward Loper wrote:
> I would like to convert an 8-bit string (i.e., a str) into unicode,
> treating chars \x00-\x7f as ascii, and converting any chars \x80-xff
> into a backslashed escape sequences. I.e., I want something like this:
>
> >>> decode_with_backslashreplace('abc \xff\xe8 def')
> u'a
Edward Loper <[EMAIL PROTECTED]> wrote:
>I would like to convert an 8-bit string (i.e., a str) into unicode,
>treating chars \x00-\x7f as ascii, and converting any chars \x80-xff
>into a backslashed escape sequences. I.e., I want something like this:
>
> >>> decode_with_backslashreplace('abc \xff
I would like to convert an 8-bit string (i.e., a str) into unicode,
treating chars \x00-\x7f as ascii, and converting any chars \x80-xff
into a backslashed escape sequences. I.e., I want something like this:
>>> decode_with_backslashreplace('abc \xff\xe8 def')
u'abc \\xff\\xe8 def'
The best I c
Hi Max. Many thanks for helping to realize where I was missing the point
and making this clearer.
Regards,
David
Max Erickson wrote:
> The encoding argument to unicode() is used to specify the encoding of the
> string that you want to translate into unicode. The interpreter stores
> unicode as
Hi Erik. Thank you for your reply. The advice I has helped clarify this
for me.
Regards,
David
Erik Max Francis wrote:
> David Pratt wrote:
>
>
>>This is not working for me. Can someone explain why. Many thanks.
>
>
> Because '\xbe' isn't UTF-8 for the character you want, '\xc2\xbe' is, as
Hi Martin. Many thanks for your reply. What I am reall after, the
following accomplishes.
>
> If you are looking for "at the same time", perhaps this is also
> interesting:
>
> py> unicode('\xbe', 'windows-1252').encode('utf-8')
> '\xc2\xbe'
>
Your answer really helped quite a bit to clarify t
The encoding argument to unicode() is used to specify the encoding of the
string that you want to translate into unicode. The interpreter stores
unicode as unicode, it isn't encoded...
>>> unicode('\xbe','cp1252')
u'\xbe'
>>> unicode('\xbe','cp1252').encode('utf-8')
'\xc2\xbe'
>>>
max
--
ht
Hi. I am working through some tutorials on unicode and am hoping that
someone can help explain this for me. I am on mac platform using python
2.4.1 at the moment. I am experimenting with unicode with the 3/4 symbol.
I want to prepare strings for db storage that come from normal Windows
machin
David Pratt wrote:
> I want to prepare strings for db storage that come from normal Windows
> machine (cp1252) so my understanding is to unicode and encode to utf-8
> and to store properly.
That also depends on the database. The database must accept
UTF-8-encoded strings, and must not modify them
David Pratt wrote:
> This is not working for me. Can someone explain why. Many thanks.
Because '\xbe' isn't UTF-8 for the character you want, '\xc2\xbe' is, as
you just showed yourself in the code snippet.
--
Erik Max Francis && [EMAIL PROTECTED] && http://www.alcyone.com/max/
San Jose, CA, US
* Serge Orlov [23:45 26/03/05 CET]:
Nicolas Evrard wrote:
Hello,
I'm puzzled by this test I made while trying to transform a page in
html to plain text. Because I cannot send unicode to feed, nor str so
how can I do this ?
Seems like the parser is in the broken state after the first exception.
Fe
Nicolas Evrard wrote:
> Hello,
>
> I'm puzzled by this test I made while trying to transform a page in
> html to plain text. Because I cannot send unicode to feed, nor str so
> how can I do this ?
Seems like the parser is in the broken state after the first exception.
Feed only binary strings to i
Hello,
I'm puzzled by this test I made while trying to transform a page in
html to plain text. Because I cannot send unicode to feed, nor str so
how can I do this ?
[EMAIL PROTECTED]:~$ python2.4
.Python 2.4.1c2 (#2, Mar 19 2005, 01:04:19)
.[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2
.Type "help", "
On Tue, 23 Nov 2004 20:37:04 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
<[EMAIL PROTECTED]> wrote:
>Steve Holden wrote:
>> Am I the only person who found it scary that Bengt could apparently
>> casually drop on a polynomial the would decode to " Löwis"?
Well, don't give me too much credit
93 matches
Mail list logo