Re: Print encoding problems in console

2011-07-15 Thread Andrew Berg
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 On 2011.07.15 07:02 PM, Pedro Abranches wrote: > Now, if you're using your python script in some shell script you > might have to store the output in some variable, like this: > > $ var=`python -c 'import sys; print sys.stdout.encoding; print > u

Re: Print encoding problems in console

2011-07-15 Thread Dan Stromberg
I've used the code below successfully to deal with such a problem when outputting filenames. Python2x3 is at http://stromberg.dnsalias.org/svn/python2x3/ , but here it's just being used to convert Python 3.x's byte strings to strings (to eliminate the b'' stuff), while on 2.x it's an identity func

Print encoding problems in console

2011-07-15 Thread Pedro Abranches
Hello everyone. I'm having a problem when outputing UTF-8 strings to a console. Let me show a simple example that explains it: $ python -c 'import sys; print sys.stdout.encoding; print u"\xe9"' UTF-8 é It's everything ok. Now, if you're using your python script in some shell script you might hav

Re: How to force SAX parser to ignore encoding problems

2009-08-06 Thread Stefan Behnel
Łukasz wrote: > I have a problem with my XML parser (created with libraries from > xml.sax package). When parser finds a invalid character (in CDATA > section) for example �, throws an exception SAXParseException. > > Is there any way to just ignore this kind of problem. Maybe there is a > way to

Re: How to force SAX parser to ignore encoding problems

2009-07-31 Thread Łukasz
On 31 Lip, 09:28, Łukasz wrote: > Hi, > I have a problem with my XML parser (created with libraries from > xml.sax package). When parser finds a invalid character (in CDATA > section) for example , After sending this message I noticed that example invalid characters are not displaying on some pla

How to force SAX parser to ignore encoding problems

2009-07-31 Thread Łukasz
Hi, I have a problem with my XML parser (created with libraries from xml.sax package). When parser finds a invalid character (in CDATA section) for example �, throws an exception SAXParseException. Is there any way to just ignore this kind of problem. Maybe there is a way to set up parser in less

Re: encoding problems

2007-08-29 Thread Damjan
> > is there a way to sort this string properly (sorted()?) > I mean first 'a' then 'à' then 'e' etc. (sorted puts accented letters at > the end). Or should I have to provide a comparison function to sorted? After setting the locale... locale.strcoll() -- damjan -- http://mail.python.org/mai

Re: encoding problems

2007-08-29 Thread Diez B. Roggisch
Ricardo Aráoz wrote: > Lawrence D'Oliveiro wrote: >> In message <[EMAIL PROTECTED]>, tool69 wrote: >> >>> p2.content = """Ce poste possède des accents : é à ê è""" >> >> My guess is this is being encoded as a Latin-1 string, but when you try >> to output it it goes through the ASCII encoder, whi

Re: encoding problems

2007-08-29 Thread Ricardo Aráoz
Lawrence D'Oliveiro wrote: > In message <[EMAIL PROTECTED]>, tool69 wrote: > >> p2.content = """Ce poste possède des accents : é à ê è""" > > My guess is this is being encoded as a Latin-1 string, but when you try to > output it it goes through the ASCII encoder, which doesn't understand the > ac

Re: encoding problems

2007-08-29 Thread tool69
Diez B. Roggisch a écrit : > tool69 wrote: > >> Hi, >> >> I would like to transform reST contents to HTML, but got problems >> with accented chars. >> >> Here's a rather simplified version using SVN Docutils 0.5: >> >> %- >> >> #!/usr/bin

Re: encoding problems

2007-08-29 Thread tool69
Lawrence D'Oliveiro a écrit : > In message <[EMAIL PROTECTED]>, tool69 wrote: > >> p2.content = """Ce poste possède des accents : é à ê è""" > > My guess is this is being encoded as a Latin-1 string, but when you try to > output it it goes through the ASCII encoder, which doesn't understand the >

Re: encoding problems

2007-08-29 Thread Diez B. Roggisch
tool69 wrote: > Hi, > > I would like to transform reST contents to HTML, but got problems > with accented chars. > > Here's a rather simplified version using SVN Docutils 0.5: > > %- > > #!/usr/bin/env python > # -*- coding: utf-8 -*-

Re: encoding problems

2007-08-29 Thread Lawrence D'Oliveiro
In message <[EMAIL PROTECTED]>, tool69 wrote: > p2.content = """Ce poste possède des accents : é à ê è""" My guess is this is being encoded as a Latin-1 string, but when you try to output it it goes through the ASCII encoder, which doesn't understand the accents. Try this: p2.content = u"""Ce po

encoding problems

2007-08-29 Thread tool69
Hi, I would like to transform reST contents to HTML, but got problems with accented chars. Here's a rather simplified version using SVN Docutils 0.5: %- #!/usr/bin/env python # -*- coding: utf-8 -*- from docutils.core import publish_p

Re: encoding problems (é and è)

2006-03-25 Thread Martin v. Löwis
Serge Orlov wrote: > The problem is that U+0587 is a ligature in Western Armenian dialect > (hy locale) and a character in Eastern Armenian dialect (hy_AM locale). > It is strange the code point is marked as compatibility char. It either > mistake or political decision. It used to be a ligature bef

Re: encoding problems (é and è)

2006-03-24 Thread Serge Orlov
Jean-Paul Calderone wrote: > On Fri, 24 Mar 2006 09:33:19 +1100, John Machin <[EMAIL PROTECTED]> wrote: > >On 24/03/2006 8:36 AM, Peter Otten wrote: > >> John Machin wrote: > >> > >>>You can replace ALL of this upshifting and accent removal in one blow by > >>>using the string translate() method wi

Re: encoding problems (é and è)

2006-03-24 Thread Serge Orlov
Martin v. Löwis wrote: > John Machin wrote: > >> and, for things like u'\u0565\u0582' (ARMENIAN SMALL LIGATURE ECH > >> YIWN), it does not even work. > > > > Sorry, I don't understand. > > 0565 is stand-alone ECH > > 0582 is stand-alone YIWN > > 0587 is the ligature. > > What doesn't work? At first

Re: encoding problems (é and è)

2006-03-24 Thread Martin v. Löwis
John Machin wrote: >> and, for things like u'\u0565\u0582' (ARMENIAN SMALL LIGATURE ECH >> YIWN), it does not even work. > > Sorry, I don't understand. > 0565 is stand-alone ECH > 0582 is stand-alone YIWN > 0587 is the ligature. > What doesn't work? At first guess, in the absence of an Armenian

Re: encoding problems (� and

2006-03-24 Thread Fredrik Lundh
John Machin wrote: > Some of the transformations are a little unfortunate :-( here's a slightly silly way to map a unicode string to its "unaccented" version: ### import unicodedata, sys CHAR_REPLACEMENT = { 0xc6: u"AE", # LATIN CAPITAL LETTER AE 0xd0: u"D", # LATIN CAPITAL LETTER ETH

Re: encoding problems (é and è)

2006-03-24 Thread John Machin
On 24/03/2006 11:44 PM, Peter Otten wrote: > John Machin wrote: > > >>0x00d0: ord('D'), # Ð >>0x00f0: ord('o'), # ð >>Icelandic capital eth becomes D, OK; but the small letter becomes o!!! > > > I see information flow from Iceland is a bit better than from Armenia :-) No information flow neede

Re: encoding problems (X and X)

2006-03-24 Thread Walter Dörwald
Duncan Booth wrote: > [...] > Unfortunately, just as I finished writing this I discovered that the > latscii module isn't as robust as I thought, it blows up on consecutive > accented characters. > > :( Replace the error handler with this (untested) and it should work with consecutive accent

Re: encoding problems (é and è)

2006-03-24 Thread Peter Otten
John Machin wrote: > 0x00d0: ord('D'), # Ð > 0x00f0: ord('o'), # ð > Icelandic capital eth becomes D, OK; but the small letter becomes o!!! I see information flow from Iceland is a bit better than from Armenia :-) > Some of the transformations are a little unfortunate :-( The OP, as you pointed

Re: encoding problems (é and è)

2006-03-24 Thread John Machin
On 24/03/2006 8:11 PM, Duncan Booth wrote: > Peter Otten wrote: > > >>>You can replace ALL of this upshifting and accent removal in one blow >>>by using the string translate() method with a suitable table. >> >>Only if you convert to unicode first or if your data maintains 1 byte >>== 1 character

Re: encoding problems (é and è)

2006-03-24 Thread Peter Otten
Duncan Booth wrote: > There's a nice little codec from Skip Montaro for removing accents from > latin-1 encoded strings. It also has an error handler so you can convert > from unicode to ascii and strip all the accents as you do so: > > http://orca.mojam.com/~skip/python/latscii.py > import

Re: encoding problems (� and

2006-03-24 Thread Duncan Booth
Peter Otten wrote: >> You can replace ALL of this upshifting and accent removal in one blow >> by using the string translate() method with a suitable table. > > Only if you convert to unicode first or if your data maintains 1 byte > == 1 character, in particular it is not UTF-8. > There's a ni

Re: encoding problems (é and è)

2006-03-23 Thread John Machin
On 24/03/2006 2:19 PM, Jean-Paul Calderone wrote: > On Fri, 24 Mar 2006 09:33:19 +1100, John Machin <[EMAIL PROTECTED]> > wrote: > >> On 24/03/2006 8:36 AM, Peter Otten wrote: >> >>> John Machin wrote: >>> You can replace ALL of this upshifting and accent removal in one blow by us

Re: encoding problems (é and è)

2006-03-23 Thread Jean-Paul Calderone
On Fri, 24 Mar 2006 09:33:19 +1100, John Machin <[EMAIL PROTECTED]> wrote: >On 24/03/2006 8:36 AM, Peter Otten wrote: >> John Machin wrote: >> >>>You can replace ALL of this upshifting and accent removal in one blow by >>>using the string translate() method with a suitable table. >> >> Only if you

Re: encoding problems (é and è)

2006-03-23 Thread John Machin
On 24/03/2006 8:36 AM, Peter Otten wrote: > John Machin wrote: > >>You can replace ALL of this upshifting and accent removal in one blow by >>using the string translate() method with a suitable table. > > Only if you convert to unicode first or if your data maintains 1 byte == 1 > character, in p

Re: encoding problems (é and è)

2006-03-23 Thread Peter Otten
John Machin wrote: > You can replace ALL of this upshifting and accent removal in one blow by > using the string translate() method with a suitable table. Only if you convert to unicode first or if your data maintains 1 byte == 1 character, in particular it is not UTF-8. Peter -- http://mail.

Re: encoding problems (é and è)

2006-03-23 Thread John Machin
On 23/03/2006 10:07 PM, bussiere bussiere wrote: > hi i'am making a program for formatting string, > or > i've added : > #!/usr/bin/python > # -*- coding: utf-8 -*- > > in the begining of my script but > > str = str.replace('Ç', 'C') > str = str.replace('é', 'E') > str = str.repl

Re: encoding problems (é and è)

2006-03-23 Thread Larry Bates
Seems to work fine for me. >>> x="éÇ" >>> x=x.replace('é','E') 'E\xc7' >>> x=x.replace('Ç','C') >>> x 'E\xc7' >>> x=x.replace('Ç','C') >>> x 'EC' You should also be able to use .upper() method to uppercase everything in the string in a single statement: tstr=ligneA.upper() Note: you should neve

Re: encoding problems (é and è)

2006-03-23 Thread Christoph Zwerschke
bussiere bussiere wrote: > hi i'am making a program for formatting string, > i've added : > #!/usr/bin/python > # -*- coding: utf-8 -*- > > in the begining of my script but > > str = str.replace('Ç', 'C') > ... > doesn't work it put me " and , instead of remplacing é by E Are your sure your scr

encoding problems (é and è)

2006-03-23 Thread bussiere bussiere
hi i'am making a program for formatting string, or i've added : #!/usr/bin/python # -*- coding: utf-8 -*- in the begining of my script but str = str.replace('Ç', 'C') str = str.replace('é', 'E') str = str.replace('É', 'E') str = str.replace('è', 'E') str = str.rep

Encoding problems with gettext and wxPython: how to do things in "good style"

2006-03-01 Thread André
I'm trying to change an app so that it uses gettext for translations rather than the idiosyncratic way I am using. I've tried the example on the wxPython wiki http://wiki.wxpython.org/index.cgi/RecipesI18n but found that the accented letters would not display properly. I have found a workaround t

Re: encoding problems with pymssql / win

2006-02-11 Thread morris carre
> (to email use "boris at batiment71 dot ch") oops, that's "boris at batiment71 dot net" -- http://mail.python.org/mailman/listinfo/python-list

encoding problems with pymssql / win

2006-02-11 Thread morris carre
I have a strange problem : some code that fetches queries from an mssql database works fine under Idle but the very same code run from a shell window obtains its strings garbled as if the encoding codepage was modified. This occurs only when using pymssql to connect; if I connect through odbc