Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* Leo Kislov (1 Apr 2007 14:24:17 -0700)
 On Apr 1, 8:47 am, Thorsten Kampe [EMAIL PROTECTED] wrote:
  I guess the culprit is this snippet from optparse.py:
 
  # used by test suite
  def _get_encoding(self, file):
  encoding = getattr(file, encoding, None)
  if not encoding:
  encoding = sys.getdefaultencoding()
  return encoding
 
  def print_help(self, file=None):
  print_help(file : file = stdout)
 
  Print an extended help message, listing all options and any
  help text provided with them, to 'file' (default stdout).
  
  if file is None:
  file = sys.stdout
  encoding = self._get_encoding(file)
  file.write(self.format_help().encode(encoding, replace))
 
  So this means: when the encoding of sys.stdout is US-ASCII, Optparse
  sets the encoding to of the help text to ASCII, too.
 
 .encode() method doesn't set an encoding. It encodes unicode text into
 bytes according to specified encoding. That means optparse needs ascii
 or unicode (at least) for help text. In other words you'd better use
 unicode throughout your program.
 
  But that's
  nonsense because the Encoding is declared in the Po (localisation)
  file.
 
 For backward compatibility gettext is working with bytes by default,
 so the PO file encoding is not even involved. You need to use unicode
 gettext.

You mean

gettext.install('test', unicode = True)
and
description = _(u'THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR 
SUPPORT!') ?

If I modify my code like this, I don't get any traceback anymore, but 
the non-ascii umlauts are still displayed as question marks.


Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* Jarek Zgoda (Sun, 01 Apr 2007 22:02:15 +0200)
 Thorsten Kampe napisa?(a):
 
  Under Windows I get  File G:\program files\python\lib\encodings
  \cp1252.py, line 12, in encode
 return codecs.charmap_encode(input,errors,encoding_table)
  I'm not very experienced with internationalization, but if you change::
 
   gettext.install('test')
 
  to::
 
   gettext.install('test', unicode=True)
 
  what happens?
  
  No traceback anymore from optparse but the non-ascii umlauts are 
  displayed as question marks (?).
 
 And this is expected behaviour of encode() with errors set to 'replace'.
 I think this is the solution to your problem. I was a bit surprised I
 never saw this error, but I always use the unicode=True setting to
 gettext.install()...

I can't see the solution here. Is the optparse print_help function 
wrong? Why should there even be errors if I use unicode = True with 
gettext.install?

I have ISO-8859-15 gettext translations and I want optparse to display 
them correctly. What do I have to do?

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Jarek Zgoda
Thorsten Kampe napisał(a):

 I can't see the solution here. Is the optparse print_help function 
 wrong? Why should there even be errors if I use unicode = True with 
 gettext.install?
 
 I have ISO-8859-15 gettext translations and I want optparse to display 
 them correctly. What do I have to do?

Please, see gettext module documentation on this topic.

The solution is: always install your translation with unicode=True
setting. This assures usage of ugettext() instead of gettext() and works
properly with character sets other than ASCII. Your messages are
internally decoded to unicode objects and passed to output. Then the
displayed output will be limited only by the encoding of your terminal,
but it will not crash your program on any inconsistency, you would see
question marks.

-- 
Jarek Zgoda

We read Knuth so you don't have to.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* Steven Bethard (Sun, 01 Apr 2007 10:21:40 -0600)
 Thorsten Kampe wrote:
 I'm not very experienced with internationalization, but if you change::
 
  gettext.install('test')
 
 to::
 
  gettext.install('test', unicode=True)
 
 what happens?

Actually, this is the solution.

But there's one more problem: the solution only works when the 
Terminal encoding is not US-ASCII. Unfortunately (almost) all 
terminals I tried are set to US-ASCII (rxvt under Cygwin, Console[1] 
running bash, Poderosa[2] running bash). Only the Windows Console is 
CP852 and this works.

I got the tip to set a different encoding by
sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')

but unfortunately this does not change the encoding of any Terminal. 
So my question is: how can I set a different encoding to sys.stdout 
(or why can I set it without any error but nothing changes?)


Thorsten

[1] http://sourceforge.net/project/screenshots.php?group_id=43764
[2] http://en.poderosa.org/present/about_poderosa.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* Jarek Zgoda (Mon, 02 Apr 2007 17:52:34 +0200)
 Thorsten Kampe napisa?(a):
 
  I can't see the solution here. Is the optparse print_help function 
  wrong? Why should there even be errors if I use unicode = True with 
  gettext.install?
  
  I have ISO-8859-15 gettext translations and I want optparse to display 
  them correctly. What do I have to do?
 
 Please, see gettext module documentation on this topic.
 
 The solution is: always install your translation with unicode=True
 setting. This assures usage of ugettext() instead of gettext() and works
 properly with character sets other than ASCII. Your messages are
 internally decoded to unicode objects and passed to output. Then the
 displayed output will be limited only by the encoding of your terminal,

You are right. My problem is that all the terminals I use are set to 
US-ASCII (rxvt under Cygwin, Console[1] running bash, Poderosa[2] 
running bash). Even those who actually support non-ASCII characters.

I got the tip to set a different encoding by
sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')

but unfortunately this does not change the encoding.

So my question is: how can I set a different encoding to sys.stdout 
(or why can I set it without any error but nothing changes?)


Thorsten

[1] http://sourceforge.net/project/screenshots.php?group_id=43764
[2] http://en.poderosa.org/present/about_poderosa.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread paul
Thorsten Kampe schrieb:
[snipp]
 I got the tip to set a different encoding by
 sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')
 
 but unfortunately this does not change the encoding of any Terminal. 
 So my question is: how can I set a different encoding to sys.stdout 
 (or why can I set it without any error but nothing changes?)
AFAIK you can't. If the terminal is limited to ascii it won't be able to
display anything else; it might not even have the right font, so how are
you supposed to fix that? The .encode(encoding, replace) ensures safe
downgrades though.

cheers
 Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* paul (Mon, 02 Apr 2007 17:49:15 +0200)
 Thorsten Kampe schrieb:
 [snipp]
  I got the tip to set a different encoding by
  sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')
  
  but unfortunately this does not change the encoding of any Terminal. 
  So my question is: how can I set a different encoding to sys.stdout 
  (or why can I set it without any error but nothing changes?)
 AFAIK you can't. If the terminal is limited to ascii it won't be able to
 display anything else; it might not even have the right font, so how are
 you supposed to fix that?

Actually rxvt, Poderosa and console have the ability to display non-
ASCII characters. I use the dejavu fonts that support non-ASCII, too.

But the problem is even simpler: I can't even set the standard Windows 
console (cmd) to Windows 1252 in Python. Although directly executing 
chcp 1252 works.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Damjan

 Actually rxvt, Poderosa and console have the ability to display non-
 ASCII characters. I use the dejavu fonts that support non-ASCII, too.
 
 But the problem is even simpler: I can't even set the standard Windows
 console (cmd) to Windows 1252 in Python. Although directly executing
 chcp 1252 works.

Maybe try to use http://sourceforge.net/projects/console it's claimed to be
muc better than the sucky CDM (I don't have windows to try it).


-- 
damjan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* Damjan (Mon, 02 Apr 2007 18:29:06 +0200)
  Actually rxvt, Poderosa and console have the ability to display non-
  ASCII characters. I use the dejavu fonts that support non-ASCII, too.
  
  But the problem is even simpler: I can't even set the standard Windows
  console (cmd) to Windows 1252 in Python. Although directly executing
  chcp 1252 works.
 
 Maybe try to use http://sourceforge.net/projects/console it's claimed to be
 muc better than the sucky CDM (I don't have windows to try it).

It is definitely. But it just runs bash or cmd.exe so its capabilities 
(encoding) are defined by Windows or Cygwin.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-02 Thread Thorsten Kampe
* Thorsten Kampe (Mon, 2 Apr 2007 16:05:25 +0100)
 * Steven Bethard (Sun, 01 Apr 2007 10:21:40 -0600)
  Thorsten Kampe wrote:
  I'm not very experienced with internationalization, but if you change::
  
   gettext.install('test')
  
  to::
  
   gettext.install('test', unicode=True)
  
  what happens?
 
 Actually, this is the solution.
 
 But there's one more problem: the solution only works when the 
 Terminal encoding is not US-ASCII. Unfortunately (almost) all 
 terminals I tried are set to US-ASCII (rxvt under Cygwin, Console[1] 
 running bash, Poderosa[2] running bash). Only the Windows Console is 
 CP852 and this works.
 
 I got the tip to set a different encoding by
 sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')
 
 but unfortunately this does not change the encoding of any Terminal. 
 So my question is: how can I set a different encoding to sys.stdout 
 (or why can I set it without any error but nothing changes?)

I solved it (finally after two days with the help of a lot of people): 

You have to set this...

sys.stdout  = codecs.EncodedFile(sys.stdout, 'iso-8859-15')
sys.stdout.encoding = 'iso-8859-15'

...both of these and exactly in this order and not vice versa. It 
doesn't have to be 'iso-8859-15', Windows-1252 is fine, too (but UTF-8 
doesn't work). Now we have a new problem: the native Windows consoles 
don't print the right characters. So you wrap this in a query:

if sys.platform in ['cygwin', 'linux2']:
sys.stdout = codecs.EncodedFile(sys.stdout, 'iso-8859-15')
sys.stdout.encoding = 'iso-8859-15'

This would be a problem if there's more than one translation (for 
instance one with polish characters that aren't contained in iso-8859-
15). One could work around this with

if sys.platform in ['cygwin', 'linux2']:
sys.stdout  = codecs.EncodedFile(sys.stdout,
  locale.getpreferredencoding())
sys.stdout.encoding = locale.getpreferredencoding()

Funny (more or less): two days work to print U and A with double 
points above.

Thanks to all who have helped to clear my confusion and to bring me 
into the right direction for the solution.


Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
* Steven Bethard (Sat, 31 Mar 2007 20:08:45 -0600)
 Thorsten Kampe wrote:
  I've written a script which uses Optik/Optparse to display the 
  options (which works fine). The text for the help message is localised 
  (with german umlauts) and when I execute the script with the localised 
  environment variable set, I get this traceback[1]. The interesting 
  thing is that the localised optparse messages from displays fine - 
  it's only my localisation that errors.
  
  From my understanding, my script doesn't put out anything, it's 
  optik/optparse who does that. My po file is directly copied from the 
  optik po file (who displays fine) and modified so the po file should 
  be fine, too.
  
  What can I do to troubleshoot whether the culprit is my script, optik 
  or gettext?
  
  Would it make sense to post the script and the mo or po files?
 
 Yes, probably.  Though if you can reduce it to the simplest test case 
 that produces the error, it'll increase your chances of having someone 
 look at it.

The most simple test.py is:

###
#! /usr/bin/env python

import gettext, \
   os,  \
   sys

gettext.textdomain('optparse')
gettext.install('test')

from optparse import OptionParser, \
 OptionGroup

cmdlineparser = OptionParser(description = _('THIS SOFTWARE COMES 
WITHOUT WARRANTY, LIABILITY OR SUPPORT!'))

options, args = cmdlineparser.parse_args()
###

When I run LANGUAGE=de ./test.py --help I get the error.

### This is the test.de.po file
# Copyright (C) 2006 Thorsten Kampe
# Thorsten Kampe [EMAIL PROTECTED], 2006

msgid  
msgstr 

Project-Id-Version: Template 1.0\n
POT-Creation-Date: Tue Sep  7 22:20:34 2004\n
PO-Revision-Date: 2005-07-03 16:47+0200\n
Last-Translator: Thorsten Kampe [EMAIL PROTECTED]\n
Language-Team: Thorsten Kampe [EMAIL PROTECTED]\n
MIME-Version: 1.0\n
Content-Type: text/plain; charset=ISO-8859-15\n
Content-Transfer-Encoding: 8-bit\n
Generated-By: pygettext.py 1.5\n

msgid  THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!
msgstr DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH 
UNTERSTÜTZUNG!
###

The localisation now produces an error in the localised optik files, 
too.

Under Windows I get  File G:\program files\python\lib\encodings
\cp1252.py, line 12, in encode
   return codecs.charmap_encode(input,errors,encoding_table)

Is there something I have to do to put the terminal in non-ascii 
output mode?

I tried

###
#! /usr/bin/env python
# -*- coding: ISO-8859-15 -*-

print DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH 
UNTERSTÜTZUNG!
###

...and this worked. That means that my terminal is willing to print, 
right?! 
 
 You could also try posting to the optik list:
  http://lists.sourceforge.net/lists/listinfo/optik-users

I already did this via Gmane (although the list seems pretty dead to 
me). Sourceforge seems to have a bigger problem as [1] and [2] error.

Sorry for the confusion but this Unicode magic is far from being 
rational. I guess most people just don't get it...


Thorsten
[1] http://sourceforge.net/mailarchive/forum.php?forum=optik-users
[2] https://lists.sourceforge.net/lists/listinfo
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
Just an addition : when I insert this statement...

print _('THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!')

into this skript, the line is printed out. So if my Skript can output 
the localised text but Optparse can't it should be an optparse bug, 
right?!

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
I guess the culprit is this snippet from optparse.py:

# used by test suite
def _get_encoding(self, file):
encoding = getattr(file, encoding, None)
if not encoding:
encoding = sys.getdefaultencoding()
return encoding

def print_help(self, file=None):
print_help(file : file = stdout)

Print an extended help message, listing all options and any
help text provided with them, to 'file' (default stdout).

if file is None:
file = sys.stdout
encoding = self._get_encoding(file)
file.write(self.format_help().encode(encoding, replace))

So this means: when the encoding of sys.stdout is US-ASCII, Optparse 
sets the encoding to of the help text to ASCII, too. But that's 
nonsense because the Encoding is declared in the Po (localisation) 
file.

How can I set the encoding of sys.stdout to another encoding? Of 
course this would be a terrible hack if the encoding of the 
localisation changes or different translators use different 
encodings...

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Steven Bethard
Thorsten Kampe wrote:
 * Steven Bethard (Sat, 31 Mar 2007 20:08:45 -0600)
 Thorsten Kampe wrote:
 I've written a script which uses Optik/Optparse to display the 
 options (which works fine). The text for the help message is localised 
 (with german umlauts) and when I execute the script with the localised 
 environment variable set, I get this traceback[1]. The interesting 
 thing is that the localised optparse messages from displays fine - 
 it's only my localisation that errors.

 From my understanding, my script doesn't put out anything, it's 
 optik/optparse who does that. My po file is directly copied from the 
 optik po file (who displays fine) and modified so the po file should 
 be fine, too.

 What can I do to troubleshoot whether the culprit is my script, optik 
 or gettext?

 Would it make sense to post the script and the mo or po files?
 Yes, probably.  Though if you can reduce it to the simplest test case 
 that produces the error, it'll increase your chances of having someone 
 look at it.
 
 The most simple test.py is:
 
 ###
 #! /usr/bin/env python
 
 import gettext, \
os,  \
sys
 
 gettext.textdomain('optparse')
 gettext.install('test')
 
 from optparse import OptionParser, \
  OptionGroup
 
 cmdlineparser = OptionParser(description = _('THIS SOFTWARE COMES 
 WITHOUT WARRANTY, LIABILITY OR SUPPORT!'))
 
 options, args = cmdlineparser.parse_args()
 ###
 
 When I run LANGUAGE=de ./test.py --help I get the error.
 
 ### This is the test.de.po file
 # Copyright (C) 2006 Thorsten Kampe
 # Thorsten Kampe [EMAIL PROTECTED], 2006
 
 msgid  
 msgstr 
 
 Project-Id-Version: Template 1.0\n
 POT-Creation-Date: Tue Sep  7 22:20:34 2004\n
 PO-Revision-Date: 2005-07-03 16:47+0200\n
 Last-Translator: Thorsten Kampe [EMAIL PROTECTED]\n
 Language-Team: Thorsten Kampe [EMAIL PROTECTED]\n
 MIME-Version: 1.0\n
 Content-Type: text/plain; charset=ISO-8859-15\n
 Content-Transfer-Encoding: 8-bit\n
 Generated-By: pygettext.py 1.5\n
 
 msgid  THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!
 msgstr DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH 
 UNTERSTÜTZUNG!
 ###
 
 The localisation now produces an error in the localised optik files, 
 too.
 
 Under Windows I get  File G:\program files\python\lib\encodings
 \cp1252.py, line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)

I'm not very experienced with internationalization, but if you change::

 gettext.install('test')

to::

 gettext.install('test', unicode=True)

what happens?

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Steven Bethard
Thorsten Kampe wrote:
 I guess the culprit is this snippet from optparse.py:
 
 # used by test suite
 def _get_encoding(self, file):
 encoding = getattr(file, encoding, None)
 if not encoding:
 encoding = sys.getdefaultencoding()
 return encoding
 
 def print_help(self, file=None):
 print_help(file : file = stdout)
 
 Print an extended help message, listing all options and any
 help text provided with them, to 'file' (default stdout).
 
 if file is None:
 file = sys.stdout
 encoding = self._get_encoding(file)
 file.write(self.format_help().encode(encoding, replace))
 
 So this means: when the encoding of sys.stdout is US-ASCII, Optparse 
 sets the encoding to of the help text to ASCII, too. But that's 
 nonsense because the Encoding is declared in the Po (localisation) 
 file.
 
 How can I set the encoding of sys.stdout to another encoding? Of 
 course this would be a terrible hack if the encoding of the 
 localisation changes or different translators use different 
 encodings...

If print_help() is what's wrong, you should probably hack print_help() 
instead of sys.stdout.  You could try something like::

 def print_help(self, file=None):
 print_help(file : file = stdout)

 Print an extended help message, listing all options and any
 help text provided with them, to 'file' (default stdout).
 
 if file is None:
 file = sys.stdout
 file.write(self.format_help())

 optparse.OptionParser.print_help = print_help

 cmdlineparser = optparse.OptionParser(description=...)
 ...

That is, you could monkey-patch print_help() before you create an 
OptionParser.

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
* Steven Bethard (Sun, 01 Apr 2007 10:21:40 -0600)
 Thorsten Kampe wrote:
  * Steven Bethard (Sat, 31 Mar 2007 20:08:45 -0600)
  Thorsten Kampe wrote:
  I've written a script which uses Optik/Optparse to display the 
  options (which works fine). The text for the help message is localised 
  (with german umlauts) and when I execute the script with the localised 
  environment variable set, I get this traceback[1]. The interesting 
  thing is that the localised optparse messages from displays fine - 
  it's only my localisation that errors.
 
  From my understanding, my script doesn't put out anything, it's 
  optik/optparse who does that. My po file is directly copied from the 
  optik po file (who displays fine) and modified so the po file should 
  be fine, too.
 
  What can I do to troubleshoot whether the culprit is my script, optik 
  or gettext?
 
  Would it make sense to post the script and the mo or po files?
  Yes, probably.  Though if you can reduce it to the simplest test case 
  that produces the error, it'll increase your chances of having someone 
  look at it.
  
  The most simple test.py is:
  
  ###
  #! /usr/bin/env python
  
  import gettext, \
 os,  \
 sys
  
  gettext.textdomain('optparse')
  gettext.install('test')
  
  from optparse import OptionParser, \
   OptionGroup
  
  cmdlineparser = OptionParser(description = _('THIS SOFTWARE COMES 
  WITHOUT WARRANTY, LIABILITY OR SUPPORT!'))
  
  options, args = cmdlineparser.parse_args()
  ###
  
  When I run LANGUAGE=de ./test.py --help I get the error.
  
  ### This is the test.de.po file
  # Copyright (C) 2006 Thorsten Kampe
  # Thorsten Kampe [EMAIL PROTECTED], 2006
  
  msgid  
  msgstr 
  
  Project-Id-Version: Template 1.0\n
  POT-Creation-Date: Tue Sep  7 22:20:34 2004\n
  PO-Revision-Date: 2005-07-03 16:47+0200\n
  Last-Translator: Thorsten Kampe [EMAIL PROTECTED]\n
  Language-Team: Thorsten Kampe [EMAIL PROTECTED]\n
  MIME-Version: 1.0\n
  Content-Type: text/plain; charset=ISO-8859-15\n
  Content-Transfer-Encoding: 8-bit\n
  Generated-By: pygettext.py 1.5\n
  
  msgid  THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!
  msgstr DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH 
  UNTERSTÜTZUNG!
  ###
  
  The localisation now produces an error in the localised optik files, 
  too.
  
  Under Windows I get  File G:\program files\python\lib\encodings
  \cp1252.py, line 12, in encode
 return codecs.charmap_encode(input,errors,encoding_table)
 
 I'm not very experienced with internationalization, but if you change::
 
  gettext.install('test')
 
 to::
 
  gettext.install('test', unicode=True)
 
 what happens?

No traceback anymore from optparse but the non-ascii umlauts are 
displayed as question marks (?).

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
* Steven Bethard (Sun, 01 Apr 2007 10:26:54 -0600)
 Thorsten Kampe wrote:
  I guess the culprit is this snippet from optparse.py:
  
  # used by test suite
  def _get_encoding(self, file):
  encoding = getattr(file, encoding, None)
  if not encoding:
  encoding = sys.getdefaultencoding()
  return encoding
  
  def print_help(self, file=None):
  print_help(file : file = stdout)
  
  Print an extended help message, listing all options and any
  help text provided with them, to 'file' (default stdout).
  
  if file is None:
  file = sys.stdout
  encoding = self._get_encoding(file)
  file.write(self.format_help().encode(encoding, replace))
  
  So this means: when the encoding of sys.stdout is US-ASCII, Optparse 
  sets the encoding to of the help text to ASCII, too. But that's 
  nonsense because the Encoding is declared in the Po (localisation) 
  file.
  
  How can I set the encoding of sys.stdout to another encoding? Of 
  course this would be a terrible hack if the encoding of the 
  localisation changes or different translators use different 
  encodings...
 
 If print_help() is what's wrong, you should probably hack print_help() 
 instead of sys.stdout.  You could try something like::
 
  def print_help(self, file=None):
  print_help(file : file = stdout)
 
  Print an extended help message, listing all options and any
  help text provided with them, to 'file' (default stdout).
  
  if file is None:
  file = sys.stdout
  file.write(self.format_help())
 
  optparse.OptionParser.print_help = print_help
 
  cmdlineparser = optparse.OptionParser(description=...)
  ...
 
 That is, you could monkey-patch print_help() before you create an 
 OptionParser.

Yes, I could do that but I'd rather know first if my code is wrong or 
the optparse code.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
* Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
 Yes, I could do that but I'd rather know first if my code is wrong or 
 the optparse code.

It might be the bug mentioned in 
http://mail.python.org/pipermail/python-dev/2006-May/065458.html

The patch although doesn't work. From my unicode-charset-codepage-
codeset-challenged point of view the encoding of sys.stdout doesn't 
matter. The charset is defined in the .po/.mo file (but of course 
optparse can't know if the message has been translated by gettext 
(_).

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
* Thorsten Kampe (Sun, 1 Apr 2007 20:08:39 +0100)
 * Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
  Yes, I could do that but I'd rather know first if my code is wrong or 
  the optparse code.
 
 It might be the bug mentioned in 
 http://mail.python.org/pipermail/python-dev/2006-May/065458.html
 
 The patch although doesn't work. From my unicode-charset-codepage-
 codeset-challenged point of view the encoding of sys.stdout doesn't 
 matter. The charset is defined in the .po/.mo file (but of course 
 optparse can't know if the message has been translated by gettext 
 (_).

If I patch line 1648 (the one mentioned in the traceback) of 
optparse.py from

file.write(self.format_help().encode(encoding, replace))
to
file.write(self.format_help())

...then everything works and is displayed fine (even without the 
unicode = True parameter to gettext.install).

But the patch might make other things fail, of course...

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Thorsten Kampe
* Thorsten Kampe (Sun, 1 Apr 2007 20:22:51 +0100)
 * Thorsten Kampe (Sun, 1 Apr 2007 20:08:39 +0100)
  * Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
   Yes, I could do that but I'd rather know first if my code is wrong or 
   the optparse code.
  
  It might be the bug mentioned in 
  http://mail.python.org/pipermail/python-dev/2006-May/065458.html
  
  The patch although doesn't work. From my unicode-charset-codepage-
  codeset-challenged point of view the encoding of sys.stdout doesn't 
  matter. The charset is defined in the .po/.mo file (but of course 
  optparse can't know if the message has been translated by gettext 
  (_).
 
 If I patch line 1648 (the one mentioned in the traceback) of 
 optparse.py from
 
 file.write(self.format_help().encode(encoding, replace))
 to
 file.write(self.format_help())
 
 ...then everything works and is displayed fine [...]

...but only in Cygwin rxvt, the standard Windows console doesn't show 
the right colors.

I give up and revert back to ASCII. This whole charset mess is not 
meant to solved by mere mortals.

Thorsten

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Jarek Zgoda
Thorsten Kampe napisał(a):

 Under Windows I get  File G:\program files\python\lib\encodings
 \cp1252.py, line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
 I'm not very experienced with internationalization, but if you change::

  gettext.install('test')

 to::

  gettext.install('test', unicode=True)

 what happens?
 
 No traceback anymore from optparse but the non-ascii umlauts are 
 displayed as question marks (?).

And this is expected behaviour of encode() with errors set to 'replace'.
I think this is the solution to your problem. I was a bit surprised I
never saw this error, but I always use the unicode=True setting to
gettext.install()...

-- 
Jarek Zgoda
http://jpa.berlios.de/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-04-01 Thread Leo Kislov
On Apr 1, 8:47 am, Thorsten Kampe [EMAIL PROTECTED] wrote:
 I guess the culprit is this snippet from optparse.py:

 # used by test suite
 def _get_encoding(self, file):
 encoding = getattr(file, encoding, None)
 if not encoding:
 encoding = sys.getdefaultencoding()
 return encoding

 def print_help(self, file=None):
 print_help(file : file = stdout)

 Print an extended help message, listing all options and any
 help text provided with them, to 'file' (default stdout).
 
 if file is None:
 file = sys.stdout
 encoding = self._get_encoding(file)
 file.write(self.format_help().encode(encoding, replace))

 So this means: when the encoding of sys.stdout is US-ASCII, Optparse
 sets the encoding to of the help text to ASCII, too.

.encode() method doesn't set an encoding. It encodes unicode text into
bytes according to specified encoding. That means optparse needs ascii
or unicode (at least) for help text. In other words you'd better use
unicode throughout your program.

 But that's
 nonsense because the Encoding is declared in the Po (localisation)
 file.

For backward compatibility gettext is working with bytes by default,
so the PO file encoding is not even involved. You need to use unicode
gettext.

 How can I set the encoding of sys.stdout to another encoding?

What are you going to set it to? As I understand you're going to
distribute your program to some users. How are you going to find out
the encoding of the terminal of your users?

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list


I18n issue with optik

2007-03-31 Thread Thorsten Kampe
Hi,

I've written a script which uses Optik/Optparse to display the 
options (which works fine). The text for the help message is localised 
(with german umlauts) and when I execute the script with the localised 
environment variable set, I get this traceback[1]. The interesting 
thing is that the localised optparse messages from displays fine - 
it's only my localisation that errors.

From my understanding, my script doesn't put out anything, it's 
optik/optparse who does that. My po file is directly copied from the 
optik po file (who displays fine) and modified so the po file should 
be fine, too.

What can I do to troubleshoot whether the culprit is my script, optik 
or gettext?

Would it make sense to post the script and the mo or po files?


Thorsten

[1]
Traceback (most recent call last):
  File script.py, line 37, in module
options, args = cmdlineparser.parse_args()
  File /usr/lib/python2.5/optparse.py, line 1378, in parse_args
stop = self._process_args(largs, rargs, values)
  File /usr/lib/python2.5/optparse.py, line 1418, in _process_args
self._process_long_opt(rargs, values)
  File /usr/lib/python2.5/optparse.py, line 1493, in 
_process_long_opt
option.process(opt, value, values, self)
  File /usr/lib/python2.5/optparse.py, line 782, in process
self.action, self.dest, opt, value, values, parser)
  File /usr/lib/python2.5/optparse.py, line 804, in take_action
parser.print_help()
  File /usr/lib/python2.5/optparse.py, line 1648, in print_help
file.write(self.format_help().encode(encoding, replace))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 
264: ordinal not in range(128)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I18n issue with optik

2007-03-31 Thread Steven Bethard
Thorsten Kampe wrote:
 I've written a script which uses Optik/Optparse to display the 
 options (which works fine). The text for the help message is localised 
 (with german umlauts) and when I execute the script with the localised 
 environment variable set, I get this traceback[1]. The interesting 
 thing is that the localised optparse messages from displays fine - 
 it's only my localisation that errors.
 
 From my understanding, my script doesn't put out anything, it's 
 optik/optparse who does that. My po file is directly copied from the 
 optik po file (who displays fine) and modified so the po file should 
 be fine, too.
 
 What can I do to troubleshoot whether the culprit is my script, optik 
 or gettext?
 
 Would it make sense to post the script and the mo or po files?

Yes, probably.  Though if you can reduce it to the simplest test case 
that produces the error, it'll increase your chances of having someone 
look at it.

You could also try posting to the optik list:
 http://lists.sourceforge.net/lists/listinfo/optik-users

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list