Re: Hello gettext
James T. Dennis <[EMAIL PROTECTED]> wrote: ... just to follow-up my own posting --- as gauche as that is: > You'd think that using things like gettext would be easy. Superficially > it seems well documented in the Library Reference(*). However, it can > be surprisingly difficult to get the external details right. >* http://docs.python.org/lib/node738.html > Here's what I finally came up with as the simplest instructions, suitable > for an "overview of Python programming" class: > Start with the venerable "Hello, World!" program ... slightly modified > to make it ever-so-slightly more "functional:" >#!/usr/bin/env python >import sys >def hello(s="World"): >print "Hello,", s >if __name__ == "__main__": >args = sys.argv[1:] >if len(args): >for each in args: >hello(each) >else: >hello() > ... and add gettext support (and a little os.path handling on the > assumption that our message object files will not be readily > installable into the system /usr/share/locale tree): >#!/usr/bin/env python >import sys, os, gettext >_ = gettext.lgettext >mydir = os.path.realpath(os.path.dirname(sys.argv[0])) >localedir = os.path.join(mydir, "locale") >gettext.bindtextdomain('HelloPython', localedir) >gettext.textdomain('HelloPython') >def hello(s=_("World")): >print _("Hello,"), s Turns out this particular version is a Bad Idea(TM) if you ever try to import this into another script and use it after changing you os.environ['LANG'] value. I mentioned in another message awhile back that I have an aversion to using defaulted arguments other than by setting them as "None" and I hesitated this time and then thought: "Oh, it's fine in this case!" Here's my updated version of this script: --- #!/usr/bin/env python import gettext, os, sys _ = gettext.lgettext i18ndomain = 'HelloPython' mydir = os.path.realpath(os.path.dirname(sys.argv[0])) localedir = os.path.join(mydir, "locale") gettext.install(i18ndomain, localedir=None, unicode=1) gettext.bindtextdomain(i18ndomain, localedir) gettext.textdomain(i18ndomain) def hello(s=None): """Print "Hello, World" (or its equivalent in any supported language): Examples: >>> os.environ['LANG']='' >>> hello() Hello, World >>> os.environ['LANG']='es_ES' >>> hello() Hola, Mundo >>> os.environ['LANG']='fr_FR' >>> hello() Bonjour, Monde """ if s is None: s = _("World") print _("Hello,"), s def test(): import doctest doctest.testmod() if __name__ == "__main__": args = sys.argv[1:] if 'PYDOCTEST' in os.environ and os.environ['PYDOCTEST']: test() elif len(args): for each in args: hello(each) else: hello() --- ... now with doctest support. :) >if __name__ == "__main__": >args = sys.argv[1:] >if len(args): >for each in args: >hello(each) >else: >hello() > Note that I've only added five lines, the two modules to my import > line, and wrapped two strings with the conventional _() function. > This part is easy, and well-documented. > Running pygettext or GNU xgettext (-L or --language=Python) is > also easy and gives us a file like: ># SOME DESCRIPTIVE TITLE. ># Copyright (C) YEAR ORGANIZATION ># FIRST AUTHOR <[EMAIL PROTECTED]>, YEAR. ># >msgid "" >msgstr "" >"Project-Id-Version: PACKAGE VERSION\n" >"POT-Creation-Date: 2007-05-14 12:19+PDT\n" >"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" >"Last-Translator: FULL NAME <[EMAIL PROTECTED]>\n" >"Language-Team: LANGUAGE <[EMAIL PROTECTED]>\n" >"MIME-Version: 1.0\n" >"Content-Type: text/plain; charset=CHARSET\n" >"Content-Transfer-Encoding: ENCODING\n" >"Generated-By: pygettext.py 1.5\n" >#: HelloWorld.py:10 >msgid "World" >msgstr "" >#: HelloWorld.py:11 >msgid "Hello," >msgstr "" > ... I suppose I should add the appropriate magic package name, > version, author and other values to my source. Anyone remember > where those are documented? Does pygettext extract them from the > sources and insert them into the .pot? > Anyway, I minimally have to change one line thus: >"Content-Type: text/plain; charset=utf-8\n" > ... and I suppose there are other ways to do this more properly. > (Documented where?) > I did find that I could either change that in the .pot file or > in the individual .po files. However, if I failed to chan
Hello gettext
You'd think that using things like gettext would be easy. Superficially it seems well documented in the Library Reference(*). However, it can be surprisingly difficult to get the external details right. * http://docs.python.org/lib/node738.html Here's what I finally came up with as the simplest instructions, suitable for an "overview of Python programming" class: Start with the venerable "Hello, World!" program ... slightly modified to make it ever-so-slightly more "functional:" #!/usr/bin/env python import sys def hello(s="World"): print "Hello,", s if __name__ == "__main__": args = sys.argv[1:] if len(args): for each in args: hello(each) else: hello() ... and add gettext support (and a little os.path handling on the assumption that our message object files will not be readily installable into the system /usr/share/locale tree): #!/usr/bin/env python import sys, os, gettext _ = gettext.lgettext mydir = os.path.realpath(os.path.dirname(sys.argv[0])) localedir = os.path.join(mydir, "locale") gettext.bindtextdomain('HelloPython', localedir) gettext.textdomain('HelloPython') def hello(s=_("World")): print _("Hello,"), s if __name__ == "__main__": args = sys.argv[1:] if len(args): for each in args: hello(each) else: hello() Note that I've only added five lines, the two modules to my import line, and wrapped two strings with the conventional _() function. This part is easy, and well-documented. Running pygettext or GNU xgettext (-L or --language=Python) is also easy and gives us a file like: # SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR ORGANIZATION # FIRST AUTHOR <[EMAIL PROTECTED]>, YEAR. # msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" "POT-Creation-Date: 2007-05-14 12:19+PDT\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <[EMAIL PROTECTED]>\n" "Language-Team: LANGUAGE <[EMAIL PROTECTED]>\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=CHARSET\n" "Content-Transfer-Encoding: ENCODING\n" "Generated-By: pygettext.py 1.5\n" #: HelloWorld.py:10 msgid "World" msgstr "" #: HelloWorld.py:11 msgid "Hello," msgstr "" ... I suppose I should add the appropriate magic package name, version, author and other values to my source. Anyone remember where those are documented? Does pygettext extract them from the sources and insert them into the .pot? Anyway, I minimally have to change one line thus: "Content-Type: text/plain; charset=utf-8\n" ... and I suppose there are other ways to do this more properly. (Documented where?) I did find that I could either change that in the .pot file or in the individual .po files. However, if I failed to change it then my translations would NOT work and would throw an exception. (Where is the setting to force the _() function to fail gracefully --- falling back to no-translation and NEVER raise exceptions? I seem to recall there is one somewhere --- but I just spent all evening reading the docs and various Google hits to get this far; so please excuse me if it's a blur right now). Now we just copy these templates to individual .po files and make our LC_MESSAGES directories: mkdir locale && mv HelloPython.pot locale cd locale for i in es_ES fr_FR # ... do cp HelloPython.pot HelloPython_$i.po mkdir -p $i/LC_MESSAGES done ... and finally we can work on the translations. We edit each of the _*.po files inserting "Hola" and "Bonjour" and "Mundo" and "Monde" in the appropriate places. And then process these into .mo files and move them into place as follows: for i in *_*.po; do i=${i#*_} msgfmt -o ./${i%.po}/LC_MESSAGES/HelloPython.mo done ... in other words HelloPython_es_ES.po is written to ./es_ES/LC_MESSAGES/HelloPython.mo, etc. This last part was the hardest to get right. To test this we simply run: $HELLO_PATH/HelloPython.py Hello, World export LANG=es_ES $HELLO_PATH/HelloPython.py Hola, Mundo export LANG=fr_FR $HELLO_PATH/HelloPython.py Bonjour, Monde export LANG=zh_ZH $HELLO_PATH/HelloPython.py Hello, World ... and we find that our Spanish and French translations work. (With apologies if my translations are technically wrong). Of course I realize this only barely scratches the surface of I18n and L10n issues. Also I don't know, offhand, how