#9212: German Umlauts and possible other foreign languages special characters -------------------------------------------+-------------------------------- Reporter: nekron | Owner: nobody Status: new | Milestone: post-1.0 Component: Internationalization | Version: SVN Resolution: | Keywords: Umlauts Stage: Accepted | Has_patch: 0 Needs_docs: 0 | Needs_tests: 0 Needs_better_patch: 0 | -------------------------------------------+-------------------------------- Changes (by mtredinnick):
* cc: mtredinnick (added) * stage: Unreviewed => Accepted Comment: Oh, I've played this game before. :-( gettext seemed to make a new release every couple of months in the early years of this decade and working out which version added new features was ''hard''. I'm inclined to be nice to older versions -- in the sense of working around their limitations -- because it isn't always easy to upgrade and I don't want to put a big learning curve in the form of compiling the toolchain in the way of people trying to localise their applications. If somebody can work out how to get the version number out of xgettext so that we can do it in the Python code, I'm very happy to put something into `makemessages.py` that looks like {{{ #!python if version < (0, 15, 0): output_text = output_text.decode('iso-8859-1').encode('utf-8') }}} I have a feeling that using a reg-exp on the first line of output from `xgettext --version` is going to be enough here. I seem to recollect (from working on GNOME's intltool) that that was enough to work out the version pretty reliably. We'll only need the major and minor number. We have a requirement that the source text '''must''' be in UTF-8 for strings being sent to gettext and so we don't support things like `codding: iso-8859-1`. Gettext and other tools aren't smart enough to understand that sort of stuff (they just do a lexical scan of the file, they don't understand too much about Python), so I'm comfortable with making that a requirement for people writing the source. But that means we have to actually support UTF-8. So if older gettext versions are treating things as iso-8859-1 by default, I can live with using programmatic hammers to force them back to UTF-8. No bytes get lost in the conversion process; it's just that the intermediate text doesn't make any sense (as noted in the original bug report). -- Ticket URL: <http://code.djangoproject.com/ticket/9212#comment:5> Django <http://code.djangoproject.com/> The Web framework for perfectionists with deadlines. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django updates" group. To post to this group, send email to django-updates@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-updates?hl=en -~----------~----~----~----~------~----~------~--~---