#9212: German Umlauts and possible other foreign languages special characters
-------------------------------------------+--------------------------------
          Reporter:  nekron                |         Owner:  nobody  
            Status:  new                   |     Milestone:  post-1.0
         Component:  Internationalization  |       Version:  SVN     
        Resolution:                        |      Keywords:  Umlauts 
             Stage:  Accepted              |     Has_patch:  0       
        Needs_docs:  0                     |   Needs_tests:  0       
Needs_better_patch:  0                     |  
-------------------------------------------+--------------------------------
Changes (by mtredinnick):

 * cc: mtredinnick (added)
  * stage:  Unreviewed => Accepted

Comment:

 Oh, I've played this game before. :-( gettext seemed to make a new release
 every couple of months in the early years of this decade and working out
 which version added new features was ''hard''.

 I'm inclined to be nice to older versions -- in the sense of working
 around their limitations -- because it isn't always easy to upgrade and I
 don't want to put a big learning curve in the form of compiling the
 toolchain in the way of people trying to localise their applications. If
 somebody can work out how to get the version number out of xgettext so
 that we can do it in the Python code, I'm very happy to put something into
 `makemessages.py` that looks like

 {{{
 #!python
 if version < (0, 15, 0):
    output_text = output_text.decode('iso-8859-1').encode('utf-8')
 }}}

 I have a feeling that using a reg-exp on the first line of output from
 `xgettext --version` is going to be enough here. I seem to recollect (from
 working on GNOME's intltool) that that was enough to work out the version
 pretty reliably. We'll only need the major and minor number.

 We have a requirement that the source text '''must''' be in UTF-8 for
 strings being sent to gettext and so we don't support things like
 `codding: iso-8859-1`. Gettext and other tools aren't smart enough to
 understand that sort of stuff (they just do a lexical scan of the file,
 they don't understand too much about Python), so I'm comfortable with
 making that a requirement for people writing the source. But that means we
 have to actually support UTF-8. So if older gettext versions are treating
 things as iso-8859-1 by default, I can live with using programmatic
 hammers to force them back to UTF-8. No bytes get lost in the conversion
 process; it's just that the intermediate text doesn't make any sense (as
 noted in the original bug report).

-- 
Ticket URL: <http://code.djangoproject.com/ticket/9212#comment:5>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to