On 04.04.2011, at 23:15, Ned Batchelder wrote:

> Last week I re-encountered the problems with using makemessages on Javascript 
> files, and lost a couple of half-days to trying to figure out why some of my 
> translatable messages weren't being found and deposited into my .po files.  
> After fully understanding the extent of Django's current hack, I decided to 
> take a stab at providing a better solution.
> 
> Background: today, Javascript source files are parsed for messages by running 
> a "pythonize" regex over them, and giving the resulting text to xgettext, 
> claiming it is Perl.  The pythonize regex simply changes any //-style comment 
> on its own line into a #-style comment.  This strange accommodation leaves a 
> great deal of valid Javascript syntax in place to confuse the Perl parser in 
> xgettext.  As a result, seemingly innocuous Javascript will result in lost 
> messages:
> gettext("xyzzy 1");
> var x = y;
> gettext("xyzzy 2");
> var x = z;
> gettext("xyzzy 3");
> In this sample, messages 1 and 3 are found, and message 2 is not, because 
> y;ABC;abc; is valid Perl for a transliteration operator.      Digging into 
> this, every time I thought I finally understood the full complexity of the 
> brokenness, another case would pop up that     didn't make sense.  The full 
> horror of Perl syntax 
> (http://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Operators , for 
> example) means that it is very difficult to treat non-Perl code as Perl and 
> expect everything to be OK.  This is polyglot programming at its worst.
> 
> This needs to be fixed.  To that end, I've written a Javascript lexer 
> (https://bitbucket.org/ned/jslex) with the goal of using it to pre-process 
> Javascript into a form more suitable for xgettext.  My understanding of why 
> we claim Javascript is Perl is that Perl has regex literals like Javascript 
> does, and so xgettext stands the best chance of parsing Javascript as Perl.  
> Clearly that's not working well.  My solution would instead remove the regex 
> literals from the Javascript, and then have xgettext treat it as C.

Thanks Ned, I meant the post about this issue here after 1.3, since we also 
talked about this during the Pycon sprint, especially since we seem to have hit 
a few more problems with the recent gettext 0.18.1.1 (such as a seemingly 
stricter Perl lexer) -- which I encountered while I applied the final 
translation updates right before 1.3 but didn't have time to investigate yet. 
The bottom line is that I think we should rethink the way we look for 
translateable strings instead of working around the limitations of xgettext.

> 1. Is this the best path forward?  Ideally xgettext would support Javascript 
> directly. There's code out there to add Javascript to     xgettext, but I 
> don't know what shape that code is in, or if it's reasonable to expect Django 
> installations to use bleeding-edge xgettext.  Is there some better solution 
> that someone is pursuing?

We can't really expect Django users to upgrade to the most recent (or even an 
unreleased) version of gettext, We've bumped the minimum required version in 
Django 1.2 to 0.15 once all OSes were covered with installers. Which made me 
talk to Armin Ronacher about using Babel instead of GNU gettext, since it has a 
JavaScript lexer and is in use in Sphinx and Trac. [1] In that sense, I 
wholeheartedly encourage you to take a stab at it for 1.4 -- if you think 
that's a good idea.

Having a Python based library (assuming it works similarly) seems like a better 
fit to Django than relying on a C program.

> 2. Is there some other badness that will bite us if we tell xgettext that the 
> modified Javascript is C?  With a full Javascript lexer, I feel pretty 
> confident that we could solve issues if they do come up, but I'd like to know 
> now what they are.

I feel this is much better solved once and fall all than to keep misusing 
xgettext.

Jannis

1: http://babel.edgewall.org/browser/trunk/babel/messages/jslexer.py


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to