Bug#461617: wordpress no-code-duplication l10n [was: Bug#461617]
2008/3/2, Lionel Elie Mamane <[EMAIL PROTECTED]>: > On Mon, Feb 25, 2008 at 02:08:29PM +0200, Nikolay Bachiyski wrote: > > 2008/2/17, Lionel Elie Mamane <[EMAIL PROTECTED]>: > > > >> I haven't examined the crop for tinymce .js files yet. Time for bed :) > > > Starting from 2.5 (the next version after 2.3) tinymce will be > > translated via gettext, so don't bother with it. > > > Hmm... Well, to make the decision to bother with it or not, I'd like > to evaluate whether 2.5 will be out in time for Debian's next release; > do you have a rough idea when 2.5 will be out? Target date is March, 10th. If there is any delay, it won't be much. Nikolay. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#461617: wordpress no-code-duplication l10n [was: Bug#461617]
2008/2/17, Lionel Elie Mamane <[EMAIL PROTECTED]>: > On Tue, Jan 22, 2008 at 09:14:40PM +0200, Nikolay Bachiyski wrote: > > > 2008/1/22, Lionel Elie Mamane <[EMAIL PROTECTED]>: > > > >> I looked only at the French translation up to now. Another thing that > >> bothers me a bit is code differences between the English and French > >> versions of Wordpress. Some are French-specific code, but don't break > >> other languages. An example is (in tinymce): > > >> if ($mce_locale = 'fr_FR') { > >> $mce_locale = 'fr'; > >> } > >> > >> Could this kind of code be folded back into wordpress itself? > > > They should name their tinymce translations fr_fr.js and this patch > > won't be needed. > > > I've also looked a bit at the Spanish one, and for some tags/branches, > the files are called es_ES.js and for some tags/branches, es.js . A > bit messy :) > > I'm encountering similar issues (sometimes the other way round) with > .po files: for example the "nl" directory contains nl_NL.po > files. This makes my "fetch all i18n files from SVN" script much more > complex :-( > > I have a beta version of this script running now. It would help > greatly if there were stronger naming conventions throughout the > repository. E.g. if the "nl" directory contained only nl.po and nl.js > files (no nl_NL.po or nl_NL.js) and conversely, if there is a "nl_NL" > directory, it contains nl_NL.po and nl_NL.js files, no nl.po or > nl.js. > > Because then I can write my script like: > > for lang in $(svn ls YOUR_REPO); do >svn export YOUR_REPO/$lang/tags/VERSION/messages/$lang.po > done > > instead of having to really code recursive directory traversal and > call "svn ls" on every directory to intelligently find .po and ll.js > and ll_cc.js files. > > Also, always find the .po file for wordpress admin interface in > tags/VERSION/messages/, not elsewhere (see example of it_IT below)... > > > Would it be possible to ask the translators to adhere to such naming > conventions (and move old directories around to match, or create new > ones)? Thanks in advance. Yes, I will ask them. However svn is sometimes daunting for translators. > > I also notice that several languages ship en.js files (in the svn > repo): sv_SE, ja, da_DK, pt_PT, sr_RS, it_IT, ... Is this normal? Probably they keep it for reference. > > > Currently, my script finds the following .po files tagged for 2.3.3: > > ca.po da_DK.po es_ES.po fr_FR.po id_ID.po ja.po ko_KR.po ru_RU.po ru_UA.po > > It misses: > > - ar because it is in branches instead of tags > - it_IT because it is in /it_IT/tags/2.3.3/wp-includes/languages/it_IT.po >instead of in /it_IT/tags/2.3.3/messages/it_IT.po > - other languages that have not tagged; some seem not updated anymore >(e.g. fi), but some seem active (e.g. de_DE), simply not tagged for >2.3.3. > > I haven't examined the crop for tinymce .js files yet. Time for bed :) Starting from 2.5 (the next version after 2.3) tinymce will be translated via gettext, so don't bother with it. Nikolay. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#461617: wordpress no-code-duplication l10n
2008/2/16, Lionel Elie Mamane <[EMAIL PROTECTED]>: > On Sat, Feb 16, 2008 at 08:33:59AM +0100, Lionel Elie Mamane wrote: > > On Thu, Jan 31, 2008 at 04:18:38PM +0200, Nikolay Bachiyski wrote: > > >> However, I like the idea of specially-formatted comments -- something > like: > > >> wp_die(/*WP_I18N_NOWPCONF*/"There doesn't seem to be a > >> wp-config.php file..."/*/WP_I18N_NOWPCONF*/"); > > >> Thus we can put these strings into the mo file and replace them on the > >> localized package build stage. > > > I'm starting a shell script "proof of concept" implementation after > > breakfast. > > Herewith attached is: > > - a (partial) patch for wordpress 2.3.3 > - a tarball with my static-l10n code, and example .po files for >English and French, with some. > > for a first draft of what it would look like. Please let me know what > you think; in particular, if I finish it and polish it up, is it good > for commit in Wordpress (I'll send a patch against trunk in that case, > obviously). Here is how it is used: > > tar xfz i18n-tools-0.0.1.tar.gz /some/place > /some/place/compile-static-i18n > /some/place/translate-static /path/to/wordpress/tree LANGUAGE_CODE > > This will translate the static strings between /*WP_I18N_START_FOO*/ > and /*WP_I18N_END_FOO*/ in files where they occur. The list of values > for FOO and the list of files is static in the translate-static > script. > > It requires sed, a POSIX-compliant /bin/sh and GNU gettext installed. > > The suggested way for a programmer to change the English string is to > change it in en.po and run translate-static. And then, warn all > translators. Or we can drop en.po and developers can have the fun of > chasing every occurrence of the string and change it the same way > everywhere; they'd still have to warn translators. See also below. > > Where shall we place the contents of the said tarball? Directly as a > subdirectory "l10n-tools" of wordpress? In a separate SVN repository > (maybe where I would commit directly), e.g. in the wordpress-i18n > repository, under tools? > > Bugs and problems: > > - The translator sees the FOO part of the placeholder (e.g. NOWPCONF) >as msgid instead of the English version of the string; he has to >look up in the code or in en.po what the English version of that >string is. This one should be avoided, or we will have to automatically strip the and parts from translation. > >As a consequence, if a programmer changes the English version of >the string in the code, there is no automatic way for the >translators to be notified (the string does not become "fuzzy" in >gettext terms). This is not a big problem. > >Fixing that is a bit problematic; to look up the English string >instead of the placeholder, I'd have to interpret PHP strings, >something I'm not very eager to code up in shell... We will make it in php, because it has a decent tokenizer. Also, we will need some custom code for extracting strings, elsewhere. > > - The implementation does not support multi-line strings and it is a >bit delicate to add that support. Mainly include/wp-db.php contains >multi-line strings. Would it be OK to change them to single-line >strings, with \n escapes for newlines? If we write it in php it will be less of a problem :-) Would you mind if we move the discussion to our wp-hackers list [0]? I want the devs to check it out also. It would be nice if you drop a mail there with the idea, and after that we can write a ticket, put the code and polish it. [0] http://lists.automattic.com/mailman/listinfo/wp-hackers Happy hacking, Nikolay. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#461617: wordpress no-code-duplication l10n [was: Bug#461617]
2008/1/22, Lionel Elie Mamane <[EMAIL PROTECTED]>: > On Tue, Jan 22, 2008 at 09:14:40PM +0200, Nikolay Bachiyski wrote: > > 2008/1/22, Lionel Elie Mamane <[EMAIL PROTECTED]>: > > > >> Well, we would like to have as much translated as is feasible without > >> duplicating the whole code. > > >> What I would be OK with is a scheme where we have one copy of the code > >> (with English strings or placeholders like @WP_STRING_ERR_NO_CONFIG@), > >> and string translations are provided in some kind of flat-text file > >> and then we automatically produce the localised versions by statically > >> replacing the strings/placeholders in the code. > > > We have been thinking for some time, how can we deal with that > > problem and solutions like yours have been suggested many times I > > don't want to make WordPress depend on a build stage. If we > > incorporate this scheme we will have to replace the placeholders > > before using the software. > > > It would have to depend on a build stage only for developers. For > users, you can put in the tarball you distribute the built version > (placeholders replaced). Users wouldn't have to do the "replace > placeholders" stage. > > > Also, after we have once replaced them, we are losing the actual > > placeholders and upgrades can be a nightmare. > > The actual placeholders can be left inside specially-formatted PHP > comments instead of being purely replaced. But how does losing the > placeholders impact upgrades? Aren't upgrades "replace the whole > code"? It won't impact upgrades, but a build stage would make the life of an average WordPress developer/hacker highly uncomfortable,a step I don't want us to take. However, I like the idea of specially-formatted comments -- something like: wp_die(/*WP_I18N_NOWPCONF*/"There doesn't seem to be a wp-config.php file..."/*/WP_I18N_NOWPCONF*/"); Thus we can put these strings into the mo file and replace them on the localized package build stage. Happy hacking, Nikolay. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#461617: wordpress no-code-duplication l10n [was: Bug#461617]
2008/1/22, Lionel Elie Mamane <[EMAIL PROTECTED]>: > reopen 461617 > retitle 461617 wordpress: support for all (upstream-supported) languages > tags 461617 =l10n > thanks > > >> One step: Internationalise (convert to gettext) all remaining > >> strings in the code. (Such as "could not connect to database", the > >> initial blog setup process, ...) > > > Unfortunately at that stage gettext can't be loaded, so these error > > messages, and some other files cannot be translated using gettext. > > OK, I'm sorry if I come with somewhat stupid questions, or ideas you > already considered and rejected, but I'd like to understand. Thanks > for bearing with me (else skip the next quote of you). > > While I completely understand e.g. why the error message for > non-existent wp-config.php cannot use gettext (language to use not > known yet), I don't understand why, for example, gettext cannot be > loaded before making the database connection. After all, the path to > gettext is not in the database, is it? Maybe there is some other > reason that things are loaded in this order. Maybe I would understand > why if I tried to actually make the change :) > > wp-admin/setup-config.php: It runs independent from wordpress, OK. But > why can't it use gettext? If the only problem is that wp-config.php > doesn't exist yet to know what language to use, it could start with a > language selection screen, couldn't it? In the context of Debian, we > have a Debian-specific script that creates the MySQL user and database > and creates the wp-config.php file, so setup-config.php isn't used at > all. We need plugins loaded in order to translate strings. Plugins can modify current locale or translate strings in other way, not gettext. However, in order to plugins to be loaded we need the database running. I agree about setup-config.php though. We can see what mo files are there and if there is only one -- use it, if there are more (which will be a quite rare case) we can let users select. > > What we should do is have the wp-config.php file include a language > setting. Added to my TODO list. > > > You should decide whether you want to provide full translations > > including all error messages, readme and translated first post, > > first comment and user roles. > > Well, we would like to have as much translated as is feasible without > duplicating the whole code. We don't want to duplicate the whole code > because then it is a bit nightmarish for security updates: instead of > one copy of the code to correct it, suddenly we have 20+ copies to > handle. The same holds for non-security updates, actually :) > > What I would be OK with is a scheme where we have one copy of the code > (with English strings or placeholders like @WP_STRING_ERR_NO_CONFIG@), > and string translations are provided in some kind of flat-text file > and then we automatically produce the localised versions by statically > replacing the strings/placeholders in the code. If we use > placeholders, something along the lines of > > for l in ${wp_supported_languages} > for f in ${wp_static_translation_files}; do > cp $f $f.tmp > for str in ${wp_string_list}; do > lstr=$(lookup_translation $l $str) > sed "s/$str/$lstr/" < $f.tmp > $f.tmp2 > cp $f.tmp2 $f.tmp > done > cp $f.tmp wordpress-$l/$f > done > done > > would do. There are probably still some kinks to be ironed out, but > would that be imaginable for Wordpress 2.5? I can prepare a patch for > this infrastructure, if you want. We have been thinking for some time, how can we deal with that problem and solutions like yours have been suggested many times I don't want to make WordPress depend on a build stage. If we incorporate this scheme we will have to replace the placeholders before using the software. Also, after we have once replaced them, we are losing the actual placeholders and upgrades can be a nightmare. > Open problem: plugin descriptions. I suppose the wordpress PHP code > could call gettext on the string after retrieving it from the comments > in wp-content/plugins/PLUGIN.php ? Something like > > {__($plugin_data['Description'])}$author > > instead of > > {$plugin_data['Description']}$author > > > I looked only at the French translation up to now. Another thing that > bothers me a bit is code differences between the English and French > versions of Wordpress. Some are French-specific code, but don't break > other languages. An example is (in tinymce): > > if ($mce_locale = 'fr_FR') { > $mce_locale = 'fr'; > } > > Could this kind of code be folded back into wordpress itself? They should name their tinymce translations fr_fr.js and this patch won't be needed. > > Still in tinymce, there is the > > spellchecker_languages : ..., > > line. Could this one be generated dynamically, depending on which > languages are supported in the current installation? Yeah, we have to allow dynamic spellchecker languages addition. > > > > If
Bug#461617:
Hello all, I am Nikolay Bachiyski and I am dealing with i18n/l10n issues around WordPress. Here are some comments on the discussion up to now: > One step: Internationalise (convert to gettext) all remaining > strings in the code. (Such as "could not connect to database", the > initial blog setup process, ...) Unfortunately at that stage gettext can't be loaded, so these error messages, and some other files cannot be translated using gettext. There is more information on that topic in the Codex [0]. You should decide whether you want to provide full translations including all error messages, readme and translated first post, first comment and user roles. If the mo files are enough for you, you can take them from the svn repository [1]. I can ask the translators to tag their work after each version, so that you can automatically grab the mo files. There are 14 tagged 2.3.2 translations, probably there are some not tagged in the repo. On the other hand, if you want to provide full translation, different packages will be a better idea, because there will be conflicts in some of the files. > Another step: Gettextise strings in themes (or at least the default > theme). You can find a i18n-ed version of the default theme in our wordpress-i18n svn repository [2]. Tags exist for all versions after 2.3. Translators are encouraged to put their default theme translation in wordpress-i18n///messages/kubrick/.mo. [0] http://codex.wordpress.org/Files_For_Direct_Translation [1] http://svn.automattic.com/wordpress-i18n/ [2] http://svn.automattic.com/wordpress-i18n/theme/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]