Re: [PATCH 2/3] i18n: Only extract comments marked by special tag
2014-04-18 2:08 GMT+08:00 Junio C Hamano gits...@pobox.com: Jiang Xin worldhello@gmail.com writes: When extract l10n messages, we use --add-comments option to keep comments right above the l10n messages for references. But sometimes irrelevant comments are also extracted. For example in the following code block, the comment in line 2 will be extracted as comment for the l10n message in line 3, but obviously it's wrong. { OPTION_CALLBACK, 0, ignore-removal, addremove_explicit, NULL /* takes no arguments */, N_(ignore paths removed in the working tree (same as --no-all)), PARSE_OPT_NOARG, ignore_removal_cb }, Since almost all comments for l10n translators are marked with the same prefix (tag): TRANSLATORS:, it's safe to only extract comments with this special tag. I.E. it's better to call xgettext as: xgettext --add-comments=TRANSLATORS: ... Also tweaks the multi-line comment in init-db.c, to make it start with the proper tag, not * TRANSLATORS: (which has a star before the tag). Hmph. I am not very happy with this change, as it would force us to special case Translators comment to follow a non-standard multi-line comment formatting convention. Is there a way to tell xgettext to accept both of these forms? /* TRANSLATORS: this is a short comment to help you */ _(foo bar); /* * TRANSLATORS: this comment is to help you, but it is * a lot longer to fit on just a single line. */ _(bar baz); We can not provide multiple `--add-comments=TAG` options to xgettext, because xgettext holds the tag in one string, not in a list: /* Tag used in comment of prevailing domain. */ static char *comment_tag; So if we won't change our multi-line comments for translators, must hack gettext in some ways. There maybe 3 ways to hack gettext: 1. When matching comments against TAG, using strstr not strncmp. 2360 /* When the comment tag is seen, it drags in not only the line 2361which it starts, but all remaining comment lines. */ 2362 if (add_all_remaining_comments 2363 || (add_all_remaining_comments = 2364 (comment_tag != NULL 2365 strncmp (s, comment_tag, strlen (comment_tag)) == 0))) 2. Add a extension to in-comment xgettext instructions. There is a undocumented feature in xgettext: User can provide instructions (prefixed by xgettext:) in comments, such as: /* * xgettext: fuzzy possible-c-format no-wrap * other comments... */ But it does not help much, unless we hack xgettext to extend this hidden feature. I.E. Add an additional flag to support unconditionally reference to the commit block. Like: /* * xgettext: comments * TRANSLATORS: this comment is to help you, but it is * a lot longer to fit on just a single line. */ _(bar baz); 3. Hack the parser for comments in gettext-tools/src/x-c.c (maybe function phase4_getc()) to support various multi-line comments style, such as: /* * TRANSLATORS: this comment is to help you, but it is * a lot longer to fit on just a single line. */ /* ** TRANSLATORS: this comment is to help you, but it is ** a lot longer to fit on just a single line. */ / * TRANSLATORS: this comment is to help you, but it is * * a lot longer to fit on just a single line. * / I CC this mail to the gettext mailing list. Full thread see: * http://thread.gmane.org/gmane.comp.version-control.git/246390/focus=246431 -- Jiang Xin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] i18n: Only extract comments marked by special tag
Jiang Xin worldhello@gmail.com writes: I am not very happy with this change, as it would force us to special case Translators comment to follow a non-standard multi-line comment formatting convention. Is there a way to tell xgettext to accept both of these forms? /* TRANSLATORS: this is a short comment to help you */ _(foo bar); /* * TRANSLATORS: this comment is to help you, but it is * a lot longer to fit on just a single line. */ _(bar baz); We can not provide multiple `--add-comments=TAG` options to xgettext, because xgettext holds the tag in one string, not in a list: /* Tag used in comment of prevailing domain. */ static char *comment_tag; So if we won't change our multi-line comments for translators, must hack gettext in some ways. There maybe 3 ways to hack gettext: ... I CC this mail to the gettext mailing list. Full thread see: * http://thread.gmane.org/gmane.comp.version-control.git/246390/focus=246431 This is one of these times when I find myself very fortunate for being surrounded by competent contributors with good tastes, which I may not deserve ;-) Thanks for being thorough. Having said that, it is only just a single comment, and it is too much hassle to even think about what to do in the meantime while we wait until such a change happens and an updated version of gettext reaches everybody. Let's take 2/3 as-is. Documentation/CodingGuidelines may want to have a sentence of two to explain this, though. Documentation/CodingGuidelines | 9 + 1 file changed, 9 insertions(+) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index dab5c61..b367a85 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -159,10 +159,19 @@ For C programs: - Multi-line comments include their delimiters on separate lines from the text. E.g. /* * A very long * multi-line comment. */ + Note however that a multi-line comment that explains a translatable + string to translators uses a different convention of starting with a + magic token TRANSLATORS: immediately after the opening delimiter, + and without an asterisk at the beginning of each line. E.g. + + /* TRANSLATORS: here is a comment that explains the string + to be translated, that follows immediately after it */ + _(Here is a translatable string explained by the above.); + - Double negation is often harder to understand than no negation at all. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] i18n: Only extract comments marked by special tag
Junio C Hamano gits...@pobox.com writes: Documentation/CodingGuidelines may want to have a sentence of two to explain this, though. After re-reading what I sent out, I realized that the way I singled out multi-line comments was misleading. Here is an updated version. -- 8 -- Subject: [PATCH] i18n: mention TRANSLATORS: marker in Documentation/CodingGuidelines These comments have to have TRANSLATORS: at the very beginning and have to deviate from the usual multi-line comment formatting convention. Signed-off-by: Junio C Hamano gits...@pobox.com --- Documentation/CodingGuidelines | 10 ++ 1 file changed, 10 insertions(+) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index dab5c61..f9b8bff 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -164,6 +164,16 @@ For C programs: * multi-line comment. */ + Note however that a comment that explains a translatable string to + translators uses a convention of starting with a magic token + TRANSLATORS: immediately after the opening delimiter, even when + it spans multiple lines. We do not add an asterisk at the beginning + of each line, either. E.g. + + /* TRANSLATORS: here is a comment that explains the string + to be translated, that follows immediately after it */ + _(Here is a translatable string explained by the above.); + - Double negation is often harder to understand than no negation at all. -- 1.9.2-651-g78816bc -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] i18n: Only extract comments marked by special tag
Jiang Xin worldhello@gmail.com writes: When extract l10n messages, we use --add-comments option to keep comments right above the l10n messages for references. But sometimes irrelevant comments are also extracted. For example in the following code block, the comment in line 2 will be extracted as comment for the l10n message in line 3, but obviously it's wrong. { OPTION_CALLBACK, 0, ignore-removal, addremove_explicit, NULL /* takes no arguments */, N_(ignore paths removed in the working tree (same as --no-all)), PARSE_OPT_NOARG, ignore_removal_cb }, Since almost all comments for l10n translators are marked with the same prefix (tag): TRANSLATORS:, it's safe to only extract comments with this special tag. I.E. it's better to call xgettext as: xgettext --add-comments=TRANSLATORS: ... Also tweaks the multi-line comment in init-db.c, to make it start with the proper tag, not * TRANSLATORS: (which has a star before the tag). Hmph. I am not very happy with this change, as it would force us to special case Translators comment to follow a non-standard multi-line comment formatting convention. Is there a way to tell xgettext to accept both of these forms? /* TRANSLATORS: this is a short comment to help you */ _(foo bar); /* * TRANSLATORS: this comment is to help you, but it is * a lot longer to fit on just a single line. */ _(bar baz); Signed-off-by: Jiang Xin worldhello@gmail.com --- Makefile | 2 +- builtin/init-db.c | 8 +++- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile index 2128ce3..a53f3a8 100644 --- a/Makefile +++ b/Makefile @@ -2102,7 +2102,7 @@ pdf: XGETTEXT_FLAGS = \ --force-po \ - --add-comments \ + --add-comments=TRANSLATORS: \ --msgid-bugs-address=Git Mailing List git@vger.kernel.org \ --from-code=UTF-8 XGETTEXT_FLAGS_C = $(XGETTEXT_FLAGS) --language=C \ diff --git a/builtin/init-db.c b/builtin/init-db.c index c7c76bb..56f85e2 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -412,11 +412,9 @@ int init_db(const char *template_dir, unsigned int flags) if (!(flags INIT_DB_QUIET)) { int len = strlen(git_dir); - /* - * TRANSLATORS: The first '%s' is either Reinitialized - * existing or Initialized empty, the second shared or - * , and the last '%s%s' is the verbatim directory name. - */ + /* TRANSLATORS: The first '%s' is either Reinitialized +existing or Initialized empty, the second shared or +, and the last '%s%s' is the verbatim directory name. */ printf(_(%s%s Git repository in %s%s\n), reinit ? _(Reinitialized existing) : _(Initialized empty), shared_repository ? _( shared) : , -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html