Re: [PATCH 2/3] i18n: Only extract comments marked by special tag

2014-04-18 Thread Jiang Xin
2014-04-18 2:08 GMT+08:00 Junio C Hamano gits...@pobox.com:
 Jiang Xin worldhello@gmail.com writes:

 When extract l10n messages, we use --add-comments option to keep
 comments right above the l10n messages for references.  But sometimes
 irrelevant comments are also extracted.  For example in the following
 code block, the comment in line 2 will be extracted as comment for the
 l10n message in line 3, but obviously it's wrong.

 { OPTION_CALLBACK, 0, ignore-removal, addremove_explicit,
   NULL /* takes no arguments */,
   N_(ignore paths removed in the working tree (same as
   --no-all)),
   PARSE_OPT_NOARG, ignore_removal_cb },

 Since almost all comments for l10n translators are marked with the same
 prefix (tag): TRANSLATORS:, it's safe to only extract comments with
 this special tag.  I.E. it's better to call xgettext as:

 xgettext --add-comments=TRANSLATORS: ...

 Also tweaks the multi-line comment in init-db.c, to make it start with
 the proper tag, not * TRANSLATORS: (which has a star before the tag).

 Hmph.

 I am not very happy with this change, as it would force us to
 special case Translators comment to follow a non-standard
 multi-line comment formatting convention.  Is there a way to tell
 xgettext to accept both of these forms?

 /* TRANSLATORS: this is a short comment to help you */
 _(foo bar);

 /*
  * TRANSLATORS: this comment is to help you, but it is
  * a lot longer to fit on just a single line.
  */
 _(bar baz);


We can not provide multiple `--add-comments=TAG` options to xgettext,
because xgettext holds the tag in one string, not in a list:

/* Tag used in comment of prevailing domain.  */
static char *comment_tag;

So if we won't change our multi-line comments for translators, must
hack gettext in some ways.

There maybe 3 ways to hack gettext:

1. When matching comments against TAG, using strstr not strncmp.

2360 /* When the comment tag is seen, it drags in not
only the line
2361which it starts, but all remaining comment lines.  */
2362 if (add_all_remaining_comments
2363 || (add_all_remaining_comments =
2364   (comment_tag != NULL
2365 strncmp (s, comment_tag, strlen
(comment_tag)) == 0)))

2. Add a extension to in-comment xgettext instructions.

There is a undocumented feature in xgettext: User can provide
instructions (prefixed by xgettext:) in comments, such as:

/*
 * xgettext: fuzzy possible-c-format no-wrap
 * other comments...
 */

But it does not help much, unless we hack xgettext to extend this
hidden feature. I.E. Add an additional flag to support unconditionally
reference to the commit block. Like:

/*
 * xgettext: comments
 * TRANSLATORS: this comment is to help you, but it is
 * a lot longer to fit on just a single line.
 */
 _(bar baz);

3. Hack the parser for comments in gettext-tools/src/x-c.c (maybe
function phase4_getc()) to support various multi-line comments style,
such as:

/*
 * TRANSLATORS: this comment is to help you, but it is
 * a lot longer to fit on just a single line.
 */

/*
** TRANSLATORS: this comment is to help you, but it is
** a lot longer to fit on just a single line.
*/

/
 * TRANSLATORS: this comment is to help you, but it is  *
 * a lot longer to fit on just a single line.   *
 /


I CC this mail to the gettext mailing list. Full thread see:

 * http://thread.gmane.org/gmane.comp.version-control.git/246390/focus=246431

-- 
Jiang Xin
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] i18n: Only extract comments marked by special tag

2014-04-18 Thread Junio C Hamano
Jiang Xin worldhello@gmail.com writes:

 I am not very happy with this change, as it would force us to
 special case Translators comment to follow a non-standard
 multi-line comment formatting convention.  Is there a way to tell
 xgettext to accept both of these forms?

 /* TRANSLATORS: this is a short comment to help you */
 _(foo bar);

 /*
  * TRANSLATORS: this comment is to help you, but it is
  * a lot longer to fit on just a single line.
  */
 _(bar baz);


 We can not provide multiple `--add-comments=TAG` options to xgettext,
 because xgettext holds the tag in one string, not in a list:

 /* Tag used in comment of prevailing domain.  */
 static char *comment_tag;

 So if we won't change our multi-line comments for translators, must
 hack gettext in some ways.

 There maybe 3 ways to hack gettext:
 ...
 I CC this mail to the gettext mailing list. Full thread see:

  * http://thread.gmane.org/gmane.comp.version-control.git/246390/focus=246431

This is one of these times when I find myself very fortunate for
being surrounded by competent contributors with good tastes, which I
may not deserve ;-)

Thanks for being thorough.

Having said that, it is only just a single comment, and it is too
much hassle to even think about what to do in the meantime while we
wait until such a change happens and an updated version of gettext
reaches everybody.  Let's take 2/3 as-is.

Documentation/CodingGuidelines may want to have a sentence of two to
explain this, though.

 Documentation/CodingGuidelines | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index dab5c61..b367a85 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -159,10 +159,19 @@ For C programs:
  - Multi-line comments include their delimiters on separate lines from
the text.  E.g.
 
/*
 * A very long
 * multi-line comment.
 */
 
+   Note however that a multi-line comment that explains a translatable
+   string to translators uses a different convention of starting with a
+   magic token TRANSLATORS:  immediately after the opening delimiter,
+   and without an asterisk at the beginning of each line.  E.g.
+
+   /* TRANSLATORS: here is a comment that explains the string
+  to be translated, that follows immediately after it */
+   _(Here is a translatable string explained by the above.);
+
  - Double negation is often harder to understand than no negation
at all.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] i18n: Only extract comments marked by special tag

2014-04-18 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Documentation/CodingGuidelines may want to have a sentence of two to
 explain this, though.

After re-reading what I sent out, I realized that the way I singled
out multi-line comments was misleading.  Here is an updated version.

-- 8 --
Subject: [PATCH] i18n: mention TRANSLATORS: marker in 
Documentation/CodingGuidelines

These comments have to have TRANSLATORS:  at the very beginning
and have to deviate from the usual multi-line comment formatting
convention.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/CodingGuidelines | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index dab5c61..f9b8bff 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -164,6 +164,16 @@ For C programs:
 * multi-line comment.
 */
 
+   Note however that a comment that explains a translatable string to
+   translators uses a convention of starting with a magic token
+   TRANSLATORS:  immediately after the opening delimiter, even when
+   it spans multiple lines.  We do not add an asterisk at the beginning
+   of each line, either.  E.g.
+
+   /* TRANSLATORS: here is a comment that explains the string
+  to be translated, that follows immediately after it */
+   _(Here is a translatable string explained by the above.);
+
  - Double negation is often harder to understand than no negation
at all.
 
-- 
1.9.2-651-g78816bc

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] i18n: Only extract comments marked by special tag

2014-04-17 Thread Junio C Hamano
Jiang Xin worldhello@gmail.com writes:

 When extract l10n messages, we use --add-comments option to keep
 comments right above the l10n messages for references.  But sometimes
 irrelevant comments are also extracted.  For example in the following
 code block, the comment in line 2 will be extracted as comment for the
 l10n message in line 3, but obviously it's wrong.

 { OPTION_CALLBACK, 0, ignore-removal, addremove_explicit,
   NULL /* takes no arguments */,
   N_(ignore paths removed in the working tree (same as
   --no-all)),
   PARSE_OPT_NOARG, ignore_removal_cb },

 Since almost all comments for l10n translators are marked with the same
 prefix (tag): TRANSLATORS:, it's safe to only extract comments with
 this special tag.  I.E. it's better to call xgettext as:

 xgettext --add-comments=TRANSLATORS: ...

 Also tweaks the multi-line comment in init-db.c, to make it start with
 the proper tag, not * TRANSLATORS: (which has a star before the tag).

Hmph.

I am not very happy with this change, as it would force us to
special case Translators comment to follow a non-standard
multi-line comment formatting convention.  Is there a way to tell
xgettext to accept both of these forms?

/* TRANSLATORS: this is a short comment to help you */
_(foo bar);

/*
 * TRANSLATORS: this comment is to help you, but it is
 * a lot longer to fit on just a single line.
 */
_(bar baz);



 Signed-off-by: Jiang Xin worldhello@gmail.com
 ---
  Makefile  | 2 +-
  builtin/init-db.c | 8 +++-
  2 files changed, 4 insertions(+), 6 deletions(-)

 diff --git a/Makefile b/Makefile
 index 2128ce3..a53f3a8 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -2102,7 +2102,7 @@ pdf:
  
  XGETTEXT_FLAGS = \
   --force-po \
 - --add-comments \
 + --add-comments=TRANSLATORS: \
   --msgid-bugs-address=Git Mailing List git@vger.kernel.org \
   --from-code=UTF-8
  XGETTEXT_FLAGS_C = $(XGETTEXT_FLAGS) --language=C \
 diff --git a/builtin/init-db.c b/builtin/init-db.c
 index c7c76bb..56f85e2 100644
 --- a/builtin/init-db.c
 +++ b/builtin/init-db.c
 @@ -412,11 +412,9 @@ int init_db(const char *template_dir, unsigned int flags)
   if (!(flags  INIT_DB_QUIET)) {
   int len = strlen(git_dir);
  
 - /*
 -  * TRANSLATORS: The first '%s' is either Reinitialized
 -  * existing or Initialized empty, the second  shared or
 -  * , and the last '%s%s' is the verbatim directory name.
 -  */
 + /* TRANSLATORS: The first '%s' is either Reinitialized
 +existing or Initialized empty, the second  shared or
 +, and the last '%s%s' is the verbatim directory name. */
   printf(_(%s%s Git repository in %s%s\n),
  reinit ? _(Reinitialized existing) : _(Initialized 
 empty),
  shared_repository ? _( shared) : ,
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html