Re: [PATCH RFC 0/2] Mixing English and a local language

2012-09-14 Thread Michael J Gruber
Jeff King venit, vidit, dixit 13.09.2012 20:00:
 On Thu, Sep 13, 2012 at 10:30:52AM -0700, Junio C Hamano wrote:
 
  But it should not be per-command, but per-message, and
  should include all output that is not diagnostic and is not
  machine-parseable (e.g., what I mentioned above, request-pull
  output, etc). If it is the project's language, then the team
  members will need to know it anyway, so it should not be too big a
  burden to have a potentially different language there than in the
  diagnostic messages.

 No matter what the project languages is, machine parseable part will
 not be localized but fixed to C anyway, so I do not think it comes
 into the picture.
 
 But there are parts that are neither machine-parseable nor diagnostics.
 The diffstat is one, but I mentioned others. Are those going to be
 forever fixed to LANG=C?
 
 That does not bother me, but for a project whose team works entirely in
 Japanese (both individually, and when sharing code), they will still be
 stuck with these English-language snippets, and no way to localize them.
 Even though they may not speak a word of it.
 
 I have no idea if such a team is a strawman or not; that is why I
 separated points 1 and 2. We can wait on point 2 until such a team shows
 up and complains (of course, they would have to come here and complain
 in English, so...).
 
 My take on this is, if there is the project language, it should
 apply to _everything_.  Please do not introduce any per-command,
 per-message, per-anything mess.  Just set LANG/LC_ALL up and be done
 with it.
 
 But isn't that arguing for localizing diffstat? It is not
 machine-parseable, so an all-Japanese team would want to localize it
 along with their diagnostics.
 
 -Peff
 

The basic assumption is that we have people who are proficient in at
least 2 languages. In fact, the initial i18n efforts were targeted at
people who are much more comfortable in their $LANG than with LANG=C.
For this category, being able to localize everything(*) is important.
They will mostly work with $LANG projects. I don't think they're strawmen.

For those proficient in 2 languages it's desirable to switch per project
because it's likely they participate in projects with different $LANG
preferences. Again, that means localizing everything(*). Additionally,
setting core.i18n in global config is probably the better choice
(compared to NO_GETTEXT=y) for those who are frustrated by git's
translation in their usual $LANG.

[git svn should pass that LANG to svn also etc.]

The question is whether we have people who prefer to work with git in
their $LANG even though project interaction requires a different
language. They would probably run log/gitk/commit... in their $LANG but
need format-patch and the like in project-lang.

I do think we have people in this category here on the list, so they
should speak up ;) Could they alias their format-patch to use -c
core.i18n=C or such? Or have command.i18n on top? per-command config
again ;)

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] Mixing English and a local language

2012-09-14 Thread Nguyen Thai Ngoc Duy
On Fri, Sep 14, 2012 at 5:41 PM, Michael J Gruber
g...@drmicha.warpmail.net wrote:
 For those proficient in 2 languages it's desirable to switch per project
 because it's likely they participate in projects with different $LANG
 preferences. Again, that means localizing everything(*). Additionally,
 setting core.i18n in global config is probably the better choice
 (compared to NO_GETTEXT=y) for those who are frustrated by git's
 translation in their usual $LANG.

 [git svn should pass that LANG to svn also etc.]

We should honor LINGUAS variable on installation. Only languages
listed in that variable are installed. Many if not most of projects do
that already. That's probably better than yet another switch.

 The question is whether we have people who prefer to work with git in
 their $LANG even though project interaction requires a different
 language. They would probably run log/gitk/commit... in their $LANG but
 need format-patch and the like in project-lang.

 I do think we have people in this category here on the list, so they
 should speak up ;) Could they alias their format-patch to use -c
 core.i18n=C or such? Or have command.i18n on top? per-command config
 again ;)

Probably not needed, but probably won't hurt repeating: I do :) And
things should just work, at least most of the time. When I set LANG, I
prefer to have everything in $LANG unless required otherwise (sending
to English speaking teams is one of them). But the exceptions should
be limited.

On Fri, Sep 14, 2012 at 12:52 AM, Junio C Hamano gits...@pobox.com wrote:
 You seem to be saying that diagnostic does not have to be in project
 language, but I do not think it is the right thing to do.  The first
 response to Frotz does not work is often What do you exactly
 mean?  How did you run Frotz?  What error message are you getting
 from it?, and you do not want to get back the diagnostics ints
 Klingon.

Whether you like it or not, all localized software has this problem.
Perhaps the only difference with commercial software is that they have
support line that also understands Klingon. I don't see any problems
with asking the reporter to translate error messages back to English,
assume that they report in English so they do know English. Given a
specific context, Klingon illiterates can even manually revert Klingon
text back to English because we have the all the translations. But
it's probably faster to just ask the reporter.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] Mixing English and a local language

2012-09-13 Thread Jeff King
On Wed, Sep 12, 2012 at 11:18:06AM -0700, Junio C Hamano wrote:

 I am so far taking the silence in the thread to mean they do not mind
 seeing the diffstat summary untranslated and they do not mind seeing
 it in Klingon, as long as the three numbers are there with (+) and (-)
 markings.

Actually, I have found the Klingon appearing in the diffstat of recent
messages to the list to be mildly annoying. I can decipher it, of
course, but in some cases I do not even have the glyphs in my font to
render the string, and it is quite ugly.

I think in an ideal world each repo could specify a project language
and, and diffstat, Signed-off-by, and [PATCH] would all be in that
language. Practically speaking, I'm not sure how much effort that is
worth; it seems like non-English speakers adapt to a few English phrases
(for example, email headers and date formats are all in English; I
imagine many clients localize them behind the scenes, but certainly the
git format-patch  $EDITOR  git send-email workflow does not and
should not).

I think I'd prefer:

  1. Revert diffstat to always be in English/C locale for now. For all
 commands. People too frequently end up showing the output of things
 besides format-patch. It means they will have to read the English
 when they are just running locally, but since format-patch is
 generating it, it is something that they would need to
 understand anyway.

  2. If people on non-English projects find that too cumbersome, then we
 can switch the English/C above for `i18n.projectlang` or
 something. But it should not be per-command, but per-message, and
 should include all output that is not diagnostic and is not
 machine-parseable (e.g., what I mentioned above, request-pull
 output, etc). If it is the project's language, then the team
 members will need to know it anyway, so it should not be too big a
 burden to have a potentially different language there than in the
 diagnostic messages.

But take my opinion with a grain of salt. English is my first language,
so I have zero first-hand experience with these issues. For most open
source projects that operate in English, I think just (1) will be fine.
The real test for needing (2) would not be a project like git, but a
project conducted solely in another language, where some of the
participants do not speak English at all.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] Mixing English and a local language

2012-09-13 Thread Jeff King
On Thu, Sep 13, 2012 at 10:52:08AM -0700, Junio C Hamano wrote:

 Junio C Hamano gits...@pobox.com writes:
 
   But it should not be per-command, but per-message, and
   should include all output that is not diagnostic and is not
   machine-parseable (e.g., what I mentioned above, request-pull
   output, etc). If it is the project's language, then the team
   members will need to know it anyway, so it should not be too big a
   burden to have a potentially different language there than in the
   diagnostic messages.
 
  No matter what the project languages is, machine parseable part will
  not be localized but fixed to C anyway, so I do not think it comes
  into the picture.
 
  My take on this is, if there is the project language, it should
  apply to _everything_.  Please do not introduce any per-command,
  per-message, per-anything mess.  Just set LANG/LC_ALL up and be done
  with it.
 
  And I think you justified why that is the right thing to do very
  well in the second sentence in the above paragraph I quoted from
  you.
 
 You seem to be saying that diagnostic does not have to be in project
 language, but I do not think it is the right thing to do.  The first
 response to Frotz does not work is often What do you exactly
 mean?  How did you run Frotz?  What error message are you getting
 from it?, and you do not want to get back the diagnostics in
 Klingon.

By that line of reasoning, wouldn't all git developers be required to
set LANG=C? Fine by me as an English speaker, but I get the impression
that other developers are using the localization. I don't think there is
anything wrong with primarily working in your native language, but
making the effort to switch for communicating with teammates (either
when writing them emails, or using LANG=C when showing them output from
your terminal).

If the switch to LANG=C thing is a relatively rare thing, I don't see
a problem. The issue with the diffstat is that it is too easy to
accidentally send out a localized one.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] Mixing English and a local language

2012-09-12 Thread Nguyen Thai Ngoc Duy
Should I interpret the silence as I don't care, if you want it, go
for it or not acceptable, but no reasons given? I'd like some form
of it in. Reverting the i18n diffstat patch is the last resort that I
really don't want to do.

On Sun, Aug 26, 2012 at 2:26 AM, Nguyễn Thái Ngọc Duy pclo...@gmail.com wrote:
 The l10n effort leads to a situation where a contributor can submit a
 patch with some auto-generated information in his language, which may
 not be the team's language. We need to make sure exchange medium like
 patch is always in a common language that the team understands.

 Now this team language may not necessarily be English. However there
 are technical difficulties involved in switching between two
 languages. The only way I can think of, on top of gettext, is provide
 git translations in multiple domains. Say diff machinery uses
 git-diff domain while the rest is in git. We can drive gettext to
 use language X for diff machinery, and Y for the rest. For that, we
 replace gettext() with dgettext().

 It's cumbersome. And there has not been any sign that there will be
 a real user for it. So I assume that the team language will always
 be English. It's simpler and should cover 90% of the user base. If
 someday people ask for that, supporting it is simply a matter of
 rewriting C_() and CQ_() macros in the first patch to use dgettext()
 instead.

 Switching between a language and English is easier. We just need an
 if/else to decide whether to call gettext(). Which is what the first
 patch does, just for certain parts of diff machinery. Error messages
 will alway be in native language.

 The second patch puts format-patch output in English unconditionally.
 Again I'm partly lazy and not so sure that there will be needs for
 format-patch to produce in native language. If someone needs it, we
 can introduce a new config key that flip no_l10n flag back to 0.

 More commands may follow format-patch. I think that 'apply' should also
 use English for non-tty output, unless users request it to be in local
 language. IOW local language is treated pretty much like coloring.

 Nguyễn Thái Ngọc Duy (2):
   Allow to print diffstat in English regardless current locale
   format-patch: always print diffstat in English

  builtin/apply.c |  2 +-
  builtin/log.c   |  1 +
  diff.c  | 19 ---
  diff.h  |  3 ++-
  4 files changed, 16 insertions(+), 9 deletions(-)

 --
 1.7.12.rc1.27.g6d3049b.dirty




-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] Mixing English and a local language

2012-09-12 Thread Junio C Hamano
Nguyen Thai Ngoc Duy pclo...@gmail.com writes:

 Should I interpret the silence as I don't care, if you want it, go
 for it or not acceptable, but no reasons given?

I do not speak for the others, but the reason I didn't respond is
none of the above. It is somewhere between Meh and Anything that
says 'local language' and 'English' cannot be worth looking at.

I _think_ the patch was inspired by $gmane/204979, where I said:

Or LC_ALL=C LANG=C git format-patch 

It does not bother me (even though I do not read Vietnamese), but
this has been brought up a few times, and we may want to revert the
i18n of the diffstat summary.  It does not seem to add much value to
the system but annoys people.  After all, the upstream diffstat
does not localizes this string (I just checked diffstat-1.55 with
Jan 2012 timestamp).

and I have been waiting to see what others think.  I am so far
taking the silence in the thread to mean they do not mind seeing the
diffstat summary untranslated and they do not mind seeing it in
Klingon, as long as the three numbers are there with (+) and (-)
markings.

It is bad enough having to decide where the boundary between 'local
language' and 'C locale' should be drawn in the mixture.  I am not
enthused by an attempt to make the boundary tweakable, and worse
yet, to do so per command.

IMHO, we should just decide where to draw the line and be done with
it.  The users already know or can be trained to know to choose the
greatest common denominator when interacting with others.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 0/2] Mixing English and a local language

2012-08-25 Thread Nguyễn Thái Ngọc Duy
The l10n effort leads to a situation where a contributor can submit a
patch with some auto-generated information in his language, which may
not be the team's language. We need to make sure exchange medium like
patch is always in a common language that the team understands.

Now this team language may not necessarily be English. However there
are technical difficulties involved in switching between two
languages. The only way I can think of, on top of gettext, is provide
git translations in multiple domains. Say diff machinery uses
git-diff domain while the rest is in git. We can drive gettext to
use language X for diff machinery, and Y for the rest. For that, we
replace gettext() with dgettext().

It's cumbersome. And there has not been any sign that there will be
a real user for it. So I assume that the team language will always
be English. It's simpler and should cover 90% of the user base. If
someday people ask for that, supporting it is simply a matter of
rewriting C_() and CQ_() macros in the first patch to use dgettext()
instead.

Switching between a language and English is easier. We just need an
if/else to decide whether to call gettext(). Which is what the first
patch does, just for certain parts of diff machinery. Error messages
will alway be in native language.

The second patch puts format-patch output in English unconditionally.
Again I'm partly lazy and not so sure that there will be needs for
format-patch to produce in native language. If someone needs it, we
can introduce a new config key that flip no_l10n flag back to 0.

More commands may follow format-patch. I think that 'apply' should also
use English for non-tty output, unless users request it to be in local
language. IOW local language is treated pretty much like coloring.

Nguyễn Thái Ngọc Duy (2):
  Allow to print diffstat in English regardless current locale
  format-patch: always print diffstat in English

 builtin/apply.c |  2 +-
 builtin/log.c   |  1 +
 diff.c  | 19 ---
 diff.h  |  3 ++-
 4 files changed, 16 insertions(+), 9 deletions(-)

-- 
1.7.12.rc1.27.g6d3049b.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html