Re: Proposal for declinations in gettext

2003-06-15 Thread Kenneth Rohde Christiansen
And Hjerterkonge in Danish. ;-)

Kenneth

On Mon, 2003-06-16 at 01:37, Bernd Groh wrote:
> I have to agree here (with complete strings being required, even if it 
> appears like thousand redundant messages in english), and I don't even 
> see a solution for the particular problem of sprintf("%s of %s", "king", 
> "heart")). Already this composition fails in several languages and a 
> separate entry for each card is required, since 'King of Hearts' is 
> 'Herzkönig' in German and not anything 'of'.
> 
> Cheers,
> Bernd
> 
> 
> Callum McKenzie wrote:
> 
> >  
> >
> msgid "king"
> 
> 
> >>The problem seems obvious to me: It is plain incorrect that "king" is
> >>a separate msgid, if it is meant to be pasted in different
> >>contexts. Instead, it should be added into any context where it is
> >>meant to be pasted into, forming separate msgids.
> >>
> >>
> >
> >Except that in the example given this would result in several thousand
> >messages (and this is after creating strings for all the possible card
> >names rather than doing sprintf("%s of %s", "king", "heart")). 
> >
> >Since it is clear that almost any generated string will break some rule in
> >some language thousands of strings may be the only "complete" solution. In
> >this particular case I have some creative solutions in mind to avoid the
> >problem, but that can't happen in general and I fear there will always be
> >some programs with either lots of strings or really bad translations into
> >some languages.
> >
> >Hopefully this will be a very small number.
> >
> > - Callum
> >
> >
> >___
> >gnome-i18n mailing list
> >[EMAIL PROTECTED]
> >http://lists.gnome.org/mailman/listinfo/gnome-i18n
> >  
> >
-- 
Kenneth Rohde Christiansen <[EMAIL PROTECTED]>

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Proposal for declinations in gettext

2003-06-15 Thread Bernd Groh
I have to agree here (with complete strings being required, even if it 
appears like thousand redundant messages in english), and I don't even 
see a solution for the particular problem of sprintf("%s of %s", "king", 
"heart")). Already this composition fails in several languages and a 
separate entry for each card is required, since 'King of Hearts' is 
'Herzkönig' in German and not anything 'of'.

Cheers,
Bernd


Callum McKenzie wrote:

>  
>
msgid "king"


>>The problem seems obvious to me: It is plain incorrect that "king" is
>>a separate msgid, if it is meant to be pasted in different
>>contexts. Instead, it should be added into any context where it is
>>meant to be pasted into, forming separate msgids.
>>
>>
>
>Except that in the example given this would result in several thousand
>messages (and this is after creating strings for all the possible card
>names rather than doing sprintf("%s of %s", "king", "heart")). 
>
>Since it is clear that almost any generated string will break some rule in
>some language thousands of strings may be the only "complete" solution. In
>this particular case I have some creative solutions in mind to avoid the
>problem, but that can't happen in general and I fear there will always be
>some programs with either lots of strings or really bad translations into
>some languages.
>
>Hopefully this will be a very small number.
>
> - Callum
>
>
>___
>gnome-i18n mailing list
>[EMAIL PROTECTED]
>http://lists.gnome.org/mailman/listinfo/gnome-i18n
>  
>


-- 
Dr. Bernd R. Groh Email: [EMAIL PROTECTED]
I18n/L10n Engineering Phone: +61 7 3872 4847
Red Hat Asia-Pacific  Fax  : +61 7 3257 4800

"Everything we know is an illusion, nothing we know is real,
 nothing real we can know, illusion is what we call reality."

Disclaimer: http://apac.redhat.com/disclaimer


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: [Translation-i18n] Proposal for declinations in gettext

2003-06-15 Thread Danilo Segan
Bruno Haible wrote:

Danilo Segan wrote:
 

The usual practice among english-speaking programmers is to "compose"
strings out of smaller parts.
   

You need to educate the programmer to use entire sentences. You can
refer them to the gettext documentation, section "Preparing Translatable
Strings". http://www.gnu.org/manual/gettext/html_chapter/gettext_3.html#SEC15
 

Yes, I'm aware of that, and that is the "perfect" solution. Still, it 
seems to be sometimes impractical (as in the mentioned card game with 
thousands of combinations). In cases like that, and where programmer is 
"uneducated" about translation, this kind of feature would help get 
correct translations for at least 20 languages (a wild estimate would be 
for 50+), and that cannot be all bad, right?

Also, I agree that this will "encourage" (actually, let them live with) 
bad strings, so that's the only negative side of the approach. If we're 
to educate, I don't see what's wrong with educating about The Good Ways, 
and still having a feature that would really help translators now (it 
could even be slightly "hidden" so programmers don't even know about it, 
if they didn't read all the details of documentation :-)).

The reason is that in most languages sentences are not composed by
juxtaposition, as in English:
  - For Serbian, you have given examples.
  - In many languages, a verb's form is spelled differently depending
on the gender of the subject.
Serbian is one of those, and the proposed mechanism would not solve 
those problems, I agree. Yet, I never claimed for it to be the ultimate 
solution for all translator needs, just for some (same as plural-forms 
tackles one particular problem, and doesn't go any further).

Still, I come quite rarely across strings that would be wrong because of 
the gender, and it's my *impression* (so, take it with a [big] grain of 
salt) that the declination problems are more common. So, I just wanted 
to know experiences from other languages, and how would the "solution" 
work for them (and all of that in terms of current programs and current 
translations).

  - In Latin, the combiner "and" comes as a suffix "-que".
  - Etc. etc.
 

The translation for "Workspace %d" would look like:
msgid "Workspace %d"
msgstr<0> "der Workspace %d"
msgstr<1> "das Workspace %d"
msgstr<2> "dem Workspace %d"
msgstr<3> "den Workspace %d"
So, the title of "Workspace 5" would be "der Workspace 5", while the
menu which allows switching to that workspace would read "Switch to den
Workspace 5".
   

There are more bits of context that influence a translation than just a
declination. For example, the beginning of a sentence is special. To pursue
your example, an English programmer would be tempted to write
 "%<0>s is empty."
 

Just a nitpick: no change would be required to "original" string, so it 
would be "%s is empty".

which would have the German translation

 "%<0>s ist leer."

and result in the final string

 "der Workspace %d is leer."

which is wrong because, in German, all sentences must start with a capital letter.
 

Yes, quite so. As I already mentioned, there're far too many cases that 
any simple mechanism like the proposed one can handle. I never claimed 
for it to solve all problems. In fact, German could handle this using 
some "tricks". Eg. if they don't know enough context to know if this 
would come in the middle or at the start of sentence, they could use 8 
(4 for middle sentence, 4 for start of sentence) number of declinations, 
and they would translate a particular example as eg.

   "%<4>s ist leer."

(if they've put uppercased forms second, so 4th [zero based counting] 
would be nominative that should be used at the start of sentences). 
Still, there are more problems: if string composed this way was used on 
both the start of sentence and in the middle, this mechanism is not 
sufficient.

So actually, this mechanism is not really related to declinations but 
rather to different word forms. Declinations are sort of "inspiration".

Still, problems of implementation seem to be bigger than problems of 
"usefulness", at least to me. And to stress one more time, this is not a 
feature to be used by programmers in future, it's a feature to solve 
some problems translators are having.

Cheers,
Danilo
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/


Re: [Translation-i18n] Proposal for declinations in gettext

2003-06-15 Thread Bruno Haible
Danilo Segan wrote:
> The usual practice among english-speaking programmers is to "compose"
> strings out of smaller parts.

You need to educate the programmer to use entire sentences. You can
refer them to the gettext documentation, section "Preparing Translatable
Strings". http://www.gnu.org/manual/gettext/html_chapter/gettext_3.html#SEC15

The reason is that in most languages sentences are not composed by
juxtaposition, as in English:
   - For Serbian, you have given examples.
   - In many languages, a verb's form is spelled differently depending
 on the gender of the subject.
   - In Latin, the combiner "and" comes as a suffix "-que".
   - Etc. etc.

> The translation for "Workspace %d" would look like:
> msgid "Workspace %d"
> msgstr<0> "der Workspace %d"
> msgstr<1> "das Workspace %d"
> msgstr<2> "dem Workspace %d"
> msgstr<3> "den Workspace %d"
>
> So, the title of "Workspace 5" would be "der Workspace 5", while the
> menu which allows switching to that workspace would read "Switch to den
> Workspace 5".

There are more bits of context that influence a translation than just a
declination. For example, the beginning of a sentence is special. To pursue
your example, an English programmer would be tempted to write

  "%<0>s is empty."

which would have the German translation

  "%<0>s ist leer."

and result in the final string

  "der Workspace %d is leer."

which is wrong because, in German, all sentences must start with a capital letter.

Bruno

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: [Translation-i18n] Re: Proposal for declinations in gettext

2003-06-15 Thread Bruno Haible
Yann Dirson wrote:
> it is difficult in some cases to
> find unique english strings that will be possible map one to one in
> all languages.

A common technique is to use a context marker in the msgid string,
like this:

my_gettext ("[menu item]Open")
my_gettext ("[combobox item]Open")

which translators can translate like this:

msgid "[menu item]Open"
msgstr "Ouvrir"

msgid "[combobox item]Open"
msgstr "Ouvert"

The my_gettext function calls gettext and, if it is returns the
untranslated string, strips the "[...]" prefix.

See also the gettext documentation, section "GUI program problems".

The only problem (quite small, IMO) with this approach is that translators
must be made aware where the context marker ends.

Bruno

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/