Re: [translate-pootle] Glossary stuff (was: Re: Frequncy list)

2009-01-22 Thread Marce Villarino
O Xoves 22 Xaneiro 2009 18:13, Samuel Murray (Groenkloof) escribiu:
> > Another thing is that in a good glossary doesn't appear words. A good
> > glossary has only concepts as entries, and several entries could have
> > the same word (because words could have several meanings).
>
> That is fine, from an academic point of view, but the fact is that a
> glossary function must have the ability to recognise items from the
> source text that are in the glossary.  No program can recognise
> concepts.  Only words can be matched.  Therefore, glossaries must be
> word based.

Both tm and glossaries usefull for me because:
a) They makes me translate faster
b) They help me using the same target text for the same source text. 
(Corollary: they help keeping a consistent style)
c) They help me to use standard wording, particularly the glossary.

By language standardization I mean reduction of polysemy/synonymy, that is do 
not use a the same word/expresion to refer to several meanings, and also, do 
not use several words/expresions to refer to a single meaning.

So, my vote goes to a glossary with "meaning" as the "primary key" concept, 
and languages, translations, subbordinated to meaning.

That still gives the chance to lookup words !!, given that a proper 
configuration is set (source and target languages), and that glossary 
contains the pair source<-=meaning=->target.
Sure, if the glossary contains several entries with , each for a  
different meaning (obviously), then several  can be suggested, 
each for it's meaning, if the glossary contains a translation for that 
meaning, of course.

-- 
Best regards,
MV


pgpNJVHPxsnlr.pgp
Description: PGP signature
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle


Re: [translate-pootle] Glossary stuff (was: Re: Frequncy list)

2009-01-22 Thread Leandro Regueiro
>> Another thing is that in a good glossary doesn't appear words. A good
>> glossary has only concepts as entries, and several entries could have
>> the same word (because words could have several meanings).
>
> That is fine, from an academic point of view, but the fact is that a
> glossary function must have the ability to recognise items from the source
> text that are in the glossary.  No program can recognise concepts.  Only
> words can be matched.  Therefore, glossaries must be word based.

Since I think glossaries are maintained by humans, the glossaries
could be concept based.


>> Sometimes could be a good idea having several glossaries, because you
>> don't use the same words in Battle for Wesnoth or in Firefox, for
>> example.
>
> Well, I think a super list is not a bad idea.  Any project manager can then
> take the super list and make the changes to it that he thinks is best for
> his particular project, but the super list remains unchanged.

I think that in the terminology server should be maintained several
glossaries, without merging. The CAT tool should be able to work
against one of them, several of them (at the same time, perhaps
merging some of them on some way), or against all of them (merging
them all). Althought this terminology server could give the
possibility to download in several formats or making queries
(searching some word like you could make in open-tran against several
TMs).


> Isn't Martin Benjamin working on such a list via AnLoc?
> http://africanlocalisation.net/en/terminology

Perhaps. In the last times there are lot of tools for translating,
maintaining TM, glossaries... Too much for me.


>> A good support (or even only support) for glossaries is a great lack
>> of a lot of CAT programs. In Lokalize there is some support for this
>> http://youonlylivetwice.info/lokalize/lokalize-glossary.htm
>
> Well, I think there are four important glossary tasks in CAT tools, namely
> term recognition, term insertion, term adding and term editing. Term
> recognition is an automatic process whereby the tool searches existing
> glossaries for matching terms in the current source text segment.  Term
> insertion is the ability to insert a term's translation into the target
> field in some easy way.  Term adding is the ability to add terms (and their
> translations) to glossaries used by the term recognition function.  Term
> editing is the ability to make changes to existing glossary entries.
>
> Most CAT tools that I know of, offer term recognition.  Even if a tool
> offers only term recognition, it can already benefit greatly from a
> pre-existing super glossary.
>
> For comparison:  A CAT tool that offers only term recognition (not the other
> three) is OmegaT.  A CAT tool that offers both term recognition and term
> insertion, is Pootle.  In both OmegaT and Pootle, it is not possible to add
> terms to the glossary without using a separate program.  OmegaT's glossaries
> are easier to edit (use a text editor) but you must reload the project each
> time.  Pootle's glossaries are more difficult to edit (unless you're running
> a local Pootle), but new terms are recognised immediately (if I remember
> correctly).
>
> From the presentation, it appears that KBab^H^H^H^HLokalize can do term
> recognition, term insertion and term adding (and possibly also term
> editing).

Yes, perhaps could do term editing, but if we set up a terminology
server, the term editing should be considered term suggestion that
must be approved by some user of the terminology server (a human).


> A way to judge a CAT tool's term recognition is (a) whether it can do fuzzy
> matching when doing glossary recognition, and (b) whether one can customise
> the matching process using techniques like (i) stemming and (ii) setting
> truncation rules.  If I remember correctly, Pootle can do #a but not #b.
>  OmegaT can do neither.  Wordfast can do #a, #b1 and #b2.

Where is "exact matching"? I think that in TMs "fuzzy matching" is
very important, but in glossaries it isn't so important.


> A way to judge a CAT tool's term insertion is (a) whether it can be done
> using only the keyboard and (b) whether it can make changes to the target
> text term in the light of the current text (eg (i) if the SL word starts
> with a capital letter, but the glossary item does not, will the CAT tool
> insert the target term with a capital letter, or (ii) if the SL word
> contains an accelerator, can the CAT tool give the inserted translation an
> accelerator also).  Pootle fails on both #a and #b. Wordfast can do #a and
> #b1 but not #b2.
>
> How does Lokalize fare in the light of the above?

I really don't know. I don't use Lokalize yet. Ask Shaforostoff.


> What other CAT tools were you thinking of when you made your comment?

I was thinking on Gtranslator, Poedit...

Bye,
Leandro Regueiro

--
This SF.net email is sponsored by:
SourcForge

[translate-pootle] Glossary stuff (was: Re: Frequncy list)

2009-01-22 Thread Samuel Murray (Groenkloof)
Leandro Regueiro wrote:

> Another thing is that in a good glossary doesn't appear words. A good
> glossary has only concepts as entries, and several entries could have
> the same word (because words could have several meanings).

That is fine, from an academic point of view, but the fact is that a 
glossary function must have the ability to recognise items from the 
source text that are in the glossary.  No program can recognise 
concepts.  Only words can be matched.  Therefore, glossaries must be 
word based.

> Sometimes could be a good idea having several glossaries, because you
> don't use the same words in Battle for Wesnoth or in Firefox, for
> example.

Well, I think a super list is not a bad idea.  Any project manager can 
then take the super list and make the changes to it that he thinks is 
best for his particular project, but the super list remains unchanged.

Isn't Martin Benjamin working on such a list via AnLoc?
http://africanlocalisation.net/en/terminology

> A good support (or even only support) for glossaries is a great lack
> of a lot of CAT programs. In Lokalize there is some support for this
> http://youonlylivetwice.info/lokalize/lokalize-glossary.htm

Well, I think there are four important glossary tasks in CAT tools, 
namely term recognition, term insertion, term adding and term editing. 
Term recognition is an automatic process whereby the tool searches 
existing glossaries for matching terms in the current source text 
segment.  Term insertion is the ability to insert a term's translation 
into the target field in some easy way.  Term adding is the ability to 
add terms (and their translations) to glossaries used by the term 
recognition function.  Term editing is the ability to make changes to 
existing glossary entries.

Most CAT tools that I know of, offer term recognition.  Even if a tool 
offers only term recognition, it can already benefit greatly from a 
pre-existing super glossary.

For comparison:  A CAT tool that offers only term recognition (not the 
other three) is OmegaT.  A CAT tool that offers both term recognition 
and term insertion, is Pootle.  In both OmegaT and Pootle, it is not 
possible to add terms to the glossary without using a separate program. 
  OmegaT's glossaries are easier to edit (use a text editor) but you 
must reload the project each time.  Pootle's glossaries are more 
difficult to edit (unless you're running a local Pootle), but new terms 
are recognised immediately (if I remember correctly).

 From the presentation, it appears that KBab^H^H^H^HLokalize can do term 
recognition, term insertion and term adding (and possibly also term 
editing).

A way to judge a CAT tool's term recognition is (a) whether it can do 
fuzzy matching when doing glossary recognition, and (b) whether one can 
customise the matching process using techniques like (i) stemming and 
(ii) setting truncation rules.  If I remember correctly, Pootle can do 
#a but not #b.  OmegaT can do neither.  Wordfast can do #a, #b1 and #b2.

A way to judge a CAT tool's term insertion is (a) whether it can be done 
using only the keyboard and (b) whether it can make changes to the 
target text term in the light of the current text (eg (i) if the SL word 
starts with a capital letter, but the glossary item does not, will the 
CAT tool insert the target term with a capital letter, or (ii) if the SL 
word contains an accelerator, can the CAT tool give the inserted 
translation an accelerator also).  Pootle fails on both #a and #b. 
Wordfast can do #a and #b1 but not #b2.

How does Lokalize fare in the light of the above?

What other CAT tools were you thinking of when you made your comment?

Samuel



-- 
Samuel Murray
sam...@translate.org.za
Decathlon, for volunteer opensource translations
http://translate.sourceforge.net/wiki/decathlon/

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle