Re: [translate-pootle] Glossary stuff (was: Re: Frequncy list)
O Xoves 22 Xaneiro 2009 18:13, Samuel Murray (Groenkloof) escribiu: > > Another thing is that in a good glossary doesn't appear words. A good > > glossary has only concepts as entries, and several entries could have > > the same word (because words could have several meanings). > > That is fine, from an academic point of view, but the fact is that a > glossary function must have the ability to recognise items from the > source text that are in the glossary. No program can recognise > concepts. Only words can be matched. Therefore, glossaries must be > word based. Both tm and glossaries usefull for me because: a) They makes me translate faster b) They help me using the same target text for the same source text. (Corollary: they help keeping a consistent style) c) They help me to use standard wording, particularly the glossary. By language standardization I mean reduction of polysemy/synonymy, that is do not use a the same word/expresion to refer to several meanings, and also, do not use several words/expresions to refer to a single meaning. So, my vote goes to a glossary with "meaning" as the "primary key" concept, and languages, translations, subbordinated to meaning. That still gives the chance to lookup words !!, given that a proper configuration is set (source and target languages), and that glossary contains the pair source<-=meaning=->target. Sure, if the glossary contains several entries with , each for a different meaning (obviously), then several can be suggested, each for it's meaning, if the glossary contains a translation for that meaning, of course. -- Best regards, MV pgpNJVHPxsnlr.pgp Description: PGP signature -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] Glossary stuff (was: Re: Frequncy list)
>> Another thing is that in a good glossary doesn't appear words. A good >> glossary has only concepts as entries, and several entries could have >> the same word (because words could have several meanings). > > That is fine, from an academic point of view, but the fact is that a > glossary function must have the ability to recognise items from the source > text that are in the glossary. No program can recognise concepts. Only > words can be matched. Therefore, glossaries must be word based. Since I think glossaries are maintained by humans, the glossaries could be concept based. >> Sometimes could be a good idea having several glossaries, because you >> don't use the same words in Battle for Wesnoth or in Firefox, for >> example. > > Well, I think a super list is not a bad idea. Any project manager can then > take the super list and make the changes to it that he thinks is best for > his particular project, but the super list remains unchanged. I think that in the terminology server should be maintained several glossaries, without merging. The CAT tool should be able to work against one of them, several of them (at the same time, perhaps merging some of them on some way), or against all of them (merging them all). Althought this terminology server could give the possibility to download in several formats or making queries (searching some word like you could make in open-tran against several TMs). > Isn't Martin Benjamin working on such a list via AnLoc? > http://africanlocalisation.net/en/terminology Perhaps. In the last times there are lot of tools for translating, maintaining TM, glossaries... Too much for me. >> A good support (or even only support) for glossaries is a great lack >> of a lot of CAT programs. In Lokalize there is some support for this >> http://youonlylivetwice.info/lokalize/lokalize-glossary.htm > > Well, I think there are four important glossary tasks in CAT tools, namely > term recognition, term insertion, term adding and term editing. Term > recognition is an automatic process whereby the tool searches existing > glossaries for matching terms in the current source text segment. Term > insertion is the ability to insert a term's translation into the target > field in some easy way. Term adding is the ability to add terms (and their > translations) to glossaries used by the term recognition function. Term > editing is the ability to make changes to existing glossary entries. > > Most CAT tools that I know of, offer term recognition. Even if a tool > offers only term recognition, it can already benefit greatly from a > pre-existing super glossary. > > For comparison: A CAT tool that offers only term recognition (not the other > three) is OmegaT. A CAT tool that offers both term recognition and term > insertion, is Pootle. In both OmegaT and Pootle, it is not possible to add > terms to the glossary without using a separate program. OmegaT's glossaries > are easier to edit (use a text editor) but you must reload the project each > time. Pootle's glossaries are more difficult to edit (unless you're running > a local Pootle), but new terms are recognised immediately (if I remember > correctly). > > From the presentation, it appears that KBab^H^H^H^HLokalize can do term > recognition, term insertion and term adding (and possibly also term > editing). Yes, perhaps could do term editing, but if we set up a terminology server, the term editing should be considered term suggestion that must be approved by some user of the terminology server (a human). > A way to judge a CAT tool's term recognition is (a) whether it can do fuzzy > matching when doing glossary recognition, and (b) whether one can customise > the matching process using techniques like (i) stemming and (ii) setting > truncation rules. If I remember correctly, Pootle can do #a but not #b. > OmegaT can do neither. Wordfast can do #a, #b1 and #b2. Where is "exact matching"? I think that in TMs "fuzzy matching" is very important, but in glossaries it isn't so important. > A way to judge a CAT tool's term insertion is (a) whether it can be done > using only the keyboard and (b) whether it can make changes to the target > text term in the light of the current text (eg (i) if the SL word starts > with a capital letter, but the glossary item does not, will the CAT tool > insert the target term with a capital letter, or (ii) if the SL word > contains an accelerator, can the CAT tool give the inserted translation an > accelerator also). Pootle fails on both #a and #b. Wordfast can do #a and > #b1 but not #b2. > > How does Lokalize fare in the light of the above? I really don't know. I don't use Lokalize yet. Ask Shaforostoff. > What other CAT tools were you thinking of when you made your comment? I was thinking on Gtranslator, Poedit... Bye, Leandro Regueiro -- This SF.net email is sponsored by: SourcForge
[translate-pootle] Glossary stuff (was: Re: Frequncy list)
Leandro Regueiro wrote: > Another thing is that in a good glossary doesn't appear words. A good > glossary has only concepts as entries, and several entries could have > the same word (because words could have several meanings). That is fine, from an academic point of view, but the fact is that a glossary function must have the ability to recognise items from the source text that are in the glossary. No program can recognise concepts. Only words can be matched. Therefore, glossaries must be word based. > Sometimes could be a good idea having several glossaries, because you > don't use the same words in Battle for Wesnoth or in Firefox, for > example. Well, I think a super list is not a bad idea. Any project manager can then take the super list and make the changes to it that he thinks is best for his particular project, but the super list remains unchanged. Isn't Martin Benjamin working on such a list via AnLoc? http://africanlocalisation.net/en/terminology > A good support (or even only support) for glossaries is a great lack > of a lot of CAT programs. In Lokalize there is some support for this > http://youonlylivetwice.info/lokalize/lokalize-glossary.htm Well, I think there are four important glossary tasks in CAT tools, namely term recognition, term insertion, term adding and term editing. Term recognition is an automatic process whereby the tool searches existing glossaries for matching terms in the current source text segment. Term insertion is the ability to insert a term's translation into the target field in some easy way. Term adding is the ability to add terms (and their translations) to glossaries used by the term recognition function. Term editing is the ability to make changes to existing glossary entries. Most CAT tools that I know of, offer term recognition. Even if a tool offers only term recognition, it can already benefit greatly from a pre-existing super glossary. For comparison: A CAT tool that offers only term recognition (not the other three) is OmegaT. A CAT tool that offers both term recognition and term insertion, is Pootle. In both OmegaT and Pootle, it is not possible to add terms to the glossary without using a separate program. OmegaT's glossaries are easier to edit (use a text editor) but you must reload the project each time. Pootle's glossaries are more difficult to edit (unless you're running a local Pootle), but new terms are recognised immediately (if I remember correctly). From the presentation, it appears that KBab^H^H^H^HLokalize can do term recognition, term insertion and term adding (and possibly also term editing). A way to judge a CAT tool's term recognition is (a) whether it can do fuzzy matching when doing glossary recognition, and (b) whether one can customise the matching process using techniques like (i) stemming and (ii) setting truncation rules. If I remember correctly, Pootle can do #a but not #b. OmegaT can do neither. Wordfast can do #a, #b1 and #b2. A way to judge a CAT tool's term insertion is (a) whether it can be done using only the keyboard and (b) whether it can make changes to the target text term in the light of the current text (eg (i) if the SL word starts with a capital letter, but the glossary item does not, will the CAT tool insert the target term with a capital letter, or (ii) if the SL word contains an accelerator, can the CAT tool give the inserted translation an accelerator also). Pootle fails on both #a and #b. Wordfast can do #a and #b1 but not #b2. How does Lokalize fare in the light of the above? What other CAT tools were you thinking of when you made your comment? Samuel -- Samuel Murray sam...@translate.org.za Decathlon, for volunteer opensource translations http://translate.sourceforge.net/wiki/decathlon/ -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle