> But may be historical developpers of fr-es langage pair would prefer > to keep a="something" comments. So, a more complicated tool would be > usefull to do automatically this kind of work.
It should be possible to regroup the different authors, comments and lemmas in a single entry, for example: <e lm="abîme" a="eleka"><i>abîme</i><par n="livre__n"/></e> <e lm="abîme" a="webform"><i>abîme</i><par n="livre__n"/></e> would become: <e lm="abîme" a="eleka/webform"><i>abîme</i><par n="livre__n"/></e> > Quelqu'un de France avec un prénom français. Par contre l'opérateur de > ton > numéro de téléphone sur IP ne semble pas répertorié : > https://fr.wikipedia.org/wiki/Liste_des_pr%C3%A9fixes_des_op%C3%A9rateurs_de_t%C3%A9l%C3%A9phonie_par_internet_en_France C'est un numéro d'entreprise, c'est peut-être pour ça, ou bien la page n'est pas à jour. > Paderni a l'air italien comme nom de famille. C'est sur cette paire que > tu > travailles ? Oui, c'est italien. Actuellement je travaille à l'intégration d'apertium à nos outils existants. Du coup, il m'arrive d'effectuer des tests sur différentes paires de langues. À ce sujet, j'ai essayé d'appliquer lt-trim sur les dictionnaires d'apertium-es-pt, puis en effectuant la traduction d'un long texte pour voir quelle différence j'obtenais, j'ai remarqué que du portugais vers l'espagnol "resumir-se-ia" n'était plus traduit par "se resumiría" mais par "resumirse-iba". Sais-tu s'il y a une contre-indication à appliquer lt-trim sur une paire de langue comme apertium-es-pt ou s'il peut s'agir d'un bug dans lt-trim? -- Gabriel Paderni www.phone-m.com +33 9 84 34 20 20 On 2017-01-21 23:32, Bernard Chardonneau wrote: > Note : Sent again. My account to this mailing list was disabled after > sending it ! > > User-Agent: libremail : logiciel libre multilingue > Date: Sat, 21 Jan 2017 15:03:11 +0100 > From: "Bernard Chardonneau" <[email protected]> > To: [email protected] > Subject: Re: [Apertium-stuff] Duplicate entries in > apertium-fr-es.fr.metadix > >> User-Agent: Roundcube Webmail/1.2.3 >> Date: Thu, 19 Jan 2017 14:16:46 +0100 >> From: [email protected] >> To: [email protected] >> Reply-To: [email protected] >> Subject: [Apertium-stuff] Duplicate entries in >> apertium-fr-es.fr.metadix >> >> >> Hello, >> There are some identical entries in apertium-fr-es.fr.metadix, with >> just >> the author that differs : >> (The line numbers are from the latest svn version of the file) >> >> (6363) <e lm="Abbas" a="eleka"> <i>Abbas</i><par >> n="Abraham__np"/></e> >> (6364) <e lm="Abbas" a="webform"> <i>Abbas</i><par >> n="Abraham__np"/></e> >> >> (6426) <e lm="Abidjan" a="eleka"> <i>Abidjan</i><par >> n="Andorre__np"/></e> >> (6427) <e lm="Abidjan" a="webform"> <i>Abidjan</i><par >> n="Andorre__np"/></e> >> >> (6430) <e lm="abîme" a="eleka"> <i>abîme</i><par >> n="livre__n"/></e> >> (6431) <e lm="abîme" a="webform"> <i>abîme</i><par >> n="livre__n"/></e> >> >> The next line is also the same with the LR restriction which doesn't >> seem right for this word? >> (6432) <e lm="abîme" a="webform" r="LR"> <i>abîme</i><par >> n="livre__n"/></e> > > Done for these entries and some around. That may be very old duplicate > entries. The reason is there was for this langage pair a possibility > to enter new words directly from a website (webform). And also lemmas > were not in alphabetic order until I run a personal tool to sort them. > So, now, it's a bit easier to see these duplicate entries which aure a > lot in the whole file. > > Quite one year ago now, I did the same on fra-por language pair on > which > I thougth to do big changes on dictionaries to extend word coverage. > A simple way I found to do that was to take off a="something" comments > and then to use "uniq" GNU/Linux - UNIX command to tale off easily > duplicate lines. > > But may be historical developpers of fr-es langage pair would prefer > to keep a="something" comments. So, a more complicated tool would be > usefull to do automatically this kind of work. > > Is apertium-dixtolls sort command clever enough to tale off duplicate > entries in this case, and also when there is form the same word a line > allowing both translation side and a line with a restricted direction ? > > Anyway, the same kind of corrections may need to be done in > languages/apertium-fra > metadix file. But as this file should include any entry of different > French > metadix (and at least one dix) files, I don't think to do that merging > in > the very next following months. > > >> I've also noticed that in the apostrophes/postblank section the lines >> 29111 to 29167 and 29168 to 29224 are exactly the same twice: >> 29111 -> 29167 / 29168 -> 29224 >> <e r="LR" lm="à cause qu'"> >> <p><l>à<b/>cause<b/>qu'</l><r>à<b/>cause<b/>que</r></p><par >> n="afin_que__cnjadv"/></e> >> (...............) >> n="afin_que__cnjadv"/></e> >> <e r="LR" lm="à mesure qu'"> >> n="afin_que__cnjadv"/></e> >> <e r="LR" lm="tel qu'"> >> <p><l>tel<b/>qu'</l><r>tel<b/>que</r></p><par >> n="afin_que__cnjadv"/></e> >> > > Done (by sorting entries of this section by alphabetic order). As my > sorting > tool was both done for .dix and .metadix files, it only process the > main section. > >> >> -- >> Gabriel Paderni >> www.phone-m.com >> +33 9 84 34 20 20 > > Quelqu'un de France avec un prénom français. Par contre l'opérateur de > ton > numéro de téléphone sur IP ne semble pas répertorié : > https://fr.wikipedia.org/wiki/Liste_des_pr%C3%A9fixes_des_op%C3%A9rateurs_de_t%C3%A9l%C3%A9phonie_par_internet_en_France > > Paderni a l'air italien comme nom de famille. C'est sur cette paire que > tu > travailles ? > >> >> > ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> _______________________________________________ >> Apertium-stuff mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff > -------------------------------- > Bernard Chardonneau (France) > Phone : [33] 9 72 36 32 90 > GSM phone : [33] 7 69 46 16 31 > > Multilingual websites for my free softwares : > http://libremail.free.fr and http://libremail.tuxfamily.org > http://cyloop.tuxfamily.org (mainly translated with Apertium) > > My general website (in french only) > http://bech.free.fr > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
