Or just call the NMT system "NMT+apertium" =P Zanga Chimombo <z.chimo...@gmail.com> čálii:
> I do not mean to be unduly polemic by questioning the methodology in > choosing what to compare, neither do I want to overlook the shortfalls > of Apertium/ RBMT, however, if Apertium was "good enough" to create > corpora for use in ENG-CAT NMT via English-Spanish Europarl corpus and > Spanish-Catalan Apertium surely, a "fairer" comparison would have been > the English-Spanish pair? > > On Sun, Oct 18, 2020 at 9:06 AM Jaume Ortolà i Font > <jaumeort...@gmail.com> wrote: >> >> Missatge de Hèctor Alòs i Font <hectora...@gmail.com> del dia dg., >> 18 d’oct. 2020 a les 7:50: >>> >>> Xavi, I am impressed that you could in Softcatalà get enough >>> bilingual texts to create an English-Catalan neural >>> translator. Congratulations on the results! I am curious to know >>> how big the corpus you collected has been, as well as from which >>> sources to ensure the quality of the translations. >> >> >> The corpora used can be found here: >> https://github.com/Softcatala/en-ca-corpus >> >> One of the corpora is an automatic translation of the >> English-Spanish Europarl corpus using Spanish-Catalan Apertium. It >> has proved good enough to train the neural translator. >> >> The neural translator could be improved with better corpora and >> using more powerful hardware in the training. The vocabulary size is >> limited because of hardware constraints. >> >>> >>> I'd maybe add that probably it would not be possible to collect >>> such a corpus for Valencian Catalan, so I guess we face in this >>> neural translator a typical problem with lesser-user >>> languages/varieties. If it is ever considered necessary to generate >>> Valencian, this will have to be done by translating it into >>> "reference" Catalan and then automatically adapting it. In fact the >>> same happens for the many flavours we currently have in Apertium >>> for Catalan, both Valencian and "Catalonian". >> >> >> It is easy to make a Catalan>Valencian adapter (a few lines of code >> using LanguageTool). Not so easy the other way around because some >> Valencian verbal forms are ambiguous. >> >>> >>> By the way, is Softcatalà trying to create a neural translator for >>> the Spanish-Catalan pair? >> >> >> Not yet. Neural translators require a lot of hardware resources, in >> training and in production. We could not support the current volume >> of Spanish-Catalan translations with neural translation. >> >> Jaume Ortolà >> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
signature.asc
Description: PGP signature
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff