Or just call the NMT system "NMT+apertium" =P 

Zanga Chimombo <z.chimo...@gmail.com>
čálii:

> I do not mean to be unduly polemic by questioning the methodology in
> choosing what to compare, neither do I want to overlook the shortfalls
> of Apertium/ RBMT, however, if Apertium was "good enough" to create
> corpora for use in ENG-CAT NMT via English-Spanish Europarl corpus and
> Spanish-Catalan Apertium surely, a "fairer" comparison would have been
> the English-Spanish pair?
>
> On Sun, Oct 18, 2020 at 9:06 AM Jaume Ortolà i Font
> <jaumeort...@gmail.com> wrote:
>>
>> Missatge de Hèctor Alòs i Font <hectora...@gmail.com> del dia dg.,
>> 18 d’oct. 2020 a les 7:50:
>>>
>>> Xavi, I am impressed that you could in Softcatalà get enough
>>> bilingual texts to create an English-Catalan neural
>>> translator. Congratulations on the results! I am curious to know
>>> how big the corpus you collected has been, as well as from which
>>> sources to ensure the quality of the translations.
>>
>>
>> The corpora used can be found here:
>> https://github.com/Softcatala/en-ca-corpus
>>
>> One of the corpora is an automatic translation of the
>> English-Spanish Europarl corpus using Spanish-Catalan Apertium. It
>> has proved good enough to train the neural translator.
>>
>> The neural translator could be improved with better corpora and
>> using more powerful hardware in the training. The vocabulary size is
>> limited because of hardware constraints.
>>
>>>
>>> I'd maybe add that probably it would not be possible to collect
>>> such a corpus for Valencian Catalan, so I guess we face in this
>>> neural translator a typical problem with lesser-user
>>> languages/varieties. If it is ever considered necessary to generate
>>> Valencian, this will have to be done by translating it into
>>> "reference" Catalan and then automatically adapting it. In fact the
>>> same happens for the many flavours we currently have in Apertium
>>> for Catalan, both Valencian and "Catalonian".
>>
>>
>> It is easy to make a Catalan>Valencian adapter (a few lines of code
>> using LanguageTool). Not so easy the other way around because some
>> Valencian verbal forms are ambiguous.
>>
>>>
>>> By the way, is Softcatalà trying to create a neural translator for
>>> the Spanish-Catalan pair?
>>
>>
>> Not yet. Neural translators require a lot of hardware resources, in
>> training and in production. We could not support the current volume
>> of Spanish-Catalan translations with neural translation.
>>
>> Jaume Ortolà
>>
>> _______________________________________________
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to