чт, 13 июн. 2019 г. в 20:54, Francis Tyers <fty...@prompsit.com>:
> El 2019-06-13 22:34, Danielle Rossetti Dos Santos escribió: > > Hello, > > > > I'm working with the monolingual transfer rule learning code and have > > a few questions: > > > > 1. I see some language pairs used to have a multi mode (such as in > > this old version of eng-cat [1]). They also used to have "poly" > > dictionaries (such as this one [2]). These files seem necessary for > > the latest monolingual rule learning script I've found [3]. Why do > > language pairs no longer have a multi mode or poly dictionaries? > > They are deprecated. > > > 2. Is there a script that can generate a poly dictionary from a > > bilingual dictionary? > > Not really no, it is deprecated. > Multi modes seem to be an important step in the training of lexical selection rules using monolingual corpora. According to both of the following pages, the mode is mandatory: http://wiki.apertium.org/wiki/Running_the_monolingual_rule_learning and http://wiki.apertium.org/wiki/Generating_lexical-selection_rules_from_monolingual_corpora Is there some way around this? Alternatively, might it be appropriate to restore (and de-deprecate) poly dictionaries and multi modes so that lexical selection rules can be learned? Or do you have some other suggestion for how to proceed here? -- Jonathan > > > 3. The third step in the monolingual rule learning script I linked > > above says this should be ran: > > > > cat europarl.en-es.es.tagged | ~/source/apertium-lex-tools/multitrans > > ~/source/apertium-en-es/en-es.autobil -m -f -t -n > > > europarl.en-es.es.multi-trimmed > > I was trying to do this step with the apertium-en-pt language pair > > using 10% of the English-Portuguese > > Europarl corpus. I stopped the program because the output file was > > getting really big (dozens of > > gigabytes). Is this expected behavior from ./multitrans with the -m > > option? If so, how are the > > English-Spanish Europarl examples run? > > Yes, they are run with a very large harddisk. :) > > However, it would be helpful to know > > 1) what kind of output you are getting > 2) what the exact setup is that you are using. > > F. > > F. > in order to work out if there > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff