is it possible to get the data you used to train the recaser? There is no encoding normalization step.
Hieu Hoang http://www.hoang.co.uk/hieu On 3 August 2016 at 14:14, Vito Mandorino <vito.mandor...@linguacustodia.com > wrote: > Dear all, > > I encountered a problem when training a recaser. When launching the command > > ./mosesdecoder/scripts/recaser/train-recaser.perl --first-step 3 --dir > model --corpus corpus.en --train-script > ./mosesdecoder/scripts/training/train-model.perl > > the phrase-table ends up having several seemingly identical translation > options: > > naţională ||| Naţională ||| 1 ||| 0-0 ||| 30 30 ||| 30 ||| > naţională ||| Naţională ||| 1 ||| 0-0 ||| 36 36 ||| 36 ||| > naţională ||| Naţională ||| 1 ||| 0-0 ||| 39 39 ||| 39 ||| > naţională ||| Naţională ||| 1 ||| 0-0 ||| 4 4 ||| 4 ||| > > and a segmentation fault occurs when compressing to compact table using > the processPhraseTableMin executable. > > Could that be due to a missing encoding normalization step somewhere in > the procedure? > Using a previous version of Moses, the same command above yields just the > line > > naţională ||| Naţională ||| 1 1 1 1 ||| 0-0 ||| 109 109 109 ||| ||| > > > Thanks, > > Vito Mandorino > -- > *M**. Vito MANDORINO -- Chief Scientist* > > > [image: Description : Description : lingua_custodia_final full logo] > > *The Translation Trustee* > > *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* > > *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 > <%2B33%206%2084%2065%2068%2089>* > > *Email :* *vito.mandor...@linguacustodia.com > <massinissa.ah...@linguacustodia.com>* > > *Website :* > *www.linguacustodia.finance <http://www.linguacustodia.com/>* > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support