Hi Moses-support, I am currently studying on factored models for English-Turkish language pair, but I am stuck at getting a model trained on surface factors of the corpus as explained there ( http://www.statmt.org/moses/?n=Moses.FactoredTutorial).
I used the command below. mosesdecoder/scripts/training/train-model.perl --mgiza --external-bin-dir /usr/local/bin/ --corpus factored-corpus/proj-syndicate.1000.clean --root-dir unfactored/ --f de --e en --lm 0:3:/home/ezgi/factored-corpus/surface.lm:0 But I got unfactored/model/phrase-table.gz instead of unfactored/model/phrase-table.0-0.gz. Besides, my phrase table is composed of these sample lines below. selbst|selbst|adv|adv wenn|wenn|kous|kous die|d|art|art.def.e ||| even|even|rb if|if|in the|the|dt ||| 0.5 0.0683685 1 0.138547 2.718 ||| 0-0 1-1 2-2 ||| 2 1 1 selbst|selbst|adv|adv wenn|wenn|kous|kous sie|sie|pper|pper.nom ||| even|even|rb if|if|in they|they|prp ||| 0.5 0.128719 1 0.102041 2.718 ||| 0-0 1-1 2-2 ||| 2 1 1 selbst|selbst|adv|adv wenn|wenn|kous|kous ||| even|even|rb if|if|in ||| 0.333333 0.20595 0.666667 0.238095 2.718 ||| 0-0 1-1 ||| 6 3 2 selbst|selbst|adv|adv wenn|wenn|kous|kous ||| even|even|rb when|when|wrb ||| 1 0.056391 0.333333 0.0396825 2.718 ||| 0-0 1-1 ||| 1 3 1 selbst|selbst|adv|adv ||| ,|,|, even|even|rb ||| 0.125 0.263158 0.0769231 0.273778 2.718 ||| 0-1 ||| 8 13 1 Actually it does not recognize real surface form (first factor) and then takes the whole tagged form as surface form. I have the same problem with my English-Turkish corpus. Please enlighten me on this problem, and if you have more detailed tutorial for factored model like baseline systems ( http://www.statmt.org/moses/?n=Moses.Baseline), could you share it? Thanks in advance, Ezgi
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
