Hi nadeem

please make available large files like you corpus file available for download, rather than emailing them. I personally use Dropbox

to answer you question, there are many | characters in your corpus. You must remove these lines, or use the script
  scripts/tokenizer/escape-special-chars.perl
to escape them

You can see which lines have the | character using the command line below:


 $grep -n "\|" hin-eng-train-lw.* | head -2
hin-eng-train-lw.hn:772:? ? ??? ??? ??????? ???? ???? ?? ??? ??????? |
hin-eng-train-lw.hn:773:????? ?? ?????? ?????? ? ? ? ? ??? ???? ?? ?? ? ? ( 5 ?? ? ? ?? 16 ?? ) ?? ??? ?? ? ??? ?? ???? ??? ?? ??? ? ?? ? ? ??? ? ? ???? ????? |

On 03/03/2014 18:09, moses-support-ow...@mit.edu wrote:

error is still there;

Exception: moses/Word.cpp:112 in void Moses::Word::CreateFromString(Moses::FactorDirection, const std::vector<unsigned int>&, const StringPiece&, bool) threw StrayFactorException because `fit'. You have configured 1 factors but the word | contains factor delimiter | too many times.

My training corpus is attached with the message..


On Thursday, February 27, 2014 6:15 AM, Philipp Koehn <pko...@inf.ed.ac.uk> wrote:
Hi,

as the error message says, please remove all bar characters "|" from your
training corpus when building the phrase table.

-phi

On Wed, Feb 26, 2014 at 7:58 PM, nadeem khan <nad_sta...@yahoo.com <mailto:nad_sta...@yahoo.com>> wrote:
> Hi all;
> I am getting this error while running decoder with alignment flags:
>
> FeatureFunction: UnknownWordPenalty0 start: 9 end: 9
> line=PhraseDictionaryMemory input-factor=0 output-factor=0
> path=/home/legends/work/hin-eng/f5/model/phrase-table.gz num-features=5
> table-limit=20
> FeatureFunction: PhraseDictionaryMemory0 start: 10 end: 14
> Loading SRILM0
> /home/legends/work/hin-eng/f5/lm/urd-eng.lm: line 4317: warning: non-zero
> probability for <unk> in closed-vocabulary LM
> Loading Distortion0
> Loading LexicalReordering0
> Loading table into memory...done.
> Loading WordPenalty0
> Loading UnknownWordPenalty0
> Loading PhraseDictionaryMemory0
> Start loading text SCFG phrase table. Moses  format : [8.000] seconds
> Reading /home/legends/work/hin-eng/f5/model/phrase-table.gz
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > ****************************************************************************************************
> Exception: moses/Word.cpp:112 in void
> Moses::Word::CreateFromString(Moses::FactorDirection, const
> std::vector<unsigned int>&, const StringPiece&, bool) threw
> StrayFactorException because `fit'.
> You have configured 1 factors but the word | contains factor delimiter | too
> many times.
>
>
> Please help out in  fixing it.
> THANKS
> Regards
> Nadeem
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to