Re: [Moses-support] clean-corpus-n.perl Error

2010-11-24 Thread Thomas Meyer
Hi Jinyi, When executing the script like you did, make sure, that the files "raw.chn" and "raw.eng" are located in the same directory as "clean-corpus.pl". Otherwise indicate the path to the files: ./clean-corpus-n.perl path/to/raw chn eng output/path/raw.clean 1 100 it might also help to rename

[Moses-support] clean-corpus-n.perl Error

2010-11-24 Thread hejinyi
When I excueted the command "./clean-corpus-n.perl raw chn eng clean 1 100", it showed as follows: clean-corpus.perl: processing raw.chn & .eng to clean, cutoff 1-100 Use of uninitialized value in open at ./clean-corpus-n.perl line 46. Use of uninitialized value in concatenation (.) or string at

Re: [Moses-support] clean-corpus-n.perl Error

2008-03-14 Thread J C Read
Yep. I had that kind of error once. You have to sentence align first. The problem is the lack of documentation and not the programs themselves. Quoting Barry Haddow <[EMAIL PROTECTED]>: > Hi Iain > > You should check that your es and en files both have the same number of > lines. > I think this

Re: [Moses-support] clean-corpus-n.perl Error

2008-03-14 Thread Barry Haddow
Hi Iain You should check that your es and en files both have the same number of lines. I think this error message is telling you that there's a length mismatch, perhaps from the concatenation script? regards Barry On Friday 14 March 2008 14:24:37 Iain Adams wrote: > Dear Mailing List, > > We a

[Moses-support] clean-corpus-n.perl Error

2008-03-14 Thread Iain Adams
Dear Mailing List, We are trying to train SRILM on europarl(en-es) but are running into problems. After tokenizing the data we run clean-corpus-n.perl however the message /home/aca04iba/en-es/corpus/corpus.tok.en is too long! at /home/aca04iba/en-es/bin/moses-scripts/scripts-20080306-1457/trainin