Hi nadeem
please make available large files like you corpus file available for
download, rather than emailing them. I personally use Dropbox
to answer you question, there are many | characters in your corpus. You
must remove these lines, or use the script
scripts/tokenizer/escape-special-chars.perl
to escape them
You can see which lines have the | character using the command line below:
$grep -n "\|" hin-eng-train-lw.* | head -2
hin-eng-train-lw.hn:772:? ? ??? ??? ??????? ???? ???? ?? ??? ??????? |
hin-eng-train-lw.hn:773:????? ?? ?????? ?????? ? ? ? ? ??? ???? ?? ?? ?
? ( 5 ?? ? ? ?? 16 ?? ) ?? ??? ?? ? ??? ?? ???? ??? ?? ??? ? ?? ? ? ???
? ? ???? ????? |
On 03/03/2014 18:09, moses-support-ow...@mit.edu wrote:
error is still there;
Exception: moses/Word.cpp:112 in void
Moses::Word::CreateFromString(Moses::FactorDirection, const
std::vector<unsigned int>&, const StringPiece&, bool) threw
StrayFactorException because `fit'.
You have configured 1 factors but the word | contains factor delimiter |
too many times.
My training corpus is attached with the message..
On Thursday, February 27, 2014 6:15 AM, Philipp Koehn
<pko...@inf.ed.ac.uk> wrote:
Hi,
as the error message says, please remove all bar characters "|" from your
training corpus when building the phrase table.
-phi
On Wed, Feb 26, 2014 at 7:58 PM, nadeem khan <nad_sta...@yahoo.com
<mailto:nad_sta...@yahoo.com>> wrote:
> Hi all;
> I am getting this error while running decoder with alignment flags:
>
> FeatureFunction: UnknownWordPenalty0 start: 9 end: 9
> line=PhraseDictionaryMemory input-factor=0 output-factor=0
> path=/home/legends/work/hin-eng/f5/model/phrase-table.gz num-features=5
> table-limit=20
> FeatureFunction: PhraseDictionaryMemory0 start: 10 end: 14
> Loading SRILM0
> /home/legends/work/hin-eng/f5/lm/urd-eng.lm: line 4317: warning: non-zero
> probability for <unk> in closed-vocabulary LM
> Loading Distortion0
> Loading LexicalReordering0
> Loading table into memory...done.
> Loading WordPenalty0
> Loading UnknownWordPenalty0
> Loading PhraseDictionaryMemory0
> Start loading text SCFG phrase table. Moses format : [8.000] seconds
> Reading /home/legends/work/hin-eng/f5/model/phrase-table.gz
>
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>
****************************************************************************************************
> Exception: moses/Word.cpp:112 in void
> Moses::Word::CreateFromString(Moses::FactorDirection, const
> std::vector<unsigned int>&, const StringPiece&, bool) threw
> StrayFactorException because `fit'.
> You have configured 1 factors but the word | contains factor
delimiter | too
> many times.
>
>
> Please help out in fixing it.
> THANKS
> Regards
> Nadeem
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support