Hi all, I've trained a model with English-Vietnameses corpus. After training, I received the lexical file* lex.0-0.n2f*. I found in this file lines as following :
... sweet-bay câu 0.0012255 *sessions *vào 0.0001057 *sessions *họp 0.0042553 *sessions *vì 0.0003054 *sessions *phiên_toà 0.0312500 *sessions *2 0.0031949 *sessions *hàng 0.0004948 *sessions *vậy_nên 0.1000000 *sessions *buổi 0.0052174 *sessions *ban_sáng 0.2500000 *sessions *học 0.0008078 *sessions *quý 0.0028169 psalmodic hát 0.0024752 ... But when I ran decoding with "*sessions*" word, I received the result: BEST TRANSLATION: sessions|UNK|UNK|UNK [1] [total=-110.537] <<0.000, -1.000, -100.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, -23.073, 0.000, 0.000, 0.000, 0.000, 0.000>> sessions reset caches htable class statistics size 800000 keys 2 acc 4 coll 0 used memory 3126Kb htable class statistics size 400000 keys 1 acc 2 coll 0 used memory 1563Kb htable class statistics size 400000 keys 3 acc 14 coll 0 used memory 1563Kb htable class statistics size 400000 keys 0 acc 0 coll 0 used memory 1563Kb htable class statistics size 400000 keys 0 acc 0 coll 0 used memory 1563Kb reset mmap Translation took 0.260 seconds Finished translating I do not know why *sessions *was translated as UNK word. Could you please help me? Can I fix this? And how? Thanks for helping me! Vi,
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support