Hi all,
I've trained a model with English-Vietnameses corpus. After training, I
received the lexical file* lex.0-0.n2f*. I found in this file lines as
following :

...
sweet-bay câu 0.0012255
*sessions *vào 0.0001057
*sessions *họp 0.0042553
*sessions *vì 0.0003054
*sessions *phiên_toà 0.0312500
*sessions *2 0.0031949
*sessions *hàng 0.0004948
*sessions *vậy_nên 0.1000000
*sessions *buổi 0.0052174
*sessions *ban_sáng 0.2500000
*sessions *học 0.0008078
*sessions *quý 0.0028169
psalmodic hát 0.0024752
...

But when I ran decoding with "*sessions*" word,  I received the result:

BEST TRANSLATION: sessions|UNK|UNK|UNK [1]  [total=-110.537] <<0.000,
-1.000, -100.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, -23.073, 0.000,
0.000, 0.000, 0.000, 0.000>>
sessions
reset caches
htable class statistics
size 800000 keys 2 acc 4 coll 0 used memory 3126Kb
htable class statistics
size 400000 keys 1 acc 2 coll 0 used memory 1563Kb
htable class statistics
size 400000 keys 3 acc 14 coll 0 used memory 1563Kb
htable class statistics
size 400000 keys 0 acc 0 coll 0 used memory 1563Kb
htable class statistics
size 400000 keys 0 acc 0 coll 0 used memory 1563Kb
reset mmap
Translation took 0.260 seconds
Finished translating

I do not know why *sessions *was translated as UNK word. Could you please
help me? Can I fix this? And how?
Thanks for helping me!


Vi,
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to