ah, this would be a problem for you.
I don't know Latin Mongolian so I don't know how to solve it. If you have
any suggestions or code, please let me know.
If you can share the data, that would be great. This would let other people
find out about this language pair.
On 25 July 2013 01:40, Xiang
I think he's reasonably asking that -drop-unknown should drop unknown
words even if they contain digits. Maybe this means another
command-line option.
Also, anybody else notice that this code has no effect?
if (isDigit == 1)
isDigit = 1;
else
isDigit == 0;
On 07/25/13 08:52, Hieu Hoang
Hi
I think what the OP wants is to be able to redefine the exceptions to
the 'drop unknown' strategy. At the moment they are hardcoded to be
0123456789. This seems quite reasonable, but what would be even better
is a way to plug in your own OOV handler, in case you want to add in
some custom
I think you asked this question before. I check and was pretty sure it
works.
How exactly are you running Moses? Can you send me your config files and
any other info that you think might be useful to debug this issue.
On 23 July 2013 07:46, Li Xiang lixiang@gmail.com wrote:
At MERT stage,
I find the following code in the moses/TranslationOptionCollection.cpp
isDigit = s.find_first_of(“0123456789”);
if (isDigit == 1)
isDigit = 1;
else
isDigit == 0;
But nearly the same code segment appears in the moses/ChartParser.cpp
isDigit = s.find_first_of(“0123456789”);
if (isDigit ==
At MERT stage, I open the switch -drop-unknown for decoder moses_chart.
But some oov works sill appear in the output translation. I carefully check
the source traing data, but I does not find the oov words.
The source language is latin mongolian. Its character consists of 0 % _ -
additionally.