ah, this would be a problem for you.
I don't know Latin Mongolian so I don't know how to solve it. If you have
any suggestions or code, please let me know.
If you can share the data, that would be great. This would let other people
find out about this language pair.
On 25 July 2013 01:40, Xiang
I think he's reasonably asking that -drop-unknown should drop unknown
words even if they contain digits. Maybe this means another
command-line option.
Also, anybody else notice that this code has no effect?
if (isDigit == 1)
isDigit = 1;
else
isDigit == 0;
On 07/25/13 08:52, Hieu Hoang
Hi
I think what the OP wants is to be able to redefine the exceptions to
the 'drop unknown' strategy. At the moment they are hardcoded to be
0123456789. This seems quite reasonable, but what would be even better
is a way to plug in your own OOV handler, in case you want to add in
some custom
Thanks, will do
On Wed, Jul 24, 2013 at 3:46 PM, Hieu Hoang hieu.ho...@ed.ac.uk wrote:
Probably
Try it and let us know
On 23 July 2013 21:44, Andrew Vine av...@machinalis.com wrote:
Hi,
I would like to prune some phrase tables following the method described
here..
Hello,
I have a large phrase-based translation system. Alignment was done with
mgiza, and took a few weeks. I now have a small amount of extremely
relevant new bitext (~50,000 lines) that I would like to use to augment the
model, without having to retrain everything. The new data contains many