It depends on what the OP meant by OOV.  If it's phrase-table OOV then
-drop-unknown will work.  If it's language model OOV then it won't.

However, if the target language model(s) contain the target side of the
phrase table, then language model OOV implies phrase table OOV.

Kenneth

On 01/15/2016 01:37 PM, Jie Jiang wrote:
> Hi Ergun:
> 
> The original request in Quang's post was:
> 
> */For instance, with the n-gram: "the <unk> house <unk> in", I would
> like the decoder to assign it the probability of the phrase: "the house
> in" (existing in the LM)./*
> 
> so each time there is a <unk> when calculating the LM score, you need to
> look another word further.
> 
> I believe that it cannot be achieved on current LM tools without
> modifying the source code, which has already been clarified by Kenneth.
> 
> 
> 2016-01-15 13:20 GMT+00:00 Ergun Bicici <ergun.bic...@dfki.de
> <mailto:ergun.bic...@dfki.de>>:
> 
> 
>     Dear Kenneth,
> 
>     In the Moses manual, -drop-unknown switch is mentioned:
> 
>     4.7.2
>      Handling Unknown Words
>     Unknown words are copied verbatim to the output. They are also
>     scored by the language
>     model, and may be placed out of order. Alternatively, you may want
>     to drop unknown words.
>     To do so add the switch -drop-unknown.
> 
>     ​Alternatively, you can write a script that replaces all OOV tokens​
>     with some OOV-token-identifier such as <unk> before sending for
>     translation. 
> 
> 
>     /Best Regards,/
>     Ergun
> 
>     Ergun Biçici
>     DFKI Projektbüro Berlin
> 
> 
>     On Fri, Jan 15, 2016 at 12:22 AM, Kenneth Heafield
>     <mo...@kheafield.com <mailto:mo...@kheafield.com>> wrote:
> 
>         Hi,
> 
>                 I think oov-feature=1 just activates the OOV count
>         feature while
>         leaving LM score unchanged.  So it would still include p(<unk> |
>         in).
> 
>                 One might try setting the OOV feature weight to -weight_LM *
>         weird_moses_internal_constant * log p(<unk>) in an attempt to
>         cancel out
>         the log p(<unk>) terms.  However that won't work either because:
> 
>         1) It will still charge backoff penalties, b(the)b(house) in the
>         example.
> 
>         2) The context will be lost each time so it's p(house) not
>         p(house | the).
> 
>         If the <unk>s follow a pattern, such as appearing every other
>         word, one
>         could insert them into the ARPA file though that would waste memory.
> 
>         I don't think there's any way to accomplish exactly what OP
>         asked for
>         without coding (though it wouldn't be that hard once one
>         understands how
>         the LM infrastructure works).
> 
>         Kenneth
> 
>         On 01/14/2016 11:07 PM, Philipp Koehn wrote:
>         > Hi,
>         >
>         > You may get the behavior you want by adding
>         >   "oov-feature=1"
>         > to your LM specification line in moses.ini
>         > and also add a second weight with value "0" to the corresponding LM
>         > weight setting.
>         >
>         > This will then only use the scores
>         > p(the|<s>)
>         > p(house|<s>,the,<unk>) ---> backoff to p(house)
>         > p(in|<s>,the,<unk>,house,<unk>) ---> backoff to p(in)
>         >
>         > -phi
>         >
>         > On Thu, Jan 14, 2016 at 8:25 AM, LUONG NGOC Quang
>         > <quangngoclu...@gmail.com <mailto:quangngoclu...@gmail.com>
>         <mailto:quangngoclu...@gmail.com
>         <mailto:quangngoclu...@gmail.com>>> wrote:
>         >
>         >     Dear All,
>         >
>         >     I am currently using a SRILM Language Model (LM) in my Moses
>         >     decoder. Does anyone know how can I ask the decoder, at the 
> decoding
>         >     time, skip all out-of-vocabulary words when computing the LM 
> score
>         >     (instead of doing back-off)?
>         >
>         >     For instance, with the n-gram: "the <unk> house <unk> in", I 
> would
>         >     like the decoder to assign it the probability of the phrase: 
> "the
>         >     house in" (existing in the LM).
>         >
>         >     Do I need more options/declarations in moses.ini file?
>         >
>         >     Any help is very much appreciated,
>         >
>         >     Best,
>         >     Quang
>         >
>         >
>         >
>         >     _______________________________________________
>         >     Moses-support mailing list
>         >     Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>         <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>>
>         >     http://mailman.mit.edu/mailman/listinfo/moses-support
>         >
>         >
>         >
>         >
>         > _______________________________________________
>         > Moses-support mailing list
>         > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>         > http://mailman.mit.edu/mailman/listinfo/moses-support
>         >
>         _______________________________________________
>         Moses-support mailing list
>         Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>         http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
>     _______________________________________________
>     Moses-support mailing list
>     Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>     http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
> 
> -- 
> 
> Best regards!
> 
> Jie Jiang
> 
> 
> 
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to