Dear Jie, There may be some option from SRILM: - http://www.speech.sri.com/pipermail/srilm-user/2013q2/001509.html - http://www.speech.sri.com/projects/srilm/manpages/ngram.1.html: * -skipoovs* Instruct the LM to skip over contexts that contain out-of-vocabulary words, instead of using a backoff strategy in these cases.
if it is not there maybe for a reason... Bing appears fast to index this thread: http://comments.gmane.org/gmane.comp.nlp.moses.user/14570 *Best Regards,* Ergun Ergun Biçici DFKI Projektbüro Berlin On Fri, Jan 15, 2016 at 2:37 PM, Jie Jiang <mail.jie.ji...@gmail.com> wrote: > Hi Ergun: > > The original request in Quang's post was: > > *For instance, with the n-gram: "the <unk> house <unk> in", I would like > the decoder to assign it the probability of the phrase: "the house in" > (existing in the LM).* > > so each time there is a <unk> when calculating the LM score, you need to > look another word further. > > I believe that it cannot be achieved on current LM tools without modifying > the source code, which has already been clarified by Kenneth. > > > 2016-01-15 13:20 GMT+00:00 Ergun Bicici <ergun.bic...@dfki.de>: > >> >> Dear Kenneth, >> >> In the Moses manual, -drop-unknown switch is mentioned: >> >> 4.7.2 >> Handling Unknown Words >> Unknown words are copied verbatim to the output. They are also scored by >> the language >> model, and may be placed out of order. Alternatively, you may want to >> drop unknown words. >> To do so add the switch -drop-unknown. >> >> Alternatively, you can write a script that replaces all OOV tokens with >> some OOV-token-identifier such as <unk> before sending for translation. >> >> >> *Best Regards,* >> Ergun >> >> Ergun Biçici >> DFKI Projektbüro Berlin >> >> >> On Fri, Jan 15, 2016 at 12:22 AM, Kenneth Heafield <mo...@kheafield.com> >> wrote: >> >>> Hi, >>> >>> I think oov-feature=1 just activates the OOV count feature while >>> leaving LM score unchanged. So it would still include p(<unk> | in). >>> >>> One might try setting the OOV feature weight to -weight_LM * >>> weird_moses_internal_constant * log p(<unk>) in an attempt to cancel out >>> the log p(<unk>) terms. However that won't work either because: >>> >>> 1) It will still charge backoff penalties, b(the)b(house) in the example. >>> >>> 2) The context will be lost each time so it's p(house) not p(house | >>> the). >>> >>> If the <unk>s follow a pattern, such as appearing every other word, one >>> could insert them into the ARPA file though that would waste memory. >>> >>> I don't think there's any way to accomplish exactly what OP asked for >>> without coding (though it wouldn't be that hard once one understands how >>> the LM infrastructure works). >>> >>> Kenneth >>> >>> On 01/14/2016 11:07 PM, Philipp Koehn wrote: >>> > Hi, >>> > >>> > You may get the behavior you want by adding >>> > "oov-feature=1" >>> > to your LM specification line in moses.ini >>> > and also add a second weight with value "0" to the corresponding LM >>> > weight setting. >>> > >>> > This will then only use the scores >>> > p(the|<s>) >>> > p(house|<s>,the,<unk>) ---> backoff to p(house) >>> > p(in|<s>,the,<unk>,house,<unk>) ---> backoff to p(in) >>> > >>> > -phi >>> > >>> > On Thu, Jan 14, 2016 at 8:25 AM, LUONG NGOC Quang >>> > <quangngoclu...@gmail.com <mailto:quangngoclu...@gmail.com>> wrote: >>> > >>> > Dear All, >>> > >>> > I am currently using a SRILM Language Model (LM) in my Moses >>> > decoder. Does anyone know how can I ask the decoder, at the >>> decoding >>> > time, skip all out-of-vocabulary words when computing the LM score >>> > (instead of doing back-off)? >>> > >>> > For instance, with the n-gram: "the <unk> house <unk> in", I would >>> > like the decoder to assign it the probability of the phrase: "the >>> > house in" (existing in the LM). >>> > >>> > Do I need more options/declarations in moses.ini file? >>> > >>> > Any help is very much appreciated, >>> > >>> > Best, >>> > Quang >>> > >>> > >>> > >>> > _______________________________________________ >>> > Moses-support mailing list >>> > Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>> > http://mailman.mit.edu/mailman/listinfo/moses-support >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > Moses-support mailing list >>> > Moses-support@mit.edu >>> > http://mailman.mit.edu/mailman/listinfo/moses-support >>> > >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > -- > > Best regards! > > Jie Jiang > >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support