this can be done, but it tends to not save much space.  also it does
not help deal with OOVs, which the language model can still score even
though they are not in the parallel set.

if you are worried about saving space then you should either look at
KenLM or RandLM

Miles

On 24 November 2011 12:58, Thomas Schoenemann
<thomas_schoenem...@yahoo.de> wrote:
> Dear all,
>  I hope that this is not too stupid a question, and that it hasn't been
> asked recently.
> In the MOSES EMS, when running experiments the phrase table is automatically
> reduced to only those phrases that actually occur in the respective dev/test
> set. Obviously this saves a lot of memory without changing the resulting
> translations.
>
> Now, I was wondering if something similar can be done/is done with the
> language model. That is, can one reduce the ARPA-file to only those words
> that occur on the target side in the (filtered) phrase table? The objective
> would of course be to maintain the translation result. Would the LM-software
> renormalize internally if some of the original entries are removed? Then the
> results would differ.
> This may even depend on what language model you use to load (rather than
> train) the ARPA file. I am using SRILM in my own translation programs, but
> would also be interested in other toolkits in case they behave more
> suitably.
>
> Can anyone point me to anything?
> Many thanks!
>   Thomas Schoenemann (currently University of Pisa)
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to