Hey Marwa,
We have been having this problem with NPLM and we have found no "real
solution". There were couple of threads on the mailing list with this
problem so far. Basically the solution that we use is to lower the learning
rate (from 1 to .5. If .5 doesn't work to .25 and so on) and increase the
number of generations that you produce because of it. Alternatively you may
try to use the experimental gradient clipping code that Ashish implemented.
Here's a quote of his email:
>
> You should be able to download the version of the nplm where the updates
> (gradient*learning_rate) are clipped between +5 and -5
> http://www.isi.edu/~avaswani/nplm_clipped.tar.gz
> If you want to change the magnitude of the update, please change it inside
> struct Clipper{
>   double operator() (double x) const {
>     return std::min(5., std::max(x,-5.));
>     //return(x);
>   }
> };
>
> in neuralClasses.h
> Right now, the clipping has been implemented only for standard SGD
> training, and not for adagrad or adadelta.


Cheers,

Nick

On Tue, Apr 21, 2015 at 6:17 AM, Marwa Refaie <basmal...@hotmail.com> wrote:

> Hi all
>
> When I train BilngualLM with large corpus it give 10 models.nplm filez
> with small numbers then alot if lines nan nan nan nan nan nan nan nan nan
> nan nan nan nan
> It works perfect with smaller corpus. Any suggestions plzzz
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to