subject:"\[Moses\-support\] Blingual neural lm, log\-likelihood\: \-nan"

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

2015-09-21 Thread Rico Sennrich

Hi all,

Small correction: --dropout isn't on Github (yet). I never got gains from it, 
and thus didn't commit. I'll have to double-check my implementation.

--input_dropout also didn't give me any gains, but could make training more 
stable (helping against nan), and is helpful if you want to get probabilities 
with incomplete context (say, if you have a 5-gram nplm and want to score a 
bigram. This is common in  hiero decoding).

Best wishes,
Rico

Sent from my Hitchhiker's guide

 Original Message 
From:Barry Haddow 
Sent:Mon, 21 Sep 2015 08:58:16 +0100
To:Nikolay Bogoychev ,jian zhang 
Cc:moses-support@mit.edu
Subject:Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

>The University of Edinburgh is a charitable body, registered in
>Scotland, with registration number SC005336.
>
>___
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

2015-09-21 Thread Barry Haddow


Hi Jian

You could also try using dropout. Adding something like

--dropout 0.8 --input_dropout 0.9 --null_index 1

to nplm training can help - look at your vocabulary file to see what the 
null index should be set to. This works with the Moses version of nplm,


cheers - Barry

On 21/09/15 08:45, Nikolay Bogoychev wrote:


Hey Jian,

I have encountered this problem with nplm myself and couldn't really 
find a solution that works every time.


Basically what happens is that there is a token that occurs very 
frequently on the same position and it's weights become huge and 
eventually not a number which propagates to the rest of the data. This 
usually happens with the beginning of sentence token especially if 
your source and target size contexts are big. One thing you could do 
is to decrease the source and target size context (doesn't always 
work). Another thing you could do is to lower the learning rate 
(always works, but you might need to set it quite low like 0.25)


The proper solution to this according to Ashish Vasvani who is the 
creator of nplm is to use gradient clipping which is commented out in 
his code. You should contact him because this is a nplm issue.


Cheers,

Nick


On Sat, Sep 19, 2015 at 8:58 PM, jian zhang > wrote:


Hi all,

I got

Epoch 
Current learning rate: 1
Training minibatches: Validation log-likelihood: -nan
   perplexity: nan

during bilingual neural lm training.

I use command:
/home/user/tools/nplm-master-rsennrich/src/trainNeuralNetwork
--train_file work_dir/blm/train.numberized --num_epochs 30
--model_prefix work_dir/blm/train.10k.model.nplm --learning_rate 1
--minibatch_size 1000 --num_noise_samples 100 --num_hidden 2
--input_embedding_dimension 512 --output_embedding_dimension 192
--num_threads 6 --loss_function log --activation_function tanh
--validation_file work_dir/blm/valid.numberized
--validation_minibatch_size 10

where train.numberized and valid.numberized files are splitted
from the file generated by
script ${moses}/scripts/training/bilingual-lm/extract_training.py.

Training/Validation numbers are:
Number of training instances: 4128195
Number of validation instances: 217274


Thanks,

Jian

Jian Zhang
Centre for Next Generation Localisation (CNGL)

Dublin City University 

___
Moses-support mailing list
Moses-support@mit.edu 
http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

2015-09-21 Thread Nikolay Bogoychev

Hey Jian,

I have encountered this problem with nplm myself and couldn't really find a
solution that works every time.

Basically what happens is that there is a token that occurs very frequently
on the same position and it's weights become huge and eventually not a
number which propagates to the rest of the data. This usually happens with
the beginning of sentence token especially if your source and target size
contexts are big. One thing you could do is to decrease the source and
target size context (doesn't always work). Another thing you could do is to
lower the learning rate (always works, but you might need to set it quite
low like 0.25)

The proper solution to this according to Ashish Vasvani who is the creator
of nplm is to use gradient clipping which is commented out in his code. You
should contact him because this is a nplm issue.

Cheers,

Nick

On Sat, Sep 19, 2015 at 8:58 PM, jian zhang  wrote:

> Hi all,
>
> I got
>
> Epoch 
> Current learning rate: 1
> Training minibatches: Validation log-likelihood: -nan
>perplexity: nan
>
> during bilingual neural lm training.
>
> I use command:
> /home/user/tools/nplm-master-rsennrich/src/trainNeuralNetwork --train_file
> work_dir/blm/train.numberized --num_epochs 30 --model_prefix
> work_dir/blm/train.10k.model.nplm --learning_rate 1 --minibatch_size 1000
> --num_noise_samples 100 --num_hidden 2 --input_embedding_dimension 512
> --output_embedding_dimension 192 --num_threads 6 --loss_function log
> --activation_function tanh --validation_file work_dir/blm/valid.numberized
> --validation_minibatch_size 10
>
> where train.numberized and valid.numberized files are splitted from the
> file generated by
> script ${moses}/scripts/training/bilingual-lm/extract_training.py.
>
> Training/Validation numbers are:
> Number of training instances: 4128195
> Number of validation instances: 217274
>
>
> Thanks,
>
> Jian
>
>
> Jian Zhang
> Centre for Next Generation Localisation (CNGL)
> 
> Dublin City University 
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Blingual neural lm, log-likelihood: -nan

2015-09-19 Thread jian zhang

Hi all,

I got

Epoch 
Current learning rate: 1
Training minibatches: Validation log-likelihood: -nan
   perplexity: nan

during bilingual neural lm training.

I use command:
/home/user/tools/nplm-master-rsennrich/src/trainNeuralNetwork --train_file
work_dir/blm/train.numberized --num_epochs 30 --model_prefix
work_dir/blm/train.10k.model.nplm --learning_rate 1 --minibatch_size 1000
--num_noise_samples 100 --num_hidden 2 --input_embedding_dimension 512
--output_embedding_dimension 192 --num_threads 6 --loss_function log
--activation_function tanh --validation_file work_dir/blm/valid.numberized
--validation_minibatch_size 10

where train.numberized and valid.numberized files are splitted from the
file generated by
script ${moses}/scripts/training/bilingual-lm/extract_training.py.

Training/Validation numbers are:
Number of training instances: 4128195
Number of validation instances: 217274


Thanks,

Jian


Jian Zhang
Centre for Next Generation Localisation (CNGL)

Dublin City University 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

[Moses-support] Blingual neural lm, log-likelihood: -nan

4 matches

Site Navigation

Mail list logo

Footer information