stephenrawls commented on issue #15278: fixing var-seq-len rnn backward() 
operator
URL: https://github.com/apache/incubator-mxnet/pull/15278#issuecomment-503788708
 
 
   @roywei @szha 
   
   Okay I think the PR is good now.
   
   The problem was indeed what I speculated before: the cudnn backward pass was 
producing the correct gradient, and the reference net was "close but not close 
enough" to it, and the reason for the discrepancy was in the reference net.
   
    I changed the reference net to just be a dead simple LSTM, where I just 
process each batch element one-at-a-time, so that each time the LSTM can size 
itself appropriately to the current input. For the backward pass I set the 
gradient parameter to accumulate so that I can compare against the LSTM using 
sequence_length with a batch.
   
   The unit test now tests the backward pass, and it is successful.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to