stephenrawls commented on issue #15278: fixing var-seq-len rnn backward() operator URL: https://github.com/apache/incubator-mxnet/pull/15278#issuecomment-503788708 @roywei @szha Okay I think the PR is good now. The problem was indeed what I speculated before: the cudnn backward pass was producing the correct gradient, and the reference net was "close but not close enough" to it, and the reason for the discrepancy was in the reference net. I changed the reference net to just be a dead simple LSTM, where I just process each batch element one-at-a-time, so that each time the LSTM can size itself appropriately to the current input. For the backward pass I set the gradient parameter to accumulate so that I can compare against the LSTM using sequence_length with a batch. The unit test now tests the backward pass, and it is successful.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services