zhreshold commented on issue #16708: Training an FPN model using grad_req="add" 
 causes rapid divergence, while manually implemented gradient accumulation 
works fine
URL: 
https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-552634447
 
 
   At first glance it's related to implementation in contrib operator, however, 
when I dig into it, it's more complicated than I though, I am still 
investigating.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to