sxjscience commented on issue #16708: Training an FPN model using
grad_req="add" causes rapid divergence, while manually implemented gradient
accumulation works fine
URL:
https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-552586348
@samskalicky Actually, @zhreshold is in
sxjscience commented on issue #16708: Training an FPN model using
grad_req="add" causes rapid divergence, while manually implemented gradient
accumulation works fine
URL:
https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-550101790
I've confirmed that this issue do exist
sxjscience commented on issue #16708: Training an FPN model using
grad_req="add" causes rapid divergence, while manually implemented gradient
accumulation works fine
URL:
https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-549207390
Thanks for reporting this! It looks tha