[GitHub] [incubator-mxnet] sxjscience commented on issue #16708: Training an FPN model using grad_req="add" causes rapid divergence, while manually implemented gradient accumulation works fine

2019-11-11 Thread GitBox
sxjscience commented on issue #16708: Training an FPN model using grad_req="add" causes rapid divergence, while manually implemented gradient accumulation works fine URL: https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-552586348 @samskalicky Actually, @zhreshold is in

[GitHub] [incubator-mxnet] sxjscience commented on issue #16708: Training an FPN model using grad_req="add" causes rapid divergence, while manually implemented gradient accumulation works fine

2019-11-05 Thread GitBox
sxjscience commented on issue #16708: Training an FPN model using grad_req="add" causes rapid divergence, while manually implemented gradient accumulation works fine URL: https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-550101790 I've confirmed that this issue do exist

[GitHub] [incubator-mxnet] sxjscience commented on issue #16708: Training an FPN model using grad_req="add" causes rapid divergence, while manually implemented gradient accumulation works fine

2019-11-03 Thread GitBox
sxjscience commented on issue #16708: Training an FPN model using grad_req="add" causes rapid divergence, while manually implemented gradient accumulation works fine URL: https://github.com/apache/incubator-mxnet/issues/16708#issuecomment-549207390 Thanks for reporting this! It looks tha