Hi Naveen,
The problem that you see with loss is due to the fact that the model clips the
gradient, which in the case of AMP is scaled by the loss scale. In order for it
to work you need to apply the same loss scale to the value you are using to
clip the gradients. This is currently possible in
Just realized I did not actually link to the issue I mentioned, it is
https://github.com/apache/incubator-mxnet/issues/17507
On 2020/05/01 18:19:27, Przemys��aw Tr��dak wrote:
> Hi Naveen,
>
> The problem that you see with loss is due to the fact that the model clips
> the gradient, which in
Thanks Przemek, appreciate your input. Let me apply the scale changes to
the gradient clips and run the experiment again.
On Fri, May 1, 2020 at 11:20 AM Przemysław Trędak
wrote:
> Just realized I did not actually link to the issue I mentioned, it is
> https://github.com/apache/incubator-mxnet/i