slyforce opened a new issue #10397: grad_req in multi-task example 
URL: https://github.com/apache/incubator-mxnet/issues/10397
 
 
   I'm referring to the python script in 
example/multi-task/example_multi_task.py 
   
   As far as I understood gradients need to be accumulated between two loss 
functions, since you are performing a backward step on two different components 
onto a single symbol (in the example, f3 gets gradients from sm1 and sm2). 
However, the default for this example is that grad_req = 'write', implying that 
one softmax backward step gets overwritten by the next. 
   
   Why is 'write' sufficient in this case? In which cases of gradient 
accumulation should 'add' be used?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to