Re: soft relu gradient, is it correct?

2019-01-06 Thread Matthias Seeger
Hi Pedro, these are just helper functions, you need to check the operator. In this case, the function is the derivative as function of the *output*, which is cheaper to compute: y = log(1 + exp(x)) => dy/dx = 1/(1 + exp(-x)) = 1 - exp(-y) If you check all sorts of other ops, the same is the

soft relu gradient, is it correct?

2018-11-20 Thread Pedro Larroy
I bumped into the definition of the softrelu gradient: https://github.com/apache/incubator-mxnet/blob/master/src/operator/mshadow_op.h#L170 Which is defined as 1- exp(-x) As we define the forward of the softrelu as the softplus function, shouldn't the gradient be the logistic function? Is my