Re: [theano-users] Gradient Problem (always 0)

Frédéric Bastien Thu, 29 Jun 2017 05:34:45 -0700

I don't know, but you can use theano.printing.debugprint([cost, grads...])

To see the gradient function. Maybe it will help you understand what is
going on.


Don't forget m and n are non sequence. This mean the dot will be lifted out
of the loop by Theano. Only the extra addition will be done at each
iterations.

Fred

Le mer. 28 juin 2017 19:12, Mohamed Akrout <mohammed.akr...@gmail.com> a
écrit :

> Hi all,
>
> I am running a neuroscience with an recurrent neural network model with
> Theano:
>
>
>
> def rnn(u_t, x_tm1, r_tm1, Wrec):
>          x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, Wrec ) + brec +
> u_t[:,Nin:]) )
>          r_t = f_hidden(x_t)
>
>
> then I define the scan function to iterate at each time step iteration
>
> [x, r], _ = theano.scan(fn=rnn,
>                                     outputs_info=[x0_, f_hidden(x0_)],
>                                     sequences=u,
>                                     non_sequences=[Wrec])
>
> Wrec and brec are learnt by stochastic gradient descent: g = T.grad(cost ,
> [Wrec, brec])
>
> where cost is the cost function: T.sum(f_loss(z, target[:,:,:Nout])) with
> z = f_output(T.dot(r, Wout_.T) + bout )
>
> Until now, everything works good.
>
>
>
> Now I want to add two new vectors, let's call them u and v so that the
> initial rnn function becomes:
>
>
> def rnn(u_t, x_tm1, r_tm1, Wrec, *u, v*):
>          x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, Wrec + *T.dot(u,
> v)* ) + brec + u_t[:,Nin:]) )
>          r_t = f_hidden(x_t)
>
> [x, r], _ = theano.scan(fn=rnn,
>                                     outputs_info=[x0_, f_hidden(x0_)],
>                                     sequences=u,
>                                     non_sequences=[Wrec,* m, n*])
>
> m and n are the variables corresponding to u and v in the main function.
>
> and suddenly, the gradient T.grad(cost, m) and T.grad(cost, n) are zeros
>
> I am blocked since 2 weeks now on this problem. I verified that the values
> are not integer by using dtype=theano.config.floatX every where in the
> definition of the variables.
>
> As you can see the link between the cost and m (or n) is: the cost
> function depends on  z, and z depends on r and r is one of the outputs of
> the rnn function that uses m and n in the equation.
>
> Do you have any ideas why this does not work ?
>
> Any idea is welcome. I hope I can unblock this problem soon.
> Thank you!
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Gradient Problem (always 0)

Reply via email to