Re: [theano-users] Gradient Problem (always 0)

Mohamed Akrout Thu, 29 Jun 2017 07:36:46 -0700

Yes I changed the values of m and n by initialising them with different 
distributions or randomly.


I changed the "+" to theano.tensor.sum -->  x_t = ( (1 - alpha)*x_tm1 + 
alpha*(T.dot(r_tm1, T.sum(Wrec, T.dot(u, v)) ) + brec + u_t[:,Nin:]) )

But this does not work as well and gives the following error:
TypeError: TensorType does not support iteration. Maybe you are using 
builtin.sum instead of theano.tensor.sum? (Maybe .max?)

I never thought that the fact that T.dot is one argument of another T.dot 
could be problematic.
Until now I am blocked, if I find the solution I will tell you what it is :(

Med


On Thursday, June 29, 2017 at 9:24:01 AM UTC-4, nouiz wrote:
>
> You can also add names to your intermediate variables. theano.grad() will 
> use them to create names for the grads nodes. This will help you understand 
> what is going on. Maybe the debugprint parameter stop_on_name=True could 
> also help make that graph more readable.
>
> On Thu, Jun 29, 2017 at 9:22 AM Frédéric Bastien <frederic...@gmail.com 
> <javascript:>> wrote:
>
>> The + of the + T.dot(u, v).
>>
>> The debugprint command I gave you will help separate the forward 
>> computation from the grad computation. 
>>
>> The grad of a dot is a another dot. So what would explain a 0 outputs 
>> would be too many or only zeros in the inputs. Can you very the values of m 
>> and n? Make sure there is no zeros in them. 
>>
>> On Thu, Jun 29, 2017 at 9:05 AM Mohamed Akrout <mohamme...@gmail.com 
>> <javascript:>> wrote:
>>
>>> Yes I printed the gradient function of m but it is extremely big. I find 
>>> it unreadable (file attached). I don't know how this tree will help me find 
>>> the problem. There are nodes who are Alloc and second but I don't know how 
>>> to change and/or control them.
>>>
>>> When you say: "Only the extra addition will be done at each iterations", 
>>> about which extra addition are you talking?
>>>
>>> Thank you Fred.
>>>
>>> Med
>>>
>>> Regarding your notice, if m and n are non sequence, Theano will not updat
>>>
>>>
>>> On Thursday, June 29, 2017 at 8:34:32 AM UTC-4, nouiz wrote:
>>>
>>>> I don't know, but you can use theano.printing.debugprint([cost, 
>>>> grads...])
>>>>
>>>> To see the gradient function. Maybe it will help you understand what is 
>>>> going on.
>>>>
>>>> Don't forget m and n are non sequence. This mean the dot will be lifted 
>>>> out of the loop by Theano. Only the extra addition will be done at each 
>>>> iterations.
>>>>
>>>> Fred
>>>>
>>>> Le mer. 28 juin 2017 19:12, Mohamed Akrout <mohamme...@gmail.com> a 
>>>> écrit :
>>>>
>>> Hi all,
>>>>>
>>>>> I am running a neuroscience with an recurrent neural network model 
>>>>> with Theano:
>>>>>
>>>>>
>>>>>
>>>>> def rnn(u_t, x_tm1, r_tm1, Wrec):
>>>>>          x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, Wrec ) + brec 
>>>>> + u_t[:,Nin:]) )
>>>>>          r_t = f_hidden(x_t)
>>>>>
>>>>>
>>>>> then I define the scan function to iterate at each time step iteration
>>>>>
>>>>> [x, r], _ = theano.scan(fn=rnn,
>>>>>                                     outputs_info=[x0_, f_hidden(x0_)],
>>>>>                                     sequences=u,
>>>>>                                     non_sequences=[Wrec])
>>>>>
>>>>> Wrec and brec are learnt by stochastic gradient descent: g = 
>>>>> T.grad(cost , [Wrec, brec])
>>>>>
>>>>> where cost is the cost function: T.sum(f_loss(z, target[:,:,:Nout])) 
>>>>> with z = f_output(T.dot(r, Wout_.T) + bout )
>>>>>
>>>>> Until now, everything works good.
>>>>>
>>>>>
>>>>>
>>>>> Now I want to add two new vectors, let's call them u and v so that the 
>>>>> initial rnn function becomes:
>>>>>
>>>>>
>>>>> def rnn(u_t, x_tm1, r_tm1, Wrec, *u, v*):
>>>>>          x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, Wrec + *T.dot(u, 
>>>>> v)* ) + brec + u_t[:,Nin:]) )
>>>>>          r_t = f_hidden(x_t)
>>>>>
>>>>> [x, r], _ = theano.scan(fn=rnn,
>>>>>                                     outputs_info=[x0_, f_hidden(x0_)],
>>>>>                                     sequences=u,
>>>>>                                     non_sequences=[Wrec,* m, n*])
>>>>>
>>>>> m and n are the variables corresponding to u and v in the main 
>>>>> function.
>>>>>
>>>>> and suddenly, the gradient T.grad(cost, m) and T.grad(cost, n) are 
>>>>> zeros
>>>>>
>>>>> I am blocked since 2 weeks now on this problem. I verified that the 
>>>>> values are not integer by using dtype=theano.config.floatX every where in 
>>>>> the definition of the variables.
>>>>>
>>>>> As you can see the link between the cost and m (or n) is: the cost 
>>>>> function depends on  z, and z depends on r and r is one of the outputs of 
>>>>> the rnn function that uses m and n in the equation.
>>>>>
>>>>> Do you have any ideas why this does not work ?
>>>>>
>>>>> Any idea is welcome. I hope I can unblock this problem soon.
>>>>> Thank you!
>>>>>
>>>>> -- 
>>>>>
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "theano-users" group.
>>>>>
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to theano-users...@googlegroups.com.
>>>>
>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to theano-users...@googlegroups.com <javascript:>.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Gradient Problem (always 0)

Reply via email to