Yes I changed the values of m and n by initialising them with different distributions or randomly.
I changed the "+" to theano.tensor.sum --> x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, T.sum(Wrec, T.dot(u, v)) ) + brec + u_t[:,Nin:]) ) But this does not work as well and gives the following error: TypeError: TensorType does not support iteration. Maybe you are using builtin.sum instead of theano.tensor.sum? (Maybe .max?) I never thought that the fact that T.dot is one argument of another T.dot could be problematic. Until now I am blocked, if I find the solution I will tell you what it is :( Med On Thursday, June 29, 2017 at 9:24:01 AM UTC-4, nouiz wrote: > > You can also add names to your intermediate variables. theano.grad() will > use them to create names for the grads nodes. This will help you understand > what is going on. Maybe the debugprint parameter stop_on_name=True could > also help make that graph more readable. > > On Thu, Jun 29, 2017 at 9:22 AM Frédéric Bastien <frederic...@gmail.com > <javascript:>> wrote: > >> The + of the + T.dot(u, v). >> >> The debugprint command I gave you will help separate the forward >> computation from the grad computation. >> >> The grad of a dot is a another dot. So what would explain a 0 outputs >> would be too many or only zeros in the inputs. Can you very the values of m >> and n? Make sure there is no zeros in them. >> >> On Thu, Jun 29, 2017 at 9:05 AM Mohamed Akrout <mohamme...@gmail.com >> <javascript:>> wrote: >> >>> Yes I printed the gradient function of m but it is extremely big. I find >>> it unreadable (file attached). I don't know how this tree will help me find >>> the problem. There are nodes who are Alloc and second but I don't know how >>> to change and/or control them. >>> >>> When you say: "Only the extra addition will be done at each iterations", >>> about which extra addition are you talking? >>> >>> Thank you Fred. >>> >>> Med >>> >>> Regarding your notice, if m and n are non sequence, Theano will not updat >>> >>> >>> On Thursday, June 29, 2017 at 8:34:32 AM UTC-4, nouiz wrote: >>> >>>> I don't know, but you can use theano.printing.debugprint([cost, >>>> grads...]) >>>> >>>> To see the gradient function. Maybe it will help you understand what is >>>> going on. >>>> >>>> Don't forget m and n are non sequence. This mean the dot will be lifted >>>> out of the loop by Theano. Only the extra addition will be done at each >>>> iterations. >>>> >>>> Fred >>>> >>>> Le mer. 28 juin 2017 19:12, Mohamed Akrout <mohamme...@gmail.com> a >>>> écrit : >>>> >>> Hi all, >>>>> >>>>> I am running a neuroscience with an recurrent neural network model >>>>> with Theano: >>>>> >>>>> >>>>> >>>>> def rnn(u_t, x_tm1, r_tm1, Wrec): >>>>> x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, Wrec ) + brec >>>>> + u_t[:,Nin:]) ) >>>>> r_t = f_hidden(x_t) >>>>> >>>>> >>>>> then I define the scan function to iterate at each time step iteration >>>>> >>>>> [x, r], _ = theano.scan(fn=rnn, >>>>> outputs_info=[x0_, f_hidden(x0_)], >>>>> sequences=u, >>>>> non_sequences=[Wrec]) >>>>> >>>>> Wrec and brec are learnt by stochastic gradient descent: g = >>>>> T.grad(cost , [Wrec, brec]) >>>>> >>>>> where cost is the cost function: T.sum(f_loss(z, target[:,:,:Nout])) >>>>> with z = f_output(T.dot(r, Wout_.T) + bout ) >>>>> >>>>> Until now, everything works good. >>>>> >>>>> >>>>> >>>>> Now I want to add two new vectors, let's call them u and v so that the >>>>> initial rnn function becomes: >>>>> >>>>> >>>>> def rnn(u_t, x_tm1, r_tm1, Wrec, *u, v*): >>>>> x_t = ( (1 - alpha)*x_tm1 + alpha*(T.dot(r_tm1, Wrec + *T.dot(u, >>>>> v)* ) + brec + u_t[:,Nin:]) ) >>>>> r_t = f_hidden(x_t) >>>>> >>>>> [x, r], _ = theano.scan(fn=rnn, >>>>> outputs_info=[x0_, f_hidden(x0_)], >>>>> sequences=u, >>>>> non_sequences=[Wrec,* m, n*]) >>>>> >>>>> m and n are the variables corresponding to u and v in the main >>>>> function. >>>>> >>>>> and suddenly, the gradient T.grad(cost, m) and T.grad(cost, n) are >>>>> zeros >>>>> >>>>> I am blocked since 2 weeks now on this problem. I verified that the >>>>> values are not integer by using dtype=theano.config.floatX every where in >>>>> the definition of the variables. >>>>> >>>>> As you can see the link between the cost and m (or n) is: the cost >>>>> function depends on z, and z depends on r and r is one of the outputs of >>>>> the rnn function that uses m and n in the equation. >>>>> >>>>> Do you have any ideas why this does not work ? >>>>> >>>>> Any idea is welcome. I hope I can unblock this problem soon. >>>>> Thank you! >>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "theano-users" group. >>>>> >>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to theano-users...@googlegroups.com. >>>> >>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "theano-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to theano-users...@googlegroups.com <javascript:>. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.