Thank you very much.

What you said is an excellent method. "You can bypass that by sampling all 
the masks first, and pass them as a sequence."

--------------
Qiang Cui



On Saturday, March 7, 2015 at 5:01:11 AM UTC+8, Pascal Lamblin wrote:
>
> Hi, 
>
> On Thu, Mar 05, 2015, Bitton Tenessi wrote: 
> > Is there a way to make it work? 
>
> This is a limitation we are aware of. The main problem is that scan is 
> currently not able to recreate the right random sample at each step when 
> computing the gradient. 
>
> You can bypass that by sampling all the masks first, and pass them as a 
> sequence. 
>
> Also note that since you initialize x and w with integers, the gradient 
> wrt w_grad will be 0. If you initialize it with float32, you can do: 
>
> x = th.shared(np.array([1,2,3], dtype='float32')) 
> w = th.shared(np.array([5,6,7], dtype='float32')) 
>
> rng = RandomStreams() 
> masks = rng.binomial(size=[3] + [x.shape[i] for i in range(x.ndim)]) 
>
> def step(idx, mask): 
>     x_drop = mask * x 
>     out = t.dot(x_drop, w) 
>     return out 
>
> res, updates = th.scan(step, sequences=[t.arange(3), masks]) 
> w_grad = t.grad(res.sum(), w) 
> fun = function([], [w_grad], updates=updates) 
> print fun() 
>
> [array([ 1.,  2.,  3.])] 
>
>
>
> > 
> > from theano.tensor.shared_randomstreams import RandomStreams 
> > from theano import function 
> > import numpy as np 
> > import theano.tensor as t 
> > import theano as th 
> > from theano.printing import Print 
> > 
> > x = th.shared(np.array([1,2,3])) 
> > w = th.shared(np.array([5,6,7])) 
> > 
> > def step(idx): 
> >     rng = RandomStreams() 
> >     mask = rng.binomial(size=x.shape) 
> >     x_drop = mask * x 
> >     out = t.dot(x_drop, w) 
> >     return out 
> > res, updates = th.scan(step, sequences=t.arange(3)) 
> > w_grad = t.grad(res.sum(), w) 
> > fun = function([], [w_grad], updates=updates) 
> > print fun() 
> > 
> >     w_grad = t.grad(res.sum(), w) 
> >   File "C:\Python27\lib\site-packages\theano\gradient.py", line 543, in 
> grad 
> >     grad_dict, wrt, cost_name) 
> >   File "C:\Python27\lib\site-packages\theano\gradient.py", line 1273, in 
> > _populate_grad_dict 
> >     rval = [access_grad_cache(elem) for elem in wrt] 
> >   File "C:\Python27\lib\site-packages\theano\gradient.py", line 1243, in 
> > access_grad_cache 
> >     term.type.why_null) 
> > theano.gradient.NullTypeGradError: tensor.grad encountered a NaN. This 
> > variable is Null because the grad method for input 2 (<RandomStateType>) 
> of 
> > the for{cpu,scan_fn} op is mathematically undefined. Depends on a shared 
> > variable 
> > 
> > 
> > 
> > -- 
> > 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups "theano-users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to theano-users...@googlegroups.com <javascript:>. 
> > For more options, visit https://groups.google.com/d/optout. 
>
>
> -- 
> Pascal 
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to