Re: [theano-users] broadcast_batched_dot

2018-05-22 Thread Pascal Lamblin

Hi,

You can use batched_tensordot for that, but it assumes that the 
"batched" dimension is the first one, so you'd have to transpose A first 
so that the "5" is first, and then transpose the result back to get C.


So here, you'd do something like:
A_T = A.transpose(1, 0, 2, 3)  # shape = [5, 2, 7, 3]
C_T = T.batched_tensordot(A_T, B, axes=[3, 1])  # "axes" matches the "3" 
between A_T and B, shape = [5, 2, 7, 6]

C = C_T.transpose(1, 0, 2, 3)  # shape = [2, 5, 7, 6]

It seems to work:
>>> C.eval({A: np.zeros((2, 5, 7, 3)), B: np.ones((5, 3, 6))}).shape
(2, 5, 7, 6)


On 2018-05-22 08:56 AM, luke wrote:

Hi all,


I want to achieve a "broadcast batched dot" operation in theano, such 
that the two arguments A and B with shapes


A.shape = [2,5,7,3]
B.shape = [5,3,6]


produce an output C of shape tensor4 [2,5,7,6], with a np equivalent of:

     for i in range(A.shape[0]):
     for j in range(A.shape[1]):
     C[i,j,:,:] = np.dot( A[i,j,:,:], B[j,:,:] )


So, basically, the last two dimensions of A and B are multiplied 
together with dot, dimension 1 of A and 0 of B are batched, and 
dimension 0 of A is broadcasted onto B.
I've played around a bit with T.batched_tensordot, but could not achieve 
this.


The only way I could make this work involves a scan over dimension 0 of 
A, and a T.batched_dot over the remaining 3 dimensions. But this is of 
course dauntingly slow.



Any ideas?


br,
Luke





--

---
You received this message because you are subscribed to the Google 
Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to theano-users+unsubscr...@googlegroups.com 
.

For more options, visit https://groups.google.com/d/optout.


--
Pascal Lamblin

--

--- 
You received this message because you are subscribed to the Google Groups "theano-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] Re: MILA and the future of Theano

2018-05-22 Thread 'Rabah Nory' via theano-users

>
> Dear MILA 

Thank you too much for the wonderful Theano DL Framework
in my opinion it is the best DL I worked in my research .
just I wonder why you killed the lovely Theano   
I am sure that all the researcher around the world are very sad to see the 
end of charming Theano .
Frankly speaking I hate TensorFlow and wish to see Theano Alive again 
with my best regards 
   

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] broadcast_batched_dot

2018-05-22 Thread luke
Hi all,


I want to achieve a "broadcast batched dot" operation in theano, such that 
the two arguments A and B with shapes

A.shape = [2,5,7,3]
B.shape = [5,3,6]


produce an output C of shape tensor4 [2,5,7,6], with a np equivalent of:

for i in range(A.shape[0]):
for j in range(A.shape[1]):
C[i,j,:,:] = np.dot( A[i,j,:,:], B[j,:,:] )


So, basically, the last two dimensions of A and B are multiplied together 
with dot, dimension 1 of A and 0 of B are batched, and dimension 0 of A is 
broadcasted onto B.
I've played around a bit with T.batched_tensordot, but could not achieve 
this.

The only way I could make this work involves a scan over dimension 0 of A, 
and a T.batched_dot over the remaining 3 dimensions. But this is of course 
dauntingly slow.


Any ideas?


br,
Luke





-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] Re: Issue with grad, scan, and intermediate values with changing sizes

2018-05-22 Thread norbert
Hi Robin,

I just came across the same issue.It looks very strange to me because I was 
using this kind of code  (scan in which some intermediate values of the 
scan function have different sizes in different iterations) for a long time 
and everything was OK, but after making small modyfications (putting 
mentioned fragment into ifelse statement) it suddenly stopped working... 
Have you figured out how to manage this issue and could you suggest a 
solution? 

Norbert

W dniu niedziela, 29 maja 2016 23:33:38 UTC+2 użytkownik 
robi...@stanford.edu napisał:
>
> Hi,
>
> I'm getting an error computing gradients through a scan in which some 
> intermediate values of the scan function have different sizes in different 
> iterations (the inputs and outputs always have the same size).  Here's a 
> minimal example:
>
> import numpy as np
> import theano
> import theano.tensor as T
>
> d = 11
> h = 7
> W1 = theano.shared(name='W1', value=np.random.uniform(-0.1, 0.1, (d,h)))
> W2 = theano.shared(name='W2', value=np.random.uniform(-0.1, 0.1, (h,)))
>
> n = T.lscalar('n')
> vecs = T.matrix('vecs')
> inds = T.lmatrix('inds')
> def recurrence(t, vecs, inds, W1, W2):
>   cur_inds = inds[T.eq(inds[:,0], t).nonzero()]
>   cur_vecs = vecs[cur_inds[:,1]]
>   hidden_layers = T.tanh(cur_vecs.dot(W1))
>   scores = hidden_layers.dot(W2)
>   return T.sum(scores)
> results, _ = theano.scan(
> fn=recurrence, sequences=[T.arange(n)], outputs_info=[None],
> non_sequences=[vecs, inds, W1, W2], strict=True)
> obj = T.sum(results)
> grads = T.grad(obj, [W1, W2])
> f = theano.function(inputs=[n, vecs, inds], outputs=grads)
> vecs_in = np.ones((10, d))
> inds_in = np.array([[0, 0], [1, 1], [1, 2], [2, 3], [3, 4], [3, 5], [3, 
> 6], [3, 7], [4, 8], [4, 9]])
> print f(5, vecs_in, inds_in)  
>
>
> Running this code results in the following error message (tried on 0.7.0, 
> 0.8.2, and 0.9.0dev1.dev-0044349fdf4244c5b616994bf16ad2ff1ff8ce8a):
>
> Traceback (most recent call last):
>   File "edge_scores.py", line 33, in 
> print f(5, vecs_in, inds_in)
>   File 
> "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", 
> line 912, in __call__
> storage_map=getattr(self.fn, 'storage_map', None))
>   File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 
> 314, in raise_with_op
> reraise(exc_type, exc_value, exc_trace)
>   File 
> "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", 
> line 899, in __call__
> self.fn() if output_subset is None else\
>   File 
> "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", 
> line 951, in rval
> r = p(n, [x[0] for x in i], o)
>   File 
> "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", 
> line 940, in 
> self, node)
>   File "theano/scan_module/scan_perform.pyx", line 547, in 
> theano.scan_module.scan_perform.perform 
> (/home/robinjia/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:6224)
> ValueError: could not broadcast input array from shape (11,4) into shape 
> (11,2)
> Apply node that caused the error: forall_inplace,cpu,grad_of_scan_fn}(n, 
> Alloc.0, Elemwise{eq,no_inplace}.0, Alloc.0, n, n, W1, W2, vecs, inds, 
> InplaceDimShuffle{x,0}.0)
> Toposort index: 47
> Inputs types: [TensorType(int64, scalar), TensorType(float64, col), 
> TensorType(int8, matrix), TensorType(float64, matrix), TensorType(int64, 
> scalar), TensorType(int64, scalar), TensorType(float64, matrix), 
> TensorType(float64, vector), TensorType(float64, matrix), TensorType(int64, 
> matrix), TensorType(float64, row)]
> Inputs shapes: [(), (5, 1), (5, 10), (2, 7), (), (), (11, 7), (7,), (10, 
> 11), (10, 2), (1, 7)]
> Inputs strides: [(), (8, 8), (10, 1), (56, 8), (), (), (56, 8), (8,), (88, 
> 8), (16, 8), (56, 8)]
> Inputs values: [array(5), array([[ 1.],
>[ 1.],
>[ 1.],
>[ 1.],
>[ 1.]]), 'not shown', 'not shown', array(5), array(5), 'not shown', 
> 'not shown', 'not shown', 'not shown', 'not shown']
> Outputs clients: [[Subtensor{int64}(forall_inplace,cpu,grad_of_scan_fn}.0, 
> ScalarFromTensor.0)], 
> [InplaceDimShuffle{1,0,2}(forall_inplace,cpu,grad_of_scan_fn}.1)], 
> [Reshape{2}(forall_inplace,cpu,grad_of_scan_fn}.2, 
> MakeVector{dtype='int64'}.0), 
> Shape_i{1}(forall_inplace,cpu,grad_of_scan_fn}.2)]]
>
> HINT: Re-running with most Theano optimization disabled could give you a 
> back-trace of when this node was created. This can be done with by setting 
> the Theano flag 'optimizer=fast_compile'. If that does not work, Theano 
> optimizations can be disabled with 'optimizer=None'.
> HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and 
> storage map footprint of this apply node.
>
>
> A couple observations:
> - There's no error if I turn off optimizations (theano.config.optimizer = 
> 'None')
> - There's no error if I have a single layer and no hidden layer (i.e. if 
> scores = cur