Re: [theano-users] Batched matrix operations in Theano

Maxim Kochurov Sun, 23 Jul 2017 05:21:07 -0700


class BatchedDiag(tt.Op):
    """
    Fast BatchedDiag allocation
    """
    __props__ = ()


    def make_node(self, diag):
        diag = tt.as_tensor_variable(diag)
        if diag.type.ndim != 2:
            raise TypeError('data argument must be a matrix', diag.type)

        return tt.Apply(self, [diag], [tt.tensor3(dtype=diag.dtype)])

    def perform(self, node, ins, outs, params=None):
        (C,) = ins
        (z,) = outs

        bc = C.shape[0]
        dim = C.shape[-1]
        Cd = np.zeros((bc, dim, dim), C.dtype)
        bidx = np.repeat(np.arange(bc), dim)
        didx = np.tile(np.arange(dim), bc)
        Cd[bidx, didx, didx] = C.flatten()
        z[0] = Cd

    def grad(self, inputs, gout):
        (gz,) = gout
        idx = tt.arange(gz.shape[-1])
        return [gz[..., idx, idx]]

    def infer_shape(self, nodes, shapes):
        return [(shapes[0][0], ) + (shapes[0][1],) * 2]

Here is code for Custom Op that might work faster when taking gradients


суббота, 7 мая 2016 г., 16:00:54 UTC+3 пользователь Tambet Matiisen написал:
>
> OK, solved. I used Keras wrapper K.zeros(), but this created Numpy matrix 
> of zeros, which failed with Theano expression as dimension. After switching 
> to full Theano implementation the error went away. The final code looks 
> like this:
>
>     # initialize with zeros
>     batch_size = x.shape[0]
>     a = T.zeros((batch_size, num_actuators, num_actuators))
>     # set diagonal elements
>     batch_idx = T.extra_ops.repeat(T.arange(batch_size), num_actuators)
>     diag_idx = T.tile(T.arange(num_actuators), batch_size)
>     b = T.set_subtensor(a[batch_idx, diag_idx, diag_idx], 
> T.flatten(T.exp(x[:, :num_actuators])))
>     # set lower triangle
>     cols = np.concatenate([np.array(range(i), dtype=np.uint) for i in 
> xrange(num_actuators)])
>     rows = np.concatenate([np.array([i]*i, dtype=np.uint) for i in 
> xrange(num_actuators)])
>     cols_idx = T.tile(T.as_tensor_variable(cols), batch_size)
>     rows_idx = T.tile(T.as_tensor_variable(rows), batch_size)
>     batch_idx = T.extra_ops.repeat(T.arange(batch_size), len(cols))
>     c = T.set_subtensor(b[batch_idx, rows_idx, cols_idx], T.flatten(x[:, 
> num_actuators:]))
>
> Thanks injecting me belief that it is possible!
>
>   Tambet
>
> reede, 6. mai 2016 17:57.02 UTC+3 kirjutas nouiz:
>>
>> what error do you get?
>>
>>
>> On Fri, May 6, 2016 at 10:54 AM, Tambet Matiisen <tambet....@gmail.com> 
>> wrote:
>>
>>> I could not figure out how make broadcasting work here, so I implemented 
>>> option 2.
>>>
>>> num_actuators=4
>>> x = K.variable([range(num_actuators*(num_actuators+1)/2)]*5)
>>>
>>> batch_size = K.shape(x)[0]
>>> a = K.zeros((batch_size.eval(), num_actuators, num_actuators))
>>>
>>> # populate diagonal
>>> batch_idx = T.extra_ops.repeat(T.arange(batch_size), num_actuators)
>>> diag_idx = T.tile(T.arange(num_actuators), batch_size)
>>> b = T.set_subtensor(a[batch_idx, diag_idx, diag_idx], 
>>> T.flatten(K.exp(x[:, :num_actuators])))
>>>
>>> # populate lower triangle
>>> cols = np.concatenate([np.array(range(i), dtype=np.uint) for i in 
>>> xrange(num_actuators)])
>>> rows = np.concatenate([np.array([i]*i, dtype=np.uint) for i in 
>>> xrange(num_actuators)])
>>> cols_idx = T.tile(K.variable(cols, dtype=int), batch_size)
>>> rows_idx = T.tile(K.variable(rows, dtype=int), batch_size)
>>> batch_idx = T.extra_ops.repeat(T.arange(batch_size), len(cols))
>>> c = T.set_subtensor(b[batch_idx, rows_idx, cols_idx], T.flatten(x[:, 
>>> num_actuators:]))
>>>
>>> It works nicely, but only because I eval() batch_size when creating all 
>>> zeros array. In real application I don't know the batch size beforehand and 
>>> using it without eval() gives an error. So the question is - can you create 
>>> a matrix in Theano dynamically, depending on some value in computational 
>>> graph?
>>>
>>>   Tambet
>>>
>>> reede, 6. mai 2016 16:14.59 UTC+3 kirjutas nouiz:
>>>>
>>>> broadcasting could be in theory more efficient. So this would request 
>>>> that you try option 1.
>>>>
>>>> Otherwise, both should work.
>>>>
>>>> Fred
>>>>
>>>> On Fri, May 6, 2016 at 9:12 AM, Tambet Matiisen <tambet....@gmail.com> 
>>>> wrote:
>>>>
>>>>> Actually I know the dimensions of the matrix beforehand, so I can do 
>>>>> those calculations in Python+Numpy. Following seems to do the trick:
>>>>>
>>>>> num_actuators = 3
>>>>> x = [1,2,3,4,5,6]
>>>>> a = K.zeros((num_actuators, num_actuators))
>>>>>
>>>>> # set diagonal elements
>>>>> b = T.set_subtensor(a[range(num_actuators), range(num_actuators)], 
>>>>> K.exp(x[:num_actuators]))
>>>>>
>>>>> # set lower triangle
>>>>> cols = np.concatenate([np.array(range(i), dtype=np.uint) for i in 
>>>>> xrange(num_actuators)])
>>>>> rows = np.concatenate([np.array([i]*i, dtype=np.uint) for i in 
>>>>> xrange(num_actuators)])
>>>>> c = T.set_subtensor(b[rows, cols], x[num_actuators:])
>>>>>
>>>>> K.eval(c)
>>>>>
>>>>>
>>>>> array([[  2.71828175,   0.        ,   0.        ],
>>>>>        [  4.        ,   7.38905621,   0.        ],
>>>>>        [  5.        ,   6.        ,  20.08553696]], dtype=float32)
>>>>>
>>>>>
>>>>> (I'm mixing Keras and Theano functions here, but I guess you 
>>>>> understand the idea.)
>>>>>
>>>>> Now the problem is the following - actually x is not 1D, but 2D, the 
>>>>> first dimension is batch size. So I would like to kind of broadcast this 
>>>>> operation over first dimension of x. Is there any way to do it?
>>>>>
>>>>> An alternative would be to
>>>>> 1. construct a to be 3D, first dimension batch size,
>>>>> 2. repeat all index ranges batch size times.
>>>>> Sounds quite inefficient, but I guess doable.
>>>>>
>>>>>   Tambet
>>>>>
>>>>> reede, 6. mai 2016 1:25.02 UTC+3 kirjutas nouiz:
>>>>>
>>>>>>
>>>>>> Le 5 mai 2016 16:18, "Tambet Matiisen" <tambet....@gmail.com> a 
>>>>>> écrit :
>>>>>> >
>>>>>> > Thanks Fred for a hint! Following seems to work (I'm using K is the 
>>>>>> Keras equivalent of theano.tensor):
>>>>>> >
>>>>>> > a = K.zeros((3,3))
>>>>>> > K.eval(a)
>>>>>> >
>>>>>> > array([[ 0.,  0.,  0.],
>>>>>> >        [ 0.,  0.,  0.],
>>>>>> >        [ 0.,  0.,  0.]], dtype=float32)
>>>>>> >
>>>>>> > <br><br><br>
>>>>>> >
>>>>>> > b = T.set_subtensor(a[[0,1,2],[0,1,2]], [1,2,3])
>>>>>> >
>>>>>> > K.eval(b)
>>>>>> >
>>>>>> >
>>>>>> > array([[ 1.,  0.,  0.],
>>>>>> >        [ 0.,  2.,  0.],
>>>>>> >        [ 0.,  0.,  3.]], dtype=float32)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > But what if I don't know the matrix dimensions beforehand? Can I 
>>>>>> produce the list of indexes to assign using just Theano arithmetics?
>>>>>>
>>>>>> Yes. There is T.a range(...) That you can use.
>>>>>>
>>>>>> Fred
>>>>>> >
>>>>>> >   Tambet
>>>>>> >
>>>>>> >
>>>>>> > neljapäev, 5. mai 2016 17:30.08 UTC+3 kirjutas nouiz:
>>>>>> >>
>>>>>> >> The first idea I have is to init a vector of zeros of 9 element 
>>>>>> and use set_subtensor to set the indices to the value you want. Then 
>>>>>> reshape to a matrix.
>>>>>> >>
>>>>>> >> Fred
>>>>>> >>
>>>>>> >> On Thu, May 5, 2016 at 5:07 AM, Tambet Matiisen <
>>>>>> tambet....@gmail.com> wrote:
>>>>>> >>>
>>>>>> >>> Hi everyone!
>>>>>> >>>
>>>>>> >>> I'm trying to apply T.diag() and T.tril() operations over batch 
>>>>>> of matrices, so that first dimension is preserved. Theano doesn't seem 
>>>>>> to 
>>>>>> provide built-in function to do that. Is there any other way to achieve 
>>>>>> the 
>>>>>> same?
>>>>>> >>>
>>>>>> >>> Basically I need to turn bunch of numbers x1, ... , x6 into a 
>>>>>> matrix like this:
>>>>>> >>>
>>>>>> >>> | e^x1    0    0 |
>>>>>> >>> |   x2 e^x3    0 |
>>>>>> >>> |   x4   x5 e^x4 |
>>>>>> >>>
>>>>>> >>> i.e. diagonal must be filled with e^xi and lower triangle must be 
>>>>>> filled with just xi. Order of x-s is not particularly important, as 
>>>>>> these 
>>>>>> are learned weights anyway.
>>>>>> >>>
>>>>>> >>> Thanks!
>>>>>> >>> Tambet
>>>>>> >>>
>>>>>> >>> -- 
>>>>>> >>>
>>>>>> >>> --- 
>>>>>> >>> You received this message because you are subscribed to the 
>>>>>> Google Groups "theano-users" group.
>>>>>> >>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to theano-users...@googlegroups.com.
>>>>>> >>>
>>>>>> >>> For more options, visit https://groups.google.com/d/optout.
>>>>>> >>
>>>>>> >>
>>>>>> > -- 
>>>>>> >
>>>>>> > --- 
>>>>>> > You received this message because you are subscribed to the Google 
>>>>>> Groups "theano-users" group.
>>>>>> > To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to theano-users...@googlegroups.com.
>>>>>> > For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>> -- 
>>>>>
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "theano-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to theano-users...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>> -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to theano-users...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Batched matrix operations in Theano

Reply via email to