from:"nicolas . granger . m"

Re: [theano-users] Numpy error during optimization phase

2018-02-13 Thread nicolas . granger . m

Sorry for the delay, I just re-ran it in a clean conda environnement, here 
are my system specs:

OS: archlinux
nvidia: 390.25
cuda: 9.1.85
numpy: 1.14.0
pygpu: 0.7.5
theano: git master

.theanorc:
[global]
device = cuda
floatX = float32
warn_float64 = warn
on_opt_error = raise

[nvcc]
fastmath = True

[gpuarray]
preallocate = 0.85

[cuda]
include_path = /opt/cuda/include
library_path = /opt/cuda/lib64


Le mercredi 7 février 2018 21:32:01 UTC+1, nouiz a écrit :
>
> I'm not able to reproduce it.
>
> On which OS? Which Theano version? Can you try a Theano version at least 
> 1.0.1?
>
> You can ignore this "error". Mostly, some optimization are skipped. But I 
> would still like to fix it.
>
> I ran the tests like this:
>
> THEANO_FLAGS=device=cuda,floatX=float32 nosetests test_ctc.py &> OUT
>
> What are your Theano flags?
>
> On Wed, Jan 24, 2018 at 5:05 AM > 
> wrote:
>
>> Hi everyone,
>>
>> While using an OpFromGraph involving some operations with binary values, 
>> there is an optimization error:
>>
>> theano.gof.opt: ERROR: Optimization failure due to: local_add_canonizer 
>>> theano.gof.opt: ERROR: node: 
>>> Elemwise{add,no_inplace}(InplaceDimShuffle{0,1,x}.0, 
>>> InplaceDimShuffle{x,0,1}.0) 
>>> theano.gof.opt: ERROR: TRACEBACK: 
>>> theano.gof.opt: ERROR: Traceback (most recent call last): 
>>> File "/home/granger/dev/Theano/theano/gof/opt.py", line 2034, in 
>>> process_node 
>>> replacements = lopt.transform(node) 
>>> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 4989, in 
>>> transform 
>>> num, denum = self.simplify(list(orig_num), list(orig_denum), out.type) 
>>> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 4833, in 
>>> simplify 
>>> out_type=out_type) 
>>> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 4919, in 
>>> simplify_constants 
>>> out_type=out_type) 
>>> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 6328, in 
>>> add_calculate 
>>> v = reduce(np.add, num, zero) - reduce(np.add, denum, zero) 
>>> TypeError: numpy boolean subtract, the `-` operator, is deprecated, use 
>>> the bitwise_xor, the `^` operator, or the logical_xor function instead.
>>
>>
>> This error does not happen when running on CPU backend.
>> I suspect it might be due to the use of binary values in my code, but the 
>> log message is not very helpful, is there any way to get some more 
>> information to track down the error? Note that the fast_compile optimizer 
>> does not trigger the error, only the fast_run one.
>>
>> A demo code and the complete output is available here: 
>> https://gist.github.com/nlgranger/279bda7fff356cfe3f40ad6397d0ba04
>>
>> Best,
>> Nicolas
>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "theano-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to theano-users...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Numpy error during optimization phase

2018-02-04 Thread nicolas . granger . m

I'm using numpy 1.14.0 from the popular conda-forge repository. BTW, I was 
wrong with the GPU/CPU distinction: the errror is triggered in either case.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[theano-users] Numpy error during optimization phase

2018-01-24 Thread nicolas . granger . m

Hi everyone,

While using an OpFromGraph involving some operations with binary values, 
there is an optimization error:

theano.gof.opt: ERROR: Optimization failure due to: local_add_canonizer 
> theano.gof.opt: ERROR: node: 
> Elemwise{add,no_inplace}(InplaceDimShuffle{0,1,x}.0, 
> InplaceDimShuffle{x,0,1}.0) 
> theano.gof.opt: ERROR: TRACEBACK: 
> theano.gof.opt: ERROR: Traceback (most recent call last): 
> File "/home/granger/dev/Theano/theano/gof/opt.py", line 2034, in 
> process_node 
> replacements = lopt.transform(node) 
> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 4989, in 
> transform 
> num, denum = self.simplify(list(orig_num), list(orig_denum), out.type) 
> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 4833, in 
> simplify 
> out_type=out_type) 
> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 4919, in 
> simplify_constants 
> out_type=out_type) 
> File "/home/granger/dev/Theano/theano/tensor/opt.py", line 6328, in 
> add_calculate 
> v = reduce(np.add, num, zero) - reduce(np.add, denum, zero) 
> TypeError: numpy boolean subtract, the `-` operator, is deprecated, use 
> the bitwise_xor, the `^` operator, or the logical_xor function instead.


This error does not happen when running on CPU backend.
I suspect it might be due to the use of binary values in my code, but the 
log message is not very helpful, is there any way to get some more 
information to track down the error? Note that the fast_compile optimizer 
does not trigger the error, only the fast_run one.

A demo code and the complete output is available here: 
https://gist.github.com/nlgranger/279bda7fff356cfe3f40ad6397d0ba04

Best,
Nicolas

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Split Op (OpFromGraph) to save intermediate results for grad

2017-08-09 Thread nicolas . granger . m

"forward the precomputed output" means that Op1 already computed the final 
output, therefore Op2 just has to behaves as identity in the forward pass 

The intermediate value is already an output of Op1 as shown in the example 
code, sorry if that wasn't clear.

Nicolas

Le mardi 8 août 2017 20:56:12 UTC+2, nouiz a écrit :
>
> I don't understand what you mean by "forward the precomputed output"
>
> What I would recommand is to make 1 op for the forward. The intermediate 
> value that can be reused for the gradient, make them output. Don't use them 
> in the forward, but you can reuse them your grad override.
>
> Frédéric
>
> On Mon, Jul 31, 2017 at 9:43 AM > 
> wrote:
>
>> I am trying to build an Op with a custom/optimized gradient formula. To 
>> override the automatic differenciation, I'm trying to use OpFromGraph. 
>> The gradient formula can reuse intermediate results from the feed forward 
>> pass, so I have tried to split the Op in two: Op1 computes the intermediate 
>> and final result and gives all of it to Op2, Op2 forwards the final result 
>> and takes care of the gradient computation given all the necessary values.
>>
>> Note that the gradient of the loss wrt the intermediate results is never 
>> needed.
>>
>> Below is a what I believe to be a minimal working example of my problem, 
>> it exhibits a strange conversion error related to the gradient computation 
>> with the intermediate values. Please take note of the presence of an 
>> integral variable.
>>
>> import numpy as np
>> import theano.tensor as T
>> import theano
>>
>>
>> def make_ops():
>> x = T.vector()
>> m = T.bvector()
>>
>> r = m.sum().astype('floatX')  # intermediate value
>> z = x * m / r  # final result
>>
>>
>> def grad_op1(inputs, output_gradients):
>> return [
>> output_gradients[0],  # gradient computation delegated to op2
>> T.DisconnectedType()()  # variable has integral type
>> # T.zeros_like(inputs[1])
>> ]
>>
>>
>> op1 = theano.OpFromGraph(
>> inputs=[x, m],
>> outputs=[z, m, r],
>> grad_overrides=grad_op1,
>> inline=True,
>> name="op1")
>>
>>
>> z = T.vector()
>> r_forwarded = T.scalar()
>>
>> def grad_op2(inputs, output_gradients):
>> _, m_, r_ = inputs
>> dm_ = theano.gradient.DisconnectedType()(name="dm_")
>> # I think the error could be around here 
>> <<--
>> # dr_ = theano.gradient.DisconnectedType()(name="dr_")
>> dr_ = T.zeros_like(r_)
>> return [m_ / r_, dm_, dr_]
>>
>> op2 = theano.OpFromGraph(
>> inputs=[z, m, r_forwarded],
>> outputs=[z],  # Op 2 forwards the precomputed output
>> grad_overrides=grad_op2,
>> inline=True,
>> name="op2")
>>
>> return op1, op2
>>
>>
>> def main():
>> op1, op2 = make_ops()
>> x = T.vector(name="x")
>> m = T.bvector(name="m")
>> z_intermediate, m_forwarded, r = op1(x, m)
>> z = op2(z_intermediate, m, r)
>>
>> g = theano.grad(T.sum(z), wrt=x)
>> print(g.eval({x: np.array([1., .3, .0, .2], dtype=np.float32),
>>   m: np.array([1, 0, 1, 1], dtype=np.int8)}))
>>
>>
>> if __name__ == "__main__":
>> main()
>>
>> (Note: I had tried to hijack my previous question thread with this 
>> problem but it went unnoticed, sorry for double posting)
>>
>> Thank you
>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "theano-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to theano-users...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[theano-users] Split Op (OpFromGraph) to save intermediate results for grad

2017-07-31 Thread nicolas . granger . m

I am trying to build an Op with a custom/optimized gradient formula. To 
override the automatic differenciation, I'm trying to use OpFromGraph. 
The gradient formula can reuse intermediate results from the feed forward 
pass, so I have tried to split the Op in two: Op1 computes the intermediate 
and final result and gives all of it to Op2, Op2 forwards the final result 
and takes care of the gradient computation given all the necessary values.

Note that the gradient of the loss wrt the intermediate results is never 
needed.

Below is a what I believe to be a minimal working example of my problem, it 
exhibits a strange conversion error related to the gradient computation 
with the intermediate values. Please take note of the presence of an 
integral variable.

import numpy as np
import theano.tensor as T
import theano


def make_ops():
x = T.vector()
m = T.bvector()

r = m.sum().astype('floatX')  # intermediate value
z = x * m / r  # final result


def grad_op1(inputs, output_gradients):
return [
output_gradients[0],  # gradient computation delegated to op2
T.DisconnectedType()()  # variable has integral type
# T.zeros_like(inputs[1])
]


op1 = theano.OpFromGraph(
inputs=[x, m],
outputs=[z, m, r],
grad_overrides=grad_op1,
inline=True,
name="op1")


z = T.vector()
r_forwarded = T.scalar()

def grad_op2(inputs, output_gradients):
_, m_, r_ = inputs
dm_ = theano.gradient.DisconnectedType()(name="dm_")
# I think the error could be around here 
<<--
# dr_ = theano.gradient.DisconnectedType()(name="dr_")
dr_ = T.zeros_like(r_)
return [m_ / r_, dm_, dr_]

op2 = theano.OpFromGraph(
inputs=[z, m, r_forwarded],
outputs=[z],  # Op 2 forwards the precomputed output
grad_overrides=grad_op2,
inline=True,
name="op2")

return op1, op2


def main():
op1, op2 = make_ops()
x = T.vector(name="x")
m = T.bvector(name="m")
z_intermediate, m_forwarded, r = op1(x, m)
z = op2(z_intermediate, m, r)

g = theano.grad(T.sum(z), wrt=x)
print(g.eval({x: np.array([1., .3, .0, .2], dtype=np.float32),
  m: np.array([1, 0, 1, 1], dtype=np.int8)}))


if __name__ == "__main__":
main()

(Note: I had tried to hijack my previous question thread with this problem 
but it went unnoticed, sorry for double posting)

Thank you

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[theano-users] Re: Unused input error with chained OpFromGraph ops

2017-07-17 Thread nicolas . granger . m

Hello,

I still haven't managed to trace the error down. Below is a shorter example 
that triggers the error. It seems theano tries to create a variable for the 
output gradient for a node through which I do not back propagate. At some 
point it hits a DisconnectedType instance and raises an error.

import numpy as np
import theano.tensor as T
import theano


def make_ops():
x_var = T.vector()
m_var = T.bvector()

r = m_var.sum().astype('floatX')
z = x_var * m_var / r


def grad_op1(inputs, output_gradients):
pass
return [
output_gradients[0],  # computation delegated to op2
theano.gradient.DisconnectedType()()
]


op1 = theano.OpFromGraph(
inputs=[x_var, m_var],
outputs=[z, r],
grad_overrides=grad_op1,
inline=True,
name="op1")

return op1


op1 = make_ops()
x_var = T.vector()
m_var = T.bvector()
z, r = op1(x_var, m_var)

g = theano.grad(T.sum(z), wrt=x_var)
print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
  m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))

output:
TypeError: Cannot convert Type DisconnectedType (of Variable <
DisconnectedType>) into Type TensorType(float32, scalar). You can try to 
manually convert  into a TensorType(float32, scalar).

Process finished with exit code 1



Le jeudi 13 juillet 2017 13:03:28 UTC+2, nicolas@gmail.com a écrit :
>
> Hi,
>
> Thank you for the suggestion, actually inlining makes more sense for what 
> I am trying to do. 
>
> However, a casting issue arises when trying to compute the derivative wrt 
> to the continuous input. If I understood correctly, DisconnectedInput 
> should be returned as the gradient for integral inputs (or inputs wrt which 
> I don't need the derivative) right?
>
> Below is the slightly modified code which illustrate this new issue:
>
> import numpy as np
> import theano.tensor as T
> import theano
>
>
> def make_ops():
> x_var = T.vector()
> m_var = T.bvector()
>
> r = m_var.sum().astype('floatX')
> z = x_var * m_var / r
>
>
> def grad_op1(inputs, output_gradients):
> return [
> output_gradients[0],  # computation delegated to op2
> theano.gradient.DisconnectedType()(),
> ]
>
>
> op1 = theano.OpFromGraph(
> inputs=[x_var, m_var],
> outputs=[z, r],
> grad_overrides=grad_op1,
> inline=True)
>
>
> z_var = T.vector()
> r_var = T.scalar()
>
> def grad_op2(inputs, output_gradients):
> _, m_, r_ = inputs
> return [
> m_ * r_,
> theano.gradient.DisconnectedType()(),
> theano.gradient.DisconnectedType()()
> ]
>
> op2 = theano.OpFromGraph(
> inputs=[z_var, m_var, r_var],
> outputs=[z_var],
> grad_overrides=grad_op2,
> inline=True)
>
> return op1, op2
>
>
> op1, op2 = make_ops()
> x_var = T.vector()
> m_var = T.bvector()
> z_, r = op1(x_var, m_var)
> z = op2(z_, m_var, r)
>
> g = theano.grad(T.sum(z), wrt=x_var)
> print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
>   m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))
>
>
>
> Le mardi 11 juillet 2017 11:32:50 UTC+2, nicolas@gmail.com a écrit :
>>
>> Hi,
>>
>> I am trying to split an computation over two ops in order to avoid 
>> spurious computations when computing the gradient.
>> My current attempt uses a first op which returns the desired result for 
>> the forward part and extra intermediate results. The second op just 
>> forwards the desired result, but its grad is overriden to compute the 
>> gradient based on the intermediate results.
>>
>> In this configuration, Theano complains about unused inputs in the 
>> forward computation because the intermediate results are not used for the 
>> forward method of the second op.
>>
>> Is this an expected behaviour or a bug?
>>
>> 
>>
>> import numpy as np
>> import theano.tensor as T
>> import theano
>>
>>
>> def make_ops():
>> x_var = T.vector()
>> m_var = T.bvector()
>>
>> r = m_var.sum().astype('floatX')
>> z = x_var * m_var / r
>>
>>
>> def grad_op1(inputs, output_gradients):
>> return [
>> output_gradients[0],  # computation delegated to op2
>> theano.gradient.DisconnectedType()()
>> ]
>>
>>
>> op1 = theano.OpFromGraph(
>> inputs=[x_var, m_var],
>> outputs=[z, r],
>> grad_overrides=grad_op1)
>>
>>
>> z_var = T.vector()
>> r_var = T.scalar()
>>
>> def grad_op2(inputs, output_gradients):
>> _, m_, r_ = inputs
>> return [
>> m_ * r_,
>> theano.gradient.DisconnectedType()(),
>> theano.gradient.DisconnectedType()()
>> ]
>>
>> op2 = theano.OpFromGraph(
>> inputs=[z_var, m_var, r_var],
>> outputs=[z_var],
>> grad_overrides=grad_op2)
>>
>> return op1, op2
>>
>>
>> op1, op2 = make_ops

[theano-users] Re: Unused input error with chained OpFromGraph ops

2017-07-13 Thread nicolas . granger . m

Hi,

Thank you for the suggestion, actually inlining makes more sense for what I 
am trying to do. 

However, a casting issue arises when trying to compute the derivative wrt 
to the continuous input. If I understood correctly, DisconnectedInput 
should be returned as the gradient for integral inputs (or inputs wrt which 
I don't need the derivative) right?

Below is the slightly modified code which illustrate this new issue:

import numpy as np
import theano.tensor as T
import theano


def make_ops():
x_var = T.vector()
m_var = T.bvector()

r = m_var.sum().astype('floatX')
z = x_var * m_var / r


def grad_op1(inputs, output_gradients):
return [
output_gradients[0],  # computation delegated to op2
theano.gradient.DisconnectedType()(),
]


op1 = theano.OpFromGraph(
inputs=[x_var, m_var],
outputs=[z, r],
grad_overrides=grad_op1,
inline=True)


z_var = T.vector()
r_var = T.scalar()

def grad_op2(inputs, output_gradients):
_, m_, r_ = inputs
return [
m_ * r_,
theano.gradient.DisconnectedType()(),
theano.gradient.DisconnectedType()()
]

op2 = theano.OpFromGraph(
inputs=[z_var, m_var, r_var],
outputs=[z_var],
grad_overrides=grad_op2,
inline=True)

return op1, op2


op1, op2 = make_ops()
x_var = T.vector()
m_var = T.bvector()
z_, r = op1(x_var, m_var)
z = op2(z_, m_var, r)

g = theano.grad(T.sum(z), wrt=x_var)
print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
  m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))



Le mardi 11 juillet 2017 11:32:50 UTC+2, nicolas@gmail.com a écrit :
>
> Hi,
>
> I am trying to split an computation over two ops in order to avoid 
> spurious computations when computing the gradient.
> My current attempt uses a first op which returns the desired result for 
> the forward part and extra intermediate results. The second op just 
> forwards the desired result, but its grad is overriden to compute the 
> gradient based on the intermediate results.
>
> In this configuration, Theano complains about unused inputs in the forward 
> computation because the intermediate results are not used for the forward 
> method of the second op.
>
> Is this an expected behaviour or a bug?
>
> 
>
> import numpy as np
> import theano.tensor as T
> import theano
>
>
> def make_ops():
> x_var = T.vector()
> m_var = T.bvector()
>
> r = m_var.sum().astype('floatX')
> z = x_var * m_var / r
>
>
> def grad_op1(inputs, output_gradients):
> return [
> output_gradients[0],  # computation delegated to op2
> theano.gradient.DisconnectedType()()
> ]
>
>
> op1 = theano.OpFromGraph(
> inputs=[x_var, m_var],
> outputs=[z, r],
> grad_overrides=grad_op1)
>
>
> z_var = T.vector()
> r_var = T.scalar()
>
> def grad_op2(inputs, output_gradients):
> _, m_, r_ = inputs
> return [
> m_ * r_,
> theano.gradient.DisconnectedType()(),
> theano.gradient.DisconnectedType()()
> ]
>
> op2 = theano.OpFromGraph(
> inputs=[z_var, m_var, r_var],
> outputs=[z_var],
> grad_overrides=grad_op2)
>
> return op1, op2
>
>
> op1, op2 = make_ops()
> x_var = T.vector()
> m_var = T.bvector()
> z_, r = op1(x_var, m_var)
> z = op2(z_, m_var, r)
>
> print(z_.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
>m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))
>
> f = theano.function([x_var, m_var], [z], on_unused_input='ignore')  # 
> raises anyway
>
> print(f(np.array([1., .3, .0, .2], dtype=np.float32),
>   np.array([1, 0, 1, 1], dtype=np.int8)))
>
> # g = theano.grad(T.sum(z), wrt=x_var)
> # print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
> #   m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[theano-users] Unused input error with chained OpFromGraph ops

2017-07-11 Thread nicolas . granger . m

Hi,

I am trying to split an computation over two ops in order to avoid spurious 
computations when computing the gradient.
My current attempt uses a first op which returns the desired result for the 
forward part and extra intermediate results. The second op just forwards 
the desired result, but its grad is overriden to compute the gradient based 
on the intermediate results.

In this configuration, Theano complains about unused inputs in the forward 
computation because the intermediate results are not used for the forward 
method of the second op.

Is this an expected behaviour or a bug?



import numpy as np
import theano.tensor as T
import theano


def make_ops():
x_var = T.vector()
m_var = T.bvector()

r = m_var.sum().astype('floatX')
z = x_var * m_var / r


def grad_op1(inputs, output_gradients):
return [
output_gradients[0],  # computation delegated to op2
theano.gradient.DisconnectedType()()
]


op1 = theano.OpFromGraph(
inputs=[x_var, m_var],
outputs=[z, r],
grad_overrides=grad_op1)


z_var = T.vector()
r_var = T.scalar()

def grad_op2(inputs, output_gradients):
_, m_, r_ = inputs
return [
m_ * r_,
theano.gradient.DisconnectedType()(),
theano.gradient.DisconnectedType()()
]

op2 = theano.OpFromGraph(
inputs=[z_var, m_var, r_var],
outputs=[z_var],
grad_overrides=grad_op2)

return op1, op2


op1, op2 = make_ops()
x_var = T.vector()
m_var = T.bvector()
z_, r = op1(x_var, m_var)
z = op2(z_, m_var, r)

print(z_.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
   m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))

f = theano.function([x_var, m_var], [z], on_unused_input='ignore')  # 
raises anyway

print(f(np.array([1., .3, .0, .2], dtype=np.float32),
  np.array([1, 0, 1, 1], dtype=np.int8)))

# g = theano.grad(T.sum(z), wrt=x_var)
# print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32),
#   m_var: np.array([1, 0, 1, 1], dtype=np.int8)}))


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[theano-users] Propagate the gradient only for a subset of samples in the bottom layers

2017-01-27 Thread nicolas . granger . m

Hello,

I am trying out an architecture for video sequences where a small CNN 
extracts features from each frame and then feeds them to an LSTM.

|RNN|->|RNN|->|RNN|->|RNN|->|RNN|->|RNN|
  |  |  |  |  |  |
|CNN|  |CNN|  |CNN|  |CNN|  |CNN|  |CNN|  

If possible I would like to train the CNN and the LSTM jointly on full 
sequences (2000 frames).
The forward pass gradient data for the CNN is too big to be stored so I 
would like to know if one can split the training in two parts:

   1. feed forward the samples through the CNN but only store 
   backpropagation data for a subset of the frames
   2. propagate through the LSTM as usual
   3. backpropagate down through the LSTM and update parameters as usual
   4. back propagate down through the CNN for the samples that belong to 
   the subset and update the parameters
   
theano.gradient.grad has a known_grads argument, maybe that could help?


Regards,

Nicolas



-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Numpy error during optimization phase

Re: [theano-users] Numpy error during optimization phase

[theano-users] Numpy error during optimization phase

Re: [theano-users] Split Op (OpFromGraph) to save intermediate results for grad

[theano-users] Split Op (OpFromGraph) to save intermediate results for grad

[theano-users] Re: Unused input error with chained OpFromGraph ops

[theano-users] Re: Unused input error with chained OpFromGraph ops

[theano-users] Unused input error with chained OpFromGraph ops

[theano-users] Propagate the gradient only for a subset of samples in the bottom layers

9 matches

Site Navigation

Mail list logo

Footer information