P.S. If it helps, yes the indices are exactly the ones I have in the code. Thanks Oussama
On 16 February 2017 at 23:00, Oussama Souihli <albitru...@gmail.com> wrote: > Hi Fred, > > Thank you for the quick reply and clever suggestions, you really hit the > nail on the head ! > > I tried the suggested vectorized code, but unfortunately still experienced > the max recursion limit exception. > I also tried the workaround, but still get the max recusion limit until I > set it to 20K and I get a segmentation fault: > > Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) > > > Can you think of other suggestions that I could try ? > If for-loop approach is doomed because of the max recursion issue, is > there perhaps a way I could rewrite my code such that scan becomes > reasonably faster ? > > > > On 16 February 2017 at 22:22, Frédéric Bastien <frederic.bast...@gmail.com > > wrote: > >> Is the indices exatly the one you have in your code? If so, you can >> probably take advantage of this to vectorize the code and make the for loop >> case work. >> >> The size are so small then using scan will be super slow. Each iteration >> of scan have big overhead and you only work on scalar mostly at each >> iteration. So it is normal that it is slow with scan. >> >> For the for loop, there was a problem of indices. I fixed it. Here is a >> partial vectorization of the code that will make it faster to execute and >> make a smaller graph: >> >> coss = tt.cos(theta) >> sins = tt.sin(theta) >> >> for k in range(0, K): >> A = tt.set_subtensor(A[indices[k, 0], indices[k, 0]], coss[k]) >> A = tt.set_subtensor(A[indices[k, 1], indices[k, 1]], coss[k]) >> A = tt.set_subtensor(A[indices[k, 0], indices[k, 1]], sins[k]) >> A = tt.set_subtensor(A[indices[k, 1], indices[k, 0]], -sins[k]) >> B = tt.dot(A, B) >> >> The call to cos and sin are only done once. This make the graph smaller. >> >> You can also do the indexing call only once. THis make the graph still >> smaller: >> >> coss = tt.cos(theta) >> sins = tt.cos(theta) >> >> for k in range(0, K): >> idx0 = indices[k, 0] >> idx1 = indices[k, 1] >> A = tt.set_subtensor(A[idx0, idx0], coss[k]) >> A = tt.set_subtensor(A[idx1, idx1], coss[k]) >> A = tt.set_subtensor(A[idx0, idx1], sins[k]) >> A = tt.set_subtensor(A[idx1, idx0], -sins[k]) >> B = tt.dot(A, B) >> >> The if the indices is exactly what you created and don't change, it can >> become that is even smaller: >> >> for k in range(0, K): >> idx0 = K//(N-1) >> idx1 = indices[k, 1] >> A = tt.set_subtensor(A[idx0, idx0], coss[k]) >> A = tt.set_subtensor(A[idx1, idx1], coss[k]) >> A = tt.set_subtensor(A[idx0, idx1], sins[k]) >> A = tt.set_subtensor(A[idx1, idx0], -sins[k]) >> B = tt.dot(A, B) >> >> You can do similar with j. >> >> If with that, you still have the max recursion limit, try the work around >> in this issue. >> >> https://github.com/Theano/Theano/issues/3607 >> >> Fred >> >> On Thu, Feb 16, 2017 at 7:53 AM Oussama Souihli <albitru...@gmail.com> >> wrote: >> >>> Hi Fred, Adam, Kiuhnm, all >>> >>> Thank you for the detailed and quick replies and multiple suggestions. >>> I tried bacthed_dot but still experienced both issues (maximum depth >>> recursion when using a for loop and very slow epochs when using >>> theano.scan), >>> so now I'm thinking the problem is also possibly due to the way I >>> populate the array A (using sub_tensor). >>> >>> Sorry for imposing on your kindness, but would you mind taking a look at >>> the attached, *very short*, minimal examples reproducing what I'm >>> trying to achieve ? >>> The examples are self-contained and only require python with numpy and >>> theano packages, nothing else. >>> >>> - *Issue_with_for_loop.py*: >>> - Reproduces the for-loop maximum recursion issue. >>> - if you run it as is, it will throw the exception on B.eval() >>> >>> - *Issue_with_scan.py*: >>> - This one shows the alternative way of implementing the logic using >>> scan. >>> - Unfortunately in this case it returns a result and gives the >>> impression like it works fine, but if I run it in Keras the epoch >>> duration >>> becomes prohibitively large (400,000 seconds per epoch !!) >>> >>> >>> If you can think of a way that can achieve the logic implemented in >>> either sheets with maximum efficiency theano-wise, I'd really appreciate >>> it. >>> >>> >>> With gratitude, >>> Oussama >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "theano-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to theano-users+unsubscr...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- >> >> --- >> You received this message because you are subscribed to a topic in the >> Google Groups "theano-users" group. >> To unsubscribe from this topic, visit https://groups.google.com/d/to >> pic/theano-users/Yz8g-cejB8c/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> theano-users+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.