Hi Simon, hi Maarten,

thank you for your answers!

@Simon: 

Yes, I figured that calling vector(...) would impose these kinds of 
overhead. This is why I am using apply_map in the code, in the hope of 
saving some of this work. 

Thanks to your suggestion, I tried specifying the base ring in the call to 
apply_map. This already reduced the overhead of prim_v by about 3 seconds, 
but apply_map is still the slowest part of the process.

@Maarten:

Yes, I do really simple things lots and lots of times. So your advice of 
turning to cython (or, as Simon suggested, to tuples instead of vectors) 
sounds very good. So far, I stuck to vectors in this part of the code 
mainly because of convenience of notation (overloaded operators).

The one piece of Sage's functionality I really need in this part of the 
code is arbitrary precision integer arithmetic (including gcd). Do you know 
how this fits to Cython's types?

Cheers,
Felix

On Wednesday, November 20, 2013 7:10:06 PM UTC+1, Maarten Derickx wrote:
>
> It seems like your code is mostly doing easy integer operations in tight 
> for loops (although in your case the for loop is hidden in v.apply_map). If 
> you care about performance in such cases, then you should not use python, 
> but cython. Because everything you do in python has a small overhead, and 
> normally this doesn't matter because the overhead is relatively small to 
> the actual running time of the function. But when doing arithmetic 
> operations with integers, this overhead is often much and much bigger then 
> the actual running time. The typical solution for this is using cython 
> code. For a nice introduction to what cython is and how to use it in sage 
> you can read:
>
> http://sagemath.blogspot.fr/2010/11/cython-sage-and-need-for-speed.html
>
> and of course there is the more technical documentation in the developers 
> guide:
>
> http://www.sagemath.org/doc/developer/coding_in_cython.html
>
>
> Le mercredi 20 novembre 2013 11:02:59 UTC+1, Felix Breuer a écrit :
>>
>> Hi all!
>>
>> I have a large collection (~50,000) of integer vectors in low dimension 
>> (~20). For each of these vectors, I want to divide all entries by their 
>> gcd. In other words if
>>
>> def prim_v(v):
>>     d = abs(gcd(v))
>>     return v.apply_map(lambda vi: vi.divide_knowing_divisible_by(d))
>>
>> then I want to compute map(prim_v, V) for a long list V. When trying to 
>> profile this using %prun, I found it difficult to tell which part of this 
>> computation is taking the most time. So I rewrote this as
>>
>> def foo(v,d):
>>     return v.divide_knowing_divisible_by(d)
>>
>> def bar(v):
>>     return abs(gcd(v))
>>
>> def prim_v(v):
>>     d = bar(v)
>>     return v.apply_map(lambda vi: foo(vi,d))
>>
>> Then, profiling this with %prun yields the following information.
>>
>>  ncalls tottime percall cumtime percall  filename:lineno(function)
>>   47802   0.119   0.000  13.811   0.000  LDsolver.py:58(prim_v)
>>   47802   0.116   0.000   3.593   0.000  LDsolver.py:54(bar)
>>  859898   0.360   0.000   0.524   0.000  LDsolver.py:51(foo)
>>
>> If I am reading this correctly, this means that most of the time (~10 
>> seconds) is not spent doing the actual computation (using foo and bar) but 
>> simply creating vectors and read/writing values to/from vectors (in 
>> apply_map). (Do something like return vector((foo(vi,d) for vi in v)) 
>> instead 
>> of apply_map does not make much a difference.) 
>>
>>
>> Now, my questions are:
>>
>> 1) Why is this so slow? Am I missing something here?
>> 2) Is there anything I can do to improve performance?
>>
>>
>> Thank you very much,
>> Felix
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com.
To post to this group, send email to sage-support@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to