Re: Speed up suggestion for task #7807.

Troels Emtekær Linnet Tue, 10 Jun 2014 11:51:17 -0700

What ever I do, I cannot get this to work?

Can you show an example ?


2014-06-10 16:29 GMT+02:00 Edward d'Auvergne <[email protected]>:
> Here is an example of avoiding automatic numpy data structure creation
> and then garbage collection:
>
> """
> from numpy import add, ones, zeros
>
> a = zeros((5, 4))
> a[1] = 1
> a[:,1] = 2
>
> b = ones((5, 4))
>
> add(a, b, a)
> print(a)
> """
>
> The result is:
>
> [[ 1.  3.  1.  1.]
>  [ 2.  3.  2.  2.]
>  [ 1.  3.  1.  1.]
>  [ 1.  3.  1.  1.]
>  [ 1.  3.  1.  1.]]
>
> The out argument for numpy.add() is used here to operate in a similar
> way to the Python "+=" operation.  But it avoids the temporary numpy
> data structures that the Python "+=" operation will create.  This will
> save a lot of time in the dispersion code.
>
> Regards,
>
> Edward
>
>
> On 10 June 2014 15:56, Edward d'Auvergne <[email protected]> wrote:
>> Hi Troels,
>>
>> Here is one suggestion, of many that I have, for significantly
>> improving the speed of the analytic dispersion models in your
>> 'disp_spin_speed' branch.  The speed ups you have currently achieved
>> for spin clusters are huge and very impressive.  But now that you have
>> the infrastructure in place, you can advance this much more!
>>
>> The suggestion has to do with the R20, R20A, and R20B numpy data
>> structures.  They way they are currently handled is relatively
>> inefficient, in that they are created de novo for each function call.
>> This means that memory allocation and Python garbage collection
>> happens for every single function call - something which should be
>> avoided at almost all costs.
>>
>> A better way to do this would be to have a self.R20_struct,
>> self.R20A_struct, and self.R20B_struct created in __init__(), and then
>> to pack in the values from the parameter vector into these structures.
>> You could create a special structure in __init__() for this.  It would
>> have the dimensions [r20_index][ei][si][mi][oi], where the first
>> dimension corresponds to the different R20 parameters.  And for each
>> r20_index element, you would have ones at the [ei][si][mi][oi]
>> positions where you would like R20 to be, and zeros elsewhere.  The
>> key is that this is created at the target function start up, and not
>> for each function call.
>>
>> This would be combined with the very powerful 'out' argument set to
>> self.R20_struct with the numpy.add() and numpy.multiply() functions to
>> prevent all memory allocations and garbage collection.  Masks could be
>> used, but I think that that would be much slower than having special
>> numpy structures with ones where R20 should be and zeros elsewhere.
>> For just creating these structures, looping over a single r20_index
>> loop and multiplying by the special [r20_index][ei][si][mi][oi]
>> one/zero structure and using numpy.add() and numpy.multiply() with out
>> arguments would be much, much faster than masks or the current
>> R20_axis logic.  It will also simplify the code.
>>
>> Regards,
>>
>> Edward
>
> _______________________________________________
> relax (http://www.nmr-relax.com)
>
> This is the relax-devel mailing list
> [email protected]
>
> To unsubscribe from this list, get a password
> reminder, or change your subscription options,
> visit the list information page at
> https://mail.gna.org/listinfo/relax-devel

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: Speed up suggestion for task #7807.

Reply via email to