Re: Speed up suggestion for task #7807.

Edward d'Auvergne Tue, 10 Jun 2014 11:58:34 -0700

Hi,

I'll have a look tomorrow but, as you've probably seen, some of the
fine details such as indices to be used need to be sorted out when
implementing this.


Regards,

Edward


On 10 June 2014 20:49, Troels Emtekær Linnet <[email protected]> wrote:
> What ever I do, I cannot get this to work?
>
> Can you show an example ?
>
> 2014-06-10 16:29 GMT+02:00 Edward d'Auvergne <[email protected]>:
>> Here is an example of avoiding automatic numpy data structure creation
>> and then garbage collection:
>>
>> """
>> from numpy import add, ones, zeros
>>
>> a = zeros((5, 4))
>> a[1] = 1
>> a[:,1] = 2
>>
>> b = ones((5, 4))
>>
>> add(a, b, a)
>> print(a)
>> """
>>
>> The result is:
>>
>> [[ 1.  3.  1.  1.]
>>  [ 2.  3.  2.  2.]
>>  [ 1.  3.  1.  1.]
>>  [ 1.  3.  1.  1.]
>>  [ 1.  3.  1.  1.]]
>>
>> The out argument for numpy.add() is used here to operate in a similar
>> way to the Python "+=" operation.  But it avoids the temporary numpy
>> data structures that the Python "+=" operation will create.  This will
>> save a lot of time in the dispersion code.
>>
>> Regards,
>>
>> Edward
>>
>>
>> On 10 June 2014 15:56, Edward d'Auvergne <[email protected]> wrote:
>>> Hi Troels,
>>>
>>> Here is one suggestion, of many that I have, for significantly
>>> improving the speed of the analytic dispersion models in your
>>> 'disp_spin_speed' branch.  The speed ups you have currently achieved
>>> for spin clusters are huge and very impressive.  But now that you have
>>> the infrastructure in place, you can advance this much more!
>>>
>>> The suggestion has to do with the R20, R20A, and R20B numpy data
>>> structures.  They way they are currently handled is relatively
>>> inefficient, in that they are created de novo for each function call.
>>> This means that memory allocation and Python garbage collection
>>> happens for every single function call - something which should be
>>> avoided at almost all costs.
>>>
>>> A better way to do this would be to have a self.R20_struct,
>>> self.R20A_struct, and self.R20B_struct created in __init__(), and then
>>> to pack in the values from the parameter vector into these structures.
>>> You could create a special structure in __init__() for this.  It would
>>> have the dimensions [r20_index][ei][si][mi][oi], where the first
>>> dimension corresponds to the different R20 parameters.  And for each
>>> r20_index element, you would have ones at the [ei][si][mi][oi]
>>> positions where you would like R20 to be, and zeros elsewhere.  The
>>> key is that this is created at the target function start up, and not
>>> for each function call.
>>>
>>> This would be combined with the very powerful 'out' argument set to
>>> self.R20_struct with the numpy.add() and numpy.multiply() functions to
>>> prevent all memory allocations and garbage collection.  Masks could be
>>> used, but I think that that would be much slower than having special
>>> numpy structures with ones where R20 should be and zeros elsewhere.
>>> For just creating these structures, looping over a single r20_index
>>> loop and multiplying by the special [r20_index][ei][si][mi][oi]
>>> one/zero structure and using numpy.add() and numpy.multiply() with out
>>> arguments would be much, much faster than masks or the current
>>> R20_axis logic.  It will also simplify the code.
>>>
>>> Regards,
>>>
>>> Edward
>>
>> _______________________________________________
>> relax (http://www.nmr-relax.com)
>>
>> This is the relax-devel mailing list
>> [email protected]
>>
>> To unsubscribe from this list, get a password
>> reminder, or change your subscription options,
>> visit the list information page at
>> https://mail.gna.org/listinfo/relax-devel

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: Speed up suggestion for task #7807.

Reply via email to