What ever I do, I cannot get this to work? Can you show an example ?
2014-06-10 16:29 GMT+02:00 Edward d'Auvergne <[email protected]>: > Here is an example of avoiding automatic numpy data structure creation > and then garbage collection: > > """ > from numpy import add, ones, zeros > > a = zeros((5, 4)) > a[1] = 1 > a[:,1] = 2 > > b = ones((5, 4)) > > add(a, b, a) > print(a) > """ > > The result is: > > [[ 1. 3. 1. 1.] > [ 2. 3. 2. 2.] > [ 1. 3. 1. 1.] > [ 1. 3. 1. 1.] > [ 1. 3. 1. 1.]] > > The out argument for numpy.add() is used here to operate in a similar > way to the Python "+=" operation. But it avoids the temporary numpy > data structures that the Python "+=" operation will create. This will > save a lot of time in the dispersion code. > > Regards, > > Edward > > > On 10 June 2014 15:56, Edward d'Auvergne <[email protected]> wrote: >> Hi Troels, >> >> Here is one suggestion, of many that I have, for significantly >> improving the speed of the analytic dispersion models in your >> 'disp_spin_speed' branch. The speed ups you have currently achieved >> for spin clusters are huge and very impressive. But now that you have >> the infrastructure in place, you can advance this much more! >> >> The suggestion has to do with the R20, R20A, and R20B numpy data >> structures. They way they are currently handled is relatively >> inefficient, in that they are created de novo for each function call. >> This means that memory allocation and Python garbage collection >> happens for every single function call - something which should be >> avoided at almost all costs. >> >> A better way to do this would be to have a self.R20_struct, >> self.R20A_struct, and self.R20B_struct created in __init__(), and then >> to pack in the values from the parameter vector into these structures. >> You could create a special structure in __init__() for this. It would >> have the dimensions [r20_index][ei][si][mi][oi], where the first >> dimension corresponds to the different R20 parameters. And for each >> r20_index element, you would have ones at the [ei][si][mi][oi] >> positions where you would like R20 to be, and zeros elsewhere. The >> key is that this is created at the target function start up, and not >> for each function call. >> >> This would be combined with the very powerful 'out' argument set to >> self.R20_struct with the numpy.add() and numpy.multiply() functions to >> prevent all memory allocations and garbage collection. Masks could be >> used, but I think that that would be much slower than having special >> numpy structures with ones where R20 should be and zeros elsewhere. >> For just creating these structures, looping over a single r20_index >> loop and multiplying by the special [r20_index][ei][si][mi][oi] >> one/zero structure and using numpy.add() and numpy.multiply() with out >> arguments would be much, much faster than masks or the current >> R20_axis logic. It will also simplify the code. >> >> Regards, >> >> Edward > > _______________________________________________ > relax (http://www.nmr-relax.com) > > This is the relax-devel mailing list > [email protected] > > To unsubscribe from this list, get a password > reminder, or change your subscription options, > visit the list information page at > https://mail.gna.org/listinfo/relax-devel _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

