Anne Archibald wrote:
> On 26/10/2007, Georg Holzmann <[EMAIL PROTECTED]> wrote:
>
>   
>> if in that example I also change the strides:
>>
>>    int s = tmp->strides[1];
>>    tmp->strides[0] = s;
>>    tmp->strides[1] = s * dim0[0];
>>
>> Then I get in python the fortran-style array in right order.
>>     
>
> This is the usual way. More or less, at least. numpy is designed from
> the start to handle arrays with arbitrary striding; this is how slices
> are implemented, for example. There will be no major performance hit
> from numpy code itself. The actual organization of data in memory will
> of course affect the speed at which your code runs. The flags, as you
> discovered, are just a performance optimization, so that code that
> needs arrays organized as C- or FORTRAN-standard doesn't need to check
> the strides every time.
>
> I don't think numpy's loops - for example in ones((100,100))+eye(100)
> - are smart about doing operations in an order that makes
> cache-coherent use of memory. The important exception is the loops
> that use ATLAS, which I think is mostly the dot() function.
>
>   
There is an optimization where-in the inner-loops are done over the 
dimension with the smallest stride. 

What other cache-coherent optimizations do you recommend?

-Travis

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to