A Monday 10 January 2011 19:29:33 Mark Wiebe escrigué:
> > so, the new code is just < 5% slower.  I suppose that removing the
> > NPY_ITER_ALIGNED flag would give us a bit more performance, but
> > that's great as it is now.  How did you do that?  Your new_iter
> > branch in NumPy already deals with unaligned data, right?
> 
> Take a look at  lowlevel_strided_loops.c.src.  In this case, the
> buffering setup code calls PyArray_GetDTypeTransferFunction, which
> in turn calls PyArray_GetStridedCopyFn, which on an x86 platform
> returns
> _aligned_strided_to_contig_size8.  This function has a simple loop of
> copies using a npy_uint64 data type.

I see.  Brilliant!

> > Well, if you can support reduce operations with your patch that
> > would be extremely good news as I'm afraid that the current reduce
> > code is a bit broken in Numexpr (at least, I vaguely remember
> > seeing it working badly in some cases).
> 
> Cool, I'll take a look at some point.  I imagine with the most
> obvious implementation small reductions would perform poorly.

IMO, reductions like sum() or prod() are mainly limited my memory 
access, so my advise would be to not try to over-optimize here, and just 
make use of the new iterator.  We can refine performance later on.

-- 
Francesc Alted
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to