On Thu, 10 Feb 2011 22:38:52 +0200, eat wrote:
[clip]
> I hope so. Please suggest if there's anything that I can do to further
> advance this. (My C skills are allready bit rusty, but at any higher
> level I'll try my best to contribute).

If someone wants to try to improve the situation, here's a possible plan 
of attack:

  1. Check first if the bottleneck is in the inner reduction loop 
(function DOUBLE_add in loops.c.src:712) or in the outer iteration 
(function PyUFunc_ReductionOp in ufunc_object.c:2781).

  2. If it's in the inner loop, some optimizations are possible, e.g. 
specialized cases for sizeof(item) strides. Think how to add them cleanly.

  3. If it's in the outer iteration, try to think how to make it faster. 
This will be a more messy problem to solve. 

-- 
Pauli Virtanen

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to