Fri, 11 Jun 2010 15:31:45 +0200, Sturla Molden wrote: [clip] >> The innermost dimension is handled via the ufunc loop, which is a >> simple for loop with constant-size step and is given a number of >> iterations. The array iterator objects are used only for stepping >> through the outer dimensions. That is, it essentially steps through >> your dtype** array, without explicitly constructing it. > > Yes, exactly my point. And because the iterator does not explicitely > construct the array, it sucks for parallel programming (e.g. with > OpenMP): > > - The iterator becomes a bottleneck to which access must be serialized > with a mutex. > - We cannot do proper work scheduling (load balancing)
I don't necessarily agree: you can do for parallelized outer loop { critical section { p = get iterator pointer ++iterator } inner loop in region `p` } This does allow load balancing etc., as a free processor can immediately grab the next available slice. Also, it would be easier to implement with OpenMP pragmas in the current code base. Of course, the assumption here is that the outer iterator overhead is small compared to the duration of the inner loop. This must then be compared to the memory access overhead involved in the dtype** array. -- Pauli Virtanen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion