On Mon, Apr 12, 2010 at 17:26, Travis Oliphant <oliph...@enthought.com> wrote: > > On Apr 11, 2010, at 2:56 PM, Anne Archibald wrote: > > 2010/4/10 Stéfan van der Walt <ste...@sun.ac.za>: > > On 10 April 2010 19:45, Pauli Virtanen <p...@iki.fi> wrote: > > Another addition to ufuncs that should be though about is specifying the > > Python-side interface to generalized ufuncs. > > This is an interesting idea; what do you have in mind? > > I can see two different kinds of answer to this question: one is a > tool like vectorize/frompyfunc that allows construction of generalized > ufuncs from python functions, and the other is thinking out what > methods and support functions generalized ufuncs need. > > The former would be very handy for prototyping gufunc-based libraries > before delving into the templated C required to make them actually > efficient. > > The latter is more essential in the long run: it'd be nice to have a > reduce-like function, but obviously only when the arity and dimensions > work out right (which I think means (shape1,shape2)->(shape2) ). This > could be applied along an axis or over a whole array. reduceat and the > other, more sophisticated, schemes might also be worth supporting. At > a more elementary level, gufunc objects should have good introspection > - docstrings, shape specification accessible from python, named formal > arguments, et cetera. (So should ufuncs, for that matter.) > > We should collect all of these proposals into a NEP. To clarify what I > mean by "group-by" behavior. > Suppose I have an array of floats and an array of integers. Each element > in the array of integers represents a region in the float array of a certain > "kind". The reduction should take place over like-kind values: > Example: > add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,2,0,0,2,2]) > results in the calculations: > 1 + 3 + 6 + 7 > 2 + 4 > 5 + 8 + 9 > and therefore the output (notice the two arrays --- perhaps a structured > array should be returned instead...) > [0,1,2], > [17, 6, 22] > > The real value is when you have tabular data and you want to do reductions > in one field based on values in another field. This happens all the time > in relational algebra and would be a relatively straightforward thing to > support in ufuncs.
I might suggest a simplification where the by array must be an array of non-negative ints such that they are indices into the output. For example (note that I replace 2 with 3 and have no 2s in the by array): add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,3,0,0,3,3]) == [17, 6, 0, 22] This basically generalizes bincount() to other binary ufuncs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion