Nathaniel, On Tue, Jan 3, 2012 at 2:02 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Tue, Jan 3, 2012 at 9:46 AM, Ognen Duzlevski <og...@enthought.com> wrote: >> Hello, >> >> I am playing with adding an enum dtype to numpy (to get my feet wet in >> numpy really). I have looked at the >> https://github.com/martinling/numpy_quaternion and I feel comfortable >> with my understanding of adding a simple type to numpy in technical >> terms. > > Hi Ognen, > > I'm in the middle of an intercontinental move, so I can't help much, > but I'd also love to see a proper enum/categorical type in numpy, so > here are a few notes: > > - I wrote a simple cython implementation of this last year, which > might be useful -- code attached. > > - The barrier I ran into, which you'll surely run into as well, is a > flaw in the ufunc API in numpy. Currently, ufunc inner loops do not > have any way to access the dtype of the array they are being called > on. For most dtypes, this isn't an issue -- the inner loop for adding > together int32's knows that it is being called on an array of int32's, > it doesn't need to see the dtype to figure that out. But with enums, > each array has a different set of possible categories, and these will > be attached to the dtype object somehow. So if you want to do, say, > equality comparison between an enum-array and a string-array: > np.enumarray(["a"", "b", "c"]) == ["a", "c", "b"] -> np.array([True, > False, True]) > ...you can't actually make this work in current numpy. The solution is > that the ufunc API needs to be changed to make dtype's somehow > available to inner loops. (Probably by passing a pointer to the array > object, like all the PyArray_ArrFuncs do.) > > See this thread: > http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052401.html > > - Both the statistical folk (pandas, statsmodels) and the hdf5 folk > (pytables, h5py) have reasons to want better enum support. (Maybe > there are other use cases too -- anyone I'm forgetting?) You should > make sure to talk to both groups to make sure what you come up with > will work for them. > > Cheers, > -- Nathaniel
Thanks! The above input is exactly what I was looking for (in addition to my original question). This "corner case" knowledge is good to have ;) Ognen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion