On Mon, 2023-05-29 at 10:55 +1000, Juan Nunez-Iglesias wrote:
> Hi folks,
> 
> Apologies if this is documented somewhere, but I haven't been able to
> find it. I've read through NEP-42 [1] and skimmed NEP-41 [2], but I'm
> not sure:
> 
> (a) at what point of implementation we are, and
> (b) if it's pretty much done, *how* to define a custom categorical
> dtype.
> 
> In my use case, I'd need a dtype that is implemented as some int
> scalar where only certain values are allowed, ie the NumPy equivalent
> of:
> 
> class Label(Enum)
>     CAR = 1
>     DOG = 45
>     NULL = 255
> 
> But with the ability to specify that I only need a uint8 in this
> case.
> 
> Is that possible today using Python (no C/Cython) and if so, is there
> some documentation or user example or StackOverflow answer that shows
> how to do this? If not, is it a design goal of the NEPs to allow such
> a thing? (I can be patient 😂)


The NEP is pretty far along and we have some examples of use here:
https://github.com/numpy/numpy-user-dtypes

There are still kinks to be iron out thouh and nobody has tried a
"categorical" type functionality yet.

However, without C/Cython it is not possible at this time. What we need
is a Categorical or Enum DType implemented in C, which would then allow
creating the specific `LabelDType` in Python. [1]

On the other hand, writing that single C implementation for a minimal
`IntEnum` DType factory is likely quite reasonably scoped.
(As a prototype implementation, but I expect adapting to a final
version should be smooth.)

- Sebastian


[1] Maybe as a DType factory in C to create arbitrary `IntEnum` likes,
maybe as parametric DType. I suspect the first is the right way, it may
be tedious or even very hard right now, that is a kink that needs
ironing out eventually. Python 3.12 has some fixes around Metaclass
instantiation in C (with backcompat hacks) which hopefully make this
less of a drain on sanity.


> 
> Thank you!
> 
> Juan.
> 
> [1]: https://numpy.org/neps/nep-0042-new-dtypes.html
> [2]: https://numpy.org/neps/nep-0041-improved-dtype-support.html
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net



_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to