The issue is probably this line

https://github.com/apache/arrow/blob/8b1c8118b017a941f0102709d72df7e5a9783aa4/cpp/src/arrow/python/python_to_arrow.cc#L504

which uses *PyList_Check* instead of *PyList_CheckExact*. Changing it to
the exact form will cause it to use the custom serializer for subclasses of
list.

On Sun, Mar 4, 2018 at 1:08 AM Mitar <mmi...@gmail.com> wrote:

> Hi!
>
> I have a subclass of numpy and another of pandas which add a metadata
> attribute to them. Moreover, I have a subclass of typing.List as a
> Python generic with this metadata attribute as well.
>
> Now, it seems if I serialize this to plasma store and back I get
> standard numpy, pandas, or list back, respectively.
>
> My question is: how can I make it so that proper subclasses are
> returned, including the custom metadata attribute?
>
> I tried to use pyarrow_lib._default_serialization_context.register_type
> but it does not seem to work. Moreover, I still worry that even if I
> create a serialization for a custom class, if anyone makes a subclass
> and tries to store it plasma store they will get back the custom class
> and not a subclass.
>
> This is how I am testing:
>
>
> https://gitlab.com/datadrivendiscovery/metadata/blob/plasma/tests/test_plasma.py#L50
>
> And here is the code for custom numpy class and attempt at registering
> custom serialization:
>
>
> https://gitlab.com/datadrivendiscovery/metadata/blob/plasma/d3m_metadata/container/numpy.py#L135
>
> It looks like custom serialization is not called.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
>

Reply via email to