Efficient Pandas serialization for mixed object and numeric DataFrames

2018-10-18 Thread Mitar
Hi! It seems that if a DataFrame contains both numeric and object columns, the whole DataFrame is pickled and not that only object columns are pickled? Is this right? Are there any plans to improve this? Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m

Support for numpy matrix

2019-03-30 Thread Mitar
Hi! It seems numpy's matrix is not supported in recent versions of pyarrow: https://issues.apache.org/jira/browse/ARROW-3399 Any ideas why this would be happening? Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m

Re: Support for numpy matrix

2019-03-30 Thread Mitar
Hi! I do not know where to start looking into this? Not sure if I have enough knowledge about arrow to be able to make a PR. Miar On Sat, Mar 30, 2019 at 3:17 PM Wes McKinney wrote: > > hi Mitar, > > I see you reported the issue on October 2 and no one has volunteered > to fix

Re: Support for numpy matrix

2019-03-30 Thread Mitar
here. An interesting fact is that this worked in older versions of numpy, but stopped in numpy 1.15.2. It works with numpy 1.14.3. So it is them changing something. Mitar On Sat, Mar 30, 2019 at 3:34 PM Philipp Moritz wrote: > > Hey Mitar, > > It might be as simple as adding a

Re: Support for numpy matrix

2019-04-01 Thread Mitar
Hi! I agree. This is in fact all information which is already there. :-) Mitar On Sat, Mar 30, 2019 at 8:40 PM Wes McKinney wrote: > > hi Mitar, > > Let's discuss further on JIRA? It's best to keep all the information > about the issue in one place. > > Thanks &

How to properly serialize subclasses of supported classes

2018-03-04 Thread Mitar
attempt at registering custom serialization: https://gitlab.com/datadrivendiscovery/metadata/blob/plasma/d3m_metadata/container/numpy.py#L135 It looks like custom serialization is not called. Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m

Re: How to properly serialize subclasses of supported classes

2018-03-05 Thread Mitar
serialization for ndarray but the existing ndarray serialization would work, casting it into a proper subclass. Mitar On Sun, Mar 4, 2018 at 2:39 PM, Robert Nishihara wrote: > The issue is probably this line > > https://github.com/apache/arrow/blob/8b1c8118b017a941f0102709d72df7e5a9783aa4/cpp/

Re: [RESULT] [VOTE] Release Apache Arrow 0.9.0 (RC2)

2018-03-22 Thread Mitar
Hi! The website seems to say that there is already a pyarrow 0.9.0 package, but it does not seem to be there yet: https://arrow.apache.org/install/#python-wheels-on-pypi-unofficial https://pypi.python.org/pypi/pyarrow BTW, why are Python packages unofficial? Mitar On Thu, Mar 22, 2018 at 8

Re: [RESULT] [VOTE] Release Apache Arrow 0.9.0 (RC2)

2018-03-22 Thread Mitar
Hi! Oh, no worries. Thanks for working on this. I just thought that because the website went up it is ready and thought that there is some bug there. I understand it takes time to do a release, properly. Mitar On Thu, Mar 22, 2018 at 11:35 AM, Phillip Cloud wrote: > We are working on gett

[jira] [Created] (ARROW-3399) Cannot serialize numpy matrix object

2018-10-01 Thread Mitar (JIRA)
Mitar created ARROW-3399: Summary: Cannot serialize numpy matrix object Key: ARROW-3399 URL: https://issues.apache.org/jira/browse/ARROW-3399 Project: Apache Arrow Issue Type: Bug Affects

[jira] [Created] (ARROW-1664) Support for xarray.DataArray and xarray.Dataset

2017-10-10 Thread Mitar (JIRA)
Mitar created ARROW-1664: Summary: Support for xarray.DataArray and xarray.Dataset Key: ARROW-1664 URL: https://issues.apache.org/jira/browse/ARROW-1664 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-2250) plasma_store process should cleanup on TERM signal as well

2018-03-03 Thread Mitar (JIRA)
Mitar created ARROW-2250: Summary: plasma_store process should cleanup on TERM signal as well Key: ARROW-2250 URL: https://issues.apache.org/jira/browse/ARROW-2250 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-2264) Efficiently serialize numpy arrays with dtype of unicode fixed length string

2018-03-05 Thread Mitar (JIRA)
Mitar created ARROW-2264: Summary: Efficiently serialize numpy arrays with dtype of unicode fixed length string Key: ARROW-2264 URL: https://issues.apache.org/jira/browse/ARROW-2264 Project: Apache Arrow

[jira] [Created] (ARROW-2269) Cannot build bdist_wheel for Python

2018-03-06 Thread Mitar (JIRA)
Mitar created ARROW-2269: Summary: Cannot build bdist_wheel for Python Key: ARROW-2269 URL: https://issues.apache.org/jira/browse/ARROW-2269 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-2273) Cannot deserialize pandas SparseDataFrame

2018-03-06 Thread Mitar (JIRA)
Mitar created ARROW-2273: Summary: Cannot deserialize pandas SparseDataFrame Key: ARROW-2273 URL: https://issues.apache.org/jira/browse/ARROW-2273 Project: Apache Arrow Issue Type: Bug