[jira] [Commented] (ARROW-1074) from_pandas doesnt convert ndarray to list
[ https://issues.apache.org/jira/browse/ARROW-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039278#comment-16039278 ] Abdul Rahman commented on ARROW-1074: - Yes, I can attempt to do that. I dont have experience with Cython, so will look into that first. Is there a dev channel/forum where I can get some help along the way ? > from_pandas doesnt convert ndarray to list > -- > > Key: ARROW-1074 > URL: https://issues.apache.org/jira/browse/ARROW-1074 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.4.0 >Reporter: Abdul Rahman >Priority: Minor > Labels: pyarrow > > [Feel free to change issue type because this is probably by design] > I have noticed that that if the one of the columns in the parquet file is of > type array, pyarrow table stores it as list > >>> table[3].type > DataType(list) > If I do a .to_pandas() on the column, I get something like this > >> table[3].to_pandas() > 0 None > > 1 [7] > > 2 [46] > dtype: object > However, I cant do a pyarrow.Table.from_pandas from a dataframe having the > above ndarray as a series/column. I get this error > Invalid: Python object of type ndarray is not None and is not a string, bool, > float, int, date, > decimal object > If to_pandas() can covert a list to ndarray, shouldnt from_pandas also > convert an ndarray to type list in the table ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-1074) from_pandas doesnt convert ndarray to list
[ https://issues.apache.org/jira/browse/ARROW-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033718#comment-16033718 ] Abdul Rahman commented on ARROW-1074: - [~wesm_impala_7e40] comments ? > from_pandas doesnt convert ndarray to list > -- > > Key: ARROW-1074 > URL: https://issues.apache.org/jira/browse/ARROW-1074 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.4.0 >Reporter: Abdul Rahman >Priority: Minor > Labels: pyarrow > > [Feel free to change issue type because this is probably by design] > I have noticed that that if the one of the columns in the parquet file is of > type array, pyarrow table stores it as list > >>> table[3].type > DataType(list) > If I do a .to_pandas() on the column, I get something like this > >> table[3].to_pandas() > 0 None > > 1 [7] > > 2 [46] > dtype: object > However, I cant do a pyarrow.Table.from_pandas from a dataframe having the > above ndarray as a series/column. I get this error > Invalid: Python object of type ndarray is not None and is not a string, bool, > float, int, date, > decimal object > If to_pandas() can covert a list to ndarray, shouldnt from_pandas also > convert an ndarray to type list in the table ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-1074) from_pandas doesnt convert ndarray to list
[ https://issues.apache.org/jira/browse/ARROW-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdul Rahman updated ARROW-1074: Description: [Feel free to change issue type because this is probably by design] I have noticed that that if the one of the columns in the parquet file is of type array, pyarrow table stores it as list >>> table[3].type DataType(list) If I do a .to_pandas() on the column, I get something like this >> table[3].to_pandas() 0 None 1 [7] 2 [46] dtype: object However, I cant do a pyarrow.Table.from_pandas from a dataframe having the above ndarray as a series/column. I get this error Invalid: Python object of type ndarray is not None and is not a string, bool, float, int, date, decimal object If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert an ndarray to type list in the table ? was: [Feel free to change issue type because this is probably by design] I have noticed that that if the one of the columns in the parquet file is of type array, pyarrow table stores it as list >>> table[3].type DataType(list) If I do a .to_pandas() on the column, I get something like this 0 None 1 [7] 2 [46] dtype: object However, I cant do a pyarrow.Table.from_pandas from a dataframe having the above ndarray as a series/column. I get this error Invalid: Python object of type ndarray is not None and is not a string, bool, float, int, date, decimal object If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert an ndarray to type list in the table ? > from_pandas doesnt convert ndarray to list > -- > > Key: ARROW-1074 > URL: https://issues.apache.org/jira/browse/ARROW-1074 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.4.0 >Reporter: Abdul Rahman >Priority: Minor > Labels: pyarrow > > [Feel free to change issue type because this is probably by design] > I have noticed that that if the one of the columns in the parquet file is of > type array, pyarrow table stores it as list > >>> table[3].type > DataType(list) > If I do a .to_pandas() on the column, I get something like this > >> table[3].to_pandas() > 0 None > > 1 [7] > > 2 [46] > dtype: object > However, I cant do a pyarrow.Table.from_pandas from a dataframe having the > above ndarray as a series/column. I get this error > Invalid: Python object of type ndarray is not None and is not a string, bool, > float, int, date, > decimal object > If to_pandas() can covert a list to ndarray, shouldnt from_pandas also > convert an ndarray to type list in the table ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-1074) from_pandas doesnt convert ndarray to list
Abdul Rahman created ARROW-1074: --- Summary: from_pandas doesnt convert ndarray to list Key: ARROW-1074 URL: https://issues.apache.org/jira/browse/ARROW-1074 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.4.0 Reporter: Abdul Rahman Priority: Minor [Feel free to change issue type because this is probably by design] I have noticed that that if the one of the columns in the parquet file is of type array, pyarrow table stores it as list >>> table[3].type DataType(list) If I do a .to_pandas() on the column, I get something like this 0 None 1 [7] 2 [46] dtype: object However, I cant do a pyarrow.Table.from_pandas from a dataframe having the above ndarray as a series/column. I get this error Invalid: Python object of type ndarray is not None and is not a string, bool, float, int, date, decimal object If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert an ndarray to type list in the table ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (ARROW-909) libjemalloc.so.2: cannot open shared object file:
[ https://issues.apache.org/jira/browse/ARROW-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989684#comment-15989684 ] Abdul Rahman edited comment on ARROW-909 at 4/29/17 1:48 AM: - [~wesmckinn] Thanks. I noticed arrow/cpp does have jemalloc in the build folder, but doesnt load them in system libraries. was (Author: abdulrahman004): [~wesmckinn] Thanks. I noticed arrow/cpp does have jemalloc in the build folder, but doesnt load them in system libraries.f > libjemalloc.so.2: cannot open shared object file: > -- > > Key: ARROW-909 > URL: https://issues.apache.org/jira/browse/ARROW-909 > Project: Apache Arrow > Issue Type: Bug > Environment: linux centos >Reporter: Abdul Rahman > Labels: pyarrow > > >>> import pyarrow > Traceback (most recent call last): > File "", line 1, in > File > "/home/default/src/venv/lib/python2.7/site-packages/pyarrow-0.2.1.dev244+g14bec24-py2.7-linux-x86_64.egg/pyarrow/__init__.py", > line 28, in > import pyarrow._config > ImportError: libjemalloc.so.2: cannot open shared object file: No such file > or directory > $LD_LIBRARY_PATH has libarrow_jemalloc.a along with other libraries including > libarrow.so, libparquet.so, libparquet_arrow.so. Pyarrow was built using > with-jemalloc and parquet-cpp was cmake-d with > -DPARQUET_ARROW=ON > Also, noticed that arrow/python documentation has been cleaned up with the > installation instructions having the coda approach only .Is this the only > supported way going forward ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-909) libjemalloc.so.2: cannot open shared object file:
[ https://issues.apache.org/jira/browse/ARROW-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989684#comment-15989684 ] Abdul Rahman commented on ARROW-909: [~wesmckinn] Thanks. I noticed arrow/cpp does have jemalloc in the build folder, but doesnt load them in system libraries.f > libjemalloc.so.2: cannot open shared object file: > -- > > Key: ARROW-909 > URL: https://issues.apache.org/jira/browse/ARROW-909 > Project: Apache Arrow > Issue Type: Bug > Environment: linux centos >Reporter: Abdul Rahman > Labels: pyarrow > > >>> import pyarrow > Traceback (most recent call last): > File "", line 1, in > File > "/home/default/src/venv/lib/python2.7/site-packages/pyarrow-0.2.1.dev244+g14bec24-py2.7-linux-x86_64.egg/pyarrow/__init__.py", > line 28, in > import pyarrow._config > ImportError: libjemalloc.so.2: cannot open shared object file: No such file > or directory > $LD_LIBRARY_PATH has libarrow_jemalloc.a along with other libraries including > libarrow.so, libparquet.so, libparquet_arrow.so. Pyarrow was built using > with-jemalloc and parquet-cpp was cmake-d with > -DPARQUET_ARROW=ON > Also, noticed that arrow/python documentation has been cleaned up with the > installation instructions having the coda approach only .Is this the only > supported way going forward ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-909) libjemalloc.so.2: cannot open shared object file:
Abdul Rahman created ARROW-909: -- Summary: libjemalloc.so.2: cannot open shared object file: Key: ARROW-909 URL: https://issues.apache.org/jira/browse/ARROW-909 Project: Apache Arrow Issue Type: Bug Environment: linux centos Reporter: Abdul Rahman >>> import pyarrow Traceback (most recent call last): File "", line 1, in File "/home/default/src/venv/lib/python2.7/site-packages/pyarrow-0.2.1.dev244+g14bec24-py2.7-linux-x86_64.egg/pyarrow/__init__.py", line 28, in import pyarrow._config ImportError: libjemalloc.so.2: cannot open shared object file: No such file or directory $LD_LIBRARY_PATH has libarrow_jemalloc.a along with other libraries including libarrow.so, libparquet.so, libparquet_arrow.so. Pyarrow was built using with-jemalloc and parquet-cpp was cmake-d with -DPARQUET_ARROW=ON Also, noticed that arrow/python documentation has been cleaned up with the installation instructions having the coda approach only .Is this the only supported way going forward ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)