[jira] [Commented] (ARROW-1074) from_pandas doesnt convert ndarray to list

2017-06-06 Thread Abdul Rahman (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039278#comment-16039278
 ] 

Abdul Rahman commented on ARROW-1074:
-

Yes, I can attempt to do that. I dont have experience with Cython, so will look 
into that first. Is there a dev channel/forum where I can get some help along 
the way ?

> from_pandas doesnt convert ndarray to list
> --
>
> Key: ARROW-1074
> URL: https://issues.apache.org/jira/browse/ARROW-1074
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.4.0
>Reporter: Abdul Rahman
>Priority: Minor
>  Labels: pyarrow
>
> [Feel free to change issue type because this is probably by design]
> I have noticed that that if the one of the columns in the parquet file is of 
> type array, pyarrow table stores it as list
> >>> table[3].type
> DataType(list)
> If I do a .to_pandas() on the column, I get something like this
> >> table[3].to_pandas()
> 0 None
>  
> 1  [7]
>  
> 2 [46]  
> dtype: object
> However, I cant do a pyarrow.Table.from_pandas from a dataframe having the 
> above ndarray as a series/column. I get this error
> Invalid: Python object of type ndarray is not None and is not a string, bool, 
> float, int, date,
> decimal object
> If to_pandas() can covert a list to ndarray, shouldnt from_pandas also 
> convert an ndarray to type list in the table ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-1074) from_pandas doesnt convert ndarray to list

2017-06-01 Thread Abdul Rahman (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033718#comment-16033718
 ] 

Abdul Rahman commented on ARROW-1074:
-

[~wesm_impala_7e40] comments ?

> from_pandas doesnt convert ndarray to list
> --
>
> Key: ARROW-1074
> URL: https://issues.apache.org/jira/browse/ARROW-1074
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.4.0
>Reporter: Abdul Rahman
>Priority: Minor
>  Labels: pyarrow
>
> [Feel free to change issue type because this is probably by design]
> I have noticed that that if the one of the columns in the parquet file is of 
> type array, pyarrow table stores it as list
> >>> table[3].type
> DataType(list)
> If I do a .to_pandas() on the column, I get something like this
> >> table[3].to_pandas()
> 0 None
>  
> 1  [7]
>  
> 2 [46]  
> dtype: object
> However, I cant do a pyarrow.Table.from_pandas from a dataframe having the 
> above ndarray as a series/column. I get this error
> Invalid: Python object of type ndarray is not None and is not a string, bool, 
> float, int, date,
> decimal object
> If to_pandas() can covert a list to ndarray, shouldnt from_pandas also 
> convert an ndarray to type list in the table ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-1074) from_pandas doesnt convert ndarray to list

2017-05-27 Thread Abdul Rahman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdul Rahman updated ARROW-1074:

Description: 
[Feel free to change issue type because this is probably by design]

I have noticed that that if the one of the columns in the parquet file is of 
type array, pyarrow table stores it as list
>>> table[3].type
DataType(list)
If I do a .to_pandas() on the column, I get something like this
>> table[3].to_pandas()
0 None  
   
1  [7]  
   
2 [46]  
dtype: object

However, I cant do a pyarrow.Table.from_pandas from a dataframe having the 
above ndarray as a series/column. I get this error
Invalid: Python object of type ndarray is not None and is not a string, bool, 
float, int, date,
decimal object

If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert 
an ndarray to type list in the table ?

  was:
[Feel free to change issue type because this is probably by design]

I have noticed that that if the one of the columns in the parquet file is of 
type array, pyarrow table stores it as list
>>> table[3].type
DataType(list)
If I do a .to_pandas() on the column, I get something like this
0 None  
   
1  [7]  
   
2 [46]  
dtype: object

However, I cant do a pyarrow.Table.from_pandas from a dataframe having the 
above ndarray as a series/column. I get this error
Invalid: Python object of type ndarray is not None and is not a string, bool, 
float, int, date,
decimal object

If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert 
an ndarray to type list in the table ?


> from_pandas doesnt convert ndarray to list
> --
>
> Key: ARROW-1074
> URL: https://issues.apache.org/jira/browse/ARROW-1074
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.4.0
>Reporter: Abdul Rahman
>Priority: Minor
>  Labels: pyarrow
>
> [Feel free to change issue type because this is probably by design]
> I have noticed that that if the one of the columns in the parquet file is of 
> type array, pyarrow table stores it as list
> >>> table[3].type
> DataType(list)
> If I do a .to_pandas() on the column, I get something like this
> >> table[3].to_pandas()
> 0 None
>  
> 1  [7]
>  
> 2 [46]  
> dtype: object
> However, I cant do a pyarrow.Table.from_pandas from a dataframe having the 
> above ndarray as a series/column. I get this error
> Invalid: Python object of type ndarray is not None and is not a string, bool, 
> float, int, date,
> decimal object
> If to_pandas() can covert a list to ndarray, shouldnt from_pandas also 
> convert an ndarray to type list in the table ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-1074) from_pandas doesnt convert ndarray to list

2017-05-27 Thread Abdul Rahman (JIRA)
Abdul Rahman created ARROW-1074:
---

 Summary: from_pandas doesnt convert ndarray to list
 Key: ARROW-1074
 URL: https://issues.apache.org/jira/browse/ARROW-1074
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.4.0
Reporter: Abdul Rahman
Priority: Minor


[Feel free to change issue type because this is probably by design]

I have noticed that that if the one of the columns in the parquet file is of 
type array, pyarrow table stores it as list
>>> table[3].type
DataType(list)
If I do a .to_pandas() on the column, I get something like this
0 None  
   
1  [7]  
   
2 [46]  
dtype: object

However, I cant do a pyarrow.Table.from_pandas from a dataframe having the 
above ndarray as a series/column. I get this error
Invalid: Python object of type ndarray is not None and is not a string, bool, 
float, int, date,
decimal object

If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert 
an ndarray to type list in the table ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (ARROW-909) libjemalloc.so.2: cannot open shared object file:

2017-04-28 Thread Abdul Rahman (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989684#comment-15989684
 ] 

Abdul Rahman edited comment on ARROW-909 at 4/29/17 1:48 AM:
-

[~wesmckinn] Thanks. I noticed arrow/cpp does have jemalloc in the build 
folder, but doesnt load them in system libraries.


was (Author: abdulrahman004):
[~wesmckinn] Thanks. I noticed arrow/cpp does have jemalloc in the build 
folder, but doesnt load them in system libraries.f

> libjemalloc.so.2: cannot open shared object file: 
> --
>
> Key: ARROW-909
> URL: https://issues.apache.org/jira/browse/ARROW-909
> Project: Apache Arrow
>  Issue Type: Bug
> Environment: linux centos
>Reporter: Abdul Rahman
>  Labels: pyarrow
>
> >>> import pyarrow
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/home/default/src/venv/lib/python2.7/site-packages/pyarrow-0.2.1.dev244+g14bec24-py2.7-linux-x86_64.egg/pyarrow/__init__.py",
>  line 28, in 
> import pyarrow._config
> ImportError: libjemalloc.so.2: cannot open shared object file: No such file 
> or directory
> $LD_LIBRARY_PATH has libarrow_jemalloc.a along with other libraries including 
> libarrow.so,  libparquet.so, libparquet_arrow.so. Pyarrow was built using 
> with-jemalloc and parquet-cpp was cmake-d with 
> -DPARQUET_ARROW=ON  
> Also, noticed that arrow/python documentation has been cleaned up with the 
> installation instructions having the coda approach only .Is this the only 
> supported way going forward ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-909) libjemalloc.so.2: cannot open shared object file:

2017-04-28 Thread Abdul Rahman (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989684#comment-15989684
 ] 

Abdul Rahman commented on ARROW-909:


[~wesmckinn] Thanks. I noticed arrow/cpp does have jemalloc in the build 
folder, but doesnt load them in system libraries.f

> libjemalloc.so.2: cannot open shared object file: 
> --
>
> Key: ARROW-909
> URL: https://issues.apache.org/jira/browse/ARROW-909
> Project: Apache Arrow
>  Issue Type: Bug
> Environment: linux centos
>Reporter: Abdul Rahman
>  Labels: pyarrow
>
> >>> import pyarrow
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/home/default/src/venv/lib/python2.7/site-packages/pyarrow-0.2.1.dev244+g14bec24-py2.7-linux-x86_64.egg/pyarrow/__init__.py",
>  line 28, in 
> import pyarrow._config
> ImportError: libjemalloc.so.2: cannot open shared object file: No such file 
> or directory
> $LD_LIBRARY_PATH has libarrow_jemalloc.a along with other libraries including 
> libarrow.so,  libparquet.so, libparquet_arrow.so. Pyarrow was built using 
> with-jemalloc and parquet-cpp was cmake-d with 
> -DPARQUET_ARROW=ON  
> Also, noticed that arrow/python documentation has been cleaned up with the 
> installation instructions having the coda approach only .Is this the only 
> supported way going forward ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-909) libjemalloc.so.2: cannot open shared object file:

2017-04-28 Thread Abdul Rahman (JIRA)
Abdul Rahman created ARROW-909:
--

 Summary: libjemalloc.so.2: cannot open shared object file: 
 Key: ARROW-909
 URL: https://issues.apache.org/jira/browse/ARROW-909
 Project: Apache Arrow
  Issue Type: Bug
 Environment: linux centos
Reporter: Abdul Rahman


>>> import pyarrow
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/home/default/src/venv/lib/python2.7/site-packages/pyarrow-0.2.1.dev244+g14bec24-py2.7-linux-x86_64.egg/pyarrow/__init__.py",
 line 28, in 
import pyarrow._config
ImportError: libjemalloc.so.2: cannot open shared object file: No such file or 
directory

$LD_LIBRARY_PATH has libarrow_jemalloc.a along with other libraries including 
libarrow.so,  libparquet.so, libparquet_arrow.so. Pyarrow was built using 
with-jemalloc and parquet-cpp was cmake-d with 
-DPARQUET_ARROW=ON  

Also, noticed that arrow/python documentation has been cleaned up with the 
installation instructions having the coda approach only .Is this the only 
supported way going forward ?




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)