[jira] [Resolved] (PARQUET-1013) Fix ZLIB_INCLUDE_DIR

2017-06-01 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved PARQUET-1013. --- Resolution: Fixed Fix Version/s: cpp-1.2.0 Issue resolved by pull request 347 [https:

Re: Store numpy arrays in parquet format

2017-06-01 Thread Wes McKinney
hi Vaishal, You can certainly use NumPy arrays to create Parquet files, but you will have to do a bit of work to adapt the NumPy arrays to Parquet's (and Arrow's) columnar data model. pandas DataFrame contains NumPy arrays internally. import pyarrow as pa import pyarrow.parquet as pq import numpy

[jira] [Commented] (PARQUET-1014) Example for multiple row group writer (cpp)

2017-06-01 Thread Deepak Majeti (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033781#comment-16033781 ] Deepak Majeti commented on PARQUET-1014: Hi [~elderrex], The example at {{exampl

[jira] [Resolved] (PARQUET-1014) Example for multiple row group writer (cpp)

2017-06-01 Thread yugu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yugu resolved PARQUET-1014. --- Resolution: Fixed not sure how but error fixed... > Example for multiple row group writer (cpp) > -

[jira] [Reopened] (PARQUET-1014) Example for multiple row group writer (cpp)

2017-06-01 Thread yugu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yugu reopened PARQUET-1014: --- similar issue re-declaring row group writer crashes > Example for multiple row group writer (cpp) > ---

[jira] [Updated] (PARQUET-1014) Example for multiple row group writer (cpp)

2017-06-01 Thread yugu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yugu updated PARQUET-1014: -- Priority: Major (was: Minor) Description: Been looking through the repo and cannot find an example for

[jira] [Resolved] (PARQUET-1014) Conflicts in declaring Column Writer of same type

2017-06-01 Thread yugu (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yugu resolved PARQUET-1014. --- Resolution: Fixed Fix Version/s: cpp-1.1.0 turns out int64_t cannot be written without def/rep level.

[jira] [Created] (PARQUET-1014) Conflicts in declaring Column Writer of same type

2017-06-01 Thread yugu (JIRA)
yugu created PARQUET-1014: - Summary: Conflicts in declaring Column Writer of same type Key: PARQUET-1014 URL: https://issues.apache.org/jira/browse/PARQUET-1014 Project: Parquet Issue Type: Bug

Store numpy arrays in parquet format

2017-06-01 Thread Shah, Vaishal
This is Vaishal from D. E. Shaw and Co. We are interested to use py-arrow/parquet for one of our projects, that deals with numpy arrays. Parquet provides API to store pandas dataframes on disk, but I could not find any support for storing numpy arrays. Since numpy is a trivial form to store data