[
https://issues.apache.org/jira/browse/PARQUET-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wes McKinney resolved PARQUET-1013.
---
Resolution: Fixed
Fix Version/s: cpp-1.2.0
Issue resolved by pull request 347
[https:
hi Vaishal,
You can certainly use NumPy arrays to create Parquet files, but you
will have to do a bit of work to adapt the NumPy arrays to Parquet's
(and Arrow's) columnar data model. pandas DataFrame contains NumPy
arrays internally.
import pyarrow as pa
import pyarrow.parquet as pq
import numpy
[
https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033781#comment-16033781
]
Deepak Majeti commented on PARQUET-1014:
Hi [~elderrex],
The example at {{exampl
[
https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yugu resolved PARQUET-1014.
---
Resolution: Fixed
not sure how but error fixed...
> Example for multiple row group writer (cpp)
> -
[
https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yugu reopened PARQUET-1014:
---
similar issue
re-declaring row group writer crashes
> Example for multiple row group writer (cpp)
> ---
[
https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yugu updated PARQUET-1014:
--
Priority: Major (was: Minor)
Description:
Been looking through the repo and cannot find an example for
[
https://issues.apache.org/jira/browse/PARQUET-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yugu resolved PARQUET-1014.
---
Resolution: Fixed
Fix Version/s: cpp-1.1.0
turns out int64_t cannot be written without def/rep level.
yugu created PARQUET-1014:
-
Summary: Conflicts in declaring Column Writer of same type
Key: PARQUET-1014
URL: https://issues.apache.org/jira/browse/PARQUET-1014
Project: Parquet
Issue Type: Bug
This is Vaishal from D. E. Shaw and Co.
We are interested to use py-arrow/parquet for one of our projects, that deals
with numpy arrays.
Parquet provides API to store pandas dataframes on disk, but I could not find
any support for storing numpy arrays.
Since numpy is a trivial form to store data