Thanks Cheng, Nong.
Data in the matrix is homogenous (cells are booleans), so, I don't expect
to face memory related issues. Is the limitation on the # of columns or
memory issues caused by the # of columns? To me it sounds more like memory
issues.
On Mon, Jan 25, 2016 at 10:16 AM, Cheng Lian
[
https://issues.apache.org/jira/browse/PARQUET-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115790#comment-15115790
]
Wes McKinney commented on PARQUET-462:
--
Could you explain this in more detail, especially in the
[
https://issues.apache.org/jira/browse/PARQUET-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115868#comment-15115868
]
Deepak Majeti commented on PARQUET-462:
---
The code duplication will happen during the initialization
[
https://issues.apache.org/jira/browse/PARQUET-461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Deepak Majeti updated PARQUET-461:
--
Description: I would like to add some more extensions to the ColumnReader
API. These
Aside from Nong's comment, I think PARQUET-222, where we discussed a
performance issue of writing wide tables, can be helpful.
Cheng
On 1/23/16 4:53 PM, Nong Li wrote:
I expect this to be difficult. This is roughly 3 orders of magnitude more
than even
a typical wide table use case.
Answers
[
https://issues.apache.org/jira/browse/PARQUET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115751#comment-15115751
]
Aliaksei Sandryhaila commented on PARQUET-433:
--
Yes, your commit looks very close. Thanks
Deepak Majeti created PARQUET-462:
-
Summary: Create a new Level class for definition and repetition
values
Key: PARQUET-462
URL: https://issues.apache.org/jira/browse/PARQUET-462
Project: Parquet
Hi Ryan,
This sounds very reasonable. I do not argue to disregard the standard
Apache approach to promoting contributors to committers. I am just
pointing out that without the input from current committers it is hard
for us to productively contribute to the project. As a consequence, it
is
PARQUET-222 is mostly a memory issue caused by the # of columns. On the
write path, each column comes with write buffers, and they can
accumulate to a large amount. In the case investigated in PARQUET-222,
it took more than 10G to write a single row consists of 26k integer
columns. I.e., this
[
https://issues.apache.org/jira/browse/PARQUET-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115820#comment-15115820
]
Deepak Majeti commented on PARQUET-462:
---
The main idea is to prevent code duplication between
[
https://issues.apache.org/jira/browse/PARQUET-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wes McKinney resolved PARQUET-238.
--
Resolution: Resolved
This is resolved with PARQUET-418 and PARQUET-267. Please let us know if
[
https://issues.apache.org/jira/browse/PARQUET-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116456#comment-15116456
]
Wes McKinney edited comment on PARQUET-238 at 1/26/16 1:20 AM:
---
This is
Wes McKinney created PARQUET-464:
Summary: Add cmake option and #defines to enable/disable struct
packing
Key: PARQUET-464
URL: https://issues.apache.org/jira/browse/PARQUET-464
Project: Parquet
I am happy to help out with the patch maintenance when there are conflicts.
With PARQUET-437 we'll want to write more unit tests which will help make
sure we aren't breaking each other's code.
On Mon, Jan 25, 2016 at 2:33 PM, Aliaksei Sandryhaila
wrote:
> Hi Ryan,
>
> This
[
https://issues.apache.org/jira/browse/PARQUET-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116337#comment-15116337
]
Wes McKinney commented on PARQUET-449:
--
[~nongli] the GitHub PR is still outstanding
> Update to
Aliaksei, thanks for being understanding here.
I agree with you that it is too difficult. We really want to get the cpp
side bootstrapped as soon as possible. Lets go with what you suggested,
to have contributors review one another's patches and then ask a
committer for a final review once
16 matches
Mail list logo