[
https://issues.apache.org/jira/browse/PARQUET-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693161#comment-17693161
]
ASF GitHub Bot commented on PARQUET-2230:
-----------------------------------------
wgtmac commented on PR #1034:
URL: https://github.com/apache/parquet-mr/pull/1034#issuecomment-1443560593
> Are we planning to deprecate the other tools covered by this one? (Or if
they were not released yet we might simply remove them?)
Let me take a look. If those tools and classes are not released yet, it
would be a good time to remove them.
> Next potential topic around parquet-cli if you're intereseted :) There are
some implementations around the hadoop conf in parquet-cli but I do not fully
understand how it works. If it works as is we should document it otherwise we
should make it work somehow. It would be great if the different read/write
flags could be used in the tools. Like setting the zstd compression ratio for
rewrite. Or using a non-default encoding. What do you think?
I am not familiar with it either. I will dig into it to find how it works
and what can be done later.
It is Friday now. Will get back to it next week. :)
> Add a new rewrite command powered by ParquetRewriter
> ----------------------------------------------------
>
> Key: PARQUET-2230
> URL: https://issues.apache.org/jira/browse/PARQUET-2230
> Project: Parquet
> Issue Type: Sub-task
> Components: parquet-cli
> Reporter: Gang Wu
> Assignee: Gang Wu
> Priority: Major
>
> parquet-cli has several commands for rewriting files but missing a
> consolidated one to provide the full features of ParquetRewriter.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)