[ https://issues.apache.org/jira/browse/PARQUET-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Szadovszky reassigned PARQUET-1826: ----------------------------------------- Assignee: Walid Gara Based on our discussion in the Parquet sync I'm assigning this to you, [~garawalid]. Feel free to contact me or write your questions to this jira directly. > Document hadoop configuration options > ------------------------------------- > > Key: PARQUET-1826 > URL: https://issues.apache.org/jira/browse/PARQUET-1826 > Project: Parquet > Issue Type: Improvement > Components: parquet-mr > Reporter: Gabor Szadovszky > Assignee: Walid Gara > Priority: Major > > The currently available hadoop configuration options is not documented > properly. The only documentation we have is the javadoc comment and the > implementation of > [ParquetOutputFormat|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java]. > We shall investigate all the possible options and their usage/default values > and document them properly in a way that it is easily accessible by our users. > I would suggest creating a `README.md` file in the sub-module > [parquet-hadoop|https://github.com/apache/parquet-mr/tree/master/parquet-hadoop] > that would describe the purpose of the module and would have a section that > lists the possible hadoop configuration options. (Later on we shall extend > this document with other descriptions about the purpose and usage of our > library in the hadoop ecosystem. These efforts shall be covered by other > jiras.) > By adding the description to the source code it would be easy to extend it by > the new features we implement so it will be up-to-date for every release. -- This message was sent by Atlassian Jira (v8.3.4#803005)