[ 
https://issues.apache.org/jira/browse/TAJO-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956035#comment-13956035
 ] 

David Chen commented on TAJO-714:
---------------------------------

Hi Hyunsik,

Thanks for your helpful feedback!

I think it would be fine as long as the compression codec jars are in the 
classpath, whether in the Hadoop or Tajo classpaths. Enabling compression is 
more on a per-file basis, which is set using the {{parquet.compression}} 
property, which the user would have to set explicitly using the {{with}} clause.

Parquet has default values for each of its configuration options, which it uses 
if the configuration options are not set by the user. The code in 
{{ParquetAppender.init()}} simply checks which properties are set in the table 
metadata by the user and otherwise defaults to Parquet's own default values 
before passing them to the {{TajoParquetWriter}} constructor.

Thanks, I can go ahead and add {{StoreType.PARQUET}} to 
{{CatalogUtil.newOptionsWithDefaults()}}. That should make the code in 
{{ParquetAppender.init()}} a little cleaner. In the future, when we decide to 
make storage types more pluggable, I think it might be a good idea to refactor 
the code for setting default options and move them code specific to that 
storage type. I will post another revision after I add Parquet's options to 
{{CatalogUtil.newOptionsWithDefaults()}}.

In the meantime, I have updated the patch with one small change which was to 
add missing calls to {{super.init()}} for {{ParquetAppender}} and 
{{ParquetScanner}}.

Thanks,
David

> Enable setting Parquet tuning parameters
> ----------------------------------------
>
>                 Key: TAJO-714
>                 URL: https://issues.apache.org/jira/browse/TAJO-714
>             Project: Tajo
>          Issue Type: Improvement
>            Reporter: David Chen
>            Assignee: David Chen
>         Attachments: TAJO-714.patch, TAJO-714_20140331_19:21:16.patch
>
>
> The first version of Parquet support does not support setting Parquet's 
> tuning configuration parameters, such as compression, row group and page 
> size, dictionary encoding, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to