[ 
https://issues.apache.org/jira/browse/TAJO-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954744#comment-13954744
 ] 

Hyunsik Choi commented on TAJO-714:
-----------------------------------

Hi David,

I took a look at your patch. Your work looks nice to me. I leaved in-line 
comments on your questions.

> The code to set default parameter values is a bit complex because 
> TableMeta.getOption() only support String default values. I can overload this 
> method to support int and boolean default values, but I noticed that this 
> method returns null if p.hasParams() is false. Should this method still 
> return the default value in this case regardless?

Above all, thank you for pointing the limitation of {{TableMeta.getOption()}}. 
I'll think of its improvement. Also, {{getOption()}} needs to be changed to 
always return an empty instance even if {{hasParams()}} is false. This work 
seems out of scope of this work. So, I'll create and resolve them in additional 
issues.

I agree that ParquetAppender should take default values, and the written 
parquet files should return default values even if {{with}} clause is not give. 
I think that the {{1323 line}} in LogicalPlanner would be helpful for your 
consideration. It enables to set some default options values when a table is 
created.

{code:java}
    // Set default options to be created.
    Options options = 
CatalogUtil.newOptionsWithDefault(createTableNode.getStorageType());
    if (expr.hasParams()) {
      options.putAll(expr.getParams());
    }
{code}


> Enabling compression requires the compression codec jars to be in the 
> classpath. Do you think we should add gzip, lzo, and snappy to the 
> dependencies in the pom.xml or should we leave it up to users to install 
> those jars?

Actually, I don't fully understand Parquet's compression mechanism. If 
Parquet's compression uses Hadoop's compression codec, Parquet compression will 
follow users' Hadoop setting because Tajo's startup script includes Hadoop's 
classpaths. Otherwise, we need to include the dependencies of compression codec 
jars.

Warm regards,
Hyunsik Choi

> Enable setting Parquet tuning parameters
> ----------------------------------------
>
>                 Key: TAJO-714
>                 URL: https://issues.apache.org/jira/browse/TAJO-714
>             Project: Tajo
>          Issue Type: Improvement
>            Reporter: David Chen
>            Assignee: David Chen
>         Attachments: TAJO-714.patch
>
>
> The first version of Parquet support does not support setting Parquet's 
> tuning configuration parameters, such as compression, row group and page 
> size, dictionary encoding, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to