Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22622#discussion_r223058276
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
    @@ -115,6 +116,69 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
         }
       }
     
    +  protected def testSelectiveDictionaryEncoding(isSelective: Boolean) {
    +    val tableName = "orcTable"
    +
    +    withTempDir { dir =>
    +      withTable(tableName) {
    +        val sqlStatement = orcImp match {
    +          case "native" =>
    +            s"""
    +               |CREATE TABLE $tableName (zipcode STRING, uniqColumn 
STRING, value DOUBLE)
    +               |USING ORC
    +               |OPTIONS (
    +               |  path '${dir.toURI}',
    +               |  orc.dictionary.key.threshold '1.0',
    +               |  orc.column.encoding.direct 'uniqColumn'
    --- End diff --
    
    That sounds like a different issue. This PR covers both `TBLPROPERTIES` and 
`OPTIONS` syntaxes where are designed for that configuration-purpose 
historically. I mean this is not about data-source specific PR. Also, the scope 
of this PR is only write-side configurations.
    
    In any way, +1 for adding some introduction section for both Parquet/ORC 
examples there. We had better give both read/write side configuration examples, 
too. Could you file a JIRA issue for that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to