Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22622#discussion_r223058276 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -115,6 +116,69 @@ abstract class OrcSuite extends OrcTest with BeforeAndAfterAll { } } + protected def testSelectiveDictionaryEncoding(isSelective: Boolean) { + val tableName = "orcTable" + + withTempDir { dir => + withTable(tableName) { + val sqlStatement = orcImp match { + case "native" => + s""" + |CREATE TABLE $tableName (zipcode STRING, uniqColumn STRING, value DOUBLE) + |USING ORC + |OPTIONS ( + | path '${dir.toURI}', + | orc.dictionary.key.threshold '1.0', + | orc.column.encoding.direct 'uniqColumn' --- End diff -- That sounds like a different issue. This PR covers both `TBLPROPERTIES` and `OPTIONS` syntaxes where are designed for that configuration-purpose historically. I mean this is not about data-source specific PR. Also, the scope of this PR is only write-side configurations. In any way, +1 for adding some introduction section for both Parquet/ORC examples there. We had better give both read/write side configuration examples, too. Could you file a JIRA issue for that?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org