[GitHub] spark pull request #20851: [SPARK-23727][SQL] Support DATE predict push down...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20851#discussion_r175283818 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -50,6 +50,15 @@ private[parquet] object ParquetFilters { (n: String, v: Any) => FilterApi.eq( binaryColumn(n), Option(v).map(b => Binary.fromReusedByteArray(v.asInstanceOf[Array[Byte]])).orNull) +case DateType => + (n: String, v: Any) => { +FilterApi.eq( + intColumn(n), + Option(v).map{ date => --- End diff -- nit: `p{` -> `p {` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20851: [SPARK-23727][SQL] Support DATE predict push down...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20851#discussion_r175276718 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -148,6 +193,15 @@ private[parquet] object ParquetFilters { case BinaryType => (n: String, v: Any) => FilterApi.gtEq(binaryColumn(n), Binary.fromReusedByteArray(v.asInstanceOf[Array[Byte]])) +case DateType => --- End diff -- Add a new SQLConf to make it configurable. Users can turn it off if they hit a bug introduced in this PR --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20851: [SPARK-23727][SQL] Support DATE predict push down...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20851#discussion_r175276713 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -313,6 +314,36 @@ class ParquetFilterSuite extends QueryTest with ParquetTest with SharedSQLContex } } + test("filter pushdown - date") { +implicit class IntToDate(int: Int) { + def d: Date = new Date(Date.valueOf("2018-03-01").getTime + 24 * 60 * 60 * 1000 * (int - 1)) +} + +withParquetDataFrame((1 to 4).map(i => Tuple1(i.d))) { implicit df => --- End diff -- These test cases only cover the limited cases. We need to check the boundary values --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20851: [SPARK-23727][SQL] Support DATE predict push down...
GitHub user yucai opened a pull request: https://github.com/apache/spark/pull/20851 [SPARK-23727][SQL] Support DATE predict push down in parquet ## What changes were proposed in this pull request? DATE predict push down is missing in parquet, it should be supported. ## How was this patch tested? Added UT and tested in local. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yucai/spark SPARK-23727 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20851.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20851 commit 079af71359bd49dc59c863f1a9a4f6fa28d5a8a0 Author: yucai Date: 2018-03-18T03:49:09Z [SPARK-23727][SQL] Support DATE predict push down in parquet --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org