This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 8385749 [SPARK-27043][SQL] Add ORC nested schema pruning benchmarks 8385749 is described below commit 83857496e53520aa0fdf3978fcbcdd6c49c3ab5c Author: Liang-Chi Hsieh <vii...@gmail.com> AuthorDate: Tue Mar 5 11:12:57 2019 -0800 [SPARK-27043][SQL] Add ORC nested schema pruning benchmarks ## What changes were proposed in this pull request? We have benchmark of nested schema pruning, but only for Parquet. This adds similar benchmark for ORC. This is used with nested schema pruning of ORC. ## How was this patch tested? Added test. Closes #23955 from viirya/orc-nested-schema-pruning-benchmark. Authored-by: Liang-Chi Hsieh <vii...@gmail.com> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- .../NestedSchemaPruningBenchmark-results.txt | 40 ---------------- .../OrcNestedSchemaPruningBenchmark-results.txt | 40 ++++++++++++++++ .../OrcV2NestedSchemaPruningBenchmark-results.txt | 40 ++++++++++++++++ ...ParquetNestedSchemaPruningBenchmark-results.txt | 40 ++++++++++++++++ .../benchmark/NestedSchemaPruningBenchmark.scala | 54 ++++++++++------------ .../OrcNestedSchemaPruningBenchmark.scala | 44 ++++++++++++++++++ .../OrcV2NestedSchemaPruningBenchmark.scala | 35 ++++++++++++++ .../ParquetNestedSchemaPruningBenchmark.scala | 35 ++++++++++++++ 8 files changed, 258 insertions(+), 70 deletions(-) diff --git a/sql/core/benchmarks/NestedSchemaPruningBenchmark-results.txt b/sql/core/benchmarks/NestedSchemaPruningBenchmark-results.txt deleted file mode 100644 index 7585cae..0000000 --- a/sql/core/benchmarks/NestedSchemaPruningBenchmark-results.txt +++ /dev/null @@ -1,40 +0,0 @@ -================================================================================================ -Nested Schema Pruning Benchmark -================================================================================================ - -Java HotSpot(TM) 64-Bit Server VM 1.8.0_201-b09 on Mac OS X 10.14.3 -Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz -Selection: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------- -Top-level column 59 / 68 16.9 59.1 1.0X -Nested column 180 / 186 5.6 179.7 0.3X - -Java HotSpot(TM) 64-Bit Server VM 1.8.0_201-b09 on Mac OS X 10.14.3 -Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz -Limiting: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------- -Top-level column 241 / 246 4.2 240.9 1.0X -Nested column 1828 / 1904 0.5 1827.5 0.1X - -Java HotSpot(TM) 64-Bit Server VM 1.8.0_201-b09 on Mac OS X 10.14.3 -Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz -Repartitioning: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------- -Top-level column 201 / 208 5.0 200.8 1.0X -Nested column 1811 / 1864 0.6 1811.4 0.1X - -Java HotSpot(TM) 64-Bit Server VM 1.8.0_201-b09 on Mac OS X 10.14.3 -Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz -Repartitioning by exprs: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------- -Top-level column 206 / 212 4.9 205.8 1.0X -Nested column 1814 / 1863 0.6 1814.3 0.1X - -Java HotSpot(TM) 64-Bit Server VM 1.8.0_201-b09 on Mac OS X 10.14.3 -Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz -Sorting: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------- -Top-level column 282 / 302 3.5 281.7 1.0X -Nested column 2093 / 2199 0.5 2093.1 0.1X - - diff --git a/sql/core/benchmarks/OrcNestedSchemaPruningBenchmark-results.txt b/sql/core/benchmarks/OrcNestedSchemaPruningBenchmark-results.txt new file mode 100644 index 0000000..f738256 --- /dev/null +++ b/sql/core/benchmarks/OrcNestedSchemaPruningBenchmark-results.txt @@ -0,0 +1,40 @@ +================================================================================================ +Nested Schema Pruning Benchmark For ORC v1 +================================================================================================ + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Selection: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 113 196 89 8.8 113.0 1.0X +Nested column 1316 1639 240 0.8 1315.5 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Limiting: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 260 474 211 3.8 260.4 1.0X +Nested column 2322 3312 701 0.4 2322.3 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Repartitioning: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 275 318 55 3.6 274.8 1.0X +Nested column 2482 3263 759 0.4 2482.2 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Repartitioning by exprs: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 274 288 11 3.7 273.9 1.0X +Nested column 2783 2905 86 0.4 2782.7 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Sorting: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 382 419 23 2.6 382.4 1.0X +Nested column 2974 3517 699 0.3 2974.1 0.1X + + diff --git a/sql/core/benchmarks/OrcV2NestedSchemaPruningBenchmark-results.txt b/sql/core/benchmarks/OrcV2NestedSchemaPruningBenchmark-results.txt new file mode 100644 index 0000000..ad43ffb --- /dev/null +++ b/sql/core/benchmarks/OrcV2NestedSchemaPruningBenchmark-results.txt @@ -0,0 +1,40 @@ +================================================================================================ +Nested Schema Pruning Benchmark For ORC v2 +================================================================================================ + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Selection: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 91 102 9 11.0 91.2 1.0X +Nested column 1459 1548 80 0.7 1458.5 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Limiting: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 101 112 10 9.9 100.7 1.0X +Nested column 1459 1619 109 0.7 1458.9 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Repartitioning: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 268 284 12 3.7 268.2 1.0X +Nested column 2781 2865 73 0.4 2780.8 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Repartitioning by exprs: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 309 318 6 3.2 309.2 1.0X +Nested column 2426 2891 253 0.4 2425.8 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Sorting: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 179 194 8 5.6 179.3 1.0X +Nested column 2084 2277 243 0.5 2083.7 0.1X + + diff --git a/sql/core/benchmarks/ParquetNestedSchemaPruningBenchmark-results.txt b/sql/core/benchmarks/ParquetNestedSchemaPruningBenchmark-results.txt new file mode 100644 index 0000000..d51ebc6 --- /dev/null +++ b/sql/core/benchmarks/ParquetNestedSchemaPruningBenchmark-results.txt @@ -0,0 +1,40 @@ +================================================================================================ +Nested Schema Pruning Benchmark For Parquet +================================================================================================ + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Selection: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 88 114 16 11.4 87.5 1.0X +Nested column 201 223 27 5.0 200.5 0.4X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Limiting: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 263 315 36 3.8 263.2 1.0X +Nested column 2111 2622 613 0.5 2111.1 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Repartitioning: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 222 250 34 4.5 222.2 1.0X +Nested column 2084 2339 266 0.5 2084.2 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Repartitioning by exprs: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 238 306 96 4.2 238.1 1.0X +Nested column 2080 2373 218 0.5 2079.5 0.1X + +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.14.3 +Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz +Sorting: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +Top-level column 328 383 57 3.1 327.6 1.0X +Nested column 2595 3136 638 0.4 2595.1 0.1X + + diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/NestedSchemaPruningBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/NestedSchemaPruningBenchmark.scala index ddfc8ae..e852de1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/NestedSchemaPruningBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/NestedSchemaPruningBenchmark.scala @@ -21,23 +21,17 @@ import org.apache.spark.benchmark.Benchmark import org.apache.spark.sql.internal.SQLConf /** - * Synthetic benchmark for nested schema pruning performance. - * To run this benchmark: - * {{{ - * 1. without sbt: - * bin/spark-submit --class <this class> --jars <spark core test jar> <sql core test jar> - * 2. build/sbt "sql/test:runMain <this class>" - * 3. generate result: - * SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain <this class>" - * Results will be written to "benchmarks/NestedSchemaPruningBenchmark-results.txt". - * }}} + * The base class for synthetic benchmark for nested schema pruning performance. */ -object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { +abstract class NestedSchemaPruningBenchmark extends SqlBasedBenchmark { import spark.implicits._ - private val N = 1000000 - private val numIters = 10 + val dataSourceName: String + val benchmarkName: String + + protected val N = 1000000 + protected val numIters = 10 // We use `col1 BIGINT, col2 STRUCT<_1: BIGINT, _2: STRING>` as a test schema. // col1 and col2._1 is used for comparision. col2._2 mimics the burden for the other columns @@ -53,13 +47,13 @@ object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { } } - private def selectBenchmark(numRows: Int, numIters: Int): Unit = { + protected def selectBenchmark(numRows: Int, numIters: Int): Unit = { withTempPath { dir => val path = dir.getCanonicalPath Seq(1, 2).foreach { i => - df.write.parquet(path + s"/$i") - spark.read.parquet(path + s"/$i").createOrReplaceTempView(s"t$i") + df.write.format(dataSourceName).save(path + s"/$i") + spark.read.format(dataSourceName).load(path + s"/$i").createOrReplaceTempView(s"t$i") } val benchmark = new Benchmark(s"Selection", numRows, numIters, output = output) @@ -71,13 +65,13 @@ object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { } } - private def limitBenchmark(numRows: Int, numIters: Int): Unit = { + protected def limitBenchmark(numRows: Int, numIters: Int): Unit = { withTempPath { dir => val path = dir.getCanonicalPath Seq(1, 2).foreach { i => - df.write.parquet(path + s"/$i") - spark.read.parquet(path + s"/$i").createOrReplaceTempView(s"t$i") + df.write.format(dataSourceName).save(path + s"/$i") + spark.read.format(dataSourceName).load(path + s"/$i").createOrReplaceTempView(s"t$i") } val benchmark = new Benchmark(s"Limiting", numRows, numIters, output = output) @@ -91,13 +85,13 @@ object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { } } - private def repartitionBenchmark(numRows: Int, numIters: Int): Unit = { + protected def repartitionBenchmark(numRows: Int, numIters: Int): Unit = { withTempPath { dir => val path = dir.getCanonicalPath Seq(1, 2).foreach { i => - df.write.parquet(path + s"/$i") - spark.read.parquet(path + s"/$i").createOrReplaceTempView(s"t$i") + df.write.format(dataSourceName).save(path + s"/$i") + spark.read.format(dataSourceName).load(path + s"/$i").createOrReplaceTempView(s"t$i") } val benchmark = new Benchmark(s"Repartitioning", numRows, numIters, output = output) @@ -111,13 +105,13 @@ object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { } } - private def repartitionByExprBenchmark(numRows: Int, numIters: Int): Unit = { + protected def repartitionByExprBenchmark(numRows: Int, numIters: Int): Unit = { withTempPath { dir => val path = dir.getCanonicalPath Seq(1, 2).foreach { i => - df.write.parquet(path + s"/$i") - spark.read.parquet(path + s"/$i").createOrReplaceTempView(s"t$i") + df.write.format(dataSourceName).save(path + s"/$i") + spark.read.format(dataSourceName).load(path + s"/$i").createOrReplaceTempView(s"t$i") } val benchmark = new Benchmark(s"Repartitioning by exprs", numRows, numIters, output = output) @@ -131,13 +125,13 @@ object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { } } - private def sortBenchmark(numRows: Int, numIters: Int): Unit = { + protected def sortBenchmark(numRows: Int, numIters: Int): Unit = { withTempPath { dir => val path = dir.getCanonicalPath Seq(1, 2).foreach { i => - df.write.parquet(path + s"/$i") - spark.read.parquet(path + s"/$i").createOrReplaceTempView(s"t$i") + df.write.format(dataSourceName).save(path + s"/$i") + spark.read.format(dataSourceName).load(path + s"/$i").createOrReplaceTempView(s"t$i") } val benchmark = new Benchmark(s"Sorting", numRows, numIters, output = output) @@ -150,8 +144,8 @@ object NestedSchemaPruningBenchmark extends SqlBasedBenchmark { } override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { - runBenchmark(s"Nested Schema Pruning Benchmark") { - withSQLConf (SQLConf.NESTED_SCHEMA_PRUNING_ENABLED.key -> "true") { + runBenchmark(benchmarkName) { + withSQLConf(SQLConf.NESTED_SCHEMA_PRUNING_ENABLED.key -> "true") { selectBenchmark (N, numIters) limitBenchmark (N, numIters) repartitionBenchmark (N, numIters) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/OrcNestedSchemaPruningBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/OrcNestedSchemaPruningBenchmark.scala new file mode 100644 index 0000000..947fc67 --- /dev/null +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/OrcNestedSchemaPruningBenchmark.scala @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.sql.internal.SQLConf + +/** + * Synthetic benchmark for nested schema pruning performance for ORC V1 datasource. + * To run this benchmark: + * {{{ + * 1. without sbt: + * bin/spark-submit --class <this class> --jars <spark core test jar> <sql core test jar> + * 2. build/sbt "sql/test:runMain <this class>" + * 3. generate result: + * SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain <this class>" + * Results will be written to "benchmarks/OrcNestedSchemaPruningBenchmark-results.txt". + * }}} + */ +object OrcNestedSchemaPruningBenchmark extends NestedSchemaPruningBenchmark { + override val dataSourceName: String = "orc" + override val benchmarkName: String = "Nested Schema Pruning Benchmark For ORC v1" + + override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { + withSQLConf(SQLConf.USE_V1_SOURCE_READER_LIST.key -> "orc", + SQLConf.USE_V1_SOURCE_WRITER_LIST.key -> "orc") { + super.runBenchmarkSuite(mainArgs) + } + } +} diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/OrcV2NestedSchemaPruningBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/OrcV2NestedSchemaPruningBenchmark.scala new file mode 100644 index 0000000..e735d1c --- /dev/null +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/OrcV2NestedSchemaPruningBenchmark.scala @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +/** + * Synthetic benchmark for nested schema pruning performance for ORC V2 datasource. + * To run this benchmark: + * {{{ + * 1. without sbt: + * bin/spark-submit --class <this class> --jars <spark core test jar> <sql core test jar> + * 2. build/sbt "sql/test:runMain <this class>" + * 3. generate result: + * SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain <this class>" + * Results will be written to "benchmarks/OrcV2NestedSchemaPruningBenchmark-results.txt". + * }}} + */ +object OrcV2NestedSchemaPruningBenchmark extends NestedSchemaPruningBenchmark { + override val dataSourceName: String = "orc" + override val benchmarkName: String = "Nested Schema Pruning Benchmark For ORC v2" +} diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ParquetNestedSchemaPruningBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ParquetNestedSchemaPruningBenchmark.scala new file mode 100644 index 0000000..1c9cc2c --- /dev/null +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ParquetNestedSchemaPruningBenchmark.scala @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +/** + * Synthetic benchmark for nested schema pruning performance for Parquet datasource. + * To run this benchmark: + * {{{ + * 1. without sbt: + * bin/spark-submit --class <this class> --jars <spark core test jar> <sql core test jar> + * 2. build/sbt "sql/test:runMain <this class>" + * 3. generate result: + * SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain <this class>" + * Results will be written to "benchmarks/ParquetNestedSchemaPruningBenchmark-results.txt". + * }}} + */ +object ParquetNestedSchemaPruningBenchmark extends NestedSchemaPruningBenchmark { + override val dataSourceName: String = "parquet" + override val benchmarkName: String = "Nested Schema Pruning Benchmark For Parquet" +} --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org