[ 
https://issues.apache.org/jira/browse/SPARK-37989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480882#comment-17480882
 ] 

Yuming Wang commented on SPARK-37989:
-------------------------------------

Benchmark 3
{noformat}
import org.apache.spark.benchmark.Benchmark
import org.apache.spark.sql.catalyst.optimizer.LimitPushDown
val numRows = 1024 * 1024 * 1000

spark.sql(s"CREATE TABLE t1 using parquet AS SELECT id % 1000 AS a, id % 10 AS 
b, id % 100 AS c FROM range(1, ${numRows}L, 1, 5)")
val benchmark = new Benchmark("Push down limit through Aggregate if it is group 
only", numRows, minNumIters = 1)
Seq(LimitPushDown.ruleName, "").foreach { execludedRules =>
  benchmark.addCase(s"Push down ${if (execludedRules.length > 0) "disabled" 
else "enabled" }") { _ =>
    withSQLConf(SQLConf.OPTIMIZER_EXCLUDED_RULES.key -> execludedRules) {
      spark.sql("SELECT distinct * FROM t1 LIMIT 
5000").write.format("noop").mode("Overwrite").save()
    }
  }
}

benchmark.run()
{noformat}


{noformat}
Java HotSpot(TM) 64-Bit Server VM 1.8.0_281-b09 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Push down limit through Aggregate if it is group only:  Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-------------------------------------------------------------------------------------------------------------------------------------
Push down disabled                                             37318          
37318           0         28.1          35.6       1.0X
Push down enabled                                              37940          
37940           0         27.6          36.2       1.0X
{noformat}


> Push down limit through Aggregate if it is group only
> -----------------------------------------------------
>
>                 Key: SPARK-37989
>                 URL: https://issues.apache.org/jira/browse/SPARK-37989
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Yuming Wang
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to