[ 
https://issues.apache.org/jira/browse/SPARK-37646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu reassigned SPARK-37646:
------------------------------------

    Assignee: Shixiong Zhu

> Avoid touching Scala reflection APIs in the lit function
> --------------------------------------------------------
>
>                 Key: SPARK-37646
>                 URL: https://issues.apache.org/jira/browse/SPARK-37646
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Shixiong Zhu
>            Assignee: Shixiong Zhu
>            Priority: Major
>
> Currently lit is slow when the concurrency is high as it needs to hit the 
> Scala reflection code which hits global locks. For example, running the 
> following test locally using Spark 3.2 shows the difference:
> {code:java}
> scala> :paste
> // Entering paste mode (ctrl-D to finish)import 
> org.apache.spark.sql.functions._
> import org.apache.spark.sql.Column
> import org.apache.spark.sql.catalyst.expressions.Literalval parallelism = 
> 50def testLiteral(): Unit = {
>   val ts = for (_ <- 0 until parallelism) yield {
>     new Thread() {
>       override def run() {
>          for (_ <- 0 until 50) {
>           new Column(Literal(0L))
>         }
>       }
>     }
>   }
>   ts.foreach(_.start())
>   ts.foreach(_.join())
> }def testLit(): Unit = {
>   val ts = for (_ <- 0 until parallelism) yield {
>     new Thread() {
>       override def run() {
>          for (_ <- 0 until 50) {
>           lit(0L)
>         }
>       }
>     }
>   }
>   ts.foreach(_.start())
>   ts.foreach(_.join())
> }println("warmup")
> testLiteral()
> testLit()println("lit: false")
> spark.time {
>   testLiteral()
> }
> println("lit: true")
> spark.time {
>   testLit()
> }// Exiting paste mode, now interpreting.warmup
> lit: false
> Time taken: 8 ms
> lit: true
> Time taken: 682 ms
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.Column
> import org.apache.spark.sql.catalyst.expressions.Literal
> parallelism: Int = 50
> testLiteral: ()Unit
> testLit: ()Unit {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to