[ 
https://issues.apache.org/jira/browse/SPARK-48568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947535#comment-17947535
 ] 

qiwei huang commented on SPARK-48568:
-------------------------------------

I tried using the fix from the Pull Request but couldn't repro the improvement. 
For all test cases, the improvement is less than 10%, with some performing even 
worse.

> Improve Performance of CodeFormatter with Java StringBuilder
> ------------------------------------------------------------
>
>                 Key: SPARK-48568
>                 URL: https://issues.apache.org/jira/browse/SPARK-48568
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Ahir Reddy
>            Priority: Trivial
>              Labels: pull-request-available
>
> - Update CodeFormatter to use Java's `StringBuilder` directly instead of 
> Scala's
> - The Scala `StringBuilder` is a very thin API that wraps Java's.
> - All callsites that need to change just trivially copy what Scala was 
> delegating to Java. For example `clear()` simply calls `setLength(0)` under 
> the hood.
> - The reason for this change is that it substantially improves the 
> performance of the CodeFormatter.
> - From some basic profiling, in a ~100s suite, code formatting took ~2.7 
> seconds of CPU time. Post change it takes about 800ms.
> - My hypothesis is that Java StringBuilder is much more JIT friendly. Scala's 
> StringBuilder has many layers as it implements a significant portion of the 
> Scala mutable collection API. It's also likely the case that the JVM has 
> special JIT handling for Java StringBuilder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to