[ https://issues.apache.org/jira/browse/SPARK-48568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947535#comment-17947535 ]
qiwei huang commented on SPARK-48568: ------------------------------------- I tried using the fix from the Pull Request but couldn't repro the improvement. For all test cases, the improvement is less than 10%, with some performing even worse. > Improve Performance of CodeFormatter with Java StringBuilder > ------------------------------------------------------------ > > Key: SPARK-48568 > URL: https://issues.apache.org/jira/browse/SPARK-48568 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 4.0.0 > Reporter: Ahir Reddy > Priority: Trivial > Labels: pull-request-available > > - Update CodeFormatter to use Java's `StringBuilder` directly instead of > Scala's > - The Scala `StringBuilder` is a very thin API that wraps Java's. > - All callsites that need to change just trivially copy what Scala was > delegating to Java. For example `clear()` simply calls `setLength(0)` under > the hood. > - The reason for this change is that it substantially improves the > performance of the CodeFormatter. > - From some basic profiling, in a ~100s suite, code formatting took ~2.7 > seconds of CPU time. Post change it takes about 800ms. > - My hypothesis is that Java StringBuilder is much more JIT friendly. Scala's > StringBuilder has many layers as it implements a significant portion of the > Scala mutable collection API. It's also likely the case that the JVM has > special JIT handling for Java StringBuilder -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org