Ahir Reddy created SPARK-48568:
----------------------------------
Summary: Improve Performance of CodeFormatter with Java
StringBuilder
Key: SPARK-48568
URL: https://issues.apache.org/jira/browse/SPARK-48568
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.0.0
Reporter: Ahir Reddy
- Update CodeFormatter to use Java's `StringBuilder` directly instead of Scala's
- The Scala `StringBuilder` is a very thin API that wraps Java's.
- All callsites that need to change just trivially copy what Scala was
delegating to Java. For example `clear()` simply calls `setLength(0)` under the
hood.
- The reason for this change is that it substantially improves the performance
of the CodeFormatter.
- From some basic profiling, in a ~100s suite, code formatting took ~2.7
seconds of CPU time. Post change it takes about 800ms.
- My hypothesis is that Java StringBuilder is much more JIT friendly. Scala's
StringBuilder has many layers as it implements a significant portion of the
Scala mutable collection API. It's also likely the case that the JVM has
special JIT handling for Java StringBuilder
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]