Github user rednaxelafx commented on the issue: https://github.com/apache/spark/pull/22847 Just in case people wonder, the following is the hack patch that I used for stress testing code splitting before this PR: ```diff --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala @@ -647,11 +647,13 @@ class CodegenContext(val useStreamlining: Boolean) { * Returns a term name that is unique within this instance of a `CodegenContext`. */ def freshName(name: String): String = synchronized { - val fullName = if (freshNamePrefix == "") { + // hack: intentionally add a very long prefix (length=300 characters) to + // trigger code splitting more frequently + val fullName = ("averylongprefix" * 20) + (if (freshNamePrefix == "") { name } else { s"${freshNamePrefix}_$name" - } + }) if (freshNameIds.contains(fullName)) { val id = freshNameIds(fullName) freshNameIds(fullName) = id + 1 ``` Of course, now with this PR, we can simply set the split threshold to a very low value (e.g. `1`) to force split.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org