[GitHub] spark pull request #22135: [SPARK-25093][SQL] Avoid recompiling regexp for c...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22135 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22135: [SPARK-25093][SQL] Avoid recompiling regexp for c...
Github user igreenfield commented on a diff in the pull request: https://github.com/apache/spark/pull/22135#discussion_r211091975 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala --- @@ -91,10 +94,7 @@ object CodeFormatter { } def stripExtraNewLinesAndComments(input: String): String = { -val commentReg = - ("""([ |\t]*?\/\*[\s|\S]*?\*\/[ |\t]*?)|""" +// strip /*comment*/ - """([ |\t]*?\/\/[\s\S]*?\n)""").r // strip //comment -val codeWithoutComment = commentReg.replaceAllIn(input, "") +val codeWithoutComment = commentRegexp.replaceAllIn(input, "") codeWithoutComment.replaceAll("""\n\s*\n""", "\n") // strip ExtraNewLines --- End diff -- this line also compile regex and could be replaced! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22135: [SPARK-25093][SQL] Avoid recompiling regexp for c...
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/22135 [SPARK-25093][SQL] Avoid recompiling regexp for comments multiple times ## What changes were proposed in this pull request? The PR moves the compilation of the regexp for code formatting outside the method which is called for each code block when splitting expressions, in order to avoid recompiling the regexp every time. Credit should be given to Izek Greenfield. ## How was this patch tested? existing UTs You can merge this pull request into a Git repository by running: $ git pull https://github.com/mgaido91/spark SPARK-25093 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22135.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22135 commit e39c855c8ea309c9ceccb775f0e2f7a1c2df3554 Author: Marco Gaido Date: 2018-08-17T15:04:58Z [SPARK-25093][SQL] Avoid recompiling regexp for comments multiple times --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org