[GitHub] spark pull request #22135: [SPARK-25093][SQL] Avoid recompiling regexp for c...

2018-08-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22135


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22135: [SPARK-25093][SQL] Avoid recompiling regexp for c...

2018-08-18 Thread igreenfield
Github user igreenfield commented on a diff in the pull request:

https://github.com/apache/spark/pull/22135#discussion_r211091975
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala
 ---
@@ -91,10 +94,7 @@ object CodeFormatter {
   }
 
   def stripExtraNewLinesAndComments(input: String): String = {
-val commentReg =
-  ("""([ |\t]*?\/\*[\s|\S]*?\*\/[ |\t]*?)|""" +// strip /*comment*/
-   """([ |\t]*?\/\/[\s\S]*?\n)""").r   // strip //comment
-val codeWithoutComment = commentReg.replaceAllIn(input, "")
+val codeWithoutComment = commentRegexp.replaceAllIn(input, "")
 codeWithoutComment.replaceAll("""\n\s*\n""", "\n") // strip 
ExtraNewLines
--- End diff --

this line also compile regex and could be replaced!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22135: [SPARK-25093][SQL] Avoid recompiling regexp for c...

2018-08-17 Thread mgaido91
GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/22135

[SPARK-25093][SQL] Avoid recompiling regexp for comments multiple times

## What changes were proposed in this pull request?

The PR moves the compilation of the regexp for code formatting outside the 
method which is called for each code block when splitting expressions, in order 
to avoid recompiling the regexp every time.

Credit should be given to Izek Greenfield.

## How was this patch tested?

existing UTs


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-25093

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22135


commit e39c855c8ea309c9ceccb775f0e2f7a1c2df3554
Author: Marco Gaido 
Date:   2018-08-17T15:04:58Z

[SPARK-25093][SQL] Avoid recompiling regexp for comments multiple times




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org