Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19811#discussion_r157415970
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
    @@ -112,15 +112,14 @@ case class Like(left: Expression, right: Expression) 
extends StringRegexExpressi
       override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
         val patternClass = classOf[Pattern].getName
         val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + 
".escapeLikeRegex"
    -    val pattern = ctx.freshName("pattern")
     
         if (right.foldable) {
           val rVal = right.eval()
           if (rVal != null) {
             val regexStr =
               
StringEscapeUtils.escapeJava(escape(rVal.asInstanceOf[UTF8String].toString()))
    -        ctx.addMutableState(patternClass, pattern,
    -          s"""$pattern = ${patternClass}.compile("$regexStr");""")
    +        val pattern = ctx.addMutableState(patternClass, "patternLike",
    +          v => s"""$v = ${patternClass}.compile("$regexStr");""", 
forceInline = true)
    --- End diff --
    
    my only concern is about point 2. I think it is a dangerous thing to do. 
What if we generate a lot of frequently used variable? I think it is safer at 
the moment to consider only 1 and 3 in the decision whether to inline or not. 
In the future, with a different codegen method, we might then define a 
threshold over which we generate an array for the given class, otherwise we use 
plain variables, which IMHO would be the best option but at the moment it is 
not feasible...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to