Github user rednaxelafx commented on the issue:

    https://github.com/apache/spark/pull/20419
  
    @kiszk SGTM and LGTM. Let's ship it!
    
    One more question on the side: with the `forceComment = true`, are we fully 
sure that won't affect the equality of `CodeAndComment`?
    The whole point of this PR is to have a way to embed the ID into the 
non-executable code of the generated code so that it won't affect the codegen 
cache hit, right?
    
    Could you please post an example of either the generated code or the 
metrics, e.g. a `SortMergeJoin` on two identical `spark.range(3)`s, and confirm 
that even when the two `range()`s are codegen's into different 
`codegenStageId`s, with the `spark.sql.codegen.useIdInClassName` turned off, 
the two stages would still hit the codegen cache? Basically to verify the 
example I gave in 
https://github.com/apache/spark/pull/20224#issuecomment-357091842 still hits 
the codegen cache when `spark.sql.codegen.useIdInClassName=false`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to