Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20831#discussion_r175992410
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
 ---
    @@ -169,7 +174,10 @@ case class InMemoryTableScanExec(
       override def outputOrdering: Seq[SortOrder] =
         
relation.child.outputOrdering.map(updateAttribute(_).asInstanceOf[SortOrder])
     
    -  private def statsFor(a: Attribute) = 
relation.partitionStatistics.forAttribute(a)
    +  // When we make canonicalized plan, we can't find a normalized attribute 
in this map.
    +  // We return a `ColumnStatisticsSchema` for normalized attribute in this 
case.
    --- End diff --
    
    I don't get it. Regardless how copy is implemented in scala, ideally we can 
just mark `buildFilter` and `partitionFilters` as lazy, and in 
`doCanonicalize`, create a new `InMemoryTableScanExec`, which won't materialize 
`partitionFilters` in either the current `InMemoryTableScanExec` or the new 
`InMemoryTableScanExec`.
    
    One problem I can think of is to serialize a canonicalized 
`InMemoryTableScanExec`, but it should never happen.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to