Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/19864#discussion_r156445522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala --- @@ -60,7 +62,8 @@ case class InMemoryRelation( @transient child: SparkPlan, tableName: Option[String])( @transient var _cachedColumnBuffers: RDD[CachedBatch] = null, - val batchStats: LongAccumulator = child.sqlContext.sparkContext.longAccumulator) + val batchStats: LongAccumulator = child.sqlContext.sparkContext.longAccumulator, + statsOfPlanToCache: Option[Statistics] = None) --- End diff -- my two cents here: I didn't look into the code which makes this influence the logic of equal and hash, but we may not want to make equals/hash dependent on this: as in Spark SQL, we usually compare plan based on the string-represented format instead of plus stats info, e.g. try to reuse the cached plan based on the execution plan's string-representation instead of anything + stats info
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org