Github user viirya commented on the issue: https://github.com/apache/spark/pull/19864 Is this initial statistics important? After the columnar RDD is materialized, we will get accurate statistics then. Don't we? On Dec 3, 2017 1:43 AM, "Nan Zhu" <notificati...@github.com> wrote: > *@CodingCat* commented on this pull request. > ------------------------------ > > In sql/core/src/main/scala/org/apache/spark/sql/execution/ > CacheManager.scala > <https://github.com/apache/spark/pull/19864#discussion_r154501939>: > > > - planToCache, > - InMemoryRelation( > - sparkSession.sessionState.conf.useCompression, > - sparkSession.sessionState.conf.columnBatchSize, > - storageLevel, > - sparkSession.sessionState.executePlan(planToCache).executedPlan, > - tableName))) > + val inMemoryRelation = InMemoryRelation( > + sparkSession.sessionState.conf.useCompression, > + sparkSession.sessionState.conf.columnBatchSize, > + storageLevel, > + sparkSession.sessionState.executePlan(planToCache).executedPlan, > + tableName) > + if (planToCache.conf.cboEnabled && planToCache.stats.rowCount.isDefined) { > + inMemoryRelation.setStatsFromCachedPlan(planToCache) > + } > > I have to make InMemoryRelation stateful to avoid breaking APIs..... > > â > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/apache/spark/pull/19864#pullrequestreview-80680362>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AAEM96llEjZsyqac_xi9Nkks_2idfmgEks5s8YxWgaJpZM4QzBjk> > . >
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org