Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r232486141 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -306,7 +306,15 @@ case class FileSourceScanExec( withOptPartitionCount } - withSelectedBucketsCount + val withOptColumnCount = relation.fileFormat match { + case columnar: ColumnarFileFormat => + val sqlConf = relation.sparkSession.sessionState.conf + val columnCount = columnar.columnCountForSchema(sqlConf, requiredSchema) + withSelectedBucketsCount + ("ColumnCount" -> columnCount.toString) --- End diff -- > Who wants that? If someone wants to put metadata somewhere in the physical plan, let them open a PR and make a case for it No .. I don't think we should add it only because it's requested once. They look same instances to me. I will have no argument if this one is added and other people request to add others later. We should make it clear why this one should be specifically added. We're not going to add all the information to metadata as requested. If the purpose of adding it is to check if the pushing down is actually working or not, the logging sounds appropriate for its purpose.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org