[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

HyukjinKwon Wed, 31 Oct 2018 21:00:01 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22905#discussion_r229933544
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
    @@ -306,7 +306,15 @@ case class FileSourceScanExec(
           withOptPartitionCount
         }
     
    -    withSelectedBucketsCount
    +    val withOptColumnCount = relation.fileFormat match {
    +      case columnar: ColumnarFileFormat =>
    +        val sqlConf = relation.sparkSession.sessionState.conf
    +        val columnCount = columnar.columnCountForSchema(sqlConf, 
requiredSchema)
    +        withSelectedBucketsCount + ("ColumnCount" -> columnCount.toString)
    --- End diff --
    
    Is this something we really should include in the metadata? If the purpose 
of this is to check if the column pruning works or not, logging should be good 
enough. Adding a trait for it sounds an overkill for the current status. Let's 
not add an abstraction just for rough guess that it can be generalised.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

Reply via email to