Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20521#discussion_r166413015
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
    @@ -493,9 +510,23 @@ case class DataSource(
             dataSource.createRelation(
               sparkSession.sqlContext, mode, caseInsensitiveOptions, 
Dataset.ofRows(sparkSession, data))
           case format: FileFormat =>
    -        
sparkSession.sessionState.executePlan(planForWritingFileFormat(format, mode, 
data)).toRdd
    +        val cmd = planForWritingFileFormat(format, mode, data)
    +        val resolvedPartCols = cmd.partitionColumns.map { col =>
    +          // The partition columns created in `planForWritingFileFormat` 
should always be
    +          // `UnresolvedAttribute` with a single name part.
    +          assert(col.isInstanceOf[UnresolvedAttribute])
    +          val unresolved = col.asInstanceOf[UnresolvedAttribute]
    +          assert(unresolved.nameParts.length == 1)
    +          val name = unresolved.nameParts.head
    +          outputColumns.find(a => equality(a.name, name)).getOrElse {
    +            throw new AnalysisException(
    +              s"Unable to resolve $name given 
[${data.output.map(_.name).mkString(", ")}]")
    +          }
    +        }
    +        val resolved = cmd.copy(partitionColumns = resolvedPartCols, 
outputColumns = outputColumns)
    --- End diff --
    
    > Why does the physical plan not match the command that is produced
    
    It matches! The only problem is, they are 2 different JVM objects. The UI 
keeps the physical plan object and displays them. An alternative solution is to 
swap the new physical plan into the UI part, but that's hard to do with the 
current UI framework.
    
    If we run 
`sparkSession.sessionState.executePlan(planForWritingFileFormat(format, mode, 
data)).toRdd`, we are executing the new physical plan, so no metrics will be 
reported to the passed-in physical plan and shown in the UI.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to