[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

cloud-fan Sun, 28 Oct 2018 19:07:15 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22514#discussion_r228780430
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
 ---
    @@ -45,6 +46,11 @@ case class CreateHiveTableAsSelectCommand(
     
       override def run(sparkSession: SparkSession, child: SparkPlan): Seq[Row] 
= {
    --- End diff --
    
    Some more thoughts:
    
    `CreateHiveTableAsSelectCommand` just runs another command, so we will not 
get any metric for this plan node. It's OK if we use the hive writer, as we 
indeed can't get any metrics(the writing is done by hive). However, if we can 
convert and use Spark's native writer, we do have metrics. I think a better fix 
is to replace Hive CTAS with data source CTAS during optimization.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

Reply via email to