Ryan Blue created SPARK-20213:
---------------------------------

             Summary: DataFrameWriter operations do not show up in SQL tab
                 Key: SPARK-20213
                 URL: https://issues.apache.org/jira/browse/SPARK-20213
             Project: Spark
          Issue Type: Bug
          Components: SQL, Web UI
    Affects Versions: 2.1.0, 2.0.2
            Reporter: Ryan Blue


In 1.6.1, {{DataFrame}} writes started using {{DataFrameWriter}} actions like 
{{insertInto}} would show up in the SQL tab. In 2.0.0 and later, they no longer 
do. The problem is that 2.0.0 and later no longer wrap execution with 
{{SQLExecution.withNewExecutionId}}, which emits 
{{SparkListenerSQLExecutionStart}}.

Here are the relevant parts of the stack traces:
{code:title=Spark 1.6.1}
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
org.apache.spark.sql.execution.QueryExecution$$anonfun$toRdd$1.apply(QueryExecution.scala:56)
org.apache.spark.sql.execution.QueryExecution$$anonfun$toRdd$1.apply(QueryExecution.scala:56)
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56)
 => holding 
Monitor(org.apache.spark.sql.hive.HiveContext$QueryExecution@424773807})
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:196)
{code}

{code:title=Spark 2.0.0}
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
 => holding Monitor(org.apache.spark.sql.execution.QueryExecution@490977924})
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:301)
{code}

I think this was introduced by 
[54d23599|https://github.com/apache/spark/commit/54d23599]. The fix should be 
to add withNewExecutionId to 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L610



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to