Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22353#discussion_r217165648
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala ---
    @@ -59,6 +57,12 @@ private[execution] object SparkPlanInfo {
           new SQLMetricInfo(metric.name.getOrElse(key), metric.id, 
metric.metricType)
         }
     
    -    new SparkPlanInfo(plan.nodeName, plan.simpleString, 
children.map(fromSparkPlan), metrics)
    +    // dump the file scan metadata (e.g file path) to event log
    --- End diff --
    
    As a next step of reviews, did you have a chance to test this on your real 
environment at least TPCDS 1TB?
    This seems to increase the event log traffic dramatically in the worst 
case. Can we have some comparison before and after this PR? @LantaoJin .


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to