[ 
https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16714908#comment-16714908
 ] 

ASF GitHub Bot commented on SPARK-26327:
----------------------------------------

xuanyuanking opened a new pull request #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277
 
 
   ## What changes were proposed in this pull request?
   
   As the description in 
[SPARK-26327](https://issues.apache.org/jira/browse/SPARK-26327), 
`postDriverMetricUpdates` was called on wrong place cause this bug, fix this by 
split the initializing of `selectedPartitions` and metrics updating logic. Add 
the updating logic in `inputRDD` initializing which can take effect in both 
code generation node and normal node.
   ## How was this patch tested?
   
   New test case in `SQLMetricsSuite`.
   Manual test:
   
   |         | Before | After |
   |---------|:--------:|:-------:|
   | CodeGen 
|![image](https://user-images.githubusercontent.com/4833765/49741753-13c7e800-fcd2-11e8-97a8-8057b657aa3c.png)|![image](https://user-images.githubusercontent.com/4833765/49741774-1f1b1380-fcd2-11e8-98d9-78b950f4e43a.png)|
   | Normal  
|![image](https://user-images.githubusercontent.com/4833765/49741836-378b2e00-fcd2-11e8-80c3-ab462a6a3184.png)|![image](https://user-images.githubusercontent.com/4833765/49741860-4a056780-fcd2-11e8-9ef1-863de217f183.png)|
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Metrics in FileSourceScanExec not update correctly
> --------------------------------------------------
>
>                 Key: SPARK-26327
>                 URL: https://issues.apache.org/jira/browse/SPARK-26327
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Yuanjian Li
>            Priority: Major
>
> As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and 
> "metadataTime"(fileListingTime) were updated while lazy val 
> `selectedPartitions` initialized. But `selectedPartitions` will be 
> initialized by `metadata` at first, which is called by 
> `queryExecution.toString` in `SQLExecution.withNewExecutionId`. So while the 
> `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding 
> liveExecutions in SQLAppStatusListener, the metrics update is not work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to