[ 
https://issues.apache.org/jira/browse/SPARK-33266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noritaka Sekiyama updated SPARK-33266:
--------------------------------------
    Description: 
Sometimes we need to identify performance bottlenecks, for example, how long it 
took to read from data store, how long it took to write into another data store.

It would be great if we can have total duration, read duration, and write 
duration as task level metrics.

Currently it seems that both `InputMetrics` and `OutputMetrics` do not have 
duration related metrics.

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/InputMetrics.scala#L42-L58]

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/OutputMetrics.scala#L41-L56]

 

On the other hand, other metrics such as `ShuffleWriteMetrics` has write time. 
We might need similar metrics for input/output.

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/ShuffleWriteMetrics.scala]

 

  was:
Sometimes we need to identify performance bottlenecks, for example, how long it 
took to read from data store, how long it took to write into another data store.

It would be great if we can have total duration, read duration, and write 
duration as task level metrics.

Currently it seems that both `InputMetrics` and `OutputMetrics` do not have 
duration related metrics.

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/InputMetrics.scala#L42-L58]

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/OutputMetrics.scala#L41-L56
]

On the other hand, other metrics such as `ShuffleWriteMetrics` has write time. 
We might need similar metrics for input/output.

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/ShuffleWriteMetrics.scala]

 


> Add total duration, read duration, and write duration as task level metrics
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-33266
>                 URL: https://issues.apache.org/jira/browse/SPARK-33266
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.1
>            Reporter: Noritaka Sekiyama
>            Priority: Major
>
> Sometimes we need to identify performance bottlenecks, for example, how long 
> it took to read from data store, how long it took to write into another data 
> store.
> It would be great if we can have total duration, read duration, and write 
> duration as task level metrics.
> Currently it seems that both `InputMetrics` and `OutputMetrics` do not have 
> duration related metrics.
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/InputMetrics.scala#L42-L58]
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/OutputMetrics.scala#L41-L56]
>  
> On the other hand, other metrics such as `ShuffleWriteMetrics` has write 
> time. We might need similar metrics for input/output.
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/ShuffleWriteMetrics.scala]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to