[ 
https://issues.apache.org/jira/browse/FLINK-30450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649716#comment-17649716
 ] 

Steve Loughran commented on FLINK-30450:
----------------------------------------

if you can grab the IOStatistics from an iOStatisticsSource implementation, 
s3a, gcs and abfs FS instances collect detailed stats you can snapshot and then 
marshall as json/serialized java objects. Also works for other classes (input 
streams, output streams, list iterators....). that stuff is there to play with.

otherwise, hadoop-3.3.2+ optionally annotates the http-referer headr on all S3 
requests so you can apportion blame from the S3 server logs, including looking 
for 503 throttle events. Avoids code changes up the stack and gives a view of 
the entire cluster.

> FileSystem supports exporting client-side metrics
> -------------------------------------------------
>
>                 Key: FLINK-30450
>                 URL: https://issues.apache.org/jira/browse/FLINK-30450
>             Project: Flink
>          Issue Type: New Feature
>          Components: FileSystems
>            Reporter: Hangxiang Yu
>            Priority: Major
>
> Client-side metrics, or job level metrics for filesystem could help us to 
> monitor filesystem more precisely.
> Some metrics (like request rate , throughput, latency, retry count, etc) are 
> useful to monitor the network or client problem of checkpointing or other 
> access cases for a job.  
> Some filesystems like s3, s3-presto, gs have supported enabling some metrics, 
> these could be exported in the filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to