[ https://issues.apache.org/jira/browse/FLINK-30450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649716#comment-17649716 ]
Steve Loughran commented on FLINK-30450: ---------------------------------------- if you can grab the IOStatistics from an iOStatisticsSource implementation, s3a, gcs and abfs FS instances collect detailed stats you can snapshot and then marshall as json/serialized java objects. Also works for other classes (input streams, output streams, list iterators....). that stuff is there to play with. otherwise, hadoop-3.3.2+ optionally annotates the http-referer headr on all S3 requests so you can apportion blame from the S3 server logs, including looking for 503 throttle events. Avoids code changes up the stack and gives a view of the entire cluster. > FileSystem supports exporting client-side metrics > ------------------------------------------------- > > Key: FLINK-30450 > URL: https://issues.apache.org/jira/browse/FLINK-30450 > Project: Flink > Issue Type: New Feature > Components: FileSystems > Reporter: Hangxiang Yu > Priority: Major > > Client-side metrics, or job level metrics for filesystem could help us to > monitor filesystem more precisely. > Some metrics (like request rate , throughput, latency, retry count, etc) are > useful to monitor the network or client problem of checkpointing or other > access cases for a job. > Some filesystems like s3, s3-presto, gs have supported enabling some metrics, > these could be exported in the filesystem. -- This message was sent by Atlassian Jira (v8.20.10#820010)