[ https://issues.apache.org/jira/browse/HADOOP-18190?focusedWorklogId=793290&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793290 ]
ASF GitHub Bot logged work on HADOOP-18190: ------------------------------------------- Author: ASF GitHub Bot Created on: 20/Jul/22 14:57 Start Date: 20/Jul/22 14:57 Worklog Time Spent: 10m Work Description: steveloughran commented on PR #4458: URL: https://github.com/apache/hadoop/pull/4458#issuecomment-1190394031 look at `org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding#trackFunctionDuration` for an example of failure handling. the failed() call switches to the same stats key with the suffix .failures and updates those counters and durations Issue Time Tracking ------------------- Worklog Id: (was: 793290) Time Spent: 2h 20m (was: 2h 10m) > s3a prefetching streams to collect iostats on prefetching operations > -------------------------------------------------------------------- > > Key: HADOOP-18190 > URL: https://issues.apache.org/jira/browse/HADOOP-18190 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.0 > Reporter: Steve Loughran > Assignee: Ahmar Suhail > Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > There is a lot more happening in reads, so there's a lot more data to collect > and publish in IO stats for us to view in a summary at the end of processes > as well as get from the stream while it is active. > Some useful ones would seem to be: > counters > * is in memory. using 0 or 1 here lets aggregation reports count total #of > memory cached files. > * prefetching operations executed > * errors during prefetching > gauges > * number of blocks in cache > * total size of blocks > * active prefetches > + active memory used > duration tracking count/min/max/ave > * time to fetch a block > * time queued before the actual fetch begins > * time a reader is blocked waiting for a block fetch to complete > and some info on cache use itself > * number of blocks discarded unread > * number of prefetched blocks later used > * number of backward seeks to a prefetched block > * number of forward seeks to a prefetched block > the key ones I care about are > # memory consumption > # can we determine if cache is working (reads with cache hit) and when it is > not (misses, wasted prefetches) > # time blocked on executors > The stats need to be accessible on a stream even when closed, and aggregated > into the FS. once we get per-thread stats contexts we can publish there too > and collect in worker threads for reporting in task commits -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org