Steve Loughran created HADOOP-17553:
---------------------------------------

             Summary: FileSystem.close() to optionally log IOStats; save to 
local dir
                 Key: HADOOP-17553
                 URL: https://issues.apache.org/jira/browse/HADOOP-17553
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs, fs/azure, fs/s3
    Affects Versions: 3.3.1
            Reporter: Steve Loughran


We could save the IOStats to a local temp dir as JSON (the snapshot is designed 
to be serializable, even has a test), with a unique name 
(iostats-stevel-s3a-bucket1-timestamp-random#.json ... etc). 

We can collect these (Rajesh can, anyway), and then
* look for load on a specific bucket
* look what happened at a specific time

The best bit: the IOStatisticsSnapshot aggregates counters, min/max/mean, so 
you could merge iostats-*-s3a-bucket1-*.json to get the IOStats of all 
principals working with a given bucket

This will be local, so low cost, low cost enough we could turn it on in 
production. All that's needed is collection of the stats from the local hosts 
(or they write to a shared mounted volume)
We will need some "hadoop iostats merge" command to take multiple files and 
merge them all together; print to screen or save to a new file. Straightforward 
as all the load and merge code is present.


Needs
* logging in FS.close
* new iostats CLI + docs, tests
* extend IOStatisticsSnapshot with list of <string, string> options for use in 
annotating saved logs (hostname, principal, jobID, ...). Don't know how to 
merge these.

If we are going to add a new context map to the IOStatisticsSnapshot then we 
MUST update it before 3.3.1 ships so as to avoid breaking the serialization 
format on the next release, especially the java one. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to