I'm interested in the ability to track metrics (such as CPU time, storage 
used per machine, across the cluster) in Hadoop by User. I've taken a look 
at the Fair and Capacity Schedulers and they seem oriented towards 
ensuring fair use between users' jobs rather than providing a feature 
which also reports what resources the users actually used on the cluster. 
Likewise, with other tools like Ganglia, which appear to be concerned with 
reporting metrics by machine (and not by job). I've also taken a look 
through the common/metrics tickets in JIRA and there does not seem to be 
any open work that addresses this requirement. 

Have I missed something ? Has anyone been able to do this ? Is there a way 
to capture metrics by Job (which could be correlated back to a user?) If 
not, is there any current or forecasted work in the project that addresses 
this requirement ? 

Kind regards
Steve Watt

Reply via email to