Auditing and accounting with Hadoop

2009-01-07 Thread Brian Bockelman
Hey, One of our charges is to do auditing and accounting with our file systems (we use the simplifying assumption that the users are non- malicious). Auditing can be done by going through the namenode logs and utilizing the UGI information to track opens/reads/writes back to the users.

Re: Auditing and accounting with Hadoop

2009-01-07 Thread Doug Cutting
The notion of a client/task ID, independent of IP or username seems useful for log analysis. DFS's client ID is probably currently your best bet, but we might improve its implementation, and make the notion more generic. It is currently implemented as: String taskId = conf.get("mapred.ta