Hi All I do have a query here on maintaining Hadoop map-reduce logs. In default the logs appear in respective task tracker nodes which you can easily drill down from the job tracker web UI at times of any failure.(Which I was following till now) . Now I need to get into the next level to manage the logs corresponding to individual jobs. In my log I'm dumping some key parameters with respect to my business which could be used for business level debugging/analysis at time in the future if required . For this purpose, I need a central log file corresponding to a job. (not many files, ie one per task tracker because as the cluster grows the no of log files corresponding to a job also increases). A single point of reference makes things handy for analysis by any business folks . I think it would be a generic requirement of any enterprise application to manage and archive the logs of each job execution. Hence definitely there would be best practices and standards identified and maintained by most of the core Hadoop enterprise users. Could you please help me out by sharing some of the better options for log management for Hadoop map-reduce logs. It could greatly help me choose the best practice that suit my environment and application needs.
Thank You Regards Bejoy.K.S