Hi Guys, We have fairly decent sized Hadoop cluster of about 200 nodes and was wondering what is the state of art if I want to aggregate and visualize Hadoop ecosystem logs, particularly
1. Tasktracker logs 2. Datanode logs 3. Hbase RegionServer logs One way is to use something like a Flume on each node to aggregate the logs and then use something like Kibana - http://www.elasticsearch.org/overview/kibana/ to visualize the logs and make them searchable. However I don't want to write another ETL for the hadoop/hbase logs themselves. We currently log in to each machine individually to 'tail -F logs' when there is an hadoop problem on a particular node. We want a better way to look at the hadoop logs themselves in a centralized way when there is an issue without having to login to 100 different machines and was wondering what is the state of are in this regard. Suggestions/Pointers are very welcome!! Sagar