Hi Guys,

We have fairly decent sized Hadoop cluster of about 200 nodes and was
wondering what is the state of art if I want to aggregate and visualize
Hadoop ecosystem logs, particularly

   1. Tasktracker logs
   2. Datanode logs
   3. Hbase RegionServer logs

One way is to use something like a Flume on each node to aggregate the logs
and then use something like Kibana -
http://www.elasticsearch.org/overview/kibana/ to visualize the logs and
make them searchable.

However I don't want to write another ETL for the hadoop/hbase logs
 themselves. We currently log in to each machine individually to 'tail -F
logs' when there is an hadoop problem on a particular node.

We want a better way to look at the hadoop logs themselves in a centralized
way when there is an issue without having to login to 100 different
machines and was wondering what is the state of are in this regard.

Suggestions/Pointers are very welcome!!

Sagar

Reply via email to