There are plenty of log aggregation tools both open source and commercial off the shelf. Here's some http://devopsangle.com/2012/04/19/8-splunk-alternatives/
My personal recommendation is LogStash. On Thu, Oct 10, 2013 at 10:38 PM, Raymond Tay <raymondtay1...@gmail.com>wrote: > You can try Chukwa which is part of the incubating projects under Apache. > Tried it before and liked it for aggregating logs. > > On 11 Oct, 2013, at 1:36 PM, Sagar Mehta <sagarme...@gmail.com> wrote: > > Hi Guys, > > We have fairly decent sized Hadoop cluster of about 200 nodes and was > wondering what is the state of art if I want to aggregate and visualize > Hadoop ecosystem logs, particularly > > 1. Tasktracker logs > 2. Datanode logs > 3. Hbase RegionServer logs > > One way is to use something like a Flume on each node to aggregate the > logs and then use something like Kibana - > http://www.elasticsearch.org/overview/kibana/ to visualize the logs and > make them searchable. > > However I don't want to write another ETL for the hadoop/hbase logs > themselves. We currently log in to each machine individually to 'tail -F > logs' when there is an hadoop problem on a particular node. > > We want a better way to look at the hadoop logs themselves in a > centralized way when there is an issue without having to login to 100 > different machines and was wondering what is the state of are in this > regard. > > Suggestions/Pointers are very welcome!! > > Sagar > > >