Re: What do you do with task logs?

2008-11-18 Thread Edward Capriolo
We just setup a log4j server. This takes the logs off the cluster.
Plus you get all the benefits of log4j

http://timarcher.com/?q=node/10


Re: What do you do with task logs?

2008-11-18 Thread Alex Loddengaard
You could take a look at Chukwa, which essentially collects and drops your
logs to HDFS:
http://wiki.apache.org/hadoop/Chukwa

The last time I tried to play with Chukwa, it wasn't in a state to be played
with yet.  If that's still the case, then you can use Scribe to collect all
of your logs in a single place, and then create a quick Python script to
persist these logs to HDFS.  Learn more about Scribe here:


http://www.cloudera.com/blog/2008/11/02/configuring-and-using-scribe-for-hadoop-log-collection/


Alex

On Tue, Nov 18, 2008 at 2:37 PM, Nathan Marz [EMAIL PROTECTED] wrote:

 We find that after about 400 to 500 jobs run in succession on our Hadoop
 cluster, the disk space on each machine is quickly used up by logs for all
 the tasks. What do people do to manage these logs? Does Hadoop have anything
 built in for managing them? Or do we have to delete/move the logs with a
 home-cooked method?

 Thanks,
 Nathan Marz
 Rapleaf