Re: How best to collect userlogs (in a streaming world)

Mayuran Yogarajah Mon, 28 Sep 2009 14:36:06 -0700

Dan Milstein wrote:

Hadoop-folk,


How have people gone about collecting debug/error log information from
streaming jobs, in Hadoop?

I'm clear that, if I write to stderr (and it's not a counter/status
line), then it goes onto the node's local disk, in:

  /var/log/hadoop/userlogs/<task atttempt>/stderr

However, I'd really like to collect those in some central location,
for processing.  Possibly via splunk (which we use right now),
possibly some other means.

  - Do people write a custom log4j appender?  (does log4j even control
writes to that stderr file?  I can't tell -- it somewhat looks like no)

  - Or, maybe write cron jobs that run on the slaves and periodically
push logs somewhere?

  - Are people outside of Facebook using scribe?

Any ideas / experiences appreciated.

Thanks,
-Dan Milstein

We use remote syslog for this. All warning/error messages get forwardedto a centralLog server. This server writes these messages to a named pipe. Aseparate script

reads from the named pipe and emails the errors to the admin.

I'd like to try out Scribe at some point, it looks neat.

M

Re: How best to collect userlogs (in a streaming world)

Reply via email to