Dan Milstein wrote:
Hadoop-folk,
How have people gone about collecting debug/error log information from
streaming jobs, in Hadoop?
I'm clear that, if I write to stderr (and it's not a counter/status
line), then it goes onto the node's local disk, in:
/var/log/hadoop/userlogs/<task atttempt>/stderr
However, I'd really like to collect those in some central location,
for processing. Possibly via splunk (which we use right now),
possibly some other means.
- Do people write a custom log4j appender? (does log4j even control
writes to that stderr file? I can't tell -- it somewhat looks like no)
- Or, maybe write cron jobs that run on the slaves and periodically
push logs somewhere?
- Are people outside of Facebook using scribe?
Any ideas / experiences appreciated.
Thanks,
-Dan Milstein
We use remote syslog for this. All warning/error messages get forwarded
to a central
Log server. This server writes these messages to a named pipe. A
separate script
reads from the named pipe and emails the errors to the admin.
I'd like to try out Scribe at some point, it looks neat.
M