Hadoop-folk,
How have people gone about collecting debug/error log information from
streaming jobs, in Hadoop?
I'm clear that, if I write to stderr (and it's not a counter/status
line), then it goes onto the node's local disk, in:
/var/log/hadoop/userlogs/<task atttempt>/stderr
However, I'd really like to collect those in some central location,
for processing. Possibly via splunk (which we use right now),
possibly some other means.
- Do people write a custom log4j appender? (does log4j even control
writes to that stderr file? I can't tell -- it somewhat looks like no)
- Or, maybe write cron jobs that run on the slaves and periodically
push logs somewhere?
- Are people outside of Facebook using scribe?
Any ideas / experiences appreciated.
Thanks,
-Dan Milstein