Re: How best to collect userlogs (in a streaming world)

Ariel Rabkin Mon, 05 Oct 2009 10:36:37 -0700

You might also look at Chukwa -- this was precisely the original
problem Chukwa was designed to solve, and we're pretty much there.
Chukwa is a particularly natural fit if you want your logs stored in
HDFS.


On Mon, Sep 28, 2009 at 2:18 PM, Dan Milstein <[email protected]> wrote:
> Hadoop-folk,
>
> How have people gone about collecting debug/error log information from
> streaming jobs, in Hadoop?
>
> I'm clear that, if I write to stderr (and it's not a counter/status line),
> then it goes onto the node's local disk, in:
>
>  /var/log/hadoop/userlogs/<task atttempt>/stderr
>
> However, I'd really like to collect those in some central location, for
> processing.  Possibly via splunk (which we use right now), possibly some
> other means.
>
>  - Do people write a custom log4j appender?  (does log4j even control writes
> to that stderr file?  I can't tell -- it somewhat looks like no)
>
>  - Or, maybe write cron jobs that run on the slaves and periodically push
> logs somewhere?
>
>  - Are people outside of Facebook using scribe?
>
> Any ideas / experiences appreciated.
>
> Thanks,
> -Dan Milstein
>
>
>
>



-- 
Ari Rabkin [email protected]
UC Berkeley Computer Science Department

Re: How best to collect userlogs (in a streaming world)

Reply via email to