Re: Best practice for logging in a highly available hadoop cluster

Matt Sicker Wed, 27 Sep 2017 17:07:55 -0700

Typical best practices for Hadoop like that is to use Flume. You could also
have a central log aggregation server that your nodes log to (e.g., using
the gelf layout for graylog, or just a TCP server accepting json/xml log
messages), or you could log via the Kafka appender or similar for
distributed logging.


On 27 September 2017 at 18:44, Anhad Singh Bhasin <anhadbha...@gmail.com>
wrote:

> Hello,
>
> We have TCPSocketServer running on Edge node of a cluster and all other
> data nodes send log events to the TCPSocketServer running on edge node. And
> we are using standard routing to redirect log events to individual log
> files.
>
> We are planning to make our system highly available by adding multiple Edge
> nodes. This means each Edge node would have its own TCPSocketServer and at
> a particular time one Edge node would be up and running.
>
> Since each Edge node would have its own set of log files, Is there a best
> practice for high available systems from Log4j2 to keep all the log files
> in one place.
>
> Can we push the log events into log files in HDFS through the log4j2
> Routing appender?
> Or Do we push all the log events into log files in a shared disk among all
> the edge nodes?
>
> Any suggestions, comments would be deeply appreciated.
>
> Thanks
> Anhad Singh Bhasin
>



-- 
Matt Sicker <boa...@gmail.com>

Re: Best practice for logging in a highly available hadoop cluster

Reply via email to