why not to use avro channel gonzalo ? On 26 November 2015 at 20:12, Gonzalo Herreros <[email protected]> wrote:
> You cannot have multiple processes writing concurrently to the same hdfs > file. > What you can do is have a topology where many agents forward to an agent > that writes to hdfs but you need a channel that allows the single hdfs > writer to lag behind without slowing the sources. > A kafka channel might be a good choice. > > Regards, > Gonzalo > > On 26 November 2015 at 11:57, yogendra reddy <[email protected]> > wrote: > >> Hi All, >> >> Here's my current flume setup for a hadoop cluster to collect service logs >> >> - Run flume agent in each of the nodes >> - Configure flume sink to write to hdfs and the files end up in this way >> >> ..flume/events/node0logfile >> ..flume/events/node1logfile >> >> ..flume/events/nodeNlogfile >> >> But I want to be able to write all the logs from multiple agents to a >> single file in hdfs . How can I achieve this and what would the topology >> look like. >> can this be done via collector ? If yes, where can I run the collector >> and how will this scale for a 1000+ node cluster. >> >> Thanks, >> Yogendra >> > >
