Hi Zaenal, There is no "avro channel", Flume will write by default avro to any of the channels. The point is that a memory channel or even a file channel will very quickly fill up because a single sink cannot keep up with the many sources.
Regards, Gonzalo On 27 November 2015 at 03:43, zaenal rifai <[email protected]> wrote: > why not to use avro channel gonzalo ? > > On 26 November 2015 at 20:12, Gonzalo Herreros <[email protected]> > wrote: > >> You cannot have multiple processes writing concurrently to the same hdfs >> file. >> What you can do is have a topology where many agents forward to an agent >> that writes to hdfs but you need a channel that allows the single hdfs >> writer to lag behind without slowing the sources. >> A kafka channel might be a good choice. >> >> Regards, >> Gonzalo >> >> On 26 November 2015 at 11:57, yogendra reddy <[email protected]> >> wrote: >> >>> Hi All, >>> >>> Here's my current flume setup for a hadoop cluster to collect service >>> logs >>> >>> - Run flume agent in each of the nodes >>> - Configure flume sink to write to hdfs and the files end up in this way >>> >>> ..flume/events/node0logfile >>> ..flume/events/node1logfile >>> >>> ..flume/events/nodeNlogfile >>> >>> But I want to be able to write all the logs from multiple agents to a >>> single file in hdfs . How can I achieve this and what would the topology >>> look like. >>> can this be done via collector ? If yes, where can I run the collector >>> and how will this scale for a 1000+ node cluster. >>> >>> Thanks, >>> Yogendra >>> >> >> >
