Re: Flume Topology

Gonzalo Herreros Thu, 26 Nov 2015 23:53:38 -0800

Hi Zaenal,

There is no "avro channel", Flume will write by default avro to any of the
channels.
The point is that a memory channel or even a file channel will very quickly
fill up because a single sink cannot keep up with the many sources.


Regards,
Gonzalo

On 27 November 2015 at 03:43, zaenal rifai <[email protected]> wrote:

> why not to use avro channel gonzalo ?
>
> On 26 November 2015 at 20:12, Gonzalo Herreros <[email protected]>
> wrote:
>
>> You cannot have multiple processes writing concurrently to the same hdfs
>> file.
>> What you can do is have a topology where many agents forward to an agent
>> that writes to hdfs but you need a channel that allows the single hdfs
>> writer to lag behind without slowing the sources.
>> A kafka channel might be a good choice.
>>
>> Regards,
>> Gonzalo
>>
>> On 26 November 2015 at 11:57, yogendra reddy <[email protected]>
>> wrote:
>>
>>> Hi All,
>>>
>>> Here's my current flume setup for a hadoop cluster to collect service
>>> logs
>>>
>>> - Run flume agent in each of the nodes
>>> - Configure flume sink to write to hdfs and the files end up in this way
>>>
>>> ..flume/events/node0logfile
>>> ..flume/events/node1logfile
>>>
>>> ..flume/events/nodeNlogfile
>>>
>>> But I want to be able to write all the logs from multiple agents to a
>>> single file in hdfs . How can I achieve this and what would the topology
>>> look like.
>>> can this be done via collector ? If yes, where can I run the collector
>>> and how will this scale for a 1000+ node  cluster.
>>>
>>> Thanks,
>>> Yogendra
>>>
>>
>>
>

Re: Flume Topology

Reply via email to