Re: Flume Topology

zaenal rifai Fri, 27 Nov 2015 00:29:22 -0800

sorry, i mean avro sink




On 27 November 2015 at 14:52, Gonzalo Herreros <[email protected]> wrote:

> Hi Zaenal,
>
> There is no "avro channel", Flume will write by default avro to any of the
> channels.
> The point is that a memory channel or even a file channel will very
> quickly fill up because a single sink cannot keep up with the many sources.
>
> Regards,
> Gonzalo
>
> On 27 November 2015 at 03:43, zaenal rifai <[email protected]> wrote:
>
>> why not to use avro channel gonzalo ?
>>
>> On 26 November 2015 at 20:12, Gonzalo Herreros <[email protected]>
>> wrote:
>>
>>> You cannot have multiple processes writing concurrently to the same hdfs
>>> file.
>>> What you can do is have a topology where many agents forward to an agent
>>> that writes to hdfs but you need a channel that allows the single hdfs
>>> writer to lag behind without slowing the sources.
>>> A kafka channel might be a good choice.
>>>
>>> Regards,
>>> Gonzalo
>>>
>>> On 26 November 2015 at 11:57, yogendra reddy <[email protected]>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> Here's my current flume setup for a hadoop cluster to collect service
>>>> logs
>>>>
>>>> - Run flume agent in each of the nodes
>>>> - Configure flume sink to write to hdfs and the files end up in this way
>>>>
>>>> ..flume/events/node0logfile
>>>> ..flume/events/node1logfile
>>>>
>>>> ..flume/events/nodeNlogfile
>>>>
>>>> But I want to be able to write all the logs from multiple agents to a
>>>> single file in hdfs . How can I achieve this and what would the topology
>>>> look like.
>>>> can this be done via collector ? If yes, where can I run the collector
>>>> and how will this scale for a 1000+ node  cluster.
>>>>
>>>> Thanks,
>>>> Yogendra
>>>>
>>>
>>>
>>
>

Re: Flume Topology

Reply via email to