Re: Collecting thousands of sources

Ashish Thu, 04 Sep 2014 19:57:07 -0700

Have a look at Flume Client SDK. One simple way would be to use Flume
clients implementations to send Events to Flume Sources, this would
significantly reduce the number of Sources you need to manage.


HTH !


On Thu, Sep 4, 2014 at 9:40 PM, JuanFra Rodriguez Cardoso <
[email protected]> wrote:

> Thanks Andrew for your quick response.
>
> My sources (server PUD) can't put events into an agregation point. For
> this reason I'm following a PollingSource schema where my agent needs to be
> configured with thousands of sources. Any clues for use cases where data is
> injected considering a polling process?
>
> Regards!
> ---
> JuanFra Rodriguez Cardoso
>
>
> 2014-09-04 17:41 GMT+02:00 Andrew Ehrlich <[email protected]>:
>
>> One way to avoid managing so many sources would be to have an aggregation
>> point between the data generators the flume sources. For example, maybe you
>> could have the data generators put events into a message queue(s), then
>> have flume consume from there?
>>
>> Andrew
>>
>> ---- On Thu, 04 Sep 2014 08:29:04 -0700 *JuanFra Rodriguez
>> Cardoso<[email protected]
>> <[email protected]>>* wrote ----
>>
>> Hi all:
>>
>> Considering an environment with thousands of sources, which are the best
>> practices for managing the agent configuration (flume.conf)? Is it
>> recommended to create a multi-layer topology where each agent takes control
>> of a subset of sources?
>>
>> In that case, a conf mgmg server (such as Puppet) would be responsible
>> for editing flume.conf  with parameters 'agent.sources' from source1 to
>> source3000 (assuming we have 3000 sources machines).
>>
>> Are my thoughts aligned with that scenarios of large scale data ingest?
>>
>> Thanks a lot!
>> ---
>> JuanFra
>>
>>
>>
>


-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Re: Collecting thousands of sources

Reply via email to