I have a similar need.  I need to ingest 40+ gpbs into a Hadoop grid.
Kafka is acting as my landing zone/front door for the grid.

I tried many variations of using tcpdump, Flume, and other concoctions.  I
ended up building a custom pcap ingest process in C.  The app uses PF_RING
ZC to load balance packets across multiple threads.  I then push the packet
data into Kafka using librdkafka.  Both the pull from PF_RING and the push
to Kafka batch many packets at a time (trading latency for throughput).

With the minimal tuning that I have done, it can handle roughly 10-12
Gbps.  I only need to achieve 10 Gbps on a single host and then I am going
to scale horizontally to manage the aggregate pcap that I need to capture.

Right now, the bottleneck is the master thread in PF_RING that dispatches
packets off to each worker thread.  That thread pegs a single CPU core (a
rather beefy core, I might add).  It does not seem capable of handling
additional worker threads to scale beyond 10-12 Gbps.

I wish I had access to the source to review and confirm, but that is how it
appears with the information that I have.









On Thu, Sep 3, 2015 at 11:46 AM, Manny Veloso <[email protected]>
wrote:

> Also, when you say 1k flows per second is that 1k devices reporting their
> flows every second? We’d need a two to three orders of magnitude more
> performance.
> --
> Manny Veloso
> Sr. Solutions Engineer
> Smartrg.com
>
> From: <[email protected]> on behalf of Luca Deri <
> [email protected]>
> Reply-To: "[email protected]" <[email protected]
> >
> Date: Tuesday, September 1, 2015 at 10:52 PM
> To: "[email protected]" <[email protected]>
> Subject: Re: [Ntop-misc] nprobe and kafka?
>
> Manny
> we have added kafka support on one of our development prototypes so movign
> to the official nprobe should not be too difficult. The performance is
> similar to the ZMQ or elasticsearch implementation, so considered the JSON
> conversion is at least 1k flows/sec
>
> Luca
>
> On 01 Sep 2015, at 23:20, Manny Veloso <[email protected]> wrote:
>
> Hi!
>
> I’m looking to use nprobe as a bridge into kafka. In the splunk app nprobe
> just sends data into splunk. Is that basically the same configuration as a
> kafka install?
>
> Also, what kind of throughput can I expect out of nprobe?
> --
> Manny Veloso
> Sr. Solutions Engineer
> Smartrg.com
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>
>
>
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>



-- 
Nick Allen <[email protected]>
_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Reply via email to