I have a similar need. I need to ingest 40+ gpbs into a Hadoop grid. Kafka is acting as my landing zone/front door for the grid.
I tried many variations of using tcpdump, Flume, and other concoctions. I ended up building a custom pcap ingest process in C. The app uses PF_RING ZC to load balance packets across multiple threads. I then push the packet data into Kafka using librdkafka. Both the pull from PF_RING and the push to Kafka batch many packets at a time (trading latency for throughput). With the minimal tuning that I have done, it can handle roughly 10-12 Gbps. I only need to achieve 10 Gbps on a single host and then I am going to scale horizontally to manage the aggregate pcap that I need to capture. Right now, the bottleneck is the master thread in PF_RING that dispatches packets off to each worker thread. That thread pegs a single CPU core (a rather beefy core, I might add). It does not seem capable of handling additional worker threads to scale beyond 10-12 Gbps. I wish I had access to the source to review and confirm, but that is how it appears with the information that I have. On Thu, Sep 3, 2015 at 11:46 AM, Manny Veloso <[email protected]> wrote: > Also, when you say 1k flows per second is that 1k devices reporting their > flows every second? We’d need a two to three orders of magnitude more > performance. > -- > Manny Veloso > Sr. Solutions Engineer > Smartrg.com > > From: <[email protected]> on behalf of Luca Deri < > [email protected]> > Reply-To: "[email protected]" <[email protected] > > > Date: Tuesday, September 1, 2015 at 10:52 PM > To: "[email protected]" <[email protected]> > Subject: Re: [Ntop-misc] nprobe and kafka? > > Manny > we have added kafka support on one of our development prototypes so movign > to the official nprobe should not be too difficult. The performance is > similar to the ZMQ or elasticsearch implementation, so considered the JSON > conversion is at least 1k flows/sec > > Luca > > On 01 Sep 2015, at 23:20, Manny Veloso <[email protected]> wrote: > > Hi! > > I’m looking to use nprobe as a bridge into kafka. In the splunk app nprobe > just sends data into splunk. Is that basically the same configuration as a > kafka install? > > Also, what kind of throughput can I expect out of nprobe? > -- > Manny Veloso > Sr. Solutions Engineer > Smartrg.com > _______________________________________________ > Ntop-misc mailing list > [email protected] > http://listgateway.unipi.it/mailman/listinfo/ntop-misc > > > > _______________________________________________ > Ntop-misc mailing list > [email protected] > http://listgateway.unipi.it/mailman/listinfo/ntop-misc > -- Nick Allen <[email protected]>
_______________________________________________ Ntop-misc mailing list [email protected] http://listgateway.unipi.it/mailman/listinfo/ntop-misc
