Hi, we have a fairly simple requirement for capturing packets, and I'm trying to figure out the most efficient way to do this. Our setup is as follows:
- Intel 82598, Xeon 6-core 2.2GHz (12 with hyperthreading), and 32GB of RAM - Traffic coming from up to 12 known IPs, that have a single thread for each IP to do parse and analyze the packets - Aggregate of up to 2Gbps of traffic Previously we were using libpcap with pf_ring in transparent mode 1, and it seemed to be dropping packets. Alfredo suggested we switch to DNA clustering, so we're in the process of doing that. However, I have a couple of questions: 1) It appears that in the DNA load script it's suggested that RSS is completely turned off in the intel driver so that the master cluster thread can do something similar. Since RSS is done in hardware, won't this have a performance impact? 2) The master would be a single thread whose only job is to pull a packet from the one RX queue on the card, run through the custom hash function, and forward the packet using zero-copy to one of the twelve threads. Will one core be sufficient enough for a single thread to hash every incoming packet? 3) In this paper: http://luca.ntop.org/imc2010.pdf it is recommended to have the "capture thread" and "polling thread" on a single core so that they can share the same cache. In my case the master cluster would be the "capture thread" (which there is only 1?), and the "polling thread" will be one of the other 12 that capture/analyze data from the unique IPs. Since we only have a 6-core processor (12 with HT), what is the recommended way of distributing the 12 poller threads to minimize cache coherency issues? Since the DNA master is using one part of the core, we would have to share the rest of them somehow, so I wasn't sure if the answer was to have multiple DNA cluster masters. Any suggestions would be appreciated. Thanks.
_______________________________________________ Ntop-misc mailing list [email protected] http://listgateway.unipi.it/mailman/listinfo/ntop-misc
