Hi Alfredo, thanks for your explanation! Regarding 2. you clarified usage. I was told that using zbalance_ipc is mandatory with ZC. So I will try using ZC with RSS now.
Regarding 1., I think we confused each other. From what I understand, without RSS enabled I can capture from different virtual queues aka rings (ethX@<queue id>) and pf_ring will take care of distribution between them. And with RSS enabled, pf_ring will just use the RSS queues and don't apply any software distribution. So each virtual queue represents a RSS queue. Right? In this case, my question is: What does pf_ring do, if I try to capture from more virtual queues than RSS queues are available? Regards, Jan ________________________________ From: [email protected] [[email protected]] on behalf of Alfredo Cardigliano [[email protected]] Sent: Tuesday, August 25, 2015 15:32 To: [email protected] Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro Hi Jan let’s recap: 1. with standard driver, if you enable multiqueue (RSS != 1) you can capture from all the queues (ethX) or specific queues (ethX@<queue id>) 2. with ZC drivers and when working in ZC mode (prepending “zc:” to the interface name), since kernel is bypassed and the library directly attaches to a device/queue, using zc:ethX you are capturing from the first queue only (zc:ethX = zc:ethX@0). Answer to your question is: in standard mode, pf_ring merges all the queues automatically, while in ZC mode, you need to explicitly specify the queues. The maximum number of RSS queues is 16. When using zbalance_ipc it is recommended to disable RSS, otherwise you have to explicitly specify all the queues (-i zc:ethX@0,zc:ethX@1,..) and RSS is useless in this case since distribution happens in software. Alfredo On 25 Aug 2015, at 11:07, Jan Grashofer <[email protected]<mailto:[email protected]>> wrote: Hi Alfredo, Before using ZC, I have configured Bro to use pf_ring for interface eth3. According to your howtos pf_ring gets loaded: modprobe pf_ring enable_tx_capture=0 min_num_slots=32768 By having a look at the load_driver.sh file I also enabled RSS: modprobe ixgbe-zc RSS=1,16 numa_cpu_affinity=0,0 (eth3 is the second) I thought pf_ring will now automatically utilize RSS, as its enabled by the driver. Am I wrong? Does pf_ring merge all the queues and does software distribution? That would be annoying! So my question was: In case pf_ring automatically utilizes the RSS queues, what would happen if I configure Bro to use more pf_ring queues than RSS queues are available? Regards, Jan ________________________________ From: [email protected]<mailto:[email protected]> [[email protected]<mailto:[email protected]>] on behalf of Alfredo Cardigliano [[email protected]<mailto:[email protected]>] Sent: Tuesday, August 25, 2015 09:36 To: [email protected]<mailto:[email protected]> Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro It just does not make sense to use multiple RSS queues (hw distribution) with the software distribution, if you enable RSS you have to explicitly capture from all the queues listing them in zbalance_ipc. I am not sure I got the question. Alfredo On 24 Aug 2015, at 13:35, Jan Grashofer <[email protected]<mailto:[email protected]>> wrote: Hi Alfredo, thanks for your advice. Can you tell what will happen, if I use more workers together with RSS? Best regards, Jan ________________________________ From: [email protected]<mailto:[email protected]> [[email protected]<mailto:[email protected]>] on behalf of Alfredo Cardigliano [[email protected]<mailto:[email protected]>] Sent: Friday, August 21, 2015 15:19 To: [email protected]<mailto:[email protected]> Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro On 21 Aug 2015, at 12:33, Jan Grashofer <[email protected]<mailto:[email protected]>> wrote: Hi Luca, thank you for your fast reply! I thought using ZC might improve the transfer of the data to the processes. Of cause I wondered why I need to run a balancer-process, but I thought there might be more magic behind the scenes and the balancer-process is just a frontend. Bro is the real bottleneck and has very poor performance, thus the packet capture acceleration provided by ZC does not affect performance. Unfortunately the Intel cards are limited to 16 RSS queues. Do you think there is any possibility to further improve the PF_RING/ixgbe-driver setup? If you need more than 16 queues, you should disable RSS and use zbalance_ipc with as many egress queues as you need. Alfredo Best regards, Jan ________________________________ From: [email protected]<mailto:[email protected]> [[email protected]<mailto:[email protected]>] on behalf of Luca Deri [[email protected]<mailto:[email protected]>] Sent: Friday, August 21, 2015 12:03 To: [email protected]<mailto:[email protected]> Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro Hi Jan, we have been notified that the latest Bro version sixes some memory leaks that might cause the crash you have experienced. On 21 Aug 2015, at 11:36, Jan Grashofer <[email protected]<mailto:[email protected]>> wrote: Hi all, I was trying to use PF_RING ZC with Bro, to improve performance. Unfortunately I was not able to get any benefit from using ZC. My test system uses: 2 x Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (16 cores + 16 HT in sum) 64 GB RAM 82599ES 10-Gigabit SFI/SFP+ NIC (ixgbe-zc driver) seeing 6 GBit/s on average Linux (2.6.32-504.23.4.el6.x86_64) PFRING_ZC v.6.1.1.150527 (enable_tx_capture=0 min_num_slots=32768) My reference setup is the following: 16 workers (pinned to cpus), 1 proxy 16 RSS queues with interrupts pinned, NUMA affinity set to first socket (next to the NIC) Capture-loss average: 0.000262957682292 Capture-loss maximum: 0.022336 This are of course good stats. My first try with ZC: 16 workers (pinned to cpus), 1 proxy 16 RSS queues, NUMA affinity set to first socket hugepages enabled zbalance_ipc -n 16 -m 1 -g 1 Capture-loss average: 0.289625722005 Capture-loss maximum: 9.155165 Obviously this was no improvement, but I read that using RSS together with ZC is counterproductive so I tried another setup. My second try with ZC: 14 workers (pinned to cpus, first of each socket free), 1 proxy 32 RX/TX queues (default setting), NUMA affinity set to first socket hugepages enabled zbalance_ipc -n 16 -m 1 -g 0 Capture-loss average: 1.78585592428 Capture-loss maximum: 35.112188 Finally this got even worse (one factor may of cause be the reduced number of workers, but still no ZC improvement). Additionally to the bad stats, Bro segfaults from time to time, when I am using ZC. As capture-loss correlates with packet drops, I think I may missed some essential thing in my setup. Do you have any suggestions? If you can use RSS you do not need the balancer that uses memory bandwidth and does not add benefits, In fact with so many apps running you put quite some pressure on the memory subsystem that can cause the drawback, >From our experience Bro is a great tool, but uses quite some resources. if you >need speed you should probably consider other options such as Suricata. Regards Luca Thanks, Jan _______________________________________________ Ntop-misc mailing list [email protected]<mailto:[email protected]> http://listgateway.unipi.it/mailman/listinfo/ntop-misc _______________________________________________ Ntop-misc mailing list [email protected]<mailto:[email protected]> http://listgateway.unipi.it/mailman/listinfo/ntop-misc _______________________________________________ Ntop-misc mailing list [email protected]<mailto:[email protected]> http://listgateway.unipi.it/mailman/listinfo/ntop-misc _______________________________________________ Ntop-misc mailing list [email protected]<mailto:[email protected]> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
_______________________________________________ Ntop-misc mailing list [email protected] http://listgateway.unipi.it/mailman/listinfo/ntop-misc
