Hi Alfredo,

Before using ZC, I have configured Bro to use pf_ring for interface eth3. 
According to your howtos pf_ring gets loaded:
modprobe pf_ring enable_tx_capture=0 min_num_slots=32768

By having a look at the load_driver.sh file I also enabled RSS:
modprobe ixgbe-zc RSS=1,16 numa_cpu_affinity=0,0 (eth3 is the second)

I thought pf_ring will now automatically utilize RSS, as its enabled by the 
driver. Am I wrong? Does pf_ring merge all the queues and does software 
distribution? That would be annoying!

So my question was: In case pf_ring automatically utilizes the RSS queues, what 
would happen if I configure Bro to use more pf_ring queues than RSS queues are 
available?

Regards,
Jan

________________________________
From: [email protected] 
[[email protected]] on behalf of Alfredo Cardigliano 
[[email protected]]
Sent: Tuesday, August 25, 2015 09:36
To: [email protected]
Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro

It just does not make sense to use multiple RSS queues (hw distribution) with 
the software distribution, if you enable RSS you have to explicitly capture 
from all the queues listing them in zbalance_ipc. I am not sure I got the 
question.

Alfredo

On 24 Aug 2015, at 13:35, Jan Grashofer 
<[email protected]<mailto:[email protected]>> wrote:

Hi Alfredo,

thanks for your advice. Can you tell what will happen, if I use more workers 
together with RSS?

Best regards,
Jan

________________________________
From: 
[email protected]<mailto:[email protected]>
 
[[email protected]<mailto:[email protected]>]
 on behalf of Alfredo Cardigliano 
[[email protected]<mailto:[email protected]>]
Sent: Friday, August 21, 2015 15:19
To: [email protected]<mailto:[email protected]>
Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro


On 21 Aug 2015, at 12:33, Jan Grashofer 
<[email protected]<mailto:[email protected]>> wrote:

Hi Luca,

thank you for your fast reply! I thought using ZC might improve the transfer of 
the data to the processes. Of cause I wondered why I need to run a 
balancer-process, but I thought there might be more magic behind the scenes and 
the balancer-process is just a frontend.

Bro is the real bottleneck and has very poor performance, thus the packet 
capture acceleration provided by ZC does not affect performance.

Unfortunately the Intel cards are limited to 16 RSS queues. Do you think there 
is any possibility to further improve the PF_RING/ixgbe-driver setup?

If you need more than 16 queues, you should disable RSS and use zbalance_ipc 
with as many egress queues as you need.

Alfredo


Best regards,
Jan

________________________________
From: 
[email protected]<mailto:[email protected]>
 
[[email protected]<mailto:[email protected]>]
 on behalf of Luca Deri [[email protected]<mailto:[email protected]>]
Sent: Friday, August 21, 2015 12:03
To: [email protected]<mailto:[email protected]>
Subject: Re: [Ntop-misc] Using PF_RING ZC with Bro

Hi Jan,
we have been notified that the latest Bro version sixes some memory leaks that 
might cause the crash you have experienced.


On 21 Aug 2015, at 11:36, Jan Grashofer 
<[email protected]<mailto:[email protected]>> wrote:

Hi all,

I was trying to use PF_RING ZC with Bro, to improve performance. Unfortunately 
I was not able to get any benefit from using ZC.

My test system uses:
2 x Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (16 cores + 16 HT in sum)
64 GB RAM
82599ES 10-Gigabit SFI/SFP+ NIC (ixgbe-zc driver) seeing 6 GBit/s on average
Linux (2.6.32-504.23.4.el6.x86_64)
PFRING_ZC v.6.1.1.150527 (enable_tx_capture=0 min_num_slots=32768)

My reference setup is the following:
16 workers (pinned to cpus), 1 proxy
16 RSS queues with interrupts pinned, NUMA affinity set to first socket (next 
to the NIC)
Capture-loss average: 0.000262957682292
Capture-loss maximum: 0.022336
This are of course good stats.

My first try with ZC:
16 workers (pinned to cpus), 1 proxy
16 RSS queues, NUMA affinity set to first socket
hugepages enabled
zbalance_ipc -n 16 -m 1 -g 1
Capture-loss average: 0.289625722005
Capture-loss maximum: 9.155165
Obviously this was no improvement, but I read that using RSS together with ZC 
is counterproductive so I tried another setup.

My second try with ZC:
14 workers (pinned to cpus, first of each socket free), 1 proxy
32 RX/TX queues (default setting), NUMA affinity set to first socket
hugepages enabled
zbalance_ipc -n 16 -m 1 -g 0
Capture-loss average: 1.78585592428
Capture-loss maximum: 35.112188
Finally this got even worse (one factor may of cause be the reduced number of 
workers, but still no ZC improvement).

Additionally to the bad stats, Bro segfaults from time to time, when I am using 
ZC. As capture-loss correlates with packet drops, I think I may missed some 
essential thing in my setup. Do you have any suggestions?


If you can use RSS you do not need the balancer that uses memory bandwidth and 
does not add benefits, In fact with so many apps running you put quite some 
pressure on the memory subsystem that can cause the drawback,

>From our experience Bro is a great tool, but uses quite some resources. if you 
>need speed you should probably consider other options such as Suricata.

Regards Luca



Thanks,
Jan
_______________________________________________
Ntop-misc mailing list
[email protected]<mailto:[email protected]>
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

_______________________________________________
Ntop-misc mailing list
[email protected]<mailto:[email protected]>
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

_______________________________________________
Ntop-misc mailing list
[email protected]<mailto:[email protected]>
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Reply via email to