Hi, Our Suricata instance running on PF_RING with libzero has been dropping packets recently (at ~2Gb/s load), but the CPU cores are not maxed out in general. So I've been looking again at more recent PF_RING options :-)
The setup is a Dell R620 with 64GB RAM (OK I should add more), two CPUS with 8 cores on each (hyperthreading turned off), and a ixgbe Intel 10Gb dual-port card of which I'm using just one port. I'm using PF_RING 6.0.2 at the moment. I must admit I'm a bit confused! I load the DNA ixgbe with insmod ixgbe.ko RSS=1,1 mtu=1522 adapters_to_enable=xx:xx:xx:xx:xx:xx (the port I'm using) then pfdnacluster_master -i dna0 -c 1 -n 15,1 -r 15 -d Suricata then runs (in "workers" runmode) using dnacl:1@0 ... 1@14 and we run ARGUS (using libpcap) on dnacl:1@15 So questions :- 1) How does CPU affinity work in libzero (or ZC)? There's no IRQs to fix ... Does it bind dnacl:1@0 to core 0, dnacl:1@1 to core 1 etc.? What should the RX thread (pfdnacluster_master -r) be bound to? 2) After reading http://www.ntop.org/pf_ring/not-all-servers-are-alike-with-pf_ring-zcdna-part-3/ I'm wondering whether I would be better running just 8 queues (or 7 and 1 for ARGUS) and forcing them somehow to the NUMA node the ixgbe card is attached to? (If yes, how do I bind libzero to cores 0,2,4,6,8,10,12,14 or whatever numactl says is on the same node as the NIC?) 3) Hugepages work in that I can allocate 1024 2048K ones as suggested in README.hugepages and then run pfdnacluster_master with the "-u /mnt/huge" option, and then pfcount, tcpdump etc. work. However Suricata always crashes out. Similarly if I start pfdnacluster_master without huge pages, then Suricata, then stop and restart pfdnacluster_master with huge pages, while Suricata is still running the latter fails (but is fine restarting without huge pages). If I start ZC version of ixgbe (which needs huge pages of course) and use zbalance_ipc -i zc:eth4 -c 1 -n 15,1 -m 1 (with Suricata talking to zc:1@0 .. zc:@14) then Suricata also fails in a similar way (errors like "[ERRCODE: SC_ERR_PF_RING_OPEN(34)] - Failed to open zc:1@0: pfring_open error. Check if zc:1@0 exists"), though pfcount and tcpdump are fine. Is it worth going for 1GB pages (which are available) and how many would I need? 4) Is it worth increasing the number of slots in each queue (pfdnacluster_master -q) or num_rx_slots (in loading ixgbe)? (We've replaced our border switches with ones our Network Manager is confident won't crash if somehow PF_RING *sends* packets to the mirrored port - that crashed one of the old switches - so I'm allowed to reload PF_RING + NIC drivers without going through Change Management and "at-risk" periods now :-) ) Best Wishes, Chris -- --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+- Christopher Wakelin, [email protected] IT Services Centre, The University of Reading, Tel: +44 (0)118 378 2908 Whiteknights, Reading, RG6 6AF, UK Fax: +44 (0)118 975 3094 _______________________________________________ Ntop-misc mailing list [email protected] http://listgateway.unipi.it/mailman/listinfo/ntop-misc
