Hi Alfredo, using hugepages improved the performance considerably. (85% to 90% e.g.) Sending bigger (=fewer) packets also helped. What a surprise.
The machine has 16 GB DDR3-1333 memory. The fact that it's a 32-bit kernel (don't ask me why) might be an issue... I assume that with current hardware libzero can't handle much more than 15Mpps at best. Is that correct? One more question: What's for example the difference between "RSS=1,1,1,1" and "RSS=1,1"? Best Regards Martin On Thu, Mar 14, 2013 at 08:42:52PM +0100, Alfredo Cardigliano wrote: > Hi Martin and Craig > let me clarify a bit some points: > - min_num_slots and transparent_mode are kernel-level settings that apply to > standard rings, not DNA/Libzero (kernel is bypassed in this case). > - RSS is useful for balancing traffic with DNA, but it has some limitations, > for this reason we developed the Libzero DNA Cluster. The latter can use a > custom distribution function, replacing the RSS with a flexible user-defined > function: this means you should load the driver with RSS disabled > (RSS=1,1,1,1). > > This said, the load_dna_driver.sh script we provide with the drivers should > be fine for Martin: RSS=1,1,1,1 will disable RSS (single queue), you can add > num_rx_slots=32768 as suggested by Craig to set the number of NIC slots to > the maximum. > As Martin said, the master is the real bottleneck as it is a centralisation > point with a computationally-intensive task: it has to read from the NIC, > parse, hash, and deliver to the slaves each packet, all this in a few clock > cycles. This is also the reason of "The more slaves, the lower the RX value > gets", as more slaves means more data structures in memory thus more stress > on cache. In order to do this at 10G wire-rate you need a good machine, with > a good cpu and a good memory hierarchy (actually your cpu looks fast enough, > I can't comment about your memory). > Probably using hugepages can help you a bit: have a look at > PF_RING/README.hugepages and the -u parameter of pfdnacluster_master. > > Best Regards > Alfredo > > On Mar 14, 2013, at 7:58 PM, Craig Merchant <[email protected]> wrote: > > > From my understanding and experience, you don't use RSS with DNA/Libzero. > > The RSS queues are limited to 16 queues. > > > > The major value of using DNA/Libzero is that it lets you use more queues > > than RSS. > > > > Try the settings below... It's been a while since I set this up, but I > > remember having some issues that required me to force pf_ring to load > > before the ixgbe driver. > > > > options ixgbe MQ=0,0 num_rx_slots=32768 > > options pf_ring min_num_slots=65536 transparent_mode=1 > > install ixgbe /sbin/modprobe pf_ring $CMDLINE_OPTS; /sbin/modprobe > > --ignore-install ixgbe $CMDLINE_OPTS > > > > -----Original Message----- > > From: [email protected] > > [mailto:[email protected]] On Behalf Of Martin Kummer > > Sent: Thursday, March 14, 2013 11:18 AM > > To: [email protected] > > Subject: Re: [Ntop-misc] libzero performance > > > > Wow, thx for the quick answer. > > > > I just use the provided script > > (drivers/DNA/ixgbe-3.10.16-DNA/src/load_dna_driver.sh) to load the drivers. > > In short it does: > > insmod ../../../../kernel/pf_ring.ko > > insmod ./ixgbe.ko RSS=1,1,1,1 > > ifconfig dna1 up > > bash ../scripts/set_irq_affinity.sh ${IF[index]} > > > > The sysadmin has forbidden me to install these drivers permanently so > > there's nothing in /etc/modprobe/*.conf > > > > Martin > > > > > > On Thu, Mar 14, 2013 at 05:44:48PM +0000, Craig Merchant wrote: > >> Martin, > >> > >> I'm running pfdnacluster_master on an interface that averages between 3-10 > >> Gbps. The traffic is copied to 28 queues (0-27). The 28th queue contains > >> a copy of all of the traffic. I don't have any issues with packets being > >> dropped. > >> > >> How are you initializing the ixbe and pf_ring drivers in your > >> /etc/modprobe.d/*.conf file? Mine looks something like: > >> > >> options igb RSS=8,8 > >> options ixgbe MQ=0,0 num_rx_slots=32768 options pf_ring > >> min_num_slots=65536 transparent_mode=1 install ixgbe /sbin/modprobe > >> pf_ring $CMDLINE_OPTS; /sbin/modprobe --ignore-install ixgbe > >> $CMDLINE_OPTS > >> > >> How are you bringing up the interface? I'm using DNA/Libzero for Snort, > >> so I bring up the interface in the Snort init script with something like: > >> > >> > >> function adapter_settings() { > >> ifconfig dna0 up promisc > >> ethtool -K dna0 tso off &>/dev/null > >> ethtool -K dna0 gro off &>/dev/null > >> ethtool -K dna0 lro off &>/dev/null > >> ethtool -K dna0 gso off &>/dev/null > >> ethtool -G dna0 tx 32768 &>/dev/null > >> ethtool -G dna0 rx 32768 &>/dev/null } > >> > >> Thanks. > >> > >> Craig > >> > >> -----Original Message----- > >> From: [email protected] > >> [mailto:[email protected]] On Behalf Of Martin > >> Kummer > >> Sent: Thursday, March 14, 2013 10:24 AM > >> To: [email protected] > >> Subject: [Ntop-misc] libzero performance > >> > >> Hi everyone. > >> > >> For my bachelor thesis I'm modifying Vermont > >> (https://github.com/constcast/vermont/wiki) to use PF_RING/libzero instead > >> of pcap. > >> > >> To test libzero I used one of the examples on the website: On one host > >> I used pfdnacluster_master and pfcount. From another host I sent 100M > >> packets at 10Gbps. (ca. 12Mpps) > >> > >> There are two issues: > >> • When I use just one slave (pfdancluster_master [...] -n 1) the > >> performance is very good. I get about 99% of the packets. When I split the > >> data between two slaves that number drops to about 90%. The last line of > >> "pfdnacluster_master -n 2" output: > >> Absolute Stats: RX 90'276'085 pkts [2'314'438.07 pkt/sec] Processed > >> 90'276'085 pkts [2'314'438.07 pkt/sec] The more slaves, the lower the RX > >> value gets. > >> > >> • Even with just one slave pfdnacluster_master uses almost 100% of a CPU > >> core. While not yet a problem for me, this is likely to be the next > >> bottleneck. > >> > >> Is there a way to increase the performance of a dnacluster, when there are > >> several slaves? > >> > >> The software used: > >> - the current svn chockout of ntop > >> - the DNA ixgbe driver by ntop > >> > >> The hardware used: > >> - Core i7-3930K (6C, HT, 3.2 GHz) > >> - Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ > >> Network Connection (rev 01) > >> Subsystem: Intel Corporation Ethernet Server Adapter X520-2 > >> > >> best regards, > >> Martin > >> _______________________________________________ > >> Ntop-misc mailing list > >> [email protected] > >> http://listgateway.unipi.it/mailman/listinfo/ntop-misc > >> _______________________________________________ > >> Ntop-misc mailing list > >> [email protected] > >> http://listgateway.unipi.it/mailman/listinfo/ntop-misc > > _______________________________________________ > > Ntop-misc mailing list > > [email protected] > > http://listgateway.unipi.it/mailman/listinfo/ntop-misc > > _______________________________________________ > > Ntop-misc mailing list > > [email protected] > > http://listgateway.unipi.it/mailman/listinfo/ntop-misc > > _______________________________________________ > Ntop-misc mailing list > [email protected] > http://listgateway.unipi.it/mailman/listinfo/ntop-misc _______________________________________________ Ntop-misc mailing list [email protected] http://listgateway.unipi.it/mailman/listinfo/ntop-misc
