Re: [Ntop-misc] Advice on setting RSS/smp_affinity

Alfredo Cardigliano Wed, 03 Sep 2014 14:27:57 -0700

On 03 Sep 2014, at 20:00, Y M <[email protected]> wrote:

> Thanks Alfredo,
> 
> Ok, If I am getting this right, If I want to establish queues = # of cores, 
> then I would go with RSS = 0 (for one physical interface) and RS = 0,0 (for 
> two physical interfaces), and for 4 queues only RSS = 4 (for one physical 
> interface), correct?


I am not sure you got it, for example if you have 2 interfaces and you want to 
enable 4 RSS queues for each interface, you should use RSS=4,4

> Yes I am using the standard mode.
> 
> What I meant by "things do not go well" is that after making the necessary 
> changes - insmod the driver with RSS and setting smp_affinity - and I find 
> that performance was degraded (mostly an error from my side :)). I am just 
> trying to understand the process of reverting back to the original state 
> before the changes took place.

Yes just unload and reload the driver with the right configuration.

Alfredo

> 
> Thanks again.
> 
> YM 
> 
> From: [email protected]
> Date: Wed, 3 Sep 2014 19:28:19 +0200
> To: [email protected]
> Subject: Re: [Ntop-misc] Advice on setting RSS/smp_affinity
> 
> Hi
> please read below
> 
> On 03 Sep 2014, at 14:33, Y M <[email protected]> wrote:
> 
> I am trying to properly setup/understand RSS and smp_affinity using the 
> scripts (load_driver and set_irq_affinity) provided with the PF_RING tarball 
> without CPU binding. I will put the questions upfront and the details will 
> follow. I would appreciate any help, though the questions may not be PF_RING 
> tightly related. Other seupt/configurations are already done and are working 
> as expected.
> 
> 1. How do the values assigned to RSS when insmod(ing) the driver actually map 
> to the queues? Such as RSS=0,0 or RS=1,1,1,1 or RSS=4,4.
> 
> Each number in the comma-separated list is per interface, where:
> 0 - number of RSS queue = number of CPU cores
> 1- single queue
> 4 - 4 RSS queues
> 
> 2. The application at the other end that is consuming the traffic has 4 
> processes running. Does this mean that only 4 queues/4 CPUs should be mapped, 
> or I can have the max. number of queues irq(ed) - in this case 8 - to the 6 
> cores and let the scheduler handle the reset?
> 
> If you are using DNA/ZC you should use 4 queues and open one queue per 
> application,
> with standard drivers it depends, if you are using in-kernel clustering you 
> can use as many queues as you want.
> 
> 3. What is the easiest way for backing-out if things do not go well; would 
> rmmod(ing) and reloading the driver suffice? This is because I am dealing 
> with a production system already and don't have the space to test.
> 
> What do you mean with “things do not go well”?
> 
> Alfredo
> 
> Loading igb and pf_ring:
> 
> insmod ./pf_ring.ko transparent_mode=2 min_num_slots=65536 enable_tx_capture=0
> insmod ./igb.ko
> 
> Server info:
> 
> # cat /proc/cpuinfo | grep "model name\|cpu cores" | head -2
> model name      : Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
> cpu cores           : 6
> 
> # uname -r
> 3.2.0-67-generic
> 
> PF_RING info:
> 
> # cat /proc/net/pf_ring/info
> PF_RING Version           : 6.0.2 ($Revision: 8089$)
> Total rings                   : 4
> 
> Standard (non DNA) Options
> Ring slots                      : 65536
> Slot version                   : 16
> Capture TX                    : No [RX only]
> IP Defragment               : No
> Socket Mode                  : Standard
> Transparent mode          : No [mode 2]
> Total plugins                  : 0
> Cluster Fragment Queue  : 0
> Cluster Fragment Discard : 0
> 
> Driver info:
> 
> # ethtool -i eth2
> driver: igb
> version: 5.2.5
> firmware-version: 1.67, 0x80000ba6, 15.0.28
> bus-info: 0000:08:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> 
> Looking at dmesg for the driver suggest that there are 8 RX queues and 8 TX 
> queues:
> 
> # grep "igb" /var/log/dmesg
> [    3.167209] igb 0000:08:00.0: Intel(R) Gigabit Ethernet Network Connection
> [    3.167214] igb 0000:08:00.0: eth2: (PCIe:5.0Gb/s:Width x4) <MAC-ADDRESS>
> [    3.167511] igb 0000:08:00.0: eth2: PBA No: G13158-000
> [    3.167513] igb 0000:08:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx 
> queue(s)
> 
> However, listing the queues associated, does not show that are actually being 
> assigned/used;
> 
> # ls -l /sys/class/net/eth2/queues/
> total 0
> drwxr-xr-x 2 root root 0 Aug 28 08:56 rx-0
> drwxr-xr-x 2 root root 0 Aug 28 08:56 tx-0
> 
> And:
> 
> # cat /proc/interrupts | grep "eth2"
>                       CPU0       CPU1       CPU2       CPU3       CPU4       
> CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11
>   81:                       1             0             0             0       
>      0             0             0            0             0             0   
>           0             0    IR-PCI-MSI-edge      eth2
>   82:       1857321564             0             0             0            0 
>             0             0            0             0             0          
>    0             0    IR-PCI-MSI-edge      eth2-TxRx-0
> 
> If I am reading this correctly, then there is only one queue and it is being 
> handled by CPU0 only and the other cores are just dormant. What is also 
> confusing me is:
> 
> # cat /proc/irq/82/smp_affinity
> 0000,00000fff
> 
> Again, if I am reading this correctly, then this means that IRQ 82 is (should 
> be) handled by CPU0...CPU11, however, this is not evident in the interrupt 
> table above. Looking at the indirection table also gives nothing:
> 
> # ethtool -x eth2
> RX flow hash indirection table for eth2 with 1 RX ring(s):
>     0:      0     0     0     0     0     0     0     0
>     8:      0     0     0     0     0     0     0     0
>    16:      0     0     0     0     0     0     0     0
>    24:      0     0     0     0     0     0     0     0
>    32:      0     0     0     0     0     0     0     0
>    40:      0     0     0     0     0     0     0     0
>    48:      0     0     0     0     0     0     0     0
>    56:      0     0     0     0     0     0     0     0
>    64:      0     0     0     0     0     0     0     0
>    72:      0     0     0     0     0     0     0     0
>    80:      0     0     0     0     0     0     0     0
>    88:      0     0     0     0     0     0     0     0
>    96:      0     0     0     0     0     0     0     0
>   104:      0     0     0     0     0     0     0     0
>   112:      0     0     0     0     0     0     0     0
>   120:      0     0     0     0     0     0     0     0
> 
> And finally:
> 
> # ethtool -S eth2
> NIC statistics:
>      rx_packets: 2984011999
>      tx_packets: 3
>      rx_bytes: 2465341976853
>      tx_bytes: 222
>      rx_broadcast: 0
>      tx_broadcast: 0
>      rx_multicast: 0
>      tx_multicast: 3
>      multicast: 0
>      collisions: 0
>      rx_crc_errors: 0
>      rx_no_buffer_count: 0
>      rx_missed_errors: 0
>      tx_aborted_errors: 0
>      tx_carrier_errors: 0
>      tx_window_errors: 0
>      tx_abort_late_coll: 0
>      tx_deferred_ok: 0
>      tx_single_coll_ok: 0
>      tx_multi_coll_ok: 0
>      tx_timeout_count: 0
>      rx_long_length_errors: 0
>      rx_short_length_errors: 0
>      rx_align_errors: 0
>      tx_tcp_seg_good: 0
>      tx_tcp_seg_failed: 0
>      rx_flow_control_xon: 0
>      rx_flow_control_xoff: 0
>      tx_flow_control_xon: 0
>      tx_flow_control_xoff: 0
>      rx_long_byte_count: 2465341976853
>      tx_dma_out_of_sync: 0
>      lro_aggregated: 0
>      lro_flushed: 0
>      tx_smbus: 0
>      rx_smbus: 0
>      dropped_smbus: 0
>      os2bmc_rx_by_bmc: 0
>      os2bmc_tx_by_bmc: 0
>      os2bmc_tx_by_host: 0
>      os2bmc_rx_by_host: 0
>      rx_errors: 0
>      tx_errors: 0
>      tx_dropped: 0
>      rx_length_errors: 0
>      rx_over_errors: 0
>      rx_frame_errors: 0
>      rx_fifo_errors: 0
>      tx_fifo_errors: 0
>      tx_heartbeat_errors: 0
>      tx_queue_0_packets: 3
>      tx_queue_0_bytes: 210
>      tx_queue_0_restart: 0
>      rx_queue_0_packets: 2984012000
>      rx_queue_0_bytes: 2453405930211
>      rx_queue_0_drops: 0
>      rx_queue_0_csum_err: 0
>      rx_queue_0_alloc_failed: 0
> 
> Again, I appreciate any help. Thanks. (Sorry if this is a duplicate).
> YM
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
> 
> 
> _______________________________________________ Ntop-misc mailing list 
> [email protected] 
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
> _______________________________________________
> Ntop-misc mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc

_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Re: [Ntop-misc] Advice on setting RSS/smp_affinity

Reply via email to