On 09/19/2016 10:28 AM, Jeremias Blendin wrote:
Hello Mauricio,

thank you for the pointer to the bin-to-hex conversion, I knew it
looked strange but I could not see why m)
In any case, it was just an example, the actual configuration for
testing is correct.


2016-09-18 18:25 GMT+02:00 Mauricio Vasquez <mauricio.vasq...@polito.it>:
Hello Jeremias,


On 09/18/2016 05:46 PM, Jeremias Blendin wrote:
Hi,

I set pmd-cpu-mask on a server running its processors in
cluster-on-die mode. This means, that the actual cpu topology is shown
to the os as shown below.

The problem I have is that although OVS is allowed to use all available
CPUs:

$ ovs-vsctl --no-wait get Open_vSwitch . other_config:pmd-cpu-mask
"0x1787586c4fa8a01c71bc"
(=python hex(int('111111111111111111111111')))
The cpu mask is calculated from a binary, so in your case if you want ovs to
have access to all cores it should be 0xFFFFF... ->
python hex(int('11111...'), 2)


OVS only uses the CPUs that are located in the first NUMA Node where
the OS is running:
$ ovs-appctl dpif-netdev/pmd-stats-show | grep ^pmd
pmd thread numa_id 0 core_id 2:
pmd thread numa_id 0 core_id 12:
pmd thread numa_id 0 core_id 13:
pmd thread numa_id 0 core_id 14:

I restarted OVS multiple times, I tried to pin queues to specific cores:

ovs-vsctl set interface dpdk0 other_config:pmd-rxq-affinity="0:2,1:12"
ovs-vsctl set interface dpdk1 other_config:pmd-rxq-affinity="0:13,1:14"
ovs-vsctl set interface vif3 other_config:pmd-rxq-affinity="0:3,1:3"
ovs-vsctl set interface vif4 other_config:pmd-rxq-affinity="0:4,1:4"

but with the same result, cores in other numa nodes are not used:

/usr/local/var/log/openvswitch/ovs-vswitchd.log
2016-09-18T15:25:04.327Z|00080|dpif_netdev|INFO|Created 4 pmd threads
on numa node 0
2016-09-18T15:25:04.327Z|00081|dpif_netdev|WARN|There is no PMD thread
on core 3. Queue 0 on port 'vif3' will not be polled.
2016-09-18T15:25:04.327Z|00082|dpif_netdev|WARN|There is no PMD thread
on core 4. Queue 0 on port 'vif4' will not be polled.
2016-09-18T15:25:04.327Z|00083|dpif_netdev|WARN|There's no available
pmd thread on numa node 0
2016-09-18T15:25:04.327Z|00084|dpif_netdev|WARN|There's no available
pmd thread on numa node 0

The log output seems to indicate that only numa node 0 is used for
some reason? Can anyone confirm this?
OVS only creates pmd threads in sockets where there are ports, in the case
of physical ports the numa node is defined by where the ports are connected
on the server, in the case of dpdkvhostuser ports, it is defined by where
memory of the virtio device is allocated.
That is an interesting point, I create the dpdkvhostuser ports with
Open vSwitch:
ovs-vsctl add-port br0 vif0 -- set Interface vif0 type=dpdkvhostuser
How can I define which memory it should use?
I don't know how can it be defined, but I found in the documentation [1] that CONFIG_RTE_LIBRTE_VHOST_NUMA=y should be set in order to automatically detect the numa node of vhostuser ports.


Probably in your case physical ports and the memory of the virtio devices
are on socket0.
As COD is active, I have two numa nodes per socket. So yes, the VM and
OVS are located on socket 0 but in different numa nodes.
OVS has memory on all nodes (4G), the VM has only memory on numa node 1.
However, this numa node (1) is never used by OVS, although the VM is
located there. I guess I could fix this issue by deactivating COD, but
this
has other drawbacks. Is there any way to directly tell OVS to run pmds
on a specific numa node?
Not, PMDs in a numa node are only created if there are interfaces in that numa node.


I understand that runnings pmds on a
different socket
might be an issue, but it seems weird to me that pmds cannot run on a
different numa node on the same socket.

Thanks!

Jeremias

Regards,

Mauricio Vasquez
Best regards,

Jeremias


ovs-vsctl show
      Bridge "br0"
          Controller "tcp:<ctrl>:6633"
              is_connected: true
          Port "vif0"
              Interface "vif0"
                  type: dpdkvhostuser
                  options: {n_rxq="2"}
          Port "dpdk1"
              Interface "dpdk1"
                  type: dpdk
                  options: {n_rxq="2"}
          Port "vif3"
              Interface "vif3"
                  type: dpdkvhostuser
                  options: {n_rxq="2"}
          Port "dpdk0"
              Interface "dpdk0"
                  type: dpdk
                  options: {n_rxq="2"}
          Port "vif1"
              Interface "vif1"
                  type: dpdkvhostuser
                  options: {n_rxq="2"}
          Port "br0"
              Interface "br0"
                  type: internal
          Port "vif4"
              Interface "vif4"
                  type: dpdkvhostuser
                  options: {n_rxq="2"}
      ovs_version: "2.6.90"


OVS (last commit):
commit 75e2077e0c43224bcca92746b28b01a4936fc101
Author: Thadeu Lima de Souza Cascardo <casca...@redhat.com>
Date:   Fri Sep 16 15:52:48 2016 -0300


CPU topology:
lstopo -p

Machine (252GB total)
    Package P#0
      NUMANode P#0 (63GB) +  L3 (15MB)
          L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#0
          L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#1
          L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#4 + PU P#2
          L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#12
          L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#3 + PU P#13
          L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#5 + PU P#14
      NUMANode P#1 (63GB) + L3 (15MB)
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 + PU P#3
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#10 + PU P#4
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#12 + PU P#5
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#9 + PU P#15
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#11 + PU P#16
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#13 + PU P#17
    Package P#1
      NUMANode P#2 (63GB) + L3 (15MB)
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#6
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#7
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#4 + PU P#8
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#18
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#3 + PU P#19
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#5 + PU P#20
      NUMANode P#3 (63GB) + L3 (15MB)
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 + PU P#9
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#10 + PU P#10
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#12 + PU P#11
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#9 + PU P#21
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#11 + PU P#22
        L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#13 + PU P#23

Other info:

$ uname -a
Linux nfvi1 4.4.0-34-lowlatency #53-Ubuntu SMP PREEMPT Wed Jul 27
19:23:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.4.0-34-lowlatency root=UUID=<whatever> ro
default_hugepagesz=1GB hugepagesz=1G hugepages=100 isolcpus=2-23
nohz_full=2-23 rcu_nocbs=2-23 apparmor=0
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[1] https://github.com/openvswitch/ovs/blob/master/INSTALL.DPDK-ADVANCED.md#36-numacluster-on-die
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to