Hi Ales,

On Wed, May 21, 2025 at 6:07 AM Ales Musil <[email protected]> wrote:
>
>
>
> On Tue, May 20, 2025 at 8:06 PM Tiago Pires via discuss 
> <[email protected]> wrote:
>>
>> Hi All,
>
>
> Hi Tiago,
>
>>
>> In an cluster with OVN 24.03.5 we are observing in a few chassis that
>> works as dedicated OVN Interconnection Gateways the ovn-controller
>> process running almost in 100% of CPU usage:
>>
>> 2025-05-20T16:58:39.546Z|689641|poll_loop|INFO|wakeup due to [POLLIN]
>> on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (95%
>> CPU usage)
>> 2025-05-20T16:58:45.488Z|689642|poll_loop|INFO|Dropped 48 log messages
>> in last 6 seconds (most recently, 1 seconds ago) due to excessive rate
>> 2025-05-20T16:58:45.488Z|689643|poll_loop|INFO|wakeup due to [POLLIN]
>> on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (92%
>> CPU usage)
>> 2025-05-20T16:58:51.553Z|689644|poll_loop|INFO|Dropped 47 log messages
>> in last 6 seconds (most recently, 0 seconds ago) due to excessive rate
>> 2025-05-20T16:58:51.553Z|689645|poll_loop|INFO|wakeup due to [POLLIN]
>> on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (98%
>> CPU usage)
>> 2025-05-20T16:58:57.514Z|689646|poll_loop|INFO|Dropped 50 log messages
>> in last 6 seconds (most recently, 1 seconds ago) due to excessive rate
>> 2025-05-20T16:58:57.514Z|689647|poll_loop|INFO|wakeup due to [POLLIN]
>> on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (95%
>> CPU usage)
>> 2025-05-20T16:59:03.558Z|689648|poll_loop|INFO|Dropped 49 log messages
>> in last 6 seconds (most recently, 0 seconds ago) due to excessive rate
>>
>> Checking what ovn-controller is doing in debug mode, we can see a lot
>> of the below ARP packets:
>>
>> 2025-05-20T17:10:21.149Z|00004|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=48| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX.6.X31, dst-ip=172.XX.X.2XX
>> 2025-05-20T17:10:21.149Z|00005|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=48| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX1.6.XX1, dst-ip=172.XX.X.XX4
>> 2025-05-20T17:10:21.271Z|00006|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.X23, dst-ip=172.X6.X.2X3
>> 2025-05-20T17:10:21.271Z|00007|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.X23, dst-ip=172.X6.X.X41
>> 2025-05-20T17:10:21.271Z|00008|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x60199dbd| in-port=338| src-mac=fa:16:3e:a7:a2:37,
>> dst-mac=00:00:00:00:00:00| src-ip=172.XX.X2.X30, dst-ip=172.XX.X.X09
>> 2025-05-20T17:10:21.271Z|00009|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=131| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XXX.X.X4, dst-ip=172.XX.X.X19
>> 2025-05-20T17:10:21.272Z|00010|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.X23, dst-ip=172.XX.X.X98
>> 2025-05-20T17:10:21.277Z|00011|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=48| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.1X1, dst-ip=172.XX.X.X05
>> 2025-05-20T17:10:21.388Z|00012|pinctrl(ovn_pinctrl0)|DBG|pinctrl
>> received  packet-in | opcode=ARP| OF_Table_ID=0|
>> OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77,
>> dst-mac=00:00:00:00:00:00| src-ip=10.XX.X.X23, dst-ip=172.XX.X.2X2
>
>
>  I can see that almost all of those packets have identical src MAC and there
> are a lot of duplicate src IP AFAICT. I have a suspicion that this might be 
> related
> to a problem that we saw with multicast split flooding ovn-controller with 
> garps [0].
>
> Could you please help us to identify which flow the OF_Cookie_ID=0x1367fe68
> corresponds to?

I got these flows for this 0x1367fe68:

 cookie=0x1367fe68, duration=1924339.607s, table=40,
n_packets=44064625, n_bytes=3260782250, idle_age=0,
priority=100,reg15=0x869,metadata=0xff3c3e
actions=set_field:0xa173->reg11,set_field:0xa1b0->reg12,resubmit(,41)
 cookie=0x1367fe68, duration=1924339.821s, table=41, n_packets=0,
n_bytes=0, idle_age=65535,
priority=100,reg10=0/0x1,reg14=0x869,reg15=0x869,metadata=0xff3c3e
actions=drop
 cookie=0x1367fe68, duration=1924341.853s, table=64, n_packets=0,
n_bytes=0, idle_age=65535,
priority=100,reg10=0x1/0x1,reg15=0x869,metadata=0xff3c3e
actions=push:NXM_OF_IN_PORT[],set_field:ANY->in_port,resubmit(,65),pop:NXM_OF_IN_PORT[]
 cookie=0x1367fe68, duration=1924341.709s, table=65,
n_packets=44064661, n_bytes=3260784914, idle_age=0,
priority=100,reg15=0x869,metadata=0xff3c3e
actions=clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0/0xffff->reg13,set_field:0xa34d->reg11,set_field:0x95d0->reg12,set_field:0x13ba->metadata,set_field:0x5->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8))


I identified the mac address and the owner is an OVN router port, so
the traffic is egressing this router port and since the source is a
remote subnet, the packet header is changed with the source mac
address to its local router port.

>
>>
>> In my understanding, it seems there are a lot of ARPs from different
>> OVN virtual networks and making the ovn-controller use more CPU time.
>> Wouldn't the ovn-controller know how to handle these ARP packets
>> without use a lot of CPU time?
>
>
> I mean ovn-controller knows what to do with them but the snippet has 9
> packets within 200ms, so you can overload pinctrl thread by just sheer
> volume.

>
>>
>> Regards,
>>
>> Tiago Pires
>>
>> --
>>
>>
>>
>>
>> _‘Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>> imediatamente anuladas e proibidas’._
>>
>>
>> * **‘Apesar do Magazine Luiza tomar
>> todas as precauções razoáveis para assegurar que nenhum vírus esteja
>> presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por
>> quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.*
>>
>>
>>
>> _______________________________________________
>> discuss mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
> [0] 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2025-February/053455.html
>
> Regards,
> Ales

Regards,

Tiago Pires

-- 




_‘Esta mensagem é direcionada apenas para os endereços constantes no 
cabeçalho inicial. Se você não está listado nos endereços constantes no 
cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa 
mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão 
imediatamente anuladas e proibidas’._


* **‘Apesar do Magazine Luiza tomar 
todas as precauções razoáveis para assegurar que nenhum vírus esteja 
presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por 
quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.*



_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to