On 6/6/24 22:41, Ilya Maximets wrote:
> On 6/6/24 20:59, Sri kor via discuss wrote:
>> Hi Team,
>>
>>     Currently we are facing  /ERR|group-table: out of table ids .W/e are
>> running OVN 23.09 version and OVS 3.2.2.  From the retis trace, the packet
>> appears to be dropped shortly after the upcall is generated. The exact
>> reason for the drop isn't specified, but it indicates that the packet is not
>> forwarded further within the OVS kernel datapath at this point. 
>>
>> As per https://issues.redhat.com/browse/FDP-70 , this issue was fixed in OVN 
>> 23.09.
>>
>>
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx systemd[1]: 
>> Started OVN controller daemon.
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx 
>> ovn-controller[1253156]: ovs|00023|extend_table|ERR|*table group-table: out 
>> of table ids.*
>>
>>
>>
>> [root@cloud-user]#
>>
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx systemd[1]: 
>> Started OVN controller daemon.
>>
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx 
>> ovn-controller[1253156]: ovs|00023|extend_table|ERR|table group-table: out 
>> of table ids.
>>
>>
>> #retis sort /tmp/playground-test1.json
>>
>>
>> 1306445529784444 (102) [swapper/102] 0 [tp] openvswitch:ovs_dp_upcall 
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 0
>>   if 4 (enp148s0f0np0) rxif 4 91.107.186.166.55805 > 204.52.24.59.22 ttl 235 
>> tos 0x0 id 54321 off 0 len 40 proto TCP (6) flags [S] seq 3425966139 win 
>> 65535
>>   upcall (miss) port 2774067634 cpu 102
>> *  + 1306445529798391 (102) [swapper/102] 0 [tp] skb:kfree_skb 
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 1 drop (reason 
>> NOT_SPECIFIED)*
>> *   * if 4 (enp148s0f0np0) rxif 4
>>   + 1306445529802935 (102) [swapper/102] 0 [kr] queue_userspace_packet 
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 2
>>     if 4 (enp148s0f0np0) rxif 4 91.107.186.166.55805 > 204.52.24.59.22 ttl 
>> 235 tos 0x0 id 54321 off 0 len 40 proto TCP (6) flags [S] seq 3425966139 win 
>> 65535
>>     upcall_enqueue (miss) (102/1306445529784444) q 1636019689 ret 0
>>   + 1306445529807829 (102) [swapper/102] 0 [kr] ovs_dp_upcall 
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 3
>>     if 4 (enp148s0f0np0) rxif 4 91.107.186.166.55805 > 204.52.24.59.22 ttl 
>> 235 tos 0x0 id 54321 off 0 len 40 proto TCP (6) flags [S] seq 3425966139 win 
>> 65535
>>     upcall_ret (102/1306445529784444) ret 0
>>
>>
>> [root@cloud-user]# ovs-vsctl  --version
>> ovs-vsctl (Open vSwitch) *3.2.2*
>> DB Schema 8.4.0
>>
>> [root@vcloud-user]# ovn-controller --version
>> ovn-controller *23.09.1*
> 
> Are you building this package yourself?  If so, on which exact commit it is 
> based?
> If not, what distribution are you using and what is the exact rpm/deb package 
> version?
> 
> My suspicion is that it is not exactly v23.09.1, but a code a few commits 
> earlier
> than that.  In this case, it may not include the fix.
> 

I agree, the versions listed above look a bit off.  If I run OVN
v23.09.1 in a sandbox I get:

$ ovn-controller --version
ovn-controller 23.09.1
Open vSwitch Library 3.3.90   <<< this differs from 3.2.2 listed above
OpenFlow versions 0x6:0x6
SB DB Schema 20.29.

Checking when we bumped the OVS submodule from 3.2.2 to the tip (at that
moment) of 3.3, it was:
1fa7628db415 ("ovs: Bump submodule to include E721 fixes.")

The log between that version and the actual v23.09.1 release is:
$ git log --oneline 1fa7628db415..v23.09.1
0afd4e59e9 (HEAD, tag: v23.09.1) Set release date for 23.09.1.
<snip>
e9e716ad53 controller: Don't artificially limit group and meter IDs to 16bit.
<snip>
627955eb79 ci: Pin Python, Fedora and Ubuntu runner versions.

What we need is actually:

commit e9e716ad531e34766d2f02783ac08955096bf636
Author: Dumitru Ceara <dce...@redhat.com>
Date:   Tue Oct 31 18:00:44 2023 +0100

    controller: Don't artificially limit group and meter IDs to 16bit.

There were a few follow up fixes for it though:
acc63727d14f ("controller: fix group_table and meter_table allocation")
c0c9e5074704 ("features.c: Always wait on the rconn.")
40b670e6ee94 ("ovn-controller: Fix busy loop when ofctrl is disconnected.")

So I guess the recommendation would be to use the most recent v23.09 release,
that is: v23.09.4

Regards,
Dumitru

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to