Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-29 Thread Marcelo Leitner
Hi,

On Tue, Jun 14, 2022 at 05:26:25PM +0300, Oz Shlomo wrote:
> Hi Ilya,
>
> On 6/14/2022 4:03 PM, Ilya Maximets wrote:
> > On 6/14/22 10:27, Oz Shlomo via dev wrote:
> > >
> > >
> > > On 6/8/2022 3:16 AM, Frode Nordahl wrote:
> > > > On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  
> > > > wrote:
> > > > >
> > > > > On 5/31/22 23:48, Ilya Maximets wrote:
> > > > > > On 5/31/22 21:15, Frode Nordahl wrote:
> > > > > > > On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
> > > > >
> > > > > 
> > > > >
> > > > > > > I've pushed the first part of the fix here:
> > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
> > > > > >
> > > > > > Thanks!  I saw that and I tend to think that it is correct.
> > > > > > I'll try to test it and apply in the next couple of days.
> > > > > >
> > > > > > One question about the test above: which entity actually adds
> > > > > > the ct_state to the packet or at which moment that happens?
> > > > > > I see it, but I'm not sure I fully understand that.  Looks
> > > > > > like I'm missing smething obvious.
> > > > >
> > > > > I found what is going on there - kernel simply tracks everything
> > > > > if not told to do so, so ICMP packets create the ct entry and
> > > > > subsequent packets re-use it, so icmp replies have +trk set while
> > > > > entering OVS.
> > > >
> > > > Great, my hunch was that something along these lines was happening as
> > > > well, I have to admit the test case was found by locating something
> > > > closest to the real life use case and it proved to work as a good test
> > > > for this condition.
> > > >
> > > > > 
> > > > >
> > > > > Let's have some summary of the issues discovered here so far,
> > > > > including a few new issues:
> > > > >
> > > > > 1. ct states set externally are not tracked correctly by OVS.
> > > > >      Fix: 
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
> > > > >      Status:  LGTM, will apply soon.
> > > > >      This fixes the problem originally reported by Liam, IIUC.  Right?
> > > >
> > > > Yes.
> > > >
> > > > > 2. Kernel ct() actions are trying to re-use the cached connection
> > > > >      after the tuple changes.
> > > > >      This ends up to be the OVN hairpin issue as reported here:
> > > > >    
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fubuntu%2F%2Bsource%2Fovn%2F%2Bbug%2F1967856&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=25B7VbtRFguupC7VoNjZK%2FWlasu%2BMSTUzJkszvEpDaQ%3D&reserved=0
> > > > >
> > > > >      Proposed Fix:
> > > > >    
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fnetdev%2F20220606221140.488984-1-i.maximets%40ovn.org%2FT%2F%23u&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=areRLYsAEbare7yo%2FxmIF9k2tMw2v8ZQkwHcR%2FEvV%2Bo%3D&reserved=0
> > > > >
> > > > >      Status: Needs review.
> > > >
> > > > I can confirm that the proposed fix resolves the OVN hairpin issue. It
> > > > also looks simple enough to be backportable all the way to where we
> > > > would need it (kernel 5.4.0). I'll have a look at giving this wider
> > > > exposure in an internal CI environment as a canary for any unintended
> > > > consequences if that would be helpful.
> > > >
> > >
> > > Sorry for jumping in late on this, as this patch was already accepted to 
> > > the kernel.
> > > However, I have a concern that this patch would break the tc datapath 
> > > when ovs hw offload is enabled.
> > >
> > > IIUC then this patch adds an implicit ovs_ct_clear call for 5-tuple 
> > > modify actions. However, this implicit action will not apply to flows 
> > > that use the tc datapath.
> > >
> > > Forde, can you verify that indeed this fix breaks the OVN hairpin use 
> > > case when hw offload is enabled.
> > Hi, Oz.  I don't think that this kernel fix 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-16 Thread Dumitru Ceara
On 6/14/22 22:24, Ilya Maximets wrote:
> On 6/14/22 16:26, Oz Shlomo wrote:
>> Hi Ilya,
>>
>> On 6/14/2022 4:03 PM, Ilya Maximets wrote:
>>> On 6/14/22 10:27, Oz Shlomo via dev wrote:


 On 6/8/2022 3:16 AM, Frode Nordahl wrote:
> On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  wrote:
>>
>> On 5/31/22 23:48, Ilya Maximets wrote:
>>> On 5/31/22 21:15, Frode Nordahl wrote:
 On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
>>
>> 
>>
 I've pushed the first part of the fix here:
 https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
>>>
>>> Thanks!  I saw that and I tend to think that it is correct.
>>> I'll try to test it and apply in the next couple of days.
>>>
>>> One question about the test above: which entity actually adds
>>> the ct_state to the packet or at which moment that happens?
>>> I see it, but I'm not sure I fully understand that.  Looks
>>> like I'm missing smething obvious.
>>
>> I found what is going on there - kernel simply tracks everything
>> if not told to do so, so ICMP packets create the ct entry and
>> subsequent packets re-use it, so icmp replies have +trk set while
>> entering OVS.
>
> Great, my hunch was that something along these lines was happening as
> well, I have to admit the test case was found by locating something
> closest to the real life use case and it proved to work as a good test
> for this condition.
>
>> 
>>
>> Let's have some summary of the issues discovered here so far,
>> including a few new issues:
>>
>> 1. ct states set externally are not tracked correctly by OVS.
>>  Fix: 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
>>  Status:  LGTM, will apply soon.
>>  This fixes the problem originally reported by Liam, IIUC.  Right?
>
> Yes.
>
>> 2. Kernel ct() actions are trying to re-use the cached connection
>>  after the tuple changes.
>>  This ends up to be the OVN hairpin issue as reported here:
>>    
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fubuntu%2F%2Bsource%2Fovn%2F%2Bbug%2F1967856&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=25B7VbtRFguupC7VoNjZK%2FWlasu%2BMSTUzJkszvEpDaQ%3D&reserved=0
>>
>>  Proposed Fix:
>>    
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fnetdev%2F20220606221140.488984-1-i.maximets%40ovn.org%2FT%2F%23u&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=areRLYsAEbare7yo%2FxmIF9k2tMw2v8ZQkwHcR%2FEvV%2Bo%3D&reserved=0
>>
>>  Status: Needs review.
>
> I can confirm that the proposed fix resolves the OVN hairpin issue. It
> also looks simple enough to be backportable all the way to where we
> would need it (kernel 5.4.0). I'll have a look at giving this wider
> exposure in an internal CI environment as a canary for any unintended
> consequences if that would be helpful.
>

 Sorry for jumping in late on this, as this patch was already accepted to 
 the kernel.
 However, I have a concern that this patch would break the tc datapath when 
 ovs hw offload is enabled.

 IIUC then this patch adds an implicit ovs_ct_clear call for 5-tuple modify 
 actions. However, this implicit action will not apply to flows that use 
 the tc datapath.

 Forde, can you verify that indeed this fix breaks the OVN hairpin use case 
 when hw offload is enabled.
>>> Hi, Oz.  I don't think that this kernel fix breaks the TC datapath
>>> as the packets processed by the openvswitch kernel module will not
>>> go back to TC for further processing, IIUC.  Also, it's not a full
>>> ct_clear, because 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-14 Thread Ilya Maximets
On 6/14/22 16:26, Oz Shlomo wrote:
> Hi Ilya,
> 
> On 6/14/2022 4:03 PM, Ilya Maximets wrote:
>> On 6/14/22 10:27, Oz Shlomo via dev wrote:
>>>
>>>
>>> On 6/8/2022 3:16 AM, Frode Nordahl wrote:
 On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  wrote:
>
> On 5/31/22 23:48, Ilya Maximets wrote:
>> On 5/31/22 21:15, Frode Nordahl wrote:
>>> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
>
> 
>
>>> I've pushed the first part of the fix here:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
>>
>> Thanks!  I saw that and I tend to think that it is correct.
>> I'll try to test it and apply in the next couple of days.
>>
>> One question about the test above: which entity actually adds
>> the ct_state to the packet or at which moment that happens?
>> I see it, but I'm not sure I fully understand that.  Looks
>> like I'm missing smething obvious.
>
> I found what is going on there - kernel simply tracks everything
> if not told to do so, so ICMP packets create the ct entry and
> subsequent packets re-use it, so icmp replies have +trk set while
> entering OVS.

 Great, my hunch was that something along these lines was happening as
 well, I have to admit the test case was found by locating something
 closest to the real life use case and it proved to work as a good test
 for this condition.

> 
>
> Let's have some summary of the issues discovered here so far,
> including a few new issues:
>
> 1. ct states set externally are not tracked correctly by OVS.
>  Fix: 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
>  Status:  LGTM, will apply soon.
>  This fixes the problem originally reported by Liam, IIUC.  Right?

 Yes.

> 2. Kernel ct() actions are trying to re-use the cached connection
>  after the tuple changes.
>  This ends up to be the OVN hairpin issue as reported here:
>    
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fubuntu%2F%2Bsource%2Fovn%2F%2Bbug%2F1967856&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=25B7VbtRFguupC7VoNjZK%2FWlasu%2BMSTUzJkszvEpDaQ%3D&reserved=0
>
>  Proposed Fix:
>    
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fnetdev%2F20220606221140.488984-1-i.maximets%40ovn.org%2FT%2F%23u&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=areRLYsAEbare7yo%2FxmIF9k2tMw2v8ZQkwHcR%2FEvV%2Bo%3D&reserved=0
>
>  Status: Needs review.

 I can confirm that the proposed fix resolves the OVN hairpin issue. It
 also looks simple enough to be backportable all the way to where we
 would need it (kernel 5.4.0). I'll have a look at giving this wider
 exposure in an internal CI environment as a canary for any unintended
 consequences if that would be helpful.

>>>
>>> Sorry for jumping in late on this, as this patch was already accepted to 
>>> the kernel.
>>> However, I have a concern that this patch would break the tc datapath when 
>>> ovs hw offload is enabled.
>>>
>>> IIUC then this patch adds an implicit ovs_ct_clear call for 5-tuple modify 
>>> actions. However, this implicit action will not apply to flows that use the 
>>> tc datapath.
>>>
>>> Forde, can you verify that indeed this fix breaks the OVN hairpin use case 
>>> when hw offload is enabled.
>> Hi, Oz.  I don't think that this kernel fix breaks the TC datapath
>> as the packets processed by the openvswitch kernel module will not
>> go back to TC for further processing, IIUC.  Also, it's not a full
>> ct_clear, because we're not clearing the flow key.
> 
> A flow datapath is either in tc or in ovs.
> If hardware offload is enabled then ovs 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-14 Thread Oz Shlomo via dev

Hi Ilya,

On 6/14/2022 4:03 PM, Ilya Maximets wrote:

On 6/14/22 10:27, Oz Shlomo via dev wrote:



On 6/8/2022 3:16 AM, Frode Nordahl wrote:

On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  wrote:


On 5/31/22 23:48, Ilya Maximets wrote:

On 5/31/22 21:15, Frode Nordahl wrote:

On Mon, May 30, 2022 at 5:25 PM Frode Nordahl





I've pushed the first part of the fix here:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0


Thanks!  I saw that and I tend to think that it is correct.
I'll try to test it and apply in the next couple of days.

One question about the test above: which entity actually adds
the ct_state to the packet or at which moment that happens?
I see it, but I'm not sure I fully understand that.  Looks
like I'm missing smething obvious.


I found what is going on there - kernel simply tracks everything
if not told to do so, so ICMP packets create the ct entry and
subsequent packets re-use it, so icmp replies have +trk set while
entering OVS.


Great, my hunch was that something along these lines was happening as
well, I have to admit the test case was found by locating something
closest to the real life use case and it proved to work as a good test
for this condition.




Let's have some summary of the issues discovered here so far,
including a few new issues:

1. ct states set externally are not tracked correctly by OVS.
     Fix: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-May%2F394450.html&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pq0CABjao2UWojg6yZut7RL%2FZEeuRou0qUVZKNYP3rQ%3D&reserved=0
     Status:  LGTM, will apply soon.
     This fixes the problem originally reported by Liam, IIUC.  Right?


Yes.


2. Kernel ct() actions are trying to re-use the cached connection
     after the tuple changes.
     This ends up to be the OVN hairpin issue as reported here:
   
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fubuntu%2F%2Bsource%2Fovn%2F%2Bbug%2F1967856&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=25B7VbtRFguupC7VoNjZK%2FWlasu%2BMSTUzJkszvEpDaQ%3D&reserved=0

     Proposed Fix:
   
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fnetdev%2F20220606221140.488984-1-i.maximets%40ovn.org%2FT%2F%23u&data=05%7C01%7Cozsh%40nvidia.com%7C7ab5490f9c334e9d877f08da4e0652fc%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637908086346878848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=areRLYsAEbare7yo%2FxmIF9k2tMw2v8ZQkwHcR%2FEvV%2Bo%3D&reserved=0

     Status: Needs review.


I can confirm that the proposed fix resolves the OVN hairpin issue. It
also looks simple enough to be backportable all the way to where we
would need it (kernel 5.4.0). I'll have a look at giving this wider
exposure in an internal CI environment as a canary for any unintended
consequences if that would be helpful.



Sorry for jumping in late on this, as this patch was already accepted to the 
kernel.
However, I have a concern that this patch would break the tc datapath when ovs 
hw offload is enabled.

IIUC then this patch adds an implicit ovs_ct_clear call for 5-tuple modify 
actions. However, this implicit action will not apply to flows that use the tc 
datapath.

Forde, can you verify that indeed this fix breaks the OVN hairpin use case when 
hw offload is enabled.

Hi, Oz.  I don't think that this kernel fix breaks the TC datapath
as the packets processed by the openvswitch kernel module will not
go back to TC for further processing, IIUC.  Also, it's not a full
ct_clear, because we're not clearing the flow key.


A flow datapath is either in tc or in ovs.
If hardware offload is enabled then ovs will create a tc flower entry.
Therefore, packets for that flow will be processed by tc and not 
openvswitch.
Note that hardware offload may be enabled even if there is no supporting 
hardware. TC software datapath is designed to be functionally equivalent 
to ovs.


tc is processed before openvswitch in the kernel pipeline. Therefore, if 
a packet is matched by tc then it will not continue to openvswitch. 
Therefore my concern is that 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-14 Thread Ilya Maximets
On 6/14/22 10:27, Oz Shlomo via dev wrote:
> 
> 
> On 6/8/2022 3:16 AM, Frode Nordahl wrote:
>> On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  wrote:
>>>
>>> On 5/31/22 23:48, Ilya Maximets wrote:
 On 5/31/22 21:15, Frode Nordahl wrote:
> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
>>>
>>> 
>>>
> I've pushed the first part of the fix here:
> https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html

 Thanks!  I saw that and I tend to think that it is correct.
 I'll try to test it and apply in the next couple of days.

 One question about the test above: which entity actually adds
 the ct_state to the packet or at which moment that happens?
 I see it, but I'm not sure I fully understand that.  Looks
 like I'm missing smething obvious.
>>>
>>> I found what is going on there - kernel simply tracks everything
>>> if not told to do so, so ICMP packets create the ct entry and
>>> subsequent packets re-use it, so icmp replies have +trk set while
>>> entering OVS.
>>
>> Great, my hunch was that something along these lines was happening as
>> well, I have to admit the test case was found by locating something
>> closest to the real life use case and it proved to work as a good test
>> for this condition.
>>
>>> 
>>>
>>> Let's have some summary of the issues discovered here so far,
>>> including a few new issues:
>>>
>>> 1. ct states set externally are not tracked correctly by OVS.
>>>     Fix: https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html
>>>     Status:  LGTM, will apply soon.
>>>     This fixes the problem originally reported by Liam, IIUC.  Right?
>>
>> Yes.
>>
>>> 2. Kernel ct() actions are trying to re-use the cached connection
>>>     after the tuple changes.
>>>     This ends up to be the OVN hairpin issue as reported here:
>>>   https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
>>>
>>>     Proposed Fix:
>>>   
>>> https://lore.kernel.org/netdev/20220606221140.488984-1-i.maxim...@ovn.org/T/#u
>>>
>>>     Status: Needs review.
>>
>> I can confirm that the proposed fix resolves the OVN hairpin issue. It
>> also looks simple enough to be backportable all the way to where we
>> would need it (kernel 5.4.0). I'll have a look at giving this wider
>> exposure in an internal CI environment as a canary for any unintended
>> consequences if that would be helpful.
>>
> 
> Sorry for jumping in late on this, as this patch was already accepted to the 
> kernel.
> However, I have a concern that this patch would break the tc datapath when 
> ovs hw offload is enabled.
> 
> IIUC then this patch adds an implicit ovs_ct_clear call for 5-tuple modify 
> actions. However, this implicit action will not apply to flows that use the 
> tc datapath.
> 
> Forde, can you verify that indeed this fix breaks the OVN hairpin use case 
> when hw offload is enabled.
Hi, Oz.  I don't think that this kernel fix breaks the TC datapath
as the packets processed by the openvswitch kernel module will not
go back to TC for further processing, IIUC.  Also, it's not a full
ct_clear, because we're not clearing the flow key. 

But I agree that the original bug exists in TC as well, since TC
just copied the ct() recircuation optimization from the openvswitch.
So, if there are subsequent ct actions with pedit in between,
TC will have the same problem with misclassification as OVS had
before the kernel fix 2061ecfdf235.

So, the similar fix should be implemented for TC as well.  However,
I'm not sure how to actually do that, because ct and pedit are
not really connected in the kernel.  The issue might be fixed as a
side effect from fixes for the issue #5 in the list here, I guess,
but it's not really a correct fix.  The reason why it should be
fixed in the kernel is because user doesn't really know that TC
or openvswitch module cached that connection, the user didn't ask
it to be cached and re-used, they only wanted to populate the
current flow key with the ct_{state,mark,label} or commit some
changes.  TC/openvswitch kernel module decided to cache the nfct,
so it should handle possible mismatch if the packet got changed.

Does that make sense?

> 
>>> 3. ct_clear is not a reversible action, because it required
>>>     to pass the packet through conntrack again in order to restore
>>>     the conntrack state.
>>>
>>>     Fix: To move the CT_CLEAR to a group of irreversible actions
>>>  in the reversible_actions() in ofproto/ofproto-dpif-xlate.c.
>>>
>>>     Status: An easy fix that does nothing useful without fixes for
>>>     later issues, because we clear the ct_state before
>>>     the patch_port_output() processing and optimizing out
>>>     the CT_CLEAR action.
>>>
>>>
>>> 4. (new!) reversible_actions() optimization is logically incorrect.
>>>     The reason is that reversible_actions() doesn't look further
>>>     than a list of actions of the first OF rule in the second bridge.
>>>     If the list of action c

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-14 Thread Oz Shlomo via dev




On 6/8/2022 3:16 AM, Frode Nordahl wrote:

On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  wrote:


On 5/31/22 23:48, Ilya Maximets wrote:

On 5/31/22 21:15, Frode Nordahl wrote:

On Mon, May 30, 2022 at 5:25 PM Frode Nordahl





I've pushed the first part of the fix here:
https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html


Thanks!  I saw that and I tend to think that it is correct.
I'll try to test it and apply in the next couple of days.

One question about the test above: which entity actually adds
the ct_state to the packet or at which moment that happens?
I see it, but I'm not sure I fully understand that.  Looks
like I'm missing smething obvious.


I found what is going on there - kernel simply tracks everything
if not told to do so, so ICMP packets create the ct entry and
subsequent packets re-use it, so icmp replies have +trk set while
entering OVS.


Great, my hunch was that something along these lines was happening as
well, I have to admit the test case was found by locating something
closest to the real life use case and it proved to work as a good test
for this condition.




Let's have some summary of the issues discovered here so far,
including a few new issues:

1. ct states set externally are not tracked correctly by OVS.
Fix: https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html
Status:  LGTM, will apply soon.
This fixes the problem originally reported by Liam, IIUC.  Right?


Yes.


2. Kernel ct() actions are trying to re-use the cached connection
after the tuple changes.
This ends up to be the OVN hairpin issue as reported here:
  https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856

Proposed Fix:
  
https://lore.kernel.org/netdev/20220606221140.488984-1-i.maxim...@ovn.org/T/#u

Status: Needs review.


I can confirm that the proposed fix resolves the OVN hairpin issue. It
also looks simple enough to be backportable all the way to where we
would need it (kernel 5.4.0). I'll have a look at giving this wider
exposure in an internal CI environment as a canary for any unintended
consequences if that would be helpful.



Sorry for jumping in late on this, as this patch was already accepted to 
the kernel.
However, I have a concern that this patch would break the tc datapath 
when ovs hw offload is enabled.


IIUC then this patch adds an implicit ovs_ct_clear call for 5-tuple 
modify actions. However, this implicit action will not apply to flows 
that use the tc datapath.


Forde, can you verify that indeed this fix breaks the OVN hairpin use 
case when hw offload is enabled.



3. ct_clear is not a reversible action, because it required
to pass the packet through conntrack again in order to restore
the conntrack state.

Fix: To move the CT_CLEAR to a group of irreversible actions
 in the reversible_actions() in ofproto/ofproto-dpif-xlate.c.

Status: An easy fix that does nothing useful without fixes for
later issues, because we clear the ct_state before
the patch_port_output() processing and optimizing out
the CT_CLEAR action.


4. (new!) reversible_actions() optimization is logically incorrect.
The reason is that reversible_actions() doesn't look further
than a list of actions of the first OF rule in the second bridge.
If the list of action contains resubmit, there can still be
irreversible actions and packet will be irreversibly modified by
the patch port processing without cloning.

Simple reproducer:

diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index dbb3b6dda..be30ad5bf 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -8736,7 +8736,8 @@ OVS_VSWITCHD_START(
  AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats 
bands=type=drop rate=2'])
  AT_CHECK([ovs-ofctl del-flows br0])
  AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,ip,actions=2,1])
-AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 in_port=1,ip,actions=meter:1,3])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
"in_port=1,ip,actions=resubmit(,7)"])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
table=7,in_port=1,ip,actions=meter:1,3])

  AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 
'in_port(100),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=10.1.1.22,dst=10.0.0.3,proto=6,tos=0,ttl=64,frag=no),tcp(src=53295,dst=8080)'],
 [0], [stdout])
  AT_CHECK([tail -1 stdout], [0],
---

Status: unclear.
One idea is to reverse the logic and check datapath actions for
being reversible when going up in recursion instead of trying to
predict all the future actions while going down.

Ideas are welcome.
Hammerhead approach: Mark 'resubmit' as irreversible action.  But that
will effectively mean that we're always cloning on patch port output,
which is not great, see the issue #8.


5. Packets after the ct() in a non-forked pipeline must be untracked.
Statu

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-07 Thread Frode Nordahl
On Tue, Jun 7, 2022 at 12:16 AM Ilya Maximets  wrote:
>
> On 5/31/22 23:48, Ilya Maximets wrote:
> > On 5/31/22 21:15, Frode Nordahl wrote:
> >> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
>
> 
>
> >> I've pushed the first part of the fix here:
> >> https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html
> >
> > Thanks!  I saw that and I tend to think that it is correct.
> > I'll try to test it and apply in the next couple of days.
> >
> > One question about the test above: which entity actually adds
> > the ct_state to the packet or at which moment that happens?
> > I see it, but I'm not sure I fully understand that.  Looks
> > like I'm missing smething obvious.
>
> I found what is going on there - kernel simply tracks everything
> if not told to do so, so ICMP packets create the ct entry and
> subsequent packets re-use it, so icmp replies have +trk set while
> entering OVS.

Great, my hunch was that something along these lines was happening as
well, I have to admit the test case was found by locating something
closest to the real life use case and it proved to work as a good test
for this condition.

> 
>
> Let's have some summary of the issues discovered here so far,
> including a few new issues:
>
> 1. ct states set externally are not tracked correctly by OVS.
>Fix: https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html
>Status:  LGTM, will apply soon.
>This fixes the problem originally reported by Liam, IIUC.  Right?

Yes.

> 2. Kernel ct() actions are trying to re-use the cached connection
>after the tuple changes.
>This ends up to be the OVN hairpin issue as reported here:
>  https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
>
>Proposed Fix:
>  
> https://lore.kernel.org/netdev/20220606221140.488984-1-i.maxim...@ovn.org/T/#u
>
>Status: Needs review.

I can confirm that the proposed fix resolves the OVN hairpin issue. It
also looks simple enough to be backportable all the way to where we
would need it (kernel 5.4.0). I'll have a look at giving this wider
exposure in an internal CI environment as a canary for any unintended
consequences if that would be helpful.

> 3. ct_clear is not a reversible action, because it required
>to pass the packet through conntrack again in order to restore
>the conntrack state.
>
>Fix: To move the CT_CLEAR to a group of irreversible actions
> in the reversible_actions() in ofproto/ofproto-dpif-xlate.c.
>
>Status: An easy fix that does nothing useful without fixes for
>later issues, because we clear the ct_state before
>the patch_port_output() processing and optimizing out
>the CT_CLEAR action.
>
>
> 4. (new!) reversible_actions() optimization is logically incorrect.
>The reason is that reversible_actions() doesn't look further
>than a list of actions of the first OF rule in the second bridge.
>If the list of action contains resubmit, there can still be
>irreversible actions and packet will be irreversibly modified by
>the patch port processing without cloning.
>
>Simple reproducer:
>
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index dbb3b6dda..be30ad5bf 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -8736,7 +8736,8 @@ OVS_VSWITCHD_START(
>  AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats 
> bands=type=drop rate=2'])
>  AT_CHECK([ovs-ofctl del-flows br0])
>  AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,ip,actions=2,1])
> -AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
> in_port=1,ip,actions=meter:1,3])
> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
> "in_port=1,ip,actions=resubmit(,7)"])
> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
> table=7,in_port=1,ip,actions=meter:1,3])
>
>  AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 
> 'in_port(100),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=10.1.1.22,dst=10.0.0.3,proto=6,tos=0,ttl=64,frag=no),tcp(src=53295,dst=8080)'],
>  [0], [stdout])
>  AT_CHECK([tail -1 stdout], [0],
> ---
>
>Status: unclear.
>One idea is to reverse the logic and check datapath actions for
>being reversible when going up in recursion instead of trying to
>predict all the future actions while going down.
>
>Ideas are welcome.
>Hammerhead approach: Mark 'resubmit' as irreversible action.  But that
>will effectively mean that we're always cloning on patch port output,
>which is not great, see the issue #8.
>
>
> 5. Packets after the ct() in a non-forked pipeline must be untracked.
>Status: unclear.
>
>Fix for the issue #2 will cover issues for already tracked packets,
>but if the packet is supposed to be untracked, but it is tracked,
>then we must emit the ct_clear action from the userspace.
>
>The most viable approach, I guess, is the previously proposed
>'pending_ct_clear' fix.  Alternative is to always emit ct_clear
> 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-06-06 Thread Ilya Maximets
On 5/31/22 23:48, Ilya Maximets wrote:
> On 5/31/22 21:15, Frode Nordahl wrote:
>> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl



>> I've pushed the first part of the fix here:
>> https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html
> 
> Thanks!  I saw that and I tend to think that it is correct.
> I'll try to test it and apply in the next couple of days.
> 
> One question about the test above: which entity actually adds
> the ct_state to the packet or at which moment that happens?
> I see it, but I'm not sure I fully understand that.  Looks
> like I'm missing smething obvious.

I found what is going on there - kernel simply tracks everything
if not told to do so, so ICMP packets create the ct entry and
subsequent packets re-use it, so icmp replies have +trk set while
entering OVS.



Let's have some summary of the issues discovered here so far,
including a few new issues:

1. ct states set externally are not tracked correctly by OVS.
   Fix: https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html
   Status:  LGTM, will apply soon.
   This fixes the problem originally reported by Liam, IIUC.  Right?


2. Kernel ct() actions are trying to re-use the cached connection
   after the tuple changes.
   This ends up to be the OVN hairpin issue as reported here:
 https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856

   Proposed Fix:
 
https://lore.kernel.org/netdev/20220606221140.488984-1-i.maxim...@ovn.org/T/#u

   Status: Needs review.


3. ct_clear is not a reversible action, because it required
   to pass the packet through conntrack again in order to restore
   the conntrack state.

   Fix: To move the CT_CLEAR to a group of irreversible actions
in the reversible_actions() in ofproto/ofproto-dpif-xlate.c.

   Status: An easy fix that does nothing useful without fixes for
   later issues, because we clear the ct_state before
   the patch_port_output() processing and optimizing out
   the CT_CLEAR action.


4. (new!) reversible_actions() optimization is logically incorrect.
   The reason is that reversible_actions() doesn't look further
   than a list of actions of the first OF rule in the second bridge.
   If the list of action contains resubmit, there can still be
   irreversible actions and packet will be irreversibly modified by
   the patch port processing without cloning.

   Simple reproducer:

diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index dbb3b6dda..be30ad5bf 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -8736,7 +8736,8 @@ OVS_VSWITCHD_START(
 AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats 
bands=type=drop rate=2'])
 AT_CHECK([ovs-ofctl del-flows br0])
 AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,ip,actions=2,1])
-AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 in_port=1,ip,actions=meter:1,3])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
"in_port=1,ip,actions=resubmit(,7)"])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 
table=7,in_port=1,ip,actions=meter:1,3])
 
 AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 
'in_port(100),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=10.1.1.22,dst=10.0.0.3,proto=6,tos=0,ttl=64,frag=no),tcp(src=53295,dst=8080)'],
 [0], [stdout])
 AT_CHECK([tail -1 stdout], [0],
---

   Status: unclear.
   One idea is to reverse the logic and check datapath actions for
   being reversible when going up in recursion instead of trying to
   predict all the future actions while going down.

   Ideas are welcome.
   Hammerhead approach: Mark 'resubmit' as irreversible action.  But that
   will effectively mean that we're always cloning on patch port output,
   which is not great, see the issue #8.


5. Packets after the ct() in a non-forked pipeline must be untracked.
   Status: unclear.

   Fix for the issue #2 will cover issues for already tracked packets,
   but if the packet is supposed to be untracked, but it is tracked,
   then we must emit the ct_clear action from the userspace.

   The most viable approach, I guess, is the previously proposed
   'pending_ct_clear' fix.  Alternative is to always emit ct_clear
   right after the ct() action and not loose our minds trying to
   track the pending action in all the weird cases.


6. Conntrack state must not be preserved while going through the
   patch port.

   State: unclear.

   The right solution would be to emit the ct_clear datapath action
   after cloning for the patch port.  Few problems here:
   - current code in ofproto/ofproto-dpif-xlate.c will not actually
 allow us to do that, we need to fix the issue #4 first to be able to
 inject new actions post factum.
   - ct_clear is non-reversible (issue #3) meaning if we're adding
 ct-clear, we have to clone even if the patch port processing
 doesn't have any ct() actions or recirculations.
 Excessive ct_clear's and clones can be avoided by fixing the
 issue #4 and 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-31 Thread Ilya Maximets
On 5/31/22 21:15, Frode Nordahl wrote:
> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
>  wrote:
>>
>> On Fri, May 27, 2022 at 10:04 PM Ilya Maximets  wrote:
>>>
>>> On 5/26/22 14:53, Frode Nordahl wrote:


 tor. 26. mai 2022, 14:45 skrev Ilya Maximets >>> >:

 On 5/26/22 13:00, Frode Nordahl wrote:
 > On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
 > mailto:frode.nord...@canonical.com>> 
 wrote:
 >>
 >> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets >>> > wrote:
 >>>
 >>> On 5/24/22 12:54, Frode Nordahl wrote:
  On Mon, May 23, 2022 at 3:49 PM Ilya Maximets >>> > wrote:
 >
 > On 5/21/22 12:49, Frode Nordahl wrote:
 >> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
 >> >>> > wrote:
 >>>
 >>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets 
 mailto:i.maxim...@ovn.org>> wrote:
 
  On 5/13/22 10:36, Frode Nordahl wrote:
 > On Fri, Mar 11, 2022 at 2:04 PM Liam Young 
 mailto:liam.yo...@canonical.com>> wrote:
 >>
 >> Hi,
 >>
 >>  Commit 355fef6f2 seems to break connectivity in my 
 setup
 >
 > After OVN started colocating sNAT and dNAT operations in the 
 same CT
 > zone [0] the above mentioned OVS commit appears to also 
 break OVN [1].
 > I have been discussing with Numan and he has found a 
 correlation with
 > the behavior of the open vswitch kernel data path conntrack
 > `skb_nfct_cached` function, i.e. if that is changed to 
 always return
 > 'false' the erratic behavior disappears.
 >
 > So I guess the question then becomes, is this a OVS 
 userspace or OVS
 > kernel problem?
 >
 > We have a reproducer in [3].
 >
 > 0: 
 https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
  
 
 > 1: 
 https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 
 
 > 2: 
 https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
  
 
 > 3: 
 https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6 
 
 >
 
  Hrm.  I think, there is a logical bug in implementations of 
 ct()
  datapath action in both datapaths.
 
  The problem appears when the OpenFlow pipeline for the packet 
 contains
  several ct() actions.  NXAST_CT action (i.e. every ct() 
 action) must
  always put the packet into an untracked state.  Tracked state 
 will be
  set only in the 'recirc_table' where the packet is cloned by 
 the ct()
  action for further processing.
 
  If an OF pipeline looks like this:
 
    actions=ct(),something,something,ct(),something
 
  each action must be entered with the 'untracked' conntrack 
 state in
  the packet metadata.
 
  However, ct() action in the datapath, unlike OpenFlow, 
 doesn't work
  this way.  It modifies the conntrack state for the packet it 
 is processing.
  During the flow translation OVS inserts recirculation right 
 after the
  datapath ct() action.  This way conntrack state affects only 
 processing
  after recirculation.  Sort of.
 
  The issue is that not all OpenFlow ct() actions have 
 recirc_table
  specified.  These actions supposed to change the state of the 
 conntrack
  subsystem, but they still must leave the packet itself in the 
 untracked
  state with no strings attached.  But since datapath ct() 
 actions doesn't
  work that way and since recirculation is not inserted (no 
 recirc_table
  specified), packet after conntrack execution carries the 
 state along to
  all other actions.
  This doesn't impact normal actions, but seems to break 
 subsequent ct()
 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-31 Thread Frode Nordahl
On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
 wrote:
>
> On Fri, May 27, 2022 at 10:04 PM Ilya Maximets  wrote:
> >
> > On 5/26/22 14:53, Frode Nordahl wrote:
> > >
> > >
> > > tor. 26. mai 2022, 14:45 skrev Ilya Maximets  > > >:
> > >
> > > On 5/26/22 13:00, Frode Nordahl wrote:
> > > > On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
> > > > mailto:frode.nord...@canonical.com>> 
> > > wrote:
> > > >>
> > > >> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets  > > > wrote:
> > > >>>
> > > >>> On 5/24/22 12:54, Frode Nordahl wrote:
> > >  On Mon, May 23, 2022 at 3:49 PM Ilya Maximets 
> > > mailto:i.maxim...@ovn.org>> wrote:
> > > >
> > > > On 5/21/22 12:49, Frode Nordahl wrote:
> > > >> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> > > >>  > > > wrote:
> > > >>>
> > > >>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets 
> > > mailto:i.maxim...@ovn.org>> wrote:
> > > 
> > >  On 5/13/22 10:36, Frode Nordahl wrote:
> > > > On Fri, Mar 11, 2022 at 2:04 PM Liam Young 
> > > mailto:liam.yo...@canonical.com>> wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >>  Commit 355fef6f2 seems to break connectivity in my 
> > > setup
> > > >
> > > > After OVN started colocating sNAT and dNAT operations in 
> > > the same CT
> > > > zone [0] the above mentioned OVS commit appears to also 
> > > break OVN [1].
> > > > I have been discussing with Numan and he has found a 
> > > correlation with
> > > > the behavior of the open vswitch kernel data path conntrack
> > > > `skb_nfct_cached` function, i.e. if that is changed to 
> > > always return
> > > > 'false' the erratic behavior disappears.
> > > >
> > > > So I guess the question then becomes, is this a OVS 
> > > userspace or OVS
> > > > kernel problem?
> > > >
> > > > We have a reproducer in [3].
> > > >
> > > > 0: 
> > > https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> > >  
> > > 
> > > > 1: 
> > > https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 
> > > 
> > > > 2: 
> > > https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> > >  
> > > 
> > > > 3: 
> > > https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6 
> > > 
> > > >
> > > 
> > >  Hrm.  I think, there is a logical bug in implementations of 
> > > ct()
> > >  datapath action in both datapaths.
> > > 
> > >  The problem appears when the OpenFlow pipeline for the 
> > > packet contains
> > >  several ct() actions.  NXAST_CT action (i.e. every ct() 
> > > action) must
> > >  always put the packet into an untracked state.  Tracked 
> > > state will be
> > >  set only in the 'recirc_table' where the packet is cloned by 
> > > the ct()
> > >  action for further processing.
> > > 
> > >  If an OF pipeline looks like this:
> > > 
> > >    actions=ct(),something,something,ct(),something
> > > 
> > >  each action must be entered with the 'untracked' conntrack 
> > > state in
> > >  the packet metadata.
> > > 
> > >  However, ct() action in the datapath, unlike OpenFlow, 
> > > doesn't work
> > >  this way.  It modifies the conntrack state for the packet it 
> > > is processing.
> > >  During the flow translation OVS inserts recirculation right 
> > > after the
> > >  datapath ct() action.  This way conntrack state affects only 
> > > processing
> > >  after recirculation.  Sort of.
> > > 
> > >  The issue is that not all OpenFlow ct() actions have 
> > > recirc_table
> > >  specified.  These actions supposed to change the state of 
> > > the conntrack
> > >  subsystem, but they still must leave the packet itself in 
> > > the untracked
> > >  state with no strings attached.  But since datapath ct() 
> > > actions doesn't
> > >  work that way and since recirculation is not inserted (no 
> > > recirc_table
> > >  specified), packet after conntrack execution carries the 
> > > state along to
> > >  all other actions.
> > >  This doesn't 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-30 Thread Frode Nordahl
On Fri, May 27, 2022 at 10:04 PM Ilya Maximets  wrote:
>
> On 5/26/22 14:53, Frode Nordahl wrote:
> >
> >
> > tor. 26. mai 2022, 14:45 skrev Ilya Maximets  > >:
> >
> > On 5/26/22 13:00, Frode Nordahl wrote:
> > > On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
> > > mailto:frode.nord...@canonical.com>> 
> > wrote:
> > >>
> > >> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets  > > wrote:
> > >>>
> > >>> On 5/24/22 12:54, Frode Nordahl wrote:
> >  On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  > > wrote:
> > >
> > > On 5/21/22 12:49, Frode Nordahl wrote:
> > >> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> > >>  > > wrote:
> > >>>
> > >>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets 
> > mailto:i.maxim...@ovn.org>> wrote:
> > 
> >  On 5/13/22 10:36, Frode Nordahl wrote:
> > > On Fri, Mar 11, 2022 at 2:04 PM Liam Young 
> > mailto:liam.yo...@canonical.com>> wrote:
> > >>
> > >> Hi,
> > >>
> > >>  Commit 355fef6f2 seems to break connectivity in my 
> > setup
> > >
> > > After OVN started colocating sNAT and dNAT operations in the 
> > same CT
> > > zone [0] the above mentioned OVS commit appears to also break 
> > OVN [1].
> > > I have been discussing with Numan and he has found a 
> > correlation with
> > > the behavior of the open vswitch kernel data path conntrack
> > > `skb_nfct_cached` function, i.e. if that is changed to always 
> > return
> > > 'false' the erratic behavior disappears.
> > >
> > > So I guess the question then becomes, is this a OVS userspace 
> > or OVS
> > > kernel problem?
> > >
> > > We have a reproducer in [3].
> > >
> > > 0: 
> > https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> >  
> > 
> > > 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 
> > 
> > > 2: 
> > https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> >  
> > 
> > > 3: 
> > https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6 
> > 
> > >
> > 
> >  Hrm.  I think, there is a logical bug in implementations of 
> > ct()
> >  datapath action in both datapaths.
> > 
> >  The problem appears when the OpenFlow pipeline for the packet 
> > contains
> >  several ct() actions.  NXAST_CT action (i.e. every ct() 
> > action) must
> >  always put the packet into an untracked state.  Tracked state 
> > will be
> >  set only in the 'recirc_table' where the packet is cloned by 
> > the ct()
> >  action for further processing.
> > 
> >  If an OF pipeline looks like this:
> > 
> >    actions=ct(),something,something,ct(),something
> > 
> >  each action must be entered with the 'untracked' conntrack 
> > state in
> >  the packet metadata.
> > 
> >  However, ct() action in the datapath, unlike OpenFlow, doesn't 
> > work
> >  this way.  It modifies the conntrack state for the packet it 
> > is processing.
> >  During the flow translation OVS inserts recirculation right 
> > after the
> >  datapath ct() action.  This way conntrack state affects only 
> > processing
> >  after recirculation.  Sort of.
> > 
> >  The issue is that not all OpenFlow ct() actions have 
> > recirc_table
> >  specified.  These actions supposed to change the state of the 
> > conntrack
> >  subsystem, but they still must leave the packet itself in the 
> > untracked
> >  state with no strings attached.  But since datapath ct() 
> > actions doesn't
> >  work that way and since recirculation is not inserted (no 
> > recirc_table
> >  specified), packet after conntrack execution carries the state 
> > along to
> >  all other actions.
> >  This doesn't impact normal actions, but seems to break 
> > subsequent ct()
> >  actions on the same pipeline.
> > 
> >  In general, it looks to me that ct() action in the datapath is
> >  not supposed to cache any connection data inside the packet's 
> > met

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-27 Thread Ilya Maximets
On 5/26/22 14:53, Frode Nordahl wrote:
> 
> 
> tor. 26. mai 2022, 14:45 skrev Ilya Maximets  >:
> 
> On 5/26/22 13:00, Frode Nordahl wrote:
> > On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
> > mailto:frode.nord...@canonical.com>> 
> wrote:
> >>
> >> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets  > wrote:
> >>>
> >>> On 5/24/22 12:54, Frode Nordahl wrote:
>  On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  > wrote:
> >
> > On 5/21/22 12:49, Frode Nordahl wrote:
> >> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> >> mailto:frode.nord...@canonical.com>> 
> wrote:
> >>>
> >>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  > wrote:
> 
>  On 5/13/22 10:36, Frode Nordahl wrote:
> > On Fri, Mar 11, 2022 at 2:04 PM Liam Young 
> mailto:liam.yo...@canonical.com>> wrote:
> >>
> >> Hi,
> >>
> >>  Commit 355fef6f2 seems to break connectivity in my 
> setup
> >
> > After OVN started colocating sNAT and dNAT operations in the 
> same CT
> > zone [0] the above mentioned OVS commit appears to also break 
> OVN [1].
> > I have been discussing with Numan and he has found a 
> correlation with
> > the behavior of the open vswitch kernel data path conntrack
> > `skb_nfct_cached` function, i.e. if that is changed to always 
> return
> > 'false' the erratic behavior disappears.
> >
> > So I guess the question then becomes, is this a OVS userspace 
> or OVS
> > kernel problem?
> >
> > We have a reproducer in [3].
> >
> > 0: 
> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
>  
> 
> > 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 
> 
> > 2: 
> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
>  
> 
> > 3: 
> https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6 
> 
> >
> 
>  Hrm.  I think, there is a logical bug in implementations of ct()
>  datapath action in both datapaths.
> 
>  The problem appears when the OpenFlow pipeline for the packet 
> contains
>  several ct() actions.  NXAST_CT action (i.e. every ct() action) 
> must
>  always put the packet into an untracked state.  Tracked state 
> will be
>  set only in the 'recirc_table' where the packet is cloned by the 
> ct()
>  action for further processing.
> 
>  If an OF pipeline looks like this:
> 
>    actions=ct(),something,something,ct(),something
> 
>  each action must be entered with the 'untracked' conntrack state 
> in
>  the packet metadata.
> 
>  However, ct() action in the datapath, unlike OpenFlow, doesn't 
> work
>  this way.  It modifies the conntrack state for the packet it is 
> processing.
>  During the flow translation OVS inserts recirculation right 
> after the
>  datapath ct() action.  This way conntrack state affects only 
> processing
>  after recirculation.  Sort of.
> 
>  The issue is that not all OpenFlow ct() actions have recirc_table
>  specified.  These actions supposed to change the state of the 
> conntrack
>  subsystem, but they still must leave the packet itself in the 
> untracked
>  state with no strings attached.  But since datapath ct() actions 
> doesn't
>  work that way and since recirculation is not inserted (no 
> recirc_table
>  specified), packet after conntrack execution carries the state 
> along to
>  all other actions.
>  This doesn't impact normal actions, but seems to break 
> subsequent ct()
>  actions on the same pipeline.
> 
>  In general, it looks to me that ct() action in the datapath is
>  not supposed to cache any connection data inside the packet's 
> metadata.
>  This seems to be the root cause of the problem.  Fields that OF 
> knows
>  about should not trigger any issues if carried along with a 
> packet,
>  but datapath-specific metadata can not be cleared, because 
> OFproto
>  simply doesn't kn

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-26 Thread Frode Nordahl
tor. 26. mai 2022, 14:45 skrev Ilya Maximets :

> On 5/26/22 13:00, Frode Nordahl wrote:
> > On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
> >  wrote:
> >>
> >> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets 
> wrote:
> >>>
> >>> On 5/24/22 12:54, Frode Nordahl wrote:
>  On Mon, May 23, 2022 at 3:49 PM Ilya Maximets 
> wrote:
> >
> > On 5/21/22 12:49, Frode Nordahl wrote:
> >> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> >>  wrote:
> >>>
> >>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets 
> wrote:
> 
>  On 5/13/22 10:36, Frode Nordahl wrote:
> > On Fri, Mar 11, 2022 at 2:04 PM Liam Young <
> liam.yo...@canonical.com> wrote:
> >>
> >> Hi,
> >>
> >>  Commit 355fef6f2 seems to break connectivity in my
> setup
> >
> > After OVN started colocating sNAT and dNAT operations in the
> same CT
> > zone [0] the above mentioned OVS commit appears to also break
> OVN [1].
> > I have been discussing with Numan and he has found a correlation
> with
> > the behavior of the open vswitch kernel data path conntrack
> > `skb_nfct_cached` function, i.e. if that is changed to always
> return
> > 'false' the erratic behavior disappears.
> >
> > So I guess the question then becomes, is this a OVS userspace or
> OVS
> > kernel problem?
> >
> > We have a reproducer in [3].
> >
> > 0:
> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> > 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> > 2:
> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> > 3:
> https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> >
> 
>  Hrm.  I think, there is a logical bug in implementations of ct()
>  datapath action in both datapaths.
> 
>  The problem appears when the OpenFlow pipeline for the packet
> contains
>  several ct() actions.  NXAST_CT action (i.e. every ct() action)
> must
>  always put the packet into an untracked state.  Tracked state
> will be
>  set only in the 'recirc_table' where the packet is cloned by the
> ct()
>  action for further processing.
> 
>  If an OF pipeline looks like this:
> 
>    actions=ct(),something,something,ct(),something
> 
>  each action must be entered with the 'untracked' conntrack state
> in
>  the packet metadata.
> 
>  However, ct() action in the datapath, unlike OpenFlow, doesn't
> work
>  this way.  It modifies the conntrack state for the packet it is
> processing.
>  During the flow translation OVS inserts recirculation right after
> the
>  datapath ct() action.  This way conntrack state affects only
> processing
>  after recirculation.  Sort of.
> 
>  The issue is that not all OpenFlow ct() actions have recirc_table
>  specified.  These actions supposed to change the state of the
> conntrack
>  subsystem, but they still must leave the packet itself in the
> untracked
>  state with no strings attached.  But since datapath ct() actions
> doesn't
>  work that way and since recirculation is not inserted (no
> recirc_table
>  specified), packet after conntrack execution carries the state
> along to
>  all other actions.
>  This doesn't impact normal actions, but seems to break subsequent
> ct()
>  actions on the same pipeline.
> 
>  In general, it looks to me that ct() action in the datapath is
>  not supposed to cache any connection data inside the packet's
> metadata.
>  This seems to be the root cause of the problem.  Fields that OF
> knows
>  about should not trigger any issues if carried along with a
> packet,
>  but datapath-specific metadata can not be cleared, because OFproto
>  simply doesn't know about it.
> 
>  But, I guess, not caching the connection and other internal
> structures
>  might significantly impact datapath performance.  Will it?
> 
>  Looking at the reproducer in [3], it has, for example, the flow
> with the
>  following actions:
> 
>    actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
>    set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
>    set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
>    ct(zone=8),recirc(0x4)
> 
>  So, if the first ct() will change some datapath-specific packet
> metadata,
>  the second ct() may try to use that information.
> 
>  It looks like during the flow translation we must add ct_clear
> action
>  explicitly after every ct() action, unless it was the last action
>

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-26 Thread Ilya Maximets
On 5/26/22 13:00, Frode Nordahl wrote:
> On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
>  wrote:
>>
>> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets  wrote:
>>>
>>> On 5/24/22 12:54, Frode Nordahl wrote:
 On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  wrote:
>
> On 5/21/22 12:49, Frode Nordahl wrote:
>> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
>>  wrote:
>>>
>>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  
>>> wrote:

 On 5/13/22 10:36, Frode Nordahl wrote:
> On Fri, Mar 11, 2022 at 2:04 PM Liam Young  
> wrote:
>>
>> Hi,
>>
>>  Commit 355fef6f2 seems to break connectivity in my 
>> setup
>
> After OVN started colocating sNAT and dNAT operations in the same CT
> zone [0] the above mentioned OVS commit appears to also break OVN [1].
> I have been discussing with Numan and he has found a correlation with
> the behavior of the open vswitch kernel data path conntrack
> `skb_nfct_cached` function, i.e. if that is changed to always return
> 'false' the erratic behavior disappears.
>
> So I guess the question then becomes, is this a OVS userspace or OVS
> kernel problem?
>
> We have a reproducer in [3].
>
> 0: 
> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> 2: 
> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> 3: 
> https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
>

 Hrm.  I think, there is a logical bug in implementations of ct()
 datapath action in both datapaths.

 The problem appears when the OpenFlow pipeline for the packet contains
 several ct() actions.  NXAST_CT action (i.e. every ct() action) must
 always put the packet into an untracked state.  Tracked state will be
 set only in the 'recirc_table' where the packet is cloned by the ct()
 action for further processing.

 If an OF pipeline looks like this:

   actions=ct(),something,something,ct(),something

 each action must be entered with the 'untracked' conntrack state in
 the packet metadata.

 However, ct() action in the datapath, unlike OpenFlow, doesn't work
 this way.  It modifies the conntrack state for the packet it is 
 processing.
 During the flow translation OVS inserts recirculation right after the
 datapath ct() action.  This way conntrack state affects only processing
 after recirculation.  Sort of.

 The issue is that not all OpenFlow ct() actions have recirc_table
 specified.  These actions supposed to change the state of the conntrack
 subsystem, but they still must leave the packet itself in the untracked
 state with no strings attached.  But since datapath ct() actions 
 doesn't
 work that way and since recirculation is not inserted (no recirc_table
 specified), packet after conntrack execution carries the state along to
 all other actions.
 This doesn't impact normal actions, but seems to break subsequent ct()
 actions on the same pipeline.

 In general, it looks to me that ct() action in the datapath is
 not supposed to cache any connection data inside the packet's metadata.
 This seems to be the root cause of the problem.  Fields that OF knows
 about should not trigger any issues if carried along with a packet,
 but datapath-specific metadata can not be cleared, because OFproto
 simply doesn't know about it.

 But, I guess, not caching the connection and other internal structures
 might significantly impact datapath performance.  Will it?

 Looking at the reproducer in [3], it has, for example, the flow with 
 the
 following actions:

   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
   ct(zone=8),recirc(0x4)

 So, if the first ct() will change some datapath-specific packet 
 metadata,
 the second ct() may try to use that information.

 It looks like during the flow translation we must add ct_clear action
 explicitly after every ct() action, unless it was the last action in
 the list.  This will end up with adding a lot of ct_clear actions
 everywhere.

 Another option is the patch below that tries to track if ct_clear
 is required and adds that action before

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-26 Thread Frode Nordahl
On Wed, May 25, 2022 at 9:55 AM Frode Nordahl
 wrote:
>
> On Tue, May 24, 2022 at 1:32 PM Ilya Maximets  wrote:
> >
> > On 5/24/22 12:54, Frode Nordahl wrote:
> > > On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  wrote:
> > >>
> > >> On 5/21/22 12:49, Frode Nordahl wrote:
> > >>> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> > >>>  wrote:
> > 
> >  On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  
> >  wrote:
> > >
> > > On 5/13/22 10:36, Frode Nordahl wrote:
> > >> On Fri, Mar 11, 2022 at 2:04 PM Liam Young 
> > >>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>>  Commit 355fef6f2 seems to break connectivity in my 
> > >>> setup
> > >>
> > >> After OVN started colocating sNAT and dNAT operations in the same CT
> > >> zone [0] the above mentioned OVS commit appears to also break OVN 
> > >> [1].
> > >> I have been discussing with Numan and he has found a correlation with
> > >> the behavior of the open vswitch kernel data path conntrack
> > >> `skb_nfct_cached` function, i.e. if that is changed to always return
> > >> 'false' the erratic behavior disappears.
> > >>
> > >> So I guess the question then becomes, is this a OVS userspace or OVS
> > >> kernel problem?
> > >>
> > >> We have a reproducer in [3].
> > >>
> > >> 0: 
> > >> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> > >> 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> > >> 2: 
> > >> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> > >> 3: 
> > >> https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> > >>
> > >
> > > Hrm.  I think, there is a logical bug in implementations of ct()
> > > datapath action in both datapaths.
> > >
> > > The problem appears when the OpenFlow pipeline for the packet contains
> > > several ct() actions.  NXAST_CT action (i.e. every ct() action) must
> > > always put the packet into an untracked state.  Tracked state will be
> > > set only in the 'recirc_table' where the packet is cloned by the ct()
> > > action for further processing.
> > >
> > > If an OF pipeline looks like this:
> > >
> > >   actions=ct(),something,something,ct(),something
> > >
> > > each action must be entered with the 'untracked' conntrack state in
> > > the packet metadata.
> > >
> > > However, ct() action in the datapath, unlike OpenFlow, doesn't work
> > > this way.  It modifies the conntrack state for the packet it is 
> > > processing.
> > > During the flow translation OVS inserts recirculation right after the
> > > datapath ct() action.  This way conntrack state affects only 
> > > processing
> > > after recirculation.  Sort of.
> > >
> > > The issue is that not all OpenFlow ct() actions have recirc_table
> > > specified.  These actions supposed to change the state of the 
> > > conntrack
> > > subsystem, but they still must leave the packet itself in the 
> > > untracked
> > > state with no strings attached.  But since datapath ct() actions 
> > > doesn't
> > > work that way and since recirculation is not inserted (no recirc_table
> > > specified), packet after conntrack execution carries the state along 
> > > to
> > > all other actions.
> > > This doesn't impact normal actions, but seems to break subsequent ct()
> > > actions on the same pipeline.
> > >
> > > In general, it looks to me that ct() action in the datapath is
> > > not supposed to cache any connection data inside the packet's 
> > > metadata.
> > > This seems to be the root cause of the problem.  Fields that OF knows
> > > about should not trigger any issues if carried along with a packet,
> > > but datapath-specific metadata can not be cleared, because OFproto
> > > simply doesn't know about it.
> > >
> > > But, I guess, not caching the connection and other internal structures
> > > might significantly impact datapath performance.  Will it?
> > >
> > > Looking at the reproducer in [3], it has, for example, the flow with 
> > > the
> > > following actions:
> > >
> > >   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
> > >   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
> > >   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
> > >   ct(zone=8),recirc(0x4)
> > >
> > > So, if the first ct() will change some datapath-specific packet 
> > > metadata,
> > > the second ct() may try to use that information.
> > >
> > > It looks like during the flow translation we must add ct_clear action
> > > explicitly after every ct() action, unless it was the last action in
> > > the list.  This will end up with adding a lot of ct_clear actions
> > > everywhere.
> > >

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-25 Thread Frode Nordahl
On Tue, May 24, 2022 at 1:32 PM Ilya Maximets  wrote:
>
> On 5/24/22 12:54, Frode Nordahl wrote:
> > On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  wrote:
> >>
> >> On 5/21/22 12:49, Frode Nordahl wrote:
> >>> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> >>>  wrote:
> 
>  On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  wrote:
> >
> > On 5/13/22 10:36, Frode Nordahl wrote:
> >> On Fri, Mar 11, 2022 at 2:04 PM Liam Young  
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>>  Commit 355fef6f2 seems to break connectivity in my 
> >>> setup
> >>
> >> After OVN started colocating sNAT and dNAT operations in the same CT
> >> zone [0] the above mentioned OVS commit appears to also break OVN [1].
> >> I have been discussing with Numan and he has found a correlation with
> >> the behavior of the open vswitch kernel data path conntrack
> >> `skb_nfct_cached` function, i.e. if that is changed to always return
> >> 'false' the erratic behavior disappears.
> >>
> >> So I guess the question then becomes, is this a OVS userspace or OVS
> >> kernel problem?
> >>
> >> We have a reproducer in [3].
> >>
> >> 0: 
> >> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> >> 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> >> 2: 
> >> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> >> 3: 
> >> https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> >>
> >
> > Hrm.  I think, there is a logical bug in implementations of ct()
> > datapath action in both datapaths.
> >
> > The problem appears when the OpenFlow pipeline for the packet contains
> > several ct() actions.  NXAST_CT action (i.e. every ct() action) must
> > always put the packet into an untracked state.  Tracked state will be
> > set only in the 'recirc_table' where the packet is cloned by the ct()
> > action for further processing.
> >
> > If an OF pipeline looks like this:
> >
> >   actions=ct(),something,something,ct(),something
> >
> > each action must be entered with the 'untracked' conntrack state in
> > the packet metadata.
> >
> > However, ct() action in the datapath, unlike OpenFlow, doesn't work
> > this way.  It modifies the conntrack state for the packet it is 
> > processing.
> > During the flow translation OVS inserts recirculation right after the
> > datapath ct() action.  This way conntrack state affects only processing
> > after recirculation.  Sort of.
> >
> > The issue is that not all OpenFlow ct() actions have recirc_table
> > specified.  These actions supposed to change the state of the conntrack
> > subsystem, but they still must leave the packet itself in the untracked
> > state with no strings attached.  But since datapath ct() actions doesn't
> > work that way and since recirculation is not inserted (no recirc_table
> > specified), packet after conntrack execution carries the state along to
> > all other actions.
> > This doesn't impact normal actions, but seems to break subsequent ct()
> > actions on the same pipeline.
> >
> > In general, it looks to me that ct() action in the datapath is
> > not supposed to cache any connection data inside the packet's metadata.
> > This seems to be the root cause of the problem.  Fields that OF knows
> > about should not trigger any issues if carried along with a packet,
> > but datapath-specific metadata can not be cleared, because OFproto
> > simply doesn't know about it.
> >
> > But, I guess, not caching the connection and other internal structures
> > might significantly impact datapath performance.  Will it?
> >
> > Looking at the reproducer in [3], it has, for example, the flow with the
> > following actions:
> >
> >   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
> >   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
> >   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
> >   ct(zone=8),recirc(0x4)
> >
> > So, if the first ct() will change some datapath-specific packet 
> > metadata,
> > the second ct() may try to use that information.
> >
> > It looks like during the flow translation we must add ct_clear action
> > explicitly after every ct() action, unless it was the last action in
> > the list.  This will end up with adding a lot of ct_clear actions
> > everywhere.
> >
> > Another option is the patch below that tries to track if ct_clear
> > is required and adds that action before the next ct() action in
> > the datapath.
> >
> > BTW, the test [3] fails with both kernel and userspace datapaths.
> >
> > The change below should fix the problem, I think.
> > It looks like we also have to put

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-24 Thread Ilya Maximets
On 5/24/22 12:54, Frode Nordahl wrote:
> On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  wrote:
>>
>> On 5/21/22 12:49, Frode Nordahl wrote:
>>> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
>>>  wrote:

 On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  wrote:
>
> On 5/13/22 10:36, Frode Nordahl wrote:
>> On Fri, Mar 11, 2022 at 2:04 PM Liam Young  
>> wrote:
>>>
>>> Hi,
>>>
>>>  Commit 355fef6f2 seems to break connectivity in my setup
>>
>> After OVN started colocating sNAT and dNAT operations in the same CT
>> zone [0] the above mentioned OVS commit appears to also break OVN [1].
>> I have been discussing with Numan and he has found a correlation with
>> the behavior of the open vswitch kernel data path conntrack
>> `skb_nfct_cached` function, i.e. if that is changed to always return
>> 'false' the erratic behavior disappears.
>>
>> So I guess the question then becomes, is this a OVS userspace or OVS
>> kernel problem?
>>
>> We have a reproducer in [3].
>>
>> 0: 
>> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
>> 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
>> 2: 
>> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
>> 3: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
>>
>
> Hrm.  I think, there is a logical bug in implementations of ct()
> datapath action in both datapaths.
>
> The problem appears when the OpenFlow pipeline for the packet contains
> several ct() actions.  NXAST_CT action (i.e. every ct() action) must
> always put the packet into an untracked state.  Tracked state will be
> set only in the 'recirc_table' where the packet is cloned by the ct()
> action for further processing.
>
> If an OF pipeline looks like this:
>
>   actions=ct(),something,something,ct(),something
>
> each action must be entered with the 'untracked' conntrack state in
> the packet metadata.
>
> However, ct() action in the datapath, unlike OpenFlow, doesn't work
> this way.  It modifies the conntrack state for the packet it is 
> processing.
> During the flow translation OVS inserts recirculation right after the
> datapath ct() action.  This way conntrack state affects only processing
> after recirculation.  Sort of.
>
> The issue is that not all OpenFlow ct() actions have recirc_table
> specified.  These actions supposed to change the state of the conntrack
> subsystem, but they still must leave the packet itself in the untracked
> state with no strings attached.  But since datapath ct() actions doesn't
> work that way and since recirculation is not inserted (no recirc_table
> specified), packet after conntrack execution carries the state along to
> all other actions.
> This doesn't impact normal actions, but seems to break subsequent ct()
> actions on the same pipeline.
>
> In general, it looks to me that ct() action in the datapath is
> not supposed to cache any connection data inside the packet's metadata.
> This seems to be the root cause of the problem.  Fields that OF knows
> about should not trigger any issues if carried along with a packet,
> but datapath-specific metadata can not be cleared, because OFproto
> simply doesn't know about it.
>
> But, I guess, not caching the connection and other internal structures
> might significantly impact datapath performance.  Will it?
>
> Looking at the reproducer in [3], it has, for example, the flow with the
> following actions:
>
>   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
>   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
>   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
>   ct(zone=8),recirc(0x4)
>
> So, if the first ct() will change some datapath-specific packet metadata,
> the second ct() may try to use that information.
>
> It looks like during the flow translation we must add ct_clear action
> explicitly after every ct() action, unless it was the last action in
> the list.  This will end up with adding a lot of ct_clear actions
> everywhere.
>
> Another option is the patch below that tries to track if ct_clear
> is required and adds that action before the next ct() action in
> the datapath.
>
> BTW, the test [3] fails with both kernel and userspace datapaths.
>
> The change below should fix the problem, I think.
> It looks like we also have to put ct_clear action before translating
> output to the patch port if we're in 'conntracked' state.
>
> And I do not know how to fix the problem for kernels without ct_clear
> support.  I don't think we can clear metadata that is internal to
> the kernel.  Ideas are welcome.
>>

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-24 Thread Frode Nordahl
On Mon, May 23, 2022 at 3:49 PM Ilya Maximets  wrote:
>
> On 5/21/22 12:49, Frode Nordahl wrote:
> > On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
> >  wrote:
> >>
> >> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  wrote:
> >>>
> >>> On 5/13/22 10:36, Frode Nordahl wrote:
>  On Fri, Mar 11, 2022 at 2:04 PM Liam Young  
>  wrote:
> >
> > Hi,
> >
> >  Commit 355fef6f2 seems to break connectivity in my setup
> 
>  After OVN started colocating sNAT and dNAT operations in the same CT
>  zone [0] the above mentioned OVS commit appears to also break OVN [1].
>  I have been discussing with Numan and he has found a correlation with
>  the behavior of the open vswitch kernel data path conntrack
>  `skb_nfct_cached` function, i.e. if that is changed to always return
>  'false' the erratic behavior disappears.
> 
>  So I guess the question then becomes, is this a OVS userspace or OVS
>  kernel problem?
> 
>  We have a reproducer in [3].
> 
>  0: 
>  https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
>  1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
>  2: 
>  https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
>  3: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> 
> >>>
> >>> Hrm.  I think, there is a logical bug in implementations of ct()
> >>> datapath action in both datapaths.
> >>>
> >>> The problem appears when the OpenFlow pipeline for the packet contains
> >>> several ct() actions.  NXAST_CT action (i.e. every ct() action) must
> >>> always put the packet into an untracked state.  Tracked state will be
> >>> set only in the 'recirc_table' where the packet is cloned by the ct()
> >>> action for further processing.
> >>>
> >>> If an OF pipeline looks like this:
> >>>
> >>>   actions=ct(),something,something,ct(),something
> >>>
> >>> each action must be entered with the 'untracked' conntrack state in
> >>> the packet metadata.
> >>>
> >>> However, ct() action in the datapath, unlike OpenFlow, doesn't work
> >>> this way.  It modifies the conntrack state for the packet it is 
> >>> processing.
> >>> During the flow translation OVS inserts recirculation right after the
> >>> datapath ct() action.  This way conntrack state affects only processing
> >>> after recirculation.  Sort of.
> >>>
> >>> The issue is that not all OpenFlow ct() actions have recirc_table
> >>> specified.  These actions supposed to change the state of the conntrack
> >>> subsystem, but they still must leave the packet itself in the untracked
> >>> state with no strings attached.  But since datapath ct() actions doesn't
> >>> work that way and since recirculation is not inserted (no recirc_table
> >>> specified), packet after conntrack execution carries the state along to
> >>> all other actions.
> >>> This doesn't impact normal actions, but seems to break subsequent ct()
> >>> actions on the same pipeline.
> >>>
> >>> In general, it looks to me that ct() action in the datapath is
> >>> not supposed to cache any connection data inside the packet's metadata.
> >>> This seems to be the root cause of the problem.  Fields that OF knows
> >>> about should not trigger any issues if carried along with a packet,
> >>> but datapath-specific metadata can not be cleared, because OFproto
> >>> simply doesn't know about it.
> >>>
> >>> But, I guess, not caching the connection and other internal structures
> >>> might significantly impact datapath performance.  Will it?
> >>>
> >>> Looking at the reproducer in [3], it has, for example, the flow with the
> >>> following actions:
> >>>
> >>>   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
> >>>   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
> >>>   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
> >>>   ct(zone=8),recirc(0x4)
> >>>
> >>> So, if the first ct() will change some datapath-specific packet metadata,
> >>> the second ct() may try to use that information.
> >>>
> >>> It looks like during the flow translation we must add ct_clear action
> >>> explicitly after every ct() action, unless it was the last action in
> >>> the list.  This will end up with adding a lot of ct_clear actions
> >>> everywhere.
> >>>
> >>> Another option is the patch below that tries to track if ct_clear
> >>> is required and adds that action before the next ct() action in
> >>> the datapath.
> >>>
> >>> BTW, the test [3] fails with both kernel and userspace datapaths.
> >>>
> >>> The change below should fix the problem, I think.
> >>> It looks like we also have to put ct_clear action before translating
> >>> output to the patch port if we're in 'conntracked' state.
> >>>
> >>> And I do not know how to fix the problem for kernels without ct_clear
> >>> support.  I don't think we can clear metadata that is internal to
> >>> the kernel.  Ideas are welcome.
> >>>
> >>> CC: Aaron, Paolo.
> >>>
> >>> Any

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-23 Thread Ilya Maximets
On 5/21/22 12:49, Frode Nordahl wrote:
> On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
>  wrote:
>>
>> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  wrote:
>>>
>>> On 5/13/22 10:36, Frode Nordahl wrote:
 On Fri, Mar 11, 2022 at 2:04 PM Liam Young  
 wrote:
>
> Hi,
>
>  Commit 355fef6f2 seems to break connectivity in my setup

 After OVN started colocating sNAT and dNAT operations in the same CT
 zone [0] the above mentioned OVS commit appears to also break OVN [1].
 I have been discussing with Numan and he has found a correlation with
 the behavior of the open vswitch kernel data path conntrack
 `skb_nfct_cached` function, i.e. if that is changed to always return
 'false' the erratic behavior disappears.

 So I guess the question then becomes, is this a OVS userspace or OVS
 kernel problem?

 We have a reproducer in [3].

 0: 
 https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
 2: 
 https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
 3: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6

>>>
>>> Hrm.  I think, there is a logical bug in implementations of ct()
>>> datapath action in both datapaths.
>>>
>>> The problem appears when the OpenFlow pipeline for the packet contains
>>> several ct() actions.  NXAST_CT action (i.e. every ct() action) must
>>> always put the packet into an untracked state.  Tracked state will be
>>> set only in the 'recirc_table' where the packet is cloned by the ct()
>>> action for further processing.
>>>
>>> If an OF pipeline looks like this:
>>>
>>>   actions=ct(),something,something,ct(),something
>>>
>>> each action must be entered with the 'untracked' conntrack state in
>>> the packet metadata.
>>>
>>> However, ct() action in the datapath, unlike OpenFlow, doesn't work
>>> this way.  It modifies the conntrack state for the packet it is processing.
>>> During the flow translation OVS inserts recirculation right after the
>>> datapath ct() action.  This way conntrack state affects only processing
>>> after recirculation.  Sort of.
>>>
>>> The issue is that not all OpenFlow ct() actions have recirc_table
>>> specified.  These actions supposed to change the state of the conntrack
>>> subsystem, but they still must leave the packet itself in the untracked
>>> state with no strings attached.  But since datapath ct() actions doesn't
>>> work that way and since recirculation is not inserted (no recirc_table
>>> specified), packet after conntrack execution carries the state along to
>>> all other actions.
>>> This doesn't impact normal actions, but seems to break subsequent ct()
>>> actions on the same pipeline.
>>>
>>> In general, it looks to me that ct() action in the datapath is
>>> not supposed to cache any connection data inside the packet's metadata.
>>> This seems to be the root cause of the problem.  Fields that OF knows
>>> about should not trigger any issues if carried along with a packet,
>>> but datapath-specific metadata can not be cleared, because OFproto
>>> simply doesn't know about it.
>>>
>>> But, I guess, not caching the connection and other internal structures
>>> might significantly impact datapath performance.  Will it?
>>>
>>> Looking at the reproducer in [3], it has, for example, the flow with the
>>> following actions:
>>>
>>>   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
>>>   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
>>>   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
>>>   ct(zone=8),recirc(0x4)
>>>
>>> So, if the first ct() will change some datapath-specific packet metadata,
>>> the second ct() may try to use that information.
>>>
>>> It looks like during the flow translation we must add ct_clear action
>>> explicitly after every ct() action, unless it was the last action in
>>> the list.  This will end up with adding a lot of ct_clear actions
>>> everywhere.
>>>
>>> Another option is the patch below that tries to track if ct_clear
>>> is required and adds that action before the next ct() action in
>>> the datapath.
>>>
>>> BTW, the test [3] fails with both kernel and userspace datapaths.
>>>
>>> The change below should fix the problem, I think.
>>> It looks like we also have to put ct_clear action before translating
>>> output to the patch port if we're in 'conntracked' state.
>>>
>>> And I do not know how to fix the problem for kernels without ct_clear
>>> support.  I don't think we can clear metadata that is internal to
>>> the kernel.  Ideas are welcome.
>>>
>>> CC: Aaron, Paolo.
>>>
>>> Any thoughts?
>>
>> Thank you so much for the detailed explanation of what's going on and
>> for providing a proposed fix.
>>
>> I took it for a spin and it does indeed appear to fix the OVN hairpin
>> issue, but it does unfortunately not appear to fix the issue Liam
>> reported for 

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-21 Thread Frode Nordahl
On Thu, May 19, 2022 at 3:39 PM Frode Nordahl
 wrote:
>
> On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  wrote:
> >
> > On 5/13/22 10:36, Frode Nordahl wrote:
> > > On Fri, Mar 11, 2022 at 2:04 PM Liam Young  
> > > wrote:
> > >>
> > >> Hi,
> > >>
> > >>  Commit 355fef6f2 seems to break connectivity in my setup
> > >
> > > After OVN started colocating sNAT and dNAT operations in the same CT
> > > zone [0] the above mentioned OVS commit appears to also break OVN [1].
> > > I have been discussing with Numan and he has found a correlation with
> > > the behavior of the open vswitch kernel data path conntrack
> > > `skb_nfct_cached` function, i.e. if that is changed to always return
> > > 'false' the erratic behavior disappears.
> > >
> > > So I guess the question then becomes, is this a OVS userspace or OVS
> > > kernel problem?
> > >
> > > We have a reproducer in [3].
> > >
> > > 0: 
> > > https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> > > 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> > > 2: 
> > > https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> > > 3: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> > >
> >
> > Hrm.  I think, there is a logical bug in implementations of ct()
> > datapath action in both datapaths.
> >
> > The problem appears when the OpenFlow pipeline for the packet contains
> > several ct() actions.  NXAST_CT action (i.e. every ct() action) must
> > always put the packet into an untracked state.  Tracked state will be
> > set only in the 'recirc_table' where the packet is cloned by the ct()
> > action for further processing.
> >
> > If an OF pipeline looks like this:
> >
> >   actions=ct(),something,something,ct(),something
> >
> > each action must be entered with the 'untracked' conntrack state in
> > the packet metadata.
> >
> > However, ct() action in the datapath, unlike OpenFlow, doesn't work
> > this way.  It modifies the conntrack state for the packet it is processing.
> > During the flow translation OVS inserts recirculation right after the
> > datapath ct() action.  This way conntrack state affects only processing
> > after recirculation.  Sort of.
> >
> > The issue is that not all OpenFlow ct() actions have recirc_table
> > specified.  These actions supposed to change the state of the conntrack
> > subsystem, but they still must leave the packet itself in the untracked
> > state with no strings attached.  But since datapath ct() actions doesn't
> > work that way and since recirculation is not inserted (no recirc_table
> > specified), packet after conntrack execution carries the state along to
> > all other actions.
> > This doesn't impact normal actions, but seems to break subsequent ct()
> > actions on the same pipeline.
> >
> > In general, it looks to me that ct() action in the datapath is
> > not supposed to cache any connection data inside the packet's metadata.
> > This seems to be the root cause of the problem.  Fields that OF knows
> > about should not trigger any issues if carried along with a packet,
> > but datapath-specific metadata can not be cleared, because OFproto
> > simply doesn't know about it.
> >
> > But, I guess, not caching the connection and other internal structures
> > might significantly impact datapath performance.  Will it?
> >
> > Looking at the reproducer in [3], it has, for example, the flow with the
> > following actions:
> >
> >   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
> >   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
> >   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
> >   ct(zone=8),recirc(0x4)
> >
> > So, if the first ct() will change some datapath-specific packet metadata,
> > the second ct() may try to use that information.
> >
> > It looks like during the flow translation we must add ct_clear action
> > explicitly after every ct() action, unless it was the last action in
> > the list.  This will end up with adding a lot of ct_clear actions
> > everywhere.
> >
> > Another option is the patch below that tries to track if ct_clear
> > is required and adds that action before the next ct() action in
> > the datapath.
> >
> > BTW, the test [3] fails with both kernel and userspace datapaths.
> >
> > The change below should fix the problem, I think.
> > It looks like we also have to put ct_clear action before translating
> > output to the patch port if we're in 'conntracked' state.
> >
> > And I do not know how to fix the problem for kernels without ct_clear
> > support.  I don't think we can clear metadata that is internal to
> > the kernel.  Ideas are welcome.
> >
> > CC: Aaron, Paolo.
> >
> > Any thoughts?
>
> Thank you so much for the detailed explanation of what's going on and
> for providing a proposed fix.
>
> I took it for a spin and it does indeed appear to fix the OVN hairpin
> issue, but it does unfortunately not appear to fix the issue Liam
> reported for the ML2/OVS use case [4].

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-19 Thread Frode Nordahl
On Sat, May 14, 2022 at 2:10 AM Ilya Maximets  wrote:
>
> On 5/13/22 10:36, Frode Nordahl wrote:
> > On Fri, Mar 11, 2022 at 2:04 PM Liam Young  wrote:
> >>
> >> Hi,
> >>
> >>  Commit 355fef6f2 seems to break connectivity in my setup
> >
> > After OVN started colocating sNAT and dNAT operations in the same CT
> > zone [0] the above mentioned OVS commit appears to also break OVN [1].
> > I have been discussing with Numan and he has found a correlation with
> > the behavior of the open vswitch kernel data path conntrack
> > `skb_nfct_cached` function, i.e. if that is changed to always return
> > 'false' the erratic behavior disappears.
> >
> > So I guess the question then becomes, is this a OVS userspace or OVS
> > kernel problem?
> >
> > We have a reproducer in [3].
> >
> > 0: 
> > https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> > 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> > 2: 
> > https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> > 3: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> >
>
> Hrm.  I think, there is a logical bug in implementations of ct()
> datapath action in both datapaths.
>
> The problem appears when the OpenFlow pipeline for the packet contains
> several ct() actions.  NXAST_CT action (i.e. every ct() action) must
> always put the packet into an untracked state.  Tracked state will be
> set only in the 'recirc_table' where the packet is cloned by the ct()
> action for further processing.
>
> If an OF pipeline looks like this:
>
>   actions=ct(),something,something,ct(),something
>
> each action must be entered with the 'untracked' conntrack state in
> the packet metadata.
>
> However, ct() action in the datapath, unlike OpenFlow, doesn't work
> this way.  It modifies the conntrack state for the packet it is processing.
> During the flow translation OVS inserts recirculation right after the
> datapath ct() action.  This way conntrack state affects only processing
> after recirculation.  Sort of.
>
> The issue is that not all OpenFlow ct() actions have recirc_table
> specified.  These actions supposed to change the state of the conntrack
> subsystem, but they still must leave the packet itself in the untracked
> state with no strings attached.  But since datapath ct() actions doesn't
> work that way and since recirculation is not inserted (no recirc_table
> specified), packet after conntrack execution carries the state along to
> all other actions.
> This doesn't impact normal actions, but seems to break subsequent ct()
> actions on the same pipeline.
>
> In general, it looks to me that ct() action in the datapath is
> not supposed to cache any connection data inside the packet's metadata.
> This seems to be the root cause of the problem.  Fields that OF knows
> about should not trigger any issues if carried along with a packet,
> but datapath-specific metadata can not be cleared, because OFproto
> simply doesn't know about it.
>
> But, I guess, not caching the connection and other internal structures
> might significantly impact datapath performance.  Will it?
>
> Looking at the reproducer in [3], it has, for example, the flow with the
> following actions:
>
>   actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
>   set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
>   set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
>   ct(zone=8),recirc(0x4)
>
> So, if the first ct() will change some datapath-specific packet metadata,
> the second ct() may try to use that information.
>
> It looks like during the flow translation we must add ct_clear action
> explicitly after every ct() action, unless it was the last action in
> the list.  This will end up with adding a lot of ct_clear actions
> everywhere.
>
> Another option is the patch below that tries to track if ct_clear
> is required and adds that action before the next ct() action in
> the datapath.
>
> BTW, the test [3] fails with both kernel and userspace datapaths.
>
> The change below should fix the problem, I think.
> It looks like we also have to put ct_clear action before translating
> output to the patch port if we're in 'conntracked' state.
>
> And I do not know how to fix the problem for kernels without ct_clear
> support.  I don't think we can clear metadata that is internal to
> the kernel.  Ideas are welcome.
>
> CC: Aaron, Paolo.
>
> Any thoughts?

Thank you so much for the detailed explanation of what's going on and
for providing a proposed fix.

I took it for a spin and it does indeed appear to fix the OVN hairpin
issue, but it does unfortunately not appear to fix the issue Liam
reported for the ML2/OVS use case [4].

When trying that use case with your patch I see this in the Open vSwitch log:
2022-05-19T08:17:02.668Z|1|ofproto_dpif_xlate(handler1)|WARN|over
max translation depth 64 on bridge br-int while processing
ct_state=new|trk,ct_ipv6_src=fc00:24:159c:555b:f816:3eff:fe8b:3ad0,ct_ipv6_

Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup

2022-05-13 Thread Ilya Maximets
On 5/13/22 10:36, Frode Nordahl wrote:
> On Fri, Mar 11, 2022 at 2:04 PM Liam Young  wrote:
>>
>> Hi,
>>
>>  Commit 355fef6f2 seems to break connectivity in my setup
> 
> After OVN started colocating sNAT and dNAT operations in the same CT
> zone [0] the above mentioned OVS commit appears to also break OVN [1].
> I have been discussing with Numan and he has found a correlation with
> the behavior of the open vswitch kernel data path conntrack
> `skb_nfct_cached` function, i.e. if that is changed to always return
> 'false' the erratic behavior disappears.
> 
> So I guess the question then becomes, is this a OVS userspace or OVS
> kernel problem?
> 
> We have a reproducer in [3].
> 
> 0: 
> https://github.com/ovn-org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a
> 1: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
> 2: 
> https://elixir.bootlin.com/linux/latest/source/net/openvswitch/conntrack.c#L683
> 3: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
> 

Hrm.  I think, there is a logical bug in implementations of ct()
datapath action in both datapaths.

The problem appears when the OpenFlow pipeline for the packet contains
several ct() actions.  NXAST_CT action (i.e. every ct() action) must
always put the packet into an untracked state.  Tracked state will be
set only in the 'recirc_table' where the packet is cloned by the ct()
action for further processing.

If an OF pipeline looks like this:

  actions=ct(),something,something,ct(),something

each action must be entered with the 'untracked' conntrack state in
the packet metadata.

However, ct() action in the datapath, unlike OpenFlow, doesn't work
this way.  It modifies the conntrack state for the packet it is processing.
During the flow translation OVS inserts recirculation right after the
datapath ct() action.  This way conntrack state affects only processing
after recirculation.  Sort of.

The issue is that not all OpenFlow ct() actions have recirc_table
specified.  These actions supposed to change the state of the conntrack
subsystem, but they still must leave the packet itself in the untracked
state with no strings attached.  But since datapath ct() actions doesn't
work that way and since recirculation is not inserted (no recirc_table
specified), packet after conntrack execution carries the state along to
all other actions.
This doesn't impact normal actions, but seems to break subsequent ct()
actions on the same pipeline.

In general, it looks to me that ct() action in the datapath is
not supposed to cache any connection data inside the packet's metadata.
This seems to be the root cause of the problem.  Fields that OF knows
about should not trigger any issues if carried along with a packet,
but datapath-specific metadata can not be cleared, because OFproto
simply doesn't know about it.

But, I guess, not caching the connection and other internal structures
might significantly impact datapath performance.  Will it?

Looking at the reproducer in [3], it has, for example, the flow with the
following actions:

  actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
  set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
  set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
  ct(zone=8),recirc(0x4)

So, if the first ct() will change some datapath-specific packet metadata,
the second ct() may try to use that information.

It looks like during the flow translation we must add ct_clear action
explicitly after every ct() action, unless it was the last action in
the list.  This will end up with adding a lot of ct_clear actions
everywhere.

Another option is the patch below that tries to track if ct_clear
is required and adds that action before the next ct() action in
the datapath.

BTW, the test [3] fails with both kernel and userspace datapaths.

The change below should fix the problem, I think.
It looks like we also have to put ct_clear action before translating
output to the patch port if we're in 'conntracked' state.

And I do not know how to fix the problem for kernels without ct_clear
support.  I don't think we can clear metadata that is internal to
the kernel.  Ideas are welcome.

CC: Aaron, Paolo.

Any thoughts?

Best regards, Ilya Maximets.

---
diff --git a/include/openvswitch/ofp-packet.h b/include/openvswitch/ofp-packet.h
index 77128d829..26b07e07f 100644
--- a/include/openvswitch/ofp-packet.h
+++ b/include/openvswitch/ofp-packet.h
@@ -133,6 +133,9 @@ struct ofputil_packet_in_private {
 /* NXCPT_CONNTRACKED. */
 bool conntracked;
 
+/* NXCPT_PENDING_CT_CLEAR. */
+bool pending_ct_clear;
+
 /* NXCPT_ACTIONS. */
 struct ofpact *actions;
 size_t actions_len;
diff --git a/lib/ofp-packet.c b/lib/ofp-packet.c
index 9485ddfc9..1df092471 100644
--- a/lib/ofp-packet.c
+++ b/lib/ofp-packet.c
@@ -425,6 +425,7 @@ enum nx_continuation_prop_type {
 NXCPT_ACTIONS,
 NXCPT_ACTION_SET,
 NXCPT_ODP_PORT,
+NXCPT_PENDING_CT_CLEAR,
 };
 
 /* Only NXT_PACKET_IN2 (not NXT