Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-02 Thread Dong Jun



On 2017/1/3 12:59, Numan Siddique wrote:



On Tue, Jan 3, 2017 at 2:06 AM, Mickey Spiegel > wrote:



On Mon, Jan 2, 2017 at 3:46 AM, Numan Siddique
> wrote:



On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel
> wrote:


On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique
> wrote:



On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel
>
wrote:


On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel
> wrote:


On Fri, Dec 30, 2016 at 11:37 AM, Mickey
Spiegel > wrote:


On Fri, Dec 30, 2016 at 7:46 AM, Numan
Siddique > wrote:

On Fri, Dec 30, 2016 at 5:36 PM, Dong
Jun > wrote:




​
Hi Dong Jun, I am also facing the same
issue on my setup.
​
These are the findings of my
investigation so far

Looks like this issue is seen after
the commit

https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312622fbaeacbc6ce7576e347


​
which removes the usage of patch ports
and uses the clone action instead.
​

I reverted to the commit just before
it and SNAT/DNAT is working as
expected.

In my case, the gateway router is
hosted on node 1 and the I am trying to
reach a VM (192.168.0.5) hosted on
node 2 using the external ip
(10.2.7.105) associated ​with it. I
could see that the node 1 is sending
the packet to node 2 through the
geneve tunnel, but it is dropped by node 2
flows.

Below is the tcpdump of the packet

**
19:39:44.709907 IP 182.16.0.16.60069 >
182.16.0.15.geneve: Geneve, Flags
[none], vni 0x1: IP
nusiddiq.blr.redhat.com
 >
192.168.0.5 : ICMP
echo
request, id 13240, seq 1, length 64
***

Below is the tcpdump of the packet
with the ovn-controller (without the
above commit) in the working case

**
19:41:56.783570 IP 182.16.0.12.29778 >
182.16.0.15.geneve: Geneve, Flags
[C], vni 0x1, options [8 bytes]: IP
nusiddiq.blr.redhat.com
 >
192.168.0.5 :
ICMP echo request, id 13308, seq 1,
length 64
19:41:56.784270 IP 182.16.0.15.14539 >
182.16.0.12.geneve: Geneve, Flags
[C], vni 0xf, options [8 bytes]: IP
192.168.0.5 > nusiddiq.blr.redhat.com
:
ICMP echo reply, id 13308, seq 1,
length 64
  

Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-02 Thread Mickey Spiegel
On Mon, Jan 2, 2017 at 8:59 PM, Numan Siddique  wrote:

>
>
> On Tue, Jan 3, 2017 at 2:06 AM, Mickey Spiegel 
> wrote:
>
>>
>> On Mon, Jan 2, 2017 at 3:46 AM, Numan Siddique 
>> wrote:
>>
>>>
>>>
>>> On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel 
>>> wrote:
>>>

 On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique 
 wrote:

>
>
> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel 
> wrote:
>
>>
>> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel <
>> mickeys@gmail.com> wrote:
>>
>>>
>>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel <
>>> mickeys@gmail.com> wrote:
>>>

 On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique <
 nusid...@redhat.com> wrote:

> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun 
> wrote:
>

 


> ​
> Hi Dong Jun, I am also facing the same issue on my setup.
> ​
> These are the findings of my investigation so far
>
> Looks like this issue is seen after the commit
> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
> 622fbaeacbc6ce7576e347
> ​
> which removes the usage of patch ports and uses the clone action
> instead.
> ​
>
> I reverted to the commit just before it and SNAT/DNAT is working as
> expected.
>
> In my case, the gateway router is hosted on node 1 and the I am
> trying to
> reach a VM (192.168.0.5) hosted on node 2 using the external ip
> (10.2.7.105) associated ​with it. I could see that the node 1 is
> sending
> the packet to node 2 through the geneve tunnel, but it is dropped
> by node 2
> flows.
>
> Below is the tcpdump of the packet
>
> **
> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
> Flags
> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP
> echo
> request, id 13240, seq 1, length 64
> ***
>
> Below is the tcpdump of the packet with the ovn-controller
> (without the
> above commit) in the working case
>
> **
> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
> Flags
> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
> 192.168.0.5:
> ICMP echo request, id 13308, seq 1, length 64
> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
> Flags
> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
> nusiddiq.blr.redhat.com:
> ICMP echo reply, id 13308, seq 1, length 64
> **
>
> The options data has - 00030005
>
> From the packet, I could see that the packet from node 1 is
> missing the
> geneve option fields which has inport and outport keys.
>

 I am facing the same issue running my distributed NAT patch set.
 Between UNSNAT recirc and output to tunnel, a megaflow is installed
 that
 is missing the geneve option fields.

 I verified that the table=32 openflow rule has the geneve option
 fields.
 ofproto/trace shows geneve in the "Datapath actions" at the end, so
 no
 problem with whatever ofproto/trace is using.

>>>
>>> Throwing some logs in, I see that flow->metadata.present.map is 0
>>> rather
>>> than 1 coming into tun_metadata_to_geneve_nlattr() in
>>> lib/tun-metadata.c,
>>> when the problem occurs. That is why the geneve option fields are
>>> missing.
>>>
>>> I have not yet figured out why flow->metadata.present.map is 0. It
>>> should
>>> be modified when tun_metadata_write() is called due to actions
>>> setting
>>> tunnel metadata values. I have not checked that yet.
>>>
>>
>> I just posted a fix. I did not try it with the gateway router or with
>> OpenStack,
>> but with this bug fix all distributed NAT manual test cases are now
>> passing.
>>
>>
> ​Thanks for the fix. I just tested it. Its working when I am trying to
> reach the ​VM using its floating ip. But not when trying to ping
> www.google.com from the VM (SNAT use case)
>

 With distributed NAT, most of my debugging and tests were using SNAT.
 The bug fix that I posted fixed the problem that was causing ICMP echo
 replies to be dropped. The openflow path for distributed SNAT is similar to
 that for SNAT on gateway routers, but there are still some 

Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-02 Thread Numan Siddique
On Tue, Jan 3, 2017 at 2:06 AM, Mickey Spiegel 
wrote:

>
> On Mon, Jan 2, 2017 at 3:46 AM, Numan Siddique 
> wrote:
>
>>
>>
>> On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel 
>> wrote:
>>
>>>
>>> On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique 
>>> wrote:
>>>


 On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel 
 wrote:

>
> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel  > wrote:
>
>>
>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel <
>> mickeys@gmail.com> wrote:
>>
>>>
>>> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique >> > wrote:
>>>
 On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun 
 wrote:

>>>
>>> 
>>>
>>>
 ​
 Hi Dong Jun, I am also facing the same issue on my setup.
 ​
 These are the findings of my investigation so far

 Looks like this issue is seen after the commit
 https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
 622fbaeacbc6ce7576e347
 ​
 which removes the usage of patch ports and uses the clone action
 instead.
 ​

 I reverted to the commit just before it and SNAT/DNAT is working as
 expected.

 In my case, the gateway router is hosted on node 1 and the I am
 trying to
 reach a VM (192.168.0.5) hosted on node 2 using the external ip
 (10.2.7.105) associated ​with it. I could see that the node 1 is
 sending
 the packet to node 2 through the geneve tunnel, but it is dropped
 by node 2
 flows.

 Below is the tcpdump of the packet

 **
 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
 Flags
 [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP
 echo
 request, id 13240, seq 1, length 64
 ***

 Below is the tcpdump of the packet with the ovn-controller (without
 the
 above commit) in the working case

 **
 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
 Flags
 [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
 192.168.0.5:
 ICMP echo request, id 13308, seq 1, length 64
 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
 Flags
 [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
 nusiddiq.blr.redhat.com:
 ICMP echo reply, id 13308, seq 1, length 64
 **

 The options data has - 00030005

 From the packet, I could see that the packet from node 1 is missing
 the
 geneve option fields which has inport and outport keys.

>>>
>>> I am facing the same issue running my distributed NAT patch set.
>>> Between UNSNAT recirc and output to tunnel, a megaflow is installed
>>> that
>>> is missing the geneve option fields.
>>>
>>> I verified that the table=32 openflow rule has the geneve option
>>> fields.
>>> ofproto/trace shows geneve in the "Datapath actions" at the end, so
>>> no
>>> problem with whatever ofproto/trace is using.
>>>
>>
>> Throwing some logs in, I see that flow->metadata.present.map is 0
>> rather
>> than 1 coming into tun_metadata_to_geneve_nlattr() in
>> lib/tun-metadata.c,
>> when the problem occurs. That is why the geneve option fields are
>> missing.
>>
>> I have not yet figured out why flow->metadata.present.map is 0. It
>> should
>> be modified when tun_metadata_write() is called due to actions setting
>> tunnel metadata values. I have not checked that yet.
>>
>
> I just posted a fix. I did not try it with the gateway router or with
> OpenStack,
> but with this bug fix all distributed NAT manual test cases are now
> passing.
>
>
 ​Thanks for the fix. I just tested it. Its working when I am trying to
 reach the ​VM using its floating ip. But not when trying to ping
 www.google.com from the VM (SNAT use case)

>>>
>>> With distributed NAT, most of my debugging and tests were using SNAT.
>>> The bug fix that I posted fixed the problem that was causing ICMP echo
>>> replies to be dropped. The openflow path for distributed SNAT is similar to
>>> that for SNAT on gateway routers, but there are still some differences,
>>> notably one router instead of two routers and no "join" switch. Also I did
>>> not try it with DNS.
>>>
>>> Are you able to debug further, to see whether a missing geneve options
>>> field is still the culprit?
>>> It is 

Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-02 Thread Mickey Spiegel
On Mon, Jan 2, 2017 at 3:46 AM, Numan Siddique  wrote:

>
>
> On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel 
> wrote:
>
>>
>> On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique 
>> wrote:
>>
>>>
>>>
>>> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel 
>>> wrote:
>>>

 On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel 
 wrote:

>
> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel <
> mickeys@gmail.com> wrote:
>
>>
>> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique 
>> wrote:
>>
>>> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:
>>>
>>
>> 
>>
>>
>>> ​
>>> Hi Dong Jun, I am also facing the same issue on my setup.
>>> ​
>>> These are the findings of my investigation so far
>>>
>>> Looks like this issue is seen after the commit
>>> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
>>> 622fbaeacbc6ce7576e347
>>> ​
>>> which removes the usage of patch ports and uses the clone action
>>> instead.
>>> ​
>>>
>>> I reverted to the commit just before it and SNAT/DNAT is working as
>>> expected.
>>>
>>> In my case, the gateway router is hosted on node 1 and the I am
>>> trying to
>>> reach a VM (192.168.0.5) hosted on node 2 using the external ip
>>> (10.2.7.105) associated ​with it. I could see that the node 1 is
>>> sending
>>> the packet to node 2 through the geneve tunnel, but it is dropped by
>>> node 2
>>> flows.
>>>
>>> Below is the tcpdump of the packet
>>>
>>> **
>>> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
>>> Flags
>>> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
>>> request, id 13240, seq 1, length 64
>>> ***
>>>
>>> Below is the tcpdump of the packet with the ovn-controller (without
>>> the
>>> above commit) in the working case
>>>
>>> **
>>> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
>>> Flags
>>> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
>>> 192.168.0.5:
>>> ICMP echo request, id 13308, seq 1, length 64
>>> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
>>> Flags
>>> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
>>> nusiddiq.blr.redhat.com:
>>> ICMP echo reply, id 13308, seq 1, length 64
>>> **
>>>
>>> The options data has - 00030005
>>>
>>> From the packet, I could see that the packet from node 1 is missing
>>> the
>>> geneve option fields which has inport and outport keys.
>>>
>>
>> I am facing the same issue running my distributed NAT patch set.
>> Between UNSNAT recirc and output to tunnel, a megaflow is installed
>> that
>> is missing the geneve option fields.
>>
>> I verified that the table=32 openflow rule has the geneve option
>> fields.
>> ofproto/trace shows geneve in the "Datapath actions" at the end, so no
>> problem with whatever ofproto/trace is using.
>>
>
> Throwing some logs in, I see that flow->metadata.present.map is 0
> rather
> than 1 coming into tun_metadata_to_geneve_nlattr() in
> lib/tun-metadata.c,
> when the problem occurs. That is why the geneve option fields are
> missing.
>
> I have not yet figured out why flow->metadata.present.map is 0. It
> should
> be modified when tun_metadata_write() is called due to actions setting
> tunnel metadata values. I have not checked that yet.
>

 I just posted a fix. I did not try it with the gateway router or with
 OpenStack,
 but with this bug fix all distributed NAT manual test cases are now
 passing.


>>> ​Thanks for the fix. I just tested it. Its working when I am trying to
>>> reach the ​VM using its floating ip. But not when trying to ping
>>> www.google.com from the VM (SNAT use case)
>>>
>>
>> With distributed NAT, most of my debugging and tests were using SNAT. The
>> bug fix that I posted fixed the problem that was causing ICMP echo replies
>> to be dropped. The openflow path for distributed SNAT is similar to that
>> for SNAT on gateway routers, but there are still some differences, notably
>> one router instead of two routers and no "join" switch. Also I did not try
>> it with DNS.
>>
>> Are you able to debug further, to see whether a missing geneve options
>> field is still the culprit?
>> It is possible that removal of patch ports within br-int uncovered other
>> issues.
>>
>
>
> ​With some testing I could see that in the node where the gateway is
> hosted
>  - The ​reply packet reaches the gateway router pipeline -> to the otls
> switch 

Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-02 Thread Numan Siddique
On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel 
wrote:

>
> On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique 
> wrote:
>
>>
>>
>> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel 
>> wrote:
>>
>>>
>>> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel 
>>> wrote:
>>>

 On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel  wrote:

>
> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique 
> wrote:
>
>> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:
>>
>
> 
>
>
>> ​
>> Hi Dong Jun, I am also facing the same issue on my setup.
>> ​
>> These are the findings of my investigation so far
>>
>> Looks like this issue is seen after the commit
>> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
>> 622fbaeacbc6ce7576e347
>> ​
>> which removes the usage of patch ports and uses the clone action
>> instead.
>> ​
>>
>> I reverted to the commit just before it and SNAT/DNAT is working as
>> expected.
>>
>> In my case, the gateway router is hosted on node 1 and the I am
>> trying to
>> reach a VM (192.168.0.5) hosted on node 2 using the external ip
>> (10.2.7.105) associated ​with it. I could see that the node 1 is
>> sending
>> the packet to node 2 through the geneve tunnel, but it is dropped by
>> node 2
>> flows.
>>
>> Below is the tcpdump of the packet
>>
>> **
>> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
>> Flags
>> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
>> request, id 13240, seq 1, length 64
>> ***
>>
>> Below is the tcpdump of the packet with the ovn-controller (without
>> the
>> above commit) in the working case
>>
>> **
>> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
>> Flags
>> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
>> 192.168.0.5:
>> ICMP echo request, id 13308, seq 1, length 64
>> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
>> Flags
>> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
>> nusiddiq.blr.redhat.com:
>> ICMP echo reply, id 13308, seq 1, length 64
>> **
>>
>> The options data has - 00030005
>>
>> From the packet, I could see that the packet from node 1 is missing
>> the
>> geneve option fields which has inport and outport keys.
>>
>
> I am facing the same issue running my distributed NAT patch set.
> Between UNSNAT recirc and output to tunnel, a megaflow is installed
> that
> is missing the geneve option fields.
>
> I verified that the table=32 openflow rule has the geneve option
> fields.
> ofproto/trace shows geneve in the "Datapath actions" at the end, so no
> problem with whatever ofproto/trace is using.
>

 Throwing some logs in, I see that flow->metadata.present.map is 0 rather
 than 1 coming into tun_metadata_to_geneve_nlattr() in
 lib/tun-metadata.c,
 when the problem occurs. That is why the geneve option fields are
 missing.

 I have not yet figured out why flow->metadata.present.map is 0. It
 should
 be modified when tun_metadata_write() is called due to actions setting
 tunnel metadata values. I have not checked that yet.

>>>
>>> I just posted a fix. I did not try it with the gateway router or with
>>> OpenStack,
>>> but with this bug fix all distributed NAT manual test cases are now
>>> passing.
>>>
>>>
>> ​Thanks for the fix. I just tested it. Its working when I am trying to
>> reach the ​VM using its floating ip. But not when trying to ping
>> www.google.com from the VM (SNAT use case)
>>
>
> With distributed NAT, most of my debugging and tests were using SNAT. The
> bug fix that I posted fixed the problem that was causing ICMP echo replies
> to be dropped. The openflow path for distributed SNAT is similar to that
> for SNAT on gateway routers, but there are still some differences, notably
> one router instead of two routers and no "join" switch. Also I did not try
> it with DNS.
>
> Are you able to debug further, to see whether a missing geneve options
> field is still the culprit?
> It is possible that removal of patch ports within br-int uncovered other
> issues.
>


​With some testing I could see that in the node where the gateway is hosted
 - The ​reply packet reaches the gateway router pipeline -> to the otls
switch pipeline (via clone) -> to the router pipeline -> to the peer port
of the switch.
​The packet gets dropped at table 22

 table=22, n_packets=275, n_bytes=26686,
priority=65535,ct_state=+inv+trk,metadata=0x1 actions=drop

Not sure why 

Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-01 Thread Mickey Spiegel
On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique  wrote:

>
>
> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel 
> wrote:
>
>>
>> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel 
>> wrote:
>>
>>>
>>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel 
>>> wrote:
>>>

 On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique 
 wrote:

> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:
>

 


> ​
> Hi Dong Jun, I am also facing the same issue on my setup.
> ​
> These are the findings of my investigation so far
>
> Looks like this issue is seen after the commit
> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
> 622fbaeacbc6ce7576e347
> ​
> which removes the usage of patch ports and uses the clone action
> instead.
> ​
>
> I reverted to the commit just before it and SNAT/DNAT is working as
> expected.
>
> In my case, the gateway router is hosted on node 1 and the I am trying
> to
> reach a VM (192.168.0.5) hosted on node 2 using the external ip
> (10.2.7.105) associated ​with it. I could see that the node 1 is
> sending
> the packet to node 2 through the geneve tunnel, but it is dropped by
> node 2
> flows.
>
> Below is the tcpdump of the packet
>
> **
> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve,
> Flags
> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
> request, id 13240, seq 1, length 64
> ***
>
> Below is the tcpdump of the packet with the ovn-controller (without the
> above commit) in the working case
>
> **
> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve,
> Flags
> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
> 192.168.0.5:
> ICMP echo request, id 13308, seq 1, length 64
> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve,
> Flags
> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
> nusiddiq.blr.redhat.com:
> ICMP echo reply, id 13308, seq 1, length 64
> **
>
> The options data has - 00030005
>
> From the packet, I could see that the packet from node 1 is missing the
> geneve option fields which has inport and outport keys.
>

 I am facing the same issue running my distributed NAT patch set.
 Between UNSNAT recirc and output to tunnel, a megaflow is installed that
 is missing the geneve option fields.

 I verified that the table=32 openflow rule has the geneve option fields.
 ofproto/trace shows geneve in the "Datapath actions" at the end, so no
 problem with whatever ofproto/trace is using.

>>>
>>> Throwing some logs in, I see that flow->metadata.present.map is 0 rather
>>> than 1 coming into tun_metadata_to_geneve_nlattr() in
>>> lib/tun-metadata.c,
>>> when the problem occurs. That is why the geneve option fields are
>>> missing.
>>>
>>> I have not yet figured out why flow->metadata.present.map is 0. It should
>>> be modified when tun_metadata_write() is called due to actions setting
>>> tunnel metadata values. I have not checked that yet.
>>>
>>
>> I just posted a fix. I did not try it with the gateway router or with
>> OpenStack,
>> but with this bug fix all distributed NAT manual test cases are now
>> passing.
>>
>>
> ​Thanks for the fix. I just tested it. Its working when I am trying to
> reach the ​VM using its floating ip. But not when trying to ping
> www.google.com from the VM (SNAT use case)
>

With distributed NAT, most of my debugging and tests were using SNAT. The
bug fix that I posted fixed the problem that was causing ICMP echo replies
to be dropped. The openflow path for distributed SNAT is similar to that
for SNAT on gateway routers, but there are still some differences, notably
one router instead of two routers and no "join" switch. Also I did not try
it with DNS.

Are you able to debug further, to see whether a missing geneve options
field is still the culprit?
It is possible that removal of patch ports within br-int uncovered other
issues.

I primarily used ovs-dpctl dump-flows to see installed megaflows, ovs-appctl
ofproto/trace (with recirc_id), and ovs-ofctl dump-flows for initial
debugging. In particular I could see that the installed megaflows were
lacking the geneve options field in the actions.

Mickey


> Numan
>
>
>> Mickey
>>
>>
>>> Mickey
>>>
>>>
 Mickey



>
> Thanks
> Numan
>
>
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> ___
> dev 

Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2017-01-01 Thread Numan Siddique
On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel 
wrote:

>
> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel 
> wrote:
>
>>
>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel 
>> wrote:
>>
>>>
>>> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique 
>>> wrote:
>>>
 On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:

>>>
>>> 
>>>
>>>
 ​
 Hi Dong Jun, I am also facing the same issue on my setup.
 ​
 These are the findings of my investigation so far

 Looks like this issue is seen after the commit
 https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
 622fbaeacbc6ce7576e347
 ​
 which removes the usage of patch ports and uses the clone action
 instead.
 ​

 I reverted to the commit just before it and SNAT/DNAT is working as
 expected.

 In my case, the gateway router is hosted on node 1 and the I am trying
 to
 reach a VM (192.168.0.5) hosted on node 2 using the external ip
 (10.2.7.105) associated ​with it. I could see that the node 1 is sending
 the packet to node 2 through the geneve tunnel, but it is dropped by
 node 2
 flows.

 Below is the tcpdump of the packet

 **
 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve, Flags
 [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
 request, id 13240, seq 1, length 64
 ***

 Below is the tcpdump of the packet with the ovn-controller (without the
 above commit) in the working case

 **
 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve, Flags
 [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com >
 192.168.0.5:
 ICMP echo request, id 13308, seq 1, length 64
 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve, Flags
 [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 >
 nusiddiq.blr.redhat.com:
 ICMP echo reply, id 13308, seq 1, length 64
 **

 The options data has - 00030005

 From the packet, I could see that the packet from node 1 is missing the
 geneve option fields which has inport and outport keys.

>>>
>>> I am facing the same issue running my distributed NAT patch set.
>>> Between UNSNAT recirc and output to tunnel, a megaflow is installed that
>>> is missing the geneve option fields.
>>>
>>> I verified that the table=32 openflow rule has the geneve option fields.
>>> ofproto/trace shows geneve in the "Datapath actions" at the end, so no
>>> problem with whatever ofproto/trace is using.
>>>
>>
>> Throwing some logs in, I see that flow->metadata.present.map is 0 rather
>> than 1 coming into tun_metadata_to_geneve_nlattr() in lib/tun-metadata.c,
>> when the problem occurs. That is why the geneve option fields are missing.
>>
>> I have not yet figured out why flow->metadata.present.map is 0. It should
>> be modified when tun_metadata_write() is called due to actions setting
>> tunnel metadata values. I have not checked that yet.
>>
>
> I just posted a fix. I did not try it with the gateway router or with
> OpenStack,
> but with this bug fix all distributed NAT manual test cases are now
> passing.
>
>
​Thanks for the fix. I just tested it. Its working when I am trying to
reach the ​VM using its floating ip. But not when trying to ping
www.google.com from the VM (SNAT use case)

Numan


> Mickey
>
>
>> Mickey
>>
>>
>>> Mickey
>>>
>>>
>>>

 Thanks
 Numan


 > ___
 > dev mailing list
 > d...@openvswitch.org
 > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
 >
 ___
 dev mailing list
 d...@openvswitch.org
 https://mail.openvswitch.org/mailman/listinfo/ovs-dev

>>>
>>>
>>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2016-12-31 Thread Mickey Spiegel
On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel 
wrote:

>
> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique 
> wrote:
>
>> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:
>>
>
> 
>
>
>> ​
>> Hi Dong Jun, I am also facing the same issue on my setup.
>> ​
>> These are the findings of my investigation so far
>>
>> Looks like this issue is seen after the commit
>> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312
>> 622fbaeacbc6ce7576e347
>> ​
>> which removes the usage of patch ports and uses the clone action instead.
>> ​
>>
>> I reverted to the commit just before it and SNAT/DNAT is working as
>> expected.
>>
>> In my case, the gateway router is hosted on node 1 and the I am trying to
>> reach a VM (192.168.0.5) hosted on node 2 using the external ip
>> (10.2.7.105) associated ​with it. I could see that the node 1 is sending
>> the packet to node 2 through the geneve tunnel, but it is dropped by node
>> 2
>> flows.
>>
>> Below is the tcpdump of the packet
>>
>> **
>> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve, Flags
>> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
>> request, id 13240, seq 1, length 64
>> ***
>>
>> Below is the tcpdump of the packet with the ovn-controller (without the
>> above commit) in the working case
>>
>> **
>> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve, Flags
>> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com > 192.168.0.5
>> :
>> ICMP echo request, id 13308, seq 1, length 64
>> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve, Flags
>> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 > nusiddiq.blr.redhat.com
>> :
>> ICMP echo reply, id 13308, seq 1, length 64
>> **
>>
>> The options data has - 00030005
>>
>> From the packet, I could see that the packet from node 1 is missing the
>> geneve option fields which has inport and outport keys.
>>
>
> I am facing the same issue running my distributed NAT patch set.
> Between UNSNAT recirc and output to tunnel, a megaflow is installed that
> is missing the geneve option fields.
>
> I verified that the table=32 openflow rule has the geneve option fields.
> ofproto/trace shows geneve in the "Datapath actions" at the end, so no
> problem with whatever ofproto/trace is using.
>

Throwing some logs in, I see that flow->metadata.present.map is 0 rather
than 1 coming into tun_metadata_to_geneve_nlattr() in lib/tun-metadata.c,
when the problem occurs. That is why the geneve option fields are missing.

I have not yet figured out why flow->metadata.present.map is 0. It should
be modified when tun_metadata_write() is called due to actions setting
tunnel metadata values. I have not checked that yet.

Mickey


> Mickey
>
>
>
>>
>> Thanks
>> Numan
>>
>>
>> > ___
>> > dev mailing list
>> > d...@openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> >
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2016-12-30 Thread Mickey Spiegel
On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique  wrote:

> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:
>




> ​
> Hi Dong Jun, I am also facing the same issue on my setup.
> ​
> These are the findings of my investigation so far
>
> Looks like this issue is seen after the commit
> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312622fbaeacbc6
> ce7576e347
> ​
> which removes the usage of patch ports and uses the clone action instead.
> ​
>
> I reverted to the commit just before it and SNAT/DNAT is working as
> expected.
>
> In my case, the gateway router is hosted on node 1 and the I am trying to
> reach a VM (192.168.0.5) hosted on node 2 using the external ip
> (10.2.7.105) associated ​with it. I could see that the node 1 is sending
> the packet to node 2 through the geneve tunnel, but it is dropped by node 2
> flows.
>
> Below is the tcpdump of the packet
>
> **
> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve, Flags
> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo
> request, id 13240, seq 1, length 64
> ***
>
> Below is the tcpdump of the packet with the ovn-controller (without the
> above commit) in the working case
>
> **
> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve, Flags
> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com > 192.168.0.5:
> ICMP echo request, id 13308, seq 1, length 64
> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve, Flags
> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 > nusiddiq.blr.redhat.com:
> ICMP echo reply, id 13308, seq 1, length 64
> **
>
> The options data has - 00030005
>
> From the packet, I could see that the packet from node 1 is missing the
> geneve option fields which has inport and outport keys.
>

I am facing the same issue running my distributed NAT patch set.
Between UNSNAT recirc and output to tunnel, a megaflow is installed that
is missing the geneve option fields.

I verified that the table=32 openflow rule has the geneve option fields.
ofproto/trace shows geneve in the "Datapath actions" at the end, so no
problem with whatever ofproto/trace is using.

Mickey



>
> Thanks
> Numan
>
>
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn ping from VM to external gateway IP failed.

2016-12-30 Thread Numan Siddique
On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun  wrote:

> Start devstack in one node(master code).
>
> (10.0.0.7)vm --- (10.0.0.1)dr(169.254.128.2) ---
> (169.254.128.1)ogr(172.24.4.10)  --- (172.24.4.1)br-ex
> (fip 172.24.4.7)
>
> $ ip addr show eth0
> 2: eth0:  mtu 1442 qdisc pfifo_fast qlen
> 1000
> inet 10.0.0.7/26 brd 10.0.0.63 scope global eth0
> *ping from 10.0.0.7 to 172.24.4.1 FAILED,HELP is greatly appreciated**
> *$ ping 172.24.4.1
> PING 172.24.4.1 (172.24.4.1): 56 data bytes
> ^C
> --- 172.24.4.1 ping statistics ---
> 5 packets transmitted, 0 packets received, 100% packet loss
>
> Other OK
> ping from vm to 172.24.4.11 and another vm'fip are OK
> $ ping 172.24.4.10
> PING 172.24.4.10 (172.24.4.10): 56 data bytes
> 64 bytes from 172.24.4.10: seq=0 ttl=253 time=0.822 ms
> $ ping 172.24.4.8
> PING 172.24.4.8 (172.24.4.8): 56 data bytes
> 64 bytes from 172.24.4.8: seq=0 ttl=61 time=1.163 ms
> ping from 172.24.4.1 to 172.24.4.7 is OK
>   root@c3:/opt/stack# ping 172.24.4.7
> PING 172.24.4.7 (172.24.4.7) 56(84) bytes of data.
> 64 bytes from 172.24.4.7: icmp_seq=1 ttl=62 time=0.903 ms
>
>
> Here is the right conntrack from 172.24.1 to 172.24.4.7
> root@c3:/opt/stack# conntrack -LN | grep icmp
> icmp 1 29 src=172.24.4.1 dst=10.0.0.7 type=8 code=0 id=11779
> src=10.0.0.7 dst=172.24.4.1 type=0 code=0 id=11779 mark=0 zone=9 use=1
> conntrack v1.4.3 (conntrack-tools): 205 flow entries have been shown.
> icmp 1 29 src=172.24.4.1 dst=172.24.4.7 type=8 code=0 id=11779
> src=10.0.0.7 dst=172.24.4.1 type=0 code=0 id=11779 mark=0 zone=4 use=1
> icmp 1 29 src=172.24.4.1 dst=172.24.4.7 type=8 code=0 id=11779
> src=172.24.4.7 dst=172.24.4.1 type=0 code=0 id=11779 mark=0 use=1
>
> *HERE IS  some info for this issue ping from 10.0.0.7 to 172.24.4.1
> *
> root@c3:/opt/stack# conntrack -LN | grep icmp
> conntrack v1.4.3 (conntrack-tools): 220 flow entries have been shown.
> icmp 1 29 src=10.0.0.7 dst=172.24.4.1 type=8 code=0 id=32513
> src=172.24.4.1 dst=172.24.4.7 type=0 code=0 id=32513 mark=0 zone=3 use=1
> icmp 1 29 src=10.0.0.7 dst=172.24.4.1 type=8 code=0 id=32513
> [UNREPLIED] src=172.24.4.1 dst=10.0.0.7 type=0 code=0 id=32513 mark=0
> zone=9 use=1
> icmp 1 29 src=172.24.4.7 dst=172.24.4.1 type=8 code=0 id=32513
> src=172.24.4.1 dst=172.24.4.7 type=0 code=0 id=32513 mark=0 use=1
>
> root@c3:/opt/stack# ovs-appctl -t 
> /usr/local/var/run/openvswitch/ovn-controller.30677.ctl
> ct-zone-list
> ee2f5eb8-60cd-4efa-94b5-0329ebe5fb25 8
> f499ea31-da2c-4673-8313-efdf22f86308_dnat 6
> f499ea31-da2c-4673-8313-efdf22f86308_snat 7
> provnet-ca213de8-a0e1-4899-8fcf-4a894c876b80 5
> 417b4dfe-b64a-45fb-952b-9ddea624ae13 9
> 70ef5a38-7fde-477a-a437-0349d56adcf0_snat 3
> 94428e19-4bd0-4eb8-b77a-bcab69539a31_dnat 2
> 94428e19-4bd0-4eb8-b77a-bcab69539a31_snat 1
> 70ef5a38-7fde-477a-a437-0349d56adcf0_dnat 4
>
> root@c3:/opt/stack# ovs-dpctl dump-flows | grep 172
> recirc_id(0x84),dp_hash(0),skb_priority(0),in_port(4),skb_
> mark(0),ct_state(+new-est-rel-rpl-inv+trk-snat-dnat),ct_zone
> (0x4),ct_mark(0),ct_label(0),eth(src=fa:16:3e:1f:ab:18,dst=
> fa:16:3e:cf:28:38),eth_type(0x0800),ipv4(src=10.0.0.7,dst=
> 172.24.4.1,proto=1,tos=0,ttl=63,frag=no),icmp(type=8,code=0),
> packets:141, bytes:13818, used:0.296s, actions:set(eth(src=fa:16:3e:5
> 6:55:b0,dst=9e:eb:2d:f1:8e:42)),set(ipv4(src=10.0.0.7,dst=
> 172.24.4.1,ttl=62)),ct(commit,zone=3,nat(src=172.24.4.7)),recirc(0x85)
> recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(
> 0),ct_state(-new+est-rel+rpl-inv+trk-snat-dnat),ct_zone(0),
> ct_mark(0),ct_label(0),eth(src=9e:eb:2d:f1:8e:42,dst=fa:
> 16:3e:56:55:b0),eth_type(0x0800),ipv4(src=172.24.4.1,
> dst=172.24.4.7,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0),
> packets:141, bytes:13818, used:0.296s, actions:ct(zone=3,nat),ct(comm
> it,zone=4,nat(dst=10.0.0.7)),recirc(0x7e)
> recirc_id(0x82),dp_hash(0),skb_priority(0),in_port(4),skb_
> mark(0),ct_state(+new-est-rel-rpl-inv+trk-snat-dnat),ct_zone
> (0x9),ct_mark(0),ct_label(0),eth(src=fa:16:3e:ba:a1:3b,dst=
> fa:16:3e:b0:15:8d),eth_type(0x0800),ipv4(src=10.0.0.7,dst=
> 172.24.4.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0),
> packets:141, bytes:13818, used:0.296s, actions:ct(commit,zone=9,label
> =0/0x1),ct(commit,zone=9,label=0/0x1),set(eth(src=fa:16:3e:
> 1f:ab:18,dst=fa:16:3e:cf:28:38)),set(ipv4(src=10.0.0.7,dst=
> 172.24.4.0/255.255.255.252,ttl=63)),ct(zone=4,nat),recirc(0x84)
> recirc_id(0x85),dp_hash(0),skb_priority(0),in_port(4),skb_
> mark(0),ct_state(-new+est-rel-rpl-inv+trk+snat-dnat),ct_zone
> (0x3),ct_mark(0),ct_label(0),eth(src=fa:16:3e:56:55:b0,dst=
> 9e:eb:2d:f1:8e:42),eth_type(0x0800),ipv4(src=172.24.4.7,
> dst=172.24.4.1,proto=1,tos=0,ttl=62,frag=no),icmp(type=8,code=0),
> packets:139, bytes:13622, used:0.296s, actions:2
> recirc_id(0x7e),dp_hash(0),skb_priority(0),in_port(2),skb_
>