On Mon, Jan 2, 2017 at 3:46 AM, Numan Siddique <nusid...@redhat.com> wrote:
> > > On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel <mickeys....@gmail.com> > wrote: > >> >> On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique <nusid...@redhat.com> >> wrote: >> >>> >>> >>> On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel <mickeys....@gmail.com> >>> wrote: >>> >>>> >>>> On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel <mickeys....@gmail.com> >>>> wrote: >>>> >>>>> >>>>> On Fri, Dec 30, 2016 at 11:37 AM, Mickey Spiegel < >>>>> mickeys....@gmail.com> wrote: >>>>> >>>>>> >>>>>> On Fri, Dec 30, 2016 at 7:46 AM, Numan Siddique <nusid...@redhat.com> >>>>>> wrote: >>>>>> >>>>>>> On Fri, Dec 30, 2016 at 5:36 PM, Dong Jun <do...@dtdream.com> wrote: >>>>>>> >>>>>> >>>>>> <snip> >>>>>> >>>>>> >>>>>>> >>>>>>> Hi Dong Jun, I am also facing the same issue on my setup. >>>>>>> >>>>>>> These are the findings of my investigation so far >>>>>>> >>>>>>> Looks like this issue is seen after the commit >>>>>>> https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312 >>>>>>> 622fbaeacbc6ce7576e347 >>>>>>> >>>>>>> which removes the usage of patch ports and uses the clone action >>>>>>> instead. >>>>>>> >>>>>>> >>>>>>> I reverted to the commit just before it and SNAT/DNAT is working as >>>>>>> expected. >>>>>>> >>>>>>> In my case, the gateway router is hosted on node 1 and the I am >>>>>>> trying to >>>>>>> reach a VM (192.168.0.5) hosted on node 2 using the external ip >>>>>>> (10.2.7.105) associated with it. I could see that the node 1 is >>>>>>> sending >>>>>>> the packet to node 2 through the geneve tunnel, but it is dropped by >>>>>>> node 2 >>>>>>> flows. >>>>>>> >>>>>>> Below is the tcpdump of the packet >>>>>>> >>>>>>> ************************** >>>>>>> 19:39:44.709907 IP 182.16.0.16.60069 > 182.16.0.15.geneve: Geneve, >>>>>>> Flags >>>>>>> [none], vni 0x1: IP nusiddiq.blr.redhat.com > 192.168.0.5: ICMP echo >>>>>>> request, id 13240, seq 1, length 64 >>>>>>> *************************** >>>>>>> >>>>>>> Below is the tcpdump of the packet with the ovn-controller (without >>>>>>> the >>>>>>> above commit) in the working case >>>>>>> >>>>>>> ************************** >>>>>>> 19:41:56.783570 IP 182.16.0.12.29778 > 182.16.0.15.geneve: Geneve, >>>>>>> Flags >>>>>>> [C], vni 0x1, options [8 bytes]: IP nusiddiq.blr.redhat.com > >>>>>>> 192.168.0.5: >>>>>>> ICMP echo request, id 13308, seq 1, length 64 >>>>>>> 19:41:56.784270 IP 182.16.0.15.14539 > 182.16.0.12.geneve: Geneve, >>>>>>> Flags >>>>>>> [C], vni 0xf, options [8 bytes]: IP 192.168.0.5 > >>>>>>> nusiddiq.blr.redhat.com: >>>>>>> ICMP echo reply, id 13308, seq 1, length 64 >>>>>>> ************************** >>>>>>> >>>>>>> The options data has - 00030005 >>>>>>> >>>>>>> From the packet, I could see that the packet from node 1 is missing >>>>>>> the >>>>>>> geneve option fields which has inport and outport keys. >>>>>>> >>>>>> >>>>>> I am facing the same issue running my distributed NAT patch set. >>>>>> Between UNSNAT recirc and output to tunnel, a megaflow is installed >>>>>> that >>>>>> is missing the geneve option fields. >>>>>> >>>>>> I verified that the table=32 openflow rule has the geneve option >>>>>> fields. >>>>>> ofproto/trace shows geneve in the "Datapath actions" at the end, so no >>>>>> problem with whatever ofproto/trace is using. >>>>>> >>>>> >>>>> Throwing some logs in, I see that flow->metadata.present.map is 0 >>>>> rather >>>>> than 1 coming into tun_metadata_to_geneve_nlattr() in >>>>> lib/tun-metadata.c, >>>>> when the problem occurs. That is why the geneve option fields are >>>>> missing. >>>>> >>>>> I have not yet figured out why flow->metadata.present.map is 0. It >>>>> should >>>>> be modified when tun_metadata_write() is called due to actions setting >>>>> tunnel metadata values. I have not checked that yet. >>>>> >>>> >>>> I just posted a fix. I did not try it with the gateway router or with >>>> OpenStack, >>>> but with this bug fix all distributed NAT manual test cases are now >>>> passing. >>>> >>>> >>> Thanks for the fix. I just tested it. Its working when I am trying to >>> reach the VM using its floating ip. But not when trying to ping >>> www.google.com from the VM (SNAT use case) >>> >> >> With distributed NAT, most of my debugging and tests were using SNAT. The >> bug fix that I posted fixed the problem that was causing ICMP echo replies >> to be dropped. The openflow path for distributed SNAT is similar to that >> for SNAT on gateway routers, but there are still some differences, notably >> one router instead of two routers and no "join" switch. Also I did not try >> it with DNS. >> >> Are you able to debug further, to see whether a missing geneve options >> field is still the culprit? >> It is possible that removal of patch ports within br-int uncovered other >> issues. >> > > > With some testing I could see that in the node where the gateway is > hosted > - The reply packet reaches the gateway router pipeline -> to the otls > switch pipeline (via clone) -> to the router pipeline -> to the peer port > of the switch. > The packet gets dropped at table 22 > > table=22, n_packets=275, n_bytes=26686, > priority=65535,ct_state=+inv+trk,metadata=0x1 > actions=drop > > Not sure why it is happening. I will try to debug further. > I added stateful ACLs, but I am unable to reproduce this. Nothing hits the invalid ct_state flow, trying switch -> router -> switch, across localnet at the end, and with various distributed NAT flavors including DNAT and SNAT. The pings always succeed. As I suggested on IRC, I think that conntrack state should be cleared when crossing an OVN patch port. Specifically, in ovn/controller/physical.c, inside the clone, it should clear ct_state (MFF_CT_STATE, be16), ct_mark (MFF_CT_MARK, be32), and ct_label (MFF_CT_LABEL, be128). Mickey > > Numan > > > >> I primarily used ovs-dpctl dump-flows to see installed megaflows, ovs-appctl >> ofproto/trace (with recirc_id), and ovs-ofctl dump-flows for initial >> debugging. In particular I could see that the installed megaflows were >> lacking the geneve options field in the actions. >> >> Mickey >> >> >>> Numan >>> >>> >>>> Mickey >>>> >>>> >>>>> Mickey >>>>> >>>>> >>>>>> Mickey >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> Numan >>>>>>> >>>>>>> >>>>>>> > _______________________________________________ >>>>>>> > dev mailing list >>>>>>> > d...@openvswitch.org >>>>>>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>> > >>>>>>> _______________________________________________ >>>>>>> dev mailing list >>>>>>> d...@openvswitch.org >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev