Re: [ovs-discuss] VMs can't ping each between two physical hosts within same subnet

2017-12-24 Thread Hui Xiang
Finally found the root cause, for some reason the dpdk device has the same
mac address with the device has the tunnel ip.

 1(dpdk0): addr:90:e2:ba:d8:c9:88
 config: 0
 state:  LIVE
 current:10GB-FD
 speed: 1 Mbps now, 0 Mbps max

 LOCAL(br-prv): addr:90:e2:ba:d8:c9:88
 config: 0
 state:  LIVE
 current:10MB-FD COPPER
 speed: 10 Mbps now, 0 Mbps max


On Sun, Dec 24, 2017 at 9:55 AM, Hui Xiang <xiangh...@gmail.com> wrote:

> My other question is does ovs/ovn support pre-population of tunnel's info
> in the fdb/arp table.
>
> On Sat, Dec 23, 2017 at 11:20 AM, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> Hi folks,
>>
>>
>>   I have a problem on vms spawned between twp physical hosts within same
>> subnet.
>> they are connected with Geneve tunnel based on dpdk.
>>
>>   With tracing and debugging, find that the remote tnl arp request will
>> be dropped by the host having the initailed ping vm.
>>
>>   This problem can be fixed by restarted ovs-vswitchd.
>>
>>
>>   two geneve tunnels:  168.254.100.13  168.254.100.14
>>   two vms: 192.168.10.3192.168.10.2
>>
>>   I have compared the logs with datapath flow:
>>
>>   [Works]
>>   2017-12-22T06:23:54.767Z|00263|dpif_netdev(pmd12)|DBG|ovs-netdev: miss
>> upcall:
>> skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(
>> 0),ct_label(0),recirc_id(0),dp_hash(0),in_port(2),packet_
>> type(ns=0,id=0),eth(src=3e:65:7c:f5:3e:4a,dst=e2:a6:0b:28:
>> b9:43),eth_type(0x0806),arp(sip=168.254.100.13,tip=168.
>> 254.100.14,op=1,sha=3e:65:7c:f5:3e:4a,tha=00:00:00:00:00:00)
>> arp,vlan_tci=0x,dl_src=3e:65:7c:f5:3e:4a,dl_dst=e2:a6:0b
>> :28:b9:43,arp_spa=168.254.100.13,arp_tpa=168.254.100.14,arp_
>> op=1,arp_sha=3e:65:7c:f5:3e:4a,arp_tha=00:00:00:00:00:00
>> 2017-12-22T06:23:54.767Z|00264|dpif(pmd12)|DBG|netdev@ovs-netdev:
>> get_stats success
>> 2017-12-22T06:23:54.767Z|00265|dpif_netdev(pmd12)|DBG|flow_add:
>> ufid:4ef92ea7-67b0-4bc3-8a0c-a4e379f2fe83 recirc_id(0),in_port(2),packet
>> _type(ns=0,id=0),eth(src=3e:65:7c:f5:3e:4a,dst=e2:a6:0b:
>> 28:b9:43),eth_type(0x0806),arp(op=1/0xff), actions:1
>>
>>  [Bad]
>>  5990 2017-12-22T03:28:28.386Z|00290|dpif_netdev(pmd84)|DBG|flow_add:
>> ufid:7d555719-c5dc-4e1b-bfff-1851f88aabe7 recirc_id(0),in_port(3),packet
>> _type(ns=0,id=0),eth(src=90:e2:ba:dd:fa:60,dst=
>> ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=168.254.100.13,tip=168.254.100.14,op=1/0xff),
>> actions:drop
>>
>> My question is which openflow has translated into above datapath action
>> drop for bad case, and is there any way to check which function/file
>> translate it or responsible of the geneve tunnel arp processing? The only
>> thing I can use is to adding logs to code and rerun so far.  Thank you very
>> much for your help.
>>
>> Hui.
>>
>>
>>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] VMs can't ping each between two physical hosts within same subnet

2017-12-23 Thread Hui Xiang
My other question is does ovs/ovn support pre-population of tunnel's info
in the fdb/arp table.

On Sat, Dec 23, 2017 at 11:20 AM, Hui Xiang <xiangh...@gmail.com> wrote:

> Hi folks,
>
>
>   I have a problem on vms spawned between twp physical hosts within same
> subnet.
> they are connected with Geneve tunnel based on dpdk.
>
>   With tracing and debugging, find that the remote tnl arp request will be
> dropped by the host having the initailed ping vm.
>
>   This problem can be fixed by restarted ovs-vswitchd.
>
>
>   two geneve tunnels:  168.254.100.13  168.254.100.14
>   two vms: 192.168.10.3192.168.10.2
>
>   I have compared the logs with datapath flow:
>
>   [Works]
>   2017-12-22T06:23:54.767Z|00263|dpif_netdev(pmd12)|DBG|ovs-netdev: miss
> upcall:
> skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_
> mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(2),
> packet_type(ns=0,id=0),eth(src=3e:65:7c:f5:3e:4a,dst=e2:
> a6:0b:28:b9:43),eth_type(0x0806),arp(sip=168.254.100.
> 13,tip=168.254.100.14,op=1,sha=3e:65:7c:f5:3e:4a,tha=00:00:00:00:00:00)
> arp,vlan_tci=0x,dl_src=3e:65:7c:f5:3e:4a,dl_dst=e2:a6:
> 0b:28:b9:43,arp_spa=168.254.100.13,arp_tpa=168.254.100.14,
> arp_op=1,arp_sha=3e:65:7c:f5:3e:4a,arp_tha=00:00:00:00:00:00
> 2017-12-22T06:23:54.767Z|00264|dpif(pmd12)|DBG|netdev@ovs-netdev:
> get_stats success
> 2017-12-22T06:23:54.767Z|00265|dpif_netdev(pmd12)|DBG|flow_add:
> ufid:4ef92ea7-67b0-4bc3-8a0c-a4e379f2fe83 recirc_id(0),in_port(2),
> packet_type(ns=0,id=0),eth(src=3e:65:7c:f5:3e:4a,dst=e2:
> a6:0b:28:b9:43),eth_type(0x0806),arp(op=1/0xff), actions:1
>
>  [Bad]
>  5990 2017-12-22T03:28:28.386Z|00290|dpif_netdev(pmd84)|DBG|flow_add:
> ufid:7d555719-c5dc-4e1b-bfff-1851f88aabe7 recirc_id(0),in_port(3),
> packet_type(ns=0,id=0),eth(src=90:e2:ba:dd:fa:60,dst=
> ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=168.254.100.
> 13,tip=168.254.100.14,op=1/0xff), actions:drop
>
> My question is which openflow has translated into above datapath action
> drop for bad case, and is there any way to check which function/file
> translate it or responsible of the geneve tunnel arp processing? The only
> thing I can use is to adding logs to code and rerun so far.  Thank you very
> much for your help.
>
> Hui.
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] What is the best way to set a Geneve tunnel based on dpdk device

2017-12-21 Thread Hui Xiang
Thank you very much Sara!

On Thu, Dec 21, 2017 at 5:37 PM, Sara Gittlin <sara.gitt...@gmail.com>
wrote:

> take a look at vxlan setup - it is similar
> http://docs.openvswitch.org/en/latest/howto/userspace-tunneling/
> --Sara
>
> On Thu, Dec 21, 2017 at 10:51 AM, Hui Xiang <xiangh...@gmail.com> wrote:
> > Hi folks,
> >
> >   I have added a dpdk device in ovs bridge 'br-dpdk' in the compute host,
> > and expect the Geneve tunnel can send packets throught this bridge, my
> > question is what is the best way to make it work and do not affect dpdk
> > performance?
> >
> >   Is it enough to just assign an ip to interface br-dpdk? But in this
> way,
> > there is a flow 'normal' in the br-dpdk by default , if there are packets
> > need to be flood, the dpdk performance should be downgraded.  Thanks for
> any
> > advice.
> >
> >   Bridge br-dpdk
> > Port "dpdk0"
> > Interface "dpdk0"
> > type: dpdk
> > options: {dpdk-devargs=":0b:00.0", n_rxq="2"}
> >
> >
> > Hui.
> >
> >
> >
> > ___
> > discuss mailing list
> > disc...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] What is the best way to set a Geneve tunnel based on dpdk device

2017-12-21 Thread Hui Xiang
Hi folks,

  I have added a dpdk device in ovs bridge 'br-dpdk' in the compute host,
and expect the Geneve tunnel can send packets throught this bridge, my
question is what is the best way to make it work and do not affect dpdk
performance?

  Is it enough to just assign an ip to interface br-dpdk? But in this way,
there is a flow 'normal' in the br-dpdk by default , if there are packets
need to be flood, the dpdk performance should be downgraded.  Thanks for
any advice.

  Bridge br-dpdk
Port "dpdk0"
Interface "dpdk0"
type: dpdk
options: {dpdk-devargs=":0b:00.0", n_rxq="2"}


Hui.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovndb_servers can't be promoted

2017-12-04 Thread Hui Xiang
Thanks Numan, so sorry for occupying your much time on it. I have fixed the
problem, but not with very clear reason, after the cloud completed
deployment with puppet(not TrippleO), no matter what I am using, either
'pcs' or 'crm', the resource just could not start as it is expected. but if
I use some native puppet provider like crm(cib), it just works...

Now, the ovndb-server can started and selected master, however, one as
master and others are stopped, first I will do more testing and debugging
to exclud somthing wrong in my special environment, I will let you know if
I found something related with ovndb-server ocf itself, meantime I am very
happy to try out your patches since you said you have seen the problem on
promote as well.

Thanks.
Hui.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovndb_servers can't be promoted

2017-11-30 Thread Hui Xiang
Thanks Numan, in my environment, it's worse, it's even not getting started
and the monitor is only called once other than repeatedly for both
master/slave or none, do you know if any problem could cause pacemaker have
this decision? other resource are good.

On Fri, Dec 1, 2017 at 2:08 AM, Numan Siddique <nusid...@redhat.com> wrote:

> Hi HuiXiang,
> Even I am seeing the issue where no node is promoted as master. I will
> test more, fix and and submit patch set v3.
>
> Thanks
> Numan
>
>
> On Thu, Nov 30, 2017 at 4:10 PM, Numan Siddique <nusid...@redhat.com>
> wrote:
>
>>
>>
>> On Thu, Nov 30, 2017 at 1:15 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>>
>>> Hi Numan,
>>>
>>> Thanks for helping, I am following your pcs example, but still with no
>>> lucky,
>>>
>>> 1. Before running any configuration, I stopped all of the ovsdb-server
>>> for OVN, and ovn-northd. Deleted ovnnb_active.conf/ovnsb_active.conf.
>>>
>>> 2. Since I have already had an vip in the cluster, so I chose to use it,
>>> it's status is OK.
>>> [root@node-1 ~]# pcs resource show
>>>  vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
>>>
>>> 3. Use pcs to create ovndb-servers and constraint
>>> [root@node-1 ~]# pcs resource create tst-ovndb ocf:ovn:ovndb-servers
>>> manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
>>> sb_master_port=6642 master
>>>  ([root@node-1 ~]# pcs resource meta tst-ovndb-master notify=true
>>>   Error: unable to find a resource/clone/master/group:
>>> tst-ovndb-master) ## returned error, so I changed into below command.
>>>
>>
>> Hi HuiXiang,
>> This command is very important. Without which, pacemaker do not notify
>> the status change and ovsdb-servers would not be promoted or demoted.
>> Hence  you don't see the notify action getting called in ovn ocf script.
>>
>> Can you try with the other command which I shared in my previous email.
>> These commands work fine for me.
>>
>> Let me  know how it goes.
>>
>> Thanks
>> Numan
>>
>>
>> [root@node-1 ~]# pcs resource master tst-ovndb-master tst-ovndb
>>> notify=true
>>> [root@node-1 ~]# pcs constraint colocation add master tst-ovndb-master
>>> with vip__management_old
>>>
>>> 4. pcs status
>>> [root@node-1 ~]# pcs status
>>>  vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
>>>  Master/Slave Set: tst-ovndb-master [tst-ovndb]
>>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>>
>>> 5. pcs resource show XXX
>>> [root@node-1 ~]# pcs resource show  vip__management_old
>>>  Resource: vip__management_old (class=ocf provider=es type=ns_IPaddr2)
>>>   Attributes: nic=br-mgmt base_veth=br-mgmt-hapr ns_veth=hapr-m
>>> ip=192.168.0.2 iflabel=ka cidr_netmask=24 ns=haproxy gateway=none
>>> gateway_metric=0 iptables_start_rules=false iptables_stop_rules=false
>>> iptables_comment=default-comment
>>>   Meta Attrs: migration-threshold=3 failure-timeout=60
>>> resource-stickiness=1
>>>   Operations: monitor interval=3 timeout=30
>>> (vip__management_old-monitor-3)
>>>   start interval=0 timeout=30 (vip__management_old-start-0)
>>>   stop interval=0 timeout=30 (vip__management_old-stop-0)
>>> [root@node-1 ~]# pcs resource show tst-ovndb-master
>>>  Master: tst-ovndb-master
>>>   Meta Attrs: notify=true
>>>   Resource: tst-ovndb (class=ocf provider=ovn type=ovndb-servers)
>>>Attributes: manage_northd=yes master_ip=192.168.0.2
>>> nb_master_port=6641 sb_master_port=6642
>>>Operations: start interval=0s timeout=30s
>>> (tst-ovndb-start-timeout-30s)
>>>stop interval=0s timeout=20s (tst-ovndb-stop-timeout-20s)
>>>promote interval=0s timeout=50s
>>> (tst-ovndb-promote-timeout-50s)
>>>demote interval=0s timeout=50s
>>> (tst-ovndb-demote-timeout-50s)
>>>monitor interval=30s timeout=20s
>>> (tst-ovndb-monitor-interval-30s)
>>>monitor interval=10s role=Master timeout=20s
>>> (tst-ovndb-monitor-interval-10s-role-Master)
>>>monitor interval=30s role=Slave timeout=20s
>>> (tst-ovndb-monitor-interval-30s-role-Slave)
>>>
>>>
>>> 6. I have put log in every ovndb-servers op, seems only the monitor op
>>> is being called, no promoted by the

Re: [ovs-discuss] ovndb_servers can't be promoted

2017-11-29 Thread Hui Xiang
Hi Numan,

Thanks for helping, I am following your pcs example, but still with no
lucky,

1. Before running any configuration, I stopped all of the ovsdb-server for
OVN, and ovn-northd. Deleted ovnnb_active.conf/ovnsb_active.conf.

2. Since I have already had an vip in the cluster, so I chose to use it,
it's status is OK.
[root@node-1 ~]# pcs resource show
 vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld

3. Use pcs to create ovndb-servers and constraint
[root@node-1 ~]# pcs resource create tst-ovndb ocf:ovn:ovndb-servers
manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
sb_master_port=6642 master
 ([root@node-1 ~]# pcs resource meta tst-ovndb-master notify=true
  Error: unable to find a resource/clone/master/group:
tst-ovndb-master) ## returned error, so I changed into below command.
[root@node-1 ~]# pcs resource master tst-ovndb-master tst-ovndb notify=true
[root@node-1 ~]# pcs constraint colocation add master tst-ovndb-master with
vip__management_old

4. pcs status
[root@node-1 ~]# pcs status
 vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
 Master/Slave Set: tst-ovndb-master [tst-ovndb]
 Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]

5. pcs resource show XXX
[root@node-1 ~]# pcs resource show  vip__management_old
 Resource: vip__management_old (class=ocf provider=es type=ns_IPaddr2)
  Attributes: nic=br-mgmt base_veth=br-mgmt-hapr ns_veth=hapr-m
ip=192.168.0.2 iflabel=ka cidr_netmask=24 ns=haproxy gateway=none
gateway_metric=0 iptables_start_rules=false iptables_stop_rules=false
iptables_comment=default-comment
  Meta Attrs: migration-threshold=3 failure-timeout=60
resource-stickiness=1
  Operations: monitor interval=3 timeout=30 (vip__management_old-monitor-3)
  start interval=0 timeout=30 (vip__management_old-start-0)
  stop interval=0 timeout=30 (vip__management_old-stop-0)
[root@node-1 ~]# pcs resource show tst-ovndb-master
 Master: tst-ovndb-master
  Meta Attrs: notify=true
  Resource: tst-ovndb (class=ocf provider=ovn type=ovndb-servers)
   Attributes: manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
sb_master_port=6642
   Operations: start interval=0s timeout=30s (tst-ovndb-start-timeout-30s)
   stop interval=0s timeout=20s (tst-ovndb-stop-timeout-20s)
   promote interval=0s timeout=50s
(tst-ovndb-promote-timeout-50s)
   demote interval=0s timeout=50s (tst-ovndb-demote-timeout-50s)
   monitor interval=30s timeout=20s
(tst-ovndb-monitor-interval-30s)
   monitor interval=10s role=Master timeout=20s
(tst-ovndb-monitor-interval-10s-role-Master)
   monitor interval=30s role=Slave timeout=20s
(tst-ovndb-monitor-interval-30s-role-Slave)


6. I have put log in every ovndb-servers op, seems only the monitor op is
being called, no promoted by the pacemaker DC:
<30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_monitor
<30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_check_status
<30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO: return
OCFOCF_NOT_RUNNINGG
<30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_master_update: 7}
<30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_master_update end}
<30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO: monitor
is going to return 7
<30>Nov 30 15:22:20 node-1 ovndb-servers(undef)[2980970]: INFO: metadata
exit OCF_SUCCESS}


Please take a look,  thank you very much.
Hui.




On Wed, Nov 29, 2017 at 11:03 PM, Numan Siddique <nusid...@redhat.com>
wrote:

>
>
> On Wed, Nov 29, 2017 at 4:16 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> FYI, If I have configured a good ovndb-server cluster with one active two
>> slaves, then start pacemaker ovn-servers resource agents, they are all
>> becoming slaves...
>>
>
> You don't need to start ovndb-servers. When you create pacemaker resources
> it would automatically start them and promote on of them.
>
> One thing which is very important is to create an IPaddr2 resource before
> and add a colocation constraint so that pacemaker would promote the
> ovsdb-server in the node
> where IPaddr2 resource is running. This IPaddr2 resource ip should be your
> master ip.
>
> Can you please do "pcs resource show " and share the
> output ?
>
> Below is how I normally use for my testing.
>
> 
> pcs cluster cib tmp-cib.xml
> cp tmp-cib.xml tmp-cib.xml.deltasrc
>
> pcs -f tmp-cib.xml resource create tst-ovndb ocf:ovn:ovndb-servers
>  manage_northd=yes master_ip=192.168.24.10 nb_master_port=6641
> sb_master_port=6642 master
> pcs -f tmp-cib.xml resource meta tst-ovndb-master notify=true
> pcs 

Re: [ovs-discuss] ovndb_servers can't be promoted

2017-11-29 Thread Hui Xiang
FYI, If I have configured a good ovndb-server cluster with one active two
slaves, then start pacemaker ovn-servers resource agents, they are all
becoming slaves...

On Tue, Nov 28, 2017 at 10:48 PM, Numan Siddique <nusid...@redhat.com>
wrote:

>
>
> On Tue, Nov 28, 2017 at 2:29 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> Hi Numan,
>>
>>
>> Finally figure it out what's wrong when running ovndb-servers ocf in my
>> environment.
>>
>> 1. There is no default ovnnb and ovnsb running in my environment, I
>> thought it should be started by pacemaker as the usual way other typical
>> resource agent do it.
>> when I create the ovndb_servers resource, nothing happened, no operation
>> is executed except monitor, which is really hard to debug for a while.
>> In the ovsdb_server_monitor() function, first it will check the status,
>> here, it will be return NOT_RUNNING, then in the ovsdb_server_master_update()
>> function, "CRM_MASTER -D" is being executed, which appears stopped every
>> following action, I am not very clear what work it did.
>>
>> So, do the ovn_nb and ovn_sb needs to be running previouly before
>> pacemaker ovndb_servers resource create? Is there any such documentation
>> referred?
>>
>> 2. Without your patch every nodes executing ovsdb_server_monitor and
>> return OCF_SUCCESS
>> However, the first node of the three nodes cluster is executed
>> ovsdb_server_stop action, the reason showed below:
>> <27>Nov 28 15:35:11 node-1 pengine[1897010]:error: clone_color:
>> ovndb_servers:0 is running on node-1.domain.tld which isn't allowed
>> Did I miss anything? I don't understand why it isn't allowed.
>>
>> 3. Regard your patch[1]
>> It first reports "/usr/lib/ocf/resource.d/ovn/ovndb-servers: line 26:
>> ocf_attribute_target: command not found ]" in my environment(pacemaker
>> 1.1.12)
>>
>
> Thanks. I will come back to you on your other points. The function
> "ocf_attribute_target" action must be added in 1.1.16-12.
>
> I think it makes sense to either remove "ocf_attribute_target" or find a
> way so that even older versions work.
>
> I will spin a v2.
> Thanks
> Numan
>
>
>
> The log showed same as item2, but I have seen very shortly different state
>> from "pcs status" as below shown:
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>  Slaves: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>> There is no promote action being executed.
>>
>>
>> Thanks for looking and help.
>>
>> [1] - https://patchwork.ozlabs.org/patch/839022/
>>
>>
>>
>>
>>
>> On Fri, Nov 24, 2017 at 10:54 PM, Numan Siddique <nusid...@redhat.com>
>> wrote:
>>
>>> Hi Hui Xiang,
>>>
>>> Can you please try with this patch [1]  and see if it works for you ?
>>> Please let me know how it goes. But I am not sure, if the patch would fix
>>> the issue.
>>>
>>> To brief, the OVN OCF script doesn't add monitor action for "Master"
>>> role. So pacemaker Resource agent would not check for the status of ovn db
>>> servers periodically. In case ovn db servers are killed, pacemaker wont
>>> know about it.
>>>
>>>
>>>
>>>
>>> You can also take a look at this [1] to know how it is used in openstack
>>> with tripleo installation.
>>>
>>> [1] - https://patchwork.ozlabs.org/patch/839022/
>>> [2] - https://github.com/openstack/puppet-tripleo/blob/master/ma
>>> nifests/profile/pacemaker/ovn_northd.pp
>>>
>>>
>>> Thanks
>>> Numan
>>>
>>> On Fri, Nov 24, 2017 at 3:00 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>>>
>>>> Hi folks,
>>>>
>>>>   I am following what suggested on doc[1] to configure the
>>>> ovndb_servers HA, however, it's so unluck with upgrading pacemaker packages
>>>> from 1.12 to 1.16, do almost every kind of changes, there still not a
>>>> ovndb_servers master promoted, is there any special recipe for it to run?
>>>> so frustrated on it, sigh.
>>>>
>>>> It always showed:
>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>>>
>>>> Even if I tried below steps:
>>>> 1. pcs resource debug-stop ovndb_server on every nodes.  ovn-ctl
>>>> status_ovnxb: running

Re: [ovs-discuss] ovndb_servers can't be promoted

2017-11-28 Thread Hui Xiang
Hi Numan,


Finally figure it out what's wrong when running ovndb-servers ocf in my
environment.

1. There is no default ovnnb and ovnsb running in my environment, I thought
it should be started by pacemaker as the usual way other typical resource
agent do it.
when I create the ovndb_servers resource, nothing happened, no operation is
executed except monitor, which is really hard to debug for a while.
In the ovsdb_server_monitor() function, first it will check the status,
here, it will be return NOT_RUNNING, then in
the ovsdb_server_master_update() function, "CRM_MASTER -D" is being
executed, which appears stopped every following action, I am not very clear
what work it did.

So, do the ovn_nb and ovn_sb needs to be running previouly before pacemaker
ovndb_servers resource create? Is there any such documentation referred?

2. Without your patch every nodes executing ovsdb_server_monitor and return
OCF_SUCCESS
However, the first node of the three nodes cluster is executed
ovsdb_server_stop action, the reason showed below:
<27>Nov 28 15:35:11 node-1 pengine[1897010]:error: clone_color:
ovndb_servers:0 is running on node-1.domain.tld which isn't allowed
Did I miss anything? I don't understand why it isn't allowed.

3. Regard your patch[1]
It first reports "/usr/lib/ocf/resource.d/ovn/ovndb-servers: line 26:
ocf_attribute_target: command not found ]" in my environment(pacemaker
1.1.12)
The log showed same as item2, but I have seen very shortly different state
from "pcs status" as below shown:
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
 Slaves: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
There is no promote action being executed.


Thanks for looking and help.

[1] - https://patchwork.ozlabs.org/patch/839022/





On Fri, Nov 24, 2017 at 10:54 PM, Numan Siddique <nusid...@redhat.com>
wrote:

> Hi Hui Xiang,
>
> Can you please try with this patch [1]  and see if it works for you ?
> Please let me know how it goes. But I am not sure, if the patch would fix
> the issue.
>
> To brief, the OVN OCF script doesn't add monitor action for "Master" role.
> So pacemaker Resource agent would not check for the status of ovn db
> servers periodically. In case ovn db servers are killed, pacemaker wont
> know about it.
>
>
>
>
> You can also take a look at this [1] to know how it is used in openstack
> with tripleo installation.
>
> [1] - https://patchwork.ozlabs.org/patch/839022/
> [2] - https://github.com/openstack/puppet-tripleo/blob/
> master/manifests/profile/pacemaker/ovn_northd.pp
>
>
> Thanks
> Numan
>
> On Fri, Nov 24, 2017 at 3:00 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> Hi folks,
>>
>>   I am following what suggested on doc[1] to configure the ovndb_servers
>> HA, however, it's so unluck with upgrading pacemaker packages from 1.12 to
>> 1.16, do almost every kind of changes, there still not a ovndb_servers
>> master promoted, is there any special recipe for it to run? so frustrated
>> on it, sigh.
>>
>> It always showed:
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>
>> Even if I tried below steps:
>> 1. pcs resource debug-stop ovndb_server on every nodes.  ovn-ctl
>> status_ovnxb: running/backup
>> 2. pcs resource debug-start ovndb_server on every nodes.  ovn-ctl
>> status_ovnxb: running/backup
>> 3. pcs resource debug-promote ovndb_server on one nodes.   ovn-ctl
>> status_ovnxb: running/active
>>
>> With above status, the pcs status still showed as:
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>
>>
>> [1]. https://github.com/openvswitch/ovs/blob/master/Document
>> ation/topics/integration.rst
>>
>> Appreciated any hint.
>>
>>
>>
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>
>>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovndb_servers can't be promoted

2017-11-27 Thread Hui Xiang
Thanks Numan for the userful info and good patches.

Yes the patch seems help a bit, but my environment looks also have other
miss configurations, I am working on it first and then test your patches
again, will keep you posted for any news.

On Fri, Nov 24, 2017 at 10:54 PM, Numan Siddique <nusid...@redhat.com>
wrote:

> Hi Hui Xiang,
>
> Can you please try with this patch [1]  and see if it works for you ?
> Please let me know how it goes. But I am not sure, if the patch would fix
> the issue.
>
> To brief, the OVN OCF script doesn't add monitor action for "Master" role.
> So pacemaker Resource agent would not check for the status of ovn db
> servers periodically. In case ovn db servers are killed, pacemaker wont
> know about it.
>
>
>
>
> You can also take a look at this [1] to know how it is used in openstack
> with tripleo installation.
>
> [1] - https://patchwork.ozlabs.org/patch/839022/
> [2] - https://github.com/openstack/puppet-tripleo/blob/
> master/manifests/profile/pacemaker/ovn_northd.pp
>
>
> Thanks
> Numan
>
> On Fri, Nov 24, 2017 at 3:00 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> Hi folks,
>>
>>   I am following what suggested on doc[1] to configure the ovndb_servers
>> HA, however, it's so unluck with upgrading pacemaker packages from 1.12 to
>> 1.16, do almost every kind of changes, there still not a ovndb_servers
>> master promoted, is there any special recipe for it to run? so frustrated
>> on it, sigh.
>>
>> It always showed:
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>
>> Even if I tried below steps:
>> 1. pcs resource debug-stop ovndb_server on every nodes.  ovn-ctl
>> status_ovnxb: running/backup
>> 2. pcs resource debug-start ovndb_server on every nodes.  ovn-ctl
>> status_ovnxb: running/backup
>> 3. pcs resource debug-promote ovndb_server on one nodes.   ovn-ctl
>> status_ovnxb: running/active
>>
>> With above status, the pcs status still showed as:
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>
>>
>> [1]. https://github.com/openvswitch/ovs/blob/master/Document
>> ation/topics/integration.rst
>>
>> Appreciated any hint.
>>
>>
>>
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>
>>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] ovndb_servers can't be promoted

2017-11-24 Thread Hui Xiang
Hi folks,

  I am following what suggested on doc[1] to configure the ovndb_servers
HA, however, it's so unluck with upgrading pacemaker packages from 1.12 to
1.16, do almost every kind of changes, there still not a ovndb_servers
master promoted, is there any special recipe for it to run? so frustrated
on it, sigh.

It always showed:
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
 Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]

Even if I tried below steps:
1. pcs resource debug-stop ovndb_server on every nodes.  ovn-ctl
status_ovnxb: running/backup
2. pcs resource debug-start ovndb_server on every nodes.  ovn-ctl
status_ovnxb: running/backup
3. pcs resource debug-promote ovndb_server on one nodes.   ovn-ctl
status_ovnxb: running/active

With above status, the pcs status still showed as:
 Master/Slave Set: ovndb_servers-master [ovndb_servers]
 Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]


[1].
https://github.com/openvswitch/ovs/blob/master/Documentation/topics/integration.rst

Appreciated any hint.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Logical Router Port ARP response

2017-11-14 Thread Hui Xiang
Hi Tiago,

  Thanks much, finally figure out that need to add dl_dst=01:00:00:00:00:00
as well, then it match eth.mcast, otherwise it will be seen unknown MAC.

Hui.

On Tue, Nov 14, 2017 at 6:53 PM, Tiago Lam <tiago...@gmail.com> wrote:

> Hi Hui,
>
> I don't think the query you are providing to ovn-trace is correct, hence
> why you see differences between your "real case" and the "tracing case".
>
> If you look at the first (commented) line that ovn-trace prints, you can
> see both eth.dst and eth.src being set to "00:00:00:00:00:00". Since
> these fields are used for setting the correct eth.dst in the ARP reply
> (eth.src and eth.dst are swapped from the actual ARP request), it won't
> know where the deliver the packet, as "00:00:00:00:00:00" is not valid,
> and hence the output="_MC_unknown".
>
> Regards,
>
> Tiago
>
> On 11/14/2017 03:26 AM, Hui Xiang wrote:
> > Hi folks,
> >
> > I am a bit confused of the OVN logical router port arp flow process if
> > arping it from external side.
> >
> > When I am tracing the logical router port arp, it always seems there
> > would be no response from the flow procedure, however, I do can get the
> > arp response and reach it.
> > My trace inport is the localnet port on the provider bridge, and it
> > always go to next and then hit ls_in_l2_lkup table and match MC_unknown
> > then dropped.
> >
> > I wonder who send this arp response? or my inport is wrong in the trace
> > if arp from outside? Thanks very much for any helpful info!
> >
> >
> > [root@node-1 ~]# ovn-trace --detail
> > 9ac587cd-becc-4584-99c9-24282f8707e2  'inport ==
> > "provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f" && arp.sha ==
> > 96:95:d8:7d:b9:4c && arp.tpa == 172.16.0.130  && arp.op == 1'
> > #
> > arp,reg14=0x1,vlan_tci=0x,dl_src=00:00:00:00:00:00,dl_
> dst=00:00:00:00:00:00,arp_spa=0.0.0.0,arp_tpa=172.16.0.130,
> arp_op=1,arp_sha=96:95:d8:7d:b9:4c,arp_tha=00:00:00:00:00:00
> >
> > ingress(dp="public", inport="provnet-668bf1")
> > -
> >  0. ls_in_port_sec_l2 (ovn-northd.c:3556): inport == "provnet-668bf1",
> > priority 50, uuid 25b99b9a
> > next;
> > 10. ls_in_arp_rsp (ovn-northd.c:3588): inport == "provnet-668bf1",
> > priority 100, uuid b3607f68
> > next;
> > 15. ls_in_l2_lkup (ovn-northd.c:3975): 1, priority 0, uuid 83de4750
> > outport = "_MC_unknown";
> > output;
> >
> > multicast(dp="public", mcgroup="_MC_unknown")
> > -
> >
> > egress(dp="public", inport="provnet-668bf1", outport="provnet-668bf1")
> > --
> > /* omitting output because inport == outport && !flags.loopback */
> >
> >
> >
> >
> > Logical Router Port:
> > _uuid   : 6d67d962-38a5-4a29-86b5-067dc26f78d4
> > enabled : []
> > external_ids: {}
> > gateway_chassis : []
> > mac : "fa:16:3e:2e:ea:e9"
> > name: "lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24"
> > networks: ["172.16.0.130/16 <http://172.16.0.130/16>"]
> > options :
> > {redirect-chassis="88596f9f-e326-4e15-ae91-8cc014e7be86"}
> > peer: []
> >
> >
> > The logical flow:
> >   table=10(ls_in_arp_rsp  ), priority=100  , match=(arp.tpa ==
> > 172.16.0.130 && arp.op == 1 && inport ==
> > "640d0475-ff83-47b7-8a4d-9ea0e770fb24"), action=(next;)
> >   table=10(ls_in_arp_rsp  ), priority=100  , match=(inport ==
> > "provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f"), action=(next;)
> >   table=10(ls_in_arp_rsp  ), priority=50   , match=(arp.tpa ==
> > 172.16.0.130 && arp.op == 1), action=(eth.dst = eth.src; eth.src =
> > fa:16:3e:2e:ea:e9; arp.op = 2; /* ARP reply */ arp.tha = arp.sha;
> > arp.sha = fa:16:3e:2e:ea:e9; arp.tpa = arp.spa; arp.spa = 172.16.0.130;
> > outport = inport; flags.loopback = 1; output;)
> >   
> >   table=15(ls_in_l2_lkup  ), priority=100  , match=(eth.mcast),
> > action=(outport = "_MC_flood"; output;)
> >   table=15(ls_in_l2_lkup  ), priority=50   , match=(eth.dst ==
> > fa:16:3e:2e:ea:e9 &&
> > is_chassis_resident("cr-lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24")),
> > action=(outport = "640d0475-ff83-4
> > 7b7-8a4d-9ea0e770fb24"; output;)
> >   table=15(ls_in_l2_lkup  ), priority=0, match=(1),
> > action=(outport = "_MC_unknown"; output;)
> >
> >
> >
> >
> >
> > ___
> > discuss mailing list
> > disc...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVN Logical Router Port ARP response

2017-11-13 Thread Hui Xiang
Hi folks,

I am a bit confused of the OVN logical router port arp flow process if
arping it from external side.

When I am tracing the logical router port arp, it always seems there would
be no response from the flow procedure, however, I do can get the arp
response and reach it.
My trace inport is the localnet port on the provider bridge, and it always
go to next and then hit ls_in_l2_lkup table and match MC_unknown then
dropped.

I wonder who send this arp response? or my inport is wrong in the trace if
arp from outside? Thanks very much for any helpful info!


[root@node-1 ~]# ovn-trace --detail 9ac587cd-becc-4584-99c9-24282f8707e2
'inport == "provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f" && arp.sha ==
96:95:d8:7d:b9:4c && arp.tpa == 172.16.0.130  && arp.op == 1'
#
arp,reg14=0x1,vlan_tci=0x,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,arp_spa=0.0.0.0,arp_tpa=172.16.0.130,arp_op=1,arp_sha=96:95:d8:7d:b9:4c,arp_tha=00:00:00:00:00:00

ingress(dp="public", inport="provnet-668bf1")
-
 0. ls_in_port_sec_l2 (ovn-northd.c:3556): inport == "provnet-668bf1",
priority 50, uuid 25b99b9a
next;
10. ls_in_arp_rsp (ovn-northd.c:3588): inport == "provnet-668bf1", priority
100, uuid b3607f68
next;
15. ls_in_l2_lkup (ovn-northd.c:3975): 1, priority 0, uuid 83de4750
outport = "_MC_unknown";
output;

multicast(dp="public", mcgroup="_MC_unknown")
-

egress(dp="public", inport="provnet-668bf1", outport="provnet-668bf1")
--
/* omitting output because inport == outport && !flags.loopback */




Logical Router Port:
_uuid   : 6d67d962-38a5-4a29-86b5-067dc26f78d4
enabled : []
external_ids: {}
gateway_chassis : []
mac : "fa:16:3e:2e:ea:e9"
name: "lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24"
networks: ["172.16.0.130/16"]
options :
{redirect-chassis="88596f9f-e326-4e15-ae91-8cc014e7be86"}
peer: []


The logical flow:
  table=10(ls_in_arp_rsp  ), priority=100  , match=(arp.tpa ==
172.16.0.130 && arp.op == 1 && inport ==
"640d0475-ff83-47b7-8a4d-9ea0e770fb24"), action=(next;)
  table=10(ls_in_arp_rsp  ), priority=100  , match=(inport ==
"provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f"), action=(next;)
  table=10(ls_in_arp_rsp  ), priority=50   , match=(arp.tpa ==
172.16.0.130 && arp.op == 1), action=(eth.dst = eth.src; eth.src =
fa:16:3e:2e:ea:e9; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha =
fa:16:3e:2e:ea:e9; arp.tpa = arp.spa; arp.spa = 172.16.0.130; outport =
inport; flags.loopback = 1; output;)
  
  table=15(ls_in_l2_lkup  ), priority=100  , match=(eth.mcast),
action=(outport = "_MC_flood"; output;)
  table=15(ls_in_l2_lkup  ), priority=50   , match=(eth.dst ==
fa:16:3e:2e:ea:e9 &&
is_chassis_resident("cr-lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24")),
action=(outport = "640d0475-ff83-4
7b7-8a4d-9ea0e770fb24"; output;)
  table=15(ls_in_l2_lkup  ), priority=0, match=(1), action=(outport
= "_MC_unknown"; output;)
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Debugging ct dnat openflow action

2017-11-13 Thread Hui Xiang
Thanks Guru.

Yes, I have replaced the upstream kernel with OVS kernel module repo 2.8.1,
I mean from [1] which add openvswitch: nat support in linux datapath seems
also
including below changes and those doesn't included in the OVS kernel
module, so I am concerning is it enough for just replace kernel module from
OVS repo.

 include/uapi/linux/netfilter/nf_conntrack_common.h |  12 +-
 include/uapi/linux/openvswitch.h   |  49 ++
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c   |  30 +-
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c   |  30 +-


[1] https://www.mail-archive.com/netdev@vger.kernel.org/msg101556.html


Cause the NAT doesn't work in my environment, I am trying to debug why,
please see previous email, thanks much for your help.

Hui.


On Tue, Nov 14, 2017 at 4:53 AM, Guru Shetty <g...@ovn.org> wrote:

>
>
> On 12 November 2017 at 22:43, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> Does ovs linux dapath NAT work with linux kernel 4.4.70 version?
>>
>
> If you use the kernel module that comes with OVS repo, it will work. If
> you use the kernel module that comes by default with linux kernel, it
> won't. You can look at ovs-vswitchd.log when ovs-vswitchd starts to see a
> message of the form:
>
> 2017-11-13T20:53:01.635Z|00018|ofproto_dpif|INFO|system@ovs-system:
> Datapath does not support ct_state_nat
>
>
>
>>
>> I have seen below comments in the NEWS saying [1]
>> "
>> - Linux:
>> * OVS Linux datapath now implements Conntrack NAT action with all
>> supported Linux kernels.
>> "
>> However, the NAT support for ovs linux datath showed in [2] and
>> [3](below) means they are merged since kernel 4.6
>> "
>> FeatureLinux upstreamLinux OVS treeUserspaceHyper-V
>> NAT 4.6 YES Yes NO
>> "
>>
>> My understanding is that the NAT is only working with a minimal version
>> of kernel 4.6? Thanks much for any help.
>>
>> [1] https://github.com/openvswitch/ovs/blob/master/NEWS
>> [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg101556.html
>> [3] http://docs.openvswitch.org/en/latest/faq/releases/
>>
>>
>> Hui.
>>
>>
>> On Fri, Nov 10, 2017 at 6:41 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>>
>>> Hi Folks,
>>>
>>>
>>> I am now debugging OVN NAT with openstack, networking-ovn. now I am
>>> blocked at the dnat action step, if anyone can give a help or hint would be
>>> really appreciated.
>>>
>>> VM instance has fixedip 20.0.0.2 and floatingip 172.16.0.131
>>>
>>> Below are the lflow-trace, openflow-trace and related openflow table.
>>>
>>> From lflow-trace, the ip4.dst=172.16.0.131 is expected turn to 20.0.0.2
>>> by ct_dnat, and then when go to next table, the nw_dst will be
>>> 20.0.0.0/24, but actually from the openflow-trace after
>>> ct_dnat(20.0.0.2), the nw_dst is still 172.16.0.0/24 in the next
>>> routing table, does there's something wrong or I miss anything in the ct
>>> dnat? it is using the ovs 2.8.1 kernel conntrack, where should I looked?
>>> Thanks much.
>>>
>>>
>>> # lflow trace
>>> ct_snat /* assuming no un-snat entry, so no change */
>>> -
>>>  4. lr_in_dnat (ovn-northd.c:5007): ip && ip4.dst == 172.16.0.131 &&
>>> inport == "lrp-640d04" && is_chassis_resident("cr-lrp-640d04"),
>>> priority 100, uuid 5d67b33f
>>> ct_dnat(20.0.0.2);
>>>
>>> ct_dnat(ip4.dst=20.0.0.2)
>>> -
>>>  5. lr_in_ip_routing (ovn-northd.c:4140): ip4.dst == 20.0.0.0/24,
>>> priority 49, uuid e869d362
>>> ip.ttl--;
>>> reg0 = ip4.dst;
>>> reg1 = 20.0.0.1;
>>> eth.src = fa:16:3e:b5:99:71;
>>> outport = "lrp-82f211";
>>> flags.loopback = 1;
>>> next;
>>>
>>> # corresponding openflow trace
>>> 12. ip,reg14=0x1,metadata=0x3,nw_dst=172.16.0.131, priority 100, cookie
>>> 0x5d67b33f
>>> ct(commit,table=13,zone=NXM_NX_REG11[0..15],nat(dst=20.0.0.2))
>>> nat(dst=20.0.0.2)
>>>  -> A clone of the packet is forked to recirculate. The forked
>>> pipeline will be resumed at table 13.
>>>
>>> Final flow: unchanged
>>> Megaflow: recirc_id=0x19,eth,ip,in_port=0,nw_dst=172.16.0.131,nw_frag=no
>>> Datapath actions: ct(commit,zone=7,nat(dst=20.0.0.2)),recirc(0x1a)
>>>
>>> ===

Re: [ovs-discuss] Debugging ct dnat openflow action

2017-11-12 Thread Hui Xiang
Does ovs linux dapath NAT work with linux kernel 4.4.70 version?

I have seen below comments in the NEWS saying [1]
"
- Linux:
* OVS Linux datapath now implements Conntrack NAT action with all
supported Linux kernels.
"
However, the NAT support for ovs linux datath showed in [2] and [3](below)
means they are merged since kernel 4.6
"
FeatureLinux upstreamLinux OVS treeUserspaceHyper-V
NAT 4.6 YES Yes NO
"

My understanding is that the NAT is only working with a minimal version of
kernel 4.6? Thanks much for any help.

[1] https://github.com/openvswitch/ovs/blob/master/NEWS
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg101556.html
[3] http://docs.openvswitch.org/en/latest/faq/releases/


Hui.


On Fri, Nov 10, 2017 at 6:41 PM, Hui Xiang <xiangh...@gmail.com> wrote:

> Hi Folks,
>
>
> I am now debugging OVN NAT with openstack, networking-ovn. now I am
> blocked at the dnat action step, if anyone can give a help or hint would be
> really appreciated.
>
> VM instance has fixedip 20.0.0.2 and floatingip 172.16.0.131
>
> Below are the lflow-trace, openflow-trace and related openflow table.
>
> From lflow-trace, the ip4.dst=172.16.0.131 is expected turn to 20.0.0.2 by
> ct_dnat, and then when go to next table, the nw_dst will be 20.0.0.0/24,
> but actually from the openflow-trace after ct_dnat(20.0.0.2), the nw_dst is
> still 172.16.0.0/24 in the next routing table, does there's something
> wrong or I miss anything in the ct dnat? it is using the ovs 2.8.1 kernel
> conntrack, where should I looked? Thanks much.
>
>
> # lflow trace
> ct_snat /* assuming no un-snat entry, so no change */
> -
>  4. lr_in_dnat (ovn-northd.c:5007): ip && ip4.dst == 172.16.0.131 &&
> inport == "lrp-640d04" && is_chassis_resident("cr-lrp-640d04"), priority
> 100, uuid 5d67b33f
> ct_dnat(20.0.0.2);
>
> ct_dnat(ip4.dst=20.0.0.2)
> -
>  5. lr_in_ip_routing (ovn-northd.c:4140): ip4.dst == 20.0.0.0/24,
> priority 49, uuid e869d362
> ip.ttl--;
> reg0 = ip4.dst;
> reg1 = 20.0.0.1;
> eth.src = fa:16:3e:b5:99:71;
> outport = "lrp-82f211";
> flags.loopback = 1;
> next;
>
> # corresponding openflow trace
> 12. ip,reg14=0x1,metadata=0x3,nw_dst=172.16.0.131, priority 100, cookie
> 0x5d67b33f
> ct(commit,table=13,zone=NXM_NX_REG11[0..15],nat(dst=20.0.0.2))
> nat(dst=20.0.0.2)
>  -> A clone of the packet is forked to recirculate. The forked
> pipeline will be resumed at table 13.
>
> Final flow: unchanged
> Megaflow: recirc_id=0x19,eth,ip,in_port=0,nw_dst=172.16.0.131,nw_frag=no
> Datapath actions: ct(commit,zone=7,nat(dst=20.0.0.2)),recirc(0x1a)
>
> 
> ===
> recirc(0x1a) - resume conntrack with default ct_state=trk|new (use
> --ct-next to customize)
> 
> ===
>
> Flow: recirc_id=0x1a,ct_state=new|trk,eth,icmp,reg11=0x7,reg12=
> 0x3,reg14=0x1,metadata=0x3,vlan_tci=0x,dl_src=00:00:
> 00:00:00:00,dl_dst=fa:16:3e:2e:ea:e9,nw_src=172.16.0.2,nw_
> dst=172.16.0.131,nw_tos=0,nw_ecn=0,nw_ttl=32,icmp_type=0,icmp_code=0
>
> bridge("br-ex")
> ---
> thaw
> Resuming from table 13
> 13. ip,metadata=0x3,nw_dst=172.16.0.0/16, priority 33, cookie 0x9e4db527
> dec_ttl()
> move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127]
>  -> NXM_NX_XXREG0[96..127] is now 0xac100083
> load:0xac100082->NXM_NX_XXREG0[64..95]
> set_field:fa:16:3e:2e:ea:e9->eth_src
> set_field:0x1->reg15
> load:0x1->NXM_NX_REG10[0]
> resubmit(,14)
>
>
> # openflow table
>  cookie=0x5d67b33f, duration=4600.548s, table=12, n_packets=3,
> n_bytes=294, priority=100,ip,reg14=0x1,metadata=0x3,nw_dst=172.16.0.131
> actions=ct(commit,table=13,zone=NXM_NX_REG11[0..15],nat(dst=20.0.0.2))
>  cookie=0xe869d362, duration=4600.551s, table=13, n_packets=3,
> n_bytes=294, priority=49,ip,metadata=0x3,nw_dst=20.0.0.0/24
> actions=dec_ttl(),move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127],load:
> 0x1401->NXM_NX_XXREG0[64..95],set_field:fa:16:3e:b5:99:
> 71->eth_src,set_field:0x3->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,14)
>  cookie=0x9e4db527, duration=4600.547s, table=13, n_packets=0, n_bytes=0,
> priority=33,ip,metadata=0x3,nw_dst=172.16.0.0/16
> actions=dec_ttl(),move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127],load:
> 0xac100082->NXM_NX_XXREG0[64..95],set_field:fa:16:3e:2e:ea:
> e9->eth_src,set_field:0x1->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,14)
>
>
> Hui.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Debugging ct dnat openflow action

2017-11-10 Thread Hui Xiang
Hi Folks,


I am now debugging OVN NAT with openstack, networking-ovn. now I am blocked
at the dnat action step, if anyone can give a help or hint would be really
appreciated.

VM instance has fixedip 20.0.0.2 and floatingip 172.16.0.131

Below are the lflow-trace, openflow-trace and related openflow table.

>From lflow-trace, the ip4.dst=172.16.0.131 is expected turn to 20.0.0.2 by
ct_dnat, and then when go to next table, the nw_dst will be 20.0.0.0/24,
but actually from the openflow-trace after ct_dnat(20.0.0.2), the nw_dst is
still 172.16.0.0/24 in the next routing table, does there's something wrong
or I miss anything in the ct dnat? it is using the ovs 2.8.1 kernel
conntrack, where should I looked? Thanks much.


# lflow trace
ct_snat /* assuming no un-snat entry, so no change */
-
 4. lr_in_dnat (ovn-northd.c:5007): ip && ip4.dst == 172.16.0.131 && inport
== "lrp-640d04" && is_chassis_resident("cr-lrp-640d04"), priority 100, uuid
5d67b33f
ct_dnat(20.0.0.2);

ct_dnat(ip4.dst=20.0.0.2)
-
 5. lr_in_ip_routing (ovn-northd.c:4140): ip4.dst == 20.0.0.0/24, priority
49, uuid e869d362
ip.ttl--;
reg0 = ip4.dst;
reg1 = 20.0.0.1;
eth.src = fa:16:3e:b5:99:71;
outport = "lrp-82f211";
flags.loopback = 1;
next;

# corresponding openflow trace
12. ip,reg14=0x1,metadata=0x3,nw_dst=172.16.0.131, priority 100, cookie
0x5d67b33f
ct(commit,table=13,zone=NXM_NX_REG11[0..15],nat(dst=20.0.0.2))
nat(dst=20.0.0.2)
 -> A clone of the packet is forked to recirculate. The forked pipeline
will be resumed at table 13.

Final flow: unchanged
Megaflow: recirc_id=0x19,eth,ip,in_port=0,nw_dst=172.16.0.131,nw_frag=no
Datapath actions: ct(commit,zone=7,nat(dst=20.0.0.2)),recirc(0x1a)

===
recirc(0x1a) - resume conntrack with default ct_state=trk|new (use
--ct-next to customize)
===

Flow:
recirc_id=0x1a,ct_state=new|trk,eth,icmp,reg11=0x7,reg12=0x3,reg14=0x1,metadata=0x3,vlan_tci=0x,dl_src=00:00:00:00:00:00,dl_dst=fa:16:3e:2e:ea:e9,nw_src=172.16.0.2,nw_dst=172.16.0.131,nw_tos=0,nw_ecn=0,nw_ttl=32,icmp_type=0,icmp_code=0

bridge("br-ex")
---
thaw
Resuming from table 13
13. ip,metadata=0x3,nw_dst=172.16.0.0/16, priority 33, cookie 0x9e4db527
dec_ttl()
move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127]
 -> NXM_NX_XXREG0[96..127] is now 0xac100083
load:0xac100082->NXM_NX_XXREG0[64..95]
set_field:fa:16:3e:2e:ea:e9->eth_src
set_field:0x1->reg15
load:0x1->NXM_NX_REG10[0]
resubmit(,14)


# openflow table
 cookie=0x5d67b33f, duration=4600.548s, table=12, n_packets=3, n_bytes=294,
priority=100,ip,reg14=0x1,metadata=0x3,nw_dst=172.16.0.131
actions=ct(commit,table=13,zone=NXM_NX_REG11[0..15],nat(dst=20.0.0.2))
 cookie=0xe869d362, duration=4600.551s, table=13, n_packets=3, n_bytes=294,
priority=49,ip,metadata=0x3,nw_dst=20.0.0.0/24
actions=dec_ttl(),move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127],load:
0x1401->NXM_NX_XXREG0[64..95],set_field:fa:16:3e:b5:99:71->eth_src,set_field:0x3->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,14)
 cookie=0x9e4db527, duration=4600.547s, table=13, n_packets=0, n_bytes=0,
priority=33,ip,metadata=0x3,nw_dst=172.16.0.0/16
actions=dec_ttl(),move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127],load:
0xac100082->NXM_NX_XXREG0[64..95],set_field:fa:16:3e:2e:ea:e9->eth_src,set_field:0x1->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,14)


Hui.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN patch port for localnet can't be created

2017-11-10 Thread Hui Xiang
By manually adding nat-addresses=router for logical router type switch port
make it go further, above are not the problem now, thanks all.

On Thu, Nov 9, 2017 at 5:54 PM, Hui Xiang <xiangh...@gmail.com> wrote:

> The ovn-trace target to vm floatingip is finally identified as MC_unknown
> because there are only three lflows in the lswitch for public net, does it
> correct for normal case, where should I look next?
>
> lflow:
>   table=15(ls_in_l2_lkup  ), priority=100  , match=(eth.mcast),
> action=(outport = "_MC_flood"; output;)
>   table=15(ls_in_l2_lkup  ), priority=50   , match=(eth.dst ==
> fa:16:3e:2e:ea:e9 && 
> is_chassis_resident("cr-lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24")),
> action=(outport = "640d0475-ff83-47b7-8a4d-9ea0e770fb24"; output;)
>   table=15(ls_in_l2_lkup  ), priority=0, match=(1),
> action=(outport = "_MC_unknown"; output;)
>
>
> vm floatingip detail:
> | b20f8e5d-8e91-4f12-84b1-409728706844 |   |
> fa:16:3e:23:33:fa | {"subnet_id": "3659502b-a41b-4c3f-964c-457e6782ae79",
> "ip_address": "172.16.0.131"} |
>
>
>
> [root@node-1 ~]# ovn-trace --detail 9ac587cd-becc-4584-99c9-24282f8707e2
> 'inport == "provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f" && eth.src ==
> 96:95:d8:7d:b9:4c && ip4.src == 172.16.0.2 && eth.dst == fa:16:3e:23:33:fa
> && ip4.dst == 172.16.0.131 && ip.ttl == 32'
> # ip,reg14=0x1,vlan_tci=0x,dl_src=96:95:d8:7d:b9:4c,dl_
> dst=fa:16:3e:23:33:fa,nw_src=172.16.0.2,nw_dst=172.16.0.
> 131,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=32
>
> ingress(dp="public", inport="provnet-668bf1")
> -
>  0. ls_in_port_sec_l2 (ovn-northd.c:3556): inport == "provnet-668bf1",
> priority 50, uuid 25b99b9a
> next;
> 10. ls_in_arp_rsp (ovn-northd.c:3588): inport == "provnet-668bf1",
> priority 100, uuid b3607f68
> next;
> 15. ls_in_l2_lkup (ovn-northd.c:3975): 1, priority 0, uuid 83de4750
> outport = "_MC_unknown";
> output;
>
> multicast(dp="public", mcgroup="_MC_unknown")
> -
>
> egress(dp="public", inport="provnet-668bf1", outport="provnet-668bf1")
> --
> /* omitting output because inport == outport && !flags.loopback */
>
>
> On Thu, Nov 9, 2017 at 5:26 PM, Hui Xiang <xiangh...@gmail.com> wrote:
>
>> Hi Numan,
>>
>>   Yes, it is the openstack setup with networking-ovn.
>>   "ovn-nbctl list logical_router_port | grep redirect-chassis"  has
>> null, and I had followed your command to set the redirect-chassis, but
>> before that the patch ports for localnet are
>> created successfully because I am recreating the environment again.
>>
>>Last time the patch ports are deleted by not clear reason when I am
>> doing ofproto/trace. I am trying to do it again because the vm
>> dnat_and_snat doesn't work in my openstack
>> and ovn environment.
>>
>>   ovn: 2.8.1,  networking-ovn: master(with most latest patches)
>>
>>   VM: | dcb35bb8-12d5-444b-96ba-22b60c70f950 | A| ACTIVE | -
>>   | Running | private=20.0.0.2, 172.16.0.131 |
>>
>>
>>
>>   [root@node-1 ~]# ovn-nbctl list logical_router_port
>>   _uuid   : e0c7e73e-42a7-469c-9b01-86dd9cc5563e
>>   enabled : []
>>   external_ids: {}
>>   gateway_chassis : []
>>   mac : "fa:16:3e:b5:99:71"
>>   name: "lrp-82f21142-a15a-4cbc-9c59-c4cbf88e2348"
>>   networks: ["20.0.0.1/24"]
>>   options : {redirect-chassis="f49d19ac-d1
>> 40-44d4-9f1c-3601a23d67d1"}
>>   peer: []
>>
>>   _uuid   : 6d67d962-38a5-4a29-86b5-067dc26f78d4
>>   enabled : []
>>   external_ids: {}
>>   gateway_chassis : [22ab4c4e-3bbc-4b41-8f65-09ddc060efb0]
>>   mac : "fa:16:3e:2e:ea:e9"
>>   name: "lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24"
>>   networks: ["172.16.0.130/16"]
>>   options : {}
>>   peer: []
>>
>>   [root@node-1 ~]# ovn-nbctl show
>>   switch 9ac587cd-becc-4584-99c9-24282f8707e2
>> (neutron-668bf1b8-cfec-44d1-b659-30a42c83e29f) (aka public)
>>   port provnet-668bf1b8-cfec-44d1-b659-30a42

Re: [ovs-discuss] OVN patch port for localnet can't be created

2017-11-09 Thread Hui Xiang
The ovn-trace target to vm floatingip is finally identified as MC_unknown
because there are only three lflows in the lswitch for public net, does it
correct for normal case, where should I look next?

lflow:
  table=15(ls_in_l2_lkup  ), priority=100  , match=(eth.mcast),
action=(outport = "_MC_flood"; output;)
  table=15(ls_in_l2_lkup  ), priority=50   , match=(eth.dst ==
fa:16:3e:2e:ea:e9 &&
is_chassis_resident("cr-lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24")),
action=(outport = "640d0475-ff83-47b7-8a4d-9ea0e770fb24"; output;)
  table=15(ls_in_l2_lkup  ), priority=0, match=(1), action=(outport
= "_MC_unknown"; output;)


vm floatingip detail:
| b20f8e5d-8e91-4f12-84b1-409728706844 |   |
fa:16:3e:23:33:fa | {"subnet_id": "3659502b-a41b-4c3f-964c-457e6782ae79",
"ip_address": "172.16.0.131"} |



[root@node-1 ~]# ovn-trace --detail 9ac587cd-becc-4584-99c9-24282f8707e2
'inport == "provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f" && eth.src ==
96:95:d8:7d:b9:4c && ip4.src == 172.16.0.2 && eth.dst == fa:16:3e:23:33:fa
&& ip4.dst == 172.16.0.131 && ip.ttl == 32'
#
ip,reg14=0x1,vlan_tci=0x,dl_src=96:95:d8:7d:b9:4c,dl_dst=fa:16:3e:23:33:fa,nw_src=172.16.0.2,nw_dst=172.16.0.131,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=32

ingress(dp="public", inport="provnet-668bf1")
-
 0. ls_in_port_sec_l2 (ovn-northd.c:3556): inport == "provnet-668bf1",
priority 50, uuid 25b99b9a
next;
10. ls_in_arp_rsp (ovn-northd.c:3588): inport == "provnet-668bf1", priority
100, uuid b3607f68
next;
15. ls_in_l2_lkup (ovn-northd.c:3975): 1, priority 0, uuid 83de4750
outport = "_MC_unknown";
output;

multicast(dp="public", mcgroup="_MC_unknown")
-

egress(dp="public", inport="provnet-668bf1", outport="provnet-668bf1")
--
/* omitting output because inport == outport && !flags.loopback */


On Thu, Nov 9, 2017 at 5:26 PM, Hui Xiang <xiangh...@gmail.com> wrote:

> Hi Numan,
>
>   Yes, it is the openstack setup with networking-ovn.
>   "ovn-nbctl list logical_router_port | grep redirect-chassis"  has null,
> and I had followed your command to set the redirect-chassis, but before
> that the patch ports for localnet are
> created successfully because I am recreating the environment again.
>
>Last time the patch ports are deleted by not clear reason when I am
> doing ofproto/trace. I am trying to do it again because the vm
> dnat_and_snat doesn't work in my openstack
> and ovn environment.
>
>   ovn: 2.8.1,  networking-ovn: master(with most latest patches)
>
>   VM: | dcb35bb8-12d5-444b-96ba-22b60c70f950 | A| ACTIVE | -
> | Running | private=20.0.0.2, 172.16.0.131 |
>
>
>
>   [root@node-1 ~]# ovn-nbctl list logical_router_port
>   _uuid   : e0c7e73e-42a7-469c-9b01-86dd9cc5563e
>   enabled : []
>   external_ids: {}
>   gateway_chassis : []
>   mac : "fa:16:3e:b5:99:71"
>   name: "lrp-82f21142-a15a-4cbc-9c59-c4cbf88e2348"
>   networks: ["20.0.0.1/24"]
>   options : {redirect-chassis="f49d19ac-
> d140-44d4-9f1c-3601a23d67d1"}
>   peer: []
>
>   _uuid   : 6d67d962-38a5-4a29-86b5-067dc26f78d4
>   enabled : []
>   external_ids: {}
>   gateway_chassis : [22ab4c4e-3bbc-4b41-8f65-09ddc060efb0]
>   mac : "fa:16:3e:2e:ea:e9"
>   name: "lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24"
>   networks: ["172.16.0.130/16"]
>   options : {}
>   peer: []
>
>   [root@node-1 ~]# ovn-nbctl show
>   switch 9ac587cd-becc-4584-99c9-24282f8707e2 
> (neutron-668bf1b8-cfec-44d1-b659-30a42c83e29f)
> (aka public)
>   port provnet-668bf1b8-cfec-44d1-b659-30a42c83e29f
>   type: localnet
>   addresses: ["unknown"]
>   port 9572b4f3-0e02-4803-acab-1bf1d0b716cb (aka internal_gw)
>   addresses: [""]
>   port 640d0475-ff83-47b7-8a4d-9ea0e770fb24
>   type: router
>   router-port: lrp-640d0475-ff83-47b7-8a4d-9ea0e770fb24
>   switch 1a4ccee6-86ff-4ec4-9e42-3fffa1709d59 
> (neutron-599de5c1-200d-4962-a3ae-ca3e0434cc0a)
> (aka private)
>   port 6fd717e1-572e-4bad-8e74-cdda95d1ce49 (aka nic_1510210943.93)
>   addresses: ["fa:16:3e:b9:c2:dd 20.0.0.2"]
>   port 82f21142-a15a-4c

Re: [ovs-discuss] OVN patch port for localnet can't be created

2017-11-09 Thread Hui Xiang
Thanks Ben.

After adding sbrec_port_binding_add_clause_type(, OVSDB_F_EQ,
"localnet"); in the update_sb_monitors() function, the patch ports for
localnet can be created,
but that is very confused when debugging the program, initially this
hypervisor is set as the gateway chassis where other vms spawned in other
hypervisor can go dnat_and_snat
through this node that without any vm spawned. In the begginging the patch
ports are created successfully, then I booted two vms, assigned
floatiingips, they are not recheable, so I begin to do ofproto/trace,
during that time, the patch ports suddenly deleted and can't be created
well. The GDB showed that no localnet type binding can be found from ovn sb
idl, I
am not quite understand the ovn sb idl update/monitor logistic so far.
Anyway probably my environment is messed up, I have re-create the
environment and try to do again.


On Wed, Nov 8, 2017 at 11:11 AM, Ben Pfaff <b...@ovn.org> wrote:

> Is update_sb_monitors() in ovn-controller.c not doing the right thing to
> ensure that localnet ports are present?  That is the first place to
> look, I think.
>
> On Tue, Nov 07, 2017 at 04:36:12PM +0800, Hui Xiang wrote:
> > Hi folks,
> >
> >   When I am running ovn in one of my node having the gateway port
> connected
> > external network via localnet, the patch port can't be created between
> > br-ex(set by ovn-bridge-mappings) with br-int, after gdb, it seems the
> > result get from SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl)
> > doesn't include 'localnet' binding type, however it does exist from
> > ovn-sbctl list port_binding, either I am missing any configuration to
> make
> > it work or this is a bug.
> >
> >   Please have a look and thank much.
> >
> > external_ids: {hostname="node-1.domain.tld",
> > ovn-bridge-mappings="physnet1:br-ex", ovn-encap-ip="168.254.101.10",
> > ovn-encap-type=geneve, ovn-remote="tcp:192.168.0.2:6642",
> > rundir="/var/run/openvswitch",
> > system-id="88596f9f-e326-4e15-ae91-8cc014e7be86"}
> > iface_types : [geneve, gre, internal, lisp, patch, stt, system,
> > tap, vxlan]
> >
> > (gdb) n
> > 181 if (!strcmp(binding->type, "localnet")) {
> > 4: binding->type = 0x55e7189608b0 "patch"
> > (gdb) display binding->logical_port
> > 5: binding->logical_port = 0x55e718960650
> > "b3edbc9a-3248-43e5-b84e-01689a9c83e2"
> > (gdb) n
> > 183 } else if (!strcmp(binding->type, "l2gateway")) {
> > 5: binding->logical_port = 0x55e718960650
> > "b3edbc9a-3248-43e5-b84e-01689a9c83e2"
> > 4: binding->type = 0x55e7189608b0 "patch"
> > (gdb)
> > 193 continue;
> > 5: binding->logical_port = 0x55e718960650
> > "b3edbc9a-3248-43e5-b84e-01689a9c83e2"
> > 4: binding->type = 0x55e7189608b0 "patch"
> > (gdb)
> > 179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
> > 5: binding->logical_port = 0x55e718960650
> > "b3edbc9a-3248-43e5-b84e-01689a9c83e2"
> > 4: binding->type = 0x55e7189608b0 "patch"
> > (gdb)
> > 181 if (!strcmp(binding->type, "localnet")) {
> > 5: binding->logical_port = 0x55e7189622d0
> > "lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
> > 4: binding->type = 0x55e718962380 "patch"
> > (gdb)
> > 183 } else if (!strcmp(binding->type, "l2gateway")) {
> > 5: binding->logical_port = 0x55e7189622d0
> > "lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
> > 4: binding->type = 0x55e718962380 "patch"
> > (gdb)
> > 193 continue;
> > 5: binding->logical_port = 0x55e7189622d0
> > "lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
> > 4: binding->type = 0x55e718962380 "patch"
> > (gdb)
> > 179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
> > 5: binding->logical_port = 0x55e7189622d0
> > "lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
> > 4: binding->type = 0x55e718962380 "patch"
> > (gdb)
> > 181 if (!strcmp(binding->type, "localnet")) {
> > 5: binding->logical_port = 0x55e718962820
> > "lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
> > 4: binding->type = 0x55e7189628d0 "patch"
> > (gdb)
> > 183 } else if (!strcmp(binding->type, "l2gateway")) {
> > 5: binding->logical_port = 0x55e718962820
> > &quo

Re: [ovs-discuss] OVN patch port for localnet can't be created

2017-11-07 Thread Hui Xiang
Seems it even doesn't get the "" type, just only patch and chassisredirect.

gateway_chassis : []
logical_port: "d94cb413-f53a-4943-9590-c75e60e63568"
mac : [""]
nat_addresses   : []
options : {}
parent_port : []
tag : []
tunnel_key  : 3
type: ""

On Tue, Nov 7, 2017 at 4:36 PM, Hui Xiang <xiangh...@gmail.com> wrote:

> Hi folks,
>
>   When I am running ovn in one of my node having the gateway port
> connected external network via localnet, the patch port can't be created
> between br-ex(set by ovn-bridge-mappings) with br-int, after gdb, it seems
> the result get from SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl)
> doesn't include 'localnet' binding type, however it does exist from
> ovn-sbctl list port_binding, either I am missing any configuration to make
> it work or this is a bug.
>
>   Please have a look and thank much.
>
> external_ids: {hostname="node-1.domain.tld",
> ovn-bridge-mappings="physnet1:br-ex", ovn-encap-ip="168.254.101.10",
> ovn-encap-type=geneve, ovn-remote="tcp:192.168.0.2:6642",
> rundir="/var/run/openvswitch", system-id="88596f9f-e326-4e15-
> ae91-8cc014e7be86"}
> iface_types : [geneve, gre, internal, lisp, patch, stt, system,
> tap, vxlan]
>
> (gdb) n
> 181 if (!strcmp(binding->type, "localnet")) {
> 4: binding->type = 0x55e7189608b0 "patch"
> (gdb) display binding->logical_port
> 5: binding->logical_port = 0x55e718960650 "b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> (gdb) n
> 183 } else if (!strcmp(binding->type, "l2gateway")) {
> 5: binding->logical_port = 0x55e718960650 "b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189608b0 "patch"
> (gdb)
> 193 continue;
> 5: binding->logical_port = 0x55e718960650 "b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189608b0 "patch"
> (gdb)
> 179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
> 5: binding->logical_port = 0x55e718960650 "b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189608b0 "patch"
> (gdb)
> 181 if (!strcmp(binding->type, "localnet")) {
> 5: binding->logical_port = 0x55e7189622d0 "lrp-3a938edc-8809-4b79-b1a6-
> 8145066e4fe3"
> 4: binding->type = 0x55e718962380 "patch"
> (gdb)
> 183 } else if (!strcmp(binding->type, "l2gateway")) {
> 5: binding->logical_port = 0x55e7189622d0 "lrp-3a938edc-8809-4b79-b1a6-
> 8145066e4fe3"
> 4: binding->type = 0x55e718962380 "patch"
> (gdb)
> 193 continue;
> 5: binding->logical_port = 0x55e7189622d0 "lrp-3a938edc-8809-4b79-b1a6-
> 8145066e4fe3"
> 4: binding->type = 0x55e718962380 "patch"
> (gdb)
> 179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
> 5: binding->logical_port = 0x55e7189622d0 "lrp-3a938edc-8809-4b79-b1a6-
> 8145066e4fe3"
> 4: binding->type = 0x55e718962380 "patch"
> (gdb)
> 181 if (!strcmp(binding->type, "localnet")) {
> 5: binding->logical_port = 0x55e718962820 "lrp-b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189628d0 "patch"
> (gdb)
> 183 } else if (!strcmp(binding->type, "l2gateway")) {
> 5: binding->logical_port = 0x55e718962820 "lrp-b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189628d0 "patch"
> (gdb)
> 193 continue;
> 5: binding->logical_port = 0x55e718962820 "lrp-b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189628d0 "patch"
> (gdb)
> 179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
> 5: binding->logical_port = 0x55e718962820 "lrp-b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> (gdb) n
> 181 if (!strcmp(binding->type, "localnet")) {
> 4: binding->type = 0x55e7189608b0 "patch"
> (gdb) display binding->logical_port
> 5: binding->logical_port = 0x55e718960650 "b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> (gdb) n 183 } else if (!strcmp(binding->type, "l2gateway")) {
> 5: binding->logical_port = 0x55e718960650 "b3edbc9a-3248-43e5-b84e-
> 01689a9c83e2"
> 4: binding->type = 0x55e7189608b0 "patch"
> (gdb)
> 193 

[ovs-discuss] OVN patch port for localnet can't be created

2017-11-07 Thread Hui Xiang
Hi folks,

  When I am running ovn in one of my node having the gateway port connected
external network via localnet, the patch port can't be created between
br-ex(set by ovn-bridge-mappings) with br-int, after gdb, it seems the
result get from SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl)
doesn't include 'localnet' binding type, however it does exist from
ovn-sbctl list port_binding, either I am missing any configuration to make
it work or this is a bug.

  Please have a look and thank much.

external_ids: {hostname="node-1.domain.tld",
ovn-bridge-mappings="physnet1:br-ex", ovn-encap-ip="168.254.101.10",
ovn-encap-type=geneve, ovn-remote="tcp:192.168.0.2:6642",
rundir="/var/run/openvswitch",
system-id="88596f9f-e326-4e15-ae91-8cc014e7be86"}
iface_types : [geneve, gre, internal, lisp, patch, stt, system,
tap, vxlan]

(gdb) n
181 if (!strcmp(binding->type, "localnet")) {
4: binding->type = 0x55e7189608b0 "patch"
(gdb) display binding->logical_port
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
(gdb) n
183 } else if (!strcmp(binding->type, "l2gateway")) {
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189608b0 "patch"
(gdb)
193 continue;
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189608b0 "patch"
(gdb)
179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189608b0 "patch"
(gdb)
181 if (!strcmp(binding->type, "localnet")) {
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
183 } else if (!strcmp(binding->type, "l2gateway")) {
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
193 continue;
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
181 if (!strcmp(binding->type, "localnet")) {
5: binding->logical_port = 0x55e718962820
"lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189628d0 "patch"
(gdb)
183 } else if (!strcmp(binding->type, "l2gateway")) {
5: binding->logical_port = 0x55e718962820
"lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189628d0 "patch"
(gdb)
193 continue;
5: binding->logical_port = 0x55e718962820
"lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189628d0 "patch"
(gdb)
179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
5: binding->logical_port = 0x55e718962820
"lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
(gdb) n
181 if (!strcmp(binding->type, "localnet")) {
4: binding->type = 0x55e7189608b0 "patch"
(gdb) display binding->logical_port
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
(gdb) n 183 } else if (!strcmp(binding->type, "l2gateway")) {
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189608b0 "patch"
(gdb)
193 continue;
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189608b0 "patch"
(gdb)
179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
5: binding->logical_port = 0x55e718960650
"b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189608b0 "patch"
(gdb)
181 if (!strcmp(binding->type, "localnet")) {
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
183 } else if (!strcmp(binding->type, "l2gateway")) { 5:
binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
193 continue;
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
179 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
5: binding->logical_port = 0x55e7189622d0
"lrp-3a938edc-8809-4b79-b1a6-8145066e4fe3"
4: binding->type = 0x55e718962380 "patch"
(gdb)
181 if (!strcmp(binding->type, "localnet")) {
5: binding->logical_port = 0x55e718962820
"lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 0x55e7189628d0 "patch"
(gdb)
183 } else if (!strcmp(binding->type, "l2gateway")) {
5: binding->logical_port = 0x55e718962820
"lrp-b3edbc9a-3248-43e5-b84e-01689a9c83e2"
4: binding->type = 

Re: [ovs-discuss] About the 'clone' action

2017-10-25 Thread Hui Xiang
Thanks Ben for your info.

I think I am finally catching a bit with the concept of 'clone' action
after reviewing below patches in order, so the for OVN case, the new
'clone' action aims to drop most of the OVS patch port, and after add new
ct_clear action, some fixes, I assume it works with NAT, and I am using
2.8.1 for testing, please point out the patches if I am missing any. One
other question is Can I set bridge-mapping:physnet1:br-int? In other words,
the br-int has vm vif, has tunnel ports(the tunnel ports would point to
another nic), and has the external nic as local gateway router port, is it
feasible without configuring more extra flows.


[ovs-dev] [RFC PATCH] ofp-actions: Add clone action.
https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/325542.html

ovn-controller: Drop most uses of OVS patch ports.
https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312622fbaeacbc6ce7576e347https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312622fbaeacbc6ce7576e347

[ovs-dev] [PATCH 2/3] dpif-netdev: Add clone action
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327633.html

[ovs-dev] Issues with the use of the clone action for resubmission to the
pipeline
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326981.html

[ovs-dev] ovn ping from VM to external gateway IP failed.
https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326936.html

[ovs-dev] [PATCH v2 0/4] Fix some "clone"-related issues
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327251.html

[ovs-dev] [PATCH v2 3/4] New action "ct_clear".
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327268.html

[ovs-dev] [PATCH v2 4/4] ovn-controller: Clear conntrack state inside clone
action.
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327255.html

On Tue, Oct 24, 2017 at 12:21 AM, Ben Pfaff <b...@ovn.org> wrote:

> On Mon, Oct 23, 2017 at 04:28:53PM +0800, Hui Xiang wrote:
> > I am trying to understand clone action deeply and checking the patch
> > related with 'clone'[1], where said "The clone action provides an action
> > envelope to enclose an action list." and "The clone action is very
> similar
> > with the OpenFlow clone action recently added", but sorry I am not able
> to
> > find any "clone" action definition in the OpenFlow spec[2], or any
> envelope
> > 's kind of phrases, it just referred some clone(copy) of piple, am I
> > looking in a wrong place?
>
> Open vSwitch has two kinds of actions.  First, there are OpenFlow
> actions, which include both standard actions and extension actions.
> "clone" is an OpenFlow extension action implemented by Open vSwitch.
> Second, there are datapath actions, which are an implementation detail
> of Open vSwitch.  Open vSwitch has a datapath action also called
> "clone", which does something similar to the OpenFlow extension "clone"
> action but at the datapath level.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] About the 'clone' action

2017-10-23 Thread Hui Xiang
Hi folks,


I am trying to understand clone action deeply and checking the patch
related with 'clone'[1], where said "The clone action provides an action
envelope to enclose an action list." and "The clone action is very similar
with the OpenFlow clone action recently added", but sorry I am not able to
find any "clone" action definition in the OpenFlow spec[2], or any envelope
's kind of phrases, it just referred some clone(copy) of piple, am I
looking in a wrong place?

Thx much.

[1] https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327633.html
[2]
https://www.opennetworking.org/wp-content/uploads/2014/10/openflow-switch-v1.5.1.pdf
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Does it work if netdev datapath connect system datapath through patch ports

2017-10-13 Thread Hui Xiang
Thanks Ben, that is much helpful explanation.

On Fri, Oct 13, 2017 at 11:42 PM, Ben Pfaff <b...@ovn.org> wrote:

> On Fri, Oct 13, 2017 at 06:04:16PM +0800, Hui Xiang wrote:
> > Is the patch port output logistic within function
> apply_nested_clone_actions
> > ? Sorry, I have not totally understood the whole picture on how a packet
> > through
> > the datapath flow on the case that goes from netdev datapath patch port_B
> > to system datapath port_C. Is it such a procedure:
> > When a packet arrives at patch port_B of netdev datapath br1, the xlate
> ctx
> > switch to system datapath br2, and try to find a
> > rule, the rule probably missed, but I didn't find where this rule should
> be
> > preloaded, and why does it not work in this case.
> >
> > Coud you help to guide?
>
> Patch port output happens in patch_port_output() in ofproto-dpif-xlate.c.
>
> Patch ports are implemented entirely within a datapath.  Output to one
> happens by simply doing a nested translation of the actions that would
> happen at the other end of the patch port.  This can't really happen if
> the packet would pass to another datapath entirely.  One could implement
> some feature for hopping from one datapath to another.  We have not done
> this because the value of the feature seems limited; most installations
> of OVS use only a single datapath.
>
> > In additions, does it work with veth connected netdev datapath with
> system
> > datapath? It seems OVN will bridge the br-int with
> > the external network having another bridge with patch port, in this case,
> > if br-int is netdev, but external bridge is system,
> > then it doesn't work as what I understand so far.
>
> veths should be able to hop from one datapath to another, at least for
> datapaths that support veths.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Does it work if netdev datapath connect system datapath through patch ports

2017-10-13 Thread Hui Xiang
Thanks Ben.

Is the patch port output logistic within function apply_nested_clone_actions
? Sorry, I have not totally understood the whole picture on how a packet
through
the datapath flow on the case that goes from netdev datapath patch port_B
to system datapath port_C. Is it such a procedure:
When a packet arrives at patch port_B of netdev datapath br1, the xlate ctx
switch to system datapath br2, and try to find a
rule, the rule probably missed, but I didn't find where this rule should be
preloaded, and why does it not work in this case.

Coud you help to guide?

In additions, does it work with veth connected netdev datapath with system
datapath? It seems OVN will bridge the br-int with
the external network having another bridge with patch port, in this case,
if br-int is netdev, but external bridge is system,
then it doesn't work as what I understand so far.

Thanks much.

On Thu, Oct 12, 2017 at 11:48 PM, Ben Pfaff <b...@ovn.org> wrote:

> On Thu, Oct 12, 2017 at 05:22:14PM +0800, Hui Xiang wrote:
> >Seems it works with the same datapath type during the discussion in
> [1]
> > by finally done a flow of leaving two kernel netdev port communicate
> > together.
> >
> >But in my case I have a setup with two bridges has different datapath
> > type, netdev and system, need to be connected, port_A(in netdev datapath)
> > is in running in the userspace, port_D(in system datapath), will the flow
> > be port_A  port_D directly and packets go through userspace to
> > kernelspace and so on in the reverse direction,
> >  does patch port can make it work?
>
> No, that doesn't work.  (I should document that.)
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Does it work if netdev datapath connect system datapath through patch ports

2017-10-12 Thread Hui Xiang
Hi folks,

   Seems it works with the same datapath type during the discussion in [1]
by finally done a flow of leaving two kernel netdev port communicate
together.

   But in my case I have a setup with two bridges has different datapath
type, netdev and system, need to be connected, port_A(in netdev datapath)
is in running in the userspace, port_D(in system datapath), will the flow
be port_A  port_D directly and packets go through userspace to
kernelspace and so on in the reverse direction,
 does patch port can make it work?

The topology is
*port_A --- br1(netdev) --- port_B (patch)  port_C (patch) ---
br2(system) --- port_D*



*[1]
https://mail.openvswitch.org/pipermail/ovs-dev/2015-September/303675.html
*

*Thanks much for your help.*
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS-DPDK IP fragmentation require

2017-07-28 Thread Hui Xiang
On Fri, Jul 28, 2017 at 12:52 PM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, July 27, 2017 at 8:10 PM
>
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
>
>
>
>
> On Fri, Jul 28, 2017 at 10:54 AM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, July 27, 2017 at 6:59 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
>
>
>
>
> On Fri, Jul 28, 2017 at 1:12 AM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, July 27, 2017 at 3:18 AM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
>
>
> Blow is the diagram (using OVS-DPDK):
>
>
>
> 1. For packets coming to vm1 from internet where could have MTU 1500,
> there could be including some fragmented packets,
>
> how does the ALC/Security groups handle these fragmented packets? do
> nothing and pass it next which may pass the packets
>
> should be dropped or any special handling?
>
>
>
> Lets assume the fragments get thru. the physical switch and/or firewall.
>
>
>
> Are you using DPDK in GW and using OVS kernel datapath in br-int where you
> apply ACL/Security groups policy ?
>
> All are using DPDK, the ACL/Security groups policy said is OVS-DPDK
> conntrack implementation.
>
> With the case that we should have dropped some packets by creating special
> security group rules, but now due to they are fragmented and get thru by
> default, this is not what is expected.
>
>
>
> I would check your configuration.
>
> The dpdk connection tracker marks fragments as ‘invalid’ today and your
> rules should drop ‘invalid’.
>
> OK, thanks. here are the two scenarios we are discussing:
>
> 1.  For packets out from vms, use Jumbo Frame supported physical
> switches/routers within OpenStack cloud and enable it in all OVS-DPDK or do
> not allow application to send large frames.
>
>
>
> Try to use jumbo frames for performance reasons.
>
>
>
> On going out, if you get fragmentation done in HW at the physical
> switches, then
>
> 1)  If it could go back into one of your dpdk networks, then
> encourage using smaller packets
>
> 2)  If it goes somewhere else, then it does not matter, keep bigger
> packets
>
> Are you sure the physical switches do not support jumbo frames?
>
> Maybe it is just a config. change fix there.
>
>
>
Few physical switches in my lab probably just support max MTU 2000..

>
>
> 2. For packets coming from internet to OVS-DPDK, fragmented packets could
> be arrived, they are all dropped due to marks as 'invalid'.
>
>  With above analysis,  if these fragments are marked as 'invalid' and
> being dropped, the best way I can think about is to not use security group
> in OVS-DPDK if there could be fragments generated.
>
>
>
> If you already trust what gets to GW because you have a HW firewall, yes
>
> This assumes internally generated is always safe.
>
>
>
> Otherwise, you want to keep security groups and ‘encourage’ no fragments
> coming in, if you can
>
> ‘Encourage’ can be done by dropping and triggering checking why the
> fragments got generated in the first place
>
> Fragments may also indicate an exploit attempt, in which case, dropping is
> good.
>
Thanks Darrell, yep these are the solutions so far.

>
>
>
>
> Please correct me if I misunderstand anything.
>
>
>
> 2. For packets egress from vm1, if all internal physical switch support
> Jumbo Frame, that's fine, but if there are some physical swithes
>
> just support 1500/2000 MTU, then fragmented packets generated again.
> The ACL/Security groups face problem as item 1 as well.
>
>
>
>
>
> For packets that reach the physical switches on the way out, then the
> decision how to handle them is at the physical switch/router
>
> The packets may be fragmented at this point; depending on the switch;
> there will be HW firewall policies to contend with, so depends.
>
>
>
> Here, again what I mean is the packets are fragmen

Re: [ovs-discuss] OVS-DPDK IP fragmentation require

2017-07-27 Thread Hui Xiang
On Fri, Jul 28, 2017 at 10:54 AM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, July 27, 2017 at 6:59 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
>
>
>
>
> On Fri, Jul 28, 2017 at 1:12 AM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, July 27, 2017 at 3:18 AM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
>
>
> Blow is the diagram (using OVS-DPDK):
>
>
>
> 1. For packets coming to vm1 from internet where could have MTU 1500,
> there could be including some fragmented packets,
>
> how does the ALC/Security groups handle these fragmented packets? do
> nothing and pass it next which may pass the packets
>
> should be dropped or any special handling?
>
>
>
> Lets assume the fragments get thru. the physical switch and/or firewall.
>
>
>
> Are you using DPDK in GW and using OVS kernel datapath in br-int where you
> apply ACL/Security groups policy ?
>
> All are using DPDK, the ACL/Security groups policy said is OVS-DPDK
> conntrack implementation.
>
> With the case that we should have dropped some packets by creating special
> security group rules, but now due to they are fragmented and get thru by
> default, this is not what is expected.
>
>
>
> I would check your configuration.
>
> The dpdk connection tracker marks fragments as ‘invalid’ today and your
> rules should drop ‘invalid’.
>
OK, thanks. here are the two scenarios we are discussing:
1. For packets out from vms, use Jumbo Frame supported physical
switches/routers within OpenStack cloud and enable it in all OVS-DPDK or do
not allow application to send large frames.
2. For packets coming from internet to OVS-DPDK, fragmented packets could
be arrived, they are all dropped due to marks as 'invalid'.
 With above analysis,  if these fragments are marked as 'invalid' and being
dropped, the best way I can think about is to not use security group in
OVS-DPDK if there could be fragments generated.

Please correct me if I misunderstand anything.

>
>
> 2. For packets egress from vm1, if all internal physical switch support
> Jumbo Frame, that's fine, but if there are some physical swithes
>
> just support 1500/2000 MTU, then fragmented packets generated again.
> The ACL/Security groups face problem as item 1 as well.
>
>
>
>
>
> For packets that reach the physical switches on the way out, then the
> decision how to handle them is at the physical switch/router
>
> The packets may be fragmented at this point; depending on the switch;
> there will be HW firewall policies to contend with, so depends.
>
>
>
> Here, again what I mean is the packets are fragmented by the physical
> switch/router, and they are switching/routing to a next node where has the
> OVS-DPDK set with security group, and OVS-DPDK may let them thru with
> ignoring the security group rules.
>
>
>
> Sorry, you lost me a bit here; in point ‘2’ above you said packets are
> going from vm1 to internet and are fine until they hit the physical switch
>
> Where you are assuming they are fragmented because the mtu is lower.
>
> If this is not going to the internet but rather another set of nodes
> running dpdk, then this is another variation of ‘1’ and hence we don’t
>
> need to discuss it.
>
>
>
>
>
> [image: line image 1]
>
>
>
> On Thu, Jul 27, 2017 at 2:49 PM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 9:43 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Thanks Darrell, comment inline.
>
>
>
> On Thu, Jul 27, 2017 at 12:08 PM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *<ovs-discuss-boun...@openvswitch.org> on behalf of Hui Xiang <
> xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 7:47 PM
> *To: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *[ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Hi guys,
>
>
>
>   Seems OVS-DPDK still missing IP fragmenta

Re: [ovs-discuss] OVS-DPDK IP fragmentation require

2017-07-27 Thread Hui Xiang
On Fri, Jul 28, 2017 at 1:12 AM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, July 27, 2017 at 3:18 AM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
>
>
> Blow is the diagram (using OVS-DPDK):
>
>
>
> 1. For packets coming to vm1 from internet where could have MTU 1500,
> there could be including some fragmented packets,
>
> how does the ALC/Security groups handle these fragmented packets? do
> nothing and pass it next which may pass the packets
>
> should be dropped or any special handling?
>
>
>
> Lets assume the fragments get thru. the physical switch and/or firewall.
>
>
>
> Are you using DPDK in GW and using OVS kernel datapath in br-int where you
> apply ACL/Security groups policy ?
>
All are using DPDK, the ACL/Security groups policy said is OVS-DPDK
conntrack implementation.
With the case that we should have dropped some packets by creating special
security group rules, but now due to they are fragmented and get thru by
default, this is not what is expected.

>
>
> 2. For packets egress from vm1, if all internal physical switch support
> Jumbo Frame, that's fine, but if there are some physical swithes
>
> just support 1500/2000 MTU, then fragmented packets generated again.
> The ACL/Security groups face problem as item 1 as well.
>
>
>
>
>
> For packets that reach the physical switches on the way out, then the
> decision how to handle them is at the physical switch/router
>
> The packets may be fragmented at this point; depending on the switch;
> there will be HW firewall policies to contend with, so depends.
>
>
>
Here, again what I mean is the packets are fragmented by the physical
switch/router, and they are switching/routing to a next node where has the
OVS-DPDK set with security group, and OVS-DPDK may let them thru with
ignoring the security group rules.

>
>
>
>
> [image: nline image 1]
>
>
>
> On Thu, Jul 27, 2017 at 2:49 PM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 9:43 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Thanks Darrell, comment inline.
>
>
>
> On Thu, Jul 27, 2017 at 12:08 PM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *<ovs-discuss-boun...@openvswitch.org> on behalf of Hui Xiang <
> xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 7:47 PM
> *To: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *[ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Hi guys,
>
>
>
>   Seems OVS-DPDK still missing IP fragmentation support, is there any
> schedule to have it?
>
> OVS 2.9
>
> I'm  transferring to use OVN, but for those nodes which have external
> network connection, they may face this problem,
>
> except to configure Jumbo frames, is there any other workaround?
>
>
>
> I am not clear on the situation however.
>
> You mention about configuring jumbo frames which means you can avoid the
> fragments by doing this ?
>
> No, I can't guarantee that, only can do it inside OpenStack, it is
> limited.
>
> If this is true, then this is the best way to proceed since performance
> will be better.
>
> What is wrong with jumbo frames ?
>
> It's good but it's limited can't be guaranteed, so I am asking is there
> any other way without IP fragmentation so far.
>
>
>
> It sounds like you want to avoid IP fragmentation; so far so good.
>
> I am not sure I understand the whole picture though.
>
> Maybe you can describe what you see ?; maybe a simple diagram would help ?
>
>
>
>
>
> BR.
>
> Hui.
>
>
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS-DPDK IP fragmentation require

2017-07-27 Thread Hui Xiang
Blow is the diagram (using OVS-DPDK):

1. For packets coming to vm1 from internet where could have MTU 1500, there
could be including some fragmented packets,
how does the ALC/Security groups handle these fragmented packets? do
nothing and pass it next which may pass the packets
should be dropped or any special handling?
2. For packets egress from vm1, if all internal physical switch support
Jumbo Frame, that's fine, but if there are some physical swithes
just support 1500/2000 MTU, then fragmented packets generated again.
The ACL/Security groups face problem as item 1 as well.

[image: Inline image 1]

On Thu, Jul 27, 2017 at 2:49 PM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 9:43 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Thanks Darrell, comment inline.
>
>
>
> On Thu, Jul 27, 2017 at 12:08 PM, Darrell Ball <db...@vmware.com> wrote:
>
>
>
>
>
> *From: *<ovs-discuss-boun...@openvswitch.org> on behalf of Hui Xiang <
> xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 7:47 PM
> *To: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *[ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Hi guys,
>
>
>
>   Seems OVS-DPDK still missing IP fragmentation support, is there any
> schedule to have it?
>
> OVS 2.9
>
> I'm  transferring to use OVN, but for those nodes which have external
> network connection, they may face this problem,
>
> except to configure Jumbo frames, is there any other workaround?
>
>
>
> I am not clear on the situation however.
>
> You mention about configuring jumbo frames which means you can avoid the
> fragments by doing this ?
>
> No, I can't guarantee that, only can do it inside OpenStack, it is
> limited.
>
> If this is true, then this is the best way to proceed since performance
> will be better.
>
> What is wrong with jumbo frames ?
>
> It's good but it's limited can't be guaranteed, so I am asking is there
> any other way without IP fragmentation so far.
>
>
>
> It sounds like you want to avoid IP fragmentation; so far so good.
>
> I am not sure I understand the whole picture though.
>
> Maybe you can describe what you see ?; maybe a simple diagram would help ?
>
>
>
>
>
> BR.
>
> Hui.
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS-DPDK IP fragmentation require

2017-07-26 Thread Hui Xiang
Thanks Darrell, comment inline.

On Thu, Jul 27, 2017 at 12:08 PM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *<ovs-discuss-boun...@openvswitch.org> on behalf of Hui Xiang <
> xiangh...@gmail.com>
> *Date: *Wednesday, July 26, 2017 at 7:47 PM
> *To: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *[ovs-discuss] OVS-DPDK IP fragmentation require
>
>
>
> Hi guys,
>
>
>
>   Seems OVS-DPDK still missing IP fragmentation support, is there any
> schedule to have it?
>
> OVS 2.9
>
> I'm  transferring to use OVN, but for those nodes which have external
> network connection, they may face this problem,
>
> except to configure Jumbo frames, is there any other workaround?
>
>
>
> I am not clear on the situation however.
>
> You mention about configuring jumbo frames which means you can avoid the
> fragments by doing this ?
>
No, I can't guarantee that, only can do it inside OpenStack, it is limited.

> If this is true, then this is the best way to proceed since performance
> will be better.
>
> What is wrong with jumbo frames ?
>
It's good but it's limited can't be guaranteed, so I am asking is there any
other way without IP fragmentation so far.

>
>
>
>
> BR.
>
> Hui.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVS-DPDK IP fragmentation require

2017-07-26 Thread Hui Xiang
Hi guys,

  Seems OVS-DPDK still missing IP fragmentation support, is there any
schedule to have it? I'm  transferring to use OVN, but for those nodes
which have external network connection, they may face this problem, except
to configure Jumbo frames, is there any other workaround?

BR.
Hui.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-07-03 Thread Hui Xiang
Your help was greatly appreciated, thanks Bodireddy.

On Mon, Jul 3, 2017 at 10:57 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> >>What is your use case(s) ?
> >>My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal
> >>case, and it requires a good performance, however, OVS-DPDK seems still
> >not
> >>reach its needs compared with  hardware offloading, we are evaluating VPP
> >as
> >>well,
> >As you mentioned VPP here, It's worth looking at the benchmarks that were
> >carried comparing
> >OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in
> OvS
> >Fall conference.
> >The slides can be found @
> >http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf.
> >In above pdf page 12, why does classifier showed a constant throughput
> with
> >increasing concurrent L4 flows? shouldn't the performance get degradation
> >with more subtable look up as you mentioned.
>
> You raised a good point. The reason being the 'sorted subtable ranking'
> implementation in 2.7 release.
> With this we will have the subtable vector sorted by frequency of hits and
> this reduces the number of subtable lookups.
> That is the reason why I asked for the ' avg. subtable lookups per hit:'
> number.
>
> I recommend watching the video of the presentation here
> https://www.youtube.com/watch?v=cxRcfn2x4eE , as the
> bottlenecks you are referring in this thread are more or less similar to
> ones discussed at the conference.
>
> - Bhanuprakash.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-07-03 Thread Hui Xiang
Thanks much Bodireddy again! comment inline.

On Mon, Jul 3, 2017 at 5:00 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> It’s a long weekend in US and will try answering some of your questions in
> Darrell's absence.
>
> >Why do think having more than 64k per PMD would be optimal ?
> >I originally thought that the bottleneck in classifier because it is
> saturated full
> >so that look up has to be going to flow table, so I think why not just
> increase
> >the dpcls flows per PMD, but seems I am wrong based on your explanation.
>
> For few use cases much of the bottleneck moves to Classifier when EMC is
> saturated. You may have
> to add more  PMD threads (again this depends on the availability of cores
> in your case.)
> As your initial investigation proved classifier is bottleneck, just
> curious about few things.
>  -  In the 'dpif-netdev/pmd-stats-show' output, what does the ' avg.
> subtable lookups per hit:'  looks like?
>  -  In steady state do 'dpcls_lookup()' top the list of functions with
> 'perf top'.
>
> Those are great advices, I'll check more.

> >What is your use case(s) ?
> >My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal
> >case, and it requires a good performance, however, OVS-DPDK seems still
> not
> >reach its needs compared with  hardware offloading, we are evaluating VPP
> as
> >well,
> As you mentioned VPP here, It's worth looking at the benchmarks that were
> carried comparing
> OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in
> OvS Fall conference.
> The slides can be found @ http://openvswitch.org/
> support/ovscon2016/8/1400-gray.pdf.
>
In above pdf page 12, why does classifier showed a constant throughput with
increasing concurrent L4 flows? shouldn't the performance get degradation
with more subtable look up as you mentioned.

>
> basically I am looking to find out what's the bottleneck so far in OVS-
> >DPDK (seems in flow look up), and if there are some solutions being
> discussed
> >or working in progress.
>
> I personally did some investigation in this area. One of the bottlenecks
> in classifier is due to sub-table lookup.
> Murmur hash is used in OvS and it is  recommended enabling intrinsics with
> -march=native/CFLAGS="-msse4.2"  if not done.
> If you have more subtables, the lookups may be taking significant cycles.
> I presume you are using OvS 2.7. Some optimizations
> were done to  improve classifier  performance(subtable ranking, hash
> optimizations).
> If emc_lookup()/emc_insert() show up in top 5 functions taking significant
> cycles, worth disabling EMC as below.
>   'ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-
> prob=0'
>
Thanks much for your advice.

>
> >Are you wanting for this number to be larger by default ?
> >I am not sure, I need to understand whether it is good or bad to set it
> larger.
> >Are you wanting for this number to be configurable ?
> >Probably good.
> >
> >BTW, after reading part of DPDK document, it strengthens to decrease to
> copy
> >between cache and memory and get cache hit as much as possible to get
> >fewer cpu cycles to fetch data, but now I am totally lost on how does OVS-
> >DPDK emc and classifier map to the LLC.
>
> I didn't get your question here. PMD is like any other thread and has EMC
> and a classifier per ingress port.
> The EMC,  classifier subtables and other data structures will make it to
> LLC when accessed.
>
ACK.

>
> As already mentioned using RDT Cache Allocation Technology(CAT), one can
> assign cache ways to high priority threads
> https://software.intel.com/en-us/articles/introduction-to-
> cache-allocation-technology
>
> - Bhanuprakash.
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-30 Thread Hui Xiang
Thanks Darrell, comment inline.

On Sat, Jul 1, 2017 at 1:02 AM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Thursday, June 29, 2017 at 6:57 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"Bodireddy, Bhanuprakash" <bhanuprakash.bodire...@intel.com>, "
> ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
>
>
>
> I am interested about how to define 'reasonable' here, how it is got and
> what what is the 'many case'? is there any document/link to refer this
> information, please shed me some light.
>
>
>
> It is based on real usage scenarios for the number of megaflows needed.
>
> The usage may be less in most cases.
>
> In cases where larger, it may imply that more threads are better and
> dividing among queues.
>
Yes, more threads are better, but the overall cores are limited, more
threads pinned on cores for OVS-DPDK, less available for vms.

>
>
> Why do think having more than 64k per PMD would be optimal ?
>
I originally thought that the bottleneck in classifier because it is
saturated full so that look up has to be going to flow table, so I think
why not just increase the dpcls flows per PMD, but seems I am wrong based
on your explanation.

> What is your use case(s) ?
>
My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal case,
and it requires a good performance, however, OVS-DPDK seems still not reach
its needs compared with  hardware offloading, we are evaluating VPP as
well, basically I am looking to find out what's the bottleneck so far in
OVS-DPDK (seems in flow look up), and if there are some solutions being
discussed or working in progress.

> Are you wanting for this number to be larger by default ?
>
I am not sure, I need to understand whether it is good or bad to set it
larger.

> Are you wanting for this number to be configurable ?
>
Probably good.

>
>
BTW, after reading part of DPDK document, it strengthens to decrease to
copy between cache and memory and get cache hit as much as possible to get
fewer cpu cycles to fetch data, but now I am totally lost on how does
OVS-DPDK emc and classifier map to the LLC.

>
>
> On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball <db...@vmware.com> wrote:
>
> Q: “how it is calculated in such an exact number? “
>
> A: It is a reasonable number to accommodate many cases.
>
> Q: “If there are more ports added for polling, for avoid competing can I
> increase the 64k size into a
> bigger one?”
>
> A: If a larger number is needed, it may imply that adding another PMD and
> dividing the forwarding
> work would be best.  Maybe even a smaller number of flows may be best
> served with more PMDs.
>
>
>
>
>
>
> On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of
> Bodireddy, Bhanuprakash" <ovs-discuss-boun...@openvswitch.org on behalf
> of bhanuprakash.bodire...@intel.com> wrote:
>
> >
>
> >I guess the answer is now the general LLC is 2.5M per core so that
> there is 64k
>
> >flows per thread.
>
>
>
> AFAIK, the no. of flows here may not have to do anything with LLC.
> Also there is EMC cache(8k entries) of ~4MB per PMD thread.
>
>
>
>
>
> Yes the performance will be nice with simple test cases (P2P with 1
> PMD thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be memory bound.
>
>
>
> BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption
> of 2.5M/core isn't right.
>
>
>
> - Bhanuprakash.
>
>
>
> >
>
> >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang <xiangh...@gmail.com>
> wrote:
>
> >Thanks Darrell,
>
> >
>
> >More questions:
>
> >Why not allocating 64k for each dpcls? does the 64k just fit in L3
> cache or
>
> >anywhere? how it is calculated in such an exact number?  If there are
> more
>
> >ports added for polling, for avoid competing can I increase the 64k
> size into a
>
> >bigger one? Thanks.
>
> >
>
> >Hui.
>
> >
>
> >
>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.
> openvswitch.org_mailman_listinfo_ovs-2Ddiscuss=DwIGaQ=
> uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=-
> aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE=pSk0G_pj9n5VvpbG_
> ukDYkjSnSmA9Q9h37XchMZofuU=
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-30 Thread Hui Xiang
Thank you very much, Bodireddy, appreciated your reply.



On Fri, Jun 30, 2017 at 5:19 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> >
> >Thanks Bodireddy.
> >
> >Sorry I am a bit confused about the EMC occupied size per PMD, here[1]
> has a
> >different story.
>
> Initially EMC had 1024 entries and the patch [1] increased it to 8k.  By
> doing so in simple test
> scenarios, most of the flows will hit EMC and we can achieve wire speed
> for smaller packets.
> With this patch the EMC cache size is now @ ~4MB.
>
> It should be noted that there can be multiple PMD threads and whole lot of
> other threads running
> on the compute. Cache is limited and all the threads contend for LLC
> resources. This leads to cache thrashing
> and impacts performance.
>
> To alleviate this problem, Intel's (Resource Director Technology) RDT is
> used to partition the LLC and
> assign cache ways to different threads based on priority.
>
> >
> >Do you mean in real scenarios OVS-DPDK can be memory bound on EMC?
> OvS is flexible and results are use case dependent.  With some use cases ,
> EMC quickly gets saturated and packets will be sent to classifier.
> Some of the bottlenecks I referred are in classifier. .
>
> >I thought EMC should be totally fit in LLC.
> As pointed earlier there may be lot of threads on the compute and this
> assumption may not be right always.
>
> >
> >If the megaflows just part in LLC, then the cost of copy between memory
> and
> >LLC should be large, isn't it not like what defined as 'fast path' in
> userspace
> >compared with kernel datapath? And if most of megaflows are in memory,
> >the reason of every PMD  has one dpcls instance is to follow the rule PMD
> >thread should has local data as most as it can, but not every PMD put it
> in its
> >local cache, if that is true, I can't see why 64k is the limit num,
> unless this is an
> >experience best value calculated from vtune/perf resutls.
> >
> >You are probably enabled hyper-thread with 35MB and got 28 cores.
>
> I have E5-2695 v3, dual socket with 14 cores per socket. I will have 56
> cores with HT enabled.
>
> - Bhanuprakash.
>
> >
> >[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html
> >
> >
> >
> >On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash
> ><bhanuprakash.bodire...@intel.com> wrote:
> >>
> >>I guess the answer is now the general LLC is 2.5M per core so that there
> is
> >64k
> >>flows per thread.
> >
> >AFAIK, the no. of flows here may not have to do anything with LLC.  Also
> there
> >is EMC cache(8k entries) of ~4MB per PMD thread.
> >Yes the performance will be nice with simple test cases (P2P with 1 PMD
> >thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be
> >memory bound.
> >
> >BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of
> >2.5M/core isn't right.
> >
> >- Bhanuprakash.
> >
> >>
> >>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang <xiangh...@gmail.com> wrote:
> >>Thanks Darrell,
> >>
> >>More questions:
> >>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or
> >>anywhere? how it is calculated in such an exact number?  If there are
> more
> >>ports added for polling, for avoid competing can I increase the 64k size
> into a
> >>bigger one? Thanks.
> >>
> >>Hui.
> >>
> >>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Hui Xiang
I am interested about how to define 'reasonable' here, how it is got and
what what is the 'many case'? is there any document/link to refer this
information, please shed me some light.

On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball <db...@vmware.com> wrote:

> Q: “how it is calculated in such an exact number? “
>
> A: It is a reasonable number to accommodate many cases.
>
> Q: “If there are more ports added for polling, for avoid competing can I
> increase the 64k size into a
> bigger one?”
>
> A: If a larger number is needed, it may imply that adding another PMD and
> dividing the forwarding
> work would be best.  Maybe even a smaller number of flows may be best
> served with more PMDs.
>
>
>
>
>
> On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of
> Bodireddy, Bhanuprakash" <ovs-discuss-boun...@openvswitch.org on behalf
> of bhanuprakash.bodire...@intel.com> wrote:
>
> >
>
> >I guess the answer is now the general LLC is 2.5M per core so that
> there is 64k
>
> >flows per thread.
>
>
>
> AFAIK, the no. of flows here may not have to do anything with LLC.
> Also there is EMC cache(8k entries) of ~4MB per PMD thread.
>
>
>
>
>
> Yes the performance will be nice with simple test cases (P2P with 1
> PMD thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be memory bound.
>
>
>
> BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption
> of 2.5M/core isn't right.
>
>
>
> - Bhanuprakash.
>
>
>
> >
>
> >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang <xiangh...@gmail.com>
> wrote:
>
> >Thanks Darrell,
>
> >
>
> >More questions:
>
> >Why not allocating 64k for each dpcls? does the 64k just fit in L3
> cache or
>
> >anywhere? how it is calculated in such an exact number?  If there are
> more
>
> >ports added for polling, for avoid competing can I increase the 64k
> size into a
>
> >bigger one? Thanks.
>
> >
>
> >Hui.
>
> >
>
> >
>
>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.
> openvswitch.org_mailman_listinfo_ovs-2Ddiscuss=DwIGaQ=
> uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=-
> aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE=pSk0G_pj9n5VvpbG_
> ukDYkjSnSmA9Q9h37XchMZofuU=
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Hui Xiang
Thanks Bodireddy.

Sorry I am a bit confused about the EMC occupied size per PMD, here[1] has
a different story.

Do you mean in real scenarios OVS-DPDK can be memory bound on EMC? I
thought EMC should be totally fit in LLC.

If the megaflows just part in LLC, then the cost of copy between memory and
LLC should be large, isn't it not like what defined as 'fast path' in
userspace compared with kernel datapath? And if most of megaflows are in
memory, the reason of every PMD  has one dpcls instance is to follow the
rule PMD thread should has local data as most as it can, but not every PMD
put it in its local cache, if that is true, I can't see why 64k is the
limit num, unless this is an experience best value calculated from
vtune/perf resutls.

You are probably enabled hyper-thread with 35MB and got 28 cores.

[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html



On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> >
> >I guess the answer is now the general LLC is 2.5M per core so that there
> is 64k
> >flows per thread.
>
> AFAIK, the no. of flows here may not have to do anything with LLC.  Also
> there is EMC cache(8k entries) of ~4MB per PMD thread.
> Yes the performance will be nice with simple test cases (P2P with 1 PMD
> thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK can
> be memory bound.
>
> BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of
> 2.5M/core isn't right.
>
> - Bhanuprakash.
>
> >
> >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang <xiangh...@gmail.com> wrote:
> >Thanks Darrell,
> >
> >More questions:
> >Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or
> >anywhere? how it is calculated in such an exact number?  If there are more
> >ports added for polling, for avoid competing can I increase the 64k size
> into a
> >bigger one? Thanks.
> >
> >Hui.
> >
> >
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Hui Xiang
I guess the answer is now the general LLC is 2.5M per core so that there is
64k flows per thread.

On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang <xiangh...@gmail.com> wrote:

> Thanks Darrell,
>
> More questions:
> Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or anywhere? how it is calculated in such an exact number?  If there are
> more ports added for polling, for avoid competing can I increase the 64k
> size into a bigger one? Thanks.
>
> Hui.
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-22 Thread Hui Xiang
Anyone could help to answer it?

On Tue, Jun 20, 2017 at 6:22 PM, Hui Xiang <xiangh...@gmail.com> wrote:

> Hello guys,
>
>   I have seen that there will be one dpcls instance for each port per pmd,
> and seems flow table max entries num is 64k per pmd other than per dpcls,
> my question is if there are several dpcls instances per pmd, would they
> compete the 64k per pmd?
>
>
> BR.
> Hui.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS-DPDK polling cycles and processing cycles for pmd

2017-06-14 Thread Hui Xiang
Thanks Darrell.

On Tue, Jun 13, 2017 at 1:14 AM, Darrell Ball <db...@vmware.com> wrote:

>
>
>
>
> *From: *Hui Xiang <xiangh...@gmail.com>
> *Date: *Sunday, June 11, 2017 at 11:35 PM
> *To: *Darrell Ball <db...@vmware.com>
> *Cc: *"ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
> *Subject: *Re: [ovs-discuss] OVS-DPDK polling cycles and processing
> cycles for pmd
>
>
>
> I see, it is exactly what you said, thanks.
>
>
>
> So to analyze or enhance performance, the value of "avg processing cycles
> per packet" should be looked to evaluate ovs-dpdk, which gives the result
> of processsing cycles per packet for a specified pmd, right?
>
>
>
> For a given pipeline of matching and actions:
>
> That is one parameter that can be used
>
> Total throughput for a given packet size as measured externally is another.
>
>
>
>
>
>
>
>
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS-DPDK polling cycles and processing cycles for pmd

2017-06-12 Thread Hui Xiang
I see, it is exactly what you said, thanks.

So to analyze or enhance performance, the value of "avg processing cycles
per packet" should be looked to evaluate ovs-dpdk, which gives the result
of processsing cycles per packet for a specified pmd, right?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVS-DPDK polling cycles and processing cycles for pmd

2017-06-10 Thread Hui Xiang
Hi guys,

  I got below results on my environment, but confusing with the cycles
percents, it looks like the polling plus processing cycles per pmd is 100%,
but as I recalled there are some improvment document that showed their
processing cycles are 100% percents which means 0 for polling cycles? how
should be the rate of polling and processing cycles take? should it be
better for less polling and much processing for performance's benefits?

pmd thread numa_id 1 core_id 27:
emc hits:141332792126
megaflow hits:263
avg. subtable lookups per hit:1.00
miss:59
lost:0
polling cycles:19832152639760 (12.60%)
processing cycles:137517517407771 (87.40%)
avg cycles per packet: 1113.33 (157349670047531/141332792711)
avg processing cycles per packet: 973.01 (137517517407771/141332792711)


Thanks a lot.
BR.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss