Bertrand,

[1] is the link to the code in netvirt where the mac is set for the hwaddr
in the bridge table. When that is set ovsdb will use that mac for the dpid.

Following back through the code you will see that intBridgeGenMac is a flag
passed in on startup to decide if netvirt should use a fixed mac for the
bridge. That flag should already be set so the fixed mac should be used.
Your logs are showing this happening - hwadr is set so the dpid changes to
that mac.

But the use case is that br-int already exists. In that case br-int already
has a dpid. netvirt will not recreate the bridge since ti already exists
and then pushes flows. Then the connection drops and for some reason the
bridge is not in mdsal so netvirt creates a new bridge - with the fixed
mac. I think that the issue is somewhere related or a timing issue. Might
even be possible to copy the dpid/mac to the hwaddr field to ensure the
first one is always used.

One thing you can do is disable that intBridgeGenMac. Then see if the dpid
stays. All depends on what ovs decides to do. other than, need to follow
the path mentioned above and figure out the timing issue.

Sam

[1]
https://github.com/opendaylight/netvirt/blob/master/openstack/net-virt/src/main/java/org/opendaylight/netvirt/openstack/netvirt/impl/SouthboundImpl.java#L215

On Fri, Oct 28, 2016 at 8:46 PM, Bertrand Low <[email protected]> wrote:

> Hi Sam,
>
>
>
> Thanks for responding.
>
>
>
> I’d like to provide further clarification.
>
>
>
> Regarding this error:
>
> 2016-10-26T19:16:25.430Z|00001|dpif(revalidator4)|WARN|system@ovs-system:
> failed to flow_get (Invalid argument) 
> ufid:b1d72bef-1acb-4755-850a-2aeb4fb75ad2
> <empty>, packets:0, bytes:0, used:never
>
> 2016-10-26T19:16:25.430Z|00002|ofproto_dpif_upcall(revalidator4)|WARN|Failed
> to acquire udpif_key corresponding to unexpected flow (Invalid argument):
> ufid:b1d72bef-1acb-4755-850a-2aeb4fb75ad2
>
> This appears to be due to the revalidator finding flows that the current
> ovs instance did not put there. If I start with a clean ovsdb, this error
> does not appear.
>
>
>
> I have enabled debugging on ovsdb-server and ovs-vswitchd and found the
> following in ovsdb-server.log:
>
>
>
> 2016-10-28T23:21:02.018Z|00117|jsonrpc|DBG|tcp:192.168.254.35:6640:
> received request, method="transact", params=["Open_vSwitch",{"
> where":[["name","==","br-int"]],"table":"Bridge","op":"
> update","row":{"fail_mode":["set",["secure"]],"external_
> ids":["map",[["opendaylight-iid","/network-topology:
> network-topology/network-topology:topology[network-
> topology:topology-id='ovsdb:1']/network-topology:node[
> network-topology:node-id='ovsdb://uuid/b429858f-b476-
> 44be-94ac-e9ea61265a11/bridge/br-int']"]]],"other_config":["
> map",[["hwaddr","90:2e:72:41:fc:da"],["disable-in-band","
> true"]]]}},{"mutations":[["external_ids","delete",["set",
> ["opendaylight-iid"]]],["external_ids","insert",["map",
> [["opendaylight-iid","/network-topology:network-topology/network-topology:
> topology[network-topology:topology-id='ovsdb:1']/
> network-topology:node[network-topology:node-id='ovsdb://
> uuid/b429858f-b476-44be-94ac-e9ea61265a11/bridge/br-int']"]
> ]]]],"where":[["name","==","br-int"]],"table":"Bridge","op":"mutate"}],
> id="e842a544-edd0-4234-b88c-847c89ce8605"
>
> 2016-10-28T23:21:02.019Z|00118|jsonrpc|DBG|tcp:192.168.254.35:6640: send
> notification, method="update", params=["f8af9856-b198-4c86-
> 83cd-541511fa62bc",{"Bridge":{"15e50dab-60d7-4544-8051-
> fdf2141ef433":{"new":{"name":"br-int","flood_vlans":["set",[
> ]],"auto_attach":["set",[]],"ports":["uuid","d20ae748-927e-
> 4c06-bf6f-0616e1127464"],"stp_enable":false,"fail_mode":"
> secure","rstp_enable":false,"_uuid":["uuid","15e50dab-60d7-
> 4544-8051-fdf2141ef433"],"rstp_status":["map",[]],"flow_
> tables":["map",[]],"netflow":["set",[]],"datapath_type":"","
> controller":["set",[["uuid","0e2fe15c-fda6-446d-9950-
> 9e3a82062120"],["uuid","7d404ba9-d2aa-4306-bdcb-e6bc088654fb"],["uuid","
> fc85f0a8-d97f-49a1-86e6-13a76bdf840a"]]],"other_
> config":["map",[["disable-in-band","true"],["hwaddr","90:
> 2e:72:41:fc:da"]]],"status":["map",[]],"ipfix":["set",[]],"
> external_ids":["map",[["opendaylight-iid","/network-
> topology:network-topology/network-topology:topology[
> network-topology:topology-id='ovsdb:1']/network-topology:
> node[network-topology:node-id='ovsdb://uuid/b429858f-b476-
> 44be-94ac-e9ea61265a11/bridge/br-int']"]]],"datapath_id":"0000a8ceb1b8d124
> ","mirrors":["set",[]],"mcast_snooping_enable":false,"datapath_
> version":"<unknown>","sflow":["set",[]],"protocols":"
> OpenFlow13"},"old":{"other_config":["map",[["disable-in-
> band","true"],["hwaddr","a8:ce:b1:b8:d1:24"]]]}}}}]
>
> 2016-10-28T23:21:02.019Z|00119|jsonrpc|DBG|tcp:192.168.254.35:6640: send
> reply, result=[{"count":1},{"count":1}], id="e842a544-edd0-4234-b88c-
> 847c89ce8605"
>
> 2016-10-28T23:21:02.019Z|00120|poll_loop|DBG|wakeup due to 0-ms timeout
> at ovsdb/jsonrpc-server.c:510
>
> 2016-10-28T23:21:02.019Z|00121|jsonrpc|DBG|unix: send notification,
> method="update", params=[null,{"Bridge":{"15e50dab-60d7-4544-8051-
> fdf2141ef433":{"new":{"name":"br-int","flood_vlans":["set",[
> ]],"auto_attach":["set",[]],"ports":["uuid","d20ae748-927e-
> 4c06-bf6f-0616e1127464"],"stp_enable":false,"fail_mode":"
> secure","rstp_enable":false,"rstp_status":["map",[]],"flow_
> tables":["map",[]],"netflow":["set",[]],"datapath_type":"","
> controller":["set",[["uuid","0e2fe15c-fda6-446d-9950-
> 9e3a82062120"],["uuid","7d404ba9-d2aa-4306-bdcb-e6bc088654fb"],["uuid","
> fc85f0a8-d97f-49a1-86e6-13a76bdf840a"]]],"status":["
> map",[]],"ipfix":["set",[]],"other_config":["map",[["
> disable-in-band","true"],["hwaddr","90:2e:72:41:fc:da"]]],"datapath_id":"
> 0000a8ceb1b8d124","mirrors":["set",[]],"mcast_snooping_enable":false,"
> datapath_version":"<unknown>","sflow":["set",[]],"protocols"
> :"OpenFlow13"},"old":{"other_config":["map",[["disable-in-
> band","true"],["hwaddr","a8:ce:b1:b8:d1:24"]]]}}}}]
>
> 2016-10-28T23:21:02.037Z|00122|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 17 (/var/run/openvswitch/db.sock<->) at lib/stream-fd.c:155
>
> 2016-10-28T23:21:02.037Z|00123|jsonrpc|DBG|unix: received request,
> method="transact", params=["Open_vSwitch",{"lock"
> :"ovs_vswitchd","op":"assert"},{"where":[["_uuid","==",["
> uuid","15e50dab-60d7-4544-8051-fdf2141ef433"]]],"table":
> "Bridge","op":"update","row":{"datapath_id":"0000902e7241fcda"}}], id=5
>
> 2016-10-28T23:21:02.038Z|00124|jsonrpc|DBG|unix: send notification,
> method="update", params=[null,{"Bridge":{"15e50dab-60d7-4544-8051-
> fdf2141ef433":{"new":{"name":"br-int","flood_vlans":["set",[
> ]],"auto_attach":["set",[]],"ports":["uuid","d20ae748-927e-
> 4c06-bf6f-0616e1127464"],"stp_enable":false,"fail_mode":"
> secure","rstp_enable":false,"rstp_status":["map",[]],"flow_
> tables":["map",[]],"netflow":["set",[]],"datapath_type":"","
> controller":["set",[["uuid","0e2fe15c-fda6-446d-9950-
> 9e3a82062120"],["uuid","7d404ba9-d2aa-4306-bdcb-e6bc088654fb"],["uuid","
> fc85f0a8-d97f-49a1-86e6-13a76bdf840a"]]],"status":["
> map",[]],"ipfix":["set",[]],"other_config":["map",[["
> disable-in-band","true"],["hwaddr","90:2e:72:41:fc:da"]]],"datapath_id":"
> 0000902e7241fcda","mirrors":["set",[]],"mcast_snooping_enable":false,"
> datapath_version":"<unknown>","sflow":["set",[]],"protocols"
> :"OpenFlow13"},"old":{"datapath_id":"0000a8ceb1b8d124"}}}}]
>
> 2016-10-28T23:21:02.039Z|00125|jsonrpc|DBG|unix: send reply,
> result=[{},{"count":1}], id=5
>
> 2016-10-28T23:21:02.039Z|00126|jsonrpc|DBG|tcp:192.168.254.35:6640: send
> notification, method="update", params=["f8af9856-b198-4c86-
> 83cd-541511fa62bc",{"Bridge":{"15e50dab-60d7-4544-8051-
> fdf2141ef433":{"new":{"name":"br-int","flood_vlans":["set",[
> ]],"auto_attach":["set",[]],"ports":["uuid","d20ae748-927e-
> 4c06-bf6f-0616e1127464"],"stp_enable":false,"fail_mode":"
> secure","rstp_enable":false,"_uuid":["uuid","15e50dab-60d7-
> 4544-8051-fdf2141ef433"],"rstp_status":["map",[]],"flow_
> tables":["map",[]],"netflow":["set",[]],"datapath_type":"","
> controller":["set",[["uuid","0e2fe15c-fda6-446d-9950-
> 9e3a82062120"],["uuid","7d404ba9-d2aa-4306-bdcb-e6bc088654fb"],["uuid","
> fc85f0a8-d97f-49a1-86e6-13a76bdf840a"]]],"other_
> config":["map",[["disable-in-band","true"],["hwaddr","90:
> 2e:72:41:fc:da"]]],"status":["map",[]],"ipfix":["set",[]],"
> external_ids":["map",[["opendaylight-iid","/network-
> topology:network-topology/network-topology:topology[
> network-topology:topology-id='ovsdb:1']/network-topology:
> node[network-topology:node-id='ovsdb://uuid/b429858f-b476-
> 44be-94ac-e9ea61265a11/bridge/br-int']"]]],"datapath_id":"0000902e7241fcda
> ","mirrors":["set",[]],"mcast_snooping_enable":false,"datapath_
> version":"<unknown>","sflow":["set",[]],"protocols":"
> OpenFlow13"},"old":{"datapath_id":"0000a8ceb1b8d124"}}}}]
>
> 2016-10-28T23:21:02.040Z|00127|poll_loop|DBG|wakeup due to 0-ms timeout
> at unix
>
>
>
> It looks like ODL 192.168.254.35 pushed some request for br-int and after
> processing that request, the br-int dpid was changed. Apparently, when the
> bridge reconfigures along with a different MAC address but does not have a
> datapath ID configured, the dpid will then be changed. Perhaps we have to
> set the dpid or MAC address for the br-int local port across restarts as
> you have suggested?
>
>
>
> I’ve also attached the full ovsdb-server and ovs-vswitchd logs to this
> email. In this particular instance, pipeline flows were partially pushed
> down.
>
>
>
> Thanks,
>
> Bertrand
>
>
>
> *From:* Sam Hague [mailto:[email protected]]
> *Sent:* October 27, 2016 5:47 AM
> *To:* Bertrand Low <[email protected]>
> *Cc:* odl netvirt dev <[email protected]>;
> [email protected]
> *Subject:* Re: [netvirt-dev] br-int pipeline flows are not pushed when
> openvswitch is restarted
>
>
>
>
>
>
>
> On Thu, Oct 27, 2016 at 3:32 AM, Bertrand Low <[email protected]>
> wrote:
>
> Hi there,
>
>
>
> Has anyone noticed a “connection flapping” behaviour for the br-int bridge
> when openvswitch is restarted? I’ve provided some details below.
>
> Also, in general, is it expected for NetVirt to be able to use an existing
> br-int bridge or does the br-int creation have to be initiated by NetVirt
> (i.e. can an operator run “add-br br-int” before “set-manager” on the
> switch)?
>
>
>
> This scenario is related to bug 6944 – the pipeline flows for an existing
> br-int bridge are not consistently installed on the switch when openvswitch
> is restarted.
>
>
>
> Scenario:
>
> -          *ODL 3-node cluster running distribution-karaf-0.5.0-Boron
> with the odl-ovsdb-openstack feature installed*
>
> -          *A standalone OVS node running OVS version 2.4.0*
>
>
>
> -          Step 1) set-manager of OVS node to all 3-nodes in the cluster
>
> -          Step 2) confirm that br-int is created and all pipeline flows
> are pushed to the OVS node
>
> -          Step 3) on the OVS node, run systemctl stop openvswitch
>
> -          Step 4) on the OVS node, run systemctl start openvswitch
>
>
>
> -          In some runs of the above sequence of steps, the pipeline
> flows for the br-int bridge are pushed to the OVS node; in other instances,
> no flows are pushed.
>
>
>
> Debugging this, it looks like the following things are occurring at step
> 4) (excerpts from karaf.log):
>
>
>
> 1)  2016-10-26 15:16:25,220 | INFO  | entLoopGroup-9-2 |
> DeviceManagerImpl                | 297 - org.opendaylight.openflowplugin.impl
> - 0.3.0.Boron | ConnectionEvent: Device connected to controller, Device:/
> 192.168.254.36:37625, NodeId:Uri [_value=openflow:207293242576193]
>
> 2)  2016-10-26 15:16:25,574 | INFO  | ntDispatcherImpl |
> OF13Provider                     | 315 - 
> org.opendaylight.netvirt.openstack.net-virt-providers
> - 1.3.0.Boron | initializeOFFlowRules: bridgeName: br-int
>
> 3)  2016-10-26 15:16:26,601 | INFO  | entLoopGroup-9-2 |
> SystemNotificationsListenerImpl  | 297 - org.opendaylight.openflowplugin.impl
> - 0.3.0.Boron | ConnectionEvent: Connection closed by device, Device:/
> 192.168.254.36:37625, NodeId:openflow:207293242576193
>
> 4)      2016-10-26 15:16:27,247 | INFO  | entLoopGroup-9-1 |
> DeviceManagerImpl                | 297 - org.opendaylight.openflowplugin.impl
> - 0.3.0.Boron | ConnectionEvent: Device connected to controller, Device:/
> 192.168.254.36:37631, NodeId:Uri [_value=openflow:202927631275083]
>
>
>
> At 1), br-int connects to ODL with ID openflow:X (in this instance,
> 207293242576193)
>
> At 2), NetVirt writes the pipeline flows into the config datastore. From
> my testing, this is always successful and the flows are always installed in
> the config datastore.
>
> At 3), br-int (openflow:207293242576193) disconnects from ODL – not sure
> why this occurs. Could this be a bug on Openvswitch?
>
> At 4), br-int reconnects to ODL, this time with a different ID openflow:Y
> (in this case, 202927631275083)
>
>
>
> In the success case where the pipeline flows are pushed down, it would
> appear that SalFlowServiceImpl was able to push the flows down to the
> switch in-between 2) and 3). i.e. ODL was able to push the pipeline flows
> for openflow:X before the switch disconnected. In the failure case, the
> above sequence of events is exactly the same, but no flows were pushed down.
>
> In some success instances, SalFlowServiceImpl would log that the “Flow
> add…finished without error”, and in some success cases SalFlowServiceImpl
> would log that “Flow add failed for flow…errors=Device disconnected” even
> though the flow was actually installed on the switch.
>
>
>
> The ovs-vswitchd.log has the following logs corresponding to the above
> connect-disconnect-connect sequence:
>
>
>
> 2016-10-26T19:16:24.910Z|00004|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
> connecting...
>
> 2016-10-26T19:16:24.910Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
> connected
>
> 2016-10-26T19:16:24.912Z|00006|ofproto_dpif|INFO|system@ovs-system:
> Datapath supports recirculation
>
> 2016-10-26T19:16:24.912Z|00007|ofproto_dpif|INFO|system@ovs-system: MPLS
> label stack length probed as 0
>
> 2016-10-26T19:16:24.912Z|00008|ofproto_dpif|INFO|system@ovs-system:
> datapath does not support masked set action feature.
>
> 2016-10-26T19:16:24.912Z|00009|ofproto_dpif|INFO|system@ovs-system:
> Datapath does not support unique flow ids
>
> 2016-10-26T19:16:24.916Z|00010|bridge|INFO|bridge br-int: added interface
> br-int on port 65534
>
> 2016-10-26T19:16:24.916Z|00011|bridge|INFO|bridge br-int: using datapath
> ID 0000bc8838168d41
>
> 2016-10-26T19:16:24.916Z|00012|connmgr|INFO|br-int: added service
> controller "punix:/var/run/openvswitch/br-int.mgmt"
>
> 2016-10-26T19:16:24.916Z|00013|connmgr|INFO|br-int: added primary
> controller "tcp:192.168.254.35:6653"
>
> 2016-10-26T19:16:24.917Z|00014|rconn|INFO|br-int<->tcp:192.168.254.35:6653:
> connecting...
>
> 2016-10-26T19:16:24.917Z|00015|connmgr|INFO|br-int: added primary
> controller "tcp:192.168.254.33:6653"
>
> 2016-10-26T19:16:24.917Z|00016|rconn|INFO|br-int<->tcp:192.168.254.33:6653:
> connecting...
>
> 2016-10-26T19:16:24.917Z|00017|connmgr|INFO|br-int: added primary
> controller "tcp:192.168.254.34:6653"
>
> 2016-10-26T19:16:24.917Z|00018|rconn|INFO|br-int<->tcp:192.168.254.34:6653:
> connecting...
>
> 2016-10-26T19:16:24.924Z|00019|rconn|INFO|br-int<->tcp:192.168.254.33:6653:
> connected
>
> 2016-10-26T19:16:24.924Z|00020|rconn|INFO|br-int<->tcp:192.168.254.34:6653:
> connected
>
> 2016-10-26T19:16:24.925Z|00021|bridge|INFO|ovs-vswitchd (Open vSwitch)
> 2.4.0
>
> 2016-10-26T19:16:24.927Z|00022|rconn|INFO|br-int<->tcp:192.168.254.35:6653:
> connected
>
> 2016-10-26T19:16:24.929Z|00001|ofproto_dpif_upcall(handler5)|INFO|received
> packet on unassociated datapath port 0
>
> 2016-10-26T19:16:25.430Z|00001|dpif(revalidator4)|WARN|system@ovs-system:
> failed to flow_get (Invalid argument) 
> ufid:b1d72bef-1acb-4755-850a-2aeb4fb75ad2
> <empty>, packets:0, bytes:0, used:never
>
> 2016-10-26T19:16:25.430Z|00002|ofproto_dpif_upcall(revalidator4)|WARN|Failed
> to acquire udpif_key corresponding to unexpected flow (Invalid argument):
> ufid:b1d72bef-1acb-4755-850a-2aeb4fb75ad2
>
> 2016-10-26T19:16:26.312Z|00023|bridge|INFO|bridge br-int: using datapath
> ID 0000b88fc560944b
>
> maybe something here is the issue. Looks like odl pushed flows when the
> switch reconnected, of which some had issues. But here, did the dpid stay
> the same or did it change? If it changed, maybe ovs is disconnecting and
> reconnecting to push the dpid change to the controller. I thought the dpid
> would stay the same as long as the network devices stayed the same. Maybe
> we should add the ovsdb config to keep the dpid the same across restarts?
>
> Also maybe enable for ovsdb and vswitchd logging.
>
> 2016-10-26T19:16:26.313Z|00024|rconn|INFO|br-int<->tcp:192.168.254.35:6653:
> disconnecting
>
> 2016-10-26T19:16:26.313Z|00025|rconn|INFO|br-int<->tcp:192.168.254.33:6653:
> disconnecting
>
> 2016-10-26T19:16:26.313Z|00026|rconn|INFO|br-int<->tcp:192.168.254.34:6653:
> disconnecting
>
> 2016-10-26T19:16:26.952Z|00028|rconn|INFO|br-int<->tcp:192.168.254.35:6653:
> connecting...
>
> 2016-10-26T19:16:26.952Z|00029|rconn|INFO|br-int<->tcp:192.168.254.33:6653:
> connecting...
>
> 2016-10-26T19:16:26.952Z|00030|rconn|INFO|br-int<->tcp:192.168.254.34:6653:
> connecting...
>
> 2016-10-26T19:16:26.958Z|00031|rconn|INFO|br-int<->tcp:192.168.254.35:6653:
> connected
>
> 2016-10-26T19:16:26.958Z|00032|rconn|INFO|br-int<->tcp:192.168.254.33:6653:
> connected
>
> 2016-10-26T19:16:26.958Z|00033|rconn|INFO|br-int<->tcp:192.168.254.34:6653:
> connected
>
> 2016-10-26T19:16:34.926Z|00034|memory|INFO|35400 kB peak resident set
> size after 10.0 seconds
>
>
>
> If the flows were successfully pushed, you would also see the line
>
>
>
> |connmgr|INFO|br-int<->tcp:192.168.254.33:6653: 12 flow_mods in the last
> 0 s (12 adds)
>
>
>
> Before the last “connecting….connected” attempt.
>
>
>
> Any thoughts on the observed connection flapping (bug on OVS?) and on
> supporting the openvswitch restart scenario?
>
>
>
> Thanks,
>
> Bertrand
>
>
>
> ::DISCLAIMER::
> ------------------------------------------------------------
> ------------------------------------------------------------
> ----------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability
> on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior
> written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error
> please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses
> and other defects.
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> ----------------------------
>
>
> _______________________________________________
> netvirt-dev mailing list
> [email protected]
> https://lists.opendaylight.org/mailman/listinfo/netvirt-dev
>
>
>
_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to