Hi there,

Has anyone noticed a "connection flapping" behaviour for the br-int bridge when 
openvswitch is restarted? I've provided some details below.
Also, in general, is it expected for NetVirt to be able to use an existing 
br-int bridge or does the br-int creation have to be initiated by NetVirt (i.e. 
can an operator run "add-br br-int" before "set-manager" on the switch)?

This scenario is related to bug 6944 - the pipeline flows for an existing 
br-int bridge are not consistently installed on the switch when openvswitch is 
restarted.

Scenario:

-          ODL 3-node cluster running distribution-karaf-0.5.0-Boron with the 
odl-ovsdb-openstack feature installed

-          A standalone OVS node running OVS version 2.4.0



-          Step 1) set-manager of OVS node to all 3-nodes in the cluster

-          Step 2) confirm that br-int is created and all pipeline flows are 
pushed to the OVS node

-          Step 3) on the OVS node, run systemctl stop openvswitch

-          Step 4) on the OVS node, run systemctl start openvswitch



-          In some runs of the above sequence of steps, the pipeline flows for 
the br-int bridge are pushed to the OVS node; in other instances, no flows are 
pushed.

Debugging this, it looks like the following things are occurring at step 4) 
(excerpts from karaf.log):


1)  2016-10-26 15:16:25,220 | INFO  | entLoopGroup-9-2 | DeviceManagerImpl      
          | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.Boron | 
ConnectionEvent: Device connected to controller, Device:/192.168.254.36:37625, 
NodeId:Uri [_value=openflow:207293242576193]

2)  2016-10-26 15:16:25,574 | INFO  | ntDispatcherImpl | OF13Provider           
          | 315 - org.opendaylight.netvirt.openstack.net-virt-providers - 
1.3.0.Boron | initializeOFFlowRules: bridgeName: br-int

3)  2016-10-26 15:16:26,601 | INFO  | entLoopGroup-9-2 | 
SystemNotificationsListenerImpl  | 297 - org.opendaylight.openflowplugin.impl - 
0.3.0.Boron | ConnectionEvent: Connection closed by device, 
Device:/192.168.254.36:37625, NodeId:openflow:207293242576193

4)      2016-10-26 15:16:27,247 | INFO  | entLoopGroup-9-1 | DeviceManagerImpl  
              | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.Boron | 
ConnectionEvent: Device connected to controller, Device:/192.168.254.36:37631, 
NodeId:Uri [_value=openflow:202927631275083]

At 1), br-int connects to ODL with ID openflow:X (in this instance, 
207293242576193)
At 2), NetVirt writes the pipeline flows into the config datastore. From my 
testing, this is always successful and the flows are always installed in the 
config datastore.
At 3), br-int (openflow:207293242576193) disconnects from ODL - not sure why 
this occurs. Could this be a bug on Openvswitch?
At 4), br-int reconnects to ODL, this time with a different ID openflow:Y (in 
this case, 202927631275083)

In the success case where the pipeline flows are pushed down, it would appear 
that SalFlowServiceImpl was able to push the flows down to the switch 
in-between 2) and 3). i.e. ODL was able to push the pipeline flows for 
openflow:X before the switch disconnected. In the failure case, the above 
sequence of events is exactly the same, but no flows were pushed down.
In some success instances, SalFlowServiceImpl would log that the "Flow 
add...finished without error", and in some success cases SalFlowServiceImpl 
would log that "Flow add failed for flow...errors=Device disconnected" even 
though the flow was actually installed on the switch.

The ovs-vswitchd.log has the following logs corresponding to the above 
connect-disconnect-connect sequence:

2016-10-26T19:16:24.910Z|00004|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connecting...
2016-10-26T19:16:24.910Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connected
2016-10-26T19:16:24.912Z|00006|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports recirculation
2016-10-26T19:16:24.912Z|00007|ofproto_dpif|INFO|system@ovs-system: MPLS label 
stack length probed as 0
2016-10-26T19:16:24.912Z|00008|ofproto_dpif|INFO|system@ovs-system: datapath 
does not support masked set action feature.
2016-10-26T19:16:24.912Z|00009|ofproto_dpif|INFO|system@ovs-system: Datapath 
does not support unique flow ids
2016-10-26T19:16:24.916Z|00010|bridge|INFO|bridge br-int: added interface 
br-int on port 65534
2016-10-26T19:16:24.916Z|00011|bridge|INFO|bridge br-int: using datapath ID 
0000bc8838168d41
2016-10-26T19:16:24.916Z|00012|connmgr|INFO|br-int: added service controller 
"punix:/var/run/openvswitch/br-int.mgmt"
2016-10-26T19:16:24.916Z|00013|connmgr|INFO|br-int: added primary controller 
"tcp:192.168.254.35:6653"
2016-10-26T19:16:24.917Z|00014|rconn|INFO|br-int<->tcp:192.168.254.35:6653: 
connecting...
2016-10-26T19:16:24.917Z|00015|connmgr|INFO|br-int: added primary controller 
"tcp:192.168.254.33:6653"
2016-10-26T19:16:24.917Z|00016|rconn|INFO|br-int<->tcp:192.168.254.33:6653: 
connecting...
2016-10-26T19:16:24.917Z|00017|connmgr|INFO|br-int: added primary controller 
"tcp:192.168.254.34:6653"
2016-10-26T19:16:24.917Z|00018|rconn|INFO|br-int<->tcp:192.168.254.34:6653: 
connecting...
2016-10-26T19:16:24.924Z|00019|rconn|INFO|br-int<->tcp:192.168.254.33:6653: 
connected
2016-10-26T19:16:24.924Z|00020|rconn|INFO|br-int<->tcp:192.168.254.34:6653: 
connected
2016-10-26T19:16:24.925Z|00021|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.4.0
2016-10-26T19:16:24.927Z|00022|rconn|INFO|br-int<->tcp:192.168.254.35:6653: 
connected
2016-10-26T19:16:24.929Z|00001|ofproto_dpif_upcall(handler5)|INFO|received 
packet on unassociated datapath port 0
2016-10-26T19:16:25.430Z|00001|dpif(revalidator4)|WARN|system@ovs-system: 
failed to flow_get (Invalid argument) ufid:b1d72bef-1acb-4755-850a-2aeb4fb75ad2 
<empty>, packets:0, bytes:0, used:never
2016-10-26T19:16:25.430Z|00002|ofproto_dpif_upcall(revalidator4)|WARN|Failed to 
acquire udpif_key corresponding to unexpected flow (Invalid argument): 
ufid:b1d72bef-1acb-4755-850a-2aeb4fb75ad2
2016-10-26T19:16:26.312Z|00023|bridge|INFO|bridge br-int: using datapath ID 
0000b88fc560944b
2016-10-26T19:16:26.313Z|00024|rconn|INFO|br-int<->tcp:192.168.254.35:6653: 
disconnecting
2016-10-26T19:16:26.313Z|00025|rconn|INFO|br-int<->tcp:192.168.254.33:6653: 
disconnecting
2016-10-26T19:16:26.313Z|00026|rconn|INFO|br-int<->tcp:192.168.254.34:6653: 
disconnecting
2016-10-26T19:16:26.952Z|00028|rconn|INFO|br-int<->tcp:192.168.254.35:6653: 
connecting...
2016-10-26T19:16:26.952Z|00029|rconn|INFO|br-int<->tcp:192.168.254.33:6653: 
connecting...
2016-10-26T19:16:26.952Z|00030|rconn|INFO|br-int<->tcp:192.168.254.34:6653: 
connecting...
2016-10-26T19:16:26.958Z|00031|rconn|INFO|br-int<->tcp:192.168.254.35:6653: 
connected
2016-10-26T19:16:26.958Z|00032|rconn|INFO|br-int<->tcp:192.168.254.33:6653: 
connected
2016-10-26T19:16:26.958Z|00033|rconn|INFO|br-int<->tcp:192.168.254.34:6653: 
connected
2016-10-26T19:16:34.926Z|00034|memory|INFO|35400 kB peak resident set size 
after 10.0 seconds

If the flows were successfully pushed, you would also see the line

|connmgr|INFO|br-int<->tcp:192.168.254.33:6653: 12 flow_mods in the last 0 s 
(12 adds)

Before the last "connecting....connected" attempt.

Any thoughts on the observed connection flapping (bug on OVS?) and on 
supporting the openvswitch restart scenario?

Thanks,
Bertrand


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------
_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to