On RHEL 7.x kernels we observe a panic induced by a paging error
when the timer kicks off a job that subsequently accesses memory
that belonged to the openvswitch kernel module but was since
unloaded - thus the paging error.

The panic can be induced on any RHEL 7.x kernel with the following test:

while `true`
do
    make check-kmod TESTSUITEFLAGS="-k \!gre"
done

On the systems I've been testing on it generally takes anywhere from a
minute to 15 minutes or so to repro but never longer than that.  Similar
results have been seen by other testers.

This patch does not fix the underlying bug, which does need to be
investigated and fixed, but it does prevent it from occurring. We
would like to prevent customer systems from panicking while we do
futher investigation to find the root cause.

Signed-off-by: Greg Rose <gvrose8...@gmail.com>
---
 datapath/datapath.c         | 10 ++++++++++
 tests/system-kmod-macros.at |  1 +
 utilities/ovs-lib.in        |  1 +
 3 files changed, 12 insertions(+)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 3ea240a..43f0d74 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -2478,6 +2478,16 @@ error:
 
 static void dp_cleanup(void)
 {
+#if RHEL_RELEASE_CODE < RHEL_RELEASE_VERSION(8,0)
+       /* On RHEL 7.x kernels we hit a kernel paging error without
+        * this barrier and subsequent hefty delay.  A process will
+        * attempt to access openvwitch memory after it has been
+        * unloaded.  Further debugging is needed on that but for
+        * now let's not let customer machines panic.
+        */
+       rcu_barrier();
+       msleep(3000);
+#endif
        dp_unregister_genl(ARRAY_SIZE(dp_genl_families));
        ovs_netdev_exit();
        unregister_netdevice_notifier(&ovs_dp_device_notifier);
diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index f23a406..2b9b691 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
                on_exit 'modprobe -q -r mod'
               ])
    on_exit 'ovs-dpctl del-dp ovs-system'
+   on_exit 'ovs-appctl dpctl/flush-conntrack'
    _OVS_VSWITCHD_START([])
    dnl Add bridges, ports, etc.
    AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| 
uuidfilt])], [0], [$2])
diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index 4dc3151..4c3ad0f 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -616,6 +616,7 @@ force_reload_kmod () {
     for dp in `ovs-dpctl dump-dps`; do
         action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
     done
+    action "ovs-appctl dpctl/flush-conntrack"
 
     for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
         action "Removing $vport module" rmmod $vport
-- 
1.8.3.1

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to