Nir Soffer has uploaded a new change for review.

Change subject: hsm: wait until lvm bootstrap is done before connecting to pool
......................................................................

hsm: wait until lvm bootstrap is done before connecting to pool

When hsm is created, a bootsrap thread is started, initializing lvm
cache and other parts of the system. This leads to races between
different threads when accessing lvm module when connecting to storage
pool. The races may lead to two threads trying to rebuild the cache in
the same time and possibly corrupt the cache.

This patch adds a synchronization point after lvm bootstrap is done. If
engine attempt to connect to storage pool while lvm cache is not ready
yet, it will wait until it is ready.

This patch will add a delay until a host can connect to the storage
pool. This delay is unavoidable if we want to have correct code.

Change-Id: I22e851dca4c2063d19446f34897a7b208b9cace4
Signed-off-by: Nir Soffer <nsof...@redhat.com>
---
M lib/vdsm/config.py.in
M vdsm/storage/hsm.py
2 files changed, 19 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/88/21388/1

diff --git a/lib/vdsm/config.py.in b/lib/vdsm/config.py.in
index 9668772..8e075fd 100644
--- a/lib/vdsm/config.py.in
+++ b/lib/vdsm/config.py.in
@@ -267,6 +267,10 @@
 
         ('use_volume_leases', 'false',
             'Whether to use the volume leases or not.'),
+
+        ('lvm_bootstrap_timeout', '120',
+            'Time in seconds to wait until lvm bootstrap is done when'
+            ' connecting to storage pool.')
     ]),
 
     # Section: [addresses]
diff --git a/vdsm/storage/hsm.py b/vdsm/storage/hsm.py
index 5e53f3a..725279f 100644
--- a/vdsm/storage/hsm.py
+++ b/vdsm/storage/hsm.py
@@ -373,8 +373,16 @@
         except Exception:
             self.log.warn("Failed to clean Storage Repository.", exc_info=True)
 
+        self._bootstrapDone = threading.Event()
+
         def storageRefresh():
-            lvm._lvminfo.bootstrap()
+            # This may take more then a minute when having lot of lvs. Until it
+            # finish, we should not use the lvm module.
+            try:
+                lvm._lvminfo.bootstrap()
+            finally:
+                self._bootstrapDone.set()
+
             sdCache.refreshStorage()
 
             fileUtils.createdir(self.tasksDir)
@@ -1008,6 +1016,12 @@
                 "spUUID=%s, msdUUID=%s, masterVersion=%s, hostID=%s, "
                 "scsiKey=%s" % (spUUID, msdUUID, masterVersion,
                                 hostID, scsiKey)))
+
+        timeout = config.getint('vars', 'lvm_bootstrap_timeout')
+        self._bootstrapDone.wait(timeout)
+        if not self._bootstrapDone.is_set():
+            raise se.StoragePoolConnectionError("Bootstrap is not done yet")
+
         with rmanager.acquireResource(STORAGE, HSM_DOM_MON_LOCK,
                                       rm.LockType.exclusive):
             return self._connectStoragePool(spUUID, hostID, scsiKey, msdUUID,


-- 
To view, visit http://gerrit.ovirt.org/21388
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I22e851dca4c2063d19446f34897a7b208b9cace4
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsof...@redhat.com>
_______________________________________________
vdsm-patches mailing list
vdsm-patches@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to