Nir Soffer has uploaded a new change for review.

Change subject: sp: Fix stopping domain monitors
......................................................................

sp: Fix stopping domain monitors

Commit 2671777c69 fixed stopping of domain monitors by stopping monitors from
StoragePool.__del__. This was a dangerous fix because a stray reference to the
pool may delay invocation of __del__ until the referring object is deleted,
delaying stopping of domain monitors for unlimited time. In case of a memory
leak, the pool may never be deleted and domain monitors would never stop.

Fortunately, this fix did seem to work for couple of years, until commit
7b1cc6a20f made the domain monitor pool independent. Instead of one domain
monitor object per pool, there is now a single shared domain monitor object
used by all pool objects. Having a shared domain monitor, the pool __del__
method became deadly.

Shortly after commit 7b1cc6a20f was merged, a new random error appeared in CI
jobs, where there is no storage space left, while storage has plenty of space.
The root cause of the error was anonymous thread running at unexpected times
and stopping silently all domain monitors.  Adding some logging revealed that
this thread was started from StoragePool.__del__ method. This thread was
running 14 minutes after the original pool was disconnected, stopping domain
monitors used by the current pool object. It seems that a task was holding a
reference to the old pool, and when the task was finished, the old pool was
finally destroyed.

We seem to stop monitoring when we should, and stopping monitoring in
the __del__ method is redundant.  This patch removes the deadly method.

Change-Id: Iad80984ab44dd4a4e36b92330eb9320cf68f1580
Relates-To: https://bugzilla.redhat.com/705058
Relates-To: http://gerrit.ovirt.org/19762
Bug-Url: https://bugzilla.redhat.com/1032925
Signed-off-by: Nir Soffer <[email protected]>
---
M vdsm/storage/sp.py
1 file changed, 0 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/58/22058/1

diff --git a/vdsm/storage/sp.py b/vdsm/storage/sp.py
index 31e0bcd..91b27d5 100644
--- a/vdsm/storage/sp.py
+++ b/vdsm/storage/sp.py
@@ -140,10 +140,6 @@
     def getSpmStatus(self):
         return self.spmRole, self.getSpmLver(), self.getSpmId()
 
-    def __del__(self):
-        if len(self.domainMonitor.poolMonitoredDomains) > 0:
-            threading.Thread(target=self.stopMonitoringDomains).start()
-
     @unsecured
     def forceFreeSpm(self):
         # DO NOT USE, STUPID, HERE ONLY FOR BC


-- 
To view, visit http://gerrit.ovirt.org/22058
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iad80984ab44dd4a4e36b92330eb9320cf68f1580
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <[email protected]>
_______________________________________________
vdsm-patches mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to