Hello Nir Soffer, Francesco Romani,

I'd like you to do a code review.  Please visit

    https://gerrit.ovirt.org/51840

to review the following change.

Change subject: virt: Correct epoll unregistration usage in vmchannels
......................................................................

virt: Correct epoll unregistration usage in vmchannels

Previously we have been unregistering fds from epoll but not in the correct
way. Which has led to log messages which seemed to have no good explanation
as of why they occur. That has been changed to 'fix' it by letting only epoll
unregister those file descriptors themselves when they are closed.
This however has been wrong - the correct procedure to use epoll is to
unregister the file descriptor first and then close the socket.

In the vmchannel case there are two scenarios where this applies:

- On an error reported by epoll
- When we want to unregister the channel from vmchannels

This patch fixes those two scenarios.

Change-Id: I9f7eab8318f41f653e0a24552c81bcd5b09d8690
Backport-To: 3.5
Backport-To: 3.6
Bug-Url: https://bugzilla.redhat.com/1226911
Signed-off-by: Vinzenz Feenstra <vfeen...@redhat.com>
Reviewed-on: https://gerrit.ovirt.org/51521
Reviewed-by: Francesco Romani <from...@redhat.com>
Reviewed-by: Michal Skrivanek <michal.skriva...@redhat.com>
Continuous-Integration: Nir Soffer <nsof...@redhat.com>
Reviewed-by: Nir Soffer <nsof...@redhat.com>
Continuous-Integration: Jenkins CI
---
M vdsm/virt/vmchannels.py
1 file changed, 31 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/40/51840/1

diff --git a/vdsm/virt/vmchannels.py b/vdsm/virt/vmchannels.py
index 45f18bd..5d9ac8e 100644
--- a/vdsm/virt/vmchannels.py
+++ b/vdsm/virt/vmchannels.py
@@ -18,6 +18,7 @@
 # Refer to the README and COPYING files for full details of the license
 #
 
+import errno
 import threading
 import time
 import select
@@ -47,6 +48,16 @@
         self._del_channels = []
         self._timeout = None
 
+    def _unregister_fd(self, fileno):
+        try:
+            self._epoll.unregister(fileno)
+        except IOError as e:
+            if e.errno != errno.ENOENT:
+                raise
+            # This case shouldn't happen anymore - But let's track it anyway
+            self.log.debug("Failed to unregister FD from epoll (ENOENT): %d",
+                           fileno)
+
     def _handle_event(self, fileno, event):
         """ Handle an epoll event occurred on a specific file descriptor. """
         reconnect = False
@@ -55,8 +66,8 @@
             if fileno in self._channels:
                 reconnect = True
             else:
-                self.log.debug("Received %.08X. On fd removed by epoll.",
-                               event)
+                self.log.warning('Received an error on an untracked fd(%d)',
+                                 fileno)
         elif (event & select.EPOLLIN):
             obj = self._channels.get(fileno, None)
             if obj:
@@ -77,6 +88,8 @@
             self._prepare_reconnect(fileno)
 
     def _prepare_reconnect(self, fileno):
+            # fileno will be closed by create_cb
+            self._unregister_fd(fileno)
             obj = self._channels.pop(fileno)
             obj['timeout_seen'] = False
             try:
@@ -85,7 +98,8 @@
                 self.log.exception("An error occurred in the create callback "
                                    "fileno: %d.", fileno)
             else:
-                self._unconnected[fileno] = obj
+                with self._update_lock:
+                    self._unconnected[fileno] = obj
 
     def _handle_timeouts(self):
         """
@@ -171,7 +185,8 @@
             self._update_channels()
             if (self._timeout is not None) and (self._timeout > 0):
                 self._handle_timeouts()
-            self._handle_unconnected()
+            with self._update_lock:
+                self._handle_unconnected()
 
     def run(self):
         """ The listener thread's function. """
@@ -213,4 +228,16 @@
         """ Unregister an exist file descriptor from the listener. """
         self.log.debug("Delete fileno %d from listener.", fileno)
         with self._update_lock:
+            # Threadsafe, fileno will be closed by caller
+            # NOTE: unregister_fd has to be called here, otherwise it'd be
+            # removed from epoll after the socket has been closed, which is
+            # incorrect
+            # NOTE: unregister_fd must be only called if fileno is not in
+            # _add_channels and not in _unconnected because it might be
+            # about to reconnect after an error or has just been added.
+            # In those cases fileno is not being tracked by epoll and
+            # would result in an ENOENT error
+            if (fileno not in self._add_channels and
+                    fileno not in self._unconnected):
+                self._unregister_fd(fileno)
             self._del_channels.append(fileno)


-- 
To view, visit https://gerrit.ovirt.org/51840
To unsubscribe, visit https://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9f7eab8318f41f653e0a24552c81bcd5b09d8690
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: ovirt-3.6
Gerrit-Owner: Vinzenz Feenstra <vfeen...@redhat.com>
Gerrit-Reviewer: Francesco Romani <from...@redhat.com>
Gerrit-Reviewer: Nir Soffer <nsof...@redhat.com>
_______________________________________________
vdsm-patches mailing list
vdsm-patches@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to