Hello Piotr Kliczewski, Adam Litke,

I'd like you to do a code review.  Please visit

    https://gerrit.ovirt.org/57710

to review the following change.

Change subject: blockSD: Fix busy loop when zeroing image volumes
......................................................................

blockSD: Fix busy loop when zeroing image volumes

We used to wait on the wrong file descriptor (stdout, writing to image),
using the wrong mask (EPOLLHUP when using poll).

The result - vdsm consume 100% cpu when zeroing image:

Tue May 17 01:33:57 2016    vdsmd.prof

         310515403 function calls (310560288 primitive calls) in 377.096 seconds

   Ordered by: internal time
   List reduced from 3173 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 37946430   74.674    0.000  257.590    0.000 commands.py:302(AsyncProc.wait)
        1   65.338   65.338  357.741  357.741 blockSD.py:229(zeroImgVolumes)
 37947222   54.932    0.000   54.932    0.000 posix:0(waitpid)
 37946448   38.780    0.000  159.845    0.000 
commands.py:284(AsyncProc.returncode)
 37946441   38.363    0.000   93.210    0.000 
subprocess.py:1344(CPopen._internal_poll)
 38034684   35.044    0.000   35.045    0.000 __builtin__:0(poll)
 37946441   27.856    0.000  121.065    0.000 subprocess.py:803(CPopen.poll)
 75928952   23.096    0.000   23.096    0.000 time:0(time)

Fixed by watching stderr, where the stats are written at the end, using POLLIN.

This fix is mainly for 3.6, for 4.0 we want to drop this buggy code and
replace it with simpler thread based code.

Change-Id: I7f5ec5faefb6840f1bd5348b93cc784d0aff8690
Bug-Url: https://bugzilla.redhat.com/1337314
Signed-off-by: Nir Soffer <nsof...@redhat.com>
Reviewed-on: https://gerrit.ovirt.org/57541
Reviewed-by: Piotr Kliczewski <piotr.kliczew...@gmail.com>
Reviewed-by: Adam Litke <ali...@redhat.com>
Continuous-Integration: Jenkins CI
---
M vdsm/storage/blockSD.py
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/10/57710/1

diff --git a/vdsm/storage/blockSD.py b/vdsm/storage/blockSD.py
index 8d953e0..b9a2010 100644
--- a/vdsm/storage/blockSD.py
+++ b/vdsm/storage/blockSD.py
@@ -240,9 +240,9 @@
     poller = select.poll()
     for volUUID in volUUIDs:
         proc = _zeroVolume(sdUUID, volUUID)
-        fd = proc.stdout.fileno()
+        fd = proc.stderr.fileno()
         zerofds[fd] = ProcVol(proc, volUUID)
-        poller.register(fd, select.EPOLLHUP)
+        poller.register(fd, select.POLLIN)
 
     # Wait until all the asyncs procs return
     # Yes, this is a potentially infinite loop. Kill the vdsm task.


-- 
To view, visit https://gerrit.ovirt.org/57710
To unsubscribe, visit https://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f5ec5faefb6840f1bd5348b93cc784d0aff8690
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: ovirt-3.6
Gerrit-Owner: Nir Soffer <nsof...@redhat.com>
Gerrit-Reviewer: Adam Litke <ali...@redhat.com>
Gerrit-Reviewer: Piotr Kliczewski <piotr.kliczew...@gmail.com>
_______________________________________________
vdsm-patches mailing list
vdsm-patches@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to