Nir Soffer has uploaded a new change for review.

Change subject: blockSD: Fix busy loop when zeroing image volumes
......................................................................

blockSD: Fix busy loop when zeroing image volumes

We used to wait on the wrong file descriptor (stdout, writing to image),
using the wrong mask (EPOLLHUP when using poll).

The result - vdsm consume 100% cpu when zeroing image:

Tue May 17 01:33:57 2016    vdsmd.prof

         310515403 function calls (310560288 primitive calls) in 377.096 seconds

   Ordered by: internal time
   List reduced from 3173 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 37946430   74.674    0.000  257.590    0.000 commands.py:302(AsyncProc.wait)
        1   65.338   65.338  357.741  357.741 blockSD.py:229(zeroImgVolumes)
 37947222   54.932    0.000   54.932    0.000 posix:0(waitpid)
 37946448   38.780    0.000  159.845    0.000 
commands.py:284(AsyncProc.returncode)
 37946441   38.363    0.000   93.210    0.000 
subprocess.py:1344(CPopen._internal_poll)
 38034684   35.044    0.000   35.045    0.000 __builtin__:0(poll)
 37946441   27.856    0.000  121.065    0.000 subprocess.py:803(CPopen.poll)
 75928952   23.096    0.000   23.096    0.000 time:0(time)

Fixed by watching stderr, where the stats are written at the end, using POLLIN.

This fix is mainly for 3.6, for 4.0 we want to drop this buggy code and
replace it with simpler thread based code.

Change-Id: I7f5ec5faefb6840f1bd5348b93cc784d0aff8690
Signed-off-by: Nir Soffer <nsof...@redhat.com>
---
M vdsm/storage/blockSD.py
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/41/57541/1

diff --git a/vdsm/storage/blockSD.py b/vdsm/storage/blockSD.py
index d03bc0e..a9c82f5 100644
--- a/vdsm/storage/blockSD.py
+++ b/vdsm/storage/blockSD.py
@@ -246,9 +246,9 @@
     poller = select.poll()
     for volUUID in volUUIDs:
         proc = _zeroVolume(sdUUID, volUUID)
-        fd = proc.stdout.fileno()
+        fd = proc.stderr.fileno()
         zerofds[fd] = ProcVol(proc, volUUID)
-        poller.register(fd, select.EPOLLHUP)
+        poller.register(fd, select.POLLIN)
 
     # Wait until all the asyncs procs return
     # Yes, this is a potentially infinite loop. Kill the vdsm task.


-- 
To view, visit https://gerrit.ovirt.org/57541
To unsubscribe, visit https://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f5ec5faefb6840f1bd5348b93cc784d0aff8690
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsof...@redhat.com>
_______________________________________________
vdsm-patches mailing list
vdsm-patches@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to