On 05/25/2012 04:52 PM, Dan Kenigsberg wrote:
On Fri, May 25, 2012 at 02:59:31AM +0800, ShaoHe Feng wrote:
On 05/23/2012 03:52 AM, Dan Kenigsberg wrote:
On Tue, May 22, 2012 at 05:02:01PM +0800, ShaoHe Feng wrote:
both mountTests and parted_utils_tests failed.
Failed where? On your own host? Is it reproducible? We had a similar,
but transient, problem in
http://jenkins.ovirt.org/job/vdsm_unit_tests/143/console
yes it is reproducible.
bu if I shop the sandbox service, and restart, then this problem
does not occur.
if I sandbox service start, then this problem comes up again.
Pardon, but I am not familiar with the sandbox service. Could you
describe a complete reproducer?
# chkconfig sandbox off
then restart fedora.
# ./run_tests.sh mountTests
no matter how times the test rum, then this problem does not comes up.
# chkconfig sandbox on
then restart fedora.
# ./run_tests.sh mountTests
# ls /dev/loop*
there are 7 loop devices.
and I run the test 8 times, then the problem comes up.
losetup: could not find any free loop device
however after I make one loop block file by mknod , I can run test successfully
one time.
We, or some other suite using the server, may be leaking a loop device.
Eyal Edri, do you know what has made this go away in run #144?
the test execute 'mount' and 'umount' command. after the umount
command, the loop device can not be freed
here is the log:
-------------------->> begin captured logging<< --------------------
Storage.Misc.excCmd: DEBUG: 'dd if=/dev/zero of=/tmp/tmpH2KSCr
bs=100M count=1' (cwd None)
Storage.Misc.excCmd: DEBUG: SUCCESS:<err> = '1+0 records in\n1+0
records out\n104857600 bytes (105 MB) copied, 0.266024 s, 39 4
MB/s\n';<rc> = 0
Storage.Misc.excCmd: DEBUG: 'losetup -f --show /tmp/tmpH2KSCr' (cwd None)
Storage.Misc.excCmd: DEBUG: FAILED:<err> = 'losetup: could not
find any free loop device\n';<rc> = 255
--------------------->> end captured logging<< ---------------------
Does your Linux host have a trace of the generated loop device?
What says
losetup -a
?
# losetup -a
/dev/loop0: [fd03]:918121 (/tmp/tmpeihztM)
/dev/loop1: [fd03]:918122 (/tmp/tmp43EVnb)
/dev/loop2: [fd03]:918123 (/tmp/tmpCoknYi)
/dev/loop3: [fd03]:918124 (/tmp/tmp_PFqBx)
/dev/loop4: [fd03]:918125 (/tmp/tmplVEPQs)
/dev/loop5: [fd03]:918126 (/tmp/tmpQrHVKH)
/dev/loop6: [fd03]:918127 (/tmp/tmpZkZJ7V)
/dev/loop7: [fd03]:918128 (/tmp/tmpmUSR26)
# losetup -d /dev/loop0
loop: can't delete device /dev/loop0: Device or resource busy
# lsof -L | grep loop
loop0 11198 root cwd DIR 253,3
4096 2 /
Who is process 11198 ?
it is a kernel thread.
USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND
root 11198 0.0 0.0 0 0 ?
S< 02:32 0:00 [loop0]
loop0 11198 root rtd DIR 253,3
4096 2 /
loop0 11198 root txt unknown
/proc/11198/exe
loop1 11287 root cwd DIR 253,3
4096 2 /
loop1 11287 root rtd DIR 253,3
4096 2 /
loop1 11287 root txt unknown
/proc/11287/exe
loop2 11309 root cwd DIR 253,3
4096 2 /
loop2 11309 root rtd DIR 253,3
4096 2 /
loop2 11309 root txt unknown
/proc/11309/exe
loop3 11327 root cwd DIR 253,3
4096 2 /
loop3 11327 root rtd DIR 253,3
4096 2 /
loop3 11327 root txt unknown
/proc/11327/exe
loop4 11350 root cwd DIR 253,3
4096 2 /
loop4 11350 root rtd DIR 253,3
4096 2 /
loop4 11350 root txt unknown
/proc/11350/exe
loop5 11372 root cwd DIR 253,3
4096 2 /
loop5 11372 root rtd DIR 253,3
4096 2 /
loop5 11372 root txt unknown
/proc/11372/exe
loop6 11391 root cwd DIR 253,3
4096 2 /
loop6 11391 root rtd DIR 253,3
4096 2 /
loop6 11391 root txt unknown
/proc/11391/exe
loop7 11408 root cwd DIR 253,3
4096 2 /
loop7 11408 root rtd DIR 253,3
4096 2 /
loop7 11408 root txt unknown
/proc/11408/exe
should I use the strace to watch the syscall about what happened to
/dev/loop
# strace -f -F -o ./strace.log ./run_tests.sh mountTests
or any other way to get more info?
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel