On 17/06/2019 17:51, Kevin Wolf wrote: > Am 17.06.2019 um 15:20 hat Roman Kagan geschrieben: >> On Mon, Jun 17, 2019 at 02:53:55PM +0200, Kevin Wolf wrote: >>> Am 17.06.2019 um 14:18 hat Roman Kagan geschrieben: >>>> On Mon, Jun 17, 2019 at 01:15:04PM +0200, Kevin Wolf wrote: >>>>> Am 11.06.2019 um 20:02 hat Andrey Shinkevich geschrieben: >>>>>> The Valgrind tool fails to manage its termination when QEMU raises the >>>>>> signal SIGKILL. Lets exclude such test cases from running under the >>>>>> Valgrind because there is no sense to check memory issues that way. >>>>>> >>>>>> Signed-off-by: Andrey Shinkevich <andrey.shinkev...@virtuozzo.com> >>>>> >>>>> I don't fully understand the reasoning here. Most interesting memory >>>>> access errors happen before a process terminates. (I'm not talking about >>>>> leaks here, but use-after-free, buffer overflows, uninitialised memory >>>>> etc.) >>>> >>>> Nothing of the above, and nothing in general, happens in the usermode >>>> process upon SIGKILL delivery. >>> >>> My point is, the interesting part is what the program does before >>> SIGKILL happens. There is value in reporting memory errors as long as we >>> can, even if the final check doesn't happen because of SIGKILL. >> >> Agreed in general, but here the testcases that include 'sigraise 9' only >> do simple operations before that which are covered elsewhere too. So >> the extra effort on making valgrind work with these testcases arguably >> isn't worth the extra value to be gained. > > Ok, fair enough. > >>>>> However, I do see that running these test cases with -valgrind ends in a >>>>> hang because the valgrind process keeps hanging around as a zombie >>>>> process and the test case doesn't reap it. I'm not exactly sure why that >>>>> is, but it looks more like a problem with the parent process (i.e. the >>>>> bash script). >>>> >>>> It rather looks like valgrind getting confused about what to do with >>>> raise(SIGKILL) in the multithreaded case. >>> >>> Well, valgrind can't do anything with SIGKILL, obviously, because it's >>> killed immediately. >> >> Right, but it can do whatever it wants with raise(SIGKILL). I haven't >> looked at valgrind sources, but >> >> # strace -ff valgind qemu-io -c 'sigraise 9' >> >> shows SIGKILL neither sent nor received by any thread; it just shows the >> main thread exit and the second thread getting stuck waiting on a futex. > > Oh, I didn't see this! So there isn't even a real SIGKILL signal. > >>> But maybe the kernel does get confused for some >>> reason. I get the main threads as a zombie, but a second is still >>> running. Sending SIGKILL to the second thread, too, makes the test case >>> complete successfully. >>> >>> So I guess the main question is why the second thread isn't >>> automatically killed when the main thread receives SIGKILL. >> >> I don't see any thread receive SIGKILL. So I tend to think this is >> valgrind's bug/feature. >> >> Anyway the problem is outside of QEMU, so I think we need to weigh the >> costs of investigating it and implementing a workaround with the >> potential benefit. > > I'd suggest to file a bug against valgrind at least. And indeed just > disable valgrind here like this patch does. > > Kevin >
I have reported the issue to the KDE Bugtracking System on bugs.kde.org as instructed on the www.valgrind.org/support/bug_reports.html The bug 409141 "Valgrind hangs when SIGKILLed" has been created. The thread can be seen on https://bugs.kde.org/show_bug.cgi?id=409141 Andrey