Re: [Valgrind-users] Getting SIGKILL to work in MariaDB

Philippe Waroquiers Tue, 13 Aug 2019 21:45:27 -0700

On Tue, 2019-08-13 at 16:54 +0300, Michael Widenius wrote:

> I can try that at once. Thanks.
> Testing...
> hm.. It would be good to describe in README_DEVELOPERS how to generate the
> autoconfigure scripts.  Now it says one to run 'make dist', which will
> not work out of the box.
README has a section "Building and installing it".
It looks like README_DEVELOPERS should start with the same title,
and reference the README section.


> Same problem with README.solaris. Please add that one should run first
> 'autogen.sh' and then configure.
README.solaris has a pointer to the README file in the "Compilation" section,
but again that can be made more precise.

I will improve both files to add the relevant reference to README.



> Currently the kill signal is never sent to the process and the process
> continues to run.
Yes, that is the bug 409141 (fixed in the git version).

> 
> > You can avoid this interception by instead doing something like
> >    system("kill -9 me");
> > as no way valgrind will be able to intercept this.
> > (of course, me has to be the pid of the calling process).
> 
> In this case the process, in this case mysqld, needs to know who it's
> 'father' process is.
> This is a bit hard to provide to mysqld, as we only know the pid after
> valgrind is started with mysqld as an option.  Another issue is that
> mysqld doesn't know if it's run under valgrind or not and it should
> only kill it's parent if it's valgrind.
Not sure I understand, so I will explain what I understood:
You used to have in a process some lines of code sending SIGKILL
to itself, so as to commit suicide ("hard exit").
Due to bug 409141, when running under valgrind, this hangs.
You have bypassed the problem by rather sending SIGFPE, but you would
like to go back to SIGKILL.
With the fix to 409141, sending SIGKILL again really kills
the process.

The remaining problem is that SIGKILL is still causing a leak search
to be done.

To really have an "hard exit", you can replace the lines of code sending SIGKILL
by the following 3 lines:
      char cmd[1000];
      sprintf(cmd, "kill -9 %d", getpid());
      system (cmd);
With this, you really have a "hard exit".

> (Assuming I understood correctly what you meant with  calling process;
> I think you mean the 'valgrind' process in this case)
Note that when a process "runs under valgrind", there is still only one
single process, which contains both the valgrind code and the code of
the program being run "under valgrind".
So, I am not sure to understand why there is anything
special to do related to parent process when running under valgrind.

> 
> > > Please open a bug at https://bugs.kde.org/enter_bug.cgi?product=valgrind
> > > and attach a minimal test case if at all possible.
> 
> Will try to create a test case, but it's not that easy as for simple
> programs valgrind seams to pass KILL forwards.  Maybe this is only a
> problem with threaded programs.
> 
> https://www.mail-archive.com/valgrind-users@lists.sourceforge.net/msg06862.html
> seams to highlight the same problem.
As far as I can see, this message and your problem of hanging after sending
SIGKILL is the bug 409141.
And you have double checked the git version also fixes the hanging bug.
So, IMO, there is no need to have another reproducer.


> I have now tested with 10.3.16.GIT from today and did run it with
> --trace-signals=yes.
> In this case the SIGKILL is sent to the process and the process is
> killed so it looks
> like the bugs is fixed. Great!
> However one problem still reminds. After the kill, we still get a
> report of leaks which
> is not that relevant when a process is killed hard.
> Hope you will figure out a way to stop this report!
Well, the valgrind signal sending interception code has rather be explicitly
designed to do various "end of life actions" (such as run the leak search)
when a process terminates or dies, including the case where a process
sends a fatal signal to itself.

It is of course possible to implement something to really have a "hard exit"
and/or disable leak search.  What is the best feature/way to do that is not
(yet) clear to me: a more general approach such as loading a supp file
or allow to change valgrind parameters is more attractive than e.g.
a very specialised request such as VALGRIND_HARD_EXIT();

In the meantime, if my understanding is correct,
   system ("kill -9 ..."); 
as explained above gives a hard exit without allowing valgrind to
intercept the signal and do a leak search or any other closing actions.

If that does not work/is not usable, then I have missed something in
what you are trying to do.

Further comments welcome ...

Philippe




_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] Getting SIGKILL to work in MariaDB

Reply via email to