[ 
https://issues.apache.org/jira/browse/DAEMON-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717230#comment-17717230
 ] 

Klaus Malorny commented on DAEMON-459:
--------------------------------------

Hi Mark,

you cannot trigger the problem by sending signals from outside to the process. 
The problem is that the code tries to send a kill to itself, but fails to do so 
as the variable allegedly holding the own process ID simply does not. The first 
time it works as the variable contains 0, which sends the signal to all 
processes of the process group (in which the child process is contained). In 
the second attempt, it contains the process ID of the previous child, which no 
longer exists. I guess that the kill function returns an error, but it is not 
checked.

You can alter the {{main_reload}} function in line 1408 (et seqq.) and print 
out the {{controlled}} variable, and, if you like, print the result of the kill 
to see what's happening. You need to trigger the call of this function from 
within the application, i.e.{{ DaemonController.reload () }}method in Java. I 
don't know whether Tomcat can be somehow prompted to do so.

> Restart only works once (regression)
> ------------------------------------
>
>                 Key: DAEMON-459
>                 URL: https://issues.apache.org/jira/browse/DAEMON-459
>             Project: Commons Daemon
>          Issue Type: Bug
>          Components: Jsvc
>    Affects Versions: 1.3.3
>            Reporter: Klaus Malorny
>            Priority: Major
>
> For certain functions, especially code updates, we rely on the ability to 
> restart the child process. This seems to work only once. On the subsequent 
> attempt, the child process hangs.
> I tracked down the problem and found out that the problem is within the 
> {{jsvc-unix.c}} file. The {{main_reload}} function is called to send the 
> signal to itself, but this does not happen. In the first restart, the 
> {{controlled}} variable holds the value of 0. This works by chance, as the 
> signal is sent to the parent, which sends it back to the child. In the second 
> attempt, the variable holds the PID of the previous child, thus the signal is 
> sent to a no longer existing process.
> The {{controlled}} variable is used both by the parent and the child process. 
> In earlier versions of the file, the child process determines its own PID by 
> using the {{getpid}} system function. This call has been – likely 
> accidentally – removed in version 1.3.3 or earlier. Thus, the variable 
> contains the parent's value before the fork which has created the child.
> The solution is simple: in the function {{{}child{}}}, add
> {{    controlled = getpid ();}}
> between the {{sigaction}} calls and the {{log_debug ("Waiting for a signal to 
> be delivered")}} call (line 913 in my copy of the file), i.e.
> {{    ...}}
> {{    memset(&act, '\0', sizeof(act));}}
> {{    act.sa_handler = handler;}}
> {{    sigemptyset(&act.sa_mask);}}
> {{    act.sa_flags = SA_RESTART | SA_NOCLDSTOP;}}
> {{    sigaction(SIGHUP, &act, NULL);}}
> {{    sigaction(SIGUSR1, &act, NULL);}}
> {{    sigaction(SIGUSR2, &act, NULL);}}
> {{    sigaction(SIGTERM, &act, NULL);}}
> {{    sigaction(SIGINT, &act, NULL);}}
> {{    *controlled = getpid ();*}}
> {{    log_debug("Waiting for a signal to be delivered");}}
> {{    create_tmp_file(args);}}
> {{    while (!stopping) {}}
> {{    ...}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to