Re: [Patch] Reliable error logs for trunk

2009-01-18 Thread Rainer Jung

Hi Rüdiger,

first thanks for reviewing.

On 17.01.2009 18:02, Ruediger Pluem wrote:


On 01/16/2009 12:21 AM, Rainer Jung wrote:


This difference goes back to pre 1.3 httpd.

The patch at

http://people.apache.org/~rjung/patches/reliable_error_log.patch


Just some quick comments below

Index: server/log.c
===
--- server/log.c(revision 734457)
+++ server/log.c(working copy)

@@ -319,20 +333,28 @@
  int rc;

  if (*s-error_fname == '|') {
-apr_file_t *dummy = NULL;
+piped_log *pl;
+fname = ap_server_root_relative(p, s-error_fname + 1);

+if (!fname) {
+ap_log_error(APLOG_MARK, APLOG_STARTUP, APR_EBADPATH, NULL,
+ %s: Invalid error log path '%s'.,
+ ap_server_argv0, s-error_fname);
+return NULL;
+}
+
  /* Spawn a new child logger.  If this is the main server_rec,
   * the new child must use a dummy stderr since the current
   * stderr might be a pipe to the old logger.  Otherwise, the
   * child inherits the parents stderr. */
-rc = log_child(p, s-error_fname + 1,dummy, is_main);
-if (rc != APR_SUCCESS) {
-ap_log_error(APLOG_MARK, APLOG_STARTUP, rc, NULL,
- Couldn't start ErrorLog process);
+pl = piped_log_open(p, fname, is_main);
+if (pl == NULL) {
+ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
+ %s: Couldn't start ErrorLog process '%s'.,
+ ap_server_argv0, fname);
  return DONE;

Why not keeping the old way if

#ifndef AP_HAVE_RELIABLE_PIPED_LOGS?


I still need to investigate, what actually happens in the non-reliable 
case. But yes, for lowest risk, we could keep calling log_child() in 
that case, until we understand it better. On the long term I have a 
preference to keep access and error logging in both cases consistent.



  }
-
-s-error_log = dummy;
+s-error_log = ap_piped_log_write_fd(pl);
  }

  #ifdef HAVE_SYSLOG

@@ -953,13 +992,26 @@
  ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
   piped log program '%s' failed unexpectedly,
   pl-program);
-if ((stats = piped_log_spawn(pl)) != APR_SUCCESS) {
+
+/* If special_stderr is non-zero, we need to get the write
+ * end of the pipe back from stderr before spawning. */
+if ((plw-special_stderr  (stats = 
apr_file_dup(ap_piped_log_write_fd(pl),
+  stderr_log, 
pl-p))
+!= APR_SUCCESS) ||
+   ((stats = piped_log_spawn(plw)) != APR_SUCCESS) ||
+/* If special_stderr is non-zero, we need to close the write
+ * end of the pipe after spawning, because we write to it via
+ * stderr. */
+   (plw-special_stderr  (stats = 
apr_file_close(ap_piped_log_write_fd(pl)))
+!= APR_SUCCESS)) {

I know that this is always a difficult and confusing part, so maybe it just 
confused me again,
but why do we need to do the above?
ap_piped_log_write_fd(pl) should have the correct fd already because in 
ap_open_logs we just
duplicate the write side of the pipe to stderr and in all other cases (not main 
server error log)
ap_piped_log_write_fd is just fine anyways.


It *is* confusing.

Step 1:
---

open_error_log() calls
   piped_log_open() calls
  apr_file_pipe_create_ex()

We get 2 FDs, say FD9 (read) and FD10 (write).

Step 2:
---

   piped_log_open() calls
  piped_log_spawn() calls
 apr_procattr_child_in_set()


We have 2 additional FDs, say FD11 and FD12, FD11 pointing to the same 
end of the pipe as FD9 and FD12 as FD10.


Step 3:
---

  piped_log_spawn() calls
 apr_proc_create()

This closes FD11.

Step 4:
---

  piped_log_spawn() calls
 apr_file_close(procnew-in)

This closes FD12.

Now we have a separate process and the parent only has 2 additional FDs, 
which can be used to communicate to the external process and to recreate it.


The next steps do only happen for the main error logger:

Step 5:
---

open_error_log() calls
   apr_file_dup2(stderr_log, s_main-error_log, stderr_p)

s_main-error_log contains FD10, so the call lets FD2 (stderr_log) point 
to the same pipe as FD10. FD9 and FD10 remain unchanged.


Step 6:
---

open_error_log() calls
   apr_file_close(s_main-error_log)

FD10 gets closed, we only have FD2 and FD9 remaining, 
ap_piped_log_write_fd(pl) is gone now!


So if we want to keep this behaviour, we need to recreate the write side 
of the pipe from stderr in maintenance before spawning again (in the 
main error logger case), and close it after spawning.


I tried using FD2 directly as the write side without duping it, but that 
doesn't 

Re: [Patch] Reliable error logs for trunk

2009-01-18 Thread Ruediger Pluem


On 01/18/2009 03:52 PM, Rainer Jung wrote:
 Hi Rüdiger,
 
 first thanks for reviewing.
 
 On 17.01.2009 18:02, Ruediger Pluem wrote:

 On 01/16/2009 12:21 AM, Rainer Jung wrote:

 This difference goes back to pre 1.3 httpd.

 The patch at

 http://people.apache.org/~rjung/patches/reliable_error_log.patch

 Just some quick comments below

 Index: server/log.c
 ===
 --- server/log.c(revision 734457)
 +++ server/log.c(working copy)

 @@ -953,13 +992,26 @@
   ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
piped log program '%s' failed unexpectedly,
pl-program);
 -if ((stats = piped_log_spawn(pl)) != APR_SUCCESS) {
 +
 +/* If special_stderr is non-zero, we need to get the write
 + * end of the pipe back from stderr before spawning. */
 +if ((plw-special_stderr  (stats =
 apr_file_dup(ap_piped_log_write_fd(pl),
 + 
 stderr_log, pl-p))
 +!= APR_SUCCESS) ||
 +   ((stats = piped_log_spawn(plw)) != APR_SUCCESS) ||
 +/* If special_stderr is non-zero, we need to close the write
 + * end of the pipe after spawning, because we write to it
 via
 + * stderr. */
 +   (plw-special_stderr  (stats =
 apr_file_close(ap_piped_log_write_fd(pl)))
 +!= APR_SUCCESS)) {

 I know that this is always a difficult and confusing part, so maybe it
 just confused me again,
 but why do we need to do the above?
 ap_piped_log_write_fd(pl) should have the correct fd already because
 in ap_open_logs we just
 duplicate the write side of the pipe to stderr and in all other cases
 (not main server error log)
 ap_piped_log_write_fd is just fine anyways.
 
 It *is* confusing.
 
 Step 1:
 ---
 
 open_error_log() calls
piped_log_open() calls
   apr_file_pipe_create_ex()
 
 We get 2 FDs, say FD9 (read) and FD10 (write).
 
 Step 2:
 ---
 
piped_log_open() calls
   piped_log_spawn() calls
  apr_procattr_child_in_set()
 
 
 We have 2 additional FDs, say FD11 and FD12, FD11 pointing to the same
 end of the pipe as FD9 and FD12 as FD10.
 
 Step 3:
 ---
 
   piped_log_spawn() calls
  apr_proc_create()
 
 This closes FD11.
 
 Step 4:
 ---
 
   piped_log_spawn() calls
  apr_file_close(procnew-in)
 
 This closes FD12.
 
 Now we have a separate process and the parent only has 2 additional FDs,
 which can be used to communicate to the external process and to recreate
 it.
 
 The next steps do only happen for the main error logger:
 
 Step 5:
 ---
 
 open_error_log() calls
apr_file_dup2(stderr_log, s_main-error_log, stderr_p)
 
 s_main-error_log contains FD10, so the call lets FD2 (stderr_log) point
 to the same pipe as FD10. FD9 and FD10 remain unchanged.
 
 Step 6:
 ---
 
 open_error_log() calls
apr_file_close(s_main-error_log)

The call to apr_file_close happens in ap_open_logs not in open_error_log
and ap_open_logs is IMHO not called by piped_log_maintenance.

 
 FD10 gets closed, we only have FD2 and FD9 remaining,
 ap_piped_log_write_fd(pl) is gone now!
 
 So if we want to keep this behaviour, we need to recreate the write side
 of the pipe from stderr in maintenance before spawning again (in the
 main error logger case), and close it after spawning.
 
 I tried using FD2 directly as the write side without duping it, but that
 doesn't work.

Hm. F10 should still be open and untouched.

Regards

Rüdiger



Re: [Patch] Reliable error logs for trunk

2009-01-18 Thread Rainer Jung

On 18.01.2009 16:37, Ruediger Pluem wrote:


On 01/18/2009 03:52 PM, Rainer Jung wrote:

Hi Rüdiger,

first thanks for reviewing.

On 17.01.2009 18:02, Ruediger Pluem wrote:

On 01/16/2009 12:21 AM, Rainer Jung wrote:


This difference goes back to pre 1.3 httpd.

The patch at

http://people.apache.org/~rjung/patches/reliable_error_log.patch

Just some quick comments below

Index: server/log.c
===
--- server/log.c(revision 734457)
+++ server/log.c(working copy)



@@ -953,13 +992,26 @@
   ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
piped log program '%s' failed unexpectedly,
pl-program);
-if ((stats = piped_log_spawn(pl)) != APR_SUCCESS) {
+
+/* If special_stderr is non-zero, we need to get the write
+ * end of the pipe back from stderr before spawning. */
+if ((plw-special_stderr   (stats =
apr_file_dup(ap_piped_log_write_fd(pl),
+
stderr_log, pl-p))
+!= APR_SUCCESS) ||
+   ((stats = piped_log_spawn(plw)) != APR_SUCCESS) ||
+/* If special_stderr is non-zero, we need to close the write
+ * end of the pipe after spawning, because we write to it
via
+ * stderr. */
+   (plw-special_stderr   (stats =
apr_file_close(ap_piped_log_write_fd(pl)))
+!= APR_SUCCESS)) {

I know that this is always a difficult and confusing part, so maybe it
just confused me again,
but why do we need to do the above?
ap_piped_log_write_fd(pl) should have the correct fd already because
in ap_open_logs we just
duplicate the write side of the pipe to stderr and in all other cases
(not main server error log)
ap_piped_log_write_fd is just fine anyways.

It *is* confusing.

Step 1:
---

open_error_log() calls
piped_log_open() calls
   apr_file_pipe_create_ex()

We get 2 FDs, say FD9 (read) and FD10 (write).

Step 2:
---

piped_log_open() calls
   piped_log_spawn() calls
  apr_procattr_child_in_set()


We have 2 additional FDs, say FD11 and FD12, FD11 pointing to the same
end of the pipe as FD9 and FD12 as FD10.

Step 3:
---

   piped_log_spawn() calls
  apr_proc_create()

This closes FD11.

Step 4:
---

   piped_log_spawn() calls
  apr_file_close(procnew-in)

This closes FD12.

Now we have a separate process and the parent only has 2 additional FDs,
which can be used to communicate to the external process and to recreate
it.

The next steps do only happen for the main error logger:

Step 5:
---

open_error_log() calls
apr_file_dup2(stderr_log, s_main-error_log, stderr_p)

s_main-error_log contains FD10, so the call lets FD2 (stderr_log) point
to the same pipe as FD10. FD9 and FD10 remain unchanged.

Step 6:
---

open_error_log() calls
apr_file_close(s_main-error_log)


The call to apr_file_close happens in ap_open_logs not in open_error_log
and ap_open_logs is IMHO not called by piped_log_maintenance.


Yes, sorry, it is ap_open_logs.

But nevertheless:

During startup or after restart we go through ap_open_logs() and thus 
after startup or restart, the write side of the original pl is closed 
and the pipe is written to via the duped FD in stderr.


More precisely ap_open_logs() calls open_error_log(), which returns 
ap_piped_log_write_fd(pl) in s-error_log, and after return 
ap_open_logs() calls apr_file_close(s_main-error_log) is s==s_main.


When killing the logger, piped_log_maintenance() gets called. It can now 
not simply use pl, because the write side has been closed before in

ap_open_logs().

I tried to keep the startup behaviour during restart, so I dup stderr to 
the write side in the maintenance function, spawn and close the write 
side again.



FD10 gets closed, we only have FD2 and FD9 remaining,
ap_piped_log_write_fd(pl) is gone now!

So if we want to keep this behaviour, we need to recreate the write side
of the pipe from stderr in maintenance before spawning again (in the
main error logger case), and close it after spawning.

I tried using FD2 directly as the write side without duping it, but that
doesn't work.


Hm. F10 should still be open and untouched.


Just to make sure we are talking about the same. This gets only closed 
for the main error logger. For all other error loggers and the access 
loggers, there's no such complication.


Regards,

Rainer


Re: [Patch] Reliable error logs for trunk

2009-01-18 Thread Rainer Jung

Oups, need to clarify.

On 18.01.2009 16:48, Rainer Jung wrote:

During startup or after restart we go through ap_open_logs() and thus


restart - restart of httpd


after startup or restart, the write side of the original pl is closed
and the pipe is written to via the duped FD in stderr.

More precisely ap_open_logs() calls open_error_log(), which returns
ap_piped_log_write_fd(pl) in s-error_log, and after return
ap_open_logs() calls apr_file_close(s_main-error_log) is s==s_main.


is - if


When killing the logger, piped_log_maintenance() gets called. It can now
not simply use pl, because the write side has been closed before in
ap_open_logs().

I tried to keep the startup behaviour during restart, so I dup stderr to


restart - restart of a killed logger


the write side in the maintenance function, spawn and close the write
side again.


Re: [Patch] Reliable error logs for trunk

2009-01-18 Thread Ruediger Pluem


On 01/18/2009 04:48 PM, Rainer Jung wrote:
 On 18.01.2009 16:37, Ruediger Pluem wrote:



 Step 6:
 ---

 open_error_log() calls
 apr_file_close(s_main-error_log)

 The call to apr_file_close happens in ap_open_logs not in open_error_log
 and ap_open_logs is IMHO not called by piped_log_maintenance.
 
 Yes, sorry, it is ap_open_logs.
 
 But nevertheless:
 
 During startup or after restart we go through ap_open_logs() and thus
 after startup or restart, the write side of the original pl is closed
 and the pipe is written to via the duped FD in stderr.

Ahh. Now I got it. So your approach seems fine. Really very confusing :-).

 
 More precisely ap_open_logs() calls open_error_log(), which returns
 ap_piped_log_write_fd(pl) in s-error_log, and after return
 ap_open_logs() calls apr_file_close(s_main-error_log) is s==s_main.
 
 When killing the logger, piped_log_maintenance() gets called. It can now
 not simply use pl, because the write side has been closed before in
 ap_open_logs().
 
 I tried to keep the startup behaviour during restart, so I dup stderr to
 the write side in the maintenance function, spawn and close the write
 side again.
 
 FD10 gets closed, we only have FD2 and FD9 remaining,
 ap_piped_log_write_fd(pl) is gone now!

 So if we want to keep this behaviour, we need to recreate the write side
 of the pipe from stderr in maintenance before spawning again (in the
 main error logger case), and close it after spawning.

 I tried using FD2 directly as the write side without duping it, but that
 doesn't work.

 Hm. F10 should still be open and untouched.
 
 Just to make sure we are talking about the same. This gets only closed
 for the main error logger. For all other error loggers and the access
 loggers, there's no such complication.

Yes, this clearly was only valid for the main error log.

Regards

Rüdiger



Re: [Patch] Reliable error logs for trunk

2009-01-18 Thread Rainer Jung

  pl = apr_palloc(p, sizeof (*pl));
  pl-p = p;
+pl-program = (program == NULL) ? NULL : apr_pstrdup(p, program);
+pl-pid = NULL;

Why is this needed now? Or was this just missed previously and is not really
related to the reliable pipe usage of the error log?


This was borked. Those fields do not exist in the struct when 
AP_HAVE_RELIABLE_PIPED_LOGS is not defined. I removed it.


Re: [Patch] Reliable error logs for trunk

2009-01-17 Thread Ruediger Pluem


On 01/16/2009 12:21 AM, Rainer Jung wrote:

 
 This difference goes back to pre 1.3 httpd.
 
 The patch at
 
 http://people.apache.org/~rjung/patches/reliable_error_log.patch

Just some quick comments below

Index: server/log.c
===
--- server/log.c(revision 734457)
+++ server/log.c(working copy)

@@ -319,20 +333,28 @@
 int rc;

 if (*s-error_fname == '|') {
-apr_file_t *dummy = NULL;
+piped_log *pl;
+fname = ap_server_root_relative(p, s-error_fname + 1);

+if (!fname) {
+ap_log_error(APLOG_MARK, APLOG_STARTUP, APR_EBADPATH, NULL,
+ %s: Invalid error log path '%s'.,
+ ap_server_argv0, s-error_fname);
+return NULL;
+}
+
 /* Spawn a new child logger.  If this is the main server_rec,
  * the new child must use a dummy stderr since the current
  * stderr might be a pipe to the old logger.  Otherwise, the
  * child inherits the parents stderr. */
-rc = log_child(p, s-error_fname + 1, dummy, is_main);
-if (rc != APR_SUCCESS) {
-ap_log_error(APLOG_MARK, APLOG_STARTUP, rc, NULL,
- Couldn't start ErrorLog process);
+pl = piped_log_open(p, fname, is_main);
+if (pl == NULL) {
+ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
+ %s: Couldn't start ErrorLog process '%s'.,
+ ap_server_argv0, fname);
 return DONE;

Why not keeping the old way if

#ifndef AP_HAVE_RELIABLE_PIPED_LOGS?

 }
-
-s-error_log = dummy;
+s-error_log = ap_piped_log_write_fd(pl);
 }

 #ifdef HAVE_SYSLOG

@@ -953,13 +992,26 @@
 ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
  piped log program '%s' failed unexpectedly,
  pl-program);
-if ((stats = piped_log_spawn(pl)) != APR_SUCCESS) {
+
+/* If special_stderr is non-zero, we need to get the write
+ * end of the pipe back from stderr before spawning. */
+if ((plw-special_stderr  (stats = 
apr_file_dup(ap_piped_log_write_fd(pl),
+  stderr_log, 
pl-p))
+!= APR_SUCCESS) ||
+   ((stats = piped_log_spawn(plw)) != APR_SUCCESS) ||
+/* If special_stderr is non-zero, we need to close the write
+ * end of the pipe after spawning, because we write to it via
+ * stderr. */
+   (plw-special_stderr  (stats = 
apr_file_close(ap_piped_log_write_fd(pl)))
+!= APR_SUCCESS)) {

I know that this is always a difficult and confusing part, so maybe it just 
confused me again,
but why do we need to do the above?
ap_piped_log_write_fd(pl) should have the correct fd already because in 
ap_open_logs we just
duplicate the write side of the pipe to stderr and in all other cases (not main 
server error log)
ap_piped_log_write_fd is just fine anyways.


+
+char buf[120];
+
 /* what can we do?  This could be the error log we're having
  * problems opening up... */
-char buf[120];
 ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
  piped_log_maintenance: unable to respawn '%s': 
%s,
- pl-program, apr_strerror(stats, buf, 
sizeof(buf)));
+ pl-program, apr_strerror(stats, buf, sizeof(buf)));
 }
 }
 break;

@@ -1037,21 +1094,25 @@
 return APR_SUCCESS;
 }

-AP_DECLARE(piped_log *) ap_open_piped_log(apr_pool_t *p, const char *program)
+static piped_log* piped_log_open(apr_pool_t *p, const char *program,
+ int special_stderr)
 {
 piped_log *pl;
 apr_file_t *dummy = NULL;
 int rc;

-rc = log_child(p, program, dummy, 0);
+rc = log_child(p, program, dummy, special_stderr);
 if (rc != APR_SUCCESS) {
 ap_log_error(APLOG_MARK, APLOG_STARTUP, rc, NULL,
- Couldn't start piped log process);
+ Couldn't start piped log process for %s,
+ program == NULL ? 'NULL' : ' program ');
 return NULL;
 }

 pl = apr_palloc(p, sizeof (*pl));
 pl-p = p;
+pl-program = (program == NULL) ? NULL : apr_pstrdup(p, program);
+pl-pid = NULL;

Why is this needed now? Or was this just missed previously and is not really
related to the reliable pipe usage of the error log?

 ap_piped_log_read_fd(pl) = NULL;
 ap_piped_log_write_fd(pl) = dummy;
 apr_pool_cleanup_register(p, pl, piped_log_cleanup, piped_log_cleanup);

 
 should fix this. I tested it with access and error logs, with the main
 server and virtual hosts and with restarts and logger killing.
 
 The problem is the 

[Patch] Reliable error logs for trunk

2009-01-15 Thread Rainer Jung
httpd has a feature called reliable piped logs. When a log process like 
rotatelogs dies, httpd automatically restarts it. This can be used to 
close open log files from outside httpd by simply killing rotatelogs.


This procedure does not work for the error logs, because those are 
started with a different mechanism, even if they are configured with the 
same rotatelogs syntax.


This difference goes back to pre 1.3 httpd.

The patch at

http://people.apache.org/~rjung/patches/reliable_error_log.patch

should fix this. I tested it with access and error logs, with the main 
server and virtual hosts and with restarts and logger killing.


The problem is the delicate handling of file descriptors in the main 
error log case.


Should I apply this to trunk?

My tests are only on Solaris. I will also test on Linux, but it would be 
nice if someone could test on Windows too.


Some comments on the patch:

1) The piped_log_maintenance() function used as a callback when the 
logger dies needs one additional data, namely whether the died logger 
was the main error log. We could add this to the piped_log structure, 
but because that is public, I'm not sure whether we want to expose it 
like this. So I preferred to wrap piped_log into piped_log_wrapper locally.


2) I did not yet test the case where AP_HAVE_RELIABLE_PIPED_LOGS is not 
defined, which seems to be a very rare case.


3) To make the behaviour between CustomLog and ErrorLog more consistent 
I added ap_server_root_relative() for the logger path before spawning 
the logger. This is not necessary to achieve the reliable restarts.


Regards,

Rainer