Re: [Patch] Reliable error logs for trunk

Rainer Jung Sun, 18 Jan 2009 06:53:33 -0800

Hi Rüdiger,

first thanks for reviewing.


On 17.01.2009 18:02, Ruediger Pluem wrote:


On 01/16/2009 12:21 AM, Rainer Jung wrote:

This difference goes back to pre 1.3 httpd.

The patch at

http://people.apache.org/~rjung/patches/reliable_error_log.patch


Just some quick comments below

Index: server/log.c
===================================================================
--- server/log.c        (revision 734457)
+++ server/log.c        (working copy)

@@ -319,20 +333,28 @@
      int rc;

      if (*s->error_fname == '|') {
-        apr_file_t *dummy = NULL;
+        piped_log *pl;
+        fname = ap_server_root_relative(p, s->error_fname + 1);

+        if (!fname) {
+            ap_log_error(APLOG_MARK, APLOG_STARTUP, APR_EBADPATH, NULL,
+                         "%s: Invalid error log path '%s'.",
+                         ap_server_argv0, s->error_fname);
+            return NULL;
+        }
+
          /* Spawn a new child logger.  If this is the main server_rec,
           * the new child must use a dummy stderr since the current
           * stderr might be a pipe to the old logger.  Otherwise, the
           * child inherits the parents stderr. */
-        rc = log_child(p, s->error_fname + 1,&dummy, is_main);
-        if (rc != APR_SUCCESS) {
-            ap_log_error(APLOG_MARK, APLOG_STARTUP, rc, NULL,
-                         "Couldn't start ErrorLog process");
+        pl = piped_log_open(p, fname, is_main);
+        if (pl == NULL) {
+            ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
+                         "%s: Couldn't start ErrorLog process '%s'.",
+                         ap_server_argv0, fname);
              return DONE;

Why not keeping the old way if

#ifndef AP_HAVE_RELIABLE_PIPED_LOGS?

I still need to investigate, what actually happens in the non-reliablecase. But yes, for lowest risk, we could keep calling log_child() inthat case, until we understand it better. On the long term I have apreference to keep access and error logging in both cases consistent.

          }
-
-        s->error_log = dummy;
+        s->error_log = ap_piped_log_write_fd(pl);
      }

  #ifdef HAVE_SYSLOG

@@ -953,13 +992,26 @@
              ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
                           "piped log program '%s' failed unexpectedly",
                           pl->program);
-            if ((stats = piped_log_spawn(pl)) != APR_SUCCESS) {
+
+            /* If special_stderr is non-zero, we need to get the write
+             * end of the pipe back from stderr before spawning. */
+            if ((plw->special_stderr&&  (stats = 
apr_file_dup(&ap_piped_log_write_fd(pl),
+                                                              stderr_log, 
pl->p))
+                != APR_SUCCESS) ||
+               ((stats = piped_log_spawn(plw)) != APR_SUCCESS) ||
+            /* If special_stderr is non-zero, we need to close the write
+             * end of the pipe after spawning, because we write to it via
+             * stderr. */
+               (plw->special_stderr&&  (stats = 
apr_file_close(ap_piped_log_write_fd(pl)))
+                != APR_SUCCESS)) {

I know that this is always a difficult and confusing part, so maybe it just 
confused me again,
but why do we need to do the above?
ap_piped_log_write_fd(pl) should have the correct fd already because in 
ap_open_logs we just
duplicate the write side of the pipe to stderr and in all other cases (not main 
server error log)
ap_piped_log_write_fd is just fine anyways.


It *is* confusing.

Step 1:
-------

open_error_log() calls
   piped_log_open() calls
      apr_file_pipe_create_ex()

We get 2 FDs, say FD9 (read) and FD10 (write).

Step 2:
-------

   piped_log_open() calls
      piped_log_spawn() calls
         apr_procattr_child_in_set()

We have 2 additional FDs, say FD11 and FD12, FD11 pointing to the sameend of the pipe as FD9 and FD12 as FD10.


Step 3:
-------

      piped_log_spawn() calls
         apr_proc_create()

This closes FD11.

Step 4:
-------

      piped_log_spawn() calls
         apr_file_close(procnew->in)

This closes FD12.

Now we have a separate process and the parent only has 2 additional FDs,which can be used to communicate to the external process and to recreate it.


The next steps do only happen for the main error logger:

Step 5:
-------

open_error_log() calls
   apr_file_dup2(stderr_log, s_main->error_log, stderr_p)

s_main->error_log contains FD10, so the call lets FD2 (stderr_log) pointto the same pipe as FD10. FD9 and FD10 remain unchanged.


Step 6:
-------

open_error_log() calls
   apr_file_close(s_main->error_log)

FD10 gets closed, we only have FD2 and FD9 remaining,ap_piped_log_write_fd(pl) is gone now!

So if we want to keep this behaviour, we need to recreate the write sideof the pipe from stderr in maintenance before spawning again (in themain error logger case), and close it after spawning.

I tried using FD2 directly as the write side without duping it, but thatdoesn't work.

+
+                char buf[120];
+
                  /* what can we do?  This could be the error log we're having
                   * problems opening up... */
-                char buf[120];
                  ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, NULL,
                               "piped_log_maintenance: unable to respawn '%s': 
%s",
-                             pl->program, apr_strerror(stats, buf, 
sizeof(buf)));
+                         pl->program, apr_strerror(stats, buf, sizeof(buf)));
              }
          }
          break;

@@ -1037,21 +1094,25 @@
      return APR_SUCCESS;
  }

-AP_DECLARE(piped_log *) ap_open_piped_log(apr_pool_t *p, const char *program)
+static piped_log* piped_log_open(apr_pool_t *p, const char *program,
+                                 int special_stderr)
  {
      piped_log *pl;
      apr_file_t *dummy = NULL;
      int rc;

-    rc = log_child(p, program,&dummy, 0);
+    rc = log_child(p, program,&dummy, special_stderr);
      if (rc != APR_SUCCESS) {
          ap_log_error(APLOG_MARK, APLOG_STARTUP, rc, NULL,
-                     "Couldn't start piped log process");
+                     "Couldn't start piped log process for %s",
+                     program == NULL ? "'NULL'" : "'" program "'");
          return NULL;
      }

      pl = apr_palloc(p, sizeof (*pl));
      pl->p = p;
+    pl->program = (program == NULL) ? NULL : apr_pstrdup(p, program);
+    pl->pid = NULL;

Why is this needed now? Or was this just missed previously and is not really
related to the reliable pipe usage of the error log?

Exactly, I thought it would be better to explicitely set the structurefields to defined values and not simply keeping them undefined, just incase we start using the structure in other places and assume the fieldsare filled.

      ap_piped_log_read_fd(pl) = NULL;
      ap_piped_log_write_fd(pl) = dummy;
      apr_pool_cleanup_register(p, pl, piped_log_cleanup, piped_log_cleanup);

should fix this. I tested it with access and error logs, with the main
server and virtual hosts and with restarts and logger killing.

The problem is the delicate handling of file descriptors in the main
error log case.

Should I apply this to trunk?


I guess soon because this make discussion easier.

My tests are only on Solaris. I will also test on Linux, but it would be
nice if someone could test on Windows too.

Some comments on the patch:

1) The piped_log_maintenance() function used as a callback when the
logger dies needs one additional data, namely whether the died logger
was the main error log. We could add this to the piped_log structure,


This is the part I don't get :-). Why do we need to know this?

As explained above, in the main error logger case the write side of thepipe gets closed in open_error_logs() and is used instead via its dupedcopy in FD2.


Regards,

Rainer

Re: [Patch] Reliable error logs for trunk

Reply via email to