On Thu, Aug 11, 2016 at 02:23:39PM +0200, Petr Cech wrote:
> On 08/09/2016 02:49 PM, Jakub Hrozek wrote:
> > On Tue, Aug 09, 2016 at 12:57:58PM +0200, Petr Cech wrote:
> > > On 08/09/2016 11:26 AM, Jakub Hrozek wrote:
> > > > On Mon, Aug 08, 2016 at 09:46:55AM +0200, Petr Cech wrote:
> > > > > > On 08/04/2016 05:01 PM, Petr Cech wrote:
> > > > > > > > On 08/04/2016 04:35 PM, Petr Cech wrote:
> > > > > > > > > > Hi list,
> > > > > > > > > > 
> > > > > > > > > > there is the first version of patch for [1]. I need
> > > > > > > > > > to investigate if we have the same issue in other
> > > > > > > > > > *_childs.
> > > > > > > > > > 
> > > > > > > > > > I tested it with sleep() added into ldap_child code.
> > > > > > > > > > 
> > > > > > > > > > [1] https://fedorahosted.org/sssd/ticket/3106
> > > > > > > > > > 
> > > > > > > > 
> > > > > > > > I know, it is not perfect. For example 2 seconds between 
> > > > > > > > SIGTERM and
> > > > > > > > SIGKILL should be constant. I will send the second version with 
> > > > > > > > other
> > > > > > > > childs, but not today.
> > > > > > > > 
> > > > > > > > Regards
> > > > > > 
> > > > > > Hello all,
> > > > > > 
> > > > > > there is fixed patch set. There is only one minor change -- 
> > > > > > constant for
> > > > > > time between signals.
> > > > > > 
> > > > > > Is there better name for global_ccname_file_dummy? I did not want 
> > > > > > to break
> > > > > > the naming... but it looks really long.
> > > > I think this name is fine, but I would prefer to not use the variable in
> > > > the code, except for setting it after the temporary file is created and
> > > > then set the variable back to NULL after the temporary file is renamed
> > > > to the real one.
> > > 
> > > Addressed.
> > > 
> > > Thanks Jakub, it looks better. :-)
> > 
> > OK, now I actually tested the patch and it regresses the way we handle
> > responses, because the handler exits the process with success error
> > code, the provider then tries to parse the child output, which is not
> > there..
> > 
> > I think the easiest way would be to define a return code of ldap_child
> > that would be used in exit in the signal handler and the back end code
> > would print a nice debug message telling the admin that the child timed
> > out.
> 
> Thanks Jakub,
> improved patch set is attached.
> 
> Regards
> 
> -- 
> Petr^4 Čech

> From f84d4d2fcc7c1f2cb404a02aeea7abd12c14dd61 Mon Sep 17 00:00:00 2001
> From: Petr Cech <pc...@redhat.com>
> Date: Fri, 22 Jul 2016 14:28:54 +0200
> Subject: [PATCH] LDAP: Fixing of removing netgroup from cache

This patch is from a different set.

> From 1f40587cbb256bab98e8e35670ca76426b41e6b1 Mon Sep 17 00:00:00 2001
> From: Petr Cech <pc...@redhat.com>
> Date: Thu, 4 Aug 2016 16:27:40 +0200
> Subject: [PATCH 2/3] LDAP: Adding SIGTERM signal before SIGKILL
> 
> We add better termination of ldap_child. If ldap_child reaches
> the timeout for termination parent sents SIGTERM signal. Child
> has 2 seconds for removing temporary file and exit.
> If it is not sufficient there is SIGKILL send to the child.
> 
> Resolves:
> https://fedorahosted.org/sssd/ticket/3106

[...]

> +static void get_tgt_timeout_handler(struct tevent_context *ev,
> +                                      struct tevent_timer *te,
> +                                      struct timeval tv, void *pvt)
> +{
> +    struct tevent_req *req = talloc_get_type(pvt, struct tevent_req);
> +    struct sdap_get_tgt_state *state = tevent_req_data(req,
> +                                            struct sdap_get_tgt_state);
> +    int ret;
> +
> +    DEBUG(SSSDBG_TRACE_ALL,
> +          "timeout for sending SIGTERM to tgt child [%d] reached.\n",
> +          state->child->pid);
> +
> +    ret = kill(state->child->pid, SIGTERM);
> +    if (ret == -1) {

Please use:
    ret = errno
    and then use ret instead of errno in the DEBUG statement to avoid
    overwriting errno if we ever change this branch in the future.

> +        DEBUG(SSSDBG_CRIT_FAILURE,
> +              "Sending SIGTERM failed [%d][%s].\n", errno, strerror(errno));
> +    }

> From 4ca2a6b90dd7d7ee792b0e144a4e6d432f4693a4 Mon Sep 17 00:00:00 2001
> From: Petr Cech <pc...@redhat.com>
> Date: Thu, 11 Aug 2016 14:18:15 +0200
> Subject: [PATCH 3/3] LDAP: Adding SIGCHLD callback
> 
> This patch adds SIGCHLD callback for ldap_child. So if timeout is
> reached and ldap_child is terminated by handler we have debug message
> about it.
> 
> Resolves:
> https://fedorahosted.org/sssd/ticket/3106

This patch does not apply for me, so I'll just comment on the diff.

> ---
>  src/providers/ldap/ldap_child.c         |  2 +-
>  src/providers/ldap/sdap_child_helpers.c | 10 +++++++++-
>  src/util/child_common.h                 |  1 +
>  3 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/src/providers/ldap/ldap_child.c b/src/providers/ldap/ldap_child.c
> index 
> 2edfc30ec7b4a3ff03d86ae41f086a6128451a9b..b6913fdaa4c8f87f33cc8c10f9a029d63626f279
>  100644
> --- a/src/providers/ldap/ldap_child.c
> +++ b/src/providers/ldap/ldap_child.c
> @@ -54,7 +54,7 @@ static void sig_term_handler(int sig)
>          }
>      }
>  
> -    _exit(0);
> +    _exit(SIGNAL_EXIT_CODE);

I would prefer CHILD_SIGTERM_EXIT_CODE or CHILD_TIMEOUT_EXIT_CODE
(prefix and change the name)


>  }
>  
>  static krb5_context krb5_error_ctx;
> diff --git a/src/providers/ldap/sdap_child_helpers.c 
> b/src/providers/ldap/sdap_child_helpers.c
> index 
> 9f9c9b57a7ca5e1217d56ab560cc941432084df8..98b0fe777fd18fad6ffdaefc48ea156a45f4e803
>  100644
> --- a/src/providers/ldap/sdap_child_helpers.c
> +++ b/src/providers/ldap/sdap_child_helpers.c
> @@ -69,6 +69,14 @@ static void sdap_close_fd(int *fd)
>      *fd = -1;
>  }
>  
> +void child_callback(int child_status, struct tevent_signal *sige, void *pvt)
> +{
> +    if (WEXITSTATUS(child_status) == SIGNAL_EXIT_CODE) {
> +        DEBUG(SSSDBG_CRIT_FAILURE,
> +              "LDAP child was terminated due to timeout\n");
> +    }
> +}
> +
>  static errno_t sdap_fork_child(struct tevent_context *ev,
>                                 struct sdap_child *child)
>  {
> @@ -110,7 +118,7 @@ static errno_t sdap_fork_child(struct tevent_context *ev,
>          sss_fd_nonblocking(child->io->read_from_child_fd);
>          sss_fd_nonblocking(child->io->write_to_child_fd);
>  
> -        ret = child_handler_setup(ev, pid, NULL, NULL, NULL);
> +        ret = child_handler_setup(ev, pid, child_callback, NULL, NULL);
>          if (ret != EOK) {
>              goto fail;
>          }
> diff --git a/src/util/child_common.h b/src/util/child_common.h
> index 
> eb7623987238d5d9e9032b9bbbbef49188f5b1d2..e88661349a1568ffc981e0a3134f7ccea61b394e
>  100644
> --- a/src/util/child_common.h
> +++ b/src/util/child_common.h
> @@ -36,6 +36,7 @@
>  #define CHILD_MSG_CHUNK     256
>  
>  #define SIGTERM_TO_SIGKILL_TIME 2
> +#define SIGNAL_EXIT_CODE EINTR

I'm not sure if we want to use errno constants here, I would say just
defining our own constant is better, at least to avoid overflowing the
maximum exit value of 255. I don't think we even have to worry about
clashing with constants from sysexits.h, this is our internal API anyway.
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

Reply via email to