Re: [PATCH v6 00/22] Agent Check Enhancements and External Check

Simon Horman Thu, 31 Oct 2013 01:03:20 -0700

Hi Willy,

sorry for not responding earlier.


I will likely have more questions regarding your review
but at this point one in particular comes to mind:
are you still interested in removing lb-agent-chk?
If so I think it is a reasonable idea as it should avoid
complexity both in this series and in the future.

On Fri, Sep 06, 2013 at 03:05:35PM +0200, Willy Tarreau wrote:
> Hi Simon,
> 
> I'm finally done with the review. That was quite long (spread over two
> weeks), and obviously less than what it might have taken you to do this
> work. Next time if you can send smaller chunks it might help a lot !
> 
> Due to the difficulty to assign enough continuous time on it, I used
> "git notes" to add notes to the commits, which allowed me to go back
> and forth in the review. So I have not commented inline the code but
> I have so few comments about the code by itself that it should not be
> an issue at all, I hope you don't mind. Otherwise ask me for details
> and I'll happily go into depth.
> 
> I found one concern at the end about the external check that's probably
> worth a separate discussion thread. I'm suspecting that we should probably
> get the 20 first patches merged ASAP and discuss the last two after that.
> 
> I'm appending the whole review here. This is a git log --reverse which
> shows my comments below the Notes: tags.
> 
> Thanks very much for this work, I hope we can sort out all remaining
> points to get it merged into 1.5-dev20.
> 
> Best regards,
> Willy
> 
> ---
> 
> commit 08eff21897f3f5d62f1360f309c5a71da6146afa
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sat Feb 23 08:34:13 2013 +0900
> 
>     DOC: Clarify documentation of option lb-agent-chk
>     
>     Avoid referring to check-port as this is not a configuration parameter.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK
> 
> commit bff65d578c2e4212dfcbfa14ba549581ca42bc1d
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 07:45:56 2013 +0900
> 
>     CLEANUP: Make parameters of srv_downtime and srv_getinter const
>     
>     The parameters of srv_downtime and srv_getinter are not modified
>     and thus may be const.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK
> 
> commit 9386a28b61cb7fb011ef2e710616b972f7bfe4e3
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sat Feb 23 10:16:43 2013 +0900
> 
>     MEDIUM: Split up struct server's check element
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     
>     The split has been made by:
>     * Moving elements of struct server's check element that will
>       be shared by both checks into a new check_common element
>       of struct server.
>     * Moving the remaining elements to a new struct check and
>       making struct server's check element a struct check.
>     * Adding a server element to struct check, a back-pointer
>       to the server element it is a member of.
>       - At this time the server could be obtained using
>         container_of, however, this will not be so easy
>         once a second struct check element is added to struct server
>         to accommodate an agent health check.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v6
>     * Correct use of srv->check_common.xprt ssl_sock_prepare_srv_ctx().
>       Previously srv->check.xprt was used which causes a build failure
>       when compiled with make USE_OPENSSL=1 ...
>       Thanks to Scott McKeown for discovering this.
>     
>     v2 - v5
>     * No change
> 
> Notes:
>     OK.
> 
> commit ed27bc9796c5d0c06e1dc3dad4774510b3f1277a
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 09:09:03 2013 +0900
> 
>     MEDIUM: Move {,fast,down}inter to struct check
>     
>     Move {,fast,down}inter elements from struct server to struct check.
>     This allows those elements of a check to be independent of the check's 
> server.
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK though could probably be merged with previous one
> 
> commit 368f15d3ad805edbbc9070e708ce5b99624a8607
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 07:25:29 2013 +0900
> 
>     MEDIUM: Move result element to struct check
>     
>     Move result element from struct server to struct check
>     This allows check results to be independent of the check's server.
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK.
> 
> commit f0d5a3c54696b4abd284b363adfb616cad0837ff
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sat Feb 23 15:35:38 2013 +0900
> 
>     MEDIUM: Paramatise functions over the check of a server
>     
>     Paramatise the following functions over the check of a server
>     
>     * set_server_down
>     * set_server_up
>     * srv_getinter
>     * server_status_printf
>     * set_server_check_status
>     * set_server_disabled
>     
>     Generally the server parameter of these functions has been removed.
>     Where it is still needed it is obtained using check->server.
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     By paramatising these functions they may act on each of the checks
>     without further significant modification.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     This is a rather large patch and I would be happy to split it up
>     on request.
>     
>     v4
>     * Use check->type in process_check()
>     
>       Use check->type instead of s->proxy->options2 & PR_O2_CHK_ANY
>       as the former is specific to the check being processed whereas
>       the latter relates only to the primary check of a server.
>     
>     * Add type to struct check
>     
>       This is used to indicate the type of a check independent of
>       its server's proxy's check type.
>     
>     v2 - v3
>     * No Change
> 
> Notes:
>     - still need to understand the SSP_O_HCHK -> check change
>     - set_server_disabled() : takes a check arg, converts it to server then
>       uses &s->check, seems suspicious or at least confusing (why not simply 
> check ?)
>     
>     - also :
>                     for(srv = s->tracknext; srv; srv = srv->tracknext)
>     -                       set_server_disabled(srv);
>     +                       set_server_disabled(&s->check);
>     
>            isn't it &srv->check instead ?
>     
>     - it's nice to have a check being the owner of a connection instead of the
>       server, looks much cleaner than what we used to have.
>     - it should probably be made it more obvious what case is covered by 
> !check->type
>       (if relevant at this point)
> 
> commit 8c7409c8a6ddb4f3629fd877935355e0902ebfa5
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sat Feb 23 15:14:19 2013 +0900
> 
>     MEDIUM: cfgparse: Factor out check initialisation
>     
>     This is in preparation for struct server having two elements
>     of type struct check.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v5
>     * Remove server argument from init_check. It is not used.
>     * Set type in init_check
>     
>       This allows a zero type to be used to indicate that
>       a task for a secondary agent check should not be stated.
> 
> Notes:
>     OK
> 
> commit 73f8958f7c982aa0d376ab5b6d4baeb0892a61c2
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 17:23:38 2013 +0900
> 
>     MEDIUM: Add state to struct check
>     
>     Add state to struct check. This is currently used to store one bit,
>     CHK_RUNNING, which is set if a check is running and clear otherwise.
>     This bit was previously SRV_CHK_RUNNING of the state element of struct
>     server.
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     
>     Signed-off-by: Simon Horman <horms+rene...@verge.net.au>
> 
> Notes:
>     Please use CHK_STATE_RUNNING instead of CHK_RUNNING so that it's
>     easier to know it is related to the state attribute of the check.
>     
>     Also, we have many HCHK_* and now CHK_* as well, so I wonder how/when
>     this will become confusing.
> 
> commit 8fe10dc6d602286db0270a0d1d27a3fa2907792e
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 21:50:46 2013 +0900
> 
>     MEDIUM: Add name element to struct check
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     
>     Signed-off-by: Simon Horman <horms+rene...@verge.net.au>
>     
>     ---
>     
>     v5
>     * Rebase
> 
> Notes:
>     Is it worth adding a name field just to report it in the stats ? Don't
>     we have everything required to display the proper name when dumping
>     the stats ? (maybe not).
> 
> commit 59d4d5ae53c2225a6001be9d526de6623dba0abd
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 17:23:38 2013 +0900
> 
>     MEDIUM: Move health element to struct check
>     
>     This is in preparation for associating a agent check
>     with a server which runs as well as the server's existing check.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK
> 
> commit 4f7700d5da3c22444f16716d105e013ef69ad0b0
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Feb 24 20:50:33 2013 +0900
> 
>     MEDIUM: Add helper for task creation for checks
>     
>     This helper is in preparation for adding a second struct check element
>     to struct server.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v5
>     * Rebase
> 
> Notes:
>     Cosmetic, but the cfgparse change just changes the previous patch, so it 
> could
>     have been merged there.
>     
>     Please change the "polarity" of the start_check_task() function or make it
>     very clear that it returns an error code, or change the check below :
>     
>     -          if (start_check_task(&s->check, mininter, nbcheck, srvpos++))
>     +          if (start_check_task(&s->check, mininter, nbcheck, srvpos++) < 
> 0)
>                        return -1
>     
>     The problem is I'm reading it as "if we could start a check task, return 
> -1".
>     And yes, I know that all the kernel's networking code you're dealing with
>     every day works this way so this can be confusing.
>     
>     Last cosmetic point, please keep srvpos++ out of the condition (I thought
>     we lost it). I tend to discourage people from doing variable assignments
>     in conditions because these are the places that are commented out the 
> first
>     during debugging sessions, causing the accidental loss of the assignment.
>     
>     Otherwise OK.
> 
> commit cf63f97ef6fa2ef584caab595c06738262f3ac80
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sat Feb 23 15:10:54 2013 +0900
> 
>     MEDIUM: checks: Add supplementary agent checks
>     
>     Allow an agent check to be run in conjunction with one other server
>     health check.
>     
>     If the backend for a server check is not lb-agent-chk then an agent
>     check may also be run using the agent-check parameter to a server,
>     which sets the TCP port to be used for the agent check.
>     
>     e.g.
>     server    web1_1 127.0.0.1:80 check agent-port 10000
>     
>     The agent-inter parameter may also be used to specify the interval
>     and timeout for agent checks.
>     
>     If either the health or agent check determines that a server is down
>     then it is marked as being down, otherwise it is marked as being up.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v5
>     * Rebase for removal of server argument from init_check
>     * Rebase for setting of type in init_check
>     
>     v4
>     *  Increment global.maxsock for agent-port.
>     
>        If agent-port is configured then an extra socket is required.
>     
>     * Do not send requests to secondary agent checks
>     
>       The request configuration of a proxy relates to the primary health
>       check and should not be sent to the secondary health check if it
>       is in operation.
>     
>     * Correct usage of PR_O2_LB_AGENT_CHK
>     
>       The correct way to check for PR_O2_LB_AGENT_CHK is
>       not to use x & PR_O2_LB_AGENT_CHK, but rather to use
>       (x & PR_O2_CHK_ANY ) == PR_O2_LB_AGENT_CHK.
>     
>     v2 - v3
>     * No change
> 
> Notes:
>     The commit message confused me but fortunately the doc fixed me :-)
>     
>     I'm having an issue with using agent-port to automatically enable the
>     check. We've done this mistake with the stats in the past. Specifying
>     any stats setting enables the stats. The end result is that nobody
>     sets the stats URL or passwords in a defaults section.
>     
>     Here we'll get the same issue. On large deployments, it's very likely
>     that users will want to have "default-server agent-port 10000" in their
>     defaults section, but then they will have no way to disable it for some
>     servers.
>     
>     Thus I'd rather stick to the same logic we have for current health checks
>     which consists in having the port being an independant and harmless 
> setting,
>     and have an "agent-check" keyword on the server lines to enable it where
>     relevant.
>     
>     Please do not forget to mention the compatibility between agent-port and
>     default-server in the doc.
>     
>     Also, just a stupid question, we've introduced option lb-agent-chk very
>     recently and we're still in development. Don't you think it would be
>     useful to remove it now before a release if agent-check can cover all
>     its needs ? It would avoid having to deal with complex setups in the
>     future. CCing the lb.org guys here who most likely use it already (and
>     might probably plan a config migration).
> 
> commit 8abb1b04fcb258a6004d37fe2dd2436cb121007a
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Mar 3 16:46:40 2013 +0900
> 
>     MEDIUM: Add helper for agent check events.
>     
>     Break agent check handling out of event_srv_chk_r().
>     
>     This is in preparation for supporting agent check results
>     returned in an HTTP header.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK
> 
> commit 77950728605ced6093a829b08ffc5b7d5ecc0b6d
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Mar 3 16:22:45 2013 +0900
> 
>     MEDIUM: Parser to allow matching of HTTP header
>     
>     Replace the current header parser, which simply skips the headers,
>     with a version that allows matching of a key in a header.
>     
>     This is in preparation for supporting agent check results
>     returned in an HTTP header.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     I don't understand the parser :-( Some comments on the inputs/outputs of 
> the
>     function are probably needed since they're clearly not obvious.
>     
>     Also I fear that it does not support partial responses anymore, but I may
>     be wrong (ie: wait for more data to come when the response is not 
> complete).
>     
>     Anyway, this change must be tagged as MAJOR since it will definitely cause
>     regressions in obscure places : I'm seeing the parser is stricter than it
>     was, and people who implement script-based health checks tend to emit 
> awful
>     HTTP-like responses which certainly don't qualify as valid but which were
>     accepted. So whatever we can do to optimise compatibility with previous
>     checks should be considered (especially since we're speaking about checks
>     and not production traffic).
> 
> commit 0b829e506657e0d18cc8ffd64808f8cb0e379168
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sun Mar 3 17:50:39 2013 +0900
> 
>     MEDIUM: Add http-check agent-hdr option
>     
>     Allow agent checks to obtain information in
>     an HTTP header of the response to an http check.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v4
>     * Do not duplicate code that is in process_result()
>     
>     v2 - v3
>     * No change
> 
> Notes:
>     OK in principle, but what does it do when agent-port is set ?
>     Is there any conflict or do we face a race between the two competing
>     checks ?
>     
>     In the doc :
>     
>      http-check agent-hdr
>        Use the value of the given http header as the result of an agent check
>        May be used in sections :   defaults | frontend | listen | backend
>                                      yes   |    no    |   yes  |   yes
>        Arguments :
>     
>          <header>  The header string to use to send the server name
>     
>     Please mention "<header>" on the syntax line itself :
>     
>      http-check agent-hdr <header>
> 
> commit 410090bf97c2553da33f799c6dc313fb91b13ead
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Wed Jun 5 18:06:07 2013 +0900
> 
>     MEDIUM: Add DRAIN state and report it on the stats page
>     
>     Add a DRAIN sub-state for a server which
>     will be shown on the stats page instead of UP if
>     an agent check is in use and the agent has most recently returned "drain".
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     In stats_dump_sv_stats() we use 'state' to display a new state, but the
>     CSS is not updated, so we'll end up with any formating or colorset. I
>     think it would be good to use :
>       - the same color as "UP 2/3" (yellow) for "DRAIN 2/3"
>       - a light brown such as #CC9900 when the DRAIN is in effect
>     
>     Also please update the comments at the top of the function to reflect
>     the new input range.
> 
> commit 5a53bc5b4aebbfb499b96a5d3dadde514d885aaf
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Wed Jun 26 23:38:56 2013 +0900
> 
>     MEDIUM: Log agent fail, stopped or down as info
>     
>     In the case where an agent check returns fail, stopped or down,
>     log this as info when logging the server status along with any
>     trailing message returned by the agent after fail, stopped or down.
>     
>     Previously only the trailing message was logged as info and
>     if omitted no info was logged.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     OK
> 
> commit 5ab56ee1510abfc79ec1f46657a49abb47da6824
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Wed Jun 5 18:05:05 2013 +0900
> 
>     MEDIUM: Do not mark a server as down if the agent is unavailable
>     
>     In the case where agent-port is used and the agent
>     check is a secondary check to not mark a server as down
>     if the agent becomes unavailable.
>     
>     In this configuration the agent should only cause a server to be marked
>     as down if the agent returns "fail", "stopped" or "down".
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v4
>     * Never update health of agent on failure to connect
>     
>       Previously the health was recharged on such cases.
>       But this causes the health to be recharged even
>       if the most recent result from the agent was "fail", "stopped", or 
> "down".
>     
>       The result was unnecessary logging of "fail", "stopped", or "down"
>       results from the agent if they were interleaved with timeouts.
>       Or in other words, if the agent responded only occasionally
>       but consistently with "fail", "stopped", or "down".
>     
>     v3
>     * No change
>     
>     v2
>     * Only log agent server status if the change in status
>       will affect the server's state
> 
> Notes:
>     While I absolutely agree with the goal described in the commit message,
>     I fail to connect it to what I'm seeing in the code. I don't know if it
>     is a result of the rework of the patches but for me all it does it apply
>     a second DRAIN state whose purpose I don't understand. I'm seeing that
>     we first set WILLDRAIN, then update the status, then add DRAIN on top of
>     that, that remains unclear to me. I'm not sure what situation the 4
>     combinations of these flags correspond to.
> 
> commit 3980cce19837da833cf0c5ced030060257b9df0c
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Thu Jun 27 21:15:04 2013 +0900
> 
>     MEDIUM: Set rise and fall of agent checks to 1
>     
>     This is achieved by moving rise and fall from struct server to struct 
> check.
>     
>     After this move the behaviour of the primary check, server->check is
>     unchanged. However, the secondary agent check, server->agent now has
>     independent rise and fall values each of which are set to 1.
>     
>     The result is that receiving "fail", "stopped" or "down" just once from 
> the
>     agent will mark the server as down. And receiving a weight just once will
>     allow the server to be marked up if its primary check is in good health.
>     
>     This opens up the scope to allow the rise and fall values of the agent
>     check to be configurable, however this has not been implemented at this
>     stage.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v7
>     When checking health against rise in set_server_up the health of a check
>     should be checked against the rise of the same check. Otherwise the 
> condition
>     for the s->agent may be false when it should be true as s->agent's rise 
> and
>     thus healthy health value is 1 whereas the default rise for check is 
> likely to
>     be 3, as check may be s->check and the the default value for 
> s->check.rise is 3.
>     
>     v5
>     * Rebase
> 
> Notes:
>     One remaining "server" below :
>     
>     -       int health;                             /* 0 to server->rise-1 = 
> bad;
>     -                                                * rise to 
> server->rise+server->fall-1 = good */
>     +       int health;                             /* 0 to rise-1 = bad;
>     +                                                * rise to 
> rise+server->fall-1 = good */
>     +       int rise, fall;                         /* time in iterations */
>     
>     Otherwise OK.
> 
> commit fa2d91c61ce78d24143a74508d1dc8c85fca53f1
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Sat Jun 22 14:36:39 2013 +0900
> 
>     MEDIUM: Add set agent pause|unpause unix socket command
>     
>     The syntax of this new command is:
>     
>     set agent <backend>/<server> pause|unpause
>     
>     This command changes the behaviour of agent checks as follows:
>     
>     In the case where an agent check is being run as a secondary check,
>     due to the agent-port parameter of a server directive, new checks
>     are only initialised when the agent is in the unpaused state. Thus,
>     setting the agent to pause will prevent any new agent checks from
>     begin initiated until the agent is set to unpaused.
>     
>     When set pause is in effect the processing of an agent check run as the
>     primary check, due to either option lb-agent-check or http-check
>     agent-hdr being set, or processing of a secondary agent check that was
>     initiated while the agent was set as unpaused is as follows: All results
>     that would alter the weight, specifically "drain" or a weight returned by
>     the agent, are ignored. The processing of agent check is otherwise
>     unchanged.
>     
>     The motivation for this option is to allow weight setting is to allow the
>     weight changing effects of the agent checks to be paused to allow the
>     weight of a server to be configured using set weight without being
>     overridden by the agent.
>     
>     The default state is unpaused.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> Notes:
>     I think that instead of setting the agent status to pause/unpaused, we'd
>     rather reuse the same principle as we currently have with servers :
>     
>          enable agent <backend>/<server>
>          disable agent <backend>/<server>
>     
>     OK for the rest of course.
> 
> commit e4cbb6e2d82ad7d037316ab0dc4932e16a3b4e14
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Tue Mar 13 14:02:40 2012 +0900
> 
>     MEDIUM: Break out check establishment into establish_chk()
>     
>     This is in preparation for adding a new type of check that
>     uses a process rather than a socket.
>     
>     Signed-off-by: Simon Horman <ho...@verge.net.au>
>     
>     ---
>     
>     v4
>     * Use check->type in establish_conn_chk()
>     
>       Use check->type instead of s->proxy->options2 & PR_O2_CHK_ANY
>       as the former is specific to the check being processed whereas
>       the latter relates only to the primary check of a server.
>     
>     v2 - v3
>     * No change
> 
> Notes:
>     I'd rather call it connect_chk() since we already use the verb
>     "connect" at other places for the same purpose.
> 
> commit 23bc14cb0389d6ecb525c76766181ef01938a6cf
> Author: Simon Horman <ho...@verge.net.au>
> Date:   Tue Mar 13 14:03:22 2012 +0900
> 
>     MEDIUM: Add external check
>     
>     Add an external check which makes use of an external process to
>     check the status of a server.
>     
>     ---
>     
>     v6
>     * Correct implementation and documentation of arguments to external-check
>       command so that they are consistent with both each other and 
> ldirectord's
>       external check. The motivation being to allow the same scripts to be 
> used
>       with both haproxy and ldirectord.
>     
>     v5
>     * Rebase
>     
>     v4
>     * Remove stray use of s->check in process_chk()
>       The check parameter should be used throughout process_chk()
>     * Layer 7 timeouts of agent checks should be ignored
>     * Ensure that argc is never used uninitialised in prepare_external_check()
>     
>     v3
>     * Rebase: basically a rewrite of large sections of the code
>     * Merge with the following patches
>       + "external-check: Actually execute command"
>       + "Allow selection of of external-check in configuration file"
>     
>     v2
>     * If the external command exits normally (WIFEXITED()) is true)
>       then set the check's code to the exit status (WEXITSTATUS())
>       of the process.
>     * Treat a timeout is a failure case rather than the test having passed
>     * Remove duplicate getnameinfo() call in start_checks()
>     * Remove duplicate assignment of sockaddr argument to getnameinfo(9
>       which caused the check port and check addr configuration of
>       a server to be ignored.
> 
> Notes:
>     Problems :
>       - don't use "option xxx <arg>" despite the old poor choice introduced
>         with "option httpchk". Options are meant to be booleans.
>       - we need some way to ensure that we can globally disable this feature
>         or better, that it must be explicitly enabled, as it introduces a
>         major security risk (APIs, ...)
>       - probably that we need to have a global prefix path setting for all
>         commands.
>       - anonymous union does not build with gcc-2.95 (still supported)
>       - please don't use stdbool (again), it is not portable and has already
>         broke twice in the past.
>       - getnameinfo is not portable either, better use str2sa() or equivalent
>         (and remove include netdb)
>       - I'm quite worried about the changes with has_conn in process_chk,
>         they're touching a very sensible place, so I think it would be
>         much better to have a distinct process_chk function for external
>         checks (we can even rename process_chk to process_chk_conn)
> 
> --------
>

Re: [PATCH v6 00/22] Agent Check Enhancements and External Check

Reply via email to