[ilb-dev] writeup for health check open issues

[email protected] Wed, 04 Mar 2009 14:57:36 -0800

Michael Schuster wrote:
> On 03/04/09 14:22, Zhenghui.Xie at Sun.COM wrote:
>> Hi, All
>>
>> Several days ago, I sent out an email 
>> (http://mail.opensolaris.org/pipermail/ilb-dev/2009-February/000116.html) 
>> to resolve the remaining open issues in health check design. After 
>> reading the feedback from Peter and discussing with iteam people, 
>> here is a writeup that proposes some solution.
>>
>> 1. health check probes:
>> ilb will provide three build-in health check probes, ping, tcp and udp.
>> 1a) For ping, use /usr/sbin/ping (ping(1M)) to send ICMP request to 
>> host(s).
>> 1b) For tcp, ilb delivers /usr/lib/ilb/tcp_query_probe, check whether 
>> open socket on the destination/port will success.
>> 1c) For udp, ilb defaultly just ping the destination (unless somebody 
>> can suggest good approach to send udp probe). If user wants more 
>> specific tests, user needs to provide their own test script.
>>
>> 2. show health check results:
>> ilbadm list-servergroup will show the health of each server. a sample 
>> output of "ilbadm list-servergroup" will look like:
>> ------------------- sample --------------------------------------
>> #ilbadm list-servergroup
>> SG-NAME   RULE-NAME   SERVER-IP   PORT   SERVER-STATE    FLAGS
>> sg-123            rule-abc             1.1.1.1             
>> 80            alive                    xxxx
>> sg-123            rule-abc              1.1.1.2            
>> 80            dead                    xxxx
>> sg-123            rule-abc              1.1.1.3            
>> 80            disabled              xxxx
>> sg-456            rule-def               2.1.1.1           
>> 21            alive                    xxxx
>> sg-456            rule-def               2.1.1.2           
>> 21            alive                    xxxx
>> ---------------- end of sample ---------------------------------
>>
>> list-servergroup provides "-v" to output more detailed server health 
>> information, sample output:
>> -----------------sample------------------------------------------
>> SG-NAME: sg-123
>> RULE-NAME: rule-abc
>> SERVER-IP     PORT  STATE    METHOD   LAST-RESULT    
>> LAST-TEST-TIME       NEXT-TEST-TIME
>> 1.1.1.1               80         alive          TCP             3/5* 
>>                         11:34:30                          11:39:30
>> 1.1.1.2               80         dead          TCP             
>> 0/5                            11:34:31                          
>> 11:39:31
>> 1.1.1.3               80         disabled     TCP            0/0** 
>>                            0:0:0                                0:0:0
>> ----------------- end of sample
>> * this means 3 out of 5 probes succeeded. The server will only be 
>> declared as dead when all probes fail.
>> ** no hc on disabled server.
>>
>> 3.   does one rule need more than one health check?
>> For phase 1, we just deliver one health check per rule. It will work 
>> like the following:
>> if user didn't specify any hc when create the rule, no health check 
>> will be performed.
>> if user specify "ping" as the hc method, ping will be performed. 
>> Server will be declared dead if ping failed.
>> if user specify "tcp" "udp" or provide their own test script, first 
>> perform a default ping. If ping failed, no further try, server will 
>> be declared as dead.  If ping succeed, next try tcp probe or user 
>> test script, server will be declared dead if test failed.
>>
>> "create-healthcheck" provides a '-n' option to turn off the default 
>> ping probe, allow user to specify "thank you but don't perform the 
>> default ping for me"
>>
>> Any suggestions, objections, rotten tomatoes/fresh eggs? Please send 
>> your comments, if any, by end of this week.
>
> a comment on the suggested display: in the code I have ready for 
> putback, I added server IDs to the output.


sure.

> In the interest of screen real-estate, maybe we could do away with 
> "state" and encode that in the flags as well, eg "*" for alive, "+" 
> for dead, "-" for disabled, etc.
>
sounds good. We probably need to have one line at bottom to explain 
which symbol means what.

-Jan

[ilb-dev] writeup for health check open issues

Reply via email to