Hello - I'm on vacation right now. I'll be back on May, 19th and will take a look at all these issues then.
On Fri, May 8, 2009 at 12:27 PM, Yoshihiko SATO <[email protected]> wrote: > Hi, Serge. > > Sorry for not replying to you sooner. > > I consider 3 problems with STONITH operation in Xen environment. > I would like to hear your opinion. > > [PROBLEM 1: About fence operation timeout] > I consider the case that two or more xm commands are executed in parallel. > For example, in the case that STONITH is failed locally, and stonithd tries to > execute remote STONITH operation. > > First of all... > In the case that two or more xm commands are executed at the same time, > The later one waits until the former is done. > Maybe it is a specification of xm command. > (Even if xm list is executed during getting dump-core, > it has to wait for a while until dumping is completed...) > Here is the time required from starting 1st xm dump-core to the finish of > the last one. These commands get dump of the same domain-U, > which has 1GB memory. > The number in the left means the number of xm dump-core which runs in > parallel. > > 1 -> 6.557s > 2 -> 11.955s > 3 -> 17.573s > 4 -> 23.249s > > Then, for example, > when one domain-U is STONITH'ed from other two ore more domain-U, > it is very slow to finish. > It is an usual case that when several domain-U exists in a cluster. > In addition, when the load of the server is high, it takes longer for > getting dump. > And STONITH the operation's timed-out might pop. > > I know timeout can be set with "stonith-timeout" parameter. > But, I think that it is too difficult to set(decide) the value for users. > Because there are too many elements to be considered. > the size of domain-U's memory, the power of each domain-0's CPU, the load when > STONITH is executed, the number of domain-U which might execute STONITH > against > one domain-U, the number of domain-U which might execute STONITH at the same > time against two or more domain-U on one domain-0, and so on... > > And, of course, getting domain-U's dump is so important function for failure > analysis, so it is necessary, but if xen0 has a specification which > "it waits until xm dump-core command is completed", it means > "F/O takes long time to be completed". > It is not good for users. > > Then, I consider the specification that it doesn't wait until the dump is > over. > How about the execution of "xm dump-core" and "xm create" as background? > For example, see the following. > > [ex.1] > $SSH_COMMAND $dom0 "(xm dump-core -C ${kill_node}; xm create ${kill_node})" & > > And check whether STONITH succeeded or failed with SSH_COMMAND's return code. > When SSH_COMMAND is successful, xen0 considers "STONITH is completed". > And when it is failed, xen0 considers "STONITH is failed!". > > By this modification, > xen0 doesn't wait for completion of dumping and re-starting of domain-U, > so STONITH operation finishes earlier. > But, then it can't check dead or alive of domain-U with ping command or xm > list > command... > > > [PROBLEM 2: interruption during fencing] > Internal process of "xm dump-core" is as follows. > > pause domain-U -> get dump -> unpause domain-U > > Fencing process in xen0 consists of xm dump-core -> xm destroy, doesn't it? > Then, xm command has a specification which waits until former one is > completed, > not executed in parallel, so if other xm command is executed during > xm dump-core in fencing process (and before xm destroy in the process is not > executed yet), the "other xm command" breaks into the fencing process. > If the "other xm command" takes long time, as if dump-core does, > a domain-U works again (because it is unpaused) between xm dump-core and > xm destroy, and then some resources in this node become active again!! > > So, I think using "xm dump-core -C" command is a better way. > With -C option, internal process of "xm dump-core" is: > > pause domain-U -> get dump -> destroy domain-U > > So, this is effective to fix this problem, I think. > Of course, when "run_dump" parameter is not set, use just "xm destroy" > to stop domain-U. > > > [PROBLEM 3: getting dump redundantly] > When two or more domain-U try to STONITH one domain-U at the same time, > for example remote-fencing is executed, > domain-0 gets two or more dump-core files. > Maybe each of them is over 1GB, it depends on domain-U's memory size, > so they take over domain-0's disk storages terribly and unnecessarily. > Then, how about the following? > Check whether it is during getting dump-core for the target domain-U or not. > If it is during dumping, xen0 considers "STONITH is completed". > In other words, the later STONITH operation with xen0 exits normally > without doing anything. > And if not, it shifts to fencing process. > As the way to check this, I intend to use ps command, like > ps ax | grep "xm dump-core -C domain-U" ... or something. > > When an administrator xm dump-core (-C) manually, then > all resouces on the domain-U are paused, or if the worst happens, > dumping takes longer time than deadtime, and the domain-U is STONITH'ed. > In addition, with -C option, it destroys the domain-U after dumping. > Then, executing xm dump-core (-C) manually is not normal operation > in cluster management, I think. > So, I consider that maybe it is no problem to use process's existence > as criterion for judgement. > > > > The above is summarized as follows: > How about adding these functions to xen0? > (1) execute xm dump-core or xm destroy and xm create at once via SSH > for example the code [ex.1]. > (2) execute (1) as background process for not waiting for dump's completion. > (3) check return code of SSH to judge whether STONITH succeeded or not. > (4) add -C option to xm dump-core instead of "xm dump-core + xm destroy". > (5) check "xm dump-core -C" process's existence to avoid dumping redundantly. > > Your comments and suggestions are really appreciated. > > > >> Hi, Serge. >> >> Sorry for not replying to you sooner. >> I have tested your patch. >> It is unquestionable. >> Thanks. >> >> Incidentally, in the case that xen config file has space before and >> behind "=" (for example, "name = domain-U"), >> It always passes through the check processing. >> If the TRIM processing is added, it becomes better. >> >> By the way, I think of other some problems. >> (run in parallel and timeout, etc...) >> >> Therefore, please wait a little more. >> >> Regards, >> Yoshihiko SATO. >> >>> Did it work for you? Shall we ask Dejan to commit this patch? >>> >>> On Wed, Apr 15, 2009 at 12:13 AM, Yoshihiko SATO >>> <[email protected]> wrote: >>>> Hello Serge, >>>> >>>> Thank you so much for your quick action! >>>> I'll test the patch. >>>> >>>> >>>> Regards, >>>> Yoshihiko SATO. >>>> >>>>> Attached is a patch that checks that DomU disappears from the "xm >>>>> list" on Dom0 after running destroy. >>>>> >>>>> On Mon, Apr 13, 2009 at 10:03 PM, Serge Dubrouski <[email protected]> >>>>> wrote: >>>>>> Hello - >>>>>> >>>>>> This makes sense and I''ll think how to implement that. Thank for the >>>>>> suggestion. >>>>>> >>>>>> 2009/4/13 Yoshihiko SATO <[email protected]>: >>>>>>> Hi Serge, >>>>>>> >>>>>>> I consider about the case that two or more plugins are set in cib.xml. >>>>>>> For example, xen0(STONITH plugin for DomU) and ibmrsa-telnet(the one for >>>>>>> Dom0) or something. >>>>>>> The setting's purpose is to STONITH Dom0 when xen0 failed to STONITH >>>>>>> DomU. >>>>>>> Then, I found the following problem about xen0's fence(off|reset) >>>>>>> action. >>>>>>> >>>>>>> xen0 doesn't check the return code of xm destroy. >>>>>>> Instead, it check the target DomU is dead or alive with ping command in >>>>>>> CheckIfDead(), right? >>>>>>> However, ping does not receive any reply packets at all >>>>>>> not only when DomU is normally STONITH'ed but when kernel panic or >>>>>>> kernel hang occurs on Dom0. >>>>>>> In the case that failure occurs on Dom0, xen0 judges "the fence action >>>>>>> succeeded", by mistake. >>>>>>> Then, STONITH plugin which is able to STONITH Dom0 (like ibmrsa-telnet >>>>>>> etc.) is not executed. >>>>>>> So, I consider that it should confirm whether xm destroy via ssh >>>>>>> succeeded or not. >>>>>>> And it is better to check whether the target is dead with ping only when >>>>>>> the command succeeded. >>>>>>> If xm destroy is failed, xen0 should return "fence action is failed", I >>>>>>> think. >>>>>>> What do you think about this? >>>>>>> I would like to hear any opinion. >>>>>>> >>>>>>> Best regards, >>>>>>> Yoshihiko SATO >>>>>>> _______________________________________________________ >>>>>>> Linux-HA-Dev: [email protected] >>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>>>>>> Home Page: http://linux-ha.org/ >>>>>>> >>>>>> -- >>>>>> Serge Dubrouski. >>>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________________ >>>>> Linux-HA-Dev: [email protected] >>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>>>> Home Page: http://linux-ha.org/ >>>> _______________________________________________________ >>>> Linux-HA-Dev: [email protected] >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>>> Home Page: http://linux-ha.org/ >>>> >>> >>> >> >> _______________________________________________________ >> Linux-HA-Dev: [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >> Home Page: http://linux-ha.org/ > > _______________________________________________________ > Linux-HA-Dev: [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ > -- Serge Dubrouski. _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
