Hi Yoshihiko-san, On Thu, Dec 24, 2009 at 10:23:41AM +0900, Yoshihiko SATO wrote: > Hi, > >> The idea is nice, but what we actually want is a "crm node >> clean-down-confirmation XXX" command, that clears the CIB accordingly. > > Is this option added in the future? > I think that I should solve it in the another way until this option is added > to crm. > As the method, I want to add the wait option to meatclient. > I understand that meatware and meatclient become unnecessary after the option > is added to crm... > > I attach patch that include the advice of Dejan.
Many thanks for the patch. Just applied it, slightly modified. Dejan > > Regards, > Yoshihiko SATO > > (2009/11/19 20:09), Lars Marowsky-Bree wrote: >> On 2009-11-16T18:58:12, Dejan Muhamedagic<[email protected]> wrote: >> >>> To have this handled by CRM, crmd would have to cancel the >>> currently running stonith action, i.e. send the appropriate >>> message to stonithd. >> >> Sure. This also happens if the node eventually reboots and rejoins >> cleanly, too, anyway. >> >>> If handled by stonithd, it would have to cancel the running stonith >>> action and send the OK status to crmd. It basically boils down to the >>> same, but the former needs extra support in crmd. Don't know how much >>> effort that would take and if it would make the code more complex. >> >> This might work too, yes - basically faking a stonithd "success". >> However, how do you send this if there's no currently pending STONITH, >> because stonith has just timed out and been handed back to the >> transitioner? >> >> Sending the fake ack to crmd might be the right way. >> >>> BTW, this would obviate the need for this patch, but not for the >>> meatware plugin, since the CRM would complain if there are no >>> stonith resources in the configuration with fencing enabled. >> >> What I really dislike about the meatware plugin is that it needs to be >> run on a specific node. It'd be much better if it could run on any >> node. >> >> And there should always be some stonith resource defined. If the manual >> override existed, they could simply define external/ssh even (no need >> for meatware), which would at least allow error recovery for, say, stop >> failures. >> >> >> Regards, >> Lars >> > > > diff -r 2668d74b4060 lib/stonith/meatclient.c > --- a/lib/stonith/meatclient.c Tue Dec 22 19:12:54 2009 +0100 > +++ b/lib/stonith/meatclient.c Thu Dec 24 09:38:08 2009 +0900 > @@ -37,14 +37,14 @@ > #include <stonith/stonith.h> > #include <glib.h> > > -#define OPTIONS "c:" > +#define OPTIONS "c:w" > > void usage(const char * cmd); > > void > usage(const char * cmd) > { > - fprintf(stderr, "usage: %s [-c node]\n", cmd); > + fprintf(stderr, "usage: %s -c node [-w]\n", cmd); > exit(S_INVAL); > } > > @@ -60,7 +60,7 @@ > char * opthost = NULL; > int clearhost = 0; > > - int c, argcount; > + int c, argcount, waitmode; > int errors = 0; > > if ((cmdname = strrchr(argv[0], '/')) == NULL) { > @@ -74,12 +74,14 @@ > case 'c': opthost = optarg; > ++clearhost; > break; > + case 'w': ++waitmode; > + break; > default: ++errors; > break; > } > } > argcount = argc - optind; > - if (!(argcount == 0)) { > + if (!(argcount == 0) || !opthost) { > errors++; > } > > @@ -99,29 +101,51 @@ > > snprintf(meatpipe, 256, "%s.%s", meatpipe_pr, opthost); > > - fd = open(meatpipe, O_WRONLY | O_NONBLOCK); > + if (waitmode) { > + gboolean waited=FALSE; > + while (1) { > + fd = open(meatpipe, O_WRONLY | O_NONBLOCK); > + if (fd < 0) { > + if (errno != ENOENT && errno != ENXIO) { > + if (waited) printf("\n"); > + snprintf(line, sizeof(line) > + , "Meatware_IPC failed: > %s", meatpipe); > + perror(line); > + exit(S_BADHOST); > + } > + printf("."); fflush(stdout); > waited=TRUE; > + sleep(1); > + continue; > + } > + if (waited) printf("\n"); > + break; > + } > > - if (fd < 0) { > - snprintf(line, sizeof(line) > - , "Meatware_IPC failed: %s", meatpipe); > - perror(line); > - exit(S_BADHOST); > - } > + } else { > + fd = open(meatpipe, O_WRONLY | O_NONBLOCK); > > - printf("\nWARNING!\n\n" > - "If node \"%s\" has not been manually power-cycled or " > - "disconnected from all shared resources and networks, " > - "data on shared disks may become corrupted and " > - "migrated services might not work as expected.\n" > - "Please verify that the name or address above " > - "corresponds to the node you just rebooted.\n\n" > - "PROCEED? [yN] ", opthost); > + if (fd < 0) { > + snprintf(line, sizeof(line) > + , "Meatware_IPC failed: %s", meatpipe); > + perror(line); > + exit(S_BADHOST); > + } > > - rc = scanf("%s", resp); > + printf("\nWARNING!\n\n" > + "If node \"%s\" has not been manually > power-cycled or " > + "disconnected from all shared resources and > networks, " > + "data on shared disks may become corrupted and " > + "migrated services might not work as > expected.\n" > + "Please verify that the name or address above " > + "corresponds to the node you just rebooted.\n\n" > + "PROCEED? [yN] ", opthost); > > - if (rc == 0 || rc == EOF || tolower(resp[0] != 'y')) { > - printf("Meatware_client: operation canceled.\n"); > - exit(S_INVAL); > + rc = scanf("%s", resp); > + > + if (rc == 0 || rc == EOF || tolower(resp[0] != 'y')) { > + printf("Meatware_client: operation > canceled.\n"); > + exit(S_INVAL); > + } > } > > sprintf(line, "meatware reply %s", opthost); > > _______________________________________________________ > Linux-HA-Dev: [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
