Sorry, it probably got rebased before I pushed it. http://hg.clusterlabs.org/pacemaker/1.1/rev/dd8e37df3e96 should be the right link
On Wed, Sep 29, 2010 at 2:51 AM, <renayama19661...@ybb.ne.jp> wrote: > Hi Andrew, > >> Pushed as: >> http://hg.clusterlabs.org/pacemaker/1.1/rev/8433015faf18 >> >> Not sure about applying to 1.0 though, its a dramatic change in behavior. > > The change of this link is not found. > Where did you update it? > > Best Regards, > Hideo Yamauchi. > > --- Andrew Beekhof <and...@beekhof.net> wrote: > >> Pushed as: >> http://hg.clusterlabs.org/pacemaker/1.1/rev/8433015faf18 >> >> Not sure about applying to 1.0 though, its a dramatic change in behavior. >> >> On Wed, Sep 22, 2010 at 11:18 AM, <renayama19661...@ybb.ne.jp> wrote: >> > Hi Andrew, >> > >> > Thank you for comment. >> > >> >> A long time ago in a galaxy far away, some messaging layers used to >> >> loose quite a few actions, including stops. >> >> About the same time, we decided that fencing because a stop action was >> >> lost wasn't a good idea. >> >> >> >> The rationale was that if the operation eventually completed, it would >> >> end up in the CIB anyway. >> >> And even if it didn't, the PE would continue to try the operation >> >> again until the whole node fell over at which point it would get shot >> >> anyway. >> > >> > Sorry... >> > I did not know the fact that there was such an argument in old days. >> > >> > >> >> Now, having said that, things have improved since then and perhaps, >> >> the interest of speeding up recovery in these situations, it is time >> >> to stop treating stop operations differently. >> >> Would you agree? >> > >> > That means, you change it in the case of "Action Lost" of the stop this >> > time to carry out >> stonith? >> > If my recognition is right, I agree too. >> > >> > if(timer->action->type != action_type_rsc) { >> > send_update = FALSE; >> > } else if(safe_str_eq(task, "cancel")) { >> > /* we dont need to update the CIB with these */ >> > send_update = FALSE; >> > } >> > ---> delete "else if(safe_str_eq(task, "stop")){..}" ? >> > >> > if(send_update) { >> > /* cib_action_update(timer->action, LRM_OP_PENDING, >> > EXECRA_STATUS_UNKNOWN); */ >> > cib_action_update(timer->action, LRM_OP_TIMEOUT, EXECRA_UNKNOWN_ERROR); >> > } >> > >> > Best Regards, >> > Hideo Yamauchi. >> > >> > --- Andrew Beekhof <and...@beekhof.net> wrote: >> > >> >> On Tue, Sep 21, 2010 at 8:59 AM, �<renayama19661...@ybb.ne.jp> >> >> wrote: >> >> > Hi, >> >> > >> >> > Node was in state that the load was very high, and we confirmed monitor >> >> > movement of >> Pacemeker. >> >> > Action Lost occurred in stop movement after the error of the monitor >> >> > occurred. >> >> > >> >> > Sep �8 20:02:22 cgl54 crmd: [3507]: ERROR: print_elem: Aborting >> >> > transition, action >> lost: >> >> [Action 9]: >> >> > In-flight (id: prmApPostgreSQLDB1_stop_0, loc: cgl49, priority: 0) >> >> > Sep �8 20:02:22 cgl54 crmd: [3507]: info: abort_transition_graph: >> action_timer_callback:486 >> > - >> >> > Triggered transition abort (complete=0) : Action lost >> >> > >> >> > >> >> > For the load of the node, We think that the stop movement did not go >> >> > well. >> >> > But cannot nodes execute stonith. >> >> >> >> A long time ago in a galaxy far away, some messaging layers used to >> >> loose quite a few actions, including stops. >> >> About the same time, we decided that fencing because a stop action was >> >> lost wasn't a good idea. >> >> >> >> The rationale was that if the operation eventually completed, it would >> >> end up in the CIB anyway. >> >> And even if it didn't, the PE would continue to try the operation >> >> again until the whole node fell over at which point it would get shot >> >> anyway. >> >> >> >> Now, having said that, things have improved since then and perhaps, >> >> the interest of speeding up recovery in these situations, it is time >> >> to stop treating stop operations differently. >> >> Would you agree? >> >> >> >> _______________________________________________ >> >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> >> >> Project Home: http://www.clusterlabs.org >> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >> Bugs: >> >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> >> >> > >> > >> > _______________________________________________ >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: >> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker