[devel] [PATCH 0 of 3] Review Request for clm: harden processing of node & agent down events during failover [#1120]

2014-09-24 Thread mathi . naickan
Summary: clm: harden processing node down events and agents during failover [#1120] Review request for Trac Ticket(s): #1120 Peer Reviewer(s): <> Pull request to: <> Affected branch(es): 4.3 and above Development branch: <> Impacted area Impact y/n -

[devel] [PATCH 1 of 3] clm: avoid stale node down processing and unexpected track callback [#1120]

2014-09-24 Thread mathi . naickan
osaf/services/saf/clmsv/clms/clms_cb.h | 6 + osaf/services/saf/clmsv/clms/clms_evt.c | 35 - 2 files changed, 40 insertions(+), 1 deletions(-) There is a possiblity that the checkpointing message for a NODE_DOWN reaches the STANDBY first, i.e. before the

[devel] [PATCH 2 of 3] clm: during failover, process agent down before node downs [#1120]

2014-09-24 Thread mathi . naickan
osaf/services/saf/clmsv/clms/clms_evt.c | 16 +++- 1 files changed, 15 insertions(+), 1 deletions(-) It is quite possible that the agent downs are for the agents that were running on the same node that went down. So, process agent downs first, before processing node downs. diff --

[devel] [PATCH 3 of 3] clm: do not send track for the node that left the cluster because of reboot [#1120]

2014-09-24 Thread mathi . naickan
osaf/services/saf/clmsv/clms/clms_imm.c | 25 + 1 files changed, 21 insertions(+), 4 deletions(-) It is possible that when a payload that goes down during controller failover, can reboot and come back fast. As a part of failover processing, it is possible that the agent

Re: [devel] [PATCH 3 of 3] clm: do not send track for the node that left the cluster because of reboot [#1120]

2014-09-24 Thread Mathivanan Naickan Palanivelu
Typo, Read the below line; > for these nodes reach the new ACTIVE as > for these nodes do notreach the new ACTIVE Mathi. - mathi.naic...@oracle.com wrote: > osaf/services/saf/clmsv/clms/clms_imm.c | 25 > + > 1 files changed, 21 insertions(+), 4 deletions(-) > > >

[devel] [PATCH 1 of 1] amfd: instantiate mw sus when node is joining [#1133]

2014-09-24 Thread nagendra . k
osaf/services/saf/amf/amfd/su.cc | 16 +++- 1 files changed, 11 insertions(+), 5 deletions(-) When mw su is in locked-in state and opensaf is started, amfnd hangs. When mw su is unlocked-in, amfnd still doesn't instantiate the mw su. Amfd doesn't send instantiate message to amfnd b

[devel] [PATCH 0 of 1] Review Request for amfd: instantiate mw sus when node is joining [#1133]

2014-09-24 Thread nagendra . k
Summary: amfd: instantiate mw sus when node is joining [#1133] Review request for Trac Ticket(s): #1133 Peer Reviewer(s): Hans F, Hans N, Praveen Pull request to: Affected branch(es): 4.5, default Development branch: default Impacted area Impact y/n

[devel] [PATCH 0 of 1] Review Request for IMM: Failure to send completed to PBE defaulted to ccb-recovery [#1127]

2014-09-24 Thread Anders Bjornerstedt
Summary: IMM: Failure to send completed to PBE defaulted to ccb-recovery [#1127] Review request for Trac Ticket(s): 1127 Peer Reviewer(s): Neel Pull request to: Affected branch(es): 4.3: 4.4; 4.5; default(4.6) Development branch: Impacted area Impact y/n --

[devel] [PATCH 1 of 1] IMM: Failure to send completed to PBE defaulted to ccb-recovery [#1127]

2014-09-24 Thread Anders Bjornerstedt
osaf/services/saf/immsv/immnd/immnd_evt.c | 63 -- 1 files changed, 42 insertions(+), 21 deletions(-) The fix is mainly in immnd_evt_proc_ccb_apply, but also some cleanup in immnd_evt_proc_ccb_compl_rsp where part of the same problem was addressed by the earlier tick

Re: [devel] [PATCH 1 of 1] amfd: instantiate mw sus when node is joining [#1133]

2014-09-24 Thread praveen malviya
Ack code review only. Thanks, Praveen On 24-Sep-14 3:36 PM, nagendr...@oracle.com wrote: > osaf/services/saf/amf/amfd/su.cc | 16 +++- > 1 files changed, 11 insertions(+), 5 deletions(-) > > > When mw su is in locked-in state and opensaf is started, amfnd hangs. > When mw su is un

Re: [devel] [PATCH 3 of 3] clm: do not send track for the node that left the cluster because of reboot [#1120]

2014-09-24 Thread ramesh betham
I think the following condition should not be added inside clms_send_track(). if ((node_id != node->node_id) && (node->admin_op == 0)){ Better not to change common area code, because the above condition may not work when admin operations are in progress. The decision to not send track should b

Re: [devel] [PATCH 1 of 3] clm: avoid stale node down processing and unexpected track callback [#1120]

2014-09-24 Thread Hans Nordebäck
Ack, (tested the "pre-patch")/Regards HansN On 09/24/14 21:26, mathi.naic...@oracle.com wrote: > osaf/services/saf/clmsv/clms/clms_cb.h | 6 + > osaf/services/saf/clmsv/clms/clms_evt.c | 35 > - > 2 files changed, 40 insertions(+), 1 deletions(-) > > >

Re: [devel] [PATCH 2 of 3] clm: during failover, process agent down before node downs [#1120]

2014-09-24 Thread Hans Nordebäck
Ack, (tested the "pre-patch")/Regards HansN On 09/24/14 21:26, mathi.naic...@oracle.com wrote: > osaf/services/saf/clmsv/clms/clms_evt.c | 16 +++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > > It is quite possible that the agent downs are for the agents that were > ru

Re: [devel] [PATCH 3 of 3] clm: do not send track for the node that left the cluster because of reboot [#1120]

2014-09-24 Thread Hans Nordebäck
Ack, (tested the "pre-patch")/Regards HansN On 09/24/14 21:26, mathi.naic...@oracle.com wrote: > osaf/services/saf/clmsv/clms/clms_imm.c | 25 + > 1 files changed, 21 insertions(+), 4 deletions(-) > > > It is possible that when a payload that goes down during controller

Re: [devel] [PATCH 1 of 1] IMM: Failure to send completed to PBE defaulted to ccb-recovery [#1127]

2014-09-24 Thread Neelakanta Reddy
Hi AndersBj, Reviewed and tested the patch. Ack when pushed with below inline comments: /Neel On Wednesday 24 September 2014 04:47 PM, Anders Bjornerstedt wrote: > osaf/services/saf/immsv/immnd/immnd_evt.c | 63 > -- > 1 files changed, 42 insertions(+), 21 deletio

Re: [devel] [PATCH 1 of 1] IMM: Failure to send completed to PBE defaulted to ccb-recovery [#1127]

2014-09-24 Thread Anders Björnerstedt
Thanks Neel for spotting this. Will fix befoe pushing. /AndersBj -Original Message- From: Neelakanta Reddy [mailto:reddy.neelaka...@oracle.com] Sent: den 24 september 2014 14:38 To: Anders Björnerstedt Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1 of 1] IMM: Failure to

[devel] Fwd: Re: Fwd: Re: [opensaf:tickets] Re: #1114 NTF: Unadapted LongDns consumer crashes due to read/subsribe long dn notification

2014-09-24 Thread praveen malviya
Forwarding to devel list. Original Message Subject: Re: Fwd: Re: [opensaf:tickets] Re: #1114 NTF: Unadapted LongDns consumer crashes due to read/subsribe long dn notification Date: Thu, 25 Sep 2014 11:34:12 +0530 From: praveen malviya Organization: Oracle Corporation To: minhc

Re: [devel] [PATCH 3 of 3] clm: do not send track for the node that left the cluster because of reboot [#1120]

2014-09-24 Thread Mathivanan Naickan Palanivelu
Okay, I am not relying on the admin_op variable, instead i introduced an extra argument to differentiate whether the reason is node_reboot or others. See new check below (updated patch 3 of 3 attached) +void clms_send_track(CLMS_CB * cb, CLMS_CLUSTER_NODE * node, SaClmChangeStepT step, bool no