Hi Nagu,

 

Since the logic like this everywhere, I don’t touch this logic since it’s risky.

++(node->snd_msg_id);

avd_d2n_msg_snd()

Also I am not sure undo will be 0 (0 is expected in this issue).

That’s why I choose the way reset 0 in exactly that scenario.

 

By the way, patch v2 is coming thanks to Minh’s comment.

Please continue give your comments on v2 mail thread.

Thank you.

 

Best Regards,

Thuan

 

From: nagen...@hasolutions.in <nagen...@hasolutions.in> 
Sent: Friday, July 6, 2018 4:36 PM
To: Tran Thuan <thuan.t...@dektech.com.au>; hans.nordeb...@ericsson.com; 
gary....@dektech.com.au; 'Minh Hon Chau' <minh.c...@dektech.com.au>
Cc: opensaf-devel@lists.sourceforge.net
Subject: RE: [PATCH 1/1] amf: amfd should reset msg_id counter to avoid message 
ID mismatch [#2891]

 

Resending as it was not delivered.

 

Thanks,

Nagendra, 91-9866424860

www.hasolutions.in <http://www.hasolutions.in> 

https://www.linkedin.com/company/hasolutions/

High Availability Solutions Pvt. Ltd.

- High Availability Solutions Provider

 

 

 

 

 

 

 

--------- Original Message --------- 

Subject: RE: [PATCH 1/1] amf: amfd should reset msg_id counter to avoid message 
ID mismatch [#2891]
From: nagen...@hasolutions.in <mailto:nagen...@hasolutions.in> 
Date: 7/6/18 2:54 pm
To: "Tran Thuan" <thuan.t...@dektech.com.au <mailto:thuan.t...@dektech.com.au> 
>, hans.nordeb...@ericsson.com <mailto:hans.nordeb...@ericsson.com> , 
gary....@dektech.com.au <mailto:gary....@dektech.com.au> , "'Minh Hon Chau'" 
<minh.c...@dektech.com.au <mailto:minh.c...@dektech.com.au> >
Cc: opensaf-devel@lists.sourceforge.net 
<mailto:opensaf-devel@lists.sourceforge.net> 

Hi Thuan,

It looks benign fix but I would prefer if you could do undoing the counter 
while sending the reboot itself as counter was incremented in the same function.

 

Thanks,

Nagendra, 91-9866424860

www.hasolutions.in <http://www.hasolutions.in> 

https://www.linkedin.com/company/hasolutions/

High Availability Solutions Pvt. Ltd.

- High Availability Solutions Provider

 

 

 

 

 

 

 

--------- Original Message --------- 

Subject: RE: [PATCH 1/1] amf: amfd should reset msg_id counter to avoid message 
ID mismatch [#2891]
From: "Tran Thuan" <thuan.t...@dektech.com.au 
<mailto:thuan.t...@dektech.com.au> >
Date: 7/6/18 1:05 pm
To: hans.nordeb...@ericsson.com <mailto:hans.nordeb...@ericsson.com> , 
gary....@dektech.com.au <mailto:gary....@dektech.com.au> , 
nagen...@hasolutions.in <mailto:nagen...@hasolutions.in> , "'Minh Hon Chau'" 
<minh.c...@dektech.com.au <mailto:minh.c...@dektech.com.au> >
Cc: opensaf-devel@lists.sourceforge.net 
<mailto:opensaf-devel@lists.sourceforge.net> 

+ Minh and Nagu

Best Regards,
Thuan (UFO – Unique FBI Opensaf)
CoreMW Maintenance, DEK VietNam

-----Original Message-----
From: thuan.tran <thuan.t...@dektech.com.au <mailto:thuan.t...@dektech.com.au> 
> 
Sent: Friday, July 6, 2018 11:17 AM
To: hans.nordeb...@ericsson.com <mailto:hans.nordeb...@ericsson.com> ; 
gary....@dektech.com.au <mailto:gary....@dektech.com.au> 
Cc: opensaf-devel@lists.sourceforge.net 
<mailto:opensaf-devel@lists.sourceforge.net> ; thuan.tran 
<thuan.t...@dektech.com.au <mailto:thuan.t...@dektech.com.au> >
Subject: [PATCH 1/1] amf: amfd should reset msg_id counter to avoid message ID 
mismatch [#2891]

There is a case that after AMFD send reboot order due to “out of sync window”.
AMFD receive CLM track callback but node is not member yet and delete node.
Later AMFND MDS down will not reset msg_id counter since it cannot find node.
When node reboot up, AMFD continue use current msg_id counter send to AMFND 
cause messasge ID mismatch in AMFND then AMFND order reboot itself node.
---
src/amf/amfd/clm.cc | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/src/amf/amfd/clm.cc b/src/amf/amfd/clm.cc index 
e113a65f9..25b54afbe 100644
--- a/src/amf/amfd/clm.cc
+++ b/src/amf/amfd/clm.cc
@@ -319,6 +319,10 @@ static void clm_track_cb(
LOG_IN("%s: CLM node '%s' is not an AMF cluster member; MDS down received",
__FUNCTION__, node_name.c_str());
avd_node_delete_nodeid(node);
+ /* Reset msg_id because AVND MDS down may come later
+ and cannot find node to reset these, cause message ID mismatch. */
+ node->rcv_msg_id = 0;
+ node->snd_msg_id = 0;
goto done;
}
TRACE(" Node Left: rootCauseEntity %s for node %u",
--
2.18.0

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to