What version are you running? Alex
On 03/20/2017 09:19 AM, David Hoyt wrote: > Correction, I believe the default time-out is 60 seconds, not 10. > > / / > > Regards, > > /David/ > > > > > > *From:* David Hoyt > *Sent:* Monday, March 20, 2017 9:19 AM > *To:* Alex Jones <[email protected]>; Neelakanta Reddy > <[email protected]>; [email protected] > *Subject:* RE: [users] si-swap opensaf SUs results in error but the > action still completes > > > > Alex, isn't the default time-out 10 seconds? > > If so, then why did immnd time-out ~7 seconds later? > > > > Mar 14 11:31:41 sb117vm0 osafamfd[21236]: NO safSi=SC-2N,safApp=OpenSAF > Swap initiated > > … > > Mar 14 11:31:48 sb117vm0 osafimmnd[21104]: WA Timeout on syncronous > admin operation 1 > > / / > > Regards, > > /David/ > > > > > > -----Original Message----- > From: Alex Jones > Sent: Saturday, March 18, 2017 9:41 AM > To: David Hoyt <[email protected] <mailto:[email protected]>>; > Neelakanta Reddy <[email protected] > <mailto:[email protected]>>; > [email protected] > <mailto:[email protected]> > Subject: RE: [users] si-swap opensaf SUs results in error but the action > still completes > > > > David, > > > > You can pass "-t <timeout in seconds> to "amf-adm" to set the timeout > to whatever you want. > > > > Alex > > > > ________________________________________ > > From: David Hoyt [[email protected]] > > Sent: Friday, March 17, 2017 9:35 AM > > To: Neelakanta Reddy; [email protected] > <mailto:[email protected]> > > Subject: Re: [users] si-swap opensaf SUs results in error but the > action still completes > > > > Hi Neel, > > > > The purpose of the test is to see if our system can continue to run > “normally” when in a geographical configuration. > > That is, both SCs are NOT co-located, but reside thousands of km apart. > > This is simulated in the lab by adding a delay between the two severs > which host the SCs. > > > > What we’re seeing is that when the delay is increased to a certain > value, the si-swap command between the two OpenSAF SUs results in an error. > > [root@sb117vm0 ~]# date ; amf-adm si-swap safSi=SC-2N,safApp=OpenSAF; > Tue Mar 14 11:31:41 EDT 2017 error - saImmOmAdminOperationInvoke_2 > FAILED: SA_AIS_ERR_TIMEOUT (5) > > > > However, the logs show that the action actually completes about 2 > seconds after the timeout. > > Mar 14 11:31:48 sb117vm0 osafimmnd[21104]: WA Timeout on syncronous > admin operation 1 Mar 14 11:31:50 sb117vm0 osafimmnd[21104]: NO > Implementer disconnected 67 <0, 2020f> (@safAmfService2020f) Mar 14 > 11:31:50 sb117vm0 osafimmnd[21104]: NO Implementer connected: 72 > (safAmfService) <0, 2020f> Mar 14 11:31:50 sb117vm0 osafamfd[21236]: NO > Switching Quiesced --> StandBy Mar 14 11:31:50 sb117vm0 osafrded[21057]: > NO RDE role set to STANDBY Mar 14 11:31:50 sb117vm0 osafamfd[21236]: NO > Controller switch over done > > > > I’m trying to determine if there’s some way to delay the immnd time-out > so that the si-swap command returns success. > > Regards, > > David > > > > > > From: Neelakanta Reddy [mailto:[email protected]] > > Sent: Friday, March 17, 2017 7:10 AM > > To: David Hoyt <[email protected] <mailto:[email protected]>>; > [email protected] > <mailto:[email protected]> > > Subject: Re: [users] si-swap opensaf SUs results in error but the action > still completes > > > > ________________________________ > > NOTICE: This email was received from an EXTERNAL sender > ________________________________ > > > > Hi, > > > > comments inline. > > > > On 2017/03/16 07:33 PM, David Hoyt wrote: > >> Some additional info. > >> > >> I found out that the users were testing in a lab that had a delay > between the two SC nodes. The delay was added for geographical > redundancy testing. > >> Once the time was reduced, the timeout error for the opensaf swap went > away. > >> > >> In looking through the osafimmnd log file, I see the following: > >> Mar 14 11:31:48.320965 osafimmnd [21104:ImmModel.cc:12042] T5 Forcing > >> Adm Req continuation to expire 609885356033 ... > >> Mar 14 11:31:48.601903 osafimmnd [21104:ImmModel.cc:12437] T5 Timeout > >> on AdministrativeOp continuation 609885356033 tmout:1 Mar 14 > >> 11:31:48.601952 osafimmnd [21104:ImmModel.cc:11311] T5 REQ ADM > >> CONTINUATION 5069295 FOUND FOR 609885356033 Mar 14 11:31:48.601987 > >> osafimmnd [21104:immnd_proc.c:1086] WA Timeout on syncronous admin > >> operation 1 > >> > >> > >> The code around line 12042 of file ImmModel.cc is as follows: > >> > >> 12040 for(ci2=sAdmReqContinuationMap.begin(); > >> ci2!=sAdmReqContinuationMap.end(); ++ci2) { > >> 12041 if((ci2->second.mTimeout) && (ci2->second.mImplId == > >> implHandle)) { > >> 12042 TRACE_5("Forcing Adm Req continuation to expire %llu", > >> ci2->first); > >> 12043 ci2->second.mTimeout = 1; /* one second is minimum timeout. */ > >> 12044 } > >> 12045 } > >> > >> > >> Right after the log at line 12042 is generated, the timeout value is > updated to 1 second (line12043). > > The node where the adminoperation is targeted went down from OpenSAF > perspective. > > Then the minimum timeout of 1 second is updated. > >> Can I increase this to 2 seconds? > > OpenSAF, noted the other node as down, increasing to 2 seconds what > additional benefit can be achieved? > > > >> If so, would it cause any badness? > > Explain, what is the end result you are targeting. > > > > Regards, > > Neel. > >> > >> Regards, > >> David >
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
