Fix in ImmModel.cc is OK But there are still problems with the fix of ModifyCallback. Some error cases that jump to done will now omitt to remove the immutils data.
So I have to nack this patch. unless the fix in ModifyCallback is really simple, then at this point I would suggest the refactoring apporach for the ModifyCallnack to follow same pattern as CreateCallback. This should work. The only difference is in the handling of parentDN/RDN. Doing a bigger change always has risks, but if we really ensure the same pattern as create then it MUST work. /AndersBj reddy.neelaka...@oracle.com wrote: > osaf/services/saf/immsv/immnd/ImmModel.cc | 4 ++- > osaf/services/saf/immsv/immpbed/immpbe_daemon.cc | 31 > ++++++----------------- > 2 files changed, 12 insertions(+), 23 deletions(-) > > > problem: > > The problem is observed when PRTA update is done. for PRTA updates > PBEA(primary) receives ObjectModifyCallback from local IMMND and PBEB (slave) > receives ModifyCallback from local IMMND in rt_object_update function > parallely. > > In this case PBEB received modifycallback and has been timeout after 5 > seconds > > Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000187 > Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in > waiting on prepare for PRTA update ccb:100000187 > dn:safNode=PL-3,safCluster=myClmCluster > > > after that PBEA tries to send prepare towards PBEB > > Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmpbed: IN Slave PBE replied with OK on > attempt to start prepare of ccb:100000187/4294967687 > Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE > commit for PRTA update Ccb:100000188/4294967688 > Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmnd[3145]: ER PBE PRTAttrs Update > continuation missing! invoc:391 > > > Analyses: > 1. After PBEB is timeout in Modifycallback slave PBE ccbutil_deleteCcbData is > not deleted, because of these PBEB replied OK for prepare from PBEA for the > same PRTA update. From this time PBEB continuously receives TRY_AGAIN as > can be noted in below syslog: > > Nov 13 11:38:25 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare > ccb:100000188/4294967688 received at Pbe slave when Prior Ccb 4294967687 > still processing > > Nov 13 12:21:47 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare > ccb:10000046b/4294968427 received at Pbe slave when Prior Ccb 4294967687 > still processing > > 2. As explained in above point, PBEB sends OK for prepare from PBEA and PBEA > commits PRTA update and the same update in not present in IMM RAM and PBEB, > this is because the non SA_AIS_OK reply from PBEB is considered and the > continuation is removed > > Nov 13 11:38:25 SLES-64BIT-SLOT2 osafimmnd[2491]: ER PBE PRTAttrs Update > continuation missing! invoc:391 > > solution: > > 1. ObjectModify Callback has been changed (as CreateCallback) to delete > ccbutil_deleteCcbData when PRTA update is timeout. > > For 2PBE only PRTA create / PRTA update operation will get parallel > create/modify callback for both PBEA and PBEB. > > 2. In the pbePrtAttrUpdateContinuation if the return code is SA_AIS_OK then > check for continuation(similar to pbePrtAttrCreateContinuation) > > diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc > b/osaf/services/saf/immsv/immnd/ImmModel.cc > --- a/osaf/services/saf/immsv/immnd/ImmModel.cc > +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc > @@ -14634,8 +14634,10 @@ void ImmModel::pbePrtAttrUpdateContinuat > } > > if(i2 == sPbeRtMutations.end()) { > - LOG_ER("PBE PRTAttrs Update continuation missing! invoc:%u", > + if(error == SA_AIS_OK) { > + LOG_ER("PBE PRTAttrs Update continuation missing! invoc:%u", > invocation); > + } > return; > } > > diff --git a/osaf/services/saf/immsv/immpbed/immpbe_daemon.cc > b/osaf/services/saf/immsv/immpbed/immpbe_daemon.cc > --- a/osaf/services/saf/immsv/immpbed/immpbe_daemon.cc > +++ b/osaf/services/saf/immsv/immpbed/immpbe_daemon.cc > @@ -298,7 +298,9 @@ static SaAisErrorT pbe2_ok_to_prepare_cc > TRACE("First try at prepare for ccb: %llu at slave PBE", ccbId); > s2PbeBCcbUtilCcbData = ccbutil_findCcbData(ccbId); > if(s2PbeBCcbUtilCcbData == NULL) { > - TRACE("First ccb-op for ccb:%llu not yet received at > slave PBE", ccbId); > + TRACE("First ccb-op not yet received or time-out in > waiting on prepare, for ccb:%llu at slave PBE", ccbId); > + return SA_AIS_ERR_TRY_AGAIN; > + goto done; > } else if(s2PbeBCcbUtilCcbData->ccbId != ccbId) { > /* This should never happen, but since we dont use any > locking and since the applier > thread may mutate ccb-utils concurrently with this > lookup for a specific ccb-record, > @@ -1270,28 +1272,17 @@ static SaAisErrorT saImmOiCcbObjectModif > } > TRACE("Commit PBE transaction %llx for rt attr update OK", ccbId); > > - if(ccbUtilCcbData && (ccbId > 0x100000000LL)) { > - /* Remove any PRTA update from immutils for 1PBE or 2PBE. > - For 2PBE removing immutildata *before* resetting > syncronisation > - variables minimzes risk of derailing multithreaded use in > immutils. > - */ > - ccbutil_deleteCcbData(ccbUtilCcbData); > - ccbUtilCcbData = NULL; > - } > - > - /* Reset 2pbe-ccb-syncronization variables at slave. */ > - if(sPbe2B) { > - s2PbeBCcbToCompleteAtB=0; > - s2PbeBCcbOpCountToExpectAtB=0; > - s2PbeBCcbOpCountNowAtB=0; > - s2PbeBCcbUtilCcbData = NULL; > - } > > goto done; > > abort_prta_trans: > pbeAbortTrans(sDbHandle); > > + done: > + if((rc != SA_AIS_OK) && sPbe2 && (ccbId > 0x100000000LL)) { > + LOG_NO("2PBE Error (%u) in PRTA update (ccbId:%llx)", rc, > ccbId); > + } > + > if(ccbUtilCcbData && (ccbId > 0x100000000LL)) { > /* Remove any PRTA update from immutils for 1PBE or 2PBE. > For 2PBE removing immutildata *before* resetting > syncronisation > @@ -1302,17 +1293,13 @@ static SaAisErrorT saImmOiCcbObjectModif > } > > /* Reset 2pbe-ccb-syncronization variables at slave. */ > - if(sPbe2B) { > + if(sPbe2B && (s2PbeBCcbToCompleteAtB == ccbId)) { > s2PbeBCcbToCompleteAtB=0; > s2PbeBCcbOpCountToExpectAtB=0; > s2PbeBCcbOpCountNowAtB=0; > s2PbeBCcbUtilCcbData = NULL; > } > > - done: > - if((rc != SA_AIS_OK) && sPbe2 && (ccbId > 0x100000000LL)) { > - LOG_NO("2PBE Error (%u) in PRTA update (ccbId:%llx)", rc, > ccbId); > - } > > TRACE_LEAVE(); > return rc; > ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel