Hi Mahesh,

I am not able reproduce the scenario(tested with 200k).
I did not observe IMMND restarts.

Thanks,
Neel.

On Friday 03 July 2015 11:06 AM, A V Mahesh wrote:
> Hi Nell,
>
> On 7/2/2015 2:54 PM, Anders Björnerstedt wrote:
>> Ack from me.
>> Not tested.
>
> I was trying to test with 200K objects  , I observed some issues 
> please verify before pushing .
>
> 1) bring up SC-1 active  with 2000K objects
> 2) bring up PL-3
> 3) bring up PL-4
> 4) try to bring up SC-2 as standby
> 5) you will observe  osafimmnd  restart on payload(s)  and they will 
> never re-join
>
>
> ==================================================================================================================================
>  
>
> Jul  3 10:49:27 PL-4 osafamfnd[3651]: NO 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED 
> => INSTANTIATING
> Jul  3 10:49:27 PL-4 osafamfwd[3661]: Started
> Jul  3 10:49:27 PL-4 osafckptnd[3671]: Started
> Jul  3 10:49:27 PL-4 osaflcknd[3681]: Started
> Jul  3 10:49:27 PL-4 osafmsgnd[3699]: Started
> Jul  3 10:49:27 PL-4 osafimmnd[3624]: NO Implementer connected: 12 
> (MsgQueueService132111) <49, 2040f>
> Jul  3 10:49:27 PL-4 osafsmfnd[3710]: Started
> Jul  3 10:49:27 PL-4 osafamfnd[3651]: NO 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING 
> => INSTANTIATED
> Jul  3 10:49:27 PL-4 osafamfnd[3651]: NO Assigning 
> 'safSi=NoRed8,safApp=OpenSAF' ACTIVE to 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF'
> Jul  3 10:49:27 PL-4 osafamfnd[3651]: NO Assigned 
> 'safSi=NoRed8,safApp=OpenSAF' ACTIVE to 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF'
> Jul  3 10:49:27 PL-4 opensafd: OpenSAF(4.5.0 - ) services successfully 
> started
> done
> PL-4:~ # Jul  3 10:49:42 PL-4 kernel: [  568.167588] tipc: Established 
> link <1.1.4:eth2-1.1.2:eth2> on network plane B
> Jul  3 10:49:42 PL-4 kernel: [  568.168970] tipc: Established link 
> <1.1.4:eth0-1.1.2:eth3> on network plane A
> Jul  3 10:49:43 PL-4 osafimmnd[3624]: NO NODE STATE-> 
> IMM_NODE_R_AVAILABLE
> Jul  3 10:50:09 PL-4 osafamfnd[3651]: NO 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' component restart probation 
> timer started (timeout: 60000000000 ns)
> Jul  3 10:50:09 PL-4 osafamfnd[3651]: NO Restarting a component of 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)
> Jul  3 10:50:09 PL-4 osafamfnd[3651]: NO 
> 'safComp=IMMND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 
> 'avaDown' : Recovery is 'componentRestart'
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: Started
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: NO Persistent Back-End 
> capability configured, Pbe file:imm.db (suffix may get added)
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: NO Fevs count adjusted to 203641 
> preLoadPid: 0
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: NO SERVER STATE: 
> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: NO SERVER STATE: 
> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: NO SERVER STATE: 
> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
> Jul  3 10:50:09 PL-4 osafimmnd[3751]: NO NODE STATE-> IMM_NODE_ISOLATED
> Jul  3 10:50:13 PL-4 osafamfnd[3651]: NO Restarting a component of 
> 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (comp restart count: 2)
> Jul  3 10:50:13 PL-4 osafamfnd[3651]: NO 
> 'safComp=IMMND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 
> 'avaDown' : Recovery is 'componentRestart'
> Jul  3 10:50:13 PL-4 osafimmnd[3773]: Started
> Jul  3 10:50:13 PL-4 osafimmnd[3773]: NO Persistent Back-End 
> capability configured, Pbe file:imm.db (suffix may get added)
> Jul  3 10:50:13 PL-4 osafimmnd[3773]: NO Fevs count adjusted to 203732 
> preLoadPid: 0
> Jul  3 10:50:13 PL-4 osafimmnd[3773]: NO SERVER STATE: 
> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> Jul  3 10:50:14 PL-4 osafimmnd[3773]: NO SERVER STATE: 
> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
> Jul  3 10:50:14 PL-4 osafimmnd[3773]: NO SERVER STATE: 
> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
> Jul  3 10:50:14 PL-4 osafimmnd[3773]: NO NODE STATE-> IMM_NODE_ISOLATED
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCompBaseType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSUBaseType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSGBaseType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfAppBaseType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSvcBaseType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCSBaseType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCompGlobalAttributes
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCompType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCSType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCtCsType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfHealthcheckType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSvcType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSvcTypeCSTypes
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSUType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSutCompType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSGType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfAppType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCluster
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfNode
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfNodeGroup
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfNodeSwBundle
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfApplication
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSG
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSI
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCSI
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCSIAttribute
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSU
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfComp
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfHealthcheck
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfCompCsType
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSIDependency
> Jul  3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded 
> classimplementer set. Impl-id:13 Class:SaAmfSIRankedSU
> Jul  3 10:50:28 PL-4 osafimmnd[3773]: NO NODE STATE-> 
> IMM_NODE_W_AVAILABLE
> Jul  3 10:50:29 PL-4 osafimmnd[3773]: NO SERVER STATE: 
> IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
> Jul  3 10:50:30 PL-4 osafimmnd[3773]: NO Implementer connected: 14 
> (MsgQueueService131599) <0, 2020f>
> ==================================================================================================================================
>  
>
>
> -AVM
>
> On 7/2/2015 2:54 PM, Anders Björnerstedt wrote:
>> Ack from me.
>> Not tested.
>> Good work!
>>
>> One thought that struck me is that the message types:
>>
>>     IMMND_EVT_A2ND_IMM_FEVS_2
>>     IMMD_EVT_ND2D_FEVS_REQ_2
>>     IMMND_EVT_D2ND_GLOB_FEVS_REQ_2
>>
>> should (in some later cleanup) be renamed to reflect that they are 
>> only used for imm-sync.
>> e.g.   IMMND_EVT_A2ND_IMM_SYNC_FEVS
>> Not for this ticket though.
>>
>> /AndersBj
>>
>>
>> -----Original Message-----
>> From: reddy.neelaka...@oracle.com [mailto:reddy.neelaka...@oracle.com]
>> Sent: den 1 juli 2015 16:16
>> To: Anders Björnerstedt; Zoran Milinkovic; mahesh.va...@oracle.com
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: [PATCH 1 of 1] imm:checkpoint only FEVS header for sync 
>> messages [#952] v2
>>
>>   osaf/services/saf/immsv/immd/immd_evt.c   |  15 ++++++++++++---
>>   osaf/services/saf/immsv/immnd/immnd_evt.c |   9 ++++++++-
>>   2 files changed, 20 insertions(+), 4 deletions(-)
>>
>>
>> At the time of sync, when check-pointing to standby IMMD for 
>> IMMND_EVT_D2ND_GLOB_FEVS_REQ_2, the fevs message buffer will be set 
>> to NULL and message size will be set to 0. so, that the MBCSV 
>> check-pointing happens only for header.
>>
>> diff --git a/osaf/services/saf/immsv/immd/immd_evt.c 
>> b/osaf/services/saf/immsv/immd/immd_evt.c
>> --- a/osaf/services/saf/immsv/immd/immd_evt.c
>> +++ b/osaf/services/saf/immsv/immd/immd_evt.c
>> @@ -251,7 +251,7 @@ uint32_t immd_evt_proc_fevs_req(IMMD_CB
>>       /* Populate & Send the FEVS Event to IMMND */
>>       memset(&send_evt, 0, sizeof(IMMSV_EVT));
>>       send_evt.type = IMMSV_EVT_TYPE_IMMND;
>> -    send_evt.info.immnd.type = (evt->type == IMMD_EVT_ND2D_FEVS_REQ_2)?
>> +    send_evt.info.immnd.type = ((evt->type == 
>> IMMD_EVT_ND2D_FEVS_REQ_2)||(evt->type == 0))?
>>           IMMND_EVT_D2ND_GLOB_FEVS_REQ_2: IMMND_EVT_D2ND_GLOB_FEVS_REQ;
>>         if ((evt->type == 0) && (fevs_req->sender_count > 0)) { @@ 
>> -266,8 +266,8 @@ uint32_t immd_evt_proc_fevs_req(IMMD_CB
>>       send_evt.info.immnd.info.fevsReq.msg.size = fevs_req->msg.size;
>>       /*Borrow the buffer from the input message instead of copying */
>>       send_evt.info.immnd.info.fevsReq.msg.buf = fevs_req->msg.buf;
>> -    send_evt.info.immnd.info.fevsReq.isObjSync = (evt->type == 
>> IMMD_EVT_ND2D_FEVS_REQ_2)?
>> -        (fevs_req->isObjSync):0x0;
>> +    send_evt.info.immnd.info.fevsReq.isObjSync = ((evt->type == 
>> IMMD_EVT_ND2D_FEVS_REQ_2) ||
>> +            (evt->type == 0 ))? (fevs_req->isObjSync):0x0;
>>         TRACE_5("immd_evt_proc_fevs_req send_count:%llu size:%u",
>>           send_evt.info.immnd.info.fevsReq.sender_count, 
>> send_evt.info.immnd.info.fevsReq.msg.size);
>> @@ -280,6 +280,15 @@ uint32_t immd_evt_proc_fevs_req(IMMD_CB
>>           mbcp_msg.type = IMMD_A2S_MSG_FEVS;
>>           mbcp_msg.info.fevsReq = send_evt.info.immnd.info.fevsReq;
>>   +        /* FEVS_REQ_2 messages are object sync messages. since 
>> this is mbcsv checkpointing
>> +           to standby, at the time of sync checkpointing complete 
>> fevs event is not required.
>> +           Checkpointing the header is sufficient to have the 
>> standby SC in
>> +sync with the fevs count.*/
>> +
>> +        if(evt->type == IMMD_EVT_ND2D_FEVS_REQ_2){
>> +            mbcp_msg.info.fevsReq.msg.size = 0;
>> +            mbcp_msg.info.fevsReq.msg.buf = NULL;
>> +            mbcp_msg.info.fevsReq.isObjSync = 0x0;
>> +        }
>>           /*Checkpoint the message to standby director.
>>              Syncronous call=>wait for ack */
>>           proc_rc = immd_mbcsv_sync_update(cb, &mbcp_msg); diff --git 
>> a/osaf/services/saf/immsv/immnd/immnd_evt.c 
>> b/osaf/services/saf/immsv/immnd/immnd_evt.c
>> --- a/osaf/services/saf/immsv/immnd/immnd_evt.c
>> +++ b/osaf/services/saf/immsv/immnd/immnd_evt.c
>> @@ -8702,7 +8702,7 @@ static uint32_t immnd_evt_proc_fevs_rcv(
>>       SaBoolT originatedAtThisNd = 
>> (m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl) == cb->node_id);
>>         if (originatedAtThisNd) {
>> -        osafassert(!reply_dest || (reply_dest == cb->immnd_mdest_id));
>> +        osafassert(!reply_dest || (reply_dest == cb->immnd_mdest_id) ||
>> +isObjSync );
>>           if (cb->fevs_replies_pending) {
>>               --(cb->fevs_replies_pending);    /*flow control towards 
>> IMMD */
>>           }
>> @@ -8731,6 +8731,12 @@ static uint32_t immnd_evt_proc_fevs_rcv(
>>           }
>>       }
>>   +    if ((evt->type == IMMND_EVT_D2ND_GLOB_FEVS_REQ_2) && 
>> (msg->size == 0) && (msg->buf == NULL)){
>> +        // This is  sync message Re-broadcasted by IMMD standby 
>> because of failover
>> +        TRACE("Re-broadcasted FEVS at the time of sync");
>> +        goto done;
>> +    }
>> +
>>       /*NORMAL CASE: Received the expected in-order message. */
>>         SaAisErrorT err = SA_AIS_OK;
>> @@ -8749,6 +8755,7 @@ static uint32_t immnd_evt_proc_fevs_rcv(
>>           }
>>       }
>>   + done:
>>       cb->highestProcessed++;
>>       dequeue_outgoing(cb);
>>       TRACE_LEAVE();
>


------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to