Hi Neel,

The main problem is OI1 will be stuck in the callback for 10sec.

On OM side, when it gets FAILED OPERATION (due to OI2 crashing), OM will try 
again because it's a resource abort.
OI1 is still stuck (for 10 sec), so OM will get FAILED_OPERATION again because 
IMMND doesn't get response from OI1 after 6sec.
So OM gets 2 FAILED_OPERATION (resource abort) in a row.

Example:
00:00  OI2 crashes, OI1 starts to get stuck, OM gets resource abort.
00:01  OM retries (assume that OM sleeps 1 sec before retrying). OI2 may also 
be back after crashing by this time.
00:07  OM gets resource abort again (because OI1 doesn't respond for 6 sec).
00:10  Now OI1 back to normal.

OM may succeed on the 3rd attempt.
As you can see, OM will not be able to do anything until OI1 is not stuck 
anymore, and it's 10sec.

So I think it's not good to let OI1 get stuck for 10sec like that.

BR,

Hung Nguyen - DEK Technologies


--------------------------------------------------------------------------------
From: Neelakanta Reddy reddy.neelaka...@oracle.com
Sent: Tuesday, December 29, 2015 7:30PM
To: Hung Nguyen, Zoran Milinkovic
     hung.d.ngu...@dektech.com.au, zoran.milinko...@ericsson.com
Cc: Opensaf-devel
     opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1 of 1] imm : Aborted reply for augumentated Ccb should be 
sent to OM [#1503]


Hi Hung,

Comments inline.

On Monday 28 December 2015 02:15 PM, Hung Nguyen wrote:
> Hi Neel,
>
> In case of the augmented ccb operation (the operation added to ccb by 
> OI1) needing validation from another OI (OI2).
> If OI2 crashes in the ccb callback, the OM will get FAILED_OPERATION 
> as expected but OI1 will get ERR_TIMEOUT from OmCcbObject operation api.
>
Yes, even though OI1 gets timeout, the OM gets FAILED_OPERATION and 
eventually CCB is aborted.
After the OI1 timeout, the reply sent from the OI1 is dropped saying as 
" ccb id  missing or terminated".

> So I think if mAugCcbParent is not NULL, we should reply to both 
> 'ccb->mOriginatingConn' and 'ccb->mAugCcbParent->mOriginatingConn'.
>
The OM timeout is longer than OI timeout, because of this the parent(OM) 
should receive the response.
If the reply is not sent then the OI1 - ccb operation will be timeout it 
may not effect the CCBs any way because the non-critcal CCB is aborted 
because of implementer disconnection.

The only case the additional changes are required to send the reply to 
both OM and OI1.
This case may be ignored as the CCB is eventually aborted and the reply 
is dropped as the CCB is aborted.

/Neel.
>
> BR,
>
> Hung Nguyen - DEK Technologies
>
>
> --------------------------------------------------------------------------------
>  
>
> From: Neelakanta Reddy reddy.neelaka...@oracle.com
> Sent: Thursday, December 24, 2015 9:34PM
> To: Zoran Milinkovic, Hung Nguyen
>     zoran.milinko...@ericsson.com, hung.d.ngu...@dektech.com.au
> Cc: Opensaf-devel
>     opensaf-devel@lists.sourceforge.net
> Subject: [PATCH 1 of 1] imm : Aborted reply for augumentated Ccb 
> should be sent to OM [#1503]
>
>
>  osaf/services/saf/immsv/immnd/ImmModel.cc |  13 +++++++++++--
>  1 files changed, 11 insertions(+), 2 deletions(-)
>
>
> When the augumented Ccb is aborted then reply should be sent to the 
> parent(OM) and not to the augmented client
>
> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
> b/osaf/services/saf/immsv/immnd/ImmModel.cc
> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc
> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
> @@ -5926,8 +5926,17 @@ ImmModel::ccbAbort(SaUint32T ccbId, Conn
>          }
>      }
>
> -    *nodeId = ccb->mOriginatingNode;
> -    *client = ccb->mOriginatingConn;
> +    if(ccb->mAugCcbParent){
> +    /* When the augumented Ccb is aborted then reply should be
> +           sent to the parent(OM) and not to the augmented client
> +        */
> +
> +        *nodeId = ccb->mAugCcbParent->mOriginatingNode;
> +        *client = ccb->mAugCcbParent->mOriginatingConn;
> +    } else {
> +        *nodeId = ccb->mOriginatingNode;
> +        *client = ccb->mOriginatingConn;
> +    }
>
>      ccb->mState = IMM_CCB_ABORTED;
>      if(ccb->mVeto == SA_AIS_OK) {
>
>



------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to