Hi Mathi

The fix is ok but I have found something related that maybe should be handled 
in the same fix.

According to Anders B (maintainer of IMM) the TRY AGAIN timeout in these 
situation should be 60 sec because of the time I may take for IMM to 
reinitialize in some situations (see also #308). This is fulfilled if  
lgs_imm_impl_set() function is used but I have found places in the log service 
where immutil still is used:

amf_active_state_handler()
Sets nTries to 250 and errorsAreFatal = 0 which means that it is possible to 
get stuck here in 100 sec (or even more since there are several immutil 
functions used in this function with these timeout settings).

lgs_imm_activate()
As above. Only used during log initialization

lgs_become_imm_applier()
This function is used in several places. This function is also using immutil 
functions as above.

amf_standby_state_handler()
Using function lgs_become_imm_applier()

log_initialize()
Using function lgs_become_imm_applier()

In these situations failed implementer handling does not abort (in immutil) and 
TRY AGAIN timeout is more than 60 sec, so maybe we can leave it for now.
I think this should be fixed so that all OI and Applier handling is done the 
same way (by using lgs_imm_impl_set() function). The only thing that will be
different is that the log service may run for up to 60 sec without an OI or 
Applier. Also when becoming an applier (standby lgs) settings from the IMM 
class "OpenSafLogConfig" should be reread just to be sure that the settings are 
correct if there are changes done before we have become an applier.

Maybe you can just push your changes as is. The rest could be handled via a 
separate ticket?
What is your opinion?

> -----Original Message-----
> From: mathi.naic...@oracle.com [mailto:mathi.naic...@oracle.com]
> Sent: den 1 oktober 2013 11:15
> To: Lennart Lund
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: [PATCH 1 of 1] logsv: improved error handling in rda callback [#567]
> 
>  osaf/services/saf/logsv/lgs/lgs_evt.c |  31 +++++--------------------------
>  1 files changed, 5 insertions(+), 26 deletions(-)
> 
> 
> The patch introduces the following changes:
> Exit when MDS or MBCSv role change fails
> Avoid the 'goto done' as it was skipping the agent down processing
> Become implementer in a separate thread
> 
> diff --git a/osaf/services/saf/logsv/lgs/lgs_evt.c
> b/osaf/services/saf/logsv/lgs/lgs_evt.c
> --- a/osaf/services/saf/logsv/lgs/lgs_evt.c
> +++ b/osaf/services/saf/logsv/lgs/lgs_evt.c
> @@ -478,12 +478,12 @@ static uint32_t proc_rda_cb_msg(lgsv_lgs
> 
>               if ((rc = lgs_mds_change_role(lgs_cb)) !=
> NCSCC_RC_SUCCESS) {
>                       LOG_ER("lgs_mds_change_role FAILED %u", rc);
> -                     goto done;
> +                     exit(EXIT_FAILURE);
>               }
> 
>               if ((rc = lgs_mbcsv_change_HA_state(lgs_cb)) !=
> NCSCC_RC_SUCCESS) {
>                       LOG_ER("lgs_mbcsv_change_HA_state FAILED %u",
> rc);
> -                     goto done;
> +                     exit(EXIT_FAILURE);
>               }
> 
>               /* fail over, become implementer
> @@ -492,27 +492,9 @@ static uint32_t proc_rda_cb_msg(lgsv_lgs
>               TRACE("Give up applier role and become implementer");
>               lgs_giveup_imm_applier(lgs_cb);
> 
> -             immutilWrapperProfile.nTries = 250; /* LOG will be blocked
> until IMM responds */
> -             immutilWrapperProfile.errorsAreFatal = 0;
> -             if ((rc = immutil_saImmOiImplementerSet(lgs_cb-
> >immOiHandle, "safLogService"))
> -                             != SA_AIS_OK) {
> -
>       LOG_ER("immutil_saImmOiImplementerSet(safLogService) FAILED
> %u", rc);
> -                     goto done;
> -             }
> -             if ((rc = immutil_saImmOiClassImplementerSet(lgs_cb-
> >immOiHandle, 
> -                             != SA_AIS_OK) {
> -
>       LOG_ER("immutil_saImmOiImplementerSet(SaLogStreamConfig)
> FAILED %u", rc);
> -                     goto done;
> -             }
> -             /* Do this only if the class exists */
> -             if (true == *(bool*)
> lgs_imm_logconf_get(LGS_IMM_LOG_OPENSAFLOGCONFIG_CLASS_EXIST,
> NULL)) {
> -                     if ((rc =
> immutil_saImmOiClassImplementerSet(lgs_cb->immOiHandle,
> "OpenSafLogConfig"))
> -                                     != SA_AIS_OK) {
> -
>       LOG_ER("immutil_saImmOiImplementerSet(OpenSafLogConfig)
> FAILED %u", rc);
> -                             goto done;
> -                     }
> -             }
> -
> +             /* Declare implementership from a separate thread */
> +             lgs_imm_impl_set(lgs_cb);
> +
>               /* Agent down list has to be processed first */
>               lgs_process_lga_down_list();
> 
> @@ -526,9 +508,6 @@ static uint32_t proc_rda_cb_msg(lgsv_lgs
>               }
>       }
> 
> -done:
> -     immutilWrapperProfile.nTries = 20; /* Reset retry time to more
> normal value. */
> -     immutilWrapperProfile.errorsAreFatal = 1;
>       TRACE_LEAVE();
>       return rc;
>  }

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to