Re: [devel] [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]
Ack, code review only. Thanks, Praveen On 18-Nov-16 8:58 AM, Vu Minh Nguyen wrote: > Thanks Praveen. I will remove ` SA_AIS_ERR_UNAVAILABLE` error code checking. > > Regards, Vu > >> -Original Message- >> From: praveen malviya [mailto:praveen.malv...@oracle.com] >> Sent: Thursday, November 17, 2016 4:07 PM >> To: Vu Minh Nguyen; >> minh.c...@dektech.com.au >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: Re: [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of >> saClmInitialize() [#2191] >> >> Hi Vu, >> >> Please find query inline [Praveen]. >> >> Thanks, >> Praveen >> >> On 17-Nov-16 11:58 AM, Vu Minh Nguyen wrote: >>> osaf/services/saf/ntfsv/ntfs/ntfs_clm.c | 14 +- >>> 1 files changed, 13 insertions(+), 1 deletions(-) >>> >>> >>> NTF did not deal with TRY_AGAIN error code of `saClmInitialize()`, >>> NTF would exit, and cause node reboot if getting TRY_AGAIN. >>> >>> The patch adds a while loop to do retry when getting TRY_AGAIN. >>> >>> diff --git a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c >> b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c >>> --- a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c >>> +++ b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c >>> @@ -101,13 +101,25 @@ void *ntfs_clm_init_thread(void *cb) >>> { >>> ntfs_cb_t *_ntfs_cb = (ntfs_cb_t *) cb; >>> SaAisErrorT rc = SA_AIS_OK; >>> + uint32_t msecs_waited = 0; >>> + const uint32_t max_waiting_time_10s = 10 * 1000; /* 10 secs */ >>> + >>> TRACE_ENTER(); >>> + >>> rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, >> ); >>> + while (((rc == SA_AIS_ERR_TRY_AGAIN) || (rc == >> SA_AIS_ERR_TIMEOUT) || >>> + (rc == SA_AIS_ERR_UNAVAILABLE)) && >> [Praveen] Based on discussion on ticket #1828 related to CLM >> initialization, I remember this API does not return ERR_UNAVAILABLE even >> on a CLM locked node also. This provision is to enable application on >> CLM locked node to perform tracking of CLM status of nodes. >> >> But I think even CLM Agent can return this code when CLMS server has not >> got node related information from CLM agent and in that case CLMS will >> return ERR_UNAVAILABLE. If it is not so, then I think we are not >> required to handle rc == SA_AIS_ERR_UNAVAILABLE. >> >> Thanks, >> Praveen >>> + (msecs_waited < max_waiting_time_10s)) { >>> + usleep(100*1000); >>> + msecs_waited += 100; >>> + rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, >> ); >>> + } >>> if (rc != SA_AIS_OK) { >>> LOG_ER("saClmInitialize failed with error: %d", rc); >>> TRACE_LEAVE(); >>> -exit(EXIT_FAILURE); >>> + exit(EXIT_FAILURE); >>> } >>> + >>> rc = saClmSelectionObjectGet(_ntfs_cb->clm_hdl, _cb- >>> clmSelectionObject); >>> if (rc != SA_AIS_OK) { >>> LOG_ER("saClmSelectionObjectGet failed with error: %d", >> rc); >>> > -- ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel
Re: [devel] [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]
Thanks Praveen. I will remove ` SA_AIS_ERR_UNAVAILABLE` error code checking. Regards, Vu > -Original Message- > From: praveen malviya [mailto:praveen.malv...@oracle.com] > Sent: Thursday, November 17, 2016 4:07 PM > To: Vu Minh Nguyen; > minh.c...@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of > saClmInitialize() [#2191] > > Hi Vu, > > Please find query inline [Praveen]. > > Thanks, > Praveen > > On 17-Nov-16 11:58 AM, Vu Minh Nguyen wrote: > > osaf/services/saf/ntfsv/ntfs/ntfs_clm.c | 14 +- > > 1 files changed, 13 insertions(+), 1 deletions(-) > > > > > > NTF did not deal with TRY_AGAIN error code of `saClmInitialize()`, > > NTF would exit, and cause node reboot if getting TRY_AGAIN. > > > > The patch adds a while loop to do retry when getting TRY_AGAIN. > > > > diff --git a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > > --- a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > > +++ b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > > @@ -101,13 +101,25 @@ void *ntfs_clm_init_thread(void *cb) > > { > > ntfs_cb_t *_ntfs_cb = (ntfs_cb_t *) cb; > > SaAisErrorT rc = SA_AIS_OK; > > + uint32_t msecs_waited = 0; > > + const uint32_t max_waiting_time_10s = 10 * 1000; /* 10 secs */ > > + > > TRACE_ENTER(); > > + > > rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, > ); > > + while (((rc == SA_AIS_ERR_TRY_AGAIN) || (rc == > SA_AIS_ERR_TIMEOUT) || > > + (rc == SA_AIS_ERR_UNAVAILABLE)) && > [Praveen] Based on discussion on ticket #1828 related to CLM > initialization, I remember this API does not return ERR_UNAVAILABLE even > on a CLM locked node also. This provision is to enable application on > CLM locked node to perform tracking of CLM status of nodes. > > But I think even CLM Agent can return this code when CLMS server has not > got node related information from CLM agent and in that case CLMS will > return ERR_UNAVAILABLE. If it is not so, then I think we are not > required to handle rc == SA_AIS_ERR_UNAVAILABLE. > > Thanks, > Praveen > > + (msecs_waited < max_waiting_time_10s)) { > > + usleep(100*1000); > > + msecs_waited += 100; > > + rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, > ); > > + } > > if (rc != SA_AIS_OK) { > > LOG_ER("saClmInitialize failed with error: %d", rc); > > TRACE_LEAVE(); > > -exit(EXIT_FAILURE); > > + exit(EXIT_FAILURE); > > } > > + > > rc = saClmSelectionObjectGet(_ntfs_cb->clm_hdl, _cb- > >clmSelectionObject); > > if (rc != SA_AIS_OK) { > > LOG_ER("saClmSelectionObjectGet failed with error: %d", > rc); > > -- ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel
Re: [devel] [PATCH 1 of 1] cpsv: on cpnd down fist remove child safReplica object then parent safCkpt object [#2189]
Dear Mahesh, Reviewed and tested with collocated and non-collocated case, saw problem fixed and could not find any occurrence. So ACK from me, tested. Sincerely, Hoang -Original Message- From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] Sent: Wednesday, November 16, 2016 3:58 PM To: hoang.m...@dektech.com.au Cc: opensaf-devel@lists.sourceforge.net Subject: [PATCH 1 of 1] cpsv: on cpnd down fist remove child safReplica object then parent safCkpt object [#2189] osaf/services/saf/cpsv/cpd/cpd_imm.c | 20 osaf/services/saf/cpsv/cpd/cpd_proc.c | 9 - 2 files changed, 20 insertions(+), 9 deletions(-) Bug : While cpd processing cpnd down for COLLOCATED cktp and that checkpoint only exist on the went down cpnd ( no others Node opened this checkpoint in cluster) , then cpd removes that checkpoint and replica completely. In such case the current logic has as bug, fist it removes ckpt node and then replica, this is causing deletion of parent object safCkpt=...,* first , then child object safReplica=...,safCkpt=...,* next. as we know IMM removes child if parent is removed ,so this is causing the issue out of sequence remove of safReplica object and ERR_NOT_EXIST is returned. Fix : While cpd removing that checkpoint and replica completely , follow the sequence of child object safReplica=...,safCkpt=...,* fist then parent object safCkpt=...,* next. This is focused fix , my be we need to review complete code for such occurrences , if found will be addressed in new ticket. diff --git a/osaf/services/saf/cpsv/cpd/cpd_imm.c b/osaf/services/saf/cpsv/cpd/cpd_imm.c --- a/osaf/services/saf/cpsv/cpd/cpd_imm.c +++ b/osaf/services/saf/cpsv/cpd/cpd_imm.c @@ -400,7 +400,9 @@ SaAisErrorT delete_runtime_replica_objec osaf_extended_name_lend(replica_dn, _name); rc = immutil_saImmOiRtObjectDelete(immOiHandle, _name); if (rc != SA_AIS_OK) { - LOG_ER("Deleting run time object %s Failed - rc = %d",replica_dn, rc); + LOG_ER("Deleting run time object %s Failed-1 - rc = %d",replica_dn, rc); + } else { + TRACE("Deleting run time object %s Success-1 - rc = %d",replica_dn, +rc); } free(replica_dn); @@ -522,9 +524,11 @@ SaAisErrorT delete_runtime_ckpt_object(C osaf_extended_name_lend(ckpt_node->ckpt_name, _name); rc = immutil_saImmOiRtObjectDelete(immOiHandle, _name); - if (rc != SA_AIS_OK) + if (rc != SA_AIS_OK) { LOG_ER("Deleting run time object %s failed - rc = %d", ckpt_node->ckpt_name, rc); - + } else { + TRACE("Deleting run time object %s success - rc = %d", ckpt_node->ckpt_name, rc); + } return rc; } @@ -917,11 +921,11 @@ SaAisErrorT cpd_clean_checkpoint_objects /* Delete the runtime object and its children. */ rc = immutil_saImmOiRtObjectDelete(cb->immOiHandle, _name); if (rc == SA_AIS_OK) { - TRACE("Object \"%s\" deleted", (char *) osaf_extended_name_borrow(_name)); - } else { - LOG_ER("%s saImmOiRtObjectDelete for \"%s\" FAILED %d", - __FUNCTION__, (char *) osaf_extended_name_borrow(_name), rc); - } + TRACE("saImmOiRtObjectDelete \"%s\" deleted Successfully", (char *) osaf_extended_name_borrow(_name)); + } else { + LOG_ER("%s saImmOiRtObjectDelete for \"%s\" FAILED %d", + __FUNCTION__, (char *) osaf_extended_name_borrow(_name), rc); + } } if (rc != SA_AIS_ERR_NOT_EXIST) { diff --git a/osaf/services/saf/cpsv/cpd/cpd_proc.c b/osaf/services/saf/cpsv/cpd/cpd_proc.c --- a/osaf/services/saf/cpsv/cpd/cpd_proc.c +++ b/osaf/services/saf/cpsv/cpd/cpd_proc.c @@ -809,6 +809,11 @@ uint32_t cpd_process_cpnd_down(CPD_CB *c send_evt.info.cpnd.info.ckpt_del.mds_dest = *cpnd_dest; if (ckpt_node->dest_cnt == 0) { TRACE_1("cpd ckpt del success for ckpt_id:%llx",ckpt_node->ckpt_id); + /* Delete reploc fist*/ + cpd_ckpt_reploc_get(>ckpt_reploc_tree, _info, _info); + if (rep_info) { + cpd_ckpt_reploc_node_delete(cb, rep_info, ckpt_node->is_unlink_set); + } cpd_ckpt_map_node_get(>ckpt_map_tree, ckpt_node->ckpt_name, _info); /* Remove the ckpt_node */ @@ -875,7 +880,7 @@ uint32_t cpd_process_cpnd_down(CPD_CB *c /* Send it to CPD(s), by sending ckpt_id = 0 */ /* This is to delete the node from reploc_tree */ cpd_ckpt_reploc_get(>ckpt_reploc_tree, _info, _info); - if
Re: [devel] [PATCH 1 of 1] AMFD: Do not let non-active AMFD receives CLM track callback [#2141]
Hi Praveen, It has been seen once in swapping 2N Opensaf SI. It was just an error indicating that AMFD failed to stop track callback, after that the si-swap also succeeded. CLM service could be busy during switch-over but AMF (as well as other services) is supposed to handle that return code properly as AMFD has done in track start. Further testing, it exposed a problem due to clm track callback received on standby AMFD. AMFD can not be too busy just try to stop clm track. First option, AMFD can start another thread which just tries to stop track callback, but that's likely to generate more problems where this standby AMFD is promoted back to active in the meantime that thread of stopping track is on going. At that time, main thread will be starting track and another thread is trying to stop track. Second option as in the patch, AMFD retries to stop track where problem happens in callback. At the time standby AMFD switching back to active, we stop and start track again, but it seems no point of doing this. I hope it's less impact on existing behavior, do you see any problems? Regarding standby controller with locked/disabled CLM, I think it is mentioned (in #1828 or some where else?) that this matter requires more thoughts as whether we want Ha role assigned to AMFD in non member node. It seems to beyond this ticket. Thanks, Minh On 17/11/16 22:03, praveen malviya wrote: > Hi Minh, > > We have not seen this issue earlier. Why CLMA is returning timeout > where it is busy? > If we are allowing Standby controller to continue tracking the Node, > then I think (though not fully sure), it will increase one more client > for validation step and this client does not exist locally. Also we > need to evaluate the situation where Standby controller is CLM locked > or disabled. > > Thanks, > Praveen > > > > > On 17-Nov-16 3:44 AM, minh chau wrote: >> Hi all, >> >> Has anyone had chance to review this patch? >> >> Thanks, >> Minh >> >> On 14/11/16 15:27, Minh Hon Chau wrote: >>> osaf/services/saf/amf/amfd/clm.cc | 37 >>> +++- >>> osaf/services/saf/amf/amfd/include/cb.h | 1 + >>> osaf/services/saf/amf/amfd/role.cc | 16 +++--- >>> 3 files changed, 35 insertions(+), 19 deletions(-) >>> >>> >>> In controller failover/switchover, sometimes active AMFD fails to stop >>> CLM track callback. Therefore, when this AMFD become standby, AMFD can >>> continue receiving CLM track callback and trigger the operations which >>> should only be executed in active AMFD. >>> >>> diff --git a/osaf/services/saf/amf/amfd/clm.cc >>> b/osaf/services/saf/amf/amfd/clm.cc >>> --- a/osaf/services/saf/amf/amfd/clm.cc >>> +++ b/osaf/services/saf/amf/amfd/clm.cc >>> @@ -219,7 +219,13 @@ static void clm_track_cb(const SaClmClus >>> LOG_ER("ClmTrackCallback received in error"); >>> goto done; >>> } >>> - >>> +if (avd_cb->avail_state_avd != SA_AMF_HA_ACTIVE) { >>> +if (avd_cb->is_clm_track_started == true) { >>> +LOG_NO("Retry to stop clm track with AMFD state(%d)", >>> avd_cb->avail_state_avd); >>> +avd_clm_track_stop(); >>> +} >>> +goto done; >>> +} >>> /* >>> ** The CLM cluster can be larger than the AMF cluster thus it is >>> not an >>> ** error if the corresponding AMF node cannot be found. >>> @@ -394,6 +400,7 @@ SaAisErrorT avd_clm_init(AVD_CL_CB* cb) >>> cb->clmHandle = 0; >>> cb->clm_sel_obj = 0; >>> +cb->is_clm_track_started = false; >>> TRACE_ENTER(); >>> /* >>>* TODO: This CLM initialization thread can be re-factored >>> @@ -453,6 +460,8 @@ SaAisErrorT avd_clm_track_start(void) >>> } else { >>> LOG_ER("Failed to start cluster tracking %u", error); >>> } >>> +} else { >>> +avd_cb->is_clm_track_started = true; >>> } >>> TRACE_LEAVE(); >>> return error; >>> @@ -460,17 +469,23 @@ SaAisErrorT avd_clm_track_start(void) >>> SaAisErrorT avd_clm_track_stop(void) >>> { >>> -SaAisErrorT error = SA_AIS_OK; >>> +SaAisErrorT error = SA_AIS_OK; >>> +TRACE_ENTER(); >>> +error = saClmClusterTrackStop(avd_cb->clmHandle); >>> +if (error != SA_AIS_OK) { >>> +if (error == SA_AIS_ERR_TRY_AGAIN || error == >>> SA_AIS_ERR_TIMEOUT || >>> +error == SA_AIS_ERR_UNAVAILABLE) { >>> +LOG_WA("Failed to stop cluster tracking %u", error); >>> +} else { >>> +LOG_ER("Failed to stop cluster tracking %u", error); >>> +} >>> +} else { >>> +TRACE("Sucessfully stops cluster tracking"); >>> +avd_cb->is_clm_track_started = false; >>> +} >>> -TRACE_ENTER(); >>> -error = saClmClusterTrackStop(avd_cb->clmHandle); >>> -if (SA_AIS_OK != error) >>> -LOG_ER("Failed to stop cluster tracking %u", error); >>> -else >>> -TRACE("Sucessfully stops
Re: [devel] [PATCH 1 of 1] AMFD: Do not let non-active AMFD receives CLM track callback [#2141]
Hi Minh, We have not seen this issue earlier. Why CLMA is returning timeout where it is busy? If we are allowing Standby controller to continue tracking the Node, then I think (though not fully sure), it will increase one more client for validation step and this client does not exist locally. Also we need to evaluate the situation where Standby controller is CLM locked or disabled. Thanks, Praveen On 17-Nov-16 3:44 AM, minh chau wrote: > Hi all, > > Has anyone had chance to review this patch? > > Thanks, > Minh > > On 14/11/16 15:27, Minh Hon Chau wrote: >> osaf/services/saf/amf/amfd/clm.cc | 37 >> +++- >> osaf/services/saf/amf/amfd/include/cb.h | 1 + >> osaf/services/saf/amf/amfd/role.cc | 16 +++--- >> 3 files changed, 35 insertions(+), 19 deletions(-) >> >> >> In controller failover/switchover, sometimes active AMFD fails to stop >> CLM track callback. Therefore, when this AMFD become standby, AMFD can >> continue receiving CLM track callback and trigger the operations which >> should only be executed in active AMFD. >> >> diff --git a/osaf/services/saf/amf/amfd/clm.cc >> b/osaf/services/saf/amf/amfd/clm.cc >> --- a/osaf/services/saf/amf/amfd/clm.cc >> +++ b/osaf/services/saf/amf/amfd/clm.cc >> @@ -219,7 +219,13 @@ static void clm_track_cb(const SaClmClus >> LOG_ER("ClmTrackCallback received in error"); >> goto done; >> } >> - >> +if (avd_cb->avail_state_avd != SA_AMF_HA_ACTIVE) { >> +if (avd_cb->is_clm_track_started == true) { >> +LOG_NO("Retry to stop clm track with AMFD state(%d)", >> avd_cb->avail_state_avd); >> +avd_clm_track_stop(); >> +} >> +goto done; >> +} >> /* >> ** The CLM cluster can be larger than the AMF cluster thus it is >> not an >> ** error if the corresponding AMF node cannot be found. >> @@ -394,6 +400,7 @@ SaAisErrorT avd_clm_init(AVD_CL_CB* cb) >> cb->clmHandle = 0; >> cb->clm_sel_obj = 0; >> +cb->is_clm_track_started = false; >> TRACE_ENTER(); >> /* >>* TODO: This CLM initialization thread can be re-factored >> @@ -453,6 +460,8 @@ SaAisErrorT avd_clm_track_start(void) >> } else { >> LOG_ER("Failed to start cluster tracking %u", error); >> } >> +} else { >> +avd_cb->is_clm_track_started = true; >> } >> TRACE_LEAVE(); >> return error; >> @@ -460,17 +469,23 @@ SaAisErrorT avd_clm_track_start(void) >> SaAisErrorT avd_clm_track_stop(void) >> { >> -SaAisErrorT error = SA_AIS_OK; >> +SaAisErrorT error = SA_AIS_OK; >> +TRACE_ENTER(); >> +error = saClmClusterTrackStop(avd_cb->clmHandle); >> +if (error != SA_AIS_OK) { >> +if (error == SA_AIS_ERR_TRY_AGAIN || error == >> SA_AIS_ERR_TIMEOUT || >> +error == SA_AIS_ERR_UNAVAILABLE) { >> +LOG_WA("Failed to stop cluster tracking %u", error); >> +} else { >> +LOG_ER("Failed to stop cluster tracking %u", error); >> +} >> +} else { >> +TRACE("Sucessfully stops cluster tracking"); >> +avd_cb->is_clm_track_started = false; >> +} >> -TRACE_ENTER(); >> -error = saClmClusterTrackStop(avd_cb->clmHandle); >> -if (SA_AIS_OK != error) >> -LOG_ER("Failed to stop cluster tracking %u", error); >> -else >> -TRACE("Sucessfully stops cluster tracking"); >> - >> -TRACE_LEAVE(); >> -return error; >> +TRACE_LEAVE(); >> +return error; >> } >> void clm_node_terminate(AVD_AVND *node) >> diff --git a/osaf/services/saf/amf/amfd/include/cb.h >> b/osaf/services/saf/amf/amfd/include/cb.h >> --- a/osaf/services/saf/amf/amfd/include/cb.h >> +++ b/osaf/services/saf/amf/amfd/include/cb.h >> @@ -210,6 +210,7 @@ typedef struct cl_cb_tag { >> /* Clm stuff */ >> std::atomic clmHandle; >> std::atomic clm_sel_obj; >> +bool is_clm_track_started; >> bool fully_initialized; >> bool swap_switch; /* true - In middle of role switch. */ >> diff --git a/osaf/services/saf/amf/amfd/role.cc >> b/osaf/services/saf/amf/amfd/role.cc >> --- a/osaf/services/saf/amf/amfd/role.cc >> +++ b/osaf/services/saf/amf/amfd/role.cc >> @@ -1050,9 +1050,7 @@ uint32_t amfd_switch_actv_qsd(AVD_CL_CB >> /* Mark AVD as Quiesced. */ >> cb->avail_state_avd = SA_AMF_HA_QUIESCED; >> >> -if (avd_clm_track_stop() != SA_AIS_OK) { >> -LOG_ER("ClmTrack stop failed"); >> -} >> +avd_clm_track_stop(); >> /* Go ahead and set mds role as already the NCS SU has been >> switched */ >> if (NCSCC_RC_SUCCESS != (rc = avd_mds_set_vdest_role(cb, >> SA_AMF_HA_QUIESCED))) { >> @@ -1260,11 +1258,13 @@ uint32_t amfd_switch_stdby_actv(AVD_CL_C >> if (NCSCC_RC_SUCCESS != avd_rde_set_role(SA_AMF_HA_ACTIVE)) { >> LOG_ER("rde role change failed from stdy -> Active"); >> } >> - >> -
Re: [devel] [PATCH 1 of 1] amfd: fix si-swap timeout [#2190]
Hi Praveen, In the trace, for example, in SC-1, we can see the code was performed like: Nov 14 11:02:27.743481 osafamfd [486:sg.cc:1693] >> set_fsm_state ... Nov 14 11:02:27.744077 osafamfd [486:sg.cc:1716] TR Admin operation finishes on SI:'safSi=SC-2N,safApp=OpenSAF' *Nov 14 11:02:27.744090 osafamfd [486:imm.cc:1971] >> avd_saImmOiAdminOperationResult: inv:17179869185, res:1** **Nov 14 11:02:27.744106 osafamfd [486:imm.cc:1976] << avd_saImmOiAdminOperationResult * Nov 14 11:02:27.744119 osafamfd [486:sg.cc:1724] << set_fsm_state The traces map to the code: > for (const auto& si : list_of_si) { > if (si->invocation != 0) { > TRACE("Admin operation finishes on > SI:'%s'",si->name.c_str()); > avd_saImmOiAdminOperationResult(avd_cb->immOiHandle, > si->invocation, SA_AIS_OK); > *si->invocation = 0;* > } > } So, the invocation here is reset. Best regards, Long Nguyen. On 11/17/2016 4:15 PM, praveen malviya wrote: > Hi Long, > > As per ticket, issue occurred during si-swap. > Please update the ticket with brief analysis so that seeing the ticket > one can have idea where the problem occurred. > I guess si-swap initiated before AMFD could update to IMM. > If that is the case, we should reject the admin operation from > si_admin_op_cb() if invocationId is still set in Si. > > Thanks, > Praveen > > On 16-Nov-16 4:19 PM, Long HB Nguyen wrote: >> osaf/services/saf/amf/amfd/imm.cc | 22 ++ >> osaf/services/saf/amf/amfd/include/imm.h | 2 ++ >> osaf/services/saf/amf/amfd/role.cc | 5 + >> 3 files changed, 29 insertions(+), 0 deletions(-) >> >> >> diff --git a/osaf/services/saf/amf/amfd/imm.cc >> b/osaf/services/saf/amf/amfd/imm.cc >> --- a/osaf/services/saf/amf/amfd/imm.cc >> +++ b/osaf/services/saf/amf/amfd/imm.cc >> @@ -423,6 +423,28 @@ AvdJobDequeueResultT Fifo::execute(const >> return ret; >> } >> >> +AvdJobDequeueResultT Fifo::executeAdminResp(const AVD_CL_CB *cb) >> +{ >> +Job *ajob; >> +AvdJobDequeueResultT ret = JOB_EXECUTED; >> + >> +TRACE_ENTER(); >> + >> +while ((ajob = peek()) != nullptr) { >> +if (dynamic_cast(ajob) != nullptr) { >> +ret = ajob->exec(cb); >> +} else { >> +ajob = dequeue(); >> +delete ajob; >> +ret = JOB_EXECUTED; >> +} >> +} >> + >> +TRACE_LEAVE2("%d", ret); >> + >> +return ret; >> +} >> + >> // >> void Fifo::empty() >> { >> diff --git a/osaf/services/saf/amf/amfd/include/imm.h >> b/osaf/services/saf/amf/amfd/include/imm.h >> --- a/osaf/services/saf/amf/amfd/include/imm.h >> +++ b/osaf/services/saf/amf/amfd/include/imm.h >> @@ -146,6 +146,8 @@ public: >> >> static AvdJobDequeueResultT execute(const AVD_CL_CB *cb); >> >> +static AvdJobDequeueResultT executeAdminResp(const AVD_CL_CB >> *cb); >> + >> static void empty(); >> >> static uint32_t size(); >> diff --git a/osaf/services/saf/amf/amfd/role.cc >> b/osaf/services/saf/amf/amfd/role.cc >> --- a/osaf/services/saf/amf/amfd/role.cc >> +++ b/osaf/services/saf/amf/amfd/role.cc >> @@ -766,6 +766,11 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb, >> } >> >> try_again: >> +/* Execute admin op jobs before calling saImmOiImplementerClear >> to avoid >> + * SA_AIS_ERR_TIMEOUT >> + */ >> +Fifo::executeAdminResp(cb); >> + >> /* Take mutex here to sync with imm reinit thread.*/ >> osaf_mutex_lock_ordie(_reinit_mutex); >> /* Give up IMM OI implementer role */ >> > -- ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel
Re: [devel] [PATCH 1 of 1] amfd: fix si-swap timeout [#2190]
Hi Long, As per ticket, issue occurred during si-swap. Please update the ticket with brief analysis so that seeing the ticket one can have idea where the problem occurred. I guess si-swap initiated before AMFD could update to IMM. If that is the case, we should reject the admin operation from si_admin_op_cb() if invocationId is still set in Si. Thanks, Praveen On 16-Nov-16 4:19 PM, Long HB Nguyen wrote: > osaf/services/saf/amf/amfd/imm.cc| 22 ++ > osaf/services/saf/amf/amfd/include/imm.h | 2 ++ > osaf/services/saf/amf/amfd/role.cc | 5 + > 3 files changed, 29 insertions(+), 0 deletions(-) > > > diff --git a/osaf/services/saf/amf/amfd/imm.cc > b/osaf/services/saf/amf/amfd/imm.cc > --- a/osaf/services/saf/amf/amfd/imm.cc > +++ b/osaf/services/saf/amf/amfd/imm.cc > @@ -423,6 +423,28 @@ AvdJobDequeueResultT Fifo::execute(const > return ret; > } > > +AvdJobDequeueResultT Fifo::executeAdminResp(const AVD_CL_CB *cb) > +{ > + Job *ajob; > + AvdJobDequeueResultT ret = JOB_EXECUTED; > + > + TRACE_ENTER(); > + > + while ((ajob = peek()) != nullptr) { > + if (dynamic_cast(ajob) != nullptr) { > + ret = ajob->exec(cb); > + } else { > + ajob = dequeue(); > + delete ajob; > + ret = JOB_EXECUTED; > + } > + } > + > + TRACE_LEAVE2("%d", ret); > + > + return ret; > +} > + > // > void Fifo::empty() > { > diff --git a/osaf/services/saf/amf/amfd/include/imm.h > b/osaf/services/saf/amf/amfd/include/imm.h > --- a/osaf/services/saf/amf/amfd/include/imm.h > +++ b/osaf/services/saf/amf/amfd/include/imm.h > @@ -146,6 +146,8 @@ public: > > static AvdJobDequeueResultT execute(const AVD_CL_CB *cb); > > +static AvdJobDequeueResultT executeAdminResp(const AVD_CL_CB *cb); > + > static void empty(); > > static uint32_t size(); > diff --git a/osaf/services/saf/amf/amfd/role.cc > b/osaf/services/saf/amf/amfd/role.cc > --- a/osaf/services/saf/amf/amfd/role.cc > +++ b/osaf/services/saf/amf/amfd/role.cc > @@ -766,6 +766,11 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb, > } > > try_again: > + /* Execute admin op jobs before calling saImmOiImplementerClear to avoid > + * SA_AIS_ERR_TIMEOUT > + */ > + Fifo::executeAdminResp(cb); > + > /* Take mutex here to sync with imm reinit thread.*/ > osaf_mutex_lock_ordie(_reinit_mutex); > /* Give up IMM OI implementer role */ > -- ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel
[devel] [PATCH 0 of 1] Review Request for amfnd: add cleanup command for proxy and proxied [#2186]
Summary: amfnd: add cleanup command for proxy and proxied [#2186] Review request for Trac Ticket(s): #2186 Peer Reviewer(s): Amf Dev Pull request to: <> Affected branch(es): Default Development branch: Default Impacted area Impact y/n Docsn Build systemn RPM/packaging n Configuration files n Startup scripts n SAF servicesy OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - <> changeset 337c6dde5dfca14c8af4299a2a2598958bd1ea37 Author: Nagendra KumarDate: Thu, 17 Nov 2016 13:41:09 +0530 amfnd: add cleanup command for proxy and proxied [#2186] When proxy and proxied are in the same SU and if proxy fails and SU restart is escalated, then proxied are not being cleaned up. Added cleanup commands for proxied component if their proxy are not in healthy state. Also, during opensafd stop, opensafd is not able to terminate the proxy. Added fix for this also. Complete diffstat: -- osaf/services/saf/amf/amfnd/clc.cc | 38 ++ osaf/services/saf/amf/amfnd/susm.cc | 24 +++- 2 files changed, 53 insertions(+), 9 deletions(-) Testing Commands: - Please use proxy.c and proxy.xml for testing attached in the ticket. Proxy.xml is having Su with 1 proxy component and two proxied components. Configured saAmfSgtDefCompRestartMax as 1. 1. immcfg -f proxy.xml 2. amf-adm unlock-in safSu=1,safSg=2N,safApp=Proxy; amf-adm unlock safSu=1,safSg=2N,safApp=Proxy 3. pkill -9 proxy and again pkill -9 proxy to restart SU. 4. amf-adm unlock-in safSu=1,safSg=2N,safApp=Proxy; amf-adm unlock safSu=1,safSg=2N,safApp=Proxy 5. amf-adm lock safSu=1,safSg=2N,safApp=Proxy 6. amf-adm unlock safSu=1,safSg=2N,safApp=Proxy 7. amf-adm restart safSu=1,safSg=2N,safApp=Proxy Testing, Expected Results: -- All steps #4 to #7 worked fine. Conditions of Submission: - Ack from Amf Dev Arch Built StartedLinux distro --- mipsn n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: --- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These have to be separate commits. ___ You need to refactor your submission into logical chunks; there is too much content into a single commit. ___ You have extraneous garbage in your review (merge commits etc) ___ You have giant attachments which should never have been sent; Instead you should place your content in a public tree to be pulled. ___ You have too many commits attached to an e-mail; resend as threaded commits, or place in a public tree for a pull. ___ You have resent this content multiple times without a clear indication of what has changed between each re-send. ___ You have failed to adequately and individually address all of the comments and change requests that were proposed in the initial review. ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc) ___ Your computer have a badly configured date and time; confusing the the threaded patch review. ___ Your changes affect IPC mechanism, and you don't present any results for in-service upgradability test. ___ Your changes affect user manual and documentation, your patch series do not contain the patch that updates the Doxygen manual.
[devel] [PATCH 1 of 1] amfnd: add cleanup command for proxy and proxied [#2186]
osaf/services/saf/amf/amfnd/clc.cc | 38 +--- osaf/services/saf/amf/amfnd/susm.cc | 24 ++- 2 files changed, 53 insertions(+), 9 deletions(-) When proxy and proxied are in the same SU and if proxy fails and SU restart is escalated, then proxied are not being cleaned up. Added cleanup commands for proxied component if their proxy are not in healthy state. Also, during opensafd stop, opensafd is not able to terminate the proxy. Added fix for this also. diff --git a/osaf/services/saf/amf/amfnd/clc.cc b/osaf/services/saf/amf/amfnd/clc.cc --- a/osaf/services/saf/amf/amfnd/clc.cc +++ b/osaf/services/saf/amf/amfnd/clc.cc @@ -2020,26 +2020,48 @@ uint32_t avnd_comp_clc_inst_clean_hdler( if (m_AVND_COMP_TYPE_IS_PROXIED(comp)) { avnd_comp_cbq_del(cb, comp, true); - /* call the cleanup callback */ - rc = avnd_comp_cbk_send(cb, comp, AVSV_AMF_PXIED_COMP_CLEAN, 0, 0); + /* Check whether + 1. it is a case of SU restart escalation. + 2. it is a case of proxy-proxied residing in the same SU. + 3. the proxy of this component is in good state to accept the callback.*/ + + /* Check if the proxy and proxied components are in the same SUs. */ + if ((m_AVND_SU_IS_RESTART(comp->su) && m_AVND_SU_IS_FAILED(comp->su)) && /* #1. */ + ((comp->pxy_comp) && (comp->su == comp->pxy_comp->su)) && /* #2 */ + (m_AVND_COMP_IS_FAILED(comp->pxy_comp))) { /* #3 */ + /* If proxy is not in good shape, we can't issue callback, so issue cleanup + (assume it is local component). */ + rc = avnd_comp_clc_cmd_execute(cb, comp, AVND_COMP_CLC_CMD_TYPE_CLEANUP); + } else { + /* call the cleanup callback */ + rc = avnd_comp_cbk_send(cb, comp, AVSV_AMF_PXIED_COMP_CLEAN, 0, 0); + } } else if (m_AVND_COMP_TYPE_IS_PROXY(comp) && comp->pxied_list.n_nodes) { /* if there are still outstanding proxied components we can't terminate right now */ /* Check if the proxy and proxied components are in the same SUs. */ TRACE("Proxy has proxied components: %u", comp->pxied_list.n_nodes); AVND_COMP_PXIED_REC *rec; rec = (AVND_COMP_PXIED_REC *)m_NCS_DBLIST_FIND_FIRST(>pxied_list); -while (rec) { -if (comp->su == rec->pxied_comp->su) + while (rec) { + if (comp->su == rec->pxied_comp->su) break; -rec = (AVND_COMP_PXIED_REC *)m_NCS_DBLIST_FIND_NEXT(>comp_dll_node); -} + rec = (AVND_COMP_PXIED_REC *)m_NCS_DBLIST_FIND_NEXT(>comp_dll_node); + } if (rec == nullptr) { TRACE("Proxy and proxied are not in the same SU."); /* This means that proxy and proxied components are not in the same SU. That means that we can cleanup the component. */ rc = avnd_comp_clc_cmd_execute(cb, comp, AVND_COMP_CLC_CMD_TYPE_CLEANUP); - } else - return rc; + } else { + TRACE("Proxy and proxied are in the same SU."); + if (((m_AVND_SU_IS_RESTART(comp->su) && m_AVND_SU_IS_FAILED(comp->su)) && + (m_AVND_COMP_IS_FAILED(comp))) || + (cb->term_state == AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED)) { + /* If proxy is not in good shape, so issue cleanup. */ + rc = avnd_comp_clc_cmd_execute(cb, comp, AVND_COMP_CLC_CMD_TYPE_CLEANUP); + } else + goto done; + } } else { if (m_AVND_SU_IS_RESTART(comp->su) && m_AVND_COMP_IS_RESTART_DIS(comp) && (comp->csi_list.n_nodes > 0) && diff --git a/osaf/services/saf/amf/amfnd/susm.cc b/osaf/services/saf/amf/amfnd/susm.cc --- a/osaf/services/saf/amf/amfnd/susm.cc +++ b/osaf/services/saf/amf/amfnd/susm.cc @@ -2601,8 +2601,30 @@ uint32_t avnd_su_pres_inst_comprestart_h if (NCSCC_RC_SUCCESS != rc) goto done; break; + } else { + /* + For a NPI comp in SU, component FSM is always triggered + at the time of assignments. If this component is + non-restartable then start
Re: [devel] [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]
Hi Vu, Please find query inline [Praveen]. Thanks, Praveen On 17-Nov-16 11:58 AM, Vu Minh Nguyen wrote: > osaf/services/saf/ntfsv/ntfs/ntfs_clm.c | 14 +- > 1 files changed, 13 insertions(+), 1 deletions(-) > > > NTF did not deal with TRY_AGAIN error code of `saClmInitialize()`, > NTF would exit, and cause node reboot if getting TRY_AGAIN. > > The patch adds a while loop to do retry when getting TRY_AGAIN. > > diff --git a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > --- a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > +++ b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c > @@ -101,13 +101,25 @@ void *ntfs_clm_init_thread(void *cb) > { > ntfs_cb_t *_ntfs_cb = (ntfs_cb_t *) cb; > SaAisErrorT rc = SA_AIS_OK; > + uint32_t msecs_waited = 0; > + const uint32_t max_waiting_time_10s = 10 * 1000; /* 10 secs */ > + > TRACE_ENTER(); > + > rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, ); > + while (((rc == SA_AIS_ERR_TRY_AGAIN) || (rc == SA_AIS_ERR_TIMEOUT) || > + (rc == SA_AIS_ERR_UNAVAILABLE)) && [Praveen] Based on discussion on ticket #1828 related to CLM initialization, I remember this API does not return ERR_UNAVAILABLE even on a CLM locked node also. This provision is to enable application on CLM locked node to perform tracking of CLM status of nodes. But I think even CLM Agent can return this code when CLMS server has not got node related information from CLM agent and in that case CLMS will return ERR_UNAVAILABLE. If it is not so, then I think we are not required to handle rc == SA_AIS_ERR_UNAVAILABLE. Thanks, Praveen > +(msecs_waited < max_waiting_time_10s)) { > + usleep(100*1000); > + msecs_waited += 100; > + rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, > ); > + } > if (rc != SA_AIS_OK) { > LOG_ER("saClmInitialize failed with error: %d", rc); > TRACE_LEAVE(); > -exit(EXIT_FAILURE); > + exit(EXIT_FAILURE); > } > + > rc = saClmSelectionObjectGet(_ntfs_cb->clm_hdl, > _cb->clmSelectionObject); > if (rc != SA_AIS_OK) { > LOG_ER("saClmSelectionObjectGet failed with error: %d", rc); > -- ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel
Re: [devel] [PATCH 2 of 2] base: Use the Mutex and Lock classes [#2173]
Ack. Thanks, Ramesh. On 11/8/2016 8:50 PM, Anders Widell wrote: > osaf/libs/core/cplusplus/base/process.cc | 31 > --- > osaf/libs/core/cplusplus/base/process.h | 19 + > osaf/libs/core/cplusplus/base/unix_socket.cc | 17 +- > osaf/libs/core/cplusplus/base/unix_socket.h | 3 +- > 4 files changed, 32 insertions(+), 38 deletions(-) > > > Use the new Mutex and Lock classes instead of std::mutex, std::unique_lock and > pthread_mutex_t. > > diff --git a/osaf/libs/core/cplusplus/base/process.cc > b/osaf/libs/core/cplusplus/base/process.cc > --- a/osaf/libs/core/cplusplus/base/process.cc > +++ b/osaf/libs/core/cplusplus/base/process.cc > @@ -50,10 +50,12 @@ Process::~Process() { > } > > void Process::Kill(int sig_no, const Duration& wait_time) { > - std::unique_lock lock(mutex_); > - pid_t pgid = process_group_id_; > - process_group_id_ = 0; > - lock.unlock(); > + pid_t pgid; > + { > +Lock lock(mutex_); > +pgid = process_group_id_; > +process_group_id_ = 0; > + } > if (pgid > 1) KillProc(-pgid, sig_no, wait_time); > } > > @@ -119,20 +121,23 @@ void Process::Execute(int argc, char *ar > LOG_WA("setpgid(%u, 0) failed: %s", >(unsigned) child_pid, strerror(errno)); > } > -std::unique_lock lock(mutex_); > -process_group_id_ = child_pid; > -StartTimer(timeout); > -lock.unlock(); > +{ > + Lock lock(mutex_); > + process_group_id_ = child_pid; > + StartTimer(timeout); > +} > int status; > pid_t wait_pid; > do { > wait_pid = waitpid(child_pid, , 0); > } while (wait_pid == (pid_t) -1 && errno == EINTR); > -lock.lock(); > -StopTimer(); > -bool killed = process_group_id_ == 0; > -process_group_id_ = 0; > -lock.unlock(); > +bool killed; > +{ > + Lock lock(mutex_); > + StopTimer(); > + killed = process_group_id_ == 0; > + process_group_id_ = 0; > +} > bool successful = false; > if (!killed && wait_pid != (pid_t) -1) { > if (!WIFEXITED(status)) { > diff --git a/osaf/libs/core/cplusplus/base/process.h > b/osaf/libs/core/cplusplus/base/process.h > --- a/osaf/libs/core/cplusplus/base/process.h > +++ b/osaf/libs/core/cplusplus/base/process.h > @@ -15,20 +15,19 @@ >* >*/ > > -#ifndef OPENSAF_OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ > -#define OPENSAF_OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ > +#ifndef OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ > +#define OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ > > +#include > #include > #include > #include > -#include > -#include > -#include "base/macros.h" > +#include "osaf/libs/core/cplusplus/base/macros.h" > +#include "osaf/libs/core/cplusplus/base/mutex.h" > > namespace base { > > class Process { > - DELETE_COPY_AND_MOVE_OPERATORS(Process); >public: > using Duration = std::chrono::steady_clock::duration; > Process(); > @@ -96,18 +95,20 @@ class Process { > * processes in the process group have terminated. > */ > static void KillProc(pid_t pid, int sig_no, const Duration& wait_time); > + >private: > using Clock = std::chrono::steady_clock; > using TimePoint = std::chrono::steady_clock::time_point; > void StartTimer(const Duration& timeout); > void StopTimer(); > static void TimerExpirationEvent(sigval notification_data); > - std::mutex mutex_; > + Mutex mutex_; > timer_t timer_id_; > bool is_timer_created_; > pid_t process_group_id_; > + DELETE_COPY_AND_MOVE_OPERATORS(Process); > }; > > -} // namespace base > +} // namespace base > > -#endif /* OPENSAF_OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ */ > +#endif // OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ > diff --git a/osaf/libs/core/cplusplus/base/unix_socket.cc > b/osaf/libs/core/cplusplus/base/unix_socket.cc > --- a/osaf/libs/core/cplusplus/base/unix_socket.cc > +++ b/osaf/libs/core/cplusplus/base/unix_socket.cc > @@ -32,19 +32,10 @@ UnixSocket::UnixSocket(const std::string > } else { > addr_.sun_path[0] = '\0'; > } > - pthread_mutexattr_t attr; > - int result = pthread_mutexattr_init(); > - if (result != 0) osaf_abort(result); > - result = pthread_mutexattr_setprotocol(, PTHREAD_PRIO_INHERIT); > - if (result != 0) osaf_abort(result); > - result = pthread_mutex_init(_, ); > - if (result != 0) osaf_abort(result); > - result = pthread_mutexattr_destroy(); > - if (result != 0) osaf_abort(result); > } > > int UnixSocket::Open() { > - osaf_mutex_lock_ordie(_); > + Lock lock(mutex_); > int sock = fd_; > if (sock < 0) { > if (addr_.sun_path[0] != '\0') { > @@ -60,18 +51,15 @@ int UnixSocket::Open() { > errno = ENAMETOOLONG; > } > } > - osaf_mutex_unlock_ordie(_); > return sock; > } > > UnixSocket::~UnixSocket() { > Close(); > - int result = pthread_mutex_destroy(_); > - if (result !=
Re: [devel] [PATCH 1 of 1] mds: Improve log message [#2193]
Hi HansN, ACK , Not tested. -AVM On 11/17/2016 12:59 PM, Hans Nordeback wrote: > osaf/libs/core/mds/mds_dt_tipc.c | 7 +-- > 1 files changed, 5 insertions(+), 2 deletions(-) > > > diff --git a/osaf/libs/core/mds/mds_dt_tipc.c > b/osaf/libs/core/mds/mds_dt_tipc.c > --- a/osaf/libs/core/mds/mds_dt_tipc.c > +++ b/osaf/libs/core/mds/mds_dt_tipc.c > @@ -2497,7 +2497,7 @@ static uint32_t mdtm_sendto(uint8_t *buf > { > /* Can be made as macro even */ > struct sockaddr_tipc server_addr; > - int send_len = 0; > + ssize_t send_len = 0; > #ifdef MDS_CHECKSUM_ENABLE_FLAG > uint16_t checksum = 0; > #endif > @@ -2524,8 +2524,11 @@ static uint32_t mdtm_sendto(uint8_t *buf > if (send_len == buff_len) { > m_MDS_LOG_INFO("MDTM: Successfully sent message"); > return NCSCC_RC_SUCCESS; > + } else if (send_len == -1) { > + m_MDS_LOG_ERR("MDTM: Failed to send message err :%s", > strerror(errno)); > + return NCSCC_RC_FAILURE; > } else { > - m_MDS_LOG_ERR("MDTM: Failed to send message err :%s", > strerror(errno)); > + m_MDS_LOG_ERR("MDTM: Failed to send message send_len :%zd", > send_len); > return NCSCC_RC_FAILURE; > } > } -- ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel