Re: [devel] [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]

2016-11-17 Thread praveen malviya
Ack, code review only.

Thanks,
Praveen

On 18-Nov-16 8:58 AM, Vu Minh Nguyen wrote:
> Thanks Praveen. I will remove ` SA_AIS_ERR_UNAVAILABLE` error code checking.
>
> Regards, Vu
>
>> -Original Message-
>> From: praveen malviya [mailto:praveen.malv...@oracle.com]
>> Sent: Thursday, November 17, 2016 4:07 PM
>> To: Vu Minh Nguyen ;
>> minh.c...@dektech.com.au
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of
>> saClmInitialize() [#2191]
>>
>> Hi Vu,
>>
>> Please find query inline [Praveen].
>>
>> Thanks,
>> Praveen
>>
>> On 17-Nov-16 11:58 AM, Vu Minh Nguyen wrote:
>>>  osaf/services/saf/ntfsv/ntfs/ntfs_clm.c |  14 +-
>>>  1 files changed, 13 insertions(+), 1 deletions(-)
>>>
>>>
>>> NTF did not deal with TRY_AGAIN error code of `saClmInitialize()`,
>>> NTF would exit, and cause node reboot if getting TRY_AGAIN.
>>>
>>> The patch adds a while loop to do retry when getting TRY_AGAIN.
>>>
>>> diff --git a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
>> b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
>>> --- a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
>>> +++ b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
>>> @@ -101,13 +101,25 @@ void *ntfs_clm_init_thread(void *cb)
>>>  {
>>> ntfs_cb_t *_ntfs_cb = (ntfs_cb_t *) cb;
>>> SaAisErrorT rc = SA_AIS_OK;
>>> +   uint32_t msecs_waited = 0;
>>> +   const uint32_t max_waiting_time_10s = 10 * 1000; /* 10 secs */
>>> +
>>> TRACE_ENTER();
>>> +
>>> rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks,
>> );
>>> +   while (((rc == SA_AIS_ERR_TRY_AGAIN) || (rc ==
>> SA_AIS_ERR_TIMEOUT) ||
>>> +   (rc == SA_AIS_ERR_UNAVAILABLE)) &&
>> [Praveen] Based on discussion on ticket #1828 related to CLM
>> initialization, I remember this API does not return ERR_UNAVAILABLE even
>> on a CLM locked node also. This provision is to enable application on
>> CLM locked node to perform tracking of CLM status of nodes.
>>
>> But I think even CLM Agent can return this code when CLMS server has not
>> got node related information from CLM agent and in that case CLMS will
>> return ERR_UNAVAILABLE. If it is not so, then I think we are not
>> required to handle rc == SA_AIS_ERR_UNAVAILABLE.
>>
>> Thanks,
>> Praveen
>>> +  (msecs_waited < max_waiting_time_10s)) {
>>> +   usleep(100*1000);
>>> +   msecs_waited += 100;
>>> +   rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks,
>> );
>>> +   }
>>> if (rc != SA_AIS_OK) {
>>> LOG_ER("saClmInitialize failed with error: %d", rc);
>>> TRACE_LEAVE();
>>> -exit(EXIT_FAILURE);
>>> +   exit(EXIT_FAILURE);
>>> }
>>> +
>>> rc = saClmSelectionObjectGet(_ntfs_cb->clm_hdl, _cb-
>>> clmSelectionObject);
>>> if (rc != SA_AIS_OK) {
>>> LOG_ER("saClmSelectionObjectGet failed with error: %d",
>> rc);
>>>
>

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]

2016-11-17 Thread Vu Minh Nguyen
Thanks Praveen. I will remove ` SA_AIS_ERR_UNAVAILABLE` error code checking.

Regards, Vu

> -Original Message-
> From: praveen malviya [mailto:praveen.malv...@oracle.com]
> Sent: Thursday, November 17, 2016 4:07 PM
> To: Vu Minh Nguyen ;
> minh.c...@dektech.com.au
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of
> saClmInitialize() [#2191]
> 
> Hi Vu,
> 
> Please find query inline [Praveen].
> 
> Thanks,
> Praveen
> 
> On 17-Nov-16 11:58 AM, Vu Minh Nguyen wrote:
> >  osaf/services/saf/ntfsv/ntfs/ntfs_clm.c |  14 +-
> >  1 files changed, 13 insertions(+), 1 deletions(-)
> >
> >
> > NTF did not deal with TRY_AGAIN error code of `saClmInitialize()`,
> > NTF would exit, and cause node reboot if getting TRY_AGAIN.
> >
> > The patch adds a while loop to do retry when getting TRY_AGAIN.
> >
> > diff --git a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> > --- a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> > +++ b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> > @@ -101,13 +101,25 @@ void *ntfs_clm_init_thread(void *cb)
> >  {
> > ntfs_cb_t *_ntfs_cb = (ntfs_cb_t *) cb;
> > SaAisErrorT rc = SA_AIS_OK;
> > +   uint32_t msecs_waited = 0;
> > +   const uint32_t max_waiting_time_10s = 10 * 1000; /* 10 secs */
> > +
> > TRACE_ENTER();
> > +
> > rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks,
> );
> > +   while (((rc == SA_AIS_ERR_TRY_AGAIN) || (rc ==
> SA_AIS_ERR_TIMEOUT) ||
> > +   (rc == SA_AIS_ERR_UNAVAILABLE)) &&
> [Praveen] Based on discussion on ticket #1828 related to CLM
> initialization, I remember this API does not return ERR_UNAVAILABLE even
> on a CLM locked node also. This provision is to enable application on
> CLM locked node to perform tracking of CLM status of nodes.
> 
> But I think even CLM Agent can return this code when CLMS server has not
> got node related information from CLM agent and in that case CLMS will
> return ERR_UNAVAILABLE. If it is not so, then I think we are not
> required to handle rc == SA_AIS_ERR_UNAVAILABLE.
> 
> Thanks,
> Praveen
> > +  (msecs_waited < max_waiting_time_10s)) {
> > +   usleep(100*1000);
> > +   msecs_waited += 100;
> > +   rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks,
> );
> > +   }
> > if (rc != SA_AIS_OK) {
> > LOG_ER("saClmInitialize failed with error: %d", rc);
> > TRACE_LEAVE();
> > -exit(EXIT_FAILURE);
> > +   exit(EXIT_FAILURE);
> > }
> > +
> > rc = saClmSelectionObjectGet(_ntfs_cb->clm_hdl, _cb-
> >clmSelectionObject);
> > if (rc != SA_AIS_OK) {
> > LOG_ER("saClmSelectionObjectGet failed with error: %d",
> rc);
> >


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 1] cpsv: on cpnd down fist remove child safReplica object then parent safCkpt object [#2189]

2016-11-17 Thread Vo Minh Hoang
Dear Mahesh,

Reviewed and tested with collocated and non-collocated case, saw problem
fixed and could not find any occurrence.

So ACK from me, tested.

Sincerely,
Hoang

-Original Message-
From: mahesh.va...@oracle.com [mailto:mahesh.va...@oracle.com] 
Sent: Wednesday, November 16, 2016 3:58 PM
To: hoang.m...@dektech.com.au
Cc: opensaf-devel@lists.sourceforge.net
Subject: [PATCH 1 of 1] cpsv: on cpnd down fist remove child safReplica
object then parent safCkpt object [#2189]

 osaf/services/saf/cpsv/cpd/cpd_imm.c  |  20 
 osaf/services/saf/cpsv/cpd/cpd_proc.c |   9 -
 2 files changed, 20 insertions(+), 9 deletions(-)


Bug :
While cpd processing cpnd down for  COLLOCATED  cktp  and that checkpoint
only exist on the went down cpnd ( no others Node opened this checkpoint in
cluster) , then cpd  removes  that checkpoint and replica completely.

In such case the current logic has as bug,  fist it removes ckpt node and
then replica, this is causing deletion of parent object safCkpt=...,*  first
, then child object safReplica=...,safCkpt=...,* next.

as we know IMM removes child if parent is removed ,so this is causing the
issue out of sequence remove of safReplica object and ERR_NOT_EXIST  is
returned.

Fix :
While cpd removing  that checkpoint and replica completely , follow the
sequence of  child object safReplica=...,safCkpt=...,*  fist then  parent
object safCkpt=...,* next.

This is focused fix , my be we need to review complete code for such
occurrences , if found will be addressed in new ticket.

diff --git a/osaf/services/saf/cpsv/cpd/cpd_imm.c
b/osaf/services/saf/cpsv/cpd/cpd_imm.c
--- a/osaf/services/saf/cpsv/cpd/cpd_imm.c
+++ b/osaf/services/saf/cpsv/cpd/cpd_imm.c
@@ -400,7 +400,9 @@ SaAisErrorT delete_runtime_replica_objec
osaf_extended_name_lend(replica_dn, _name);
rc = immutil_saImmOiRtObjectDelete(immOiHandle, _name); 
if (rc != SA_AIS_OK) {
-   LOG_ER("Deleting run time object %s Failed - rc =
%d",replica_dn, rc);
+   LOG_ER("Deleting run time object %s Failed-1 - rc =
%d",replica_dn, rc);
+   } else {
+   TRACE("Deleting run time object %s Success-1 - rc =
%d",replica_dn, 
+rc);
}
 
free(replica_dn);
@@ -522,9 +524,11 @@ SaAisErrorT delete_runtime_ckpt_object(C
osaf_extended_name_lend(ckpt_node->ckpt_name, _name);
 
rc =  immutil_saImmOiRtObjectDelete(immOiHandle, _name);
-   if (rc != SA_AIS_OK)
+   if (rc != SA_AIS_OK) {
LOG_ER("Deleting run time object %s failed - rc = %d",
ckpt_node->ckpt_name, rc);
-
+   } else {
+   TRACE("Deleting run time object %s success - rc = %d",
ckpt_node->ckpt_name, rc);
+   }
return rc;
 }
 
@@ -917,11 +921,11 @@ SaAisErrorT cpd_clean_checkpoint_objects
/* Delete the runtime object and its children. */
rc = immutil_saImmOiRtObjectDelete(cb->immOiHandle,
_name);
if (rc == SA_AIS_OK) {
-   TRACE("Object \"%s\" deleted", (char *)
osaf_extended_name_borrow(_name));
-   } else {
-   LOG_ER("%s saImmOiRtObjectDelete for \"%s\" FAILED
%d",
-   __FUNCTION__, (char *)
osaf_extended_name_borrow(_name), rc);
-   }
+  TRACE("saImmOiRtObjectDelete \"%s\" deleted
Successfully", (char *) osaf_extended_name_borrow(_name));
+  } else {
+  LOG_ER("%s saImmOiRtObjectDelete for \"%s\" FAILED
%d",
+  __FUNCTION__, (char *)
osaf_extended_name_borrow(_name), rc);
+  }
}
 
if (rc != SA_AIS_ERR_NOT_EXIST) { diff --git
a/osaf/services/saf/cpsv/cpd/cpd_proc.c
b/osaf/services/saf/cpsv/cpd/cpd_proc.c
--- a/osaf/services/saf/cpsv/cpd/cpd_proc.c
+++ b/osaf/services/saf/cpsv/cpd/cpd_proc.c
@@ -809,6 +809,11 @@ uint32_t cpd_process_cpnd_down(CPD_CB *c
send_evt.info.cpnd.info.ckpt_del.mds_dest =
*cpnd_dest;
if (ckpt_node->dest_cnt == 0) {
TRACE_1("cpd ckpt del success for
ckpt_id:%llx",ckpt_node->ckpt_id);
+   /* Delete reploc fist*/
+   cpd_ckpt_reploc_get(>ckpt_reploc_tree,
_info, _info);
+   if (rep_info) {
+   cpd_ckpt_reploc_node_delete(cb,
rep_info, ckpt_node->is_unlink_set);
+   }
cpd_ckpt_map_node_get(>ckpt_map_tree,
ckpt_node->ckpt_name, _info);
 
/* Remove the ckpt_node */
@@ -875,7 +880,7 @@ uint32_t cpd_process_cpnd_down(CPD_CB *c
/* Send it to CPD(s), by sending ckpt_id = 0 */
/* This is to delete the node from reploc_tree */
cpd_ckpt_reploc_get(>ckpt_reploc_tree, _info,
_info);
-   if 

Re: [devel] [PATCH 1 of 1] AMFD: Do not let non-active AMFD receives CLM track callback [#2141]

2016-11-17 Thread minh chau
Hi Praveen,

It has been seen once in swapping 2N Opensaf SI. It was just an error 
indicating that AMFD failed to stop track callback, after that the 
si-swap also succeeded. CLM service could be busy during switch-over but 
AMF (as well as other services) is supposed to handle that return code 
properly as AMFD has done in track start. Further testing, it exposed a 
problem due to clm track callback received on standby AMFD.

AMFD can not be too busy just try to stop clm track. First option, AMFD 
can start another thread which just tries to stop track callback, but 
that's likely to generate more problems where this standby AMFD is 
promoted back to active in the meantime that thread of stopping track is 
on going. At that time, main thread will be starting track and another 
thread is trying to stop track.

Second option as in the patch, AMFD retries to stop track where problem 
happens in callback. At the time standby AMFD switching back to active, 
we stop and start track again, but it seems no point of doing this. I 
hope it's less impact on existing behavior, do you see any problems?

Regarding standby controller with locked/disabled CLM, I think it is 
mentioned (in #1828 or some where else?) that this matter requires more 
thoughts as whether we want Ha role assigned to AMFD in non member node. 
It seems to beyond this ticket.

Thanks,
Minh

On 17/11/16 22:03, praveen malviya wrote:
> Hi Minh,
>
> We have not seen this issue earlier. Why CLMA is returning timeout 
> where it is busy?
> If we are allowing Standby controller to continue tracking the Node, 
> then I think (though not fully sure), it will increase one more client 
> for validation step and this client does not exist locally. Also we 
> need to evaluate the situation where Standby controller is CLM locked 
> or disabled.
>
> Thanks,
> Praveen
>
>
>
>
> On 17-Nov-16 3:44 AM, minh chau wrote:
>> Hi all,
>>
>> Has anyone had chance to review this patch?
>>
>> Thanks,
>> Minh
>>
>> On 14/11/16 15:27, Minh Hon Chau wrote:
>>> osaf/services/saf/amf/amfd/clm.cc   |  37
>>> +++-
>>>   osaf/services/saf/amf/amfd/include/cb.h |   1 +
>>>   osaf/services/saf/amf/amfd/role.cc  |  16 +++---
>>>   3 files changed, 35 insertions(+), 19 deletions(-)
>>>
>>>
>>> In controller failover/switchover, sometimes active AMFD fails to stop
>>> CLM track callback. Therefore, when this AMFD become standby, AMFD can
>>> continue receiving CLM track callback and trigger the operations which
>>> should only be executed in active AMFD.
>>>
>>> diff --git a/osaf/services/saf/amf/amfd/clm.cc
>>> b/osaf/services/saf/amf/amfd/clm.cc
>>> --- a/osaf/services/saf/amf/amfd/clm.cc
>>> +++ b/osaf/services/saf/amf/amfd/clm.cc
>>> @@ -219,7 +219,13 @@ static void clm_track_cb(const SaClmClus
>>>   LOG_ER("ClmTrackCallback received in error");
>>>   goto done;
>>>   }
>>> -
>>> +if (avd_cb->avail_state_avd != SA_AMF_HA_ACTIVE) {
>>> +if (avd_cb->is_clm_track_started == true) {
>>> +LOG_NO("Retry to stop clm track with AMFD state(%d)",
>>> avd_cb->avail_state_avd);
>>> +avd_clm_track_stop();
>>> +}
>>> +goto done;
>>> +}
>>>   /*
>>>   ** The CLM cluster can be larger than the AMF cluster thus it is
>>> not an
>>>   ** error if the corresponding AMF node cannot be found.
>>> @@ -394,6 +400,7 @@ SaAisErrorT avd_clm_init(AVD_CL_CB* cb)
>>> cb->clmHandle = 0;
>>>   cb->clm_sel_obj = 0;
>>> +cb->is_clm_track_started = false;
>>>   TRACE_ENTER();
>>>   /*
>>>* TODO: This CLM initialization thread can be re-factored
>>> @@ -453,6 +460,8 @@ SaAisErrorT avd_clm_track_start(void)
>>>   } else {
>>>   LOG_ER("Failed to start cluster tracking %u", error);
>>>   }
>>> +} else {
>>> +avd_cb->is_clm_track_started = true;
>>>   }
>>>   TRACE_LEAVE();
>>>   return error;
>>> @@ -460,17 +469,23 @@ SaAisErrorT avd_clm_track_start(void)
>>> SaAisErrorT avd_clm_track_stop(void)
>>>   {
>>> -SaAisErrorT error = SA_AIS_OK;
>>> +SaAisErrorT error = SA_AIS_OK;
>>> +TRACE_ENTER();
>>> +error = saClmClusterTrackStop(avd_cb->clmHandle);
>>> +if (error != SA_AIS_OK) {
>>> +if (error == SA_AIS_ERR_TRY_AGAIN || error ==
>>> SA_AIS_ERR_TIMEOUT ||
>>> +error == SA_AIS_ERR_UNAVAILABLE) {
>>> +LOG_WA("Failed to stop cluster tracking %u", error);
>>> +} else {
>>> +LOG_ER("Failed to stop cluster tracking %u", error);
>>> +}
>>> +} else {
>>> +TRACE("Sucessfully stops cluster tracking");
>>> +avd_cb->is_clm_track_started = false;
>>> +}
>>>   -TRACE_ENTER();
>>> -error = saClmClusterTrackStop(avd_cb->clmHandle);
>>> -if (SA_AIS_OK != error)
>>> -LOG_ER("Failed to stop cluster tracking %u", error);
>>> -else
>>> -TRACE("Sucessfully stops 

Re: [devel] [PATCH 1 of 1] AMFD: Do not let non-active AMFD receives CLM track callback [#2141]

2016-11-17 Thread praveen malviya
Hi Minh,

We have not seen this issue earlier. Why CLMA is returning timeout where 
it is busy?
If we are allowing Standby controller to continue tracking the Node, 
then I think (though not fully sure), it will increase one more client 
for validation step and this client does not exist locally. Also we need 
to evaluate the situation where Standby controller is CLM locked or 
disabled.

Thanks,
Praveen




On 17-Nov-16 3:44 AM, minh chau wrote:
> Hi all,
>
> Has anyone had chance to review this patch?
>
> Thanks,
> Minh
>
> On 14/11/16 15:27, Minh Hon Chau wrote:
>>   osaf/services/saf/amf/amfd/clm.cc   |  37
>> +++-
>>   osaf/services/saf/amf/amfd/include/cb.h |   1 +
>>   osaf/services/saf/amf/amfd/role.cc  |  16 +++---
>>   3 files changed, 35 insertions(+), 19 deletions(-)
>>
>>
>> In controller failover/switchover, sometimes active AMFD fails to stop
>> CLM track callback. Therefore, when this AMFD become standby, AMFD can
>> continue receiving CLM track callback and trigger the operations which
>> should only be executed in active AMFD.
>>
>> diff --git a/osaf/services/saf/amf/amfd/clm.cc
>> b/osaf/services/saf/amf/amfd/clm.cc
>> --- a/osaf/services/saf/amf/amfd/clm.cc
>> +++ b/osaf/services/saf/amf/amfd/clm.cc
>> @@ -219,7 +219,13 @@ static void clm_track_cb(const SaClmClus
>>   LOG_ER("ClmTrackCallback received in error");
>>   goto done;
>>   }
>> -
>> +if (avd_cb->avail_state_avd != SA_AMF_HA_ACTIVE) {
>> +if (avd_cb->is_clm_track_started == true) {
>> +LOG_NO("Retry to stop clm track with AMFD state(%d)",
>> avd_cb->avail_state_avd);
>> +avd_clm_track_stop();
>> +}
>> +goto done;
>> +}
>>   /*
>>   ** The CLM cluster can be larger than the AMF cluster thus it is
>> not an
>>   ** error if the corresponding AMF node cannot be found.
>> @@ -394,6 +400,7 @@ SaAisErrorT avd_clm_init(AVD_CL_CB* cb)
>> cb->clmHandle = 0;
>>   cb->clm_sel_obj = 0;
>> +cb->is_clm_track_started = false;
>>   TRACE_ENTER();
>>   /*
>>* TODO: This CLM initialization thread can be re-factored
>> @@ -453,6 +460,8 @@ SaAisErrorT avd_clm_track_start(void)
>>   } else {
>>   LOG_ER("Failed to start cluster tracking %u", error);
>>   }
>> +} else {
>> +avd_cb->is_clm_track_started = true;
>>   }
>>   TRACE_LEAVE();
>>   return error;
>> @@ -460,17 +469,23 @@ SaAisErrorT avd_clm_track_start(void)
>> SaAisErrorT avd_clm_track_stop(void)
>>   {
>> -SaAisErrorT error = SA_AIS_OK;
>> +SaAisErrorT error = SA_AIS_OK;
>> +TRACE_ENTER();
>> +error = saClmClusterTrackStop(avd_cb->clmHandle);
>> +if (error != SA_AIS_OK) {
>> +if (error == SA_AIS_ERR_TRY_AGAIN || error ==
>> SA_AIS_ERR_TIMEOUT ||
>> +error == SA_AIS_ERR_UNAVAILABLE) {
>> +LOG_WA("Failed to stop cluster tracking %u", error);
>> +} else {
>> +LOG_ER("Failed to stop cluster tracking %u", error);
>> +}
>> +} else {
>> +TRACE("Sucessfully stops cluster tracking");
>> +avd_cb->is_clm_track_started = false;
>> +}
>>   -TRACE_ENTER();
>> -error = saClmClusterTrackStop(avd_cb->clmHandle);
>> -if (SA_AIS_OK != error)
>> -LOG_ER("Failed to stop cluster tracking %u", error);
>> -else
>> -TRACE("Sucessfully stops cluster tracking");
>> -
>> -TRACE_LEAVE();
>> -return error;
>> +TRACE_LEAVE();
>> +return error;
>>   }
>> void clm_node_terminate(AVD_AVND *node)
>> diff --git a/osaf/services/saf/amf/amfd/include/cb.h
>> b/osaf/services/saf/amf/amfd/include/cb.h
>> --- a/osaf/services/saf/amf/amfd/include/cb.h
>> +++ b/osaf/services/saf/amf/amfd/include/cb.h
>> @@ -210,6 +210,7 @@ typedef struct cl_cb_tag {
>>   /* Clm stuff */
>>   std::atomic clmHandle;
>>   std::atomic clm_sel_obj;
>> +bool is_clm_track_started;
>> bool fully_initialized;
>>   bool swap_switch; /* true - In middle of role switch. */
>> diff --git a/osaf/services/saf/amf/amfd/role.cc
>> b/osaf/services/saf/amf/amfd/role.cc
>> --- a/osaf/services/saf/amf/amfd/role.cc
>> +++ b/osaf/services/saf/amf/amfd/role.cc
>> @@ -1050,9 +1050,7 @@ uint32_t amfd_switch_actv_qsd(AVD_CL_CB
>>   /*  Mark AVD as Quiesced. */
>>   cb->avail_state_avd = SA_AMF_HA_QUIESCED;
>>
>> -if (avd_clm_track_stop() != SA_AIS_OK) {
>> -LOG_ER("ClmTrack stop failed");
>> -}
>> +avd_clm_track_stop();
>> /* Go ahead and set mds role as already the NCS SU has been
>> switched */
>>   if (NCSCC_RC_SUCCESS != (rc = avd_mds_set_vdest_role(cb,
>> SA_AMF_HA_QUIESCED))) {
>> @@ -1260,11 +1258,13 @@ uint32_t amfd_switch_stdby_actv(AVD_CL_C
>>   if (NCSCC_RC_SUCCESS != avd_rde_set_role(SA_AMF_HA_ACTIVE)) {
>>   LOG_ER("rde role change failed from stdy -> Active");
>>   }
>> -
>> - 

Re: [devel] [PATCH 1 of 1] amfd: fix si-swap timeout [#2190]

2016-11-17 Thread Long Nguyen
Hi Praveen,

In the trace, for example, in SC-1, we can see the code was performed like:

Nov 14 11:02:27.743481 osafamfd [486:sg.cc:1693] >> set_fsm_state
...
Nov 14 11:02:27.744077 osafamfd [486:sg.cc:1716] TR Admin operation 
finishes on SI:'safSi=SC-2N,safApp=OpenSAF'
*Nov 14 11:02:27.744090 osafamfd [486:imm.cc:1971] >> 
avd_saImmOiAdminOperationResult: inv:17179869185, res:1**
**Nov 14 11:02:27.744106 osafamfd [486:imm.cc:1976] << 
avd_saImmOiAdminOperationResult *
Nov 14 11:02:27.744119 osafamfd [486:sg.cc:1724] << set_fsm_state

The traces map to the code:
> for (const auto& si : list_of_si) {
> if (si->invocation != 0) {
> TRACE("Admin operation finishes on 
> SI:'%s'",si->name.c_str());
>  avd_saImmOiAdminOperationResult(avd_cb->immOiHandle,
> si->invocation, SA_AIS_OK);
> *si->invocation = 0;*
> }
> }

So, the invocation here is reset.

Best regards,
Long Nguyen.

On 11/17/2016 4:15 PM, praveen malviya wrote:
> Hi Long,
>
> As per ticket, issue occurred during si-swap.
> Please update the ticket with brief analysis so that seeing the ticket 
> one can have idea where the problem occurred.
> I guess si-swap initiated before AMFD could update to IMM.
> If that is the case, we should reject the admin operation from 
> si_admin_op_cb() if invocationId is still set in Si.
>
> Thanks,
> Praveen
>
> On 16-Nov-16 4:19 PM, Long HB Nguyen wrote:
>>  osaf/services/saf/amf/amfd/imm.cc |  22 ++
>>  osaf/services/saf/amf/amfd/include/imm.h |   2 ++
>>  osaf/services/saf/amf/amfd/role.cc   |   5 +
>>  3 files changed, 29 insertions(+), 0 deletions(-)
>>
>>
>> diff --git a/osaf/services/saf/amf/amfd/imm.cc 
>> b/osaf/services/saf/amf/amfd/imm.cc
>> --- a/osaf/services/saf/amf/amfd/imm.cc
>> +++ b/osaf/services/saf/amf/amfd/imm.cc
>> @@ -423,6 +423,28 @@ AvdJobDequeueResultT Fifo::execute(const
>>  return ret;
>>  }
>>
>> +AvdJobDequeueResultT Fifo::executeAdminResp(const AVD_CL_CB *cb)
>> +{
>> +Job *ajob;
>> +AvdJobDequeueResultT ret = JOB_EXECUTED;
>> +
>> +TRACE_ENTER();
>> +
>> +while ((ajob = peek()) != nullptr) {
>> +if (dynamic_cast(ajob) != nullptr) {
>> +ret = ajob->exec(cb);
>> +} else {
>> +ajob = dequeue();
>> +delete ajob;
>> +ret = JOB_EXECUTED;
>> +}
>> +}
>> +
>> +TRACE_LEAVE2("%d", ret);
>> +
>> +return ret;
>> +}
>> +
>>  //
>>  void Fifo::empty()
>>  {
>> diff --git a/osaf/services/saf/amf/amfd/include/imm.h 
>> b/osaf/services/saf/amf/amfd/include/imm.h
>> --- a/osaf/services/saf/amf/amfd/include/imm.h
>> +++ b/osaf/services/saf/amf/amfd/include/imm.h
>> @@ -146,6 +146,8 @@ public:
>>
>>  static AvdJobDequeueResultT execute(const AVD_CL_CB *cb);
>>
>> +static AvdJobDequeueResultT executeAdminResp(const AVD_CL_CB 
>> *cb);
>> +
>>  static void empty();
>>
>>  static uint32_t size();
>> diff --git a/osaf/services/saf/amf/amfd/role.cc 
>> b/osaf/services/saf/amf/amfd/role.cc
>> --- a/osaf/services/saf/amf/amfd/role.cc
>> +++ b/osaf/services/saf/amf/amfd/role.cc
>> @@ -766,6 +766,11 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb,
>>  }
>>
>>  try_again:
>> +/* Execute admin op jobs before calling saImmOiImplementerClear 
>> to avoid
>> + * SA_AIS_ERR_TIMEOUT
>> + */
>> +Fifo::executeAdminResp(cb);
>> +
>>  /* Take mutex here to sync with imm reinit thread.*/
>>  osaf_mutex_lock_ordie(_reinit_mutex);
>>  /* Give up IMM OI implementer role */
>>
>

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 1] amfd: fix si-swap timeout [#2190]

2016-11-17 Thread praveen malviya
Hi Long,

As per ticket, issue occurred during si-swap.
Please update the ticket with brief analysis so that seeing the ticket 
one can have idea where the problem occurred.
I guess si-swap initiated before AMFD could update to IMM.
If that is the case, we should reject the admin operation from 
si_admin_op_cb() if invocationId is still set in Si.

Thanks,
Praveen

On 16-Nov-16 4:19 PM, Long HB Nguyen wrote:
>  osaf/services/saf/amf/amfd/imm.cc|  22 ++
>  osaf/services/saf/amf/amfd/include/imm.h |   2 ++
>  osaf/services/saf/amf/amfd/role.cc   |   5 +
>  3 files changed, 29 insertions(+), 0 deletions(-)
>
>
> diff --git a/osaf/services/saf/amf/amfd/imm.cc 
> b/osaf/services/saf/amf/amfd/imm.cc
> --- a/osaf/services/saf/amf/amfd/imm.cc
> +++ b/osaf/services/saf/amf/amfd/imm.cc
> @@ -423,6 +423,28 @@ AvdJobDequeueResultT Fifo::execute(const
>   return ret;
>  }
>
> +AvdJobDequeueResultT Fifo::executeAdminResp(const AVD_CL_CB *cb)
> +{
> + Job *ajob;
> + AvdJobDequeueResultT ret = JOB_EXECUTED;
> +
> + TRACE_ENTER();
> +
> + while ((ajob = peek()) != nullptr) {
> + if (dynamic_cast(ajob) != nullptr) {
> + ret = ajob->exec(cb);
> + } else {
> + ajob = dequeue();
> + delete ajob;
> + ret = JOB_EXECUTED;
> + }
> + }
> +
> + TRACE_LEAVE2("%d", ret);
> +
> + return ret;
> +}
> +
>  //
>  void Fifo::empty()
>  {
> diff --git a/osaf/services/saf/amf/amfd/include/imm.h 
> b/osaf/services/saf/amf/amfd/include/imm.h
> --- a/osaf/services/saf/amf/amfd/include/imm.h
> +++ b/osaf/services/saf/amf/amfd/include/imm.h
> @@ -146,6 +146,8 @@ public:
>
>  static AvdJobDequeueResultT execute(const AVD_CL_CB *cb);
>
> +static AvdJobDequeueResultT executeAdminResp(const AVD_CL_CB *cb);
> +
>  static void empty();
>
>   static uint32_t size();
> diff --git a/osaf/services/saf/amf/amfd/role.cc 
> b/osaf/services/saf/amf/amfd/role.cc
> --- a/osaf/services/saf/amf/amfd/role.cc
> +++ b/osaf/services/saf/amf/amfd/role.cc
> @@ -766,6 +766,11 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb,
>   }
>
>  try_again:
> + /* Execute admin op jobs before calling saImmOiImplementerClear to avoid
> +  * SA_AIS_ERR_TIMEOUT
> +  */
> + Fifo::executeAdminResp(cb);
> +
>   /* Take mutex here to sync with imm reinit thread.*/
>   osaf_mutex_lock_ordie(_reinit_mutex);
>   /* Give up IMM OI implementer role */
>

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 0 of 1] Review Request for amfnd: add cleanup command for proxy and proxied [#2186]

2016-11-17 Thread nagendra . k
Summary: amfnd: add cleanup command for proxy and proxied [#2186]
Review request for Trac Ticket(s): #2186
Peer Reviewer(s): Amf Dev
Pull request to: <>
Affected branch(es): Default
Development branch: Default


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesy
 OpenSAF servicesn
 Core libraries  n
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-
 <>

changeset 337c6dde5dfca14c8af4299a2a2598958bd1ea37
Author: Nagendra Kumar
Date:   Thu, 17 Nov 2016 13:41:09 +0530

amfnd: add cleanup command for proxy and proxied [#2186] When proxy and
proxied are in the same SU and if proxy fails and SU restart is 
escalated,
then proxied are not being cleaned up. Added cleanup commands for 
proxied
component if their proxy are not in healthy state. Also, during opensafd
stop, opensafd is not able to terminate the proxy. Added fix for this 
also.


Complete diffstat:
--
 osaf/services/saf/amf/amfnd/clc.cc  |  38 
++
 osaf/services/saf/amf/amfnd/susm.cc |  24 +++-
 2 files changed, 53 insertions(+), 9 deletions(-)


Testing Commands:
-
Please use proxy.c and proxy.xml for testing attached in the ticket.
Proxy.xml is having Su with 1 proxy component and two proxied components.
Configured saAmfSgtDefCompRestartMax as 1.

1. immcfg -f proxy.xml
2. amf-adm unlock-in safSu=1,safSg=2N,safApp=Proxy; amf-adm unlock 
safSu=1,safSg=2N,safApp=Proxy
3. pkill -9 proxy and again pkill -9 proxy to restart SU.
4. amf-adm unlock-in safSu=1,safSg=2N,safApp=Proxy; amf-adm unlock 
safSu=1,safSg=2N,safApp=Proxy
5. amf-adm lock  safSu=1,safSg=2N,safApp=Proxy 
6. amf-adm unlock  safSu=1,safSg=2N,safApp=Proxy 
7. amf-adm restart  safSu=1,safSg=2N,safApp=Proxy 

Testing, Expected Results:
--
All steps #4 to #7 worked fine.

Conditions of Submission:
-
Ack from Amf Dev

Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  y  y
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
(i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)

___ Your computer have a badly configured date and time; confusing the
the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
do not contain the patch that updates the Doxygen manual.



[devel] [PATCH 1 of 1] amfnd: add cleanup command for proxy and proxied [#2186]

2016-11-17 Thread nagendra . k
 osaf/services/saf/amf/amfnd/clc.cc  |  38 +---
 osaf/services/saf/amf/amfnd/susm.cc |  24 ++-
 2 files changed, 53 insertions(+), 9 deletions(-)


When proxy and proxied are in the same SU and if proxy
fails and SU restart is escalated, then proxied are not being
cleaned up.
Added cleanup commands for proxied component if their proxy are not
in healthy state.
Also, during opensafd stop, opensafd is not able to terminate the proxy.
Added fix for this also.

diff --git a/osaf/services/saf/amf/amfnd/clc.cc 
b/osaf/services/saf/amf/amfnd/clc.cc
--- a/osaf/services/saf/amf/amfnd/clc.cc
+++ b/osaf/services/saf/amf/amfnd/clc.cc
@@ -2020,26 +2020,48 @@ uint32_t avnd_comp_clc_inst_clean_hdler(
 
if (m_AVND_COMP_TYPE_IS_PROXIED(comp)) {
avnd_comp_cbq_del(cb, comp, true);
-   /* call the cleanup callback */
-   rc = avnd_comp_cbk_send(cb, comp, AVSV_AMF_PXIED_COMP_CLEAN, 0, 
0);
+   /* Check whether
+  1. it is a case of SU restart escalation.
+  2. it is a case of proxy-proxied residing in the same SU.
+  3. the proxy of this component is in good state to accept 
the callback.*/
+
+   /* Check if the proxy and proxied components are in the same 
SUs. */
+   if ((m_AVND_SU_IS_RESTART(comp->su) && 
m_AVND_SU_IS_FAILED(comp->su)) && /* #1. */
+   ((comp->pxy_comp) && (comp->su == 
comp->pxy_comp->su)) && /* #2  */
+   (m_AVND_COMP_IS_FAILED(comp->pxy_comp))) { /* 
#3 */
+   /* If proxy is not in good shape, we can't issue 
callback, so issue cleanup
+  (assume it is local component). */
+   rc = avnd_comp_clc_cmd_execute(cb, comp, 
AVND_COMP_CLC_CMD_TYPE_CLEANUP);
+   } else {
+   /* call the cleanup callback */
+   rc = avnd_comp_cbk_send(cb, comp, 
AVSV_AMF_PXIED_COMP_CLEAN, 0, 0);
+   }
} else if (m_AVND_COMP_TYPE_IS_PROXY(comp) && comp->pxied_list.n_nodes) 
{
/* if there are still outstanding proxied components we can't 
terminate right now */
/* Check if the proxy and proxied components are in the same 
SUs. */
TRACE("Proxy has proxied components: %u", 
comp->pxied_list.n_nodes);
AVND_COMP_PXIED_REC *rec;
rec = (AVND_COMP_PXIED_REC 
*)m_NCS_DBLIST_FIND_FIRST(>pxied_list);
-while (rec) {
-if (comp->su == rec->pxied_comp->su)
+   while (rec) {
+   if (comp->su == rec->pxied_comp->su)
break;
-rec = (AVND_COMP_PXIED_REC 
*)m_NCS_DBLIST_FIND_NEXT(>comp_dll_node);
-}
+   rec = (AVND_COMP_PXIED_REC 
*)m_NCS_DBLIST_FIND_NEXT(>comp_dll_node);
+   }
if (rec == nullptr) {
TRACE("Proxy and proxied are not in the same SU.");
/* This means that proxy and proxied components are not 
in the same SU.
   That means that we can cleanup the component. */
rc = avnd_comp_clc_cmd_execute(cb, comp, 
AVND_COMP_CLC_CMD_TYPE_CLEANUP);
-   } else
-   return rc;
+   } else {
+   TRACE("Proxy and proxied are in the same SU.");
+   if (((m_AVND_SU_IS_RESTART(comp->su) && 
m_AVND_SU_IS_FAILED(comp->su)) &&
+   (m_AVND_COMP_IS_FAILED(comp))) 
||
+   (cb->term_state == 
AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED)) {
+   /* If proxy is not in good shape, so issue 
cleanup. */
+   rc = avnd_comp_clc_cmd_execute(cb, comp, 
AVND_COMP_CLC_CMD_TYPE_CLEANUP);
+   } else
+   goto done;
+   }
} else {
if (m_AVND_SU_IS_RESTART(comp->su) && 
m_AVND_COMP_IS_RESTART_DIS(comp) &&
(comp->csi_list.n_nodes > 0) &&
diff --git a/osaf/services/saf/amf/amfnd/susm.cc 
b/osaf/services/saf/amf/amfnd/susm.cc
--- a/osaf/services/saf/amf/amfnd/susm.cc
+++ b/osaf/services/saf/amf/amfnd/susm.cc
@@ -2601,8 +2601,30 @@ uint32_t avnd_su_pres_inst_comprestart_h
if (NCSCC_RC_SUCCESS != rc)
goto done;
break;
+   } else {
+   /*
+  For a NPI comp in SU, component FSM is 
always triggered
+  at the time of assignments. If this 
component is
+  non-restartable then start 

Re: [devel] [PATCH 1 of 1] ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]

2016-11-17 Thread praveen malviya
Hi Vu,

Please find query inline [Praveen].

Thanks,
Praveen

On 17-Nov-16 11:58 AM, Vu Minh Nguyen wrote:
>  osaf/services/saf/ntfsv/ntfs/ntfs_clm.c |  14 +-
>  1 files changed, 13 insertions(+), 1 deletions(-)
>
>
> NTF did not deal with TRY_AGAIN error code of `saClmInitialize()`,
> NTF would exit, and cause node reboot if getting TRY_AGAIN.
>
> The patch adds a while loop to do retry when getting TRY_AGAIN.
>
> diff --git a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c 
> b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> --- a/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> +++ b/osaf/services/saf/ntfsv/ntfs/ntfs_clm.c
> @@ -101,13 +101,25 @@ void *ntfs_clm_init_thread(void *cb)
>  {
>   ntfs_cb_t *_ntfs_cb = (ntfs_cb_t *) cb;
>   SaAisErrorT rc = SA_AIS_OK;
> + uint32_t msecs_waited = 0;
> + const uint32_t max_waiting_time_10s = 10 * 1000; /* 10 secs */
> +
>   TRACE_ENTER();
> +
>   rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, );
> + while (((rc == SA_AIS_ERR_TRY_AGAIN) || (rc == SA_AIS_ERR_TIMEOUT) ||
> + (rc == SA_AIS_ERR_UNAVAILABLE)) &&
[Praveen] Based on discussion on ticket #1828 related to CLM 
initialization, I remember this API does not return ERR_UNAVAILABLE even 
on a CLM locked node also. This provision is to enable application on 
CLM locked node to perform tracking of CLM status of nodes.

But I think even CLM Agent can return this code when CLMS server has not 
got node related information from CLM agent and in that case CLMS will 
return ERR_UNAVAILABLE. If it is not so, then I think we are not 
required to handle rc == SA_AIS_ERR_UNAVAILABLE.

Thanks,
Praveen
> +(msecs_waited < max_waiting_time_10s)) {
> + usleep(100*1000);
> + msecs_waited += 100;
> + rc = saClmInitialize_4(&_ntfs_cb->clm_hdl, _callbacks, 
> );
> + }
>   if (rc != SA_AIS_OK) {
>   LOG_ER("saClmInitialize failed with error: %d", rc);
>   TRACE_LEAVE();
> -exit(EXIT_FAILURE);
> + exit(EXIT_FAILURE);
>   }
> +
>   rc = saClmSelectionObjectGet(_ntfs_cb->clm_hdl, 
> _cb->clmSelectionObject);
>   if (rc != SA_AIS_OK) {
>   LOG_ER("saClmSelectionObjectGet failed with error: %d", rc);
>

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 2 of 2] base: Use the Mutex and Lock classes [#2173]

2016-11-17 Thread ramesh betham
Ack.

Thanks,
Ramesh.

On 11/8/2016 8:50 PM, Anders Widell wrote:
>   osaf/libs/core/cplusplus/base/process.cc |  31 
> ---
>   osaf/libs/core/cplusplus/base/process.h  |  19 +
>   osaf/libs/core/cplusplus/base/unix_socket.cc |  17 +-
>   osaf/libs/core/cplusplus/base/unix_socket.h  |   3 +-
>   4 files changed, 32 insertions(+), 38 deletions(-)
>
>
> Use the new Mutex and Lock classes instead of std::mutex, std::unique_lock and
> pthread_mutex_t.
>
> diff --git a/osaf/libs/core/cplusplus/base/process.cc 
> b/osaf/libs/core/cplusplus/base/process.cc
> --- a/osaf/libs/core/cplusplus/base/process.cc
> +++ b/osaf/libs/core/cplusplus/base/process.cc
> @@ -50,10 +50,12 @@ Process::~Process() {
>   }
>   
>   void Process::Kill(int sig_no, const Duration& wait_time) {
> -  std::unique_lock lock(mutex_);
> -  pid_t pgid = process_group_id_;
> -  process_group_id_ = 0;
> -  lock.unlock();
> +  pid_t pgid;
> +  {
> +Lock lock(mutex_);
> +pgid = process_group_id_;
> +process_group_id_ = 0;
> +  }
> if (pgid > 1) KillProc(-pgid, sig_no, wait_time);
>   }
>   
> @@ -119,20 +121,23 @@ void Process::Execute(int argc, char *ar
> LOG_WA("setpgid(%u, 0) failed: %s",
>(unsigned) child_pid, strerror(errno));
>   }
> -std::unique_lock lock(mutex_);
> -process_group_id_ = child_pid;
> -StartTimer(timeout);
> -lock.unlock();
> +{
> +  Lock lock(mutex_);
> +  process_group_id_ = child_pid;
> +  StartTimer(timeout);
> +}
>   int status;
>   pid_t wait_pid;
>   do {
> wait_pid = waitpid(child_pid, , 0);
>   } while (wait_pid == (pid_t) -1 && errno == EINTR);
> -lock.lock();
> -StopTimer();
> -bool killed = process_group_id_ == 0;
> -process_group_id_ = 0;
> -lock.unlock();
> +bool killed;
> +{
> +  Lock lock(mutex_);
> +  StopTimer();
> +  killed = process_group_id_ == 0;
> +  process_group_id_ = 0;
> +}
>   bool successful = false;
>   if (!killed && wait_pid != (pid_t) -1) {
> if (!WIFEXITED(status)) {
> diff --git a/osaf/libs/core/cplusplus/base/process.h 
> b/osaf/libs/core/cplusplus/base/process.h
> --- a/osaf/libs/core/cplusplus/base/process.h
> +++ b/osaf/libs/core/cplusplus/base/process.h
> @@ -15,20 +15,19 @@
>*
>*/
>   
> -#ifndef OPENSAF_OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_
> -#define OPENSAF_OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_
> +#ifndef OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_
> +#define OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_
>   
> +#include 
>   #include 
>   #include 
>   #include 
> -#include 
> -#include 
> -#include "base/macros.h"
> +#include "osaf/libs/core/cplusplus/base/macros.h"
> +#include "osaf/libs/core/cplusplus/base/mutex.h"
>   
>   namespace base {
>   
>   class Process {
> -  DELETE_COPY_AND_MOVE_OPERATORS(Process);
>public:
> using Duration = std::chrono::steady_clock::duration;
> Process();
> @@ -96,18 +95,20 @@ class Process {
>  * processes in the process group have terminated.
>  */
> static void KillProc(pid_t pid, int sig_no, const Duration& wait_time);
> +
>private:
> using Clock = std::chrono::steady_clock;
> using TimePoint = std::chrono::steady_clock::time_point;
> void StartTimer(const Duration& timeout);
> void StopTimer();
> static void TimerExpirationEvent(sigval notification_data);
> -  std::mutex mutex_;
> +  Mutex mutex_;
> timer_t timer_id_;
> bool is_timer_created_;
> pid_t process_group_id_;
> +  DELETE_COPY_AND_MOVE_OPERATORS(Process);
>   };
>   
> -} // namespace base
> +}  // namespace base
>   
> -#endif  /* OPENSAF_OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_ */
> +#endif  // OSAF_LIBS_CORE_CPLUSPLUS_BASE_PROCESS_H_
> diff --git a/osaf/libs/core/cplusplus/base/unix_socket.cc 
> b/osaf/libs/core/cplusplus/base/unix_socket.cc
> --- a/osaf/libs/core/cplusplus/base/unix_socket.cc
> +++ b/osaf/libs/core/cplusplus/base/unix_socket.cc
> @@ -32,19 +32,10 @@ UnixSocket::UnixSocket(const std::string
> } else {
>   addr_.sun_path[0] = '\0';
> }
> -  pthread_mutexattr_t attr;
> -  int result = pthread_mutexattr_init();
> -  if (result != 0) osaf_abort(result);
> -  result = pthread_mutexattr_setprotocol(, PTHREAD_PRIO_INHERIT);
> -  if (result != 0) osaf_abort(result);
> -  result = pthread_mutex_init(_, );
> -  if (result != 0) osaf_abort(result);
> -  result = pthread_mutexattr_destroy();
> -  if (result != 0) osaf_abort(result);
>   }
>   
>   int UnixSocket::Open() {
> -  osaf_mutex_lock_ordie(_);
> +  Lock lock(mutex_);
> int sock = fd_;
> if (sock < 0) {
>   if (addr_.sun_path[0] != '\0') {
> @@ -60,18 +51,15 @@ int UnixSocket::Open() {
> errno = ENAMETOOLONG;
>   }
> }
> -  osaf_mutex_unlock_ordie(_);
> return sock;
>   }
>   
>   UnixSocket::~UnixSocket() {
> Close();
> -  int result = pthread_mutex_destroy(_);
> -  if (result != 

Re: [devel] [PATCH 1 of 1] mds: Improve log message [#2193]

2016-11-17 Thread A V Mahesh
Hi HansN,

ACK , Not tested.

-AVM


On 11/17/2016 12:59 PM, Hans Nordeback wrote:
>   osaf/libs/core/mds/mds_dt_tipc.c |  7 +--
>   1 files changed, 5 insertions(+), 2 deletions(-)
>
>
> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c 
> b/osaf/libs/core/mds/mds_dt_tipc.c
> --- a/osaf/libs/core/mds/mds_dt_tipc.c
> +++ b/osaf/libs/core/mds/mds_dt_tipc.c
> @@ -2497,7 +2497,7 @@ static uint32_t mdtm_sendto(uint8_t *buf
>   {
>   /* Can be made as macro even */
>   struct sockaddr_tipc server_addr;
> - int send_len = 0;
> + ssize_t send_len = 0;
>   #ifdef MDS_CHECKSUM_ENABLE_FLAG
>   uint16_t checksum = 0;
>   #endif
> @@ -2524,8 +2524,11 @@ static uint32_t mdtm_sendto(uint8_t *buf
>   if (send_len == buff_len) {
>   m_MDS_LOG_INFO("MDTM: Successfully sent message");
>   return NCSCC_RC_SUCCESS;
> + } else if (send_len == -1) {
> + m_MDS_LOG_ERR("MDTM: Failed to send message err :%s", 
> strerror(errno));
> + return NCSCC_RC_FAILURE;
>   } else {
> - m_MDS_LOG_ERR("MDTM: Failed to send message err :%s", 
> strerror(errno));
> + m_MDS_LOG_ERR("MDTM: Failed to send message send_len :%zd", 
> send_len);
>   return NCSCC_RC_FAILURE;
>   }
>   }


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel