Re: [devel] [PATCH 0 of 2] Review Request for IMM: add error strings for CCB operations for ERR_NOT_EXIST error [#459]

2014-04-03 Thread Anders Björnerstedt
Ack from me.
Good work!

I have not tested the patch, only reviewed the code. 
Most important is to test that this patch does not break anything (regression 
testing).
Hopefully Neelakanta can do some of that.

The actual functionality provided by the patch is a better/clearer error 
indication
towards human users. So testing that only consists of verifgying that you do 
get the
appropriate error string for the cases where an OI is missing. The problem was 
the 
ambiguity of getting ERR_NOT_EXIST on a ccb delete or modify. For ccb create 
there
Would not be the same ambiguity since the typical object related error is 
instead
there ERR_EXIST. 

While reviewing the code I spotted what seems to be a potential memory leak in 
old existing code.
This is in immnd_evt_proc_ccb_compl_rsp() where immModel_ccbGrabErrStrings() is 
done generically
but the corresponding release (using immsv_evt_free_attrNames) is done only if 
the client to
respond to is local. The memory leak would happen if one or more Ois reject a 
ccb in validation
and add error strings. I guess this does not happen very often. 

I will write a defect ticket for that.

At some point i we should rename immsv_evt_free_attrNames to something better...

/AndersBJ



-Original Message-
From: Zoran Milinkovic [mailto:zoran.milinko...@ericsson.com] 
Sent: den 2 april 2014 15:37
To: reddy.neelaka...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [devel] [PATCH 0 of 2] Review Request for IMM: add error strings for 
CCB operations for ERR_NOT_EXIST error [#459]

Summary: IMM: add error strings for CCB operations for ERR_NOT_EXIST error 
[#459] Review request for Trac Ticket(s): 459 Peer Reviewer(s): Anders, 
Neelakanta Pull request to: Zoran Affected branch(es): default(4.5) Development 
branch: default(4.5)


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesy
 OpenSAF servicesn
 Core libraries  n
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-

changeset a866790f3345c3f99c57d7cb9aa9579a1d5c34ad
Author: Zoran Milinkovic 
Date:   Wed, 02 Apr 2014 15:22:02 +0200

IMM: add error strings for CCB operations for ERR_NOT_EXIST error [#459]

When an implementer is detached from an object/class, CCB operations 
return
ERR_NOT_EXIST error code. The patch should give more information 
regarding
the error code in the error string.

changeset b02f4b505de84c6fb338880292835ba45074ce6a
Author: Zoran Milinkovic 
Date:   Wed, 02 Apr 2014 15:24:46 +0200

IMMTOOLS: write error strings for ERR_NOT_EXIST error code in immcfg 
[#459]


Complete diffstat:
--
 osaf/services/saf/immsv/immnd/ImmModel.cc |  60 

 osaf/services/saf/immsv/immnd/ImmModel.hh |   5 +
 osaf/services/saf/immsv/immnd/immnd_evt.c |  27 ---
 osaf/tools/safimm/immcfg/imm_cfg.c|  69 
-
 4 files changed, 125 insertions(+), 36 deletions(-)


Testing Commands:
-
immcfg -a attr=123 id=1
immcfg -d id=1
immcfg -c TestClass id=2


Testing, Expected Results:
--
Create an object of a TestClass ("immcfg -c TestClass id=1").
Attach an implementer to the class using immapplier ("immapplier -a TestImpl 
TestClass").
Exit from immapplier (Ctrl+C).
1. Modify id=1 ("immcfg -a attr=123 id=1"), and the error string 
(ERR_NOT_EXIST) should be written to the output 2. Delete id=1 ("immcfg -d 
id=1"), and the error string (ERR_NOT_EXIST) should be written to the output 3. 
Create a new object of class TestClass ("immcfg -c TestClass id=2"), and the 
error string (ERR_NOT_EXIST) should be written to the output.


Conditions of Submission:
-
Ack from Anders and Neelakanta


Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  n  n
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to pu

[devel] [PATCH 1 of 1] IMM: fix memory leak in IMM [#834]

2014-04-03 Thread Zoran Milinkovic
 osaf/services/saf/immsv/immnd/immnd_evt.c |  2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


diff --git a/osaf/services/saf/immsv/immnd/immnd_evt.c 
b/osaf/services/saf/immsv/immnd/immnd_evt.c
--- a/osaf/services/saf/immsv/immnd/immnd_evt.c
+++ b/osaf/services/saf/immsv/immnd/immnd_evt.c
@@ -3826,9 +3826,9 @@ static void immnd_evt_proc_ccb_compl_rsp
if (rc != NCSCC_RC_SUCCESS) {
LOG_WA("Failed to send response to agent/client 
over MDS rc:%u", rc);
}
-   
immsv_evt_free_attrNames(send_evt.info.imma.info.errRsp.errStrings);
}
finalize_ccb:
+   immsv_evt_free_attrNames(errStrings);
TRACE_2("CCB COMPLETED: TERMINATING CCB:%u", 
evt->info.ccbUpcallRsp.ccbId);
err = immModel_ccbFinalize(cb, evt->info.ccbUpcallRsp.ccbId);
if (err != SA_AIS_OK) {

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 0 of 1] Review Request for IMM: fix memory leak in IMM [#834]

2014-04-03 Thread Zoran Milinkovic
Summary: IMM: fix memory leak in IMM [#834]
Review request for Trac Ticket(s): 834
Peer Reviewer(s): Neelakanta
Pull request to: Zoran
Affected branch(es): opensaf-4.3.x, opensaf-4.4.x, default(4.5)
Development branch: default(4.5)


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesy
 OpenSAF servicesn
 Core libraries  n
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-

changeset cd5fa346756eb5e5d48a9e6345bff5c33136e683
Author: Zoran Milinkovic 
Date:   Thu, 03 Apr 2014 11:13:35 +0200

IMM: fix memory leak in IMM [#834]


Complete diffstat:
--
 osaf/services/saf/immsv/immnd/immnd_evt.c |  2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


Testing Commands:
-



Testing, Expected Results:
--
Check that valgrind does not report any memoty leak.


Conditions of Submission:
-
Ack from Neelakanta


Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  n  n
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
(i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)

___ Your computer have a badly configured date and time; confusing the
the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
do not contain the patch that updates the Doxygen manual.


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 1] IMM: fix memory leak in IMM [#834]

2014-04-03 Thread Anders Björnerstedt
Ack from me.

/AndersBj 

-Original Message-
From: Zoran Milinkovic [mailto:zoran.milinko...@ericsson.com] 
Sent: den 3 april 2014 11:34
To: reddy.neelaka...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [devel] [PATCH 1 of 1] IMM: fix memory leak in IMM [#834]

 osaf/services/saf/immsv/immnd/immnd_evt.c |  2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


diff --git a/osaf/services/saf/immsv/immnd/immnd_evt.c 
b/osaf/services/saf/immsv/immnd/immnd_evt.c
--- a/osaf/services/saf/immsv/immnd/immnd_evt.c
+++ b/osaf/services/saf/immsv/immnd/immnd_evt.c
@@ -3826,9 +3826,9 @@ static void immnd_evt_proc_ccb_compl_rsp
if (rc != NCSCC_RC_SUCCESS) {
LOG_WA("Failed to send response to agent/client 
over MDS rc:%u", rc);
}
-   
immsv_evt_free_attrNames(send_evt.info.imma.info.errRsp.errStrings);
}
finalize_ccb:
+   immsv_evt_free_attrNames(errStrings);
TRACE_2("CCB COMPLETED: TERMINATING CCB:%u", 
evt->info.ccbUpcallRsp.ccbId);
err = immModel_ccbFinalize(cb, evt->info.ccbUpcallRsp.ccbId);
if (err != SA_AIS_OK) {

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 2] IMM: add error strings for CCB operations for ERR_NOT_EXIST error [#459]

2014-04-03 Thread Hans Feldt
I think this patch is a bit controversial. We need to define a "protocol" for 
the text carried in the error strings to 
allow for future changes. For example we should consider source identification, 
maybe we should have key/value pairs in 
the string?

Remember that the string might end up in front of an operator...

Thanks,
Hans

On 04/02/2014 03:36 PM, Zoran Milinkovic wrote:
>   osaf/services/saf/immsv/immnd/ImmModel.cc |  60 
> +++
>   osaf/services/saf/immsv/immnd/ImmModel.hh |   5 ++
>   osaf/services/saf/immsv/immnd/immnd_evt.c |  27 -
>   3 files changed, 89 insertions(+), 3 deletions(-)
>
>
> When an implementer is detached from an object/class, CCB operations return 
> ERR_NOT_EXIST error code.
> The patch should give more information regarding the error code in the error 
> string.
>
> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
> b/osaf/services/saf/immsv/immnd/ImmModel.cc
> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc
> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
> @@ -6950,6 +6950,10 @@ SaAisErrorT ImmModel::ccbObjectCreate(Im
>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an "
>   "implementer and flag SA_IMM_CCB_REGISTERED_OI is 
> set",
>   objectName.c_str());
> +setCcbErrorString(ccb,
> +"ERR_NOT_EXIST: object '%s' does not have an "
> +"implementer and flag SA_IMM_CCB_REGISTERED_OI 
> is set",
> +objectName.c_str());
>   err = SA_AIS_ERR_NOT_EXIST;
>   }
>   } else { /* SA_IMM_CCB_REGISTERED_OI NOT set */
> @@ -7790,6 +7794,10 @@ ImmModel::ccbObjectModify(const ImmsvOmC
>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an "
>   "implementer and flag SA_IMM_CCB_REGISTERED_OI is set",
>   objectName.c_str());
> +setCcbErrorString(ccb,
> +"ERR_NOT_EXIST: object '%s' does not have an "
> +"implementer and flag SA_IMM_CCB_REGISTERED_OI is 
> set",
> +objectName.c_str());
>   err = SA_AIS_ERR_NOT_EXIST;
>   }
>   } else { /* SA_IMM_CCB_REGISTERED_OI NOT set */
> @@ -8201,6 +8209,10 @@ ImmModel::deleteObject(ObjectMap::iterat
>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an 
> implementer "
>   "and flag SA_IMM_CCB_REGISTERED_OI is set",
>   oi->first.c_str());
> +setCcbErrorString(ccb,
> +"ERR_NOT_EXIST: object '%s' does not have an 
> implementer "
> +"and flag SA_IMM_CCB_REGISTERED_OI is set",
> +oi->first.c_str());
>   return SA_AIS_ERR_NOT_EXIST;
>   }
>   } else {  /* SA_IMM_CCB_REGISTERED_OI NOT set */
> @@ -8370,6 +8382,54 @@ ImmModel::deleteObject(ObjectMap::iterat
>   return SA_AIS_OK;
>   }
>
> +void
> +ImmModel::setCcbErrorString(CcbInfo *ccb, const char *errorString, ...)
> +{
> +int errLen = strlen(errorString) + 1;
> +char *fmtError = (char *)malloc(errLen);
> +int len;
> +va_list vl;
> +
> +va_start(vl, errorString);
> +len = vsnprintf(fmtError, errLen, errorString, vl);
> +va_end(vl);
> +
> +osafassert(len >= 0);
> +len++;   /* Reserve one byte for null-terminated sign '\0' */
> +if(len > errLen) {
> +fmtError = (char *)realloc(fmtError, len);
> +va_start(vl, errorString);
> +osafassert(vsnprintf(fmtError, len, errorString, vl) >= 0);
> +va_end(vl);
> +}
> +
> +unsigned int ix=0;
> +ImmsvAttrNameList* errStr = ccb->mErrorStrings;
> +ImmsvAttrNameList** errStrTail = &(ccb->mErrorStrings);
> +while(errStr) {
> +if(!strncmp(fmtError, errStr->name.buf, len)) {
> +TRACE_5("Discarding duplicate error string '%s' for ccb id %u",
> +fmtError, ccb->mId);
> +free(fmtError);
> +return;
> +}
> +++ix;
> +errStrTail = &(errStr->next);
> +errStr = errStr->next;
> +}
> +
> +if(ix >= IMMSV_MAX_ATTRIBUTES) {
> +TRACE_5("Discarding error string '%s' for ccb id %u (too many)",
> +fmtError, ccb->mId);
> +free(fmtError);
> +} else {
> +(*errStrTail) = (ImmsvAttrNameList *) 
> malloc(sizeof(ImmsvAttrNameList));
> +(*errStrTail)->next = NULL;
> +(*errStrTail)->name.size = len;
> +(*errStrTail)->name.buf = fmtError;
> +}
> +}
> +
>   bool
>   ImmModel::hasLocalClassAppliers(ClassInfo* classInfo)
>   {
> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.hh 
> b/osaf/services/saf/immsv/immnd/ImmModel.hh
> --- a/osaf/services/saf/immsv/immnd/ImmModel.hh
> +++ b/osaf/services/saf/immsv/immnd/ImmM

Re: [devel] [PATCH 1 of 3] clm: add null check for unconfigured node join request [#816]

2014-04-03 Thread Hans Feldt
Ack
/Hans

On 03/27/2014 02:13 AM, mathi.naic...@oracle.com wrote:
>   osaf/services/saf/clmsv/clms/clms_evt.c |  48 
> ++--
>   1 files changed, 27 insertions(+), 21 deletions(-)
>
>
> When a node tries to join with a node name that is not configured
> or just gets simply mis-configured, then CLM crashes causing due
> to lack of null check during the processing of node join request.
> When this issue happens, the following logging is done by CLM:
> Mar 26 18:44:52 SC-1 local0.err osafclmd[418]: ER CLM NodeName: 'SC-3' 
> doesn't match entry in imm.xml.
> Specify a correct node name in/etc/opensaf/node_name.
> This patch 1 of 1 adds a null check such that CLM server doesn't crashes.
>
> diff --git a/osaf/services/saf/clmsv/clms/clms_evt.c 
> b/osaf/services/saf/clmsv/clms/clms_evt.c
> --- a/osaf/services/saf/clmsv/clms/clms_evt.c
> +++ b/osaf/services/saf/clmsv/clms/clms_evt.c
> @@ -286,34 +286,40 @@ uint32_t proc_node_up_msg(CLMS_CB * cb,
>  nodeup_info->node_name.value);
>   }
>
> - /* Retrieve IP information */
> - if ((ip = (IPLIST *)ncs_patricia_tree_get(&clms_cb->iplist, (uint8_t 
> *)&nodeid)) == NULL) {
> - clm_msg.info.api_resp_info.rc = SA_AIS_ERR_NOT_EXIST;
> - LOG_ER("IP information not found for: %s with node_id: %u", 
> nodeup_info->node_name.value, nodeid);
> - } else {
> - if (ip->addr.length) { /* If length = 0, it is AF_TIPC. So 
> process only IPv4 or 6 addresses */
> - /* We might want to validate IP */
> - if (ip_matched(ip->addr.family, ip->addr.value, 
> node->node_addr.family,node->node_addr.value)) {
> - clm_msg.info.api_resp_info.rc = SA_AIS_OK;
> - } else {
> - clm_msg.info.api_resp_info.rc = 
> SA_AIS_ERR_NOT_EXIST;
> - LOG_ER("IP address on %s is not matching the 
> ipaddress in" PKGSYSCONFDIR "/imm.xml",
> + if (node != NULL) {
> + /* Retrieve IP information */
> + if ((ip = (IPLIST *)ncs_patricia_tree_get(&clms_cb->iplist, 
> (uint8_t *)&nodeid)) == NULL) {
> + clm_msg.info.api_resp_info.rc = SA_AIS_ERR_NOT_EXIST;
> + LOG_ER("IP information not found for: %s with node_id: 
> %u",
> + 
> nodeup_info->node_name.value, nodeid);
> + } else {
> + if (ip->addr.length) { /* If length = 0, it is AF_TIPC. 
> So process only IPv4 or 6 addresses */
> + /* We might want to validate IP */
> + if (ip_matched(ip->addr.family, ip->addr.value,
> + 
> node->node_addr.family,node->node_addr.value)) {
> + clm_msg.info.api_resp_info.rc = 
> SA_AIS_OK;
> + } else {
> + clm_msg.info.api_resp_info.rc = 
> SA_AIS_ERR_NOT_EXIST;
> + LOG_ER("IP address on %s is not 
> matching the ipaddress in" PKGSYSCONFDIR "/imm.xml",
>   
> nodeup_info->node_name.value);
> + }
>   }
>   }
> - }
>
> - if ((cb->node_id != nodeup_info->node_id) && (node->nodeup == SA_TRUE)){
> - clm_msg.info.api_resp_info.rc = SA_AIS_ERR_EXIST;
> - LOG_ER("Duplicate node join request for CLM node: '%s'. Specify 
> a unique node name in" PKGSYSCONFDIR "/node_name",
> -nodeup_info->node_name.value);
> - } else
> - clm_msg.info.api_resp_info.rc = SA_AIS_OK;
> - 
> + if ((cb->node_id != nodeup_info->node_id) && (node->nodeup == 
> SA_TRUE)){
> + clm_msg.info.api_resp_info.rc = SA_AIS_ERR_EXIST;
> + LOG_ER("Duplicate node join request for CLM node: '%s'. 
> "
> + "Specify a unique node name in" 
> PKGSYSCONFDIR "/node_name",
> + nodeup_info->node_name.value);
> + } else
> + clm_msg.info.api_resp_info.rc = SA_AIS_OK;
> +
> + } /* Node exists in DB */
> +
>   if (clm_msg.info.api_resp_info.rc != SA_AIS_OK) {
>   clm_msg.evt_type = CLMSV_CLMS_TO_CLMA_API_RESP_MSG;
>   clm_msg.info.api_resp_info.type = CLMSV_CLUSTER_JOIN_RESP;
> - clm_msg.info.api_resp_info.param.node_name = node_name;
> + clm_msg.info.api_resp_info.param.node_name = 
> nodeup_info->node_name;
>   rc = clms_mds_msg_send(cb, &clm_msg, &evt->fr_dest, 
> &evt->mds_ctxt, MDS_SEND_PRIORITY_HIGH,
>  NCSMDS_SVC_ID_CLMNA);
>   if (rc != NCSCC_RC_SUCCESS)
>
>


Re: [devel] [PATCH 1 of 2] IMM: add error strings for CCB operations for ERR_NOT_EXIST error [#459]

2014-04-03 Thread Anders Björnerstedt
This patch only affects the ccb-operation callbacks for ccb- 
create/delete/modify.
It does not append anything to a user error string since the callback never
reached the user. It adds an error string in the imm server where there 
previously could
be no string,  now clarifying what the cause of the ERR_NOT_EXIST was. 

The only issue I have with the fix (that I realized after acking) is that for 
the create
callback it is really the ERR_EXIST case that mainly needs clarification. The 
only reason
for this patch is to increase comprehension of the cause by some human user. 
This string 
is not as such intended to be caught and processed by a program. It is 
backwards compatible
with what exists. 


I think the point you are addressing is the need to support program processing 
of the 
error string to discriminate (process differently) based on the different 
causes of
the same error code. There is a need for that also, particularly for handling 
the
result of an aborted CCB. 

Here we are talking some "taging" either as prefix or postfix. 
We can and will add that also, at least to the cases where it is important.
The cardinal example being SA_AIS_ERR_FAILED_OPERATION for a CCB (abort of a 
CCB).

We may even add this also to ccb-create/delete/modify if there is need to 
programmetricaly 
process diferrently, e.g. The cases of an object not esisting from the 
implementer being detached.

In summary, this patch solves a human factors problem and does not introiduce 
any new problem
Or make it more difficult to solve the other problem, i.e. I dont see it as 
controversial.

/AndersBj


-Original Message-
From: Hans Feldt [mailto:hans.fe...@ericsson.com] 
Sent: den 3 april 2014 12:33
To: Zoran Milinkovic; reddy.neelaka...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [devel] [PATCH 1 of 2] IMM: add error strings for CCB operations 
for ERR_NOT_EXIST error [#459]

I think this patch is a bit controversial. We need to define a "protocol" for 
the text carried in the error strings to allow for future changes. For example 
we should consider source identification, maybe we should have key/value pairs 
in the string?

Remember that the string might end up in front of an operator...

Thanks,
Hans

On 04/02/2014 03:36 PM, Zoran Milinkovic wrote:
>   osaf/services/saf/immsv/immnd/ImmModel.cc |  60 
> +++
>   osaf/services/saf/immsv/immnd/ImmModel.hh |   5 ++
>   osaf/services/saf/immsv/immnd/immnd_evt.c |  27 -
>   3 files changed, 89 insertions(+), 3 deletions(-)
>
>
> When an implementer is detached from an object/class, CCB operations return 
> ERR_NOT_EXIST error code.
> The patch should give more information regarding the error code in the error 
> string.
>
> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
> b/osaf/services/saf/immsv/immnd/ImmModel.cc
> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc
> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
> @@ -6950,6 +6950,10 @@ SaAisErrorT ImmModel::ccbObjectCreate(Im
>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an "
>   "implementer and flag SA_IMM_CCB_REGISTERED_OI is 
> set",
>   objectName.c_str());
> +setCcbErrorString(ccb,
> +"ERR_NOT_EXIST: object '%s' does not have an "
> +"implementer and flag SA_IMM_CCB_REGISTERED_OI 
> is set",
> +objectName.c_str());
>   err = SA_AIS_ERR_NOT_EXIST;
>   }
>   } else { /* SA_IMM_CCB_REGISTERED_OI NOT set */ @@ 
> -7790,6 +7794,10 @@ ImmModel::ccbObjectModify(const ImmsvOmC
>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an "
>   "implementer and flag SA_IMM_CCB_REGISTERED_OI is set",
>   objectName.c_str());
> +setCcbErrorString(ccb,
> +"ERR_NOT_EXIST: object '%s' does not have an "
> +"implementer and flag SA_IMM_CCB_REGISTERED_OI is 
> set",
> +objectName.c_str());
>   err = SA_AIS_ERR_NOT_EXIST;
>   }
>   } else { /* SA_IMM_CCB_REGISTERED_OI NOT set */ @@ -8201,6 
> +8209,10 @@ ImmModel::deleteObject(ObjectMap::iterat
>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an 
> implementer "
>   "and flag SA_IMM_CCB_REGISTERED_OI is set",
>   oi->first.c_str());
> +setCcbErrorString(ccb,
> +"ERR_NOT_EXIST: object '%s' does not have an 
> implementer "
> +"and flag SA_IMM_CCB_REGISTERED_OI is set",
> +oi->first.c_str());
>   return SA_AIS_ERR_NOT_EXIST;
>   }
>   } else {  /* SA_IMM_CCB_REGISTERED_OI NOT set */ @@ -8370,6 
> +8382,54 @@ ImmModel::deleteObject(Obje

Re: [devel] [PATCH 1 of 2] IMM: add error strings for CCB operations for ERR_NOT_EXIST error [#459]

2014-04-03 Thread Anders Bjornerstedt
Anders Björnerstedt wrote:
> This patch only affects the ccb-operation callbacks for ccb- 
> create/delete/modify.
> It does not append anything to a user error string since the callback never
> reached the user. It adds an error string in the imm server where there 
> previously could
> be no string,  now clarifying what the cause of the ERR_NOT_EXIST was. 
>
> The only issue I have with the fix (that I realized after acking) is that for 
> the create
> callback it is really the ERR_EXIST case that mainly needs clarification. 

Actually the ERR_EXIST case is unambiguous for saImmOmCcbObjectCreate,
while ERR_NOT_EXIST is ambiguous also for this operation (three cases).
We could provide an error string for all three cases, all caught in the 
imm server,
with error messages intended for "human consumption".

/Anders
> The only reason
> for this patch is to increase comprehension of the cause by some human user. 
> This string 
> is not as such intended to be caught and processed by a program. It is 
> backwards compatible
> with what exists. 
>
>
> I think the point you are addressing is the need to support program 
> processing of the 
> error string to discriminate (process differently) based on the different 
> causes of
> the same error code. There is a need for that also, particularly for handling 
> the
> result of an aborted CCB. 
>
> Here we are talking some "taging" either as prefix or postfix. 
> We can and will add that also, at least to the cases where it is important.
> The cardinal example being SA_AIS_ERR_FAILED_OPERATION for a CCB (abort of a 
> CCB).
>
> We may even add this also to ccb-create/delete/modify if there is need to 
> programmetricaly 
> process diferrently, e.g. The cases of an object not esisting from the 
> implementer being detached.
>
> In summary, this patch solves a human factors problem and does not introiduce 
> any new problem
> Or make it more difficult to solve the other problem, i.e. I dont see it as 
> controversial.
>
> /AndersBj
>
>
> -Original Message-
> From: Hans Feldt [mailto:hans.fe...@ericsson.com] 
> Sent: den 3 april 2014 12:33
> To: Zoran Milinkovic; reddy.neelaka...@oracle.com
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [devel] [PATCH 1 of 2] IMM: add error strings for CCB operations 
> for ERR_NOT_EXIST error [#459]
>
> I think this patch is a bit controversial. We need to define a "protocol" for 
> the text carried in the error strings to allow for future changes. For 
> example we should consider source identification, maybe we should have 
> key/value pairs in the string?
>
> Remember that the string might end up in front of an operator...
>
> Thanks,
> Hans
>
> On 04/02/2014 03:36 PM, Zoran Milinkovic wrote:
>   
>>   osaf/services/saf/immsv/immnd/ImmModel.cc |  60 
>> +++
>>   osaf/services/saf/immsv/immnd/ImmModel.hh |   5 ++
>>   osaf/services/saf/immsv/immnd/immnd_evt.c |  27 -
>>   3 files changed, 89 insertions(+), 3 deletions(-)
>>
>>
>> When an implementer is detached from an object/class, CCB operations return 
>> ERR_NOT_EXIST error code.
>> The patch should give more information regarding the error code in the error 
>> string.
>>
>> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
>> b/osaf/services/saf/immsv/immnd/ImmModel.cc
>> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc
>> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
>> @@ -6950,6 +6950,10 @@ SaAisErrorT ImmModel::ccbObjectCreate(Im
>>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an "
>>   "implementer and flag SA_IMM_CCB_REGISTERED_OI is 
>> set",
>>   objectName.c_str());
>> +setCcbErrorString(ccb,
>> +"ERR_NOT_EXIST: object '%s' does not have an "
>> +"implementer and flag SA_IMM_CCB_REGISTERED_OI 
>> is set",
>> +objectName.c_str());
>>   err = SA_AIS_ERR_NOT_EXIST;
>>   }
>>   } else { /* SA_IMM_CCB_REGISTERED_OI NOT set */ @@ 
>> -7790,6 +7794,10 @@ ImmModel::ccbObjectModify(const ImmsvOmC
>>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an "
>>   "implementer and flag SA_IMM_CCB_REGISTERED_OI is set",
>>   objectName.c_str());
>> +setCcbErrorString(ccb,
>> +"ERR_NOT_EXIST: object '%s' does not have an "
>> +"implementer and flag SA_IMM_CCB_REGISTERED_OI is 
>> set",
>> +objectName.c_str());
>>   err = SA_AIS_ERR_NOT_EXIST;
>>   }
>>   } else { /* SA_IMM_CCB_REGISTERED_OI NOT set */ @@ -8201,6 
>> +8209,10 @@ ImmModel::deleteObject(ObjectMap::iterat
>>   TRACE_7("ERR_NOT_EXIST: object '%s' does not have an 
>> implementer "
>>   "and flag SA_IMM_CCB_REGISTERED_OI i

Re: [devel] [PATCH 2 of 5] amfnd: store amfd mds ver [#574]

2014-04-03 Thread SuryaNarayana Garlapati
Comments:
1. First of all, the mds sub part version should not be stored/used. You 
should depend on the message format version as SI rank is being embedded 
in the new message(which is being sent from AVD).
  If you have a requirement for usage of message version, you can get it 
at runtime when you receive a SUSI message etc. Decode the received 
message and use the message format version.
2.  As per patch, Storage of mds version will happen only once for a 
life time. You need to process the NEW-ACTIVE event of MDS as well
which will make sure to store the mds version even after the 
failover/switchover.

Regards
Surya

On Friday 28 March 2014 06:44 PM, Hans Feldt wrote:
>   osaf/services/saf/avsv/avd/avd_mds.c   |  2 ++
>   osaf/services/saf/avsv/avnd/avnd_di.c  |  1 +
>   osaf/services/saf/avsv/avnd/avnd_mds.c |  3 +++
>   osaf/services/saf/avsv/avnd/include/avnd_cb.h  |  1 +
>   osaf/services/saf/avsv/avnd/include/avnd_evt.h |  1 +
>   5 files changed, 8 insertions(+), 0 deletions(-)
>
>
> diff --git a/osaf/services/saf/avsv/avd/avd_mds.c 
> b/osaf/services/saf/avsv/avd/avd_mds.c
> --- a/osaf/services/saf/avsv/avd/avd_mds.c
> +++ b/osaf/services/saf/avsv/avd/avd_mds.c
> @@ -414,6 +414,8 @@ static uint32_t avd_mds_svc_evt(MDS_CALL
>   break;
>   
>   case NCSMDS_SVC_ID_AVND:
> + // TODO remove, just for test
> + LOG_NO("%s: AVND UP version: %u", __FUNCTION__, 
> evt_info->i_rem_svc_pvt_ver);
>   if (evt_info->i_node_id == cb->node_id_avd) {
>   AVD_EVT *evt = calloc(1, sizeof(AVD_EVT));
>   osafassert(evt);
> diff --git a/osaf/services/saf/avsv/avnd/avnd_di.c 
> b/osaf/services/saf/avsv/avnd/avnd_di.c
> --- a/osaf/services/saf/avsv/avnd/avnd_di.c
> +++ b/osaf/services/saf/avsv/avnd/avnd_di.c
> @@ -421,6 +421,7 @@ uint32_t avnd_evt_mds_avd_up_evh(AVND_CB
>   
>   /* store the AVD MDS address */
>   cb->avd_dest = evt->info.mds.mds_dest;
> + cb->avd_mds_ver = evt->info.mds.rem_svc_pvt_ver;
>   
>   avnd_send_node_up_msg();
>   }
> diff --git a/osaf/services/saf/avsv/avnd/avnd_mds.c 
> b/osaf/services/saf/avsv/avnd/avnd_mds.c
> --- a/osaf/services/saf/avsv/avnd/avnd_mds.c
> +++ b/osaf/services/saf/avsv/avnd/avnd_mds.c
> @@ -594,8 +594,11 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
>   case NCSMDS_UP:
>   switch (evt_info->i_svc_id) {
>   case NCSMDS_SVC_ID_AVD:
> + // TODO remove, just for test
> + LOG_NO("%s: AVD UP version: %u", __FUNCTION__, 
> evt_info->i_rem_svc_pvt_ver);
>   /* create the mds event */
>   evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
> &evt_info->i_dest, 0, 0, 0);
> + evt->info.mds.rem_svc_pvt_ver = 
> evt_info->i_rem_svc_pvt_ver;
>   break;
>   
>   case NCSMDS_SVC_ID_AVA:
> diff --git a/osaf/services/saf/avsv/avnd/include/avnd_cb.h 
> b/osaf/services/saf/avsv/avnd/include/avnd_cb.h
> --- a/osaf/services/saf/avsv/avnd/include/avnd_cb.h
> +++ b/osaf/services/saf/avsv/avnd/include/avnd_cb.h
> @@ -50,6 +50,7 @@ typedef struct avnd_cb_tag {
>   MDS_DEST avnd_dest; /* AvND mds addr */
>   MDS_DEST avd_dest;  /* AvD mds addr */
>   bool is_avd_down;   /* Temp: Indicates if AvD went down */
> + MDS_SVC_PVT_SUB_PART_VER avd_mds_ver; /* Director MDS version */
>   
>   /* cb related params */
>   NCS_LOCK lock;  /* cb lock */
> diff --git a/osaf/services/saf/avsv/avnd/include/avnd_evt.h 
> b/osaf/services/saf/avsv/avnd/include/avnd_evt.h
> --- a/osaf/services/saf/avsv/avnd/include/avnd_evt.h
> +++ b/osaf/services/saf/avsv/avnd/include/avnd_evt.h
> @@ -128,6 +128,7 @@ typedef struct avnd_tmr_evt {
>   typedef struct avnd_mds_evt {
>   MDS_DEST mds_dest;  /* mds address */
>   NODE_ID node_id;
> + MDS_SVC_PVT_SUB_PART_VER rem_svc_pvt_ver;
>   } AVND_MDS_EVT;
>   
>   /* HA STATE change event definition */
>
> --
> ___
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 2 of 3] clm: change log message for unconfigured nodes, connectivity problems [#816]

2014-04-03 Thread Hans Feldt

See inline
Thanks,
Hans

On 03/27/2014 02:13 AM, mathi.naic...@oracle.com wrote:
>   osaf/services/saf/clmsv/clms/clms_evt.c  |  12 ++--
>   osaf/services/saf/clmsv/nodeagent/main.c |   7 +++
>   2 files changed, 13 insertions(+), 6 deletions(-)
>
>
> When a node join request from an unconfigured/misconfigured node is 
> attempted, then
> CLM prints the following log message:
> Mar 26 18:44:52 SC-1 local0.err osafclmd[418]: ER CLM NodeName: 'SC-3' 
> doesn't match entry in imm.xml.
> Specify a correct node name in/etc/opensaf/node_name.
> This log message is currently categorized as ERROR. However, there are 
> suggestions to improve
> the current log message such that the message more closely depicts
> the case of unconfigured or misconfigured nodes.
> The log string is now changed as below:
> Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO CLM NodeName: 'PL-8' is 
> not a configured cluster node.
> Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO /etc/opensaf/node_name 
> should contain the rdn value of a configured CLM node object name
>
> In the case of problems with connectivity problems, the following modified 
> log strings are added:
> Send timed out. Check status(network, process) of ACTIVE controller
> and
> Send failed: %u. Check network connectivity
>
> diff --git a/osaf/services/saf/clmsv/clms/clms_evt.c 
> b/osaf/services/saf/clmsv/clms/clms_evt.c
> --- a/osaf/services/saf/clmsv/clms/clms_evt.c
> +++ b/osaf/services/saf/clmsv/clms/clms_evt.c
> @@ -282,8 +282,16 @@ uint32_t proc_node_up_msg(CLMS_CB * cb,
>   node = clms_node_get_by_name(&node_name);
>   if (node == NULL) {
>   clm_msg.info.api_resp_info.rc = SA_AIS_ERR_NOT_EXIST;
> - LOG_ER("CLM NodeName: '%s' doesn't match entry in imm.xml. 
> Specify a correct node name in" PKGSYSCONFDIR "/node_name",
> -nodeup_info->node_name.value);
> + /* The /etc/opensaf/node_name is an user exposed configuration 
> file.
> +  * The node_name file contains the RDN value of the CLM node 
> name.
> +  * (a) When opensaf cluster configuration is pre-provisioned 
> using the OpenSAF IMM tools:
> +  * the /etc/opensaf/node_name should contain one of the 
> values specified
> +  * in nodes.cfg while generating the imm.xml.
> +  * (b) When opensaf cluster nodes are dynamically added at 
> runtime:
> +  * the /etc/opensaf/node_name should contain the rdn value.
> +  */
> + LOG_NO("CLM NodeName: '%s' is not a configured cluster node.", 
> nodeup_info->node_name.value);

In this log I would like to see the "join the cluster" and "unconfigured" that 
I tried to read out from the CLM spec 
3.1.2. For example:

"Node '%s' [attempts|requests] to join the cluster but is unconfigured"

> + LOG_NO(PKGSYSCONFDIR "/node_name should contain the rdn value 
> of configured CLM node object name");

This log is not relevant/correct and should be removed. CLM cannot know if the 
name presented is wrong or just about to 
be configured.

>   }
>
>   if (node != NULL) {
> diff --git a/osaf/services/saf/clmsv/nodeagent/main.c 
> b/osaf/services/saf/clmsv/nodeagent/main.c
> --- a/osaf/services/saf/clmsv/nodeagent/main.c
> +++ b/osaf/services/saf/clmsv/nodeagent/main.c
> @@ -442,11 +442,10 @@ SaAisErrorT clmna_process_dummyup_msg(vo
>   break;
>   case NCSCC_RC_REQ_TIMOUT:
>   error = SA_AIS_ERR_TIMEOUT;
> - break;
> - LOG_ER("clmna_mds_msg_sync_send Timed Out");
> + LOG_ER("Send timed out. Check status(network, process) 
> of ACTIVE controller");
>   goto done;
>   default:
> - LOG_ER("clmna_mds_msg_sync_send FAILED: %u", rc);
> + LOG_ER("Send failed: %u. Check network connectivity", 
> rc);
>   goto done;
>   }
>
> @@ -456,7 +455,7 @@ SaAisErrorT clmna_process_dummyup_msg(vo
>   error = SA_AIS_ERR_NO_RESOURCES;
>
>   if (error == SA_AIS_ERR_NOT_EXIST) {
> - LOG_ER("%s is Not a member of cluster",
> + LOG_ER("%s is not a configured node",
>   
> o_msg->info.api_resp_info.param.node_name.value);
>   free(o_msg);
>   goto done;
>
>

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 3 of 3] clm: clmna to be respawned by nid when node join request fails [#816]

2014-04-03 Thread Hans Feldt
On the controller I get an extra "send failed" log message that is ugly:

Apr  3 13:40:01 SC-1 local0.notice osafclmd[417]: NO CLM NodeName: 'PL-6' is 
not a configured cluster node.
Apr  3 13:40:01 SC-1 local0.notice osafclmd[417]: NO /etc/opensaf/node_name 
should contain the rdn value of configured 
CLM node object name
Apr  3 13:40:02 SC-1 local0.notice osafclmd[417]: NO proc_initialize_msg: send 
failed. dest:2060f19cd6007
Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO CLM NodeName: 'PL-6' is 
not a configured cluster node.
Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO /etc/opensaf/node_name 
should contain the rdn value of configured 
CLM node object name
Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO proc_initialize_msg: send 
failed. dest:2060f19cda006
Apr  3 13:40:32 SC-1 local0.notice osafclmd[417]: NO CLM NodeName: 'PL-6' is 
not a configured cluster node.
Apr  3 13:40:32 SC-1 local0.notice osafclmd[417]: NO /etc/opensaf/node_name 
should contain the rdn value of configured 
CLM node object name
Apr  3 13:40:33 SC-1 local0.notice osafclmd[417]: NO proc_initialize_msg: send 
failed. dest:2060f19cda009
Apr  3 13:40:34 SC-1 local0.notice osafimmnd[388]: NO Global discard node 
received for nodeId:2060f pid:359
Apr  3 13:40:39 SC-1 user.warn kernel: tipc: Resetting link 
<1.1.1:eth0-1.1.6:eth0>, peer not responding

I would like to have a try again loop inside clmna instead. Then we could have 
the nid to supervise and every now and 
then do a node reboot. There is not much point in trying three times and give 
up. I think clmna should retry forever and 
let nodeinit handle the bigger loop. The try interval needs to be configured 
but should have good default such as every 
30 sec.

Thanks,
Hans

On 03/27/2014 02:13 AM, mathi.naic...@oracle.com wrote:
>   osaf/services/saf/clmsv/nodeagent/main.c |  8 
>   1 files changed, 8 insertions(+), 0 deletions(-)
>
>
> When a node join request for an unconfigured/misconfigured node or
> when a node join request with a duplicate node_name is attempted, then
> clmna should report those errors to NID such that NID attempts to
> respawan clmna.
> With the introduction of this change, the following happens(can be seen in 
> the syslog) in
> the case of a unconfigured/misconfigured node join request:
> At the ACTIVE controller syslog:
> Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO CLM NodeName: 'PL-8' is 
> not a configured cluster node.
> Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO /etc/opensaf/node_name 
> should contain the rdn value of a configured CLM node object name
>
> At the unconfigured/misconfigured node, the syslog will be like as below:
> Mar 26 19:03:54 PL-3 local0.notice osafclmna[871]: Started
> Mar 26 19:03:54 PL-3 local0.err osafclmna[871]: ER PL-8 is not a configured 
> node
> Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Failed   DESC:CLMNA
> Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Going for recovery
> Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Trying To RESPAWN 
> /usr/local/lib/opensaf/clc-cli/osaf-clmna attempt #1
> Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Sending SIGKILL to CLMNA, 
> pid=868
> Mar 26 19:03:54 PL-3 local0.err osafclmna[871]: ER Exiting
> Mar 26 19:03:54 PL-3 local0.notice osafclmna[871]: exiting for shutdown
> Mar 26 19:04:09 PL-3 local0.notice osafclmna[895]: Started
> Mar 26 19:04:09 PL-3 local0.err osafclmna[895]: ER PL-8 is not a configured 
> node
> Mar 26 19:04:09 PL-3 local0.err osafclmna[895]: ER Exiting
> Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Could Not RESPAWN CLMNA
> Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Failed   DESC:CLMNA
> Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Trying To RESPAWN 
> /usr/local/lib/opensaf/clc-cli/osaf-clmna attempt #2
> Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Sending SIGKILL to CLMNA, 
> pid=892
> Mar 26 19:04:09 PL-3 local0.notice osafclmna[895]: exiting for shutdown
> Mar 26 19:04:24 PL-3 local0.notice osafclmna[919]: Started
> Mar 26 19:04:25 PL-3 local0.err osafclmna[919]: ER PL-8 is not a configured 
> node
> Mar 26 19:04:25 PL-3 local0.err osafclmna[919]: ER Exiting
> Mar 26 19:04:25 PL-3 local0.err opensafd[837]: ER Could Not RESPAWN CLMNA
> Mar 26 19:04:25 PL-3 local0.err opensafd[837]: ER Failed   DESC:CLMNA
> Mar 26 19:04:25 PL-3 local0.err opensafd[837]: ER FAILED TO RESPAWN
> Mar 26 19:04:27 PL-3 local0.notice osafclmna[919]: exiting for shutdown
> Mar 26 19:04:28 PL-3 local0.notice osafimmnd[864]: exiting for shutdown
>
>
> For cases when a duplicate node join request comes, the following syslog
> message will be seen at the ACTIVE controller:
> Mar 26 19:07:43 SC-1 local0.err osafclmd[418]: ER Duplicate node join request 
> for CLM node: 'SC-2'. Specify a unique node name in/etc/opensaf/node_name
> Mar 26 19:07:59 SC-1 local0.err osafclmd[418]: ER Duplicate node join request 
> for CLM node: 'SC-2'. Specify a unique node name in/etc/opensaf/node_name
>
> And the foll

Re: [devel] [PATCH 2 of 5] amfnd: store amfd mds ver [#574]

2014-04-03 Thread Hans Feldt
Thanks, that's a good comment. Need to think what the best way forward is. The 
change will most likely affect the rest of patch series.

Skickat från min Sony Xperia™-smartphone


 SuryaNarayana Garlapati skrev 

Comments:
1. First of all, the mds sub part version should not be stored/used. You
should depend on the message format version as SI rank is being embedded
in the new message(which is being sent from AVD).
  If you have a requirement for usage of message version, you can get it
at runtime when you receive a SUSI message etc. Decode the received
message and use the message format version.
2.  As per patch, Storage of mds version will happen only once for a
life time. You need to process the NEW-ACTIVE event of MDS as well
which will make sure to store the mds version even after the
failover/switchover.

Regards
Surya

On Friday 28 March 2014 06:44 PM, Hans Feldt wrote:
>   osaf/services/saf/avsv/avd/avd_mds.c   |  2 ++
>   osaf/services/saf/avsv/avnd/avnd_di.c  |  1 +
>   osaf/services/saf/avsv/avnd/avnd_mds.c |  3 +++
>   osaf/services/saf/avsv/avnd/include/avnd_cb.h  |  1 +
>   osaf/services/saf/avsv/avnd/include/avnd_evt.h |  1 +
>   5 files changed, 8 insertions(+), 0 deletions(-)
>
>
> diff --git a/osaf/services/saf/avsv/avd/avd_mds.c 
> b/osaf/services/saf/avsv/avd/avd_mds.c
> --- a/osaf/services/saf/avsv/avd/avd_mds.c
> +++ b/osaf/services/saf/avsv/avd/avd_mds.c
> @@ -414,6 +414,8 @@ static uint32_t avd_mds_svc_evt(MDS_CALL
>break;
>
>case NCSMDS_SVC_ID_AVND:
> + // TODO remove, just for test
> + LOG_NO("%s: AVND UP version: %u", __FUNCTION__, 
> evt_info->i_rem_svc_pvt_ver);
>if (evt_info->i_node_id == cb->node_id_avd) {
>AVD_EVT *evt = calloc(1, sizeof(AVD_EVT));
>osafassert(evt);
> diff --git a/osaf/services/saf/avsv/avnd/avnd_di.c 
> b/osaf/services/saf/avsv/avnd/avnd_di.c
> --- a/osaf/services/saf/avsv/avnd/avnd_di.c
> +++ b/osaf/services/saf/avsv/avnd/avnd_di.c
> @@ -421,6 +421,7 @@ uint32_t avnd_evt_mds_avd_up_evh(AVND_CB
>
>/* store the AVD MDS address */
>cb->avd_dest = evt->info.mds.mds_dest;
> + cb->avd_mds_ver = evt->info.mds.rem_svc_pvt_ver;
>
>avnd_send_node_up_msg();
>}
> diff --git a/osaf/services/saf/avsv/avnd/avnd_mds.c 
> b/osaf/services/saf/avsv/avnd/avnd_mds.c
> --- a/osaf/services/saf/avsv/avnd/avnd_mds.c
> +++ b/osaf/services/saf/avsv/avnd/avnd_mds.c
> @@ -594,8 +594,11 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
>case NCSMDS_UP:
>switch (evt_info->i_svc_id) {
>case NCSMDS_SVC_ID_AVD:
> + // TODO remove, just for test
> + LOG_NO("%s: AVD UP version: %u", __FUNCTION__, 
> evt_info->i_rem_svc_pvt_ver);
>/* create the mds event */
>evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
> &evt_info->i_dest, 0, 0, 0);
> + evt->info.mds.rem_svc_pvt_ver = 
> evt_info->i_rem_svc_pvt_ver;
>break;
>
>case NCSMDS_SVC_ID_AVA:
> diff --git a/osaf/services/saf/avsv/avnd/include/avnd_cb.h 
> b/osaf/services/saf/avsv/avnd/include/avnd_cb.h
> --- a/osaf/services/saf/avsv/avnd/include/avnd_cb.h
> +++ b/osaf/services/saf/avsv/avnd/include/avnd_cb.h
> @@ -50,6 +50,7 @@ typedef struct avnd_cb_tag {
>MDS_DEST avnd_dest; /* AvND mds addr */
>MDS_DEST avd_dest;  /* AvD mds addr */
>bool is_avd_down;   /* Temp: Indicates if AvD went down */
> + MDS_SVC_PVT_SUB_PART_VER avd_mds_ver; /* Director MDS version */
>
>/* cb related params */
>NCS_LOCK lock;  /* cb lock */
> diff --git a/osaf/services/saf/avsv/avnd/include/avnd_evt.h 
> b/osaf/services/saf/avsv/avnd/include/avnd_evt.h
> --- a/osaf/services/saf/avsv/avnd/include/avnd_evt.h
> +++ b/osaf/services/saf/avsv/avnd/include/avnd_evt.h
> @@ -128,6 +128,7 @@ typedef struct avnd_tmr_evt {
>   typedef struct avnd_mds_evt {
>MDS_DEST mds_dest;  /* mds address */
>NODE_ID node_id;
> + MDS_SVC_PVT_SUB_PART_VER rem_svc_pvt_ver;
>   } AVND_MDS_EVT;
>
>   /* HA STATE change event definition */
>
> --
> ___
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 3 of 3] clm: clmna to be respawned by nid when node join request fails [#816]

2014-04-03 Thread Mathivanan Naickan Palanivelu
There is no functional need for a retry when it is not a communication problem 
there.
The default number of re-spawns has been 3 all along. Yes, we can increase it 
to a bigger value.
It is up to the system integrator to decide what to do when OpenSAF startup 
fails.

Thanks,
Mathi.

- hans.fe...@ericsson.com wrote:

> On the controller I get an extra "send failed" log message that is
> ugly:
> 
> Apr  3 13:40:01 SC-1 local0.notice osafclmd[417]: NO CLM NodeName:
> 'PL-6' is not a configured cluster node.
> Apr  3 13:40:01 SC-1 local0.notice osafclmd[417]: NO
> /etc/opensaf/node_name should contain the rdn value of configured 
> CLM node object name
> Apr  3 13:40:02 SC-1 local0.notice osafclmd[417]: NO
> proc_initialize_msg: send failed. dest:2060f19cd6007
> Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO CLM NodeName:
> 'PL-6' is not a configured cluster node.
> Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO
> /etc/opensaf/node_name should contain the rdn value of configured 
> CLM node object name
> Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO
> proc_initialize_msg: send failed. dest:2060f19cda006
> Apr  3 13:40:32 SC-1 local0.notice osafclmd[417]: NO CLM NodeName:
> 'PL-6' is not a configured cluster node.
> Apr  3 13:40:32 SC-1 local0.notice osafclmd[417]: NO
> /etc/opensaf/node_name should contain the rdn value of configured 
> CLM node object name
> Apr  3 13:40:33 SC-1 local0.notice osafclmd[417]: NO
> proc_initialize_msg: send failed. dest:2060f19cda009
> Apr  3 13:40:34 SC-1 local0.notice osafimmnd[388]: NO Global discard
> node received for nodeId:2060f pid:359
> Apr  3 13:40:39 SC-1 user.warn kernel: tipc: Resetting link
> <1.1.1:eth0-1.1.6:eth0>, peer not responding
> 
> I would like to have a try again loop inside clmna instead. Then we
> could have the nid to supervise and every now and 
> then do a node reboot. There is not much point in trying three times
> and give up. I think clmna should retry forever and 
> let nodeinit handle the bigger loop. The try interval needs to be
> configured but should have good default such as every 
> 30 sec.
> 
> Thanks,
> Hans
> 
> On 03/27/2014 02:13 AM, mathi.naic...@oracle.com wrote:
> >   osaf/services/saf/clmsv/nodeagent/main.c |  8 
> >   1 files changed, 8 insertions(+), 0 deletions(-)
> >
> >
> > When a node join request for an unconfigured/misconfigured node or
> > when a node join request with a duplicate node_name is attempted,
> then
> > clmna should report those errors to NID such that NID attempts to
> > respawan clmna.
> > With the introduction of this change, the following happens(can be
> seen in the syslog) in
> > the case of a unconfigured/misconfigured node join request:
> > At the ACTIVE controller syslog:
> > Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO CLM NodeName:
> 'PL-8' is not a configured cluster node.
> > Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO
> /etc/opensaf/node_name should contain the rdn value of a configured
> CLM node object name
> >
> > At the unconfigured/misconfigured node, the syslog will be like as
> below:
> > Mar 26 19:03:54 PL-3 local0.notice osafclmna[871]: Started
> > Mar 26 19:03:54 PL-3 local0.err osafclmna[871]: ER PL-8 is not a
> configured node
> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Failed  
> DESC:CLMNA
> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Going for
> recovery
> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Trying To RESPAWN
> /usr/local/lib/opensaf/clc-cli/osaf-clmna attempt #1
> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Sending SIGKILL to
> CLMNA, pid=868
> > Mar 26 19:03:54 PL-3 local0.err osafclmna[871]: ER Exiting
> > Mar 26 19:03:54 PL-3 local0.notice osafclmna[871]: exiting for
> shutdown
> > Mar 26 19:04:09 PL-3 local0.notice osafclmna[895]: Started
> > Mar 26 19:04:09 PL-3 local0.err osafclmna[895]: ER PL-8 is not a
> configured node
> > Mar 26 19:04:09 PL-3 local0.err osafclmna[895]: ER Exiting
> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Could Not RESPAWN
> CLMNA
> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Failed  
> DESC:CLMNA
> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Trying To RESPAWN
> /usr/local/lib/opensaf/clc-cli/osaf-clmna attempt #2
> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Sending SIGKILL to
> CLMNA, pid=892
> > Mar 26 19:04:09 PL-3 local0.notice osafclmna[895]: exiting for
> shutdown
> > Mar 26 19:04:24 PL-3 local0.notice osafclmna[919]: Started
> > Mar 26 19:04:25 PL-3 local0.err osafclmna[919]: ER PL-8 is not a
> configured node
> > Mar 26 19:04:25 PL-3 local0.err osafclmna[919]: ER Exiting
> > Mar 26 19:04:25 PL-3 local0.err opensafd[837]: ER Could Not RESPAWN
> CLMNA
> > Mar 26 19:04:25 PL-3 local0.err opensafd[837]: ER Failed  
> DESC:CLMNA
> > Mar 26 19:04:25 PL-3 local0.err opensafd[837]: ER FAILED TO RESPAWN
> > Mar 26 19:04:27 PL-3 local0.notice osafclmna[919]: exiting for
> shutdown
> > Mar 26 19:04:28 PL-3 local0.notice osafimmnd[8

Re: [devel] [PATCH 2 of 3] clm: change log message for unconfigured nodes, connectivity problems [#816]

2014-04-03 Thread Mathivanan Naickan Palanivelu
Ok, Will do that.
- Mathi.

- hans.fe...@ericsson.com wrote:

> See inline
> Thanks,
> Hans
> 
> On 03/27/2014 02:13 AM, mathi.naic...@oracle.com wrote:
> >   osaf/services/saf/clmsv/clms/clms_evt.c  |  12 ++--
> >   osaf/services/saf/clmsv/nodeagent/main.c |   7 +++
> >   2 files changed, 13 insertions(+), 6 deletions(-)
> >
> >
> > When a node join request from an unconfigured/misconfigured node is
> attempted, then
> > CLM prints the following log message:
> > Mar 26 18:44:52 SC-1 local0.err osafclmd[418]: ER CLM NodeName:
> 'SC-3' doesn't match entry in imm.xml.
> > Specify a correct node name in/etc/opensaf/node_name.
> > This log message is currently categorized as ERROR. However, there
> are suggestions to improve
> > the current log message such that the message more closely depicts
> > the case of unconfigured or misconfigured nodes.
> > The log string is now changed as below:
> > Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO CLM NodeName:
> 'PL-8' is not a configured cluster node.
> > Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO
> /etc/opensaf/node_name should contain the rdn value of a configured
> CLM node object name
> >
> > In the case of problems with connectivity problems, the following
> modified log strings are added:
> > Send timed out. Check status(network, process) of ACTIVE controller
> > and
> > Send failed: %u. Check network connectivity
> >
> > diff --git a/osaf/services/saf/clmsv/clms/clms_evt.c
> b/osaf/services/saf/clmsv/clms/clms_evt.c
> > --- a/osaf/services/saf/clmsv/clms/clms_evt.c
> > +++ b/osaf/services/saf/clmsv/clms/clms_evt.c
> > @@ -282,8 +282,16 @@ uint32_t proc_node_up_msg(CLMS_CB * cb,
> > node = clms_node_get_by_name(&node_name);
> > if (node == NULL) {
> > clm_msg.info.api_resp_info.rc = SA_AIS_ERR_NOT_EXIST;
> > -   LOG_ER("CLM NodeName: '%s' doesn't match entry in imm.xml.
> Specify a correct node name in" PKGSYSCONFDIR "/node_name",
> > -  nodeup_info->node_name.value);
> > +   /* The /etc/opensaf/node_name is an user exposed configuration
> file.
> > +* The node_name file contains the RDN value of the CLM node
> name.
> > +* (a) When opensaf cluster configuration is pre-provisioned
> using the OpenSAF IMM tools:
> > +* the /etc/opensaf/node_name should contain one of the
> values specified
> > +* in nodes.cfg while generating the imm.xml.
> > +* (b) When opensaf cluster nodes are dynamically added at
> runtime:
> > +* the /etc/opensaf/node_name should contain the rdn value.
> > +*/
> > +   LOG_NO("CLM NodeName: '%s' is not a configured cluster node.",
> nodeup_info->node_name.value);
> 
> In this log I would like to see the "join the cluster" and
> "unconfigured" that I tried to read out from the CLM spec 
> 3.1.2. For example:
> 
> "Node '%s' [attempts|requests] to join the cluster but is
> unconfigured"
> 
> > +   LOG_NO(PKGSYSCONFDIR "/node_name should contain the rdn value of
> configured CLM node object name");
> 
> This log is not relevant/correct and should be removed. CLM cannot
> know if the name presented is wrong or just about to 
> be configured.
> 
> > }
> >
> > if (node != NULL) {
> > diff --git a/osaf/services/saf/clmsv/nodeagent/main.c
> b/osaf/services/saf/clmsv/nodeagent/main.c
> > --- a/osaf/services/saf/clmsv/nodeagent/main.c
> > +++ b/osaf/services/saf/clmsv/nodeagent/main.c
> > @@ -442,11 +442,10 @@ SaAisErrorT clmna_process_dummyup_msg(vo
> > break;
> > case NCSCC_RC_REQ_TIMOUT:
> > error = SA_AIS_ERR_TIMEOUT;
> > -   break;
> > -   LOG_ER("clmna_mds_msg_sync_send Timed Out");
> > +   LOG_ER("Send timed out. Check status(network, process) 
> > of ACTIVE
> controller");
> > goto done;
> > default:
> > -   LOG_ER("clmna_mds_msg_sync_send FAILED: %u", rc);
> > +   LOG_ER("Send failed: %u. Check network connectivity", 
> > rc);
> > goto done;
> > }
> >
> > @@ -456,7 +455,7 @@ SaAisErrorT clmna_process_dummyup_msg(vo
> > error = SA_AIS_ERR_NO_RESOURCES;
> >
> > if (error == SA_AIS_ERR_NOT_EXIST) {
> > -   LOG_ER("%s is Not a member of cluster",
> > +   LOG_ER("%s is not a configured node",
> > 
> > o_msg->info.api_resp_info.param.node_name.value);
> > free(o_msg);
> > goto done;
> >
> >

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 1 of 1] IMM:sending delete callback for PBE enabled RTobj delete [#835]

2014-04-03 Thread reddy . neelakanta
 osaf/services/saf/immsv/immnd/ImmModel.cc |  2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


when the RT Object is deleted in a PBE enabled cluster which is subscribed to 
special applier,
the delete callback is not sent. with this patch the delete callback will be 
sent

diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
b/osaf/services/saf/immsv/immnd/ImmModel.cc
--- a/osaf/services/saf/immsv/immnd/ImmModel.cc
+++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
@@ -13336,7 +13336,7 @@ SaInt32T ImmModel::pbePrtObjDeletesConti
 osafassert(oi != sObjectMap.end());
 
 if(oMut->mAfterImage->mObjFlags & IMM_RTNFY_FLAG) {
-if(*spApplConnPtr) {
+if(spApplConnPtr) {
 if(oMut->mAfterImage->mObjFlags & IMM_DN_INTERNAL_REP) {
 std::string tmpName(i2->first);
 nameToExternal(tmpName);

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 0 of 1] Review Request for IMM:sending delete callback for PBE enabled RTobj delete [#835]

2014-04-03 Thread reddy . neelakanta
Summary: IMM:sending delete callback for PBE enabled RTobj delete [#835]
Review request for Trac Ticket(s): 835
Peer Reviewer(s): AndersBj
Affected branch(es): 4.3.x, 4.4.x, default
Development branch: default


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesn
 OpenSAF servicesy
 Core libraries  n
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-
 <>

changeset d44b42b53a9570abeea18454f0bf8fdbaa5ee047
Author: Neelakanta Reddy
Date:   Thu, 03 Apr 2014 19:55:01 +0530

IMM:sending delete callback for PBE enabled RTobj delete [#835] when 
the RT
Object is deleted in a PBE enabled cluster which is subscribed to 
special
applier, the delete callback is not sent. with this patch the delete
callback will be sent

Testing Commands:
-
As explained in the desciption of the ticket.

Testing, Expected Results:
--
Delete callback for persistent runtime object must be sent for special applier

Conditions of Submission:
-
Ack from AndersBj

Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  y  y
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
(i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)

___ Your computer have a badly configured date and time; confusing the
the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
do not contain the patch that updates the Doxygen manual.


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 1] IMM:Removed Implementer name for config objects while loading [#543]

2014-04-03 Thread Neelakanta Reddy
Hi AndersBj/zoran,

comments inline.

/Neel.
On Thursday 27 March 2014 06:47 PM, Zoran Milinkovic wrote:
> Hi Neelakanta,
>
> Generally, ack from me, with a comment to Anders's comment.
>
> I think you can use "state->attrDefinitions" to determine if a class is 
> persistent runtime or not.
> "state->attrDefinitions" is used just before you add a class name to 
> runtimeClass set.
we can determine the class as persistent runtime, and insert only the 
persistent runtime class to the set.
persistent runtime can be determined, when the the value of the rdn 
flags contains by setting a global variable
(and clearing the global value after each class creation):

SA_IMM_ATTR_RUNTIME &&(SA_IMM_ATTR_CACHED || SA_IMM_ATTR_PERSISTENT)

will provide the new patch with the comments.

> Best regards,
> Zoran
>
> -Original Message-
> From: Anders Bjornerstedt [mailto:anders.bjornerst...@ericsson.com]
> Sent: den 27 mars 2014 13:03
> To: reddy.neelaka...@oracle.com
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [devel] [PATCH 1 of 1] IMM:Removed Implementer name for config 
> objects while loading [#543]
>
> Ack with two minor comments, only one of which requires a small change.
> I tested for both config objecs and persistent RTOs and the patch works.
> I.e. it removes the implementer-name for config objects and does not remove 
> it for PRTOs.
>
> See minor comments below
>
> reddy.neelaka...@oracle.com wrote:
>>   osaf/services/saf/immsv/immloadd/imm_loader.cc   |  32 +---
>>   osaf/services/saf/immsv/immloadd/imm_loader.hh   |   1 +
>>   osaf/services/saf/immsv/immloadd/imm_pbe_load.cc |   5 ++
>>   3 files changed, 32 insertions(+), 6 deletions(-)
>>
>>
>> When the loading is done from imm.xml or imm.db which has implementer name 
>> for config objects.
>> The implementer name is Nullified while loading from imm.xml or imm.db with 
>> this patch for config objects.
>>
>> diff --git a/osaf/services/saf/immsv/immloadd/imm_loader.cc
>> b/osaf/services/saf/immsv/immloadd/imm_loader.cc
>> --- a/osaf/services/saf/immsv/immloadd/imm_loader.cc
>> +++ b/osaf/services/saf/immsv/immloadd/imm_loader.cc
>> @@ -121,6 +121,7 @@ bool isXsdLoaded;
>>   std::string xsddir;
>>   std::string xsd;
>>   typedef std::set AttrFlagSet;
>> +std::set runtimeClass;
>>   AttrFlagSet attrFlagSet;
>>   
>>   /* Helper functions */
>> @@ -1056,6 +1057,9 @@ static void endElementHandler(void* user
>>   LOG_ER("Failed to create class %s - 
>> exiting",state->className);
>>   exit(1);
>>   }
>> +if(state->classCategory == SA_IMM_CLASS_RUNTIME) {
>> +runtimeClass.insert(state->className);
>>
> This creates the set of names for all runtime classes. Ideally one would have 
> liked to have only the subset of the names for all persistent runtime object 
> classes.
> But perhaps this is tricky to compute in this context.
> Only persistent RTOs get dumped and reloaded.
> And only persistent RTOs have a corresponding base table in the PBE sqlite 
> representation.
> Only if this is simple to fix would I fix it.
same as explained above, for zoran comment.
>> +}
>>   state->attrFlags = 0;
>>   
>>   state->attrValueTypeSet= 0;
>> @@ -1089,12 +1093,28 @@ static void endElementHandler(void* user
>>   else
>>   {
>>   //addObjectAttributeDefinition(state);
>> -addObjectAttributeDefinition(state->objectClass,
>> -state->attrName,
>> -&(state->attrValueBuffers),
>> -getClassAttrValueType(&(state->classAttrTypeMap),
>> -state->objectClass, state->attrName),
>> -&(state->attrValuesList));
>> +std::set::iterator clit =
>> +runtimeClass.find(state->objectClass);
>>
> The above line is redundant and I in fact get a compiler warning for it 
> (build failed for me on error as warnings).
> So just remove the above line.
> You are probably using an older GCC that missed this, I am using GCC 4.6.2.
yes, the above line is redundant. will remove.
>
> Thanks
> /AndersBj
>> +if((strcmp (state->attrName,"SaImmAttrImplementerName")==0)){
>> +std::set::iterator clit = 
>> runtimeClass.find(state->objectClass);
>> +if(clit != runtimeClass.end()){
>> +addObjectAttributeDefinition(state->objectClass,
>> +state->attrName,
>> +&(state->attrValueBuffers),
>> +
>> getClassAttrValueType(&(state->classAttrTypeMap),
>> +state->objectClass, state->attrName),
>> +&(state->attrValuesList));
>> +}
>> +}
>> +else
>> +{
>> + addObjectAttributeDefinition(state->objectClass,
>> +state->attrName,
>> +&(state->attrValueBuffers),
>> +  

Re: [devel] [PATCH 1 of 1] IMM:sending delete callback for PBE enabled RTobj delete [#835]

2014-04-03 Thread Neelakanta Reddy
Hi AndersBj,

I forgot, we have to search first if the special applier is there or 
not,  the following the new patch( no need to change the "if condition" 
as in published patch).


diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
b/osaf/services/saf/immsv/immnd/ImmModel.cc
--- a/osaf/services/saf/immsv/immnd/ImmModel.cc
+++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
@@ -13305,6 +13305,13 @@ SaInt32T ImmModel::pbePrtObjDeletesConti

  TRACE("1PBE or 2PBE with error or 2PBE where both PBEs have 
replied.");
  nrofDeletes=0;
+
+/* Lookup special applier */
+ImplementerInfo* spImpl = getSpecialApplier();
+if(spImpl && spImpl->mConn) {
+(*spApplConnPtr) = spImpl->mConn;
+}
+

  for(i2 = sPbeRtMutations.begin();i2!=sPbeRtMutations.end(); ) {
  if(i2->second->mContinuationId != invocation) {

Thanks,
Neel.


On Thursday 03 April 2014 07:59 PM, reddy.neelaka...@oracle.com wrote:
>   osaf/services/saf/immsv/immnd/ImmModel.cc |  2 +-
>   1 files changed, 1 insertions(+), 1 deletions(-)
>
>
> when the RT Object is deleted in a PBE enabled cluster which is subscribed to 
> special applier,
> the delete callback is not sent. with this patch the delete callback will be 
> sent
>
> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
> b/osaf/services/saf/immsv/immnd/ImmModel.cc
> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc
> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
> @@ -13336,7 +13336,7 @@ SaInt32T ImmModel::pbePrtObjDeletesConti
>   osafassert(oi != sObjectMap.end());
>   
>   if(oMut->mAfterImage->mObjFlags & IMM_RTNFY_FLAG) {
> -if(*spApplConnPtr) {
> +if(spApplConnPtr) {
>   if(oMut->mAfterImage->mObjFlags & IMM_DN_INTERNAL_REP) {
>   std::string tmpName(i2->first);
>   nameToExternal(tmpName);
>
> --
> ___
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1 of 1] IMM:sending delete callback for PBE enabled RTobj delete [#835]

2014-04-03 Thread Anders Björnerstedt
Hi Neel,

The proposed patch will work logically, but it is not as efficient as it could 
be.
By changing 

if(*spApplConnPtr) {

To:

if(spApplConnPtr) {

Is the same as:

if(true) {

That is the condition will always be true because the pointer itself is of 
course always 
provided. Note the:  osafassert(spApplConnPtr);

So you may in this variant just remove this if statement.

The body of the if statement builds up a vecor of object names to be used for 
sending
special applier callbacks for deleted PRTOs. There can theoretically be more 
than one
Because delete operations are cascading. There could be a subtree of PRTOs 
being deletd.

The vector is then build up here regardles of whether any special applier 
exists.
Copying the string names of the objects.
But if no special applier exists at this node (typically the case at all 
payloads),
Then the vector is just discarded later. So this causes wasted execution/memory.
In reality is probably not much. But wy waste if it is not necessary.

So I will Nack your patch and will send an alternate fix in a moment that goes 
back
to the original pattenrn of generating this vecor only if there is a special 
applier
at the node.

Note I do apreciate your effort in trying to understand this convoluted code.
It can not be easy.
For exampe on a 2PBE system, this function is entered twice,
once for the reply from each PBE.

Thanks
/AndersBj


-Original Message-
From: reddy.neelaka...@oracle.com [mailto:reddy.neelaka...@oracle.com] 
Sent: den 3 april 2014 16:29
To: Anders Björnerstedt
Cc: opensaf-devel@lists.sourceforge.net
Subject: [PATCH 1 of 1] IMM:sending delete callback for PBE enabled RTobj 
delete [#835]

 osaf/services/saf/immsv/immnd/ImmModel.cc |  2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


when the RT Object is deleted in a PBE enabled cluster which is subscribed to 
special applier, the delete callback is not sent. with this patch the delete 
callback will be sent

diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
b/osaf/services/saf/immsv/immnd/ImmModel.cc
--- a/osaf/services/saf/immsv/immnd/ImmModel.cc
+++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
@@ -13336,7 +13336,7 @@ SaInt32T ImmModel::pbePrtObjDeletesConti
 osafassert(oi != sObjectMap.end());
 
 if(oMut->mAfterImage->mObjFlags & IMM_RTNFY_FLAG) {
-if(*spApplConnPtr) {
+if(spApplConnPtr) {
 if(oMut->mAfterImage->mObjFlags & IMM_DN_INTERNAL_REP) {
 std::string tmpName(i2->first);
 nameToExternal(tmpName);

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 1 of 1] IMM: Special applier callbacks for PRTO deletes are re-established [#835]

2014-04-03 Thread Anders Bjornerstedt
 osaf/services/saf/immsv/immnd/ImmModel.cc |  15 +--
 1 files changed, 9 insertions(+), 6 deletions(-)


Changes to the server code for immsv done in relation to adding support
for 2PBE in OpenSAF 4.4, broke the logic that generates callbacks to
special appliers for PRTO deletes. The problem exists with 0PBE, 1PBE or 2PBE.

This changeset fixes the problem so the special applier callbacks
for PRTO deletes are now sent.

diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
b/osaf/services/saf/immsv/immnd/ImmModel.cc
--- a/osaf/services/saf/immsv/immnd/ImmModel.cc
+++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
@@ -13305,6 +13305,12 @@ SaInt32T ImmModel::pbePrtObjDeletesConti
 
 TRACE("1PBE or 2PBE with error or 2PBE where both PBEs have replied.");
 nrofDeletes=0;
+
+/* Lookup special applier */
+ImplementerInfo* spImpl = getSpecialApplier();
+if(spImpl && spImpl->mConn) {
+(*spApplConnPtr) = spImpl->mConn;
+}
 
 for(i2 = sPbeRtMutations.begin();i2!=sPbeRtMutations.end(); ) {
 if(i2->second->mContinuationId != invocation) {
@@ -13400,12 +13406,9 @@ SaInt32T ImmModel::pbePrtObjDeletesConti
 sPbeRtReqContinuationMap.erase(ci);
 } 
 
-/* Lookup special applier */
-if(!objNameVector.empty()) {
-ImplementerInfo* spImpl = getSpecialApplier();
-if(spImpl && spImpl->mConn) {
-(*spApplConnPtr) = spImpl->mConn;
-}
+/* If no notifications then reset spApplConnPtr to zero */
+if((*spApplConnPtr) && objNameVector.empty()) {
+(*spApplConnPtr) = 0;
 }
 
 TRACE_LEAVE();

--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


[devel] [PATCH 0 of 1] Review Request for IMM: Special applier callbacks for PRTO deletes are re-established [#835]

2014-04-03 Thread Anders Bjornerstedt
Summary: IMM: Special applier callbacks for PRTO deletes are re-established 
[#835]
Review request for Trac Ticket(s): 835
Peer Reviewer(s): Neel
Pull request to: 
Affected branch(es): 4.4; defaultr(4.5) 
Development branch: 


Impacted area   Impact y/n

 Docsn
 Build systemn
 RPM/packaging   n
 Configuration files n
 Startup scripts n
 SAF servicesy
 OpenSAF servicesn
 Core libraries  n
 Samples n
 Tests   n
 Other   n


Comments (indicate scope for each "y" above):
-

changeset cdc66478d9a72b3bf0a8a04904e0878c0bae9990
Author: Anders Bjornerstedt 
Date:   Fri, 04 Apr 2014 06:50:38 +0200

IMM: Special applier callbacks for PRTO deletes are re-established 
[#835]

 Changes to the server code for immsv done in relation to adding 
support for
2PBE in OpenSAF 4.4, broke the logic that generates callbacks to special
appliers for PRTO deletes. The problem exists with 0PBE, 1PBE or 2PBE.

This changeset fixes the problem so the special applier callbacks for 
PRTO
deletes are now sent.


Complete diffstat:
--
 osaf/services/saf/immsv/immnd/ImmModel.cc |  15 +--
 1 files changed, 9 insertions(+), 6 deletions(-)


Testing Commands:
-
As explained in the desciption of the ticket.

Testing, Expected Results:
--
Delete callback for persistent runtime object are sent for special appliers.


Conditions of Submission:
-
Ack from Neel.

Arch  Built StartedLinux distro
---
mipsn  n
mips64  n  n
x86 n  n
x86_64  n  n
powerpc n  n
powerpc64   n  n


Reviewer Checklist:
---
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
(i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)

___ Your computer have a badly configured date and time; confusing the
the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
do not contain the patch that updates the Doxygen manual.


--
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 2 of 5] amfnd: store amfd mds ver [#574]

2014-04-03 Thread Hans Feldt
Hi,
I interpreted your answer such that there are two alternatives:
1) rely on mds sub part version but then I need to handle NEW-ACTIVE or
2) use message format version instead

Seems like you had a preference for 2?

Any alternative seems to require quite a few more changes in amfnd.
Alt1 would need add new event event handling in both the mds callback
and main thread. Alt2 would require to preserve the message format
version all the way to the event handler.

But again the strange thing is that these are not the same formats and
there is not much explaining what they are. mds sub part version is 8
bits, used at MDS install and service discovery (SVC_EVENT callback).
But message format version is 16 bit. Normally (at least in amf)
message format version seems to get filled with sub part version so 8
bits of it is unused.

Thanks,
Hans

On 3 April 2014 13:13, SuryaNarayana Garlapati
 wrote:
> Comments:
> 1. First of all, the mds sub part version should not be stored/used. You
> should depend on the message format version as SI rank is being embedded
> in the new message(which is being sent from AVD).
>   If you have a requirement for usage of message version, you can get it
> at runtime when you receive a SUSI message etc. Decode the received
> message and use the message format version.
> 2.  As per patch, Storage of mds version will happen only once for a
> life time. You need to process the NEW-ACTIVE event of MDS as well
> which will make sure to store the mds version even after the
> failover/switchover.
>
> Regards
> Surya
>
> On Friday 28 March 2014 06:44 PM, Hans Feldt wrote:
>>   osaf/services/saf/avsv/avd/avd_mds.c   |  2 ++
>>   osaf/services/saf/avsv/avnd/avnd_di.c  |  1 +
>>   osaf/services/saf/avsv/avnd/avnd_mds.c |  3 +++
>>   osaf/services/saf/avsv/avnd/include/avnd_cb.h  |  1 +
>>   osaf/services/saf/avsv/avnd/include/avnd_evt.h |  1 +
>>   5 files changed, 8 insertions(+), 0 deletions(-)
>>
>>
>> diff --git a/osaf/services/saf/avsv/avd/avd_mds.c 
>> b/osaf/services/saf/avsv/avd/avd_mds.c
>> --- a/osaf/services/saf/avsv/avd/avd_mds.c
>> +++ b/osaf/services/saf/avsv/avd/avd_mds.c
>> @@ -414,6 +414,8 @@ static uint32_t avd_mds_svc_evt(MDS_CALL
>>   break;
>>
>>   case NCSMDS_SVC_ID_AVND:
>> + // TODO remove, just for test
>> + LOG_NO("%s: AVND UP version: %u", __FUNCTION__, 
>> evt_info->i_rem_svc_pvt_ver);
>>   if (evt_info->i_node_id == cb->node_id_avd) {
>>   AVD_EVT *evt = calloc(1, sizeof(AVD_EVT));
>>   osafassert(evt);
>> diff --git a/osaf/services/saf/avsv/avnd/avnd_di.c 
>> b/osaf/services/saf/avsv/avnd/avnd_di.c
>> --- a/osaf/services/saf/avsv/avnd/avnd_di.c
>> +++ b/osaf/services/saf/avsv/avnd/avnd_di.c
>> @@ -421,6 +421,7 @@ uint32_t avnd_evt_mds_avd_up_evh(AVND_CB
>>
>>   /* store the AVD MDS address */
>>   cb->avd_dest = evt->info.mds.mds_dest;
>> + cb->avd_mds_ver = evt->info.mds.rem_svc_pvt_ver;
>>
>>   avnd_send_node_up_msg();
>>   }
>> diff --git a/osaf/services/saf/avsv/avnd/avnd_mds.c 
>> b/osaf/services/saf/avsv/avnd/avnd_mds.c
>> --- a/osaf/services/saf/avsv/avnd/avnd_mds.c
>> +++ b/osaf/services/saf/avsv/avnd/avnd_mds.c
>> @@ -594,8 +594,11 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
>>   case NCSMDS_UP:
>>   switch (evt_info->i_svc_id) {
>>   case NCSMDS_SVC_ID_AVD:
>> + // TODO remove, just for test
>> + LOG_NO("%s: AVD UP version: %u", __FUNCTION__, 
>> evt_info->i_rem_svc_pvt_ver);
>>   /* create the mds event */
>>   evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
>> &evt_info->i_dest, 0, 0, 0);
>> + evt->info.mds.rem_svc_pvt_ver = 
>> evt_info->i_rem_svc_pvt_ver;
>>   break;
>>
>>   case NCSMDS_SVC_ID_AVA:
>> diff --git a/osaf/services/saf/avsv/avnd/include/avnd_cb.h 
>> b/osaf/services/saf/avsv/avnd/include/avnd_cb.h
>> --- a/osaf/services/saf/avsv/avnd/include/avnd_cb.h
>> +++ b/osaf/services/saf/avsv/avnd/include/avnd_cb.h
>> @@ -50,6 +50,7 @@ typedef struct avnd_cb_tag {
>>   MDS_DEST avnd_dest; /* AvND mds addr */
>>   MDS_DEST avd_dest;  /* AvD mds addr */
>>   bool is_avd_down;   /* Temp: Indicates if AvD went down */
>> + MDS_SVC_PVT_SUB_PART_VER avd_mds_ver; /* Director MDS version */
>>
>>   /* cb related params */
>>   NCS_LOCK lock;  /* cb lock */
>> diff --git a/osaf/services/saf/avsv/avnd/include/avnd_evt.h 
>> b/osaf/services/saf/avsv/avnd/include/avnd_evt.h
>> --- a/osaf/services/saf/avsv/avnd/include/avnd_evt.h
>> +++ b/osaf/services/saf/avsv/avnd/include/avnd_evt.h
>> @@ -128,6 +128,7 @@ typedef struct avnd_tmr_evt {
>>   typedef struct avnd_mds_evt {
>>   MDS_DEST mds_dest;  /* mds address */
>>   NODE_ID no

Re: [devel] [PATCH 3 of 3] clm: clmna to be respawned by nid when node join request fails [#816]

2014-04-03 Thread Hans Feldt
Fine let's see that as a possible future change. Remember removing nid
has been discussed, then we will have no retry at all without my
proposed change.

About the log "NO proc_initialize_msg: send failed", do you think it
comes from the fact that the payload start is not correct? clmna
seemed to return to nid even not a member and then amfnd tried to use
CLM? Should it be OK to use the CLM API on a non member node?

Thanks,
Hans

On 3 April 2014 14:12, Mathivanan Naickan Palanivelu
 wrote:
> There is no functional need for a retry when it is not a communication 
> problem there.
> The default number of re-spawns has been 3 all along. Yes, we can increase it 
> to a bigger value.
> It is up to the system integrator to decide what to do when OpenSAF startup 
> fails.
>
> Thanks,
> Mathi.
>
> - hans.fe...@ericsson.com wrote:
>
>> On the controller I get an extra "send failed" log message that is
>> ugly:
>>
>> Apr  3 13:40:01 SC-1 local0.notice osafclmd[417]: NO CLM NodeName:
>> 'PL-6' is not a configured cluster node.
>> Apr  3 13:40:01 SC-1 local0.notice osafclmd[417]: NO
>> /etc/opensaf/node_name should contain the rdn value of configured
>> CLM node object name
>> Apr  3 13:40:02 SC-1 local0.notice osafclmd[417]: NO
>> proc_initialize_msg: send failed. dest:2060f19cd6007
>> Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO CLM NodeName:
>> 'PL-6' is not a configured cluster node.
>> Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO
>> /etc/opensaf/node_name should contain the rdn value of configured
>> CLM node object name
>> Apr  3 13:40:17 SC-1 local0.notice osafclmd[417]: NO
>> proc_initialize_msg: send failed. dest:2060f19cda006
>> Apr  3 13:40:32 SC-1 local0.notice osafclmd[417]: NO CLM NodeName:
>> 'PL-6' is not a configured cluster node.
>> Apr  3 13:40:32 SC-1 local0.notice osafclmd[417]: NO
>> /etc/opensaf/node_name should contain the rdn value of configured
>> CLM node object name
>> Apr  3 13:40:33 SC-1 local0.notice osafclmd[417]: NO
>> proc_initialize_msg: send failed. dest:2060f19cda009
>> Apr  3 13:40:34 SC-1 local0.notice osafimmnd[388]: NO Global discard
>> node received for nodeId:2060f pid:359
>> Apr  3 13:40:39 SC-1 user.warn kernel: tipc: Resetting link
>> <1.1.1:eth0-1.1.6:eth0>, peer not responding
>>
>> I would like to have a try again loop inside clmna instead. Then we
>> could have the nid to supervise and every now and
>> then do a node reboot. There is not much point in trying three times
>> and give up. I think clmna should retry forever and
>> let nodeinit handle the bigger loop. The try interval needs to be
>> configured but should have good default such as every
>> 30 sec.
>>
>> Thanks,
>> Hans
>>
>> On 03/27/2014 02:13 AM, mathi.naic...@oracle.com wrote:
>> >   osaf/services/saf/clmsv/nodeagent/main.c |  8 
>> >   1 files changed, 8 insertions(+), 0 deletions(-)
>> >
>> >
>> > When a node join request for an unconfigured/misconfigured node or
>> > when a node join request with a duplicate node_name is attempted,
>> then
>> > clmna should report those errors to NID such that NID attempts to
>> > respawan clmna.
>> > With the introduction of this change, the following happens(can be
>> seen in the syslog) in
>> > the case of a unconfigured/misconfigured node join request:
>> > At the ACTIVE controller syslog:
>> > Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO CLM NodeName:
>> 'PL-8' is not a configured cluster node.
>> > Mar 26 20:23:32 SC-1 local0.notice osafclmd[420]: NO
>> /etc/opensaf/node_name should contain the rdn value of a configured
>> CLM node object name
>> >
>> > At the unconfigured/misconfigured node, the syslog will be like as
>> below:
>> > Mar 26 19:03:54 PL-3 local0.notice osafclmna[871]: Started
>> > Mar 26 19:03:54 PL-3 local0.err osafclmna[871]: ER PL-8 is not a
>> configured node
>> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Failed
>> DESC:CLMNA
>> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Going for
>> recovery
>> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Trying To RESPAWN
>> /usr/local/lib/opensaf/clc-cli/osaf-clmna attempt #1
>> > Mar 26 19:03:54 PL-3 local0.err opensafd[837]: ER Sending SIGKILL to
>> CLMNA, pid=868
>> > Mar 26 19:03:54 PL-3 local0.err osafclmna[871]: ER Exiting
>> > Mar 26 19:03:54 PL-3 local0.notice osafclmna[871]: exiting for
>> shutdown
>> > Mar 26 19:04:09 PL-3 local0.notice osafclmna[895]: Started
>> > Mar 26 19:04:09 PL-3 local0.err osafclmna[895]: ER PL-8 is not a
>> configured node
>> > Mar 26 19:04:09 PL-3 local0.err osafclmna[895]: ER Exiting
>> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Could Not RESPAWN
>> CLMNA
>> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Failed
>> DESC:CLMNA
>> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Trying To RESPAWN
>> /usr/local/lib/opensaf/clc-cli/osaf-clmna attempt #2
>> > Mar 26 19:04:09 PL-3 local0.err opensafd[837]: ER Sending SIGKILL to
>> CLMNA, pid=892
>> > Mar 26 19:04:09 PL-3 local0.notice osafclmna[895]: 

Re: [devel] [PATCH 2 of 5] amfnd: store amfd mds ver [#574]

2014-04-03 Thread SuryaNarayana Garlapati
Hi Hans,
Comments/answers inline.

Regards
Surya
On Friday 04 April 2014 11:16 AM, Hans Feldt wrote:
> Hi,
> I interpreted your answer such that there are two alternatives:
> 1) rely on mds sub part version but then I need to handle NEW-ACTIVE or
> 2) use message format version instead
>
> Seems like you had a preference for 2?
[Surya] Yes, message format should be used and that is the correct way 
to handle this situation.
>
> Any alternative seems to require quite a few more changes in amfnd.
> Alt1 would need add new event event handling in both the mds callback
> and main thread. Alt2 would require to preserve the message format
> version all the way to the event handler.
[Surya] Yes, it is preferred to preserve the message format 
version(variable i_msg_fmt_ver) all the way to event handler.
>
> But again the strange thing is that these are not the same formats and
> there is not much explaining what they are.
[Surya] I think i had already explained in my earlier emails, but here 
goes the explanation.

  MDS Service subpart version is the version of a service with which it 
has installed with MDS.
Message format version is the version with which a message was encoded. 
Such that on the receiving end,
application uses this message format version and decodes the message 
accordingly.
>   mds sub part version is 8
> bits, used at MDS install and service discovery (SVC_EVENT callback).
[Surya] Yes you are correct.
> But message format version is 16 bit. Normally (at least in amf)
> message format version seems to get filled with sub part version so 8
> bits of it is unused.
[Surya] Actually there were some limitations with the number of bits 
that were available in the TIPC namesequence bind.
So it was limited it to 8 bits for the SVC install.
>
> Thanks,
> Hans
>
> On 3 April 2014 13:13, SuryaNarayana Garlapati
>  wrote:
>> Comments:
>> 1. First of all, the mds sub part version should not be stored/used. You
>> should depend on the message format version as SI rank is being embedded
>> in the new message(which is being sent from AVD).
>>If you have a requirement for usage of message version, you can get it
>> at runtime when you receive a SUSI message etc. Decode the received
>> message and use the message format version.
>> 2.  As per patch, Storage of mds version will happen only once for a
>> life time. You need to process the NEW-ACTIVE event of MDS as well
>> which will make sure to store the mds version even after the
>> failover/switchover.
>>
>> Regards
>> Surya
>>
>> On Friday 28 March 2014 06:44 PM, Hans Feldt wrote:
>>>osaf/services/saf/avsv/avd/avd_mds.c   |  2 ++
>>>osaf/services/saf/avsv/avnd/avnd_di.c  |  1 +
>>>osaf/services/saf/avsv/avnd/avnd_mds.c |  3 +++
>>>osaf/services/saf/avsv/avnd/include/avnd_cb.h  |  1 +
>>>osaf/services/saf/avsv/avnd/include/avnd_evt.h |  1 +
>>>5 files changed, 8 insertions(+), 0 deletions(-)
>>>
>>>
>>> diff --git a/osaf/services/saf/avsv/avd/avd_mds.c 
>>> b/osaf/services/saf/avsv/avd/avd_mds.c
>>> --- a/osaf/services/saf/avsv/avd/avd_mds.c
>>> +++ b/osaf/services/saf/avsv/avd/avd_mds.c
>>> @@ -414,6 +414,8 @@ static uint32_t avd_mds_svc_evt(MDS_CALL
>>>break;
>>>
>>>case NCSMDS_SVC_ID_AVND:
>>> + // TODO remove, just for test
>>> + LOG_NO("%s: AVND UP version: %u", __FUNCTION__, 
>>> evt_info->i_rem_svc_pvt_ver);
>>>if (evt_info->i_node_id == cb->node_id_avd) {
>>>AVD_EVT *evt = calloc(1, sizeof(AVD_EVT));
>>>osafassert(evt);
>>> diff --git a/osaf/services/saf/avsv/avnd/avnd_di.c 
>>> b/osaf/services/saf/avsv/avnd/avnd_di.c
>>> --- a/osaf/services/saf/avsv/avnd/avnd_di.c
>>> +++ b/osaf/services/saf/avsv/avnd/avnd_di.c
>>> @@ -421,6 +421,7 @@ uint32_t avnd_evt_mds_avd_up_evh(AVND_CB
>>>
>>>/* store the AVD MDS address */
>>>cb->avd_dest = evt->info.mds.mds_dest;
>>> + cb->avd_mds_ver = evt->info.mds.rem_svc_pvt_ver;
>>>
>>>avnd_send_node_up_msg();
>>>}
>>> diff --git a/osaf/services/saf/avsv/avnd/avnd_mds.c 
>>> b/osaf/services/saf/avsv/avnd/avnd_mds.c
>>> --- a/osaf/services/saf/avsv/avnd/avnd_mds.c
>>> +++ b/osaf/services/saf/avsv/avnd/avnd_mds.c
>>> @@ -594,8 +594,11 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
>>>case NCSMDS_UP:
>>>switch (evt_info->i_svc_id) {
>>>case NCSMDS_SVC_ID_AVD:
>>> + // TODO remove, just for test
>>> + LOG_NO("%s: AVD UP version: %u", __FUNCTION__, 
>>> evt_info->i_rem_svc_pvt_ver);
>>>/* create the mds event */
>>>evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
>>> &evt_info->i_dest, 0, 0, 0);
>>> + evt->info.mds.rem_svc_pvt_ver = 
>>> evt_info->i_rem_svc_pvt_ver;
>>>break

Re: [devel] [PATCH 2 of 5] amfnd: store amfd mds ver [#574]

2014-04-03 Thread Hans Feldt
See inline
Thanks,
Hans

> -Original Message-
> From: SuryaNarayana Garlapati [mailto:suryanarayana.garlap...@oracle.com]
> Sent: den 4 april 2014 08:29
> To: Hans Feldt
> Cc: Hans Feldt; Hans Nordebäck; praveen malviya; nagendr...@oracle.com; 
> opensaf-devel@lists.sourceforge.net
> Subject: Re: [devel] [PATCH 2 of 5] amfnd: store amfd mds ver [#574]
> 
> Hi Hans,
> Comments/answers inline.
> 
> Regards
> Surya
> On Friday 04 April 2014 11:16 AM, Hans Feldt wrote:
> > Hi,
> > I interpreted your answer such that there are two alternatives:
> > 1) rely on mds sub part version but then I need to handle NEW-ACTIVE or
> > 2) use message format version instead
> >
> > Seems like you had a preference for 2?
> [Surya] Yes, message format should be used and that is the correct way
> to handle this situation.
> >
> > Any alternative seems to require quite a few more changes in amfnd.
> > Alt1 would need add new event event handling in both the mds callback
> > and main thread. Alt2 would require to preserve the message format
> > version all the way to the event handler.
> [Surya] Yes, it is preferred to preserve the message format
> version(variable i_msg_fmt_ver) all the way to event handler.
> >
> > But again the strange thing is that these are not the same formats and
> > there is not much explaining what they are.
> [Surya] I think i had already explained in my earlier emails, but here
> goes the explanation.
> 
>   MDS Service subpart version is the version of a service with which it
> has installed with MDS.
> Message format version is the version with which a message was encoded.
> Such that on the receiving end,
> application uses this message format version and decodes the message
> accordingly.
> >   mds sub part version is 8
> > bits, used at MDS install and service discovery (SVC_EVENT callback).
> [Surya] Yes you are correct.
> > But message format version is 16 bit. Normally (at least in amf)
> > message format version seems to get filled with sub part version so 8
> > bits of it is unused.
> [Surya] Actually there were some limitations with the number of bits
> that were available in the TIPC namesequence bind.
> So it was limited it to 8 bits for the SVC install.
[Hans] But when sending a message is 8 or 16 bits sent across?
We had discussed before if a major/minor scheme could be applied on top of 
this. If 16 bits are sent/received I guess we have the possibility to use the 
unused 8 bits for a minor version?

> >
> > Thanks,
> > Hans
> >
> > On 3 April 2014 13:13, SuryaNarayana Garlapati
> >  wrote:
> >> Comments:
> >> 1. First of all, the mds sub part version should not be stored/used. You
> >> should depend on the message format version as SI rank is being embedded
> >> in the new message(which is being sent from AVD).
> >>If you have a requirement for usage of message version, you can get it
> >> at runtime when you receive a SUSI message etc. Decode the received
> >> message and use the message format version.
> >> 2.  As per patch, Storage of mds version will happen only once for a
> >> life time. You need to process the NEW-ACTIVE event of MDS as well
> >> which will make sure to store the mds version even after the
> >> failover/switchover.
> >>
> >> Regards
> >> Surya
> >>
> >> On Friday 28 March 2014 06:44 PM, Hans Feldt wrote:
> >>>osaf/services/saf/avsv/avd/avd_mds.c   |  2 ++
> >>>osaf/services/saf/avsv/avnd/avnd_di.c  |  1 +
> >>>osaf/services/saf/avsv/avnd/avnd_mds.c |  3 +++
> >>>osaf/services/saf/avsv/avnd/include/avnd_cb.h  |  1 +
> >>>osaf/services/saf/avsv/avnd/include/avnd_evt.h |  1 +
> >>>5 files changed, 8 insertions(+), 0 deletions(-)
> >>>
> >>>
> >>> diff --git a/osaf/services/saf/avsv/avd/avd_mds.c 
> >>> b/osaf/services/saf/avsv/avd/avd_mds.c
> >>> --- a/osaf/services/saf/avsv/avd/avd_mds.c
> >>> +++ b/osaf/services/saf/avsv/avd/avd_mds.c
> >>> @@ -414,6 +414,8 @@ static uint32_t avd_mds_svc_evt(MDS_CALL
> >>>break;
> >>>
> >>>case NCSMDS_SVC_ID_AVND:
> >>> + // TODO remove, just for test
> >>> + LOG_NO("%s: AVND UP version: %u", __FUNCTION__, 
> >>> evt_info->i_rem_svc_pvt_ver);
> >>>if (evt_info->i_node_id == cb->node_id_avd) {
> >>>AVD_EVT *evt = calloc(1, sizeof(AVD_EVT));
> >>>osafassert(evt);
> >>> diff --git a/osaf/services/saf/avsv/avnd/avnd_di.c 
> >>> b/osaf/services/saf/avsv/avnd/avnd_di.c
> >>> --- a/osaf/services/saf/avsv/avnd/avnd_di.c
> >>> +++ b/osaf/services/saf/avsv/avnd/avnd_di.c
> >>> @@ -421,6 +421,7 @@ uint32_t avnd_evt_mds_avd_up_evh(AVND_CB
> >>>
> >>>/* store the AVD MDS address */
> >>>cb->avd_dest = evt->info.mds.mds_dest;
> >>> + cb->avd_mds_ver = evt->info.mds.rem_svc_pvt_ver;
> >>>
> >>>avnd_send_node_up_msg();
> >>>}
> >>> diff --git a/osaf/servic