[tickets] [opensaf:tickets] #2426 mds: MDS send failure

2017-12-20 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit c4a934ba77290efcdcc76700cd024fb3871a3ebf
Author: Zoran Milinkovic 
Date:   Wed Dec 20 15:10:04 2017 +0100

imm: change log level for failing to send accept message [#2426]

Importance log level is changed from error to warning level due to no 
reaction from IMMD.




---

** [tickets:#2426] mds: MDS send failure**

**Status:** fixed
**Milestone:** 5.18.01
**Created:** Thu Apr 13, 2017 11:22 AM UTC by Hung Nguyen
**Last Updated:** Wed Dec 20, 2017 02:15 PM UTC
**Owner:** Zoran Milinkovic
**Attachments:**

- 
[logs.tgz](https://sourceforge.net/p/opensaf/tickets/2426/attachment/logs.tgz) 
(1.8 MB; application/x-compressed)


IMMD@SC-2 recived a message from IMMND@SC-1 but failed to send a message back 
to IMMND@SC-1.
Both IMMD and IMMND use MDS_SENDTYPE_SND.
RDE also got that failure.
~~~
18:33:18 SC-1 osafrded[183]: WA Failed to send RDE_MSG_PEER_INFO_RESP(4) to 
2020f9d120640
18:33:18 SC-1 osafrded[183]: message repeated 2 times: [ WA Failed to send 
RDE_MSG_PEER_INFO_RESP(4) to 2020f9d120640]

18:33:18 SC-2 osafrded[183]: WA Failed to send RDE_MSG_PEER_INFO_RESP(4) to 
2010fc4b8a390
18:33:18 SC-2 osafimmd[202]: WA IMMD - MDS Send Failed
18:33:18 SC-2 osafimmd[202]: ER Failed to send accept message to IMMND 2010f
~~~

Attached is the logs.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2426 mds: MDS send failure

2017-12-20 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **assigned_to**: Hans Nordebäck --> Zoran Milinkovic
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36163959/



---

** [tickets:#2426] mds: MDS send failure**

**Status:** review
**Milestone:** 5.18.01
**Created:** Thu Apr 13, 2017 11:22 AM UTC by Hung Nguyen
**Last Updated:** Fri Nov 03, 2017 09:50 PM UTC
**Owner:** Zoran Milinkovic
**Attachments:**

- 
[logs.tgz](https://sourceforge.net/p/opensaf/tickets/2426/attachment/logs.tgz) 
(1.8 MB; application/x-compressed)


IMMD@SC-2 recived a message from IMMND@SC-1 but failed to send a message back 
to IMMND@SC-1.
Both IMMD and IMMND use MDS_SENDTYPE_SND.
RDE also got that failure.
~~~
18:33:18 SC-1 osafrded[183]: WA Failed to send RDE_MSG_PEER_INFO_RESP(4) to 
2020f9d120640
18:33:18 SC-1 osafrded[183]: message repeated 2 times: [ WA Failed to send 
RDE_MSG_PEER_INFO_RESP(4) to 2020f9d120640]

18:33:18 SC-2 osafrded[183]: WA Failed to send RDE_MSG_PEER_INFO_RESP(4) to 
2010fc4b8a390
18:33:18 SC-2 osafimmd[202]: WA IMMD - MDS Send Failed
18:33:18 SC-2 osafimmd[202]: ER Failed to send accept message to IMMND 2010f
~~~

Attached is the logs.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2745 imm: lower the log message level for failing to send the accept message

2017-12-20 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: unassigned --> duplicate
- **Component**: unknown --> imm
- **Comment**:

Duplicate of #2426



---

** [tickets:#2745] imm: lower the log message level for failing to send the 
accept message**

**Status:** duplicate
**Milestone:** 5.18.01
**Created:** Wed Dec 20, 2017 01:40 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Dec 20, 2017 01:40 PM UTC
**Owner:** nobody


When IMMD fails to send the accept message, it's logged with error importance 
level.
The importance level will be changed to warning level due to no reaction from 
IMMD on error.
~~~
osafimmd[202]: ER Failed to send accept message to IMMND 2010f
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2745 imm: lower the log message level for failing to send the accept message

2017-12-20 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2745] imm: lower the log message level for failing to send the 
accept message**

**Status:** unassigned
**Milestone:** 5.18.01
**Created:** Wed Dec 20, 2017 01:40 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Dec 20, 2017 01:40 PM UTC
**Owner:** nobody


When IMMD fails to send the accept message, it's logged with error importance 
level.
The importance level will be changed to warning level due to no reaction from 
IMMD on error.
~~~
osafimmd[202]: ER Failed to send accept message to IMMND 2010f
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2450 IMM: SearchNext returning BAD_HANDLE when ERR_NOT_EXISTS is expected

2017-12-18 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: unassigned --> invalid
- **Comment**:

In omSearchNext, ERR_NOT_EXIST means that the search has come to the end, and 
that there is no more results to fetch. So, ERR_NOT_EXIST cannot be returned in 
this case at all.

If after omSearchInitialize call, the search result is altered, it means that 
the search result is corrupted, and that that search result is not the same as 
the search result from omSearchInitialize.
According to SAF spec and allowed return errors for omSearchNext, 
ERR_BAD_HANDLE is the most suitable error code in this case.





---

** [tickets:#2450] IMM: SearchNext returning BAD_HANDLE when ERR_NOT_EXISTS is 
expected**

**Status:** invalid
**Milestone:** 5.18.01
**Created:** Wed May 03, 2017 10:38 AM UTC by Chani Srivastava
**Last Updated:** Fri Nov 03, 2017 09:50 PM UTC
**Owner:** nobody


OS : Suse 64bit
Changeset : 5.2 GA
Setup : 4 nodes

**Summary:** IMM: SearchNext returning BAD_HANDLE when ERR_NOT_EXISTS is 
expected while searching for runtime object

**Steps to Reproduce**
1. Create a runtime object
2. Do Search Initiliaze()
3. Delete the object created in Step1
4. Do SearchNext()
5. Do SearchNext() again 

**Observed Bahavior:**
Step4 returning SA_AIS_ERR_TIMEOUT
Step5 is returning SA_AIS_ERR_BAD_HANDLE (SA_AIS_ERR_NOT_EXIST is expected)


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2729 osaf: add /sbin/shutdown to sudoers file in 00-README.conf

2017-12-07 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit 37983760835c40056c0a2d404e47f17f2a50b102
Author: Zoran Milinkovic 
Date:   Thu Dec 7 16:02:46 2017 +0100

osaf: add /sbin/shutdown to sudoers file in 00-README.conf [#2729]

/sbin/shutdown is added to /etc/sudoers for the configuration steps in 
00-README.conf




---

** [tickets:#2729] osaf: add /sbin/shutdown to sudoers file in 00-README.conf**

**Status:** fixed
**Milestone:** 5.18.01
**Created:** Tue Dec 05, 2017 01:33 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Dec 05, 2017 02:06 PM UTC
**Owner:** Zoran Milinkovic


Update 00-README.conf with adding /sbin/shutdown to sudoers file.
shutdown in sudoers file is needed for cluster/node reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2729 osaf: add /sbin/shutdown to sudoers file in 00-README.conf

2017-12-05 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36147092/



---

** [tickets:#2729] osaf: add /sbin/shutdown to sudoers file in 00-README.conf**

**Status:** review
**Milestone:** 5.18.01
**Created:** Tue Dec 05, 2017 01:33 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Dec 05, 2017 01:33 PM UTC
**Owner:** Zoran Milinkovic


Update 00-README.conf with adding /sbin/shutdown to sudoers file.
shutdown in sudoers file is needed for cluster/node reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2729 osaf: add /sbin/shutdown to sudoers file in 00-README.conf

2017-12-05 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2729] osaf: add /sbin/shutdown to sudoers file in 00-README.conf**

**Status:** accepted
**Milestone:** 5.18.01
**Created:** Tue Dec 05, 2017 01:33 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Dec 05, 2017 01:33 PM UTC
**Owner:** Zoran Milinkovic


Update 00-README.conf with adding /sbin/shutdown to sudoers file.
shutdown in sudoers file is needed for cluster/node reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2701 imm: timeout in imma_sync_with_immnd is wrong

2017-11-22 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2701] imm: timeout in imma_sync_with_immnd is wrong**

**Status:** assigned
**Milestone:** 5.18.01
**Created:** Wed Nov 22, 2017 10:17 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Nov 22, 2017 10:17 AM UTC
**Owner:** Vu Minh Nguyen


In the change in the ticket #452, the timeout was increased from 3 seconds to 
30 seconds by mistake.
The timeout needs to be reverted to 3 seconds.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2665 imm: update IMM documents with new changes

2017-11-09 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

IMMSv PR document:

changeset:   223:63f10041c752
tag: tip
user:    Zoran Milinkovic 
date:Thu Nov 09 16:03:04 2017 +0100
files:   OpenSAF_IMMSv_PR.odt
description:
imm: update IMMSv PR document [#2665]

Update IMMSv PR document with new enhancement for removal of disconnected 
appliers



---

** [tickets:#2665] imm: update IMM documents with new changes**

**Status:** fixed
**Milestone:** 5.17.11
**Created:** Wed Nov 01, 2017 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Nov 09, 2017 02:40 PM UTC
**Owner:** Zoran Milinkovic


Update README and IMMSv PR document


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2665 imm: update IMM documents with new changes

2017-11-09 Thread Zoran Milinkovic via Opensaf-tickets
- **Comment**:

README update:

develop:

commit 27b0a983d08de94e390aa811b8971e0a82059b80
Author: Zoran Milinkovic 
Date:   Thu Nov 9 15:36:22 2017 +0100

imm: update README with IMM changes in OpenSAF 5.17.11 and rename IMM 
schema [#2665]

Add two sections.
The section that describes the new flag for enabling features in 5.17.11,
and the section that describes the removal of disconnected appliers

Rename IMM schema from OpensafImm_Upgrade_5.17.10.xml to 
OpensafImm_Upgrade_5.17.11.xml

-

release:

commit 57f6b4c09859055b8b891573d4c42da258887a51
Author: Zoran Milinkovic 
Date:   Thu Nov 9 15:36:22 2017 +0100

imm: update README with IMM changes in OpenSAF 5.17.11 and rename IMM 
schema [#2665]

Add two sections.
The section that describes the new flag for enabling features in 5.17.11,
and the section that describes the removal of disconnected appliers

Rename IMM schema from OpensafImm_Upgrade_5.17.10.xml to 
OpensafImm_Upgrade_5.17.11.xml




---

** [tickets:#2665] imm: update IMM documents with new changes**

**Status:** review
**Milestone:** 5.17.11
**Created:** Wed Nov 01, 2017 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Nov 06, 2017 05:21 PM UTC
**Owner:** Zoran Milinkovic


Update README and IMMSv PR document


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2666 clm: update CLM document with new changes

2017-11-07 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: unassigned --> fixed
- **assigned_to**: Zoran Milinkovic
- **Milestone**: 5.18.01 --> 5.17.11
- **Comment**:

changeset:   221:4862ac046671
user:    Zoran Milinkovic 
date:Tue Nov 07 11:04:19 2017 +0100
files:   OpenSAF_CLMSv_PR.odt
description:
clm: update CLMSv PR document with new admin operations [#2666]

CLMSv PR is udated with new admin operations.

-

changeset:   222:c4e0d3ac2aaa
tag: tip
user:    Zoran Milinkovic 
date:Tue Nov 07 11:13:16 2017 +0100
files:   OpenSAF_CLMSv_PR.odt
description:
clm: update the year of the document [#2666]

In the first document was a typo for the year of the document.
It is fixed and it is 2017 year now.



---

** [tickets:#2666] clm: update CLM document with new changes**

**Status:** fixed
**Milestone:** 5.17.11
**Created:** Wed Nov 01, 2017 01:30 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Nov 03, 2017 09:50 PM UTC
**Owner:** Zoran Milinkovic


Update CLMSv PR document with new changes


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2665 imm: update IMM documents with new changes

2017-11-06 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: unassigned --> review
- **assigned_to**: Zoran Milinkovic
- **Milestone**: 5.18.01 --> 5.17.11
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36105476/



---

** [tickets:#2665] imm: update IMM documents with new changes**

**Status:** review
**Milestone:** 5.17.11
**Created:** Wed Nov 01, 2017 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Nov 03, 2017 09:50 PM UTC
**Owner:** Zoran Milinkovic


Update README and IMMSv PR document


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2599 imm: remove cascading delete for runtime objects

2017-11-03 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> duplicate
- **Comment**:

The ticket is solved in ticket #2667



---

** [tickets:#2599] imm: remove cascading delete for runtime objects**

**Status:** duplicate
**Milestone:** future
**Created:** Wed Sep 27, 2017 12:01 PM UTC by Rafael Odzakow
**Last Updated:** Fri Oct 27, 2017 10:22 AM UTC
**Owner:** Zoran Milinkovic


Investigation needed. There is a lot of log spam on large installations when 
deleting these objects. It could be caused by SMF deleting individual elements 
instead of a parent root.

Posted by Zoran from IMM:

> The problem comes with cascade delete of SMF objects which contain many 
> thousand objects.
> When cascade delete is invoked, it shouldn't be deleted more than 1 
> objects at once.
> 
> All this is explained in IMM PR document, chapter 3.4, point 6.
> https://sourceforge.net/p/opensaf/documentation/ci/default/tree/OpenSAF_IMMSv_PR.odt
> 
> Adding Rafael Odzakow


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2667 imm: improve cascade delete

2017-11-01 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36099484/



---

** [tickets:#2667] imm: improve cascade delete**

**Status:** review
**Milestone:** 5.18.01
**Created:** Wed Nov 01, 2017 01:43 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Nov 01, 2017 01:43 PM UTC
**Owner:** Zoran Milinkovic


When an object is deleted, and the object has children, the delete meesage is 
sent for each deleted object to PBE.
Since there are a lot of messages in the cascade delete from IMMND to PBE at 
once, there is a limitation that the cascade delete should not be done on 
object that contains more than 1 object.
More than 1 object may cause buffer overload (e.g. TIPC_ERR_OVERLOAD), and 
messages might be lost.

The improvement should send only one message to PBE which will contain only the 
root object. The rest of cascade delete will be on PBE side.

The limitation of 1 objects in the cascade delete will still apply if IMM 
has to send more than 1 notification messages. This could be avoid with 
removing the notify flag from classes before the huge cascade delete.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2667 imm: improve cascade delete

2017-11-01 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2667] imm: improve cascade delete**

**Status:** accepted
**Milestone:** 5.18.01
**Created:** Wed Nov 01, 2017 01:43 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Nov 01, 2017 01:43 PM UTC
**Owner:** Zoran Milinkovic


When an object is deleted, and the object has children, the delete meesage is 
sent for each deleted object to PBE.
Since there are a lot of messages in the cascade delete from IMMND to PBE at 
once, there is a limitation that the cascade delete should not be done on 
object that contains more than 1 object.
More than 1 object may cause buffer overload (e.g. TIPC_ERR_OVERLOAD), and 
messages might be lost.

The improvement should send only one message to PBE which will contain only the 
root object. The rest of cascade delete will be on PBE side.

The limitation of 1 objects in the cascade delete will still apply if IMM 
has to send more than 1 notification messages. This could be avoid with 
removing the notify flag from classes before the huge cascade delete.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2666 clm: update CLM document with new changes

2017-11-01 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2666] clm: update CLM document with new changes**

**Status:** unassigned
**Milestone:** 5.17.11
**Created:** Wed Nov 01, 2017 01:30 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Nov 01, 2017 01:30 PM UTC
**Owner:** nobody


Update CLMSv PR document with new changes


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2665 imm: update IMM documents with new changes

2017-11-01 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2665] imm: update IMM documents with new changes**

**Status:** unassigned
**Milestone:** 5.17.11
**Created:** Wed Nov 01, 2017 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Nov 01, 2017 01:29 PM UTC
**Owner:** nobody


Update README and IMMSv PR document


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2649 clm: add new admin operations to CLM

2017-10-31 Thread Zoran Milinkovic via Opensaf-tickets
develop:

commit 87095e7ca1ba384c5ff8695462cf45a096a42f85
Author: Zoran Milinkovic 
Date:   Tue Oct 31 14:14:53 2017 +0100

clm: fix RPM build [#2649]

-

release:

commit 870097797e4c21aeb7b5f951d2ee72dd46934797
Author: Zoran Milinkovic 
Date:   Tue Oct 31 14:14:53 2017 +0100

clm: fix RPM build [#2649]


---

** [tickets:#2649] clm: add new admin operations to CLM**

**Status:** fixed
**Milestone:** 5.17.11
**Created:** Fri Oct 20, 2017 03:24 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Oct 30, 2017 08:31 PM UTC
**Owner:** Zoran Milinkovic


Due to hard handling of executing scripts on remote nodes when user and 
password are required, CLM can be used for remote executing of scripts.

The admin operation "immadm -o 4 ..." will be extended in the way that when CLM 
node DN is targeted object, the CLM node will be rebooted.
In example if SC-2 needs to be rebooted:
immadm -o 4 safNode=SC-2,safCluster=myClmCluster

Another admin operation "immadm -o 5 ..." will be a more generic admin 
operation and it will execute scripts stored in "scripts/clm" directory. The 
scripts will be prefixed with osafclm_ word.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2649 clm: add new admin operations to CLM

2017-10-30 Thread Zoran Milinkovic via Opensaf-tickets
- **Comment**:

commit 6ef61617a15646c122b98afaeda692bf15aedd68
Author: Zoran Milinkovic 
Date:   Mon Oct 30 16:37:39 2017 +0100

clm: fix the build on 32-bit linux caused #2649 [#2649]

The patch fixes the problem with building OpenSAF on 32-bit linux




---

** [tickets:#2649] clm: add new admin operations to CLM**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Fri Oct 20, 2017 03:24 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Oct 27, 2017 03:29 PM UTC
**Owner:** Zoran Milinkovic


Due to hard handling of executing scripts on remote nodes when user and 
password are required, CLM can be used for remote executing of scripts.

The admin operation "immadm -o 4 ..." will be extended in the way that when CLM 
node DN is targeted object, the CLM node will be rebooted.
In example if SC-2 needs to be rebooted:
immadm -o 4 safNode=SC-2,safCluster=myClmCluster

Another admin operation "immadm -o 5 ..." will be a more generic admin 
operation and it will execute scripts stored in "scripts/clm" directory. The 
scripts will be prefixed with osafclm_ word.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2602 pyosaf: improvement of high level python interfaces

2017-10-27 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

Pushed patches on behalf of Nguyen Luu

-

commit c10f21f5fd65b99d71dbe28b6345e65f4b8e518f
Author: Zoran Milinkovic 
Date:   Fri Oct 27 18:00:01 2017 +0200

pyosaf: High level python interfaces for IMM [#2602]

commit 426b7ae52778f24fc9cb1e9bdd0fca932b7cff70
Author: Zoran Milinkovic 
Date:   Fri Oct 27 18:00:01 2017 +0200

pyosaf: High level python interfaces for NTF [#2602]

Improve the implementation of NTF pyosaf utils

commit 5dde9242d5294f34af79c439308e0858bd17642c
Author: Zoran Milinkovic 
Date:   Fri Oct 27 18:00:01 2017 +0200

pyosaf: High level python interfaces for LOG [#2602]

Improved implementation of LOG pyosaf utils

commit b4612aff258fb3aa5cb876e3442e7c83dc2691a4
Author: Zoran Milinkovic 
Date:   Fri Oct 27 18:00:01 2017 +0200

pyosaf: High level python interfaces for CLM [#2602]

- Add more error handling
- Refactor clm code
- Keep raising exceptions for existing python methods
- Add __version__ attribute to utils




---

** [tickets:#2602] pyosaf: improvement of high level python interfaces**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Fri Sep 29, 2017 03:56 AM UTC by Long H Buu Nguyen
**Last Updated:** Mon Oct 23, 2017 12:14 PM UTC
**Owner:** Long H Buu Nguyen


There are high level python interfaces for some OpenSaf services (clm, imm, 
log, ntf) existing in pyosaf/utils. They already provides high level 
functionality. However, they are still simple and need to be improved for some 
aspects:
1) Error handling: decorate() function currently handles SA_AIS_ERR_TRY_AGAIN 
only. More error codes should be handled as well.
2) Versioning: add versioning so users can easily know which a version is 
better to use.
3) Support Python 3: Python 2.7 is mandatory, but 3.x must be considered. The 
compability should be kept.
4) Documentation: should have at least an instruction how to use the high level 
python interfaces.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2649 clm: add new admin operations to CLM

2017-10-27 Thread Zoran Milinkovic via Opensaf-tickets
commit d8de354eb803664c7ad9389071f7be624b237aa9
Author: Zoran Milinkovic 
Date:   Fri Oct 27 17:25:59 2017 +0200

clm: fix for test.sh problem caused by patch for #2649 [#2649]


---

** [tickets:#2649] clm: add new admin operations to CLM**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Fri Oct 20, 2017 03:24 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Oct 27, 2017 02:44 PM UTC
**Owner:** Zoran Milinkovic


Due to hard handling of executing scripts on remote nodes when user and 
password are required, CLM can be used for remote executing of scripts.

The admin operation "immadm -o 4 ..." will be extended in the way that when CLM 
node DN is targeted object, the CLM node will be rebooted.
In example if SC-2 needs to be rebooted:
immadm -o 4 safNode=SC-2,safCluster=myClmCluster

Another admin operation "immadm -o 5 ..." will be a more generic admin 
operation and it will execute scripts stored in "scripts/clm" directory. The 
scripts will be prefixed with osafclm_ word.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2649 clm: add new admin operations to CLM

2017-10-27 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 864c0da8d303f420cdb86e3390f239d20ab9801c
Author: Zoran Milinkovic 
Date:   Fri Oct 27 16:41:21 2017 +0200

clm: add new admin operations [#2649]

Add two new admin operations to CLM.
1. Node reboot will use the same openation id as cluster reboot, but the DN 
will contain CLM node

   Example for rebooting SC2:
 immadm -o 4 safNode=SC-2,safCluster=myClmCluster

2. Execute action on remote nodei(s) with operation id 5.
   Action is a script stored in /usr/lib/opensaf/clm-scripts directory 
started with prefix "osafclm_"
   The action name needs to be provided by saClmAction parameter, which is 
type of SaStringT.
   If the targeted DN is CLM node, the action will be done on the node.
   If the targeted DN is CLM cluster DN, the action will be done on all 
nodes in the cluster.

   Example for executing clm-scripts/osafclm_stop script on SC-2:
 immadm -o 5 -p saClmAction:SA_STRING_T:stop 
safNode=SC-2,safCluster=myClmCluster

   Example for executing clm-scripts/osafclm_stop script on all nodes:
 immadm -o 5 -p saClmAction:SA_STRING_T:stop safCluster=myClmCluster




---

** [tickets:#2649] clm: add new admin operations to CLM**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Fri Oct 20, 2017 03:24 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Oct 23, 2017 02:47 PM UTC
**Owner:** Zoran Milinkovic


Due to hard handling of executing scripts on remote nodes when user and 
password are required, CLM can be used for remote executing of scripts.

The admin operation "immadm -o 4 ..." will be extended in the way that when CLM 
node DN is targeted object, the CLM node will be rebooted.
In example if SC-2 needs to be rebooted:
immadm -o 4 safNode=SC-2,safCluster=myClmCluster

Another admin operation "immadm -o 5 ..." will be a more generic admin 
operation and it will execute scripts stored in "scripts/clm" directory. The 
scripts will be prefixed with osafclm_ word.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2599 imm: remove cascading delete for runtime objects

2017-10-27 Thread Zoran Milinkovic via Opensaf-tickets
- **summary**: smf: remove cascading delete for runtime objects --> imm: remove 
cascading delete for runtime objects
- **status**: unassigned --> accepted
- **assigned_to**: Rafael Odzakow --> Zoran Milinkovic
- **Type**: discussion --> enhancement
- **Component**: smf --> imm
- **Part**: d --> -
- **Priority**: minor --> major
- **Milestone**: 5.17.10 --> future
- **Comment**:

Due to a short time for testing, the new soultion for handling large amount of 
cascade delete in IMM will be postponed to the next release.



---

** [tickets:#2599] imm: remove cascading delete for runtime objects**

**Status:** accepted
**Milestone:** future
**Created:** Wed Sep 27, 2017 12:01 PM UTC by Rafael Odzakow
**Last Updated:** Wed Sep 27, 2017 12:03 PM UTC
**Owner:** Zoran Milinkovic


Investigation needed. There is a lot of log spam on large installations when 
deleting these objects. It could be caused by SMF deleting individual elements 
instead of a parent root.

Posted by Zoran from IMM:

> The problem comes with cascade delete of SMF objects which contain many 
> thousand objects.
> When cascade delete is invoked, it shouldn't be deleted more than 1 
> objects at once.
> 
> All this is explained in IMM PR document, chapter 3.4, point 6.
> https://sourceforge.net/p/opensaf/documentation/ci/default/tree/OpenSAF_IMMSv_PR.odt
> 
> Adding Rafael Odzakow


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2654 clm: clm test asserts due to timeout in poll

2017-10-25 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36090263/



---

** [tickets:#2654] clm: clm test asserts due to timeout in poll**

**Status:** review
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 03:04 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Oct 25, 2017 03:04 PM UTC
**Owner:** Zoran Milinkovic


In CLM tests, immadm command is used in many places for executing CLM admin 
operations for locking, unlocking and shautting down nodes.
In overloaded system this can make problems, and the execution of immadm can 
take long time.
Since we have sanity check for executing immadm in another thread, this 
situation may go to timeout in poll in an overloaded system.

~~~
Thread 1 (Thread 0x7f3b61c0c740 (LWP 280)):
#0 0x7f3b61007428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
resultvar = 0
pid = 280
selftid = 280
#1 0x7f3b6100902a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4}, sa_mask = 
{__val = {0, 0, 140732357794048, 47244640256, 139893019865088, 94726958067288, 
865, 94726958071200, 0, 0, 139893007538572, 139893008635480, 139893008649136, 
0, 139893008635480, 94726958067288}}, sa_flags = 1640067072, sa_restorer = 
0x562756afae58}
sigs = {__val = {32, 0 }}
#2 0x7f3b60fffbd7 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:92
str = 0x562758c0d360 ""
total = 4096
#3 0x7f3b60fffc82 in __GI___assert_fail 
(assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:101
No locals.
#4 0x562756af64c7 in saClmClusterTrack_27 () at 
src/clm/apitest/tet_saClmClusterTrack.c:865
fds = {{fd = 10, events = 1, revents = 0}}
thread8 = 139892944803584
__PRETTY_FUNCTION__ = "saClmClusterTrack_27"
#5 0x562756afa309 in run_test_case (suite=, tcase=) at src/osaf/apitest/utest.c:178
No locals.
#6 0x562756afa824 in test_run (suite=, tcase=) at src/osaf/apitest/utest.c:202
i = 7
j = 27
#7 0x7f3b60ff2830 in __libc_start_main (main=0x562756af2910 , argc=1, 
argv=0x7ffece31db98, init=, fini=, 
rtld_fini=, stack_end=0x7ffece31db88) at ../csu/libc-start.c:291
result = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -8017775631807234294, 
94726958036048, 140732357794704, 0, 0, -4393240863189713142, 
-4430610558499635446}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 
0x7ffece31dba8, 0x7f3b61c1c168}, data = {prev = 0x0, cleanup = 0x0, canceltype 
= -835593304}}}
not_first_call = 
#8 0x562756af3479 in _start ()
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2654 clm: clm test asserts due to timeout in poll

2017-10-25 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2654] clm: clm test asserts due to timeout in poll**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 03:04 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Oct 25, 2017 03:04 PM UTC
**Owner:** Zoran Milinkovic


In CLM tests, immadm command is used in many places for executing CLM admin 
operations for locking, unlocking and shautting down nodes.
In overloaded system this can make problems, and the execution of immadm can 
take long time.
Since we have sanity check for executing immadm in another thread, this 
situation may go to timeout in poll in an overloaded system.

~~~
Thread 1 (Thread 0x7f3b61c0c740 (LWP 280)):
#0 0x7f3b61007428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
resultvar = 0
pid = 280
selftid = 280
#1 0x7f3b6100902a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4}, sa_mask = 
{__val = {0, 0, 140732357794048, 47244640256, 139893019865088, 94726958067288, 
865, 94726958071200, 0, 0, 139893007538572, 139893008635480, 139893008649136, 
0, 139893008635480, 94726958067288}}, sa_flags = 1640067072, sa_restorer = 
0x562756afae58}
sigs = {__val = {32, 0 }}
#2 0x7f3b60fffbd7 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:92
str = 0x562758c0d360 ""
total = 4096
#3 0x7f3b60fffc82 in __GI___assert_fail 
(assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:101
No locals.
#4 0x562756af64c7 in saClmClusterTrack_27 () at 
src/clm/apitest/tet_saClmClusterTrack.c:865
fds = {{fd = 10, events = 1, revents = 0}}
thread8 = 139892944803584
__PRETTY_FUNCTION__ = "saClmClusterTrack_27"
#5 0x562756afa309 in run_test_case (suite=, tcase=) at src/osaf/apitest/utest.c:178
No locals.
#6 0x562756afa824 in test_run (suite=, tcase=) at src/osaf/apitest/utest.c:202
i = 7
j = 27
#7 0x7f3b60ff2830 in __libc_start_main (main=0x562756af2910 , argc=1, 
argv=0x7ffece31db98, init=, fini=, 
rtld_fini=, stack_end=0x7ffece31db88) at ../csu/libc-start.c:291
result = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -8017775631807234294, 
94726958036048, 140732357794704, 0, 0, -4393240863189713142, 
-4430610558499635446}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 
0x7ffece31dba8, 0x7f3b61c1c168}, data = {prev = 0x0, cleanup = 0x0, canceltype 
= -835593304}}}
not_first_call = 
#8 0x562756af3479 in _start ()
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2649 clm: add new admin operations to CLM

2017-10-23 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36087212/



---

** [tickets:#2649] clm: add new admin operations to CLM**

**Status:** review
**Milestone:** 5.17.10
**Created:** Fri Oct 20, 2017 03:24 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Oct 20, 2017 03:24 PM UTC
**Owner:** Zoran Milinkovic


Due to hard handling of executing scripts on remote nodes when user and 
password are required, CLM can be used for remote executing of scripts.

The admin operation "immadm -o 4 ..." will be extended in the way that when CLM 
node DN is targeted object, the CLM node will be rebooted.
In example if SC-2 needs to be rebooted:
immadm -o 4 safNode=SC-2,safCluster=myClmCluster

Another admin operation "immadm -o 5 ..." will be a more generic admin 
operation and it will execute scripts stored in "scripts/clm" directory. The 
scripts will be prefixed with osafclm_ word.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2649 clm: add new admin operations to CLM

2017-10-20 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2649] clm: add new admin operations to CLM**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Fri Oct 20, 2017 03:24 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Oct 20, 2017 03:24 PM UTC
**Owner:** Zoran Milinkovic


Due to hard handling of executing scripts on remote nodes when user and 
password are required, CLM can be used for remote executing of scripts.

The admin operation "immadm -o 4 ..." will be extended in the way that when CLM 
node DN is targeted object, the CLM node will be rebooted.
In example if SC-2 needs to be rebooted:
immadm -o 4 safNode=SC-2,safCluster=myClmCluster

Another admin operation "immadm -o 5 ..." will be a more generic admin 
operation and it will execute scripts stored in "scripts/clm" directory. The 
scripts will be prefixed with osafclm_ word.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2523 pyosaf: IMM OM module initialized with ERR_BAD_HANDLE

2017-10-11 Thread Zoran Milinkovic via Opensaf-tickets
Hi Hieu,

Just talked with a girl who was involved in raising this ticket, and she said 
that it's ok.
So, you have ack from me on this.

Thanks,
Zoran

-Original Message-
From: Hieu Nguyen [mailto:dhie...@users.sf.net] 
Sent: den 3 oktober 2017 10:30
To: [opensaf:tickets] <2...@tickets.opensaf.p.re.sf.net>
Subject: [opensaf:tickets] #2523 pyosaf: IMM OM module initialized with 
ERR_BAD_HANDLE

Hi Zoran !

All functions in pyosaf utils decorate by decorate() function in 
pyosaf/utils/__init__.py. You can see in __init__.py of clm/immoi/immom/log/ntf 
same as below:
saImmOmInitialize = decorate(saImmOm.saImmOmInitialize)
saImmOmSelectionObjectGet = decorate(saImmOm.saImmOmSelectionObjectGet)
saImmOmDispatch   = decorate(saImmOm.saImmOmDispatch)
saImmOmFinalize   = decorate(saImmOm.saImmOmFinalize)


All error codes returned handle by decorate() function (decorator of Python). 
We have been handle ERR_BAD_HANDLE in this function with stop app and raise 
exception.
if error != eSaAisErrorT.SA_AIS_OK:
  raise_saf_exception(function, error)

Do you have any ideas about this?



---

** [tickets:#2523] pyosaf: IMM OM module initialized with ERR_BAD_HANDLE**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Thu Jul 06, 2017 01:40 PM UTC by Zoran Milinkovic **Last 
Updated:** Tue Oct 03, 2017 06:24 AM UTC
**Owner:** Hieu Nguyen


pyosaf does not handle well intialization of IMM OM module.
Regardless of error code returned from saImmOmInitialize, in _initialize() 
function, saImmOmAccessorInitialize is called().


---

Sent from sourceforge.net because you indicated interest in 
<https://sourceforge.net/p/opensaf/tickets/2523/>



To unsubscribe from further messages, please visit 
<https://sourceforge.net/auth/subscriptions/>


---

** [tickets:#2523] pyosaf: IMM OM module initialized with ERR_BAD_HANDLE**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Thu Jul 06, 2017 01:40 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Oct 03, 2017 08:30 AM UTC
**Owner:** Hieu Nguyen


pyosaf does not handle well intialization of IMM OM module.
Regardless of error code returned from saImmOmInitialize, in _initialize() 
function, saImmOmAccessorInitialize is called().


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2451 clm: Make the cluster reset admin op safe

2017-09-22 Thread Zoran Milinkovic via Opensaf-tickets
I attached one idea (prototype) for the safe cluster restart.
The attached file contains a bit change in IMM and CLM.

The idea is that when cluster restart is invoked by CLM admin operation, that 
CLM first disable sync in IMM (change in IMM), and then continue with rebooting 
nodes.

If a rebooted node comes up too fast, before the last IMM veteran node goes 
down, IMM sync will not be possible, and the node will be hanging in the NID 
phase waiting for the sync.
When the last IMM veteran node goes down, IMMD will start with electing a new 
coordinator. Since there is no any veteran node in the cluster, the new IMM 
coordinator will start loading data from PBE or XML file.

The side effect of the attached file is that some nodes which joined before the 
last veteran goes down, can be rebooted again mostly due to QUIESCED role in 
RDE, or if they are payload running without SC absence allowed.
There is nothing wrong with rebooting that nodes again. They are still in 
OpenSAF starting phase, and there is no any application up and running. So, 
rebooting that nodes are safe.

The attached file is only a proposal and needs to be split in two tickets, one 
for IMM (disable sync feature) and this ticket for CLM.

For IMM part, I would like to make the disable sync function as a one way 
function, and when the sync is disabled, it cannot be enabled again until the 
cluster restart is done.
In the attached file, disable sync feature can be switched on and off.



Attachments:

- 
[clmrestart.diff](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/d666d71b/3ab4/attachment/clmrestart.diff)
 (6.4 kB; application/octet-stream)


---

** [tickets:#2451] clm: Make the cluster reset admin op safe**

**Status:** review
**Milestone:** 5.17.10
**Created:** Wed May 03, 2017 10:51 AM UTC by Anders Widell
**Last Updated:** Fri Sep 15, 2017 06:01 AM UTC
**Owner:** Hans Nordebäck


The cluster reset admin operation that was implemented in ticket [#2053] is not 
safe: if a node reboots very fast it can come up again and join the old cluster 
before other nodes have rebooted. See mail discussion:

https://sourceforge.net/p/opensaf/mailman/message/35398725/

This can be solved by implementing a two-phase cluster reset or by introducing 
a cluster generation number which is increased at each cluster reset (maybe 
both ordered an spontaneous cluster resets). A node will not be allowed to join 
the cluster with a different cluster genration without first rebooting.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2579 imm: remove disconnected appliers

2017-09-15 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 959c04b78346991bcffa75d1e4774d3610413e6e
Author: Zoran Milinkovic 
Date:   Mon Sep 11 17:46:23 2017 +0200

imm: remove disconnected appliers [#2579]

When an applier is disconnected, it will be removed from the system after 
time period set in minApplierTimeout attribute in IMM object.
If the time is set to 0, it will work as it works today, and it will never 
be removed from the system.

The time set in minApplierTimeout attribute guarantees that the applier 
will be "alive" for at least the time set in the attribute.
When this period expires, with the next clean the basement call, the 
applier will be removed.

The time set in minApplierTimeout attribute is in seconds.

To enable this feature, protocol51710 must be enabled and minApplierTimeout 
attribute value must be great than 0.



---

** [tickets:#2579] imm: remove disconnected appliers**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Mon Sep 11, 2017 02:16 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 11, 2017 03:56 PM UTC
**Owner:** Zoran Milinkovic


Once an implementer is created, it resides in the system to the cluster 
restart. There is no way to remove implementers from the system.
Applier is a special case of implementers and the same feature applies to 
appliers.

When OpenSAF is runnning in the cloud, nodes are added and are removed from the 
system, and new nodes may come with new names.
This makes a problem with appliers, which in most cases contain a node name in 
an applier name.
If there are a lot of adding and removing nodes, it can come in the situation 
that all 3000 implementers are allocated, and intialization of new implementers 
or appliers is not possible.

Since there are a limited number of implementers, there is no need to remove 
implementers from the system.
IMM should remove appliers from the system if they are not used for a certain 
time.

If SC absence allowed is enabled, this time should not be less that SC absence 
allowed time. Otherwise all appliers will be discarded when the system comes 
back from the headless state.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2579 imm: remove disconnected appliers

2017-09-11 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36033957/



---

** [tickets:#2579] imm: remove disconnected appliers**

**Status:** review
**Milestone:** 5.17.10
**Created:** Mon Sep 11, 2017 02:16 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 11, 2017 02:16 PM UTC
**Owner:** Zoran Milinkovic


Once an implementer is created, it resides in the system to the cluster 
restart. There is no way to remove implementers from the system.
Applier is a special case of implementers and the same feature applies to 
appliers.

When OpenSAF is runnning in the cloud, nodes are added and are removed from the 
system, and new nodes may come with new names.
This makes a problem with appliers, which in most cases contain a node name in 
an applier name.
If there are a lot of adding and removing nodes, it can come in the situation 
that all 3000 implementers are allocated, and intialization of new implementers 
or appliers is not possible.

Since there are a limited number of implementers, there is no need to remove 
implementers from the system.
IMM should remove appliers from the system if they are not used for a certain 
time.

If SC absence allowed is enabled, this time should not be less that SC absence 
allowed time. Otherwise all appliers will be discarded when the system comes 
back from the headless state.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2579 imm: remove disconnected appliers

2017-09-11 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2579] imm: remove disconnected appliers**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Mon Sep 11, 2017 02:16 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 11, 2017 02:16 PM UTC
**Owner:** Zoran Milinkovic


Once an implementer is created, it resides in the system to the cluster 
restart. There is no way to remove implementers from the system.
Applier is a special case of implementers and the same feature applies to 
appliers.

When OpenSAF is runnning in the cloud, nodes are added and are removed from the 
system, and new nodes may come with new names.
This makes a problem with appliers, which in most cases contain a node name in 
an applier name.
If there are a lot of adding and removing nodes, it can come in the situation 
that all 3000 implementers are allocated, and intialization of new implementers 
or appliers is not possible.

Since there are a limited number of implementers, there is no need to remove 
implementers from the system.
IMM should remove appliers from the system if they are not used for a certain 
time.

If SC absence allowed is enabled, this time should not be less that SC absence 
allowed time. Otherwise all appliers will be discarded when the system comes 
back from the headless state.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2532 mds: TCP SVC_UP event is not received after subscribing

2017-08-29 Thread Zoran Milinkovic via Opensaf-tickets
The tests have been started.
We'll come back when the problem is reproduced.


---

** [tickets:#2532] mds: TCP SVC_UP event is not received after subscribing**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Fri Jul 21, 2017 05:59 AM UTC by Hung Nguyen
**Last Updated:** Tue Aug 29, 2017 08:23 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs_n_traces.tgz](https://sourceforge.net/p/opensaf/tickets/2532/attachment/logs_n_traces.tgz)
 (1.5 MB; application/x-compressed)


MDS is successfully installed on IMMA and IMMA subscribed to IMMD successfully.
IMMND also received IMMA SVC_UP event but IMMA didn't receive SVC_UP event for 
IMMND.

~~~
<142>1 2017-07-20T13:00:36.072773+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14043"] MCM:API: svc_id = IMMA_OM(26) on VDEST id = 65535, 
SVC_PVT_VER = 0 Install Successfull
> ...
<142>1 2017-07-20T13:00:36.073091+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14074"] MCM:API: svc_subscribe :svc_id = IMMA_OM(26) on VDEST id = 
65535 Subscription to svc_id = IMMND(25) Successful
> ...
<142>1 2017-07-20T13:00:36.073904+02:00 PL-4 osafimmnd 177 mds.log [meta 
sequenceId="96185"] MCM:API: svc_up : svc_id = IMMND(25) on DEST id = 65535 got 
UP for svc_id = IMMA_OM(26) on Adest = , 
rem_svc_pvt_ver=0, rem_svc_archword=10
~~~


IMMA waited for the SVC_UP event for 30 sec but didn't receive anything.
~~~
Jul 20 13:00:36.071465 imma [278:278:src/imm/agent/imma_init.cc:0263] >> 
imma_startup 
Jul 20 13:00:36.071474 imma [278:278:src/imm/agent/imma_init.cc:0273] TR use 
count 0
Jul 20 13:00:36.071484 imma [278:278:src/base/ncs_main_pub.c:0220] TR 
NCS:PROCESS_ID=278
Jul 20 13:00:36.071494 imma [278:278:src/base/sysf_def.c:0089] TR INITIALIZING 
LEAP ENVIRONMENT
Jul 20 13:00:36.071584 imma [278:278:src/base/sysf_def.c:0124] TR DONE 
INITIALIZING LEAP ENVIRONMENT
Jul 20 13:00:36.071832 imma [278:278:src/base/ncs_main_pub.c:0757] TR 
NCS:NODE_ID=0x0002040F
Jul 20 13:00:36.072329 imma [278:278:src/mbc/mbcsv_dl_api.c:0059] >> 
mbcsv_lib_req 
Jul 20 13:00:36.072350 imma [278:278:src/mbc/mbcsv_dl_api.c:0096] >> 
mbcsv_lib_init 
Jul 20 13:00:36.072378 imma [278:278:src/mbc/mbcsv_mbx.c:0174] >> 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072389 imma [278:278:src/mbc/mbcsv_mbx.c:0189] << 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072399 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0158] >> 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072409 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0173] << 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072419 imma [278:278:src/mbc/mbcsv_dl_api.c:0075] << 
mbcsv_lib_req 
Jul 20 13:00:36.072440 imma [278:278:src/base/ncs_main_pub.c:0389] TR 
MBCSV:MBCA:ON
Jul 20 13:00:36.073104 imma [278:278:src/imm/agent/imma_init.cc:0063] >> 
imma_sync_with_immnd 
Jul 20 13:00:36.073114 imma [278:278:src/imm/agent/imma_init.cc:0071] TR 
Blocking first client
Jul 20 13:01:06.102156 imma [278:278:src/imm/agent/imma_init.cc:0081] TR 
Blocking wait released
Jul 20 13:01:06.102375 imma [278:278:src/imm/agent/imma_init.cc:0091] << 
imma_sync_with_immnd 
Jul 20 13:01:06.102413 imma [278:278:src/imm/agent/imma_init.cc:0179] TR Client 
agent successfully initialized
Jul 20 13:01:06.102427 imma [278:278:src/imm/agent/imma_init.cc:0296] << 
imma_startup: use count 1
~~~


Attached is traces and logs.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2532 mds: TCP SVC_UP event is not received after subscribing

2017-08-29 Thread Zoran Milinkovic via Opensaf-tickets
Hi Mahesh,

I just want to add that this issue is very hard to reproduce.

We see this issue at least once a day in our test environment.
I have never managed to reproduce the problem in my environment.

BR,
Zoran


---

** [tickets:#2532] mds: TCP SVC_UP event is not received after subscribing**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Fri Jul 21, 2017 05:59 AM UTC by Hung Nguyen
**Last Updated:** Tue Aug 29, 2017 03:30 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs_n_traces.tgz](https://sourceforge.net/p/opensaf/tickets/2532/attachment/logs_n_traces.tgz)
 (1.5 MB; application/x-compressed)


MDS is successfully installed on IMMA and IMMA subscribed to IMMD successfully.
IMMND also received IMMA SVC_UP event but IMMA didn't receive SVC_UP event for 
IMMND.

~~~
<142>1 2017-07-20T13:00:36.072773+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14043"] MCM:API: svc_id = IMMA_OM(26) on VDEST id = 65535, 
SVC_PVT_VER = 0 Install Successfull
> ...
<142>1 2017-07-20T13:00:36.073091+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14074"] MCM:API: svc_subscribe :svc_id = IMMA_OM(26) on VDEST id = 
65535 Subscription to svc_id = IMMND(25) Successful
> ...
<142>1 2017-07-20T13:00:36.073904+02:00 PL-4 osafimmnd 177 mds.log [meta 
sequenceId="96185"] MCM:API: svc_up : svc_id = IMMND(25) on DEST id = 65535 got 
UP for svc_id = IMMA_OM(26) on Adest = , 
rem_svc_pvt_ver=0, rem_svc_archword=10
~~~


IMMA waited for the SVC_UP event for 30 sec but didn't receive anything.
~~~
Jul 20 13:00:36.071465 imma [278:278:src/imm/agent/imma_init.cc:0263] >> 
imma_startup 
Jul 20 13:00:36.071474 imma [278:278:src/imm/agent/imma_init.cc:0273] TR use 
count 0
Jul 20 13:00:36.071484 imma [278:278:src/base/ncs_main_pub.c:0220] TR 
NCS:PROCESS_ID=278
Jul 20 13:00:36.071494 imma [278:278:src/base/sysf_def.c:0089] TR INITIALIZING 
LEAP ENVIRONMENT
Jul 20 13:00:36.071584 imma [278:278:src/base/sysf_def.c:0124] TR DONE 
INITIALIZING LEAP ENVIRONMENT
Jul 20 13:00:36.071832 imma [278:278:src/base/ncs_main_pub.c:0757] TR 
NCS:NODE_ID=0x0002040F
Jul 20 13:00:36.072329 imma [278:278:src/mbc/mbcsv_dl_api.c:0059] >> 
mbcsv_lib_req 
Jul 20 13:00:36.072350 imma [278:278:src/mbc/mbcsv_dl_api.c:0096] >> 
mbcsv_lib_init 
Jul 20 13:00:36.072378 imma [278:278:src/mbc/mbcsv_mbx.c:0174] >> 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072389 imma [278:278:src/mbc/mbcsv_mbx.c:0189] << 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072399 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0158] >> 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072409 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0173] << 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072419 imma [278:278:src/mbc/mbcsv_dl_api.c:0075] << 
mbcsv_lib_req 
Jul 20 13:00:36.072440 imma [278:278:src/base/ncs_main_pub.c:0389] TR 
MBCSV:MBCA:ON
Jul 20 13:00:36.073104 imma [278:278:src/imm/agent/imma_init.cc:0063] >> 
imma_sync_with_immnd 
Jul 20 13:00:36.073114 imma [278:278:src/imm/agent/imma_init.cc:0071] TR 
Blocking first client
Jul 20 13:01:06.102156 imma [278:278:src/imm/agent/imma_init.cc:0081] TR 
Blocking wait released
Jul 20 13:01:06.102375 imma [278:278:src/imm/agent/imma_init.cc:0091] << 
imma_sync_with_immnd 
Jul 20 13:01:06.102413 imma [278:278:src/imm/agent/imma_init.cc:0179] TR Client 
agent successfully initialized
Jul 20 13:01:06.102427 imma [278:278:src/imm/agent/imma_init.cc:0296] << 
imma_startup: use count 1
~~~


Attached is traces and logs.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2559 imm: change log level to warning on ERR_TRY_AGAIN in PBE

2017-08-18 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit 43085df2640649ff596c88336e6df485ebd8a571
Author: Zoran Milinkovic 
Date:   Fri Aug 18 09:38:45 2017 +0200

imm: change log level from error to warning in PBE [#2559]

In #2491, the log message if logged always with error log level.
If a cluster goes headless, this case is very likely to be happen
with ERR_TRY_AGAIN, but that's expected behavior when the cluster
goes headless.

-

release:

commit 43085df2640649ff596c88336e6df485ebd8a571
Author: Zoran Milinkovic 
Date:   Fri Aug 18 09:38:45 2017 +0200

imm: change log level from error to warning in PBE [#2559]

In #2491, the log message if logged always with error log level.
If a cluster goes headless, this case is very likely to be happen
with ERR_TRY_AGAIN, but that's expected behavior when the cluster
goes headless.



---

** [tickets:#2559] imm: change log level to warning on ERR_TRY_AGAIN in PBE**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Thu Aug 17, 2017 12:45 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Aug 17, 2017 01:05 PM UTC
**Owner:** Zoran Milinkovic


In ticket #2491, the added block 
~~~
> +if (errorCode != SA_AIS_ERR_NOT_EXIST
> +&& errorCode != SA_AIS_ERR_INVALID_PARAM) {
> +LOG_ER("Failed to get class description for class '%s' from imm "
> +   "with error=%d, exiting",
> +   classNameString.c_str(), errorCode);
> +exit(1);
> +}
~~~
is too strict and log error messages even for ERR_TRY_AGAIN.

ERR_TRY_AGAIN is very common to get if the cluster goes headless. Then if the 
switchover is very quick and is done before IMMD goes down, PBE can start 
regenerating the database on the new elected IMM coordinator node, and receive 
ERR_TRY_AGAIN.

For ERR_TRY_AGAIN in this case, the log level should be reduced from error to 
warning level.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2559 imm: change log level to warning on ERR_TRY_AGAIN in PBE

2017-08-17 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36000592/



---

** [tickets:#2559] imm: change log level to warning on ERR_TRY_AGAIN in PBE**

**Status:** review
**Milestone:** 5.17.10
**Created:** Thu Aug 17, 2017 12:45 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Aug 17, 2017 12:45 PM UTC
**Owner:** Zoran Milinkovic


In ticket #2491, the added block 
~~~
> +if (errorCode != SA_AIS_ERR_NOT_EXIST
> +&& errorCode != SA_AIS_ERR_INVALID_PARAM) {
> +LOG_ER("Failed to get class description for class '%s' from imm "
> +   "with error=%d, exiting",
> +   classNameString.c_str(), errorCode);
> +exit(1);
> +}
~~~
is too strict and log error messages even for ERR_TRY_AGAIN.

ERR_TRY_AGAIN is very common to get if the cluster goes headless. Then if the 
switchover is very quick and is done before IMMD goes down, PBE can start 
regenerating the database on the new elected IMM coordinator node, and receive 
ERR_TRY_AGAIN.

For ERR_TRY_AGAIN in this case, the log level should be reduced from error to 
warning level.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2559 imm: change log level to warning on ERR_TRY_AGAIN in PBE

2017-08-17 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2559] imm: change log level to warning on ERR_TRY_AGAIN in PBE**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Thu Aug 17, 2017 12:45 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Aug 17, 2017 12:45 PM UTC
**Owner:** Zoran Milinkovic


In ticket #2491, the added block 
~~~
> +if (errorCode != SA_AIS_ERR_NOT_EXIST
> +&& errorCode != SA_AIS_ERR_INVALID_PARAM) {
> +LOG_ER("Failed to get class description for class '%s' from imm "
> +   "with error=%d, exiting",
> +   classNameString.c_str(), errorCode);
> +exit(1);
> +}
~~~
is too strict and log error messages even for ERR_TRY_AGAIN.

ERR_TRY_AGAIN is very common to get if the cluster goes headless. Then if the 
switchover is very quick and is done before IMMD goes down, PBE can start 
regenerating the database on the new elected IMM coordinator node, and receive 
ERR_TRY_AGAIN.

For ERR_TRY_AGAIN in this case, the log level should be reduced from error to 
warning level.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2549 imm: immnd may assert after coming from headless state or losing both controllers

2017-08-17 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit e78203f0153360806c318e59b47845d438b3382e
Author: Zoran Milinkovic 
Date:   Thu Aug 17 11:41:56 2017 +0200

imm: fix immnd coredump due to initialized CLM handle [#2549]

Initially CLM handle is set to 0. When CLM handle is initialized,
it can be initialized again only when saClmDispatch returns 
SA_AIS_ERR_BAD_HANDLE.
This will prevent coredumps with initialized CLM handle
caused with MDS UP message for AMF and CLM services.

-

release:

commit 6daa3b31255dbd2bc28b401d6c3f3766d6f6e2cf
Author: Zoran Milinkovic 
Date:   Thu Aug 17 11:41:56 2017 +0200

imm: fix immnd coredump due to initialized CLM handle [#2549]

Initially CLM handle is set to 0. When CLM handle is initialized,
it can be initialized again only when saClmDispatch returns 
SA_AIS_ERR_BAD_HANDLE.
This will prevent coredumps with initialized CLM handle
caused with MDS UP message for AMF and CLM services.



---

** [tickets:#2549] imm: immnd may assert after coming from headless state or 
losing both controllers**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Wed Aug 09, 2017 10:54 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Aug 09, 2017 11:46 AM UTC
**Owner:** Zoran Milinkovic


~~~
Thread 1 (Thread 0x7fda7f771740 (LWP 11485)):
---Type  to continue, or q  to quit---
#0  0x7fda7df738d7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fda7df74caa in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fda7eee404e in __osafassert_fail (__file=, 
__line=, __func=, __assertion=) at 
../../opensaf/src/base/sysf_def.c:286
No locals.
#3  0x564aa570110d in main (argc=, argv=) at 
../../opensaf/src/imm/immnd/immnd_main.c:426
now = {tv_sec = 2116, tv_nsec = 356673778}
passed_time = 
mbx_fd = 
error = 
timeout = 1000
eventCount = 14
maxEvt = 100
start_time = {tv_sec = 2116, tv_nsec = 355976144}
fds = {{fd = 20, events = 1, revents = 0}, {fd = 18, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 1}, {fd = 16, events = 1, revents 
= 1}, {fd = 22, events = 1, revents = 0}}
term_fd = 20
nfds = 5
__FUNCTION__ = "main"
~~~

When a system comes from headless state or losing both controllers with enabled 
SC roaming, MDS up message for AMF or CLM up triggers FD_CLM_INIT event when 
immnd_cb->clm_hdl is not 0.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2491 imm: PBE regenerates imm.db if immnd exits during the PBE state verification

2017-08-16 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit deb8bea9703ef121c9c7324bcf6c94628fa0d7d1
Author: Zoran Milinkovic 
Date:   Wed Aug 16 10:53:58 2017 +0200

imm: regenerate PBE in verifyClassPBE only if database is corrupted [#2491]

In verifyClassPBE(), the patch makes distinguish between IMM issue and 
database corruption.
For IMM issue, PBE will not be regenerated, while for database corruption, 
PBE will be regenerated.

-

release:

commit e5052b27fd240d03344eb8a3249e1caf0d89c658
Author: Zoran Milinkovic 
Date:   Wed Aug 16 10:53:58 2017 +0200

imm: regenerate PBE in verifyClassPBE only if database is corrupted [#2491]

In verifyClassPBE(), the patch makes distinguish between IMM issue and 
database corruption.
For IMM issue, PBE will not be regenerated, while for database corruption, 
PBE will be regenerated.



---

** [tickets:#2491] imm: PBE regenerates imm.db if immnd exits during the PBE 
state verification**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Fri Jun 09, 2017 09:18 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Aug 14, 2017 12:41 PM UTC
**Owner:** Zoran Milinkovic


If IMMND exits during the verification of PBE state, PBE regenerates new 
database from XML file.

PBE must distinguish between IMM issues (IMMND exits, network problem, etc) and 
database curruption.
If it's IMM issue, PBE should restart.
If it's database corruption issue, PBE should be regenerated.


2017-06-05T17:00:24.31 cm1 local0.notice osafimmnd[984]: NO This IMMND is now 
the NEW Coord
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[1] == '--recover'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[2] == '--pbe'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[3] == 
'/storage/clear/coremw/etc/imm.db'
2017-06-05T17:00:24.68 cm1 local0.err osafimmnd[984]: ER No IMMD service => 
cluster restart, exiting

2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Verify class 
CmwMgntLockClass failed!
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Renamed 
/storage/clear/coremw/etc/imm.db to 
/storage/clear/coremw/etc/imm.db.failed_immdump because it has been detected to 
be corrupt.
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Removed obsolete journal 
file: /storage/clear/coremw/etc/imm.db-journal 
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA verifyPbeState failed!
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Pbe: Failed to 
re-attach to db file /storage/clear/coremw/etc/imm.db - regenerating db file
2017-06-05T17:00:34.73 cm1 user.info osafimmpbed: IN Generating DB file from 
current IMM state. DB file: /storage/clear/coremw/etc/imm.db
2017-06-05T17:00:34.74 cm1 user.notice osafimmpbed: NO Successfully opened 
empty local sqlite pbe file /tmp/ImmPbeTmpSubDir/imm.db.d3HPkE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2544 imm: saClmDispatch is returning ERR_BAD_HANDLE until immnd initializes CLM handle

2017-08-16 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit 0b0d224ff3da7e59c7bb215664b1b39144a789f7
Author: Zoran Milinkovic 
Date:   Wed Aug 16 09:38:52 2017 +0200

imm: include CLM in poll before CLM handle is initialized [#2544]

CLM selection object is initially set to -1. Included CLM selection
object in poll will be ignored until CLM selection object is created
and set to fds[FD_CLM].

-

release:

commit 9f2329c6dd529e9196a2a2868ef251423e658330
Author: Zoran Milinkovic 
Date:   Wed Aug 16 09:38:52 2017 +0200

imm: include CLM in poll before CLM handle is initialized [#2544]

CLM selection object is initially set to -1. Included CLM selection
object in poll will be ignored until CLM selection object is created
and set to fds[FD_CLM].



---

** [tickets:#2544] imm: saClmDispatch is returning ERR_BAD_HANDLE until immnd 
initializes CLM handle**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Mon Aug 07, 2017 02:08 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Aug 07, 2017 03:06 PM UTC
**Owner:** Zoran Milinkovic


In the poll loop, immnd handles FD_CLM event even if FD_CLM event is not 
processed by poll call.

With random values in fds[FD_CLM], immnd may process FD_CLM event without 
calling poll. saClmDispatch will return SA_AIS_ERR_BAD_HANDLE error until CLM 
handle is initialized and CLM selection object set to 
immnd_cb->clmSelectionObject.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2491 imm: PBE regenerates imm.db if immnd exits during the PBE state verification

2017-08-14 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35995813/



---

** [tickets:#2491] imm: PBE regenerates imm.db if immnd exits during the PBE 
state verification**

**Status:** review
**Milestone:** 5.17.10
**Created:** Fri Jun 09, 2017 09:18 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Aug 14, 2017 12:07 PM UTC
**Owner:** Zoran Milinkovic


If IMMND exits during the verification of PBE state, PBE regenerates new 
database from XML file.

PBE must distinguish between IMM issues (IMMND exits, network problem, etc) and 
database curruption.
If it's IMM issue, PBE should restart.
If it's database corruption issue, PBE should be regenerated.


2017-06-05T17:00:24.31 cm1 local0.notice osafimmnd[984]: NO This IMMND is now 
the NEW Coord
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[1] == '--recover'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[2] == '--pbe'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[3] == 
'/storage/clear/coremw/etc/imm.db'
2017-06-05T17:00:24.68 cm1 local0.err osafimmnd[984]: ER No IMMD service => 
cluster restart, exiting

2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Verify class 
CmwMgntLockClass failed!
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Renamed 
/storage/clear/coremw/etc/imm.db to 
/storage/clear/coremw/etc/imm.db.failed_immdump because it has been detected to 
be corrupt.
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Removed obsolete journal 
file: /storage/clear/coremw/etc/imm.db-journal 
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA verifyPbeState failed!
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Pbe: Failed to 
re-attach to db file /storage/clear/coremw/etc/imm.db - regenerating db file
2017-06-05T17:00:34.73 cm1 user.info osafimmpbed: IN Generating DB file from 
current IMM state. DB file: /storage/clear/coremw/etc/imm.db
2017-06-05T17:00:34.74 cm1 user.notice osafimmpbed: NO Successfully opened 
empty local sqlite pbe file /tmp/ImmPbeTmpSubDir/imm.db.d3HPkE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2491 imm: PBE regenerates imm.db if immnd exits during the PBE state verification

2017-08-14 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: duplicate --> accepted
- **Comment**:

This is not the duplicated ticket.
The problem described in the ticket is similar with the problem solved in #2527



---

** [tickets:#2491] imm: PBE regenerates imm.db if immnd exits during the PBE 
state verification**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Fri Jun 09, 2017 09:18 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Aug 14, 2017 11:42 AM UTC
**Owner:** Zoran Milinkovic


If IMMND exits during the verification of PBE state, PBE regenerates new 
database from XML file.

PBE must distinguish between IMM issues (IMMND exits, network problem, etc) and 
database curruption.
If it's IMM issue, PBE should restart.
If it's database corruption issue, PBE should be regenerated.


2017-06-05T17:00:24.31 cm1 local0.notice osafimmnd[984]: NO This IMMND is now 
the NEW Coord
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[1] == '--recover'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[2] == '--pbe'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[3] == 
'/storage/clear/coremw/etc/imm.db'
2017-06-05T17:00:24.68 cm1 local0.err osafimmnd[984]: ER No IMMD service => 
cluster restart, exiting

2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Verify class 
CmwMgntLockClass failed!
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Renamed 
/storage/clear/coremw/etc/imm.db to 
/storage/clear/coremw/etc/imm.db.failed_immdump because it has been detected to 
be corrupt.
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Removed obsolete journal 
file: /storage/clear/coremw/etc/imm.db-journal 
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA verifyPbeState failed!
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Pbe: Failed to 
re-attach to db file /storage/clear/coremw/etc/imm.db - regenerating db file
2017-06-05T17:00:34.73 cm1 user.info osafimmpbed: IN Generating DB file from 
current IMM state. DB file: /storage/clear/coremw/etc/imm.db
2017-06-05T17:00:34.74 cm1 user.notice osafimmpbed: NO Successfully opened 
empty local sqlite pbe file /tmp/ImmPbeTmpSubDir/imm.db.d3HPkE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2491 imm: PBE regenerates imm.db if immnd exits during the PBE state verification

2017-08-14 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> duplicate
- **Comment**:

This ticket is a duplicate of #2527.
The problem has been solved in #2527



---

** [tickets:#2491] imm: PBE regenerates imm.db if immnd exits during the PBE 
state verification**

**Status:** duplicate
**Milestone:** 5.17.10
**Created:** Fri Jun 09, 2017 09:18 AM UTC by Zoran Milinkovic
**Last Updated:** Fri Jul 28, 2017 08:23 AM UTC
**Owner:** Zoran Milinkovic


If IMMND exits during the verification of PBE state, PBE regenerates new 
database from XML file.

PBE must distinguish between IMM issues (IMMND exits, network problem, etc) and 
database curruption.
If it's IMM issue, PBE should restart.
If it's database corruption issue, PBE should be regenerated.


2017-06-05T17:00:24.31 cm1 local0.notice osafimmnd[984]: NO This IMMND is now 
the NEW Coord
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[1] == '--recover'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[2] == '--pbe'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[3] == 
'/storage/clear/coremw/etc/imm.db'
2017-06-05T17:00:24.68 cm1 local0.err osafimmnd[984]: ER No IMMD service => 
cluster restart, exiting

2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Verify class 
CmwMgntLockClass failed!
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Renamed 
/storage/clear/coremw/etc/imm.db to 
/storage/clear/coremw/etc/imm.db.failed_immdump because it has been detected to 
be corrupt.
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Removed obsolete journal 
file: /storage/clear/coremw/etc/imm.db-journal 
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA verifyPbeState failed!
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Pbe: Failed to 
re-attach to db file /storage/clear/coremw/etc/imm.db - regenerating db file
2017-06-05T17:00:34.73 cm1 user.info osafimmpbed: IN Generating DB file from 
current IMM state. DB file: /storage/clear/coremw/etc/imm.db
2017-06-05T17:00:34.74 cm1 user.notice osafimmpbed: NO Successfully opened 
empty local sqlite pbe file /tmp/ImmPbeTmpSubDir/imm.db.d3HPkE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2248 base: leap test hangs forever

2017-08-10 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> duplicate
- **Blocker**:  --> False
- **Milestone**: 5.17.08 --> future
- **Comment**:

The solution will be provided with ticket #2440



---

** [tickets:#2248] base: leap test hangs forever**

**Status:** duplicate
**Milestone:** future
**Created:** Tue Jan 03, 2017 02:49 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Apr 10, 2017 01:40 PM UTC
**Owner:** Zoran Milinkovic


When leap test is run manually, the test hangs forever for 
SysfTmrTest.TestIntervalTimer.

Timers created before the timer engine thread started have much bigger timeout.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2549 imm: immnd may assert after coming from headless state or losing both controllers

2017-08-09 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35988641/



---

** [tickets:#2549] imm: immnd may assert after coming from headless state or 
losing both controllers**

**Status:** review
**Milestone:** 5.17.10
**Created:** Wed Aug 09, 2017 10:54 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Aug 09, 2017 10:56 AM UTC
**Owner:** Zoran Milinkovic


~~~
Thread 1 (Thread 0x7fda7f771740 (LWP 11485)):
---Type  to continue, or q  to quit---
#0  0x7fda7df738d7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fda7df74caa in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fda7eee404e in __osafassert_fail (__file=, 
__line=, __func=, __assertion=) at 
../../opensaf/src/base/sysf_def.c:286
No locals.
#3  0x564aa570110d in main (argc=, argv=) at 
../../opensaf/src/imm/immnd/immnd_main.c:426
now = {tv_sec = 2116, tv_nsec = 356673778}
passed_time = 
mbx_fd = 
error = 
timeout = 1000
eventCount = 14
maxEvt = 100
start_time = {tv_sec = 2116, tv_nsec = 355976144}
fds = {{fd = 20, events = 1, revents = 0}, {fd = 18, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 1}, {fd = 16, events = 1, revents 
= 1}, {fd = 22, events = 1, revents = 0}}
term_fd = 20
nfds = 5
__FUNCTION__ = "main"
~~~

When a system comes from headless state or losing both controllers with enabled 
SC roaming, MDS up message for AMF or CLM up triggers FD_CLM_INIT event when 
immnd_cb->clm_hdl is not 0.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2549 imm: immnd may assert after coming from headless state or losing both controllers

2017-08-09 Thread Zoran Milinkovic via Opensaf-tickets
- Description has changed:

Diff:



--- old
+++ new
@@ -1,3 +1,4 @@
+~~~
 Thread 1 (Thread 0x7fda7f771740 (LWP 11485)):
 ---Type  to continue, or q  to quit---
 #0  0x7fda7df738d7 in raise () from /lib64/libc.so.6
@@ -19,5 +20,6 @@
 term_fd = 20
 nfds = 5
 __FUNCTION__ = "main"
-
+~~~
+
 When a system comes from headless state or losing both controllers with 
enabled SC roaming, MDS up message for AMF or CLM up triggers FD_CLM_INIT event 
when immnd_cb->clm_hdl is not 0.






---

** [tickets:#2549] imm: immnd may assert after coming from headless state or 
losing both controllers**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Wed Aug 09, 2017 10:54 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Aug 09, 2017 10:54 AM UTC
**Owner:** Zoran Milinkovic


~~~
Thread 1 (Thread 0x7fda7f771740 (LWP 11485)):
---Type  to continue, or q  to quit---
#0  0x7fda7df738d7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fda7df74caa in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fda7eee404e in __osafassert_fail (__file=, 
__line=, __func=, __assertion=) at 
../../opensaf/src/base/sysf_def.c:286
No locals.
#3  0x564aa570110d in main (argc=, argv=) at 
../../opensaf/src/imm/immnd/immnd_main.c:426
now = {tv_sec = 2116, tv_nsec = 356673778}
passed_time = 
mbx_fd = 
error = 
timeout = 1000
eventCount = 14
maxEvt = 100
start_time = {tv_sec = 2116, tv_nsec = 355976144}
fds = {{fd = 20, events = 1, revents = 0}, {fd = 18, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 1}, {fd = 16, events = 1, revents 
= 1}, {fd = 22, events = 1, revents = 0}}
term_fd = 20
nfds = 5
__FUNCTION__ = "main"
~~~

When a system comes from headless state or losing both controllers with enabled 
SC roaming, MDS up message for AMF or CLM up triggers FD_CLM_INIT event when 
immnd_cb->clm_hdl is not 0.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2549 imm: immnd may assert after coming from headless state or losing both controllers

2017-08-09 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2549] imm: immnd may assert after coming from headless state or 
losing both controllers**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Wed Aug 09, 2017 10:54 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Aug 09, 2017 10:54 AM UTC
**Owner:** Zoran Milinkovic


Thread 1 (Thread 0x7fda7f771740 (LWP 11485)):
---Type  to continue, or q  to quit---
#0  0x7fda7df738d7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fda7df74caa in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fda7eee404e in __osafassert_fail (__file=, 
__line=, __func=, __assertion=) at 
../../opensaf/src/base/sysf_def.c:286
No locals.
#3  0x564aa570110d in main (argc=, argv=) at 
../../opensaf/src/imm/immnd/immnd_main.c:426
now = {tv_sec = 2116, tv_nsec = 356673778}
passed_time = 
mbx_fd = 
error = 
timeout = 1000
eventCount = 14
maxEvt = 100
start_time = {tv_sec = 2116, tv_nsec = 355976144}
fds = {{fd = 20, events = 1, revents = 0}, {fd = 18, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 1}, {fd = 16, events = 1, revents 
= 1}, {fd = 22, events = 1, revents = 0}}
term_fd = 20
nfds = 5
__FUNCTION__ = "main"

When a system comes from headless state or losing both controllers with enabled 
SC roaming, MDS up message for AMF or CLM up triggers FD_CLM_INIT event when 
immnd_cb->clm_hdl is not 0.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2544 imm: saClmDispatch is returning ERR_BAD_HANDLE until immnd initializes CLM handle

2017-08-07 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35985521/



---

** [tickets:#2544] imm: saClmDispatch is returning ERR_BAD_HANDLE until immnd 
initializes CLM handle**

**Status:** review
**Milestone:** 5.17.10
**Created:** Mon Aug 07, 2017 02:08 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Aug 07, 2017 02:08 PM UTC
**Owner:** Zoran Milinkovic


In the poll loop, immnd handles FD_CLM event even if FD_CLM event is not 
processed by poll call.

With random values in fds[FD_CLM], immnd may process FD_CLM event without 
calling poll. saClmDispatch will return SA_AIS_ERR_BAD_HANDLE error until CLM 
handle is initialized and CLM selection object set to 
immnd_cb->clmSelectionObject.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2544 imm: saClmDispatch is returning ERR_BAD_HANDLE until immnd initializes CLM handle

2017-08-07 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2544] imm: saClmDispatch is returning ERR_BAD_HANDLE until immnd 
initializes CLM handle**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Mon Aug 07, 2017 02:08 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Aug 07, 2017 02:08 PM UTC
**Owner:** Zoran Milinkovic


In the poll loop, immnd handles FD_CLM event even if FD_CLM event is not 
processed by poll call.

With random values in fds[FD_CLM], immnd may process FD_CLM event without 
calling poll. saClmDispatch will return SA_AIS_ERR_BAD_HANDLE error until CLM 
handle is initialized and CLM selection object set to 
immnd_cb->clmSelectionObject.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2528 clm: CLM does not handle ERR_BAD_HANDLE from saImmOmSearchInitialize

2017-07-14 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35945126/



---

** [tickets:#2528] clm: CLM does not handle ERR_BAD_HANDLE from 
saImmOmSearchInitialize**

**Status:** review
**Milestone:** 5.17.08
**Created:** Fri Jul 14, 2017 12:18 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Jul 14, 2017 12:18 PM UTC
**Owner:** Zoran Milinkovic


CLM fails with ERR_BAD_HANDLE in saImmOmSearchInitialize call.
CLM should reinitialize OM handle and repeat search at least once.

Jul 11 21:00:40 SC-1 osafrded[5886]: NO Got peer info response from node 
0x2020f with role ACTIVE
Jul 11 21:00:40 SC-1 osafclmd[5976]: WA OpenSAF imm lib: Message loss detected 
for dest 564115135000812 service id:25
Jul 11 21:00:40 SC-1 osafimmnd[5931]: WA IMMND - Client Node Get Failed for 
client handle: 1357209796879
Jul 11 21:00:40 SC-1 osafclmd[5976]: WA OpenSAF imm lib: Message loss detected 
for dest 564115135000812 service id:25
Jul 11 21:00:40 SC-1 osafclmd[5976]: WA marking handle as exposed
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER No Object of SaClmNode Class was found
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER clms_node_create_config failed rc:9
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER clms_imm_activate FAILED
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER initialize_for_assignment FAILED 9


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2528 clm: CLM does not handle ERR_BAD_HANDLE from saImmOmSearchInitialize

2017-07-14 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2528] clm: CLM does not handle ERR_BAD_HANDLE from 
saImmOmSearchInitialize**

**Status:** accepted
**Milestone:** 5.17.08
**Created:** Fri Jul 14, 2017 12:18 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Jul 14, 2017 12:18 PM UTC
**Owner:** Zoran Milinkovic


CLM fails with ERR_BAD_HANDLE in saImmOmSearchInitialize call.
CLM should reinitialize OM handle and repeat search at least once.

Jul 11 21:00:40 SC-1 osafrded[5886]: NO Got peer info response from node 
0x2020f with role ACTIVE
Jul 11 21:00:40 SC-1 osafclmd[5976]: WA OpenSAF imm lib: Message loss detected 
for dest 564115135000812 service id:25
Jul 11 21:00:40 SC-1 osafimmnd[5931]: WA IMMND - Client Node Get Failed for 
client handle: 1357209796879
Jul 11 21:00:40 SC-1 osafclmd[5976]: WA OpenSAF imm lib: Message loss detected 
for dest 564115135000812 service id:25
Jul 11 21:00:40 SC-1 osafclmd[5976]: WA marking handle as exposed
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER No Object of SaClmNode Class was found
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER clms_node_create_config failed rc:9
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER clms_imm_activate FAILED
Jul 11 21:00:40 SC-1 osafclmd[5976]: ER initialize_for_assignment FAILED 9


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2527 imm: PBE is not regenerated on data inconsistency with sql constraint error

2017-07-12 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35941109/



---

** [tickets:#2527] imm: PBE is not regenerated on data inconsistency with sql 
constraint error**

**Status:** review
**Milestone:** 5.17.08
**Created:** Wed Jul 12, 2017 11:48 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jul 12, 2017 11:48 AM UTC
**Owner:** Zoran Milinkovic


When PBE detects that the database is corrupted with SQL contraint error code, 
PBE should regenerate the database instead of reataching to the existing 
database.

PBE inconsistency was detected... PBE exit and reatached:
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[3617]: ER SQL 
statement('INSERT INTO objects (obj_id, class_id, dn, last_ccb) VALUES (?, ?, 
?, ?)') failed with error code: 19
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[3617]: ER objectToPBE failed 
in sqlite_prepare_ccb. Handle is closed - exiting
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer locally 
disconnected. Marking it as doomed 120 <1359, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer 
disconnected 120 <1359, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: WA Persistent back-end 
process has apparently died.
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO STARTING PBE process.
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO 
pbe-db-file-path:/cluster/storage/clear/coremw/etc/imm.db VETERAN:1 B:0
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[1] == 
'--recover'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[2] == '--pbe'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[3] == 
'/cluster/storage/clear/coremw/etc/imm.db'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: NO Successfully 
opened pre-existing sqlite pbe file /cluster/storage/clear/coremw/etc/imm.db
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN 
saImmRepositoryInit: SA_IMM_KEEP_REPOSITORY - attaching to repository

 PBE inconsistency detected again
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmpbed[5629]: ER SQL 
statement('INSERT INTO objects (obj_id, class_id, dn, last_ccb) VALUES (?, ?, 
?, ?)') failed with error code: 19
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmpbed[5629]: ER objectToPBE failed 
in sqlite_prepare_ccb. Handle is closed - exiting
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer locally 
disconnected. Marking it as doomed 122 <5604, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer 
disconnected 122 <5604, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmnd[1645]: WA Persistent back-end 
process has apparently died.
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO STARTING PBE process.
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO 
pbe-db-file-path:/cluster/storage/clear/coremw/etc/imm.db VETERAN:1 B:0
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[1] == 
'--recover'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[2] == '--pbe'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[3] == 
'/cluster/storage/clear/coremw/etc/imm.db'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: NO Successfully 
opened pre-existing sqlite pbe file /cluster/storage/clear/coremw/etc/imm.db
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN 
saImmRepositoryInit: SA_IMM_KEEP_REPOSITORY - attaching to repository

 and the same situation was repeated 5 times until PBE was regenerated.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2527 imm: PBE is not regenerated on data inconsistency with sql constraint error

2017-07-12 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2527] imm: PBE is not regenerated on data inconsistency with sql 
constraint error**

**Status:** accepted
**Milestone:** 5.17.08
**Created:** Wed Jul 12, 2017 11:48 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jul 12, 2017 11:48 AM UTC
**Owner:** Zoran Milinkovic


When PBE detects that the database is corrupted with SQL contraint error code, 
PBE should regenerate the database instead of reataching to the existing 
database.

PBE inconsistency was detected... PBE exit and reatached:
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[3617]: ER SQL 
statement('INSERT INTO objects (obj_id, class_id, dn, last_ccb) VALUES (?, ?, 
?, ?)') failed with error code: 19
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[3617]: ER objectToPBE failed 
in sqlite_prepare_ccb. Handle is closed - exiting
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer locally 
disconnected. Marking it as doomed 120 <1359, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer 
disconnected 120 <1359, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: WA Persistent back-end 
process has apparently died.
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO STARTING PBE process.
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO 
pbe-db-file-path:/cluster/storage/clear/coremw/etc/imm.db VETERAN:1 B:0
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[1] == 
'--recover'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[2] == '--pbe'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN arg[3] == 
'/cluster/storage/clear/coremw/etc/imm.db'
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: NO Successfully 
opened pre-existing sqlite pbe file /cluster/storage/clear/coremw/etc/imm.db
Jul 07 10:55:49 fi15-rc-bgf19-20170621 osafimmpbed[5629]: IN 
saImmRepositoryInit: SA_IMM_KEEP_REPOSITORY - attaching to repository

 PBE inconsistency detected again
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmpbed[5629]: ER SQL 
statement('INSERT INTO objects (obj_id, class_id, dn, last_ccb) VALUES (?, ?, 
?, ?)') failed with error code: 19
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmpbed[5629]: ER objectToPBE failed 
in sqlite_prepare_ccb. Handle is closed - exiting
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer locally 
disconnected. Marking it as doomed 122 <5604, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:51 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO Implementer 
disconnected 122 <5604, 2d80f> (OpenSafImmPBE)
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmnd[1645]: WA Persistent back-end 
process has apparently died.
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO STARTING PBE process.
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmnd[1645]: NO 
pbe-db-file-path:/cluster/storage/clear/coremw/etc/imm.db VETERAN:1 B:0
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[1] == 
'--recover'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[2] == '--pbe'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN arg[3] == 
'/cluster/storage/clear/coremw/etc/imm.db'
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: NO Successfully 
opened pre-existing sqlite pbe file /cluster/storage/clear/coremw/etc/imm.db
Jul 07 10:55:52 fi15-rc-bgf19-20170621 osafimmpbed[6392]: IN 
saImmRepositoryInit: SA_IMM_KEEP_REPOSITORY - attaching to repository

 and the same situation was repeated 5 times until PBE was regenerated.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2520 clm: make CLM tests more independent from other CLM tests

2017-07-11 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop(5.17.10): 

commit 3be8e9adb4670607a93907a886b0cf301570d65a
Author: Zoran Milinkovic 
Date:   Tue Jul 11 11:19:31 2017 +0200

clm: make CLM tests independent of other CLM tests [#2520]

The patch removes dependencies between CLM tests. CLM tests can be run more 
times now.
Duplicated CLM tests are removed from clmtest.

-

release(5.17.8):

commit 89ba594771730a88b6584af19e6ce629bbd8fdbb
Author: Zoran Milinkovic 
Date:   Tue Jul 11 11:19:31 2017 +0200

clm: make CLM tests independent of other CLM tests [#2520]

The patch removes dependencies between CLM tests. CLM tests can be run more 
times now.
Duplicated CLM tests are removed from clmtest.



---

** [tickets:#2520] clm: make CLM tests more independent from other CLM tests**

**Status:** fixed
**Milestone:** 5.17.08
**Created:** Tue Jul 04, 2017 01:50 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Jul 05, 2017 02:56 PM UTC
**Owner:** Zoran Milinkovic


Today's CLM tests depend on other tests, request them to revert CLM state to 
the starting state.
If CLM tests are done manually, then they must be done in an order. Also, some 
tests cannot be called twice.

For example test 7 21:
~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

waiting on poll
Inside TrackCallback4
invocation : 0
Step : 4
error = 1
numberOfMembers = 2
No of items = 1

Value of i = 0
Cluster Change = 3
Node Name length = 36, value = safNode=PL-3,safCluster=myClmCluster
Node Member = 0
Node  view number  = 5
Node  eename length = 0,value  = 
Node  boottimestamp  = 1499175648545988548
Node  nodeAddress family  = 1,node address length = 0, node address value = 
Node  nodeid  = 131855

   21  PASSED   saClmClusterTrack_4 with SA_TRACK_CHANGES_ONLY track flags - 
admin lock

=

   Test Result:
  Total:  1
  Passed: 1
  Failed: 0
~~~

And if we repeat the test, it fails:

~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:604: saClmClusterTrack_21: 
Assertion `ret == 1' failed.
waiting on pollAborted
~~~

Or if we call another test after executing test 7 21:

~~~
$ clmtest 7 23

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:672: saClmClusterTrack_23: 
Assertion `ret == 1' failed.
Aborted
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2524 pyosaf: decorate function does not handle version struct in initialize functions

2017-07-06 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2524] pyosaf: decorate function does not handle version struct in 
initialize functions**

**Status:** unassigned
**Milestone:** 5.17.08
**Created:** Thu Jul 06, 2017 01:44 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Jul 06, 2017 01:44 PM UTC
**Owner:** nobody


decorate() function does not handle version in initialize calls.
If initialize function returns ERR_TRY_AGAIN, version struct is populated with 
the latest version. The next initialize call will not initialize a handle with 
the required version, but with the latest library version


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2523 pyosaf: IMM OM module initialized with ERR_BAD_HANDLE

2017-07-06 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2523] pyosaf: IMM OM module initialized with ERR_BAD_HANDLE**

**Status:** unassigned
**Milestone:** 5.17.08
**Created:** Thu Jul 06, 2017 01:40 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Jul 06, 2017 01:40 PM UTC
**Owner:** nobody


pyosaf does not handle well intialization of IMM OM module.
Regardless of error code returned from saImmOmInitialize, in _initialize() 
function, saImmOmAccessorInitialize is called().


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2520 clm: make CLM tests more independent from other CLM tests

2017-07-05 Thread Zoran Milinkovic via Opensaf-tickets
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35929376/



---

** [tickets:#2520] clm: make CLM tests more independent from other CLM tests**

**Status:** review
**Milestone:** 5.17.08
**Created:** Tue Jul 04, 2017 01:50 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Jul 05, 2017 02:53 PM UTC
**Owner:** Zoran Milinkovic


Today's CLM tests depend on other tests, request them to revert CLM state to 
the starting state.
If CLM tests are done manually, then they must be done in an order. Also, some 
tests cannot be called twice.

For example test 7 21:
~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

waiting on poll
Inside TrackCallback4
invocation : 0
Step : 4
error = 1
numberOfMembers = 2
No of items = 1

Value of i = 0
Cluster Change = 3
Node Name length = 36, value = safNode=PL-3,safCluster=myClmCluster
Node Member = 0
Node  view number  = 5
Node  eename length = 0,value  = 
Node  boottimestamp  = 1499175648545988548
Node  nodeAddress family  = 1,node address length = 0, node address value = 
Node  nodeid  = 131855

   21  PASSED   saClmClusterTrack_4 with SA_TRACK_CHANGES_ONLY track flags - 
admin lock

=

   Test Result:
  Total:  1
  Passed: 1
  Failed: 0
~~~

And if we repeat the test, it fails:

~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:604: saClmClusterTrack_21: 
Assertion `ret == 1' failed.
waiting on pollAborted
~~~

Or if we call another test after executing test 7 21:

~~~
$ clmtest 7 23

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:672: saClmClusterTrack_23: 
Assertion `ret == 1' failed.
Aborted
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2520 clm: make CLM tests more independent from other CLM tests

2017-07-05 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review



---

** [tickets:#2520] clm: make CLM tests more independent from other CLM tests**

**Status:** review
**Milestone:** 5.17.08
**Created:** Tue Jul 04, 2017 01:50 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Jul 04, 2017 01:50 PM UTC
**Owner:** Zoran Milinkovic


Today's CLM tests depend on other tests, request them to revert CLM state to 
the starting state.
If CLM tests are done manually, then they must be done in an order. Also, some 
tests cannot be called twice.

For example test 7 21:
~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

waiting on poll
Inside TrackCallback4
invocation : 0
Step : 4
error = 1
numberOfMembers = 2
No of items = 1

Value of i = 0
Cluster Change = 3
Node Name length = 36, value = safNode=PL-3,safCluster=myClmCluster
Node Member = 0
Node  view number  = 5
Node  eename length = 0,value  = 
Node  boottimestamp  = 1499175648545988548
Node  nodeAddress family  = 1,node address length = 0, node address value = 
Node  nodeid  = 131855

   21  PASSED   saClmClusterTrack_4 with SA_TRACK_CHANGES_ONLY track flags - 
admin lock

=

   Test Result:
  Total:  1
  Passed: 1
  Failed: 0
~~~

And if we repeat the test, it fails:

~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:604: saClmClusterTrack_21: 
Assertion `ret == 1' failed.
waiting on pollAborted
~~~

Or if we call another test after executing test 7 21:

~~~
$ clmtest 7 23

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:672: saClmClusterTrack_23: 
Assertion `ret == 1' failed.
Aborted
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2520 clm: make CLM tests more independent from other CLM tests

2017-07-04 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2520] clm: make CLM tests more independent from other CLM tests**

**Status:** accepted
**Milestone:** 5.17.08
**Created:** Tue Jul 04, 2017 01:50 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Jul 04, 2017 01:50 PM UTC
**Owner:** Zoran Milinkovic


Today's CLM tests depend on other tests, request them to revert CLM state to 
the starting state.
If CLM tests are done manually, then they must be done in an order. Also, some 
tests cannot be called twice.

For example test 7 21:
~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

waiting on poll
Inside TrackCallback4
invocation : 0
Step : 4
error = 1
numberOfMembers = 2
No of items = 1

Value of i = 0
Cluster Change = 3
Node Name length = 36, value = safNode=PL-3,safCluster=myClmCluster
Node Member = 0
Node  view number  = 5
Node  eename length = 0,value  = 
Node  boottimestamp  = 1499175648545988548
Node  nodeAddress family  = 1,node address length = 0, node address value = 
Node  nodeid  = 131855

   21  PASSED   saClmClusterTrack_4 with SA_TRACK_CHANGES_ONLY track flags - 
admin lock

=

   Test Result:
  Total:  1
  Passed: 1
  Failed: 0
~~~

And if we repeat the test, it fails:

~~~
$ clmtest 7 21

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:604: saClmClusterTrack_21: 
Assertion `ret == 1' failed.
waiting on pollAborted
~~~

Or if we call another test after executing test 7 21:

~~~
$ clmtest 7 23

Suite 7: Test case for saClmClusterTrack. ** For all tests to pass, Run a 
payload with node_name PL-3 **

error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_NO_OP (28)
clmtest: src/clm/apitest/tet_saClmClusterTrack.c:672: saClmClusterTrack_23: 
Assertion `ret == 1' failed.
Aborted
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2504 imm: dispatch functions don't send imm finalize message when return ERR_BAD_HANDLE

2017-07-04 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> fixed
- **Comment**:

develop(5.17.10):

commit bb2134e532d22d72ffea151cf9ea4bd67cecfc63
Author: Zoran Milinkovic 
Date:   Tue Jul 4 15:01:15 2017 +0200

imm: send imm finalize message to immnd when dispatch returns 
ERR_BAD_HANDLE [#2504]

Send IMM_FINALIZE message to immnd when dispatch functions returns 
ERR_BAD_HANDLE.
IMM_FINALIZE will release all allocated resources of a handle on IMM 
service side.

-

release(5.17.8):

commit 3e2eeb4ae793beb9f46f49925437537bb9606603
Author: Zoran Milinkovic 
Date:   Tue Jul 4 15:01:15 2017 +0200

imm: send imm finalize message to immnd when dispatch returns 
ERR_BAD_HANDLE [#2504]

Send IMM_FINALIZE message to immnd when dispatch functions returns 
ERR_BAD_HANDLE.
IMM_FINALIZE will release all allocated resources of a handle on IMM 
service side.



---

** [tickets:#2504] imm: dispatch functions don't send imm finalize message when 
return ERR_BAD_HANDLE**

**Status:** fixed
**Milestone:** 5.17.08
**Created:** Tue Jun 20, 2017 08:48 AM UTC by Zoran Milinkovic
**Last Updated:** Sat Jul 01, 2017 04:17 PM UTC
**Owner:** Zoran Milinkovic


When dispatch functions return ERR_BAD_HANDLE error, IMM finalize message is 
not sent to IMM service.
This works when there is only one IMM handle (IMMA closes connection with IMM 
service). If more handles are used, it may happen that resources in IMM service 
will never be released.
If oiFinalize is called after receiving ERR_BAD_HANDLE, resources on client 
side will be released, but not on IMM service side.

For example: if implementer is attached to OI handle, and oiDispatch returns 
ERR_BAD_HANDLE, the implementer will be attached, but not accessible even if 
oiFinalize on the oi handle is called. And a new implementer with the same 
implementer name will not be possible to set again.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2513 rde: allow to early change peer role only when active or standby nodes exist

2017-06-28 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

release(5.17.06):

commit aba7bc5460136cca39b6b067d934973b3099fe0f
Author: Zoran Milinkovic 
Date:   Wed Jun 28 13:25:29 2017 +0200

rde: allow early role change when active or standby nodes are introduced 
[#2513]

When active or standby nodes are introduced with request message, there is 
no need to wait more for requesting the active role.
When standby node is introduced, then we are sure that there is an active 
node somewhere in the cluster. So, changing the peer state is safe.

-

develop(5.17.08):

commit f089f030a322a43c79f3f259f07a4c42bb4d0da1
Author: Zoran Milinkovic 
Date:   Wed Jun 28 13:25:29 2017 +0200

rde: allow early role change when active or standby nodes are introduced 
[#2513]

When active or standby nodes are introduced with request message, there is 
no need to wait more for requesting the active role.
When standby node is introduced, then we are sure that there is an active 
node somewhere in the cluster. So, changing the peer state is safe.



---

** [tickets:#2513] rde: allow to early change peer role only when active or 
standby nodes exist**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Wed Jun 28, 2017 09:14 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 28, 2017 01:01 PM UTC
**Owner:** Zoran Milinkovic


The pushed patch from ticket #2423 triggers some side effects, like multiple 
active nodes, losing MDS logs, etc.
The first proposed patch with allowing to change a node role only when it's 
known that there is active or standby node seems to work well.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2513 rde: allow to early change peer role only when active or standby nodes exist

2017-06-28 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35916824/



---

** [tickets:#2513] rde: allow to early change peer role only when active or 
standby nodes exist**

**Status:** review
**Milestone:** 5.17.06
**Created:** Wed Jun 28, 2017 09:14 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 28, 2017 09:14 AM UTC
**Owner:** Zoran Milinkovic


The pushed patch from ticket #2423 triggers some side effects, like multiple 
active nodes, losing MDS logs, etc.
The first proposed patch with allowing to change a node role only when it's 
known that there is active or standby node seems to work well.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2513 rde: allow to early change peer role only when active or standby nodes exist

2017-06-28 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2513] rde: allow to early change peer role only when active or 
standby nodes exist**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Wed Jun 28, 2017 09:14 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 28, 2017 09:14 AM UTC
**Owner:** Zoran Milinkovic


The pushed patch from ticket #2423 triggers some side effects, like multiple 
active nodes, losing MDS logs, etc.
The first proposed patch with allowing to change a node role only when it's 
known that there is active or standby node seems to work well.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2423 rde: RDE sets the active role even if there is a node with the active role in a cluster

2017-06-21 Thread Zoran Milinkovic via Opensaf-tickets
5.17.06:

commit 1ff0760db847faa0d4981fae57ce4d59a69e016c
Author: Zoran Milinkovic 
Date:   Thu May 18 13:20:27 2017 +0200

rde: save peer role on peer info request message [#2423]

When peer info message is received, the peer info will be saved.
This will prevent setting active role before response info is received.


---

** [tickets:#2423] rde: RDE sets the active role even if there is a node with 
the active role in a cluster**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Tue Apr 11, 2017 11:14 AM UTC by Zoran Milinkovic
**Last Updated:** Thu May 18, 2017 11:31 AM UTC
**Owner:** Zoran Milinkovic


When there is a late detection of an active node, the new node may acquire 
active role due to gap in time between request and response messages. RDE does 
not remember a role of nodes that sent a request, and it make problems for 
electing the second node with the active role.

2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info request from node 
0x2050f with role ACTIVE
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2016-12-22 17:55:08 SC-2 opensaf_sc_active: 
5c20d9c8-c867-11e6-a222-5254001c9220 expected on SC-1
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Switched to ACTIVE from Undefined
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info response from node 
0x2050f with role ACTIVE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2465 imm: change log level for "ER Problem in sending to IMMD over MDS"

2017-06-20 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

Added the same log level fix for "IMMND - AdminOwner Initialize Failed" when 
the patch has been pushed.

-----

Author: Zoran Milinkovic 
Date:   Tue Jun 20 13:10:40 2017 +0200

imm: change log level from error to warning when ERR_TRY_AGAIN is returned 
[#2465]

Messages "Problem in sending to IMMD over MDS" and "IMMND - AdminOwner 
Initialize Failed" changed log level from error to warning.
In all cases ERR_TRY_AGAIN is returned, and error log level is not correct.



---

** [tickets:#2465] imm: change log level for "ER Problem in sending to IMMD 
over MDS"**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Fri May 19, 2017 12:22 PM UTC by Zoran Milinkovic
**Last Updated:** Fri May 19, 2017 12:49 PM UTC
**Owner:** Zoran Milinkovic


Change log level from error ro warning for message "Problem in sending to IMMD 
over MDS".
In all cases when this message is logged, IMMND returns ERR_TRY_AGAIN, and log 
level warning is more correct than error.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2504 imm: dispatch functions don't send imm finalize message when return ERR_BAD_HANDLE

2017-06-20 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2504] imm: dispatch functions don't send imm finalize message when 
return ERR_BAD_HANDLE**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Tue Jun 20, 2017 08:48 AM UTC by Zoran Milinkovic
**Last Updated:** Tue Jun 20, 2017 08:48 AM UTC
**Owner:** Zoran Milinkovic


When dispatch functions return ERR_BAD_HANDLE error, IMM finalize message is 
not sent to IMM service.
This works when there is only one IMM handle (IMMA closes connection with IMM 
service). If more handles are used, it may happen that resources in IMM service 
will never be released.
If oiFinalize is called after receiving ERR_BAD_HANDLE, resources on client 
side will be released, but not on IMM service side.

For example: if implementer is attached to OI handle, and oiDispatch returns 
ERR_BAD_HANDLE, the implementer will be attached, but not accessible even if 
oiFinalize on the oi handle is called. And a new implementer with the same 
implementer name will not be possible to set again.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2495 imm: saImmOmCcbApply times out due to miscalculation for old critical CCBs

2017-06-15 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit b806280ba7990a39382051ae961029ec7deec637
Author: Zoran Milinkovic 
Date:   Wed Jun 14 11:27:40 2017 +0200

imm: fix counting timeouts for old critical CCBs [#2495]

The patch fix counting timeouts for old critical CCBs



---

** [tickets:#2495] imm: saImmOmCcbApply times out due to miscalculation for old 
critical CCBs**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Tue Jun 13, 2017 02:57 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 14, 2017 11:07 AM UTC
**Owner:** Zoran Milinkovic


When IMM fetches old critical CCBs, IMM does not add timed out CCBs in a vector 
due to miscalculation for expired CCBs.
Instead of adding expired CCBs to the vector, IMM continues to calculate 
timeout in minus.

2017-05-25 17:58:27 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:27 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-0.382578
2017-05-25 17:58:28 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:28 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-1.387785
2017-05-25 17:58:29 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:29 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-2.392967
2017-05-25 17:58:30 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:30 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-3.398186
2017-05-25 17:58:31 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:31 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-4.403361
2017-05-25 17:58:32 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:32 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-5.408588
.

The bug was introduced in OpenSAF 5.1 with ticket #1704


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2481 imm: node crashes due to missing discard node info during the sync

2017-06-14 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit b5864c91a82fb27f02b09eaa2ffc3db456f1e67a
Author: Zoran Milinkovic 
Date:   Wed Jun 14 16:25:50 2017 +0200

imm: remove vector clearing for dead implementers, nodes and admin owners 
in objectSync [#2481]

After removing the clearing of dead implementer, node and admin owner 
vectors, re-executing on vectors will be done after the node is fully synced.



---

** [tickets:#2481] imm: node crashes due to missing discard node info during 
the sync**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Fri Jun 02, 2017 02:39 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Jun 02, 2017 03:24 PM UTC
**Owner:** Zoran Milinkovic


When a node receives the discard node message during the sync, re-executing 
discard node is not done after the node is synced.

Jun  1 09:09:46 PL-16 osafimmnd[20197]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Jun  1 09:09:46 PL-16 osafimmnd[20197]: NO SERVER STATE: 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO Global discard node received for 
nodeId:2090f pid:20533
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 2715
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO RepositoryInitModeT is 
SA_IMM_INIT_FROM_FILE
Jun  1 09:09:48 PL-16 osafimmnd[20197]: WA IMM Access Control mode is DISABLED!
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO Epoch set to 58 in ImmModel

It results in many messages like:
Jun  1 09:09:55 PL-16 osafimmnd[20197]: NO Sync-verify: Veteran node has 
different Implementer-id 41 for implementer: @safPmService1333912, should be 0 
according to finalizeSync. Assunimg implSet bypased finSync
Jun  1 09:10:07 PL-16 osafimmnd[20197]: NO Sync-verify: Veteran node has 
different Implementer-id 41 for implementer: @safPmService1333912, should be 0 
according to finalizeSync. Assunimg implSet bypased finSync

And at the end, when the IMM data inconsistency is detected, IMM aborts:
Jun 1 09:18:19 PL-16 osafimmnd[20197]: ER Sync-verify: Established node has 
different Implementer-id: 41 for name: @safPmService1333912, sync says 578.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2495 imm: saImmOmCcbApply times out due to miscalculation for old critical CCBs

2017-06-14 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35893768/



---

** [tickets:#2495] imm: saImmOmCcbApply times out due to miscalculation for old 
critical CCBs**

**Status:** review
**Milestone:** 5.17.06
**Created:** Tue Jun 13, 2017 02:57 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 14, 2017 03:17 AM UTC
**Owner:** Zoran Milinkovic


When IMM fetches old critical CCBs, IMM does not add timed out CCBs in a vector 
due to miscalculation for expired CCBs.
Instead of adding expired CCBs to the vector, IMM continues to calculate 
timeout in minus.

2017-05-25 17:58:27 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:27 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-0.382578
2017-05-25 17:58:28 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:28 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-1.387785
2017-05-25 17:58:29 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:29 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-2.392967
2017-05-25 17:58:30 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:30 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-3.398186
2017-05-25 17:58:31 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:31 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-4.403361
2017-05-25 17:58:32 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:32 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-5.408588
.

The bug was introduced in OpenSAF 5.1 with ticket #1704


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2495 imm: saImmOmCcbApply times out due to miscalculation for old critical CCBs

2017-06-13 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2495] imm: saImmOmCcbApply times out due to miscalculation for old 
critical CCBs**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Tue Jun 13, 2017 02:57 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Jun 13, 2017 02:57 PM UTC
**Owner:** Zoran Milinkovic


When IMM fetches old critical CCBs, IMM does not add timed out CCBs in a vector 
due to miscalculation for expired CCBs.
Instead of adding expired CCBs to the vector, IMM continues to calculate 
timeout in minus.

2017-05-25 17:58:27 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:27 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-0.382578
2017-05-25 17:58:28 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:28 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-1.387785
2017-05-25 17:58:29 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:29 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-2.392967
2017-05-25 17:58:30 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:30 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-3.398186
2017-05-25 17:58:31 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:31 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-4.403361
2017-05-25 17:58:32 SC-1 osafimmnd[205]: WA Timeout (6) on transaction in 
critical state! ccb:2
2017-05-25 17:58:32 SC-1 osafimmnd[205]: NO Ccb 2 is old, but also large (1) 
will wait secs:-5.408588
.

The bug was introduced in OpenSAF 5.1 with ticket #1704


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2491 imm: PBE regenerates imm.db if immnd exits during the PBE state verification

2017-06-09 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2491] imm: PBE regenerates imm.db if immnd exits during the PBE 
state verification**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Fri Jun 09, 2017 09:18 AM UTC by Zoran Milinkovic
**Last Updated:** Fri Jun 09, 2017 09:18 AM UTC
**Owner:** Zoran Milinkovic


If IMMND exits during the verification of PBE state, PBE regenerates new 
database from XML file.

PBE must distinguish between IMM issues (IMMND exits, network problem, etc) and 
database curruption.
If it's IMM issue, PBE should restart.
If it's database corruption issue, PBE should be regenerated.


2017-06-05T17:00:24.31 cm1 local0.notice osafimmnd[984]: NO This IMMND is now 
the NEW Coord
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[0] == 
'/usr/lib64/opensaf/osafimmpbed'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[1] == '--recover'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[2] == '--pbe'
2017-06-05T17:00:24.31 cm1 user.info osafimmpbed: IN arg[3] == 
'/storage/clear/coremw/etc/imm.db'
2017-06-05T17:00:24.68 cm1 local0.err osafimmnd[984]: ER No IMMD service => 
cluster restart, exiting

2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Verify class 
CmwMgntLockClass failed!
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Renamed 
/storage/clear/coremw/etc/imm.db to 
/storage/clear/coremw/etc/imm.db.failed_immdump because it has been detected to 
be corrupt.
2017-06-05T17:00:34.73 cm1 user.notice osafimmpbed: NO Removed obsolete journal 
file: /storage/clear/coremw/etc/imm.db-journal 
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA verifyPbeState failed!
2017-06-05T17:00:34.73 cm1 user.warning osafimmpbed: WA Pbe: Failed to 
re-attach to db file /storage/clear/coremw/etc/imm.db - regenerating db file
2017-06-05T17:00:34.73 cm1 user.info osafimmpbed: IN Generating DB file from 
current IMM state. DB file: /storage/clear/coremw/etc/imm.db
2017-06-05T17:00:34.74 cm1 user.notice osafimmpbed: NO Successfully opened 
empty local sqlite pbe file /tmp/ImmPbeTmpSubDir/imm.db.d3HPkE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2481 imm: node crashes due to missing discard node info during the sync

2017-06-02 Thread Zoran Milinkovic
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35873836/



---

** [tickets:#2481] imm: node crashes due to missing discard node info during 
the sync**

**Status:** review
**Milestone:** 5.17.06
**Created:** Fri Jun 02, 2017 02:39 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Jun 02, 2017 02:39 PM UTC
**Owner:** Zoran Milinkovic


When a node receives the discard node message during the sync, re-executing 
discard node is not done after the node is synced.

Jun  1 09:09:46 PL-16 osafimmnd[20197]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Jun  1 09:09:46 PL-16 osafimmnd[20197]: NO SERVER STATE: 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO Global discard node received for 
nodeId:2090f pid:20533
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 2715
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO RepositoryInitModeT is 
SA_IMM_INIT_FROM_FILE
Jun  1 09:09:48 PL-16 osafimmnd[20197]: WA IMM Access Control mode is DISABLED!
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO Epoch set to 58 in ImmModel

It results in many messages like:
Jun  1 09:09:55 PL-16 osafimmnd[20197]: NO Sync-verify: Veteran node has 
different Implementer-id 41 for implementer: @safPmService1333912, should be 0 
according to finalizeSync. Assunimg implSet bypased finSync
Jun  1 09:10:07 PL-16 osafimmnd[20197]: NO Sync-verify: Veteran node has 
different Implementer-id 41 for implementer: @safPmService1333912, should be 0 
according to finalizeSync. Assunimg implSet bypased finSync

And at the end, when the IMM data inconsistency is detected, IMM aborts:
Jun 1 09:18:19 PL-16 osafimmnd[20197]: ER Sync-verify: Established node has 
different Implementer-id: 41 for name: @safPmService1333912, sync says 578.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2481 imm: node crashes due to missing discard node info during the sync

2017-06-02 Thread Zoran Milinkovic



---

** [tickets:#2481] imm: node crashes due to missing discard node info during 
the sync**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Fri Jun 02, 2017 02:39 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Jun 02, 2017 02:39 PM UTC
**Owner:** Zoran Milinkovic


When a node receives the discard node message during the sync, re-executing 
discard node is not done after the node is synced.

Jun  1 09:09:46 PL-16 osafimmnd[20197]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Jun  1 09:09:46 PL-16 osafimmnd[20197]: NO SERVER STATE: 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO Global discard node received for 
nodeId:2090f pid:20533
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 2715
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO RepositoryInitModeT is 
SA_IMM_INIT_FROM_FILE
Jun  1 09:09:48 PL-16 osafimmnd[20197]: WA IMM Access Control mode is DISABLED!
Jun  1 09:09:48 PL-16 osafimmnd[20197]: NO Epoch set to 58 in ImmModel

It results in many messages like:
Jun  1 09:09:55 PL-16 osafimmnd[20197]: NO Sync-verify: Veteran node has 
different Implementer-id 41 for implementer: @safPmService1333912, should be 0 
according to finalizeSync. Assunimg implSet bypased finSync
Jun  1 09:10:07 PL-16 osafimmnd[20197]: NO Sync-verify: Veteran node has 
different Implementer-id 41 for implementer: @safPmService1333912, should be 0 
according to finalizeSync. Assunimg implSet bypased finSync

And at the end, when the IMM data inconsistency is detected, IMM aborts:
Jun 1 09:18:19 PL-16 osafimmnd[20197]: ER Sync-verify: Established node has 
different Implementer-id: 41 for name: @safPmService1333912, sync says 578.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2465 imm: change log level for "ER Problem in sending to IMMD over MDS"

2017-05-19 Thread Zoran Milinkovic
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35848336/



---

** [tickets:#2465] imm: change log level for "ER Problem in sending to IMMD 
over MDS"**

**Status:** review
**Milestone:** 5.17.06
**Created:** Fri May 19, 2017 12:22 PM UTC by Zoran Milinkovic
**Last Updated:** Fri May 19, 2017 12:22 PM UTC
**Owner:** Zoran Milinkovic


Change log level from error ro warning for message "Problem in sending to IMMD 
over MDS".
In all cases when this message is logged, IMMND returns ERR_TRY_AGAIN, and log 
level warning is more correct than error.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2465 imm: change log level for "ER Problem in sending to IMMD over MDS"

2017-05-19 Thread Zoran Milinkovic



---

** [tickets:#2465] imm: change log level for "ER Problem in sending to IMMD 
over MDS"**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Fri May 19, 2017 12:22 PM UTC by Zoran Milinkovic
**Last Updated:** Fri May 19, 2017 12:22 PM UTC
**Owner:** Zoran Milinkovic


Change log level from error ro warning for message "Problem in sending to IMMD 
over MDS".
In all cases when this message is logged, IMMND returns ERR_TRY_AGAIN, and log 
level warning is more correct than error.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2423 rde: RDE sets the active role even if there is a node with the active role in a cluster

2017-05-18 Thread Zoran Milinkovic
- **status**: review --> fixed
- **Blocker**:  --> False
- **Comment**:

develop:

commit 1867ef71083edfad88dc3a9970549e6d35085bd2
Author: Zoran Milinkovic 
Date:   Thu May 18 13:20:27 2017 +0200

rde: save peer role on peer info request message [#2423]

When peer info message is received, the peer info will be saved.
This will prevent setting active role before response info is received.

-

default:

changeset:   8801:c6a7b0237794
tag: tip
user:    Zoran Milinkovic 
date:Thu May 18 13:29:00 2017 +0200
summary: rde: save peer role on peer info request message [#2423]



---

** [tickets:#2423] rde: RDE sets the active role even if there is a node with 
the active role in a cluster**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Tue Apr 11, 2017 11:14 AM UTC by Zoran Milinkovic
**Last Updated:** Thu Apr 13, 2017 06:59 AM UTC
**Owner:** Zoran Milinkovic


When there is a late detection of an active node, the new node may acquire 
active role due to gap in time between request and response messages. RDE does 
not remember a role of nodes that sent a request, and it make problems for 
electing the second node with the active role.

2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info request from node 
0x2050f with role ACTIVE
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2016-12-22 17:55:08 SC-2 opensaf_sc_active: 
5c20d9c8-c867-11e6-a222-5254001c9220 expected on SC-1
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Switched to ACTIVE from Undefined
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info response from node 
0x2050f with role ACTIVE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2418 imm: Info of dead IMMND remains in standby IMMD

2017-04-13 Thread Zoran Milinkovic
- **status**: accepted --> review



---

** [tickets:#2418] imm: Info of dead IMMND remains in standby IMMD**

**Status:** review
**Milestone:** 5.0.2
**Created:** Mon Apr 10, 2017 10:23 AM UTC by Hung Nguyen
**Last Updated:** Mon Apr 10, 2017 10:24 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2418/attachment/log.tgz) 
(149.4 kB; application/x-compressed)


When Standby IMMD is up at the same time with a IMMND exiting, the info of that 
IMMND might not be removed from **immnd_tree** of the Standby IMMD.

Details of the problem is explained in the sequence diagram below
[sequence 
diagram](http://sequencediagram.org/index.html?initialData=A4QwTgLglgxloDsIAICCBhAKgWgJIFl8ARAKFElnhCWQGVMAhPQ0kkAIwHsAPZTgNwCmYOo2bFkAYjCCAJgC5kRAPIB1AHLJBQmgDMwnALbIC+dUT4JkCTrMHIAGiRJdeA4aKamii3AigoxLQAOgh+Acj4DOjI1LLIAM6CgdHIBgA29hCcdBBx7ACezvReLMjYAHxoWOIW8uic6bIJBQgwaYIAjgCuggkQziQYON7lVSW1ig1NLW0dCcCcCEmhEAAW9qbmyOlQ-chQbenddgnI65uE20vWtvZOzhw8fEIiw7VSMgpKapragnoDMYthYbjY7I5nK4Xh53t4ADQTbyKTAbExXCx7DqGdzxfRGaojMrsbooGSGECHM6HTy1IA)

SC-5 was Active, SC-2 was Standby, IMMND on SC-1 was exiting

~~~
18:35:03 SC-1 osafimmnd[441]: exiting for shutdown

18:35:03 SC-2 osafrded[413]: NO RDE role set to STANDBY
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, 
dest:568511936070075)
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, 
dest:567412424442298)
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, 
dest:566312912814523)
18:35:03 SC-2 osafimmd[430]: NO MDS event from svc_id 25 (change:3, 
dest:565213401186744)

18:35:03 SC-5 osafimmd[433]: NO MDS event from svc_id 25 (change:4, 
dest:564113889558969)
~~~

Down event for IMMND@SC-1 was received on SC-5 but not on SC-2.


**The symptoms:**

1. If the down IMMND is the corrdinator, that results in when that Standby IMMD 
becomes Active, it fails to elect new coordinator as there's already a 
coordinator in the **immnd_tree**.
~~~
18:35:11 SC-2 osafimmd[430]: WA IMMND coordinator at 2050f apparently crashed 
=> electing new coord
~~~
No more logs about newly elected coordinator were printed out.


2. When IMMND@SC-1 is up again, it will fail to introduce to IMMD because the 
IMMD already have IMMND@SC-1 in **immnd_tree** with a wrong epoch.

~~~
18:35:29 SC-1 osafimmnd[441]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> 
IMM_SERVER_CLUSTER_WAITING
18:35:29 SC-1 osafimmnd[441]: NO This IMMND is now the NEW Coord
18:35:29 SC-1 osafimmnd[441]: ER 3 > 0, exiting
~~~




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2423 rde: RDE sets the active role even if there is a node with the active role in a cluster

2017-04-12 Thread Zoran Milinkovic
- **status**: accepted --> review



---

** [tickets:#2423] rde: RDE sets the active role even if there is a node with 
the active role in a cluster**

**Status:** review
**Milestone:** 5.17.06
**Created:** Tue Apr 11, 2017 11:14 AM UTC by Zoran Milinkovic
**Last Updated:** Tue Apr 11, 2017 11:14 AM UTC
**Owner:** Zoran Milinkovic


When there is a late detection of an active node, the new node may acquire 
active role due to gap in time between request and response messages. RDE does 
not remember a role of nodes that sent a request, and it make problems for 
electing the second node with the active role.

2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info request from node 
0x2050f with role ACTIVE
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2016-12-22 17:55:08 SC-2 opensaf_sc_active: 
5c20d9c8-c867-11e6-a222-5254001c9220 expected on SC-1
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Switched to ACTIVE from Undefined
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info response from node 
0x2050f with role ACTIVE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2423 rde: RDE sets the active role even if there is a node with the active role in a cluster

2017-04-11 Thread Zoran Milinkovic



---

** [tickets:#2423] rde: RDE sets the active role even if there is a node with 
the active role in a cluster**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Tue Apr 11, 2017 11:14 AM UTC by Zoran Milinkovic
**Last Updated:** Tue Apr 11, 2017 11:14 AM UTC
**Owner:** Zoran Milinkovic


When there is a late detection of an active node, the new node may acquire 
active role due to gap in time between request and response messages. RDE does 
not remember a role of nodes that sent a request, and it make problems for 
electing the second node with the active role.

2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info request from node 
0x2050f with role ACTIVE
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2016-12-22 17:55:08 SC-2 opensaf_sc_active: 
5c20d9c8-c867-11e6-a222-5254001c9220 expected on SC-1
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Switched to ACTIVE from Undefined
2016-12-22 17:55:08 SC-2 osafrded[421]: NO Got peer info response from node 
0x2050f with role ACTIVE



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2100 Standby should not be rebooted, for SC absence configuration mismatch

2017-04-10 Thread Zoran Milinkovic
IMMSV_SC_ABSENCE_ALLOWED cannot be moved to imm.xml or PBE.
IMMSV_SC_ABSENCE_ALLOWED, as well as IMMSV_SC_ABSENCE_VETERAN_MAX_WAIT, are 
required for IMMD which starts before IMMND.

NID should take a decision if a node should be restarted or not after failed 
IMMD start


---

** [tickets:#2100]  Standby should not be rebooted, for  SC absence 
configuration mismatch**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Oct 07, 2016 07:11 AM UTC by Srikanth R
**Last Updated:** Thu Mar 30, 2017 04:51 AM UTC
**Owner:** nobody


Changeset : 8190 5.1.GA

-> Initially brought up opensaf on SC-1 with "SC ABSENCE" feature enabled in 
immd.conf.

-> On SC-2, "SC ABSENCE" feature is not enabled in immd.conf and opensafd is 
started on SC-2, for which node rebooted.

Oct  7 17:58:27 SLES-SLOT2 osafimmd[3615]: ER SC absence allowed in not the 
same as on active IMMD. Active: 900, Standby: 0. Exiting.
Oct  7 17:58:27 SLES-SLOT2 osafamfnd[3676]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Oct  7 17:58:27 SLES-SLOT2 osafamfnd[3676]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Oct  7 17:58:27 SLES-SLOT2 osafamfnd[3676]: Rebooting OpenSAF NodeId = 131599 
EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60

   Here  user had misconfigured the configuration on both the controllers, for 
which standby rebooted. Opensafd is enabled in runlevel as part of installation 
and standby shall reboot continuously until opensafd is stopped on SC-1.
   
  Suggested behavior :
   
   Opensafd should not start on standby, instead of immediate reboot. 
   
   Also, the cluster level  attributes like IMMSV_SC_ABSENCE_ALLOWED,  can be 
moved to imm.xml. Node level attributes like traces enabling can be retained in 
configuration files.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2393 Immd got crashed on Active as immnd restarted on Active with cluster having single controller and payload

2017-03-24 Thread Zoran Milinkovic
- **status**: unassigned --> invalid
- **Comment**:

This is expected behavior when SC absence is not allowed.
Even if there were more payloads, cluster reboot would be initiated due to 
absence of IMMNDs on controllers.

SC absence is set to 0 (is not allowed). This info can be seen in message:
Mar 23 11:06:12 SO-SLOT-1 osafimmd[2138]: ER Failed to find candidate for new 
IMMND coordinator (ScAbsenceAllowed:0 RulingEpoch:2



---

** [tickets:#2393] Immd got crashed on Active as immnd restarted on Active with 
cluster having single controller and payload**

**Status:** invalid
**Milestone:** 5.2.RC2
**Created:** Thu Mar 23, 2017 05:58 AM UTC by Ritu Raj
**Last Updated:** Thu Mar 23, 2017 05:55 PM UTC
**Owner:** nobody
**Attachments:**

- 
[PL-3.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2393/attachment/PL-3.tar.bz2)
 (558.9 kB; application/x-bzip)
- 
[SC-1.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2393/attachment/SC-1.tar.bz2)
 (2.5 MB; application/x-bzip)


###Environment details
OS : Suse 64bit
Changeset : 8701 ( 5.2.RC1)
2 nodes setup(1 controller and 1 payload)

###Summary
Immd got crashed on Active as immnd restarted on Active with cluster having 
single controller and payload

###Steps followed & Observed behaviour
1. Bring up cluster wtih 1 controller and 1 payload 
2. Kill immnd on active controller 
3. Observed, that immd got crashed on Active controller(SC-1) due to which 
Payload also got rebooted

** Issue obserbed when there is only one controller **

**Syslog**
SC-1:::

Mar 23 11:06:12 SO-SLOT-1 osafamfnd[2213]: NO 
'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' component restart probation timer 
started (timeout: 600 ns)
Mar 23 11:06:12 SO-SLOT-1 osafamfnd[2213]: NO Restarting a component of 
'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)
Mar 23 11:06:12 SO-SLOT-1 osafamfnd[2213]: NO 
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Mar 23 11:06:12 SO-SLOT-1 osafsmfd[2235]: WA DispatchOiCallback: 
saImmOiDispatch() Fail 'SA_AIS_ERR_BAD_HANDLE (9)'
Mar 23 11:06:12 SO-SLOT-1 osafntfimcnd[2181]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Mar 23 11:06:12 SO-SLOT-1 osafimmd[2138]: WA IMMND coordinator at 2010f 
apparently crashed => electing new coord
Mar 23 11:06:12 SO-SLOT-1 osafimmd[2138]: ER Failed to find candidate for new 
IMMND coordinator (ScAbsenceAllowed:0 RulingEpoch:2
Mar 23 11:06:12 SO-SLOT-1 osafimmd[2138]: ER Active IMMD has to restart the 
IMMSv. All IMMNDs will restart
Mar 23 11:06:12 SO-SLOT-1 osafimmd[2138]: ER IMM RELOAD with NO persistent back 
end => ensure cluster restart by IMMD exit at both SCs, exiting
Mar 23 11:06:12 SO-SLOT-1 osafamfnd[2213]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Mar 23 11:06:12 SO-SLOT-1 osafamfnd[2213]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Mar 23 11:06:12 SO-SLOT-1 osafamfnd[2213]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60
Mar 23 11:06:12 SO-SLOT-1 opensaf_reboot: Rebooting local node; timeout=60

PL-3:::
Mar 23 11:06:21 SO-SLOT-3 osafimmnd[2280]: ER IMMND forced to restart on order 
from IMMD, exiting
Mar 23 11:06:21 SO-SLOT-3 osafamfnd[2290]: NO 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' component restart probation timer 
started (timeout: 600 ns)
Mar 23 11:06:21 SO-SLOT-3 osafamfnd[2290]: NO Restarting a component of 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)
Mar 23 11:06:21 SO-SLOT-3 osafamfnd[2290]: NO 
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Mar 23 11:06:21 SO-SLOT-3 osafimmnd[2755]: mkfifo already exists: 
/var/lib/opensaf/osafimmnd.fifo File exists
Mar 23 11:06:21 SO-SLOT-3 osafimmnd[2755]: Started
Mar 23 11:06:26 SO-SLOT-3 osafamfnd[2290]: WA AMF director unexpectedly crashed
Mar 23 11:06:26 SO-SLOT-3 osafamfnd[2290]: Rebooting OpenSAF NodeId = 131855 EE 
Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, 
OwnNodeId = 131855, SupervisionTime = 60

Traces:
>From traces Active 'Failed to find candidate for new IMMND coordinator' and 
>Active IMMD has to restart the IMMSv
~~~
Mar 23 11:06:12.535325 osafimmd [2138:src/imm/immd/immd_evt.c:2638] T5 Received 
IMMND service event
Mar 23 11:06:12.535349 osafimmd [2138:src/imm/immd/immd_evt.c:2741] T5 PROCESS 
MDS EVT: NCSMDS_DOWN, my PID:2138
Mar 23 11:06:12.535451 osafimmd [2138:src/imm/immd/immd_evt.c:2748] T5 
NCSMDS_DOWN => local IMMND down
Mar 23 11:06:12.535463 osafimmd [2138:src/imm/immd/immd_evt.c:2763] T5 IMMND 
DOWN PROCESS detected by IMMD
Mar 23 11:06:12.535475 osafimmd [2138:src/imm/immd/immd_proc.c:0618] >> 
immd_process_immnd_down
Mar 23 11:06:12.535483 osafimmd [2138:src/imm/immd/immd_proc.c:0621] T5 
immd_process_immnd_down

[tickets] [opensaf:tickets] Re: #2382 imm: reducing log level for ccb-committed messages

2017-03-16 Thread Zoran Milinkovic
Hi,

I fully agree with Anders.

I suggest to close this ticket.
When ticket #2306 is implemented and support logging to /var/log/opensaf/xx 
log, then we can create a new ticket, and redirect these logs to the new log 
file.

Thanks,
Zoran

-Original Message-
From: Anders Bjornerstedt [mailto:ander...@users.sf.net] 
Sent: den 16 mars 2017 11:30
To: [opensaf:tickets] <2...@tickets.opensaf.p.re.sf.net>
Subject: [opensaf:tickets] #2382 imm: reducing log level for ccb-committed 
messages

First, this ticket should not be a defect.
The log level of the ccb commit messages is intentional, the motive being to 
have a record of if and when a CCB was committed. 

Second, having a record o configuration changes at hte OpensAF level is 
normally necesssary for analyzing a reproted problem involving OpenSAF. Many 
problems are triggered by a configuration change. Having a persistent record of 
such configuration changes is crucial for understanding or debugging unexpected 
events or problems, in a system.
Such troubleshooting does not just cover troubleshooting of OpenSAF, but also 
troubleshooting of application level behavior when the configuration of such an 
application is changed.

Log level NOtice is the lowest log level that is pushed to the syslog by 
default in OpenSAF.
This ticket in fact goes further than just lowering the log level to INfo 
(which is normally not logged but can be toggled on), it argues for lowering it 
to trace!

So you could end up in a scenario where there is a serious incident on a 
system, but no way to see from OpensAF logs if there was any configuration 
change involved in triggering the problem. You would need to reproduce the 
problem to get trace or INfo log level enabled.
The problem with trace is that the volumes are so large that it somtimes 
impacts the bhavior of the system, simetimes making it difficult to reproduce 
the problem.

CCB traffic is very low during normal operation. Only during SMF campaigns, or 
manual reconfigurations of the system would there be CCB traffic of any 
significance.
So log messages of committed CCBs can hardly be a big issue in teerms of 
volume, in general.

In summary:
I argue that this ticket is not motivated and it is by definition not a defect 
since the current behavior is intentional and well motivated.

The motive behind this ticket should be analyzed better and explained better in 
the ticket.
Or the ticket may just be closed.

A slightly better alternative is to introduce a new configuration parameter  to 
specify if CCB commits are to be logged. The default of that configuration 
parameter must of course be OFF (currrent behavior the default).






---

** [tickets:#2382] imm: reducing log level for ccb-committed messages**

**Status:** review
**Milestone:** 5.0.2
**Created:** Thu Mar 16, 2017 09:26 AM UTC by Neelakanta Reddy **Last 
Updated:** Thu Mar 16, 2017 09:47 AM UTC
**Owner:** Neelakanta Reddy


 if(i != sOwnerVector.end()) {
LOG_NO("Ccb %u COMMITTED (%s)", ccb->mId, 
(*i)->mAdminOwnerName.c_str());
} else {
LOG_NO("Ccb %u COMMITTED (%s)", ccb->mId, "");
}

Reduce the LOG_NO to TRACE


---

Sent from sourceforge.net because you indicated interest in 




To unsubscribe from further messages, please visit 



---

** [tickets:#2382] imm: reducing log level for ccb-committed messages**

**Status:** review
**Milestone:** 5.0.2
**Created:** Thu Mar 16, 2017 09:26 AM UTC by Neelakanta Reddy
**Last Updated:** Thu Mar 16, 2017 10:30 AM UTC
**Owner:** Neelakanta Reddy


 if(i != sOwnerVector.end()) {
LOG_NO("Ccb %u COMMITTED (%s)", ccb->mId, 
(*i)->mAdminOwnerName.c_str());
} else {
LOG_NO("Ccb %u COMMITTED (%s)", ccb->mId, "");
}

Reduce the LOG_NO to TRACE


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2284 IMM: Improper return code without any error string while deleting large number of objects

2017-03-13 Thread Zoran Milinkovic
Hi Srikanth,

I had a discussion with Anders Bjornerstedt a week before, and we agreed to 
create a new ticket.
I see that he has already explained the reason why I created a new ticket and 
closed the older one.

Thanks,
Zoran

-Original Message-
From: Srikanth R [mailto:rwp...@users.sf.net] 
Sent: den 10 mars 2017 07:18
To: [opensaf:tickets] <2...@tickets.opensaf.p.re.sf.net>
Subject: [opensaf:tickets] #2284 IMM: Improper return code without any error 
string while deleting large number of objects

To my understanding, this ticket is raised to correct the invalid return code ( 
ERR_LIBRARY).  As per the ticket description, the expected behavior is "

Expected behavior - Proper return code with error string should be returned "

What is the necessity of a new ticket ?


---

** [tickets:#2284] IMM: Improper return code without any error string while 
deleting large number of objects**

**Status:** invalid
**Milestone:** 5.2.RC1
**Created:** Wed Feb 01, 2017 07:13 AM UTC by Chani Srivastava **Last 
Updated:** Thu Mar 09, 2017 01:15 PM UTC
**Owner:** nobody


Steps to reproduce:

1. Bring up opensaf on a cluster
2. Create around 10k objects
3. Try deleating these objects in one immcfg operation

Output:
Error Returned - error - saImmOmAdminOwnerSet FAILED: SA_AIS_ERR_LIBRARY (2)

No error string stating the cause of failure is returned.

Syslog - immcfg: ER TOO MANY Object Names line:733

Expected behavior - Proper return code with error string should be returned 


---

Sent from sourceforge.net because you indicated interest in 




To unsubscribe from further messages, please visit 



---

** [tickets:#2284] IMM: Improper return code without any error string while 
deleting large number of objects**

**Status:** invalid
**Milestone:** 5.2.RC1
**Created:** Wed Feb 01, 2017 07:13 AM UTC by Chani Srivastava
**Last Updated:** Fri Mar 10, 2017 04:34 PM UTC
**Owner:** nobody


Steps to reproduce:

1. Bring up opensaf on a cluster
2. Create around 10k objects
3. Try deleating these objects in one immcfg operation

Output:
Error Returned - error - saImmOmAdminOwnerSet FAILED: SA_AIS_ERR_LIBRARY (2)

No error string stating the cause of failure is returned.

Syslog - immcfg: ER TOO MANY Object Names line:733

Expected behavior - Proper return code with error string should be returned 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2359 imm: incorrect error code when more than 10000 objects are set in saImmOmAdminOwnerSet

2017-03-09 Thread Zoran Milinkovic
- **Component**: unknown --> imm



---

** [tickets:#2359] imm: incorrect error code when more than 1 objects are 
set in saImmOmAdminOwnerSet**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Thu Mar 09, 2017 01:34 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Mar 09, 2017 01:34 PM UTC
**Owner:** nobody


When more than 1 objects are stored in saImmOmAdminOwnerSet/Release/Clear, 
SA_AIS_ERR_LIBRARY is returned.
The correct returned error code should be SA_AIS_ERR_NO_RESOURCES.

IMM has a limitation of setting 1 objects per admin owner set/release/clear 
calls.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2359 imm: incorrect error code when more than 10000 objects are set in saImmOmAdminOwnerSet

2017-03-09 Thread Zoran Milinkovic



---

** [tickets:#2359] imm: incorrect error code when more than 1 objects are 
set in saImmOmAdminOwnerSet**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Thu Mar 09, 2017 01:34 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Mar 09, 2017 01:34 PM UTC
**Owner:** nobody


When more than 1 objects are stored in saImmOmAdminOwnerSet/Release/Clear, 
SA_AIS_ERR_LIBRARY is returned.
The correct returned error code should be SA_AIS_ERR_NO_RESOURCES.

IMM has a limitation of setting 1 objects per admin owner set/release/clear 
calls.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2284 IMM: Improper return code without any error string while deleting large number of objects

2017-03-09 Thread Zoran Milinkovic
- **status**: unassigned --> invalid
- **Comment**:

This is a limitation in IMM that adminOwnerSet may contain up to 1 objects.
The ticket will be closed as invalid, and a new ticket will be created to 
return the correct error code (SA_AIS_ERR_NO_RESOURCES) instead of 
SA_AIS_ERR_LIBRARY



---

** [tickets:#2284] IMM: Improper return code without any error string while 
deleting large number of objects**

**Status:** invalid
**Milestone:** 5.2.RC1
**Created:** Wed Feb 01, 2017 07:13 AM UTC by Chani Srivastava
**Last Updated:** Fri Mar 03, 2017 04:48 PM UTC
**Owner:** nobody


Steps to reproduce:

1. Bring up opensaf on a cluster
2. Create around 10k objects
3. Try deleating these objects in one immcfg operation

Output:
Error Returned - error - saImmOmAdminOwnerSet FAILED: SA_AIS_ERR_LIBRARY (2)

No error string stating the cause of failure is returned.

Syslog - immcfg: ER TOO MANY Object Names line:733

Expected behavior - Proper return code with error string should be returned 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2353 imm: incorrect log level after coming from headless state

2017-03-09 Thread Zoran Milinkovic
- **status**: review --> fixed
- **Comment**:

default(5.2):

changeset:   8678:f2e439c5b6da
tag: tip
user:    Zoran Milinkovic 
date:Tue Mar 07 17:02:07 2017 +0100
summary: imm: fix log level for saClmDispatch [#2353]




---

** [tickets:#2353] imm: incorrect log level after coming from headless state**

**Status:** fixed
**Milestone:** 5.2.RC1
**Created:** Tue Mar 07, 2017 03:55 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Mar 07, 2017 04:04 PM UTC
**Owner:** Zoran Milinkovic


Mar  7 10:19:00 PL-3 osafimmnd[7110]: ER saClmDispatch failed: 9
Mar  7 10:19:00 PL-3 osafimmnd[7110]: NO Re-initializing with CLMS

Error log level should be changed to warning level.
CLM handle will be reinitializing after saClmDispatch returns bad handle.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2353 imm: incorrect log level after coming from headless state

2017-03-07 Thread Zoran Milinkovic
- **status**: accepted --> review



---

** [tickets:#2353] imm: incorrect log level after coming from headless state**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Tue Mar 07, 2017 03:55 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Mar 07, 2017 03:55 PM UTC
**Owner:** Zoran Milinkovic


Mar  7 10:19:00 PL-3 osafimmnd[7110]: ER saClmDispatch failed: 9
Mar  7 10:19:00 PL-3 osafimmnd[7110]: NO Re-initializing with CLMS

Error log level should be changed to warning level.
CLM handle will be reinitializing after saClmDispatch returns bad handle.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2353 imm: incorrect log level after coming from headless state

2017-03-07 Thread Zoran Milinkovic



---

** [tickets:#2353] imm: incorrect log level after coming from headless state**

**Status:** accepted
**Milestone:** 5.2.RC1
**Created:** Tue Mar 07, 2017 03:55 PM UTC by Zoran Milinkovic
**Last Updated:** Tue Mar 07, 2017 03:55 PM UTC
**Owner:** Zoran Milinkovic


Mar  7 10:19:00 PL-3 osafimmnd[7110]: ER saClmDispatch failed: 9
Mar  7 10:19:00 PL-3 osafimmnd[7110]: NO Re-initializing with CLMS

Error log level should be changed to warning level.
CLM handle will be reinitializing after saClmDispatch returns bad handle.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2232 imm: immModel_protocol51Allowed is not used in the code

2017-02-28 Thread Zoran Milinkovic
- **status**: accepted --> invalid
- **Comment**:

immModel_protocol51Allowed is not used in the code because it's an interface to 
C files.
Protocol 51 is used in C++ files.

immModel_protocol51Allowed can remain in the code for possible future use.



---

** [tickets:#2232] imm: immModel_protocol51Allowed is not used in the code**

**Status:** invalid
**Milestone:** 5.1.1
**Created:** Mon Dec 19, 2016 03:23 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Dec 19, 2016 03:23 PM UTC
**Owner:** Zoran Milinkovic


Function immModel_protocol51Allowed is not used in the code.
The function should be removed from the code.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2320 clm: standby clmd crashes due to missing node information

2017-02-22 Thread Zoran Milinkovic
- Description has changed:

Diff:



--- old
+++ new
@@ -1,4 +1,25 @@
 The standby CLMD service crashed due to missing PL-3 information.
+
+syslog from SC-2:
+~~~
+Feb 13 00:43:31 SC-2-2 osafamfd[5082]: NO Cold sync complete!
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: Ruling epoch noted as:5
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: SaImmRepositoryInitModeT 
changed and noted as 'SA_IMM_KEEP_REPOSITORY'
+Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
+Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 19082
+Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO Epoch set to 5 in ImmModel
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2020f old epoch: 4  new epoch:5
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2010f old epoch: 4  new epoch:5
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2030f old epoch: 0  new epoch:5
+Feb 13 00:43:31 SC-2-2 osafclmd[5066]: ER Node is NULL,problem with the 
database.
+Feb 13 00:43:31 SC-2-2 osafclmd[5066]: 
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0' 
failed.
+Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
+Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
+Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
+Feb 13 00:43:32 SC-2-2 opensaf_reboot: Rebooting local node; timeout=60
+~~~
 
 Coredump:
 ~~~






---

** [tickets:#2320] clm: standby clmd crashes due to missing node information**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Wed Feb 22, 2017 12:55 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Feb 22, 2017 12:55 PM UTC
**Owner:** nobody


The standby CLMD service crashed due to missing PL-3 information.

syslog from SC-2:
~~~
Feb 13 00:43:31 SC-2-2 osafamfd[5082]: NO Cold sync complete!
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: Ruling epoch noted as:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: SaImmRepositoryInitModeT changed 
and noted as 'SA_IMM_KEEP_REPOSITORY'
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 19082
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO Epoch set to 5 in ImmModel
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2020f old epoch: 4  new epoch:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2010f old epoch: 4  new epoch:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2030f old epoch: 0  new epoch:5
Feb 13 00:43:31 SC-2-2 osafclmd[5066]: ER Node is NULL,problem with the 
database.
Feb 13 00:43:31 SC-2-2 osafclmd[5066]: 
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0' 
failed.
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Feb 13 00:43:32 SC-2-2 opensaf_reboot: Rebooting local node; timeout=60
~~~

Coredump:
~~~
[New LWP 5066]
[New LWP 5069]
[New LWP 5068]
[New LWP 5070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafclmd'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
### BT ###
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
#1  0x7fbc7be89478 in abort () from /lib64/libc.so.6
#2  0x7fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50 
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468, 
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec", 
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at 
../../opensaf/src/base/sysf_def.c:281
#3  0x7fbc7e1aa016 in ckpt_proc_node_rec (c

[tickets] [opensaf:tickets] #2320 clm: standby clmd crashes due to missing node information

2017-02-22 Thread Zoran Milinkovic



---

** [tickets:#2320] clm: standby clmd crashes due to missing node information**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Wed Feb 22, 2017 12:55 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Feb 22, 2017 12:55 PM UTC
**Owner:** nobody


The standby CLMD service crashed due to missing PL-3 information.

Coredump:
~~~
[New LWP 5066]
[New LWP 5069]
[New LWP 5068]
[New LWP 5070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafclmd'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
### BT ###
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
#1  0x7fbc7be89478 in abort () from /lib64/libc.so.6
#2  0x7fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50 
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468, 
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec", 
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at 
../../opensaf/src/base/sysf_def.c:281
#3  0x7fbc7e1aa016 in ckpt_proc_node_rec (cb=, 
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
#4  0x7fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=, 
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
#5  ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
#6  mbcsv_callback (arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
#7  0x7fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60, 
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
#8  0x7fbc7c857146 in ncs_mbcsv_rcv_async_update (peer=0x7fbc7f217a60, 
evt=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:440
#9  0x7fbc7c85dd30 in mbcsv_process_events (rcvd_evt=0x7fbc740036e0, 
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at 
../../opensaf/src/mbc/mbcsv_pr_evts.c:168
#10 0x7fbc7c85de9b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753, 
mbx=mbx@entry=4283432961) at ../../opensaf/src/mbc/mbcsv_pr_evts.c:272
#11 0x7fbc7c8586c2 in mbcsv_process_dispatch_request (arg=0x7fff27b6b310) 
at ../../opensaf/src/mbc/mbcsv_api.c:423
#12 0x7fbc7e1aa7be in clms_mbcsv_dispatch (mbcsv_hdl=) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:687
#13 0x7fbc7e19e4e4 in main (argc=, argv=) at 
../../opensaf/src/clm/clmd/clms_main.c:535
### BT FULL ###
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fbc7be89478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50 
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468, 
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec", 
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at 
../../opensaf/src/base/sysf_def.c:281
No locals.
#3  0x7fbc7e1aa016 in ckpt_proc_node_rec (cb=, 
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
param = 0x7fbc7f218a60
node = 0x0
ip = 0x0
__FUNCTION__ = "ckpt_proc_node_rec"
#4  0x7fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=, 
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
ckpt_cluster_rec = 
rc = 1
num_bytes = 
hdr = 0x7fbc7f218a50
ckpt_finalize_rec = 
ckpt_node_rec = 
ckpt_node_config_rec = 
ckpt_node_del_rec = 
ckpt_node_down_rec = 
ckpt_msg = 0x7fbc7f218a50
ckpt_client_rec = 
ckpt_csync_node_rec = 
ckpt_agent_down = 
#5  ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
rc = 1
msg_fmt_version = 1
#6  mbcsv_callback (arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
rc = 1
__FUNCTION__ = "mbcsv_callback"
#7  0x7fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60, 
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
parg = {
  i_op = NCS_MBCSV_CBOP_DEC,
  i_client_hdl = 0,
  i_ckpt_hdl = 4292870177,
  info = {
encode = {
  io_msg_type = NCS_MBCSV_MSG_ASYNC_UPDATE,
  io_action = NCS_MBCSV_ACT_ADD,
  io_reo_type = 6,
  io_reo_hdl = 0,
  io_uba = {
start = 0x0,
ub = 0x0,
bufp = 0x70 ,
res = 112,
ttl = 0,
max = 2132894108
  },
  io_req_context = 9209973925752930305,
  i_peer_version = 21264
},
decode = {
  i_msg_type = NC

[tickets] [opensaf:tickets] #2304 imm: osafimmpbed creates coredump due to double free memory

2017-02-22 Thread Zoran Milinkovic
- **status**: review --> fixed
- **Comment**:

default(5.2):

changeset:   8608:b50e7fd1fa07
tag: tip
parent:  8605:0c6da910d0d4
user:    Zoran Milinkovic 
date:Mon Feb 13 13:36:49 2017 +0100
summary: imm: fix PBE coredump for double freeing memory [#2304]



---

** [tickets:#2304] imm: osafimmpbed creates coredump due to double free memory**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Mon Feb 13, 2017 11:57 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Feb 13, 2017 12:48 PM UTC
**Owner:** Zoran Milinkovic


When IMM is running with code coverage, there is often coredump for osafimmpbed.
The problem comes from double exit call from two threads, the main and MDS 
thread. Both threads try to call destructor for static variable in IMM PBE 
library.

I think this is a timing issue and we haven't seen this error earlier. With 
code coverage flag, the problem occurs aprox. once a day.

GDB coredump backtrace:
~~~
[New LWP 1888]
[New LWP 1884]
[New LWP 1887]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafimmpbed --pbe 
/srv/shared/imm//imm.db'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

Thread 3 (Thread 0x7fc9258e8b00 (LWP 1887)):
#0  0x7fc924072fdd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x7fc924ad909b in osaf_poll_no_timeout (io_fds=0x7fc9258e8290, 
i_nfds=1) at src/base/osaf_poll.c:32
result = 32713
#2  0x7fc924ad9248 in osaf_ppoll (io_fds=0x7fc9258e8290, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at src/base/osaf_poll.c:79
millisecond_round_up = {tv_sec = 0, tv_nsec = 99}
max_possible_timeout = {tv_sec = 2147483, tv_nsec = 64700}
start_time = {tv_sec = 17179869186, tv_nsec = 140501895252736}
time_left_ts = {tv_sec = 1, tv_nsec = 1}
result = 615339859
#3  0x7fc924ae95cf in ncs_tmr_wait () at src/base/sysf_tmr.c:409
rc = 1
inds_rmvd = 1
next_delay = 0
tv = {tv_sec = 16777215, tv_usec = 0}
ts_current = {tv_sec = 216961, tv_nsec = 620030550}
ts = {tv_sec = 16777215, tv_nsec = 0}
set = {fd = 8, events = 1, revents = 0}
#4  0x7fc924353184 in start_thread (arg=0x7fc9258e8b00) at 
pthread_create.c:312
__res = 
pd = 0x7fc9258e8b00
now = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140501895252736, 
-8535808571625374835, 1, 1, 140501895253440, 140501895252736, 
8509828887122344845, 8509832138929142669}, mask_was_saved = 0}}, priv = {pad = 
{0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 
pagesize_m1 = 
sp = 
freesize = 
__PRETTY_FUNCTION__ = "start_thread"
#5  0x7fc92408037d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 2 (Thread 0x7fc9258eb780 (LWP 1884)):
#0  0x7fc92435a64a in do_fcntl (arg=0x7ffd194a7070, cmd=7, fd=22) at 
../sysdeps/unix/sysv/linux/fcntl.c:39
resultvar = 18446744073709551104
#1  __libc_fcntl (fd=22, cmd=) at 
../sysdeps/unix/sysv/linux/fcntl.c:92
ap = {{gp_offset = 16, fp_offset = 32713, overflow_arg_area = 
0x7ffd194a7070, reg_save_area = 0x7ffd194a7030}}
arg = 0x7ffd194a7070
oldtype = 0
#2  0x7fc925270985 in __gcov_open () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#3  0x7fc9252714ee in gcov_exit () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#4  0x7fc923fc21a9 in __run_exit_handlers (status=1, listp=0x7fc9243446c8 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
atfct = 
onfct = 
cxafct = 
f = 
#5  0x7fc923fc21f5 in __GI_exit (status=) at exit.c:104
No locals.
#6  0x55828aa7c60c in pbeDaemon (immHandle=4230542917903, 
dbHandle=0x55828bb010e8, ownerHandle=1483565869334821379, 
classIdMap=0x7ffd194abc10, objCount=335, pbe2=false, pbe2B=false) at 
src/imm/immpbed/immpbe_daemon.cc:2343
error = SA_AIS_OK
ci = {first = , second = }
__FUNCTION__ = "pbeDaemon"
#7  0x55828aa6b408 in main (argc=3, argv=0x7ffd194abdd8) at 
src/imm/immpbed/immpbe.cc:354
localTmpFilename = ""
pbeRecoverFile = true
dbHandle = 0x55828bb010e8
classIdMap = std::map with 62 elements = {["OpenSafLogConfig"] = 
0x55828bb83290, ["OpenSafLogCurrentConfig"] = 0x55828bb792a0, 
["OpenSafSmfCampRestartIndicator"] = 0x55828bb7cdb0, 
["OpenSafSmfCampRestartInfo"] = 0x55828bb83b80, ["OpenSafSmfConfig"] = 
0x55828bb79090, ["OpenSafSmfExecControl"] = 0x55828bb7d0d0, ["OpenSafSmfMisc"] 
= 0x558

[tickets] [opensaf:tickets] #2261 imm: PBE is crashing when OpenSAF is shutting down

2017-02-22 Thread Zoran Milinkovic
- **status**: assigned --> duplicate
- **Comment**:

Duplicate of #2304



---

** [tickets:#2261] imm: PBE is crashing when OpenSAF is shutting down**

**Status:** duplicate
**Milestone:** 5.0.2
**Created:** Fri Jan 13, 2017 09:26 AM UTC by Zoran Milinkovic
**Last Updated:** Fri Jan 13, 2017 09:26 AM UTC
**Owner:** Zoran Milinkovic


When OpenSAF is shutting down, PBE is crashing sometimes.

~~~
[New LWP 1888]
[New LWP 1884]
[New LWP 1887]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafimmpbed --pbe 
/srv/shared/imm//imm.db'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

Thread 3 (Thread 0x7fc9258e8b00 (LWP 1887)):
#0  0x7fc924072fdd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x7fc924ad909b in osaf_poll_no_timeout (io_fds=0x7fc9258e8290, 
i_nfds=1) at src/base/osaf_poll.c:32
#2  0x7fc924ad9248 in osaf_ppoll (io_fds=0x7fc9258e8290, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at src/base/osaf_poll.c:79
#3  0x7fc924ae95cf in ncs_tmr_wait () at src/base/sysf_tmr.c:409
#4  0x7fc924353184 in start_thread (arg=0x7fc9258e8b00) at 
pthread_create.c:312
#5  0x7fc92408037d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 2 (Thread 0x7fc9258eb780 (LWP 1884)):
#0  0x7fc92435a64a in do_fcntl (arg=0x7ffd194a7070, cmd=7, fd=22) at 
../sysdeps/unix/sysv/linux/fcntl.c:39
#1  __libc_fcntl (fd=22, cmd=) at 
../sysdeps/unix/sysv/linux/fcntl.c:92
#2  0x7fc925270985 in __gcov_open () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#3  0x7fc9252714ee in gcov_exit () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#4  0x7fc923fc21a9 in __run_exit_handlers (status=1, listp=0x7fc9243446c8 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#5  0x7fc923fc21f5 in __GI_exit (status=) at exit.c:104
No locals.
#6  0x55828aa7c60c in pbeDaemon (immHandle=4230542917903, 
dbHandle=0x55828bb010e8, ownerHandle=1483565869334821379, 
classIdMap=0x7ffd194abc10, objCount=335, pbe2=false, pbe2B=false) at 
src/imm/immpbed/immpbe_daemon.cc:2343
#7  0x55828aa6b408 in main (argc=3, argv=0x7ffd194abdd8) at 
src/imm/immpbed/immpbe.cc:354

Thread 1 (Thread 0x7fc9258c8b00 (LWP 1888)):
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7fc923fc0028 in __GI_abort () at abort.c:89
#2  0x7fc923ff92a4 in __libc_message (do_abort=do_abort@entry=1, 
fmt=fmt@entry=0x7fc9241076b0 "*** Error in `%s': %s: 0x%s ***\n") at 
../sysdeps/posix/libc_fatal.c:175
#3  0x7fc92400555e in malloc_printerr (ptr=, 
str=0x7fc924107878 "double free or corruption (fasttop)", action=1) at 
malloc.c:4996
#4  _int_free (av=, p=, have_lock=0) at 
malloc.c:3840
#5  0x7fc92483936f in std::basic_string, 
std::allocator >::~basic_string() () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6  0x7fc923fc253a in __cxa_finalize (d=0x7fc9256c6c80) at cxa_finalize.c:56
#7  0x7fc925493833 in __do_global_dtors_aux () from 
/usr/local/lib/opensaf/libimmpbe_dump.so.0
No symbol table info available.
#8  0x7fc9258c7dc0 in ?? ()
No symbol table info available.
#9  0x7fc9256e870a in _dl_fini () at dl-fini.c:252
Backtrace stopped: frame did not save the PC
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2304 imm: osafimmpbed creates coredump due to double free memory

2017-02-13 Thread Zoran Milinkovic
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35663267/



---

** [tickets:#2304] imm: osafimmpbed creates coredump due to double free memory**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Mon Feb 13, 2017 11:57 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Feb 13, 2017 11:57 AM UTC
**Owner:** Zoran Milinkovic


When IMM is running with code coverage, there is often coredump for osafimmpbed.
The problem comes from double exit call from two threads, the main and MDS 
thread. Both threads try to call destructor for static variable in IMM PBE 
library.

I think this is a timing issue and we haven't seen this error earlier. With 
code coverage flag, the problem occurs aprox. once a day.

GDB coredump backtrace:
~~~
[New LWP 1888]
[New LWP 1884]
[New LWP 1887]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafimmpbed --pbe 
/srv/shared/imm//imm.db'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

Thread 3 (Thread 0x7fc9258e8b00 (LWP 1887)):
#0  0x7fc924072fdd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x7fc924ad909b in osaf_poll_no_timeout (io_fds=0x7fc9258e8290, 
i_nfds=1) at src/base/osaf_poll.c:32
result = 32713
#2  0x7fc924ad9248 in osaf_ppoll (io_fds=0x7fc9258e8290, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at src/base/osaf_poll.c:79
millisecond_round_up = {tv_sec = 0, tv_nsec = 99}
max_possible_timeout = {tv_sec = 2147483, tv_nsec = 64700}
start_time = {tv_sec = 17179869186, tv_nsec = 140501895252736}
time_left_ts = {tv_sec = 1, tv_nsec = 1}
result = 615339859
#3  0x7fc924ae95cf in ncs_tmr_wait () at src/base/sysf_tmr.c:409
rc = 1
inds_rmvd = 1
next_delay = 0
tv = {tv_sec = 16777215, tv_usec = 0}
ts_current = {tv_sec = 216961, tv_nsec = 620030550}
ts = {tv_sec = 16777215, tv_nsec = 0}
set = {fd = 8, events = 1, revents = 0}
#4  0x7fc924353184 in start_thread (arg=0x7fc9258e8b00) at 
pthread_create.c:312
__res = 
pd = 0x7fc9258e8b00
now = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140501895252736, 
-8535808571625374835, 1, 1, 140501895253440, 140501895252736, 
8509828887122344845, 8509832138929142669}, mask_was_saved = 0}}, priv = {pad = 
{0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 
pagesize_m1 = 
sp = 
freesize = 
__PRETTY_FUNCTION__ = "start_thread"
#5  0x7fc92408037d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 2 (Thread 0x7fc9258eb780 (LWP 1884)):
#0  0x7fc92435a64a in do_fcntl (arg=0x7ffd194a7070, cmd=7, fd=22) at 
../sysdeps/unix/sysv/linux/fcntl.c:39
resultvar = 18446744073709551104
#1  __libc_fcntl (fd=22, cmd=) at 
../sysdeps/unix/sysv/linux/fcntl.c:92
ap = {{gp_offset = 16, fp_offset = 32713, overflow_arg_area = 
0x7ffd194a7070, reg_save_area = 0x7ffd194a7030}}
arg = 0x7ffd194a7070
oldtype = 0
#2  0x7fc925270985 in __gcov_open () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#3  0x7fc9252714ee in gcov_exit () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#4  0x7fc923fc21a9 in __run_exit_handlers (status=1, listp=0x7fc9243446c8 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
atfct = 
onfct = 
cxafct = 
f = 
#5  0x7fc923fc21f5 in __GI_exit (status=) at exit.c:104
No locals.
#6  0x55828aa7c60c in pbeDaemon (immHandle=4230542917903, 
dbHandle=0x55828bb010e8, ownerHandle=1483565869334821379, 
classIdMap=0x7ffd194abc10, objCount=335, pbe2=false, pbe2B=false) at 
src/imm/immpbed/immpbe_daemon.cc:2343
error = SA_AIS_OK
ci = {first = , second = }
__FUNCTION__ = "pbeDaemon"
#7  0x55828aa6b408 in main (argc=3, argv=0x7ffd194abdd8) at 
src/imm/immpbed/immpbe.cc:354
localTmpFilename = ""
pbeRecoverFile = true
dbHandle = 0x55828bb010e8
classIdMap = std::map with 62 elements = {["OpenSafLogConfig"] = 
0x55828bb83290, ["OpenSafLogCurrentConfig"] = 0x55828bb792a0, 
["OpenSafSmfCampRestartIndicator"] = 0x55828bb7cdb0, 
["OpenSafSmfCampRestartInfo"] = 0x55828bb83b80, ["OpenSafSmfConfig"] = 
0x55828bb79090, ["OpenSafSmfExecControl"] = 0x55828bb7d0d0, ["OpenSafSmfMisc"] 
= 0x55828bb7bbc0, ["OpenSafSmfPbeIndicator"] = 0x55828bb79420, 
["OpenSafSmfRollbackData"] = 0x55828bb7a650, ["OpenSafSmfRollbackElement"] = 
0x55828b

[tickets] [opensaf:tickets] #2304 imm: osafimmpbed creates coredump due to double free memory

2017-02-13 Thread Zoran Milinkovic



---

** [tickets:#2304] imm: osafimmpbed creates coredump due to double free memory**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Mon Feb 13, 2017 11:57 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Feb 13, 2017 11:57 AM UTC
**Owner:** Zoran Milinkovic


When IMM is running with code coverage, there is often coredump for osafimmpbed.
The problem comes from double exit call from two threads, the main and MDS 
thread. Both threads try to call destructor for static variable in IMM PBE 
library.

I think this is a timing issue and we haven't seen this error earlier. With 
code coverage flag, the problem occurs aprox. once a day.

GDB coredump backtrace:
~~~
[New LWP 1888]
[New LWP 1884]
[New LWP 1887]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafimmpbed --pbe 
/srv/shared/imm//imm.db'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

Thread 3 (Thread 0x7fc9258e8b00 (LWP 1887)):
#0  0x7fc924072fdd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x7fc924ad909b in osaf_poll_no_timeout (io_fds=0x7fc9258e8290, 
i_nfds=1) at src/base/osaf_poll.c:32
result = 32713
#2  0x7fc924ad9248 in osaf_ppoll (io_fds=0x7fc9258e8290, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at src/base/osaf_poll.c:79
millisecond_round_up = {tv_sec = 0, tv_nsec = 99}
max_possible_timeout = {tv_sec = 2147483, tv_nsec = 64700}
start_time = {tv_sec = 17179869186, tv_nsec = 140501895252736}
time_left_ts = {tv_sec = 1, tv_nsec = 1}
result = 615339859
#3  0x7fc924ae95cf in ncs_tmr_wait () at src/base/sysf_tmr.c:409
rc = 1
inds_rmvd = 1
next_delay = 0
tv = {tv_sec = 16777215, tv_usec = 0}
ts_current = {tv_sec = 216961, tv_nsec = 620030550}
ts = {tv_sec = 16777215, tv_nsec = 0}
set = {fd = 8, events = 1, revents = 0}
#4  0x7fc924353184 in start_thread (arg=0x7fc9258e8b00) at 
pthread_create.c:312
__res = 
pd = 0x7fc9258e8b00
now = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140501895252736, 
-8535808571625374835, 1, 1, 140501895253440, 140501895252736, 
8509828887122344845, 8509832138929142669}, mask_was_saved = 0}}, priv = {pad = 
{0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 
pagesize_m1 = 
sp = 
freesize = 
__PRETTY_FUNCTION__ = "start_thread"
#5  0x7fc92408037d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 2 (Thread 0x7fc9258eb780 (LWP 1884)):
#0  0x7fc92435a64a in do_fcntl (arg=0x7ffd194a7070, cmd=7, fd=22) at 
../sysdeps/unix/sysv/linux/fcntl.c:39
resultvar = 18446744073709551104
#1  __libc_fcntl (fd=22, cmd=) at 
../sysdeps/unix/sysv/linux/fcntl.c:92
ap = {{gp_offset = 16, fp_offset = 32713, overflow_arg_area = 
0x7ffd194a7070, reg_save_area = 0x7ffd194a7030}}
arg = 0x7ffd194a7070
oldtype = 0
#2  0x7fc925270985 in __gcov_open () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#3  0x7fc9252714ee in gcov_exit () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#4  0x7fc923fc21a9 in __run_exit_handlers (status=1, listp=0x7fc9243446c8 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
atfct = 
onfct = 
cxafct = 
f = 
#5  0x7fc923fc21f5 in __GI_exit (status=) at exit.c:104
No locals.
#6  0x55828aa7c60c in pbeDaemon (immHandle=4230542917903, 
dbHandle=0x55828bb010e8, ownerHandle=1483565869334821379, 
classIdMap=0x7ffd194abc10, objCount=335, pbe2=false, pbe2B=false) at 
src/imm/immpbed/immpbe_daemon.cc:2343
error = SA_AIS_OK
ci = {first = , second = }
__FUNCTION__ = "pbeDaemon"
#7  0x55828aa6b408 in main (argc=3, argv=0x7ffd194abdd8) at 
src/imm/immpbed/immpbe.cc:354
localTmpFilename = ""
pbeRecoverFile = true
dbHandle = 0x55828bb010e8
classIdMap = std::map with 62 elements = {["OpenSafLogConfig"] = 
0x55828bb83290, ["OpenSafLogCurrentConfig"] = 0x55828bb792a0, 
["OpenSafSmfCampRestartIndicator"] = 0x55828bb7cdb0, 
["OpenSafSmfCampRestartInfo"] = 0x55828bb83b80, ["OpenSafSmfConfig"] = 
0x55828bb79090, ["OpenSafSmfExecControl"] = 0x55828bb7d0d0, ["OpenSafSmfMisc"] 
= 0x55828bb7bbc0, ["OpenSafSmfPbeIndicator"] = 0x55828bb79420, 
["OpenSafSmfRollbackData"] = 0x55828bb7a650, ["OpenSafSmfRollbackElement"] = 
0x55828bb7afd0, ["OpenSafSmfSingleStepInfo"] = 0x55828bb79b50, 
["OpensafConfig"] = 0x

[tickets] [opensaf:tickets] #2299 msg: high CPU usage of osafmsgnd on payloads after headless state

2017-02-10 Thread Zoran Milinkovic
strace on osafmsgnd process shows that osafmsgnd is in poll loop with closed 
file descriptor (17):
poll([{fd=21, events=POLLIN}, {fd=15, events=POLLIN}, {fd=17, events=POLLIN}, 
{fd=12, events=POLLIN}, {fd=19, events=POLLIN}], 5, 4294967295) = 1 ([{fd=17, 
revents=POLLNVAL}])

My guess that it's CLM file descriptor.
MSG also reported to syslog:
Feb 10 14:44:19 PL-3 osafmsgnd[3749]: ER saClmDispatch Failed with error 9


---

** [tickets:#2299] msg: high CPU usage of osafmsgnd on payloads after headless 
state**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Fri Feb 10, 2017 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Feb 10, 2017 01:29 PM UTC
**Owner:** nobody


When a cluster comes from the headless state, osafmsgnd on payload is running 
with high CPU usage.
Running 'top':
  PID USER  PR  NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND  

 
 3537 root  20   0  199828  20884  20392 R 98,0  1,0   0:53.96 osafmsgnd
 
 To reproduce the problem is enough to run one controller and one payload:
 1. Start SC-1
 2. Start PL-3
 3. When the sync is done, stop SC-1
 4. Start SC-1
 5. Run 'top' on PL-3
 
 The MSG problem is detected on the default branch, but it can exist on earlier 
versions.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2299 msg: high CPU usage of osafmsgnd on payloads after headless state

2017-02-10 Thread Zoran Milinkovic



---

** [tickets:#2299] msg: high CPU usage of osafmsgnd on payloads after headless 
state**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Fri Feb 10, 2017 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Feb 10, 2017 01:29 PM UTC
**Owner:** nobody


When a cluster comes from the headless state, osafmsgnd on payload is running 
with high CPU usage.
Running 'top':
  PID USER  PR  NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND  

 
 3537 root  20   0  199828  20884  20392 R 98,0  1,0   0:53.96 osafmsgnd
 
 To reproduce the problem is enough to run one controller and one payload:
 1. Start SC-1
 2. Start PL-3
 3. When the sync is done, stop SC-1
 4. Start SC-1
 5. Run 'top' on PL-3
 
 The MSG problem is detected on the default branch, but it can exist on earlier 
versions.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2297 mds: improve MDS logging

2017-02-10 Thread Zoran Milinkovic
- **status**: review --> fixed
- **Comment**:

changeset:   8570:ba3b43b87c76
tag: tip
user:    Zoran Milinkovic 
date:Thu Feb 09 14:32:58 2017 +0100
summary: mds: improve MDS logging [#2297]



---

** [tickets:#2297] mds: improve MDS logging**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Thu Feb 09, 2017 01:15 PM UTC by Zoran Milinkovic
**Last Updated:** Thu Feb 09, 2017 01:43 PM UTC
**Owner:** Zoran Milinkovic


Example:
Nov 17 18:11:44.259636 osafamfd[500] ERR |MDS_SND_RCV: Timeout or Error occured
Nov 17 18:11:44.259779 osafamfd[500] ERR |MDS_SND_RCV: Timeout occured on 
sndrsp message
Nov 17 18:11:44.259817 osafamfd[500] ERR |MDS_SND_RCV: Adest=<0x0002010f,439>

In the first log, it's not obvious if MDS send failed because of timeout or 
because of error.
If MDS send fails beacuse of error, then the second message will still write 
that it's because of the timeout, which is not correct.

The first message should distinguish between timeout or error cases. In error 
case, errno can be printed to make ease debugging.

The second message should be changed from "Timeout occured." to "Timeout or 
error occured."


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


  1   2   3   4   5   6   >