date:20151102

[tickets] [opensaf:tickets] #1581 pyosaf: Make log level configurable in the SafLogger utility class

2015-11-02 Thread Johan Mårtensson




---

** [tickets:#1581] pyosaf: Make log level configurable in the SafLogger utility 
class**

**Status:** assigned
**Milestone:** 5.0.FC
**Created:** Mon Nov 02, 2015 03:08 PM UTC by Johan Mårtensson
**Last Updated:** Mon Nov 02, 2015 03:08 PM UTC
**Owner:** Johan Mårtensson


In the SafLogger::log method the log level is hard-coded to notice. This should 
be fixed so that it's configurable.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1527 log: terminated due to use SaImmOiHandleT concurrently from 02 threads

2015-11-02 Thread Vu Minh Nguyen

- **status**: unassigned --> assigned
- **assigned_to**: Vu Minh Nguyen



---

** [tickets:#1527] log: terminated due to use SaImmOiHandleT concurrently from 
02 threads**

**Status:** assigned
**Milestone:** 5.0.FC
**Created:** Wed Oct 07, 2015 10:59 AM UTC by Vu Minh Nguyen
**Last Updated:** Sun Nov 01, 2015 09:36 PM UTC
**Owner:** Vu Minh Nguyen


When standby takes active role, "new" active logsv starts one thread 
`imm_impl_restore_thread` to set OI implementer for LOG service. In the 
meantime, the main thread is still there, ready to receive any coming requests. 

So, the picture here is there are 02 threads using one OiHandle concurrently - 
`imm_impl_restore_thread` and main thread. It violates the IMM rule, stated in 
IMM PR doc, 
`"the developer must avoid using the same handle concurrently from several 
threads."`

In the trace log below, there are 02 problems caused by using OiHandle in 02 
different threads:
1) Get `ERR_BAD_OPERATION` as do request to IMM while no implementer have been 
set.

> Sep 17 18:22:04 SC-2 osaflogd[15047]: NO ACTIVE request 
Sep 17 18:22:04 SC-2 osaflogd[15047]: ER ERR_BAD_OPERATION: The SaImmOiHandleT 
is not associated with any implementer name
...
> Sep 17 18:22:04 SC-2 osafimmnd[15026]: NO Implementer connected: 211 
> (safLogService) <7, 2020f> 

2) Get `ERR_LIBRARY` as double LOCK from IMM side, logsv terminated.

> Sep 17 20:07:59 SC-2 osafimmnd[14962]: NO Implementer connected: 401 
> (safLogService) <7, 2020f>
...
Sep 17 20:07:59 SC-2 osaflogd[14975]: saImmOiClassImplementerSet FAILED, rc = 2
….
Sep 17 20:08:09 SC-2 osafamfnd[15047]: NO 
'safComp=LOG,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 17 20:08:09 SC-2 osafamfnd[15047]: ER 
safComp=LOG,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1582 smfd: IMMA_SYNCR_TIMEOUT extended to 5 minutes

2015-11-02 Thread Ingvar Bergström

- **status**: assigned --> review



---

** [tickets:#1582] smfd: IMMA_SYNCR_TIMEOUT extended to 5 minutes**

**Status:** review
**Milestone:** 4.6.2
**Created:** Tue Nov 03, 2015 06:39 AM UTC by Ingvar Bergström
**Last Updated:** Tue Nov 03, 2015 06:39 AM UTC
**Owner:** Ingvar Bergström


Heavily overloaded systems cause smfd to receive TIMEOUT from IMM.
The IMMA_SYNCR_TIMEOUT timeout is extended to from one to five minutes.
Handling of TIMEOUT from IMM is corrected in smf OI.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1582 smfd: IMMA_SYNCR_TIMEOUT extended to 5 minutes

2015-11-02 Thread Ingvar Bergström




---

** [tickets:#1582] smfd: IMMA_SYNCR_TIMEOUT extended to 5 minutes**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Tue Nov 03, 2015 06:39 AM UTC by Ingvar Bergström
**Last Updated:** Tue Nov 03, 2015 06:39 AM UTC
**Owner:** Ingvar Bergström


Heavily overloaded systems cause smfd to receive TIMEOUT from IMM.
The IMMA_SYNCR_TIMEOUT timeout is extended to from one to five minutes.
Handling of TIMEOUT from IMM is corrected in smf OI.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1566 Cluster reset happened during switchover due to AMF director heart beat timeout.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1566] Cluster reset happened during switchover due to AMF director 
heart beat timeout.**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Sat Oct 24, 2015 06:25 AM UTC by Ritu Raj
**Last Updated:** Sat Oct 24, 2015 06:29 AM UTC
**Owner:** nobody


Changeset: 6901
70 nodes configured with PBE 
Application: Nway configured on all the nodes

Issues Observed:
> Cluster reset happened during switchover due to AMF director heart beat 
> timeout.

Steps Performed:
* AMF (Nway) application brought up on the nodes.
* Some operations are performed on Nway application hosted on PL-65 to PL-68.
* Stopped opensaf on the nodes PL-65 to PL-68.
* Two switchover performed on Cluster.  First switchover succeded without any 
issue. During second switchover old  standby controller (SC-2) rebooted when  
it is being promoted to ACTIVE state.

Oct 22 15:45:10 SLES-64BIT-SLOT2 osafamfnd[2580]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Oct 22 15:45:10 SLES-64BIT-SLOT2 osafimmd[2505]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2020f)


Oct 22 15:45:10 SLES-64BIT-SLOT2 osafimmnd[2516]: NO Implementer (applier) 
connected: 130 (@OpenSafImmReplicatorA) <10675, 2020f>
Oct 22 15:45:10 SLES-64BIT-SLOT2 osafamfnd[2580]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Oct 22 15:45:10 SLES-64BIT-SLOT2 osafamfnd[2580]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Oct 22 15:45:10 SLES-64BIT-SLOT2 osafamfnd[2580]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Oct 22 15:45:10 SLES-64BIT-SLOT2 osafamfnd[2580]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60
Oct 22 15:45:10 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
timeout=60

* After SC-2 went for reboot, SC-1 tried to become active during witch AMF 
director heart beat timeout and cluster reset happened.

Oct 22 15:54:53 SLES-64BIT-SLOT1 osafamfd[2557]: NO 
'safRankedSu=safSu=dummy_NWay_1Norm_4\,safSg=SG_dummy_n\,safApp=N_6,safSi=dummy_NWay_1Norm_6,safApp=N_6'
Oct 22 15:54:53 SLES-64BIT-SLOT1 osafamfnd[2567]: ER AMF director heart beat 
timeout, generating core for amfd
Oct 22 15:54:54 SLES-64BIT-SLOT1 osafamfnd[2567]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: AMF director heart beat timeout, OwnNodeId = 131343, 
SupervisionTime = 60
Oct 22 15:54:54 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60
Oct 22 15:54:55 SLES-64BIT-SLOT1 osafimmnd[2503]: WA MDS Send Failed
Oct 22 15:54:55 SLES-64BIT-SLOT1 osafimmnd[2503]: WA Error code 2 returned for 
message type 16 - ignoring
Oct 22 15:54:55 SLES-64BIT-SLOT1 osafimmnd[2503]: NO Implementer locally 
disconnected. Marking it as doomed 136 <4871, 2010f> (@safAmfService2010f)

* Traces are not availbale


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1563 AMF : SU should not be instantiated if any one of hosted NG is is in lock-in state

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1563] AMF : SU should not be instantiated if any one of hosted NG 
is is in lock-in state**

**Status:** review
**Milestone:** 4.6.2
**Created:** Fri Oct 23, 2015 02:13 PM UTC by Srikanth R
**Last Updated:** Fri Oct 30, 2015 01:18 PM UTC
**Owner:** Nagendra Kumar


Changeset : 6901
Application : 2N

Issue : SU should not be instantiated if any one of hosted NG is is in lock-in 
state

Initialy both the NGs are brought to locked-in state.
SYSTEST-PLD-1:/opt/goahead/tetware/opensaffire/suites/avsv/infra # amf-state ng
safAmfNodeGroup=SCs,safAmfCluster=myAmfCluster
saAmfNGAdminState=UNLOCKED(1)
safAmfNodeGroup=PLs,safAmfCluster=myAmfCluster
saAmfNGAdminState=LOCKED-INSTANTIATION(3)
safAmfNodeGroup=AllNodes,safAmfCluster=myAmfCluster
saAmfNGAdminState=LOCKED-INSTANTIATION(3)

 On one of the  locked-in NGs, if unlock-in operation is performed, the SUs 
hosted on PL should not be instantiated, as still the other NG is in lock-in 
state. The node should not be moved to LOCKED state
SYSTEST-PLD-1:/opt/goahead/tetware/opensaffire/suites/avsv/infra # amf-adm 
unlock-in safAmfNodeGroup=AllNodes,safAmfCluster=myAmfCluster
   safAmfNodeGroup=AllNodes,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safAmfNode=PL-3,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safAmfNode=PL-4,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safAmfNode=PL-5,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safAmfNode=PL-6,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safAmfNode=SC-1,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safAmfNode=SC-2,safAmfCluster=myAmfCluster
  LOCKEDINSTANTIATION --> LOCKED
   safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN
  INSTANTIATING --> INSTANTIATED
   safSu=TestApp_SU3,safSg=TestApp_SG1,safApp=TestApp_TwoN
  INSTANTIATING --> INSTANTIATED



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1567 AMF : Locked-in node should be moved to ENABLED state, during CLM node unlock

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1567] AMF : Locked-in node should be moved to ENABLED state, 
during CLM node unlock**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Sat Oct 24, 2015 11:01 AM UTC by Srikanth R
**Last Updated:** Fri Oct 30, 2015 11:58 AM UTC
**Owner:** Nagendra Kumar


Changeset : 6901
Application : hosted 2n, no red application on PL-3

Steps : 

1) Perform CLM lock operation on PL-3. AMF DN PL-3 moved to DISABLED state.
2)  Perform lock operation on NG consisting of PL-3
3)  Perform lock-inst operation on the same NG. Now AMF node PL-3 state shall 
be DISABLED & LOCKED-IN
4)  Perform CLM unlock operation on PL-3. AMF DN PL-3 should be moved back to 
ENABLED state, but instead AMF DN is in DISABLED state. Further unlock-in 
operations on NG are not instantiating the SUs.

AMF should update the  Locked-in node state to ENABLED state, during CLM 
node unlock



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1532 AMF : SI should be reverted to unlocked state, after shutdown operation of SI is rejected

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1532] AMF : SI should be reverted to unlocked state, after 
shutdown operation of SI is rejected**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Oct 08, 2015 11:20 AM UTC by Srikanth R
**Last Updated:** Mon Oct 19, 2015 09:15 AM UTC
**Owner:** Nagendra Kumar


Changeset : 6901
Application  : 2n ( two SUs and 4 SIs with SI1 as sponsor for the remaining SIs)

Steps :

 * Initially all the SIs are in assigned state.
 * Invoked shutdown operation on one of the dependent SI .i.e SI2.
 *  For the quiescing callback, component responded with FAILED_OP

Oct  8 16:27:20 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCING to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Performing failover of 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (SU failover count: 2)
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted 
due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'

 * After recovery of SU1, SI2 assignments are also done, which is not expected.

Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
TERMINATING => INSTANTIATED
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'

 * Below is the SI state after the shutdown operation
 safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)

* Further unlock operation of SI resulted in TIMEOUT return op.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1446 log: trouble when the number of existing app streams reachs limitation

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1446] log: trouble when the number of existing app streams reachs 
limitation**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Aug 13, 2015 10:10 AM UTC by Vu Minh Nguyen
**Last Updated:** Thu Aug 13, 2015 10:10 AM UTC
**Owner:** nobody


When creating an configurable app stream (e.g: using `immcfg –c`). Suppose all 
inputs are valid.
In this case, logsv returns `SA_AIS_OK` to IMM for its callbacks. Means IMM is 
allowed to creates its database/resource for this obj (*1*).

In apply callback, IMM asks logsv to apply the change – not require acknowledge.
If number of app streams has reached the limitation - defined by 
`logMaxApplicationStreams`, logsv will get failed to add this stream to 
stream_array.

As the result, logsv deletes all allocated resources managed by itself. The 
created resources in step (*1*) is still existing. And it causes things as 
below – see my comments in right side:

1. Create obj successfully from IMM. But actually, logsv gets failed at 
ccbApplyCallback
> immcfg -c SaLogStreamConfig safLgStrCfg=test6 -a saLogStreamPathName=. -a 
> saLogStreamFileName=test6

2. immlist failed as logsv returns not ok `no such obj` to IMM.
> immlist safLgStrCfg=test6
error - saImmOmAccessorGet_2 FAILED: SA_AIS_ERR_NO_RESOURCES (18)

3. Create obj failed as the resource is existing
> immcfg -c SaLogStreamConfig safLgStrCfg=test6 -a saLogStreamPathName=. -a 
> saLogStreamFileName=test6 
error - saImmOmCcbObjectCreate_2 FAILED with SA_AIS_ERR_EXIST (14)


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1464 Cluster reset triggered, after middleware si-swap ( one of controller in disabled )

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1464] Cluster reset triggered, after middleware si-swap ( one of 
controller in disabled )**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Aug 28, 2015 10:00 AM UTC by Srikanth R
**Last Updated:** Fri Aug 28, 2015 10:00 AM UTC
**Owner:** nobody
**Attachments:**

- 
[clusterReset.tgz](https://sourceforge.net/p/opensaf/tickets/1464/attachment/clusterReset.tgz)
 (5.4 MB; application/x-compressed)


*Setup*
4.7M0 with changeset 6770
4 nodes configured with no PBE configured and 2N application hosted.
SC-1 is active controller and SC-2 is standby controller and both the 
controllers are hosting application SUs configured with 2N redundancy model.

*Issues*

 Cluster went for reset, for the si-swap operation on middleware. The active 
controller is in disabled state, before invoking si-swap operation.
 
 
 *Steps Performed*
 
 -> Because of faulty application, SC-1 moved to disabled state. NodeAutorepair 
feature is disabled for SC-1.
 
 Aug 28 15:03:17 SYSTEST-CNTLR-1 osafamfnd[4650]: NO 
'safComp=COMP3SU1TWONAPP,safSu=SU1,safSg=SGONE,safApp=TWONAPP' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailover'
Aug 28 15:03:17 SYSTEST-CNTLR-1 osafamfd[4640]: NO NodeAutorepair disabled for 
'safAmfNode=SC-1,safAmfCluster=myAmfCluster', no reboot ordered

-> Invoked si-swap operation on middleware SI.

-> Standby controller ( SC-2) got rebooted, as implementer set failed with 
ERR_EXIST .


Aug 28 15:03:32 SYSTEST-CNTLR-2 osafamfnd[4761]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Aug 28 15:03:32 SYSTEST-CNTLR-2 osafntfimcnd[4726]: NO exiting on signal 15
Aug 28 15:03:32 SYSTEST-CNTLR-2 osafimmd[4686]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2010f)
Aug 28 15:03:32 SYSTEST-CNTLR-2 osafmsgd[4882]: ER mqd_imm_declare_implementer 
failed: err = 14
Aug 28 15:03:32 SYSTEST-CNTLR-2 osaflogd[4707]: ER saImmOiClassImplementerSet 
(safLogService) failed: 14
Aug 28 15:03:32 SYSTEST-CNTLR-2 osafckptd[4780]: ER cpd immOiImplmenterSet 
failed with err = 14
Aug 28 15:03:32 SYSTEST-CNTLR-2 osafamfnd[4761]: NO 
'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

 
 -> SC-1 also got rebooted, after SC-2 reboot.
  
Aug 28 15:03:58 SYSTEST-CNTLR-1 osafamfd[4640]: NO Node 'SC-2' left the cluster
Aug 28 15:03:58 SYSTEST-CNTLR-1 osafamfd[4640]: WA State change notification 
lost for 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Aug 28 15:03:58 SYSTEST-CNTLR-1 osafamfd[4640]: ER Failed to start cluster 
tracking 6
Aug 28 15:03:58 SYSTEST-CNTLR-1 osafamfd[4640]: NO NodeAutorepair disabled for 
'safAmfNode=SC-1,safAmfCluster=myAmfCluster', no reboot ordered
Aug 28 15:03:58 SYSTEST-CNTLR-1 opensaf_reboot: Rebooting remote node in the 
absence of PLM is outside the scope of OpenSAF
Aug 28 15:04:03 SYSTEST-CNTLR-1 osafclmd[4621]: ER saNtfNotificationSend() 
returned: SA_AIS_ERR_TRY_AGAIN (6)
Aug 28 15:04:08 SYSTEST-CNTLR-1 osaflogd[4596]: WA saImmOiRtObjectDelete 
returned 5 for safLgStr=TWONLOGSTREAM
Aug 28 15:04:08 SYSTEST-CNTLR-1 osafimmnd[4583]: WA ERR_BAD_HANDLE: Handle use 
is blocked by pending reply on syncronous call
Aug 28 15:04:08 SYSTEST-CNTLR-1 osafimmnd[4583]: NO Implementer locally 
disconnected. Marking it as doomed 4 <17, 2010f> (safAmfService)
Aug 28 15:04:08 SYSTEST-CNTLR-1 osafamfd[4640]: NO Re-initializing with IMM
Aug 28 15:04:08 SYSTEST-CNTLR-1 osafimmnd[4583]: WA IMMND - Client Node Get 
Failed for cli_hdl 73014575375
Aug 28 15:04:08 SYSTEST-CNTLR-1 osafimmnd[4583]: WA Timeout on syncronous admin 
operation 1
Aug 28 15:04:13 SYSTEST-CNTLR-1 osafimmnd[4583]: WA ERR_BAD_HANDLE: Handle use 
is blocked by pending reply on syncronous call
Aug 28 15:04:13 SYSTEST-CNTLR-1 osafimmnd[4583]: NO Implementer locally 
disconnected. Marking it as doomed 3 <12, 2010f> (safClmService)
Aug 28 15:04:13 SYSTEST-CNTLR-1 osafimmnd[4583]: WA IMMND - Client Node Get 
Failed for cli_hdl 51539738895
Aug 28 15:04:22 SYSTEST-CNTLR-1 osafclmd[4621]: ER saImmOiImplementerSet failed 
rc:6, exiting
Aug 28 15:04:22 SYSTEST-CNTLR-1 osafamfnd[4650]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

-> As both the controllers went for reboot,  payloads went for reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1463 log: output a redundant quotation mark in log fileWhe

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1463] log: output a redundant quotation mark in log fileWhe**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Fri Aug 28, 2015 09:57 AM UTC by Vu Minh Nguyen
**Last Updated:** Wed Sep 23, 2015 09:27 AM UTC
**Owner:** Vu Minh Nguyen


When sending a log record which is longer than `saLogStreamFixedLogRecordSize` 
value,
there will be a redundant double quotation mark in log file. 

Only happens in case of using token `@Cb` without double quotations around it 
(`@Cb` not `"@Cb"`).

Here is an example of log file:

>$ cat saLogAlarm_20150828_073826.log
11 0x13fe8cf840b38008 0x13fe8cf83f451b28 0x4003 T saflogger.3881@SC-1   
 saflogger.3881@SC-1
11"



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1353 smf: two hours is spent on step undoing state

2015-11-02 Thread Anders Widell

- **Milestone**: 4.6.1 --> 4.6.2



---

** [tickets:#1353] smf: two hours is spent on step undoing state **

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue Apr 28, 2015 01:33 PM UTC by Neelakanta Reddy
**Last Updated:** Wed Jul 15, 2015 01:03 PM UTC
**Owner:** nobody
**Attachments:**

- 
[messages_step_undo](https://sourceforge.net/p/opensaf/tickets/1353/attachment/messages_step_undo)
 (111.1 kB; application/octet-stream)


Test description:
1. rolling middle-ware upgrade(4.5->4.6) campaign is ran
2. one of the upgrade node(PL-4) the new rpms(4.6) are kept empty and the node 
comes up without opensaf installation
3. the step rollback is taken approximately two hours to describe the campaign 
as EXECUTION_FAILED
4. attaching syslog of SC-1

Apr 24 18:36:55 SLES1 osafamfd[2289]: NO Node 'PL-4' left the cluster
Apr 24 18:36:55 SLES1 osafimmnd[2237]: NO Implementer connected: 47 
(MsgQueueService132111) <2280, 2010f>
Apr 24 18:36:55 SLES1 osafimmnd[2237]: NO Implementer locally disconnected. 
Marking it as doomed 47 <2280, 2010f> (MsgQueueService132111)
Apr 24 18:36:55 SLES1 osafimmnd[2237]: NO Implementer disconnected 47 <2280, 
2010f> (MsgQueueService132111)
Apr 24 18:36:58 SLES1 kernel: [  172.812065] TIPC: Resetting link 
<1.1.1:eth0-1.1.4:eth0>, peer not responding
Apr 24 18:36:58 SLES1 kernel: [  172.812071] TIPC: Lost link 
<1.1.1:eth0-1.1.4:eth0> on network plane A
Apr 24 18:36:58 SLES1 kernel: [  172.812075] TIPC: Lost contact with <1.1.4>
Apr 24 18:37:15 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 18:37:36 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster

---
--
--

Apr 24 20:36:00 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:36:22 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:36:44 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:37:06 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO Failed to get node dest for clm node 
safNode=PL-4,safCluster=myClmCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO no node destination found whitin time 
limit for node safAmfNode=PL-4,safAmfCluster=myAmfCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO no node destination found for node 
safAmfNode=PL-4,safAmfCluster=myAmfCluster
Apr 24 20:37:28 SLES1 osafsmfd[2318]: ER Failed to online install old bundles
Apr 24 20:37:28 SLES1 osafsmfd[2318]: ER Step undoing failed
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO Step safSmfStep=0004 in procedure 
safSmfProc=OpenSAF-upgrade failed, step result 5
Apr 24 20:37:28 SLES1 osafsmfd[2318]: NO CAMP: Procedure 
safSmfProc=OpenSAF-upgrade returned FAILED





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1362 AMF: saAmfSGNumCurrAssignedSUs is not updated for operations performed on SG.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.6.1 --> 4.6.2



---

** [tickets:#1362] AMF: saAmfSGNumCurrAssignedSUs is not updated for operations 
performed on SG.**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Thu Apr 30, 2015 09:22 AM UTC by Srikanth R
**Last Updated:** Tue May 05, 2015 05:37 AM UTC
**Owner:** Praveen


Changeset : 6490

ISSUE : saAmfSGNumCurrAssignedSUs is not updated for operations performed on SG.

For the SG in lock-in / locked  state, saAmfSGNumCurrAssignedSUs is not changed 
to the value 0. This attribute is updated, for operations performed on SU.


SOLO:/opt/goahead/tetware/opensaffire/suites/avsv/framework # immlist 
safSg=SG,safApp=test2nApp
Name   Type Value(s)

safSg  SA_STRING_T  safSg=SG 
saAmfSGTypeSA_NAME_T
safVersion=4.0.0,safSgType=test2nSgType (39) 
saAmfSGSuRestartProb   SA_TIME_T
saAmfSGSuRestartMaxSA_UINT32_T  
saAmfSGSuHostNodeGroup SA_NAME_T
safAmfNodeGroup=AllNodes,safAmfCluster=myAmfCluster (51) 
saAmfSGNumPrefStandbySUs   SA_UINT32_T  1 (0x1)
saAmfSGNumPrefInserviceSUs SA_UINT32_T  3 (0x3)
saAmfSGNumPrefAssignedSUs  SA_UINT32_T  3 (0x3)
saAmfSGNumPrefActiveSUsSA_UINT32_T  1 (0x1)
saAmfSGNumCurrNonInstantiatedSpareSUs  SA_UINT32_T  0 (0x0)
saAmfSGNumCurrInstantiatedSpareSUs SA_UINT32_T  0 (0x0)
saAmfSGNumCurrAssignedSUs  SA_UINT32_T  2 (0x2)
saAmfSGMaxStandbySIsperSU  SA_UINT32_T  1 (0x1)
saAmfSGMaxActiveSIsperSU   SA_UINT32_T  1 (0x1)
saAmfSGCompRestartProb SA_TIME_T
saAmfSGCompRestartMax  SA_UINT32_T  
saAmfSGAutoRepair  SA_UINT32_T  1 (0x1)
saAmfSGAutoAdjustProb  SA_TIME_T
saAmfSGAutoAdjust  SA_UINT32_T  0 (0x0)
saAmfSGAdminState  SA_UINT32_T  3 (0x3)
SaImmAttrImplementerName   SA_STRING_T  safAmfService 
SaImmAttrClassName SA_STRING_T  SaAmfSG 
SaImmAttrAdminOwnerNameSA_STRING_T  


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1291 IMM: IMMD healthcheck callback timeout when standby controller rebooted in middle of IMMND sync

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> never



---

** [tickets:#1291] IMM: IMMD healthcheck callback timeout when standby 
controller rebooted in middle of IMMND sync**

**Status:** invalid
**Milestone:** never
**Created:** Mon Mar 30, 2015 07:21 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 21, 2015 06:35 AM UTC
**Owner:** nobody
**Attachments:**

- 
[immlogs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1291/attachment/immlogs.tar.bz2)
 (6.8 MB; application/x-bzip)


The issue is observed with 4.6 FC changeset 6377. The system is up and running 
with single pbe and 50k objects. This issue is seen after 
http://sourceforge.net/p/opensaf/tickets/1290 is observed. IMM application is 
running on standby controller and immcfg command is run from payload to set 
CompRestartMax value to 1000. IMMND is killed twice on standby controller 
leading to #1290.

As a result, standby controller left the cluster in middle of sync, IMMD 
reported healthcheck callback timeout and the active controller too went for 
reboot. Following is the syslog of SC-1:

Mar 26 14:58:17 SLES-64BIT-SLOT1 osafimmloadd: NO Sync starting
Mar 26 14:58:28 SLES-64BIT-SLOT1 osaffmd[9529]: NO Node Down event for node id 
2020f:
Mar 26 14:58:28 SLES-64BIT-SLOT1 osaffmd[9529]: NO Current role: ACTIVE
Mar 26 14:58:28 SLES-64BIT-SLOT1 osaffmd[9529]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Received Node Down for peer controller, OwnNodeId = 
131343, SupervisionTime = 60
Mar 26 14:58:28 SLES-64BIT-SLOT1 kernel: [15200.412080] TIPC: Resetting link 
<1.1.1:eth0-1.1.2:eth0>, peer not responding
Mar 26 14:58:28 SLES-64BIT-SLOT1 kernel: [15200.412089] TIPC: Lost link 
<1.1.1:eth0-1.1.2:eth0> on network plane A
Mar 26 14:58:28 SLES-64BIT-SLOT1 kernel: [15200.413191] TIPC: Lost contact with 
<1.1.2>
Mar 26 14:58:28 SLES-64BIT-SLOT1 osafclmd[9609]: NO Node 131599 went down. Not 
sending track callback for agents on that node
Mar 26 14:58:28 SLES-64BIT-SLOT1 osafclmd[9609]: NO Node 131599 went down. Not 
sending track callback for agents on that node
Mar 26 14:58:28 SLES-64BIT-SLOT1 osafclmd[9609]: NO Node 131599 went down. Not 
sending track callback for agents on that node
Mar 26 14:58:28 SLES-64BIT-SLOT1 osafclmd[9609]: NO Node 131599 went down. Not 
sending track callback for agents on that node
Mar 26 14:58:28 SLES-64BIT-SLOT1 osafclmd[9609]: NO Node 131599 went down. Not 
sending track callback for agents on that node
Mar 26 14:58:28 SLES-64BIT-SLOT1 osafclmd[9609]: NO Node 131599 went down. Not 
sending track callback for agents on that node
Mar 26 14:58:30 SLES-64BIT-SLOT1 osafamfd[9628]: NO Node 'SC-2' left the cluster
Mar 26 14:58:30 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting remote node in the 
absence of PLM is outside the scope of OpenSAF
Mar 26 14:58:54 SLES-64BIT-SLOT1 kernel: [15226.674333] TIPC: Established link 
<1.1.1:eth0-1.1.2:eth0> on network plane A
Mar 26 15:00:02 SLES-64BIT-SLOT1 syslog-ng[3261]: Log statistics; 
dropped='pipe(/dev/xconsole)=0', dropped='pipe(/dev/tty10)=0', 
processed='center(queued)=2197', processed='center(received)=1172', 
processed='destination(messages)=1172', processed='destination(mailinfo)=0', 
processed='destination(mailwarn)=0', 
processed='destination(localmessages)=955', processed='destination(newserr)=0', 
processed='destination(mailerr)=0', processed='destination(netmgm)=0', 
processed='destination(warn)=44', processed='destination(console)=13', 
processed='destination(null)=0', processed='destination(mail)=0', 
processed='destination(xconsole)=13', processed='destination(firewall)=0', 
processed='destination(acpid)=0', processed='destination(newscrit)=0', 
processed='destination(newsnotice)=0', processed='source(src)=1172'
Mar 26 15:00:07 SLES-64BIT-SLOT1 osafimmloadd: ER Too many TRY_AGAIN on 
saImmOmSearchNext - aborting
Mar 26 15:00:08 SLES-64BIT-SLOT1 osafimmnd[9549]: ER SYNC APPARENTLY FAILED 
status:1
Mar 26 15:00:08 SLES-64BIT-SLOT1 osafimmnd[9549]: NO -SERVER STATE: 
IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY
Mar 26 15:00:08 SLES-64BIT-SLOT1 osafimmnd[9549]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE (2484)
Mar 26 15:00:08 SLES-64BIT-SLOT1 osafimmnd[9549]: NO Epoch set to 12 in ImmModel
Mar 26 15:00:08 SLES-64BIT-SLOT1 osafimmnd[9549]: NO Coord broadcasting 
ABORT_SYNC, epoch:12
Mar 26 15:00:08 SLES-64BIT-SLOT1 osafimmpbed: NO Update epoch 12 committing 
with ccbId:10054/4294967380
Mar 26 15:01:34 SLES-64BIT-SLOT1 osafamfnd[9638]: NO SU failover probation 
timer started (timeout: 12000 ns)
Mar 26 15:01:34 SLES-64BIT-SLOT1 osafamfnd[9638]: NO Performing failover of 
'safSu=SC-1,safSg=2N,safApp=OpenSAF' (SU failover count: 1)
Mar 26 15:01:34 SLES-64BIT-SLOT1 osafamfnd[9638]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar 26 15:01:34 SLES-64BIT-SLOT1 osafamfnd[9638]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
Mar

[tickets] [opensaf:tickets] #1285 MDS TCP: zero bytes recvd results in application exit

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1285] MDS TCP: zero bytes recvd results in application exit**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Thu Mar 26, 2015 09:49 AM UTC by Girish
**Last Updated:** Tue Aug 11, 2015 06:26 AM UTC
**Owner:** A V Mahesh (AVM)


sometimes application using opensaf exits with below message:

 Feb 20 15:24:59 fedvm1 RIB[28549]: MDTM:socket_recv() = 0, conn lost with dh 
server, exiting library err :Success
Feb 20 15:24:59 fedvm1 osafamfnd[28263]: NO 
'safSu=SU1,safSg=app-simplex,safApp=appos' component restart probation timer 
started (timeout: 40 ns)
Feb 20 15:24:59 fedvm1 osafamfnd[28263]: NO Restarting a component of 
'safSu=SU1,safSg=app-simplex,safApp=appos' (comp restart count: 1)
Feb 20 15:24:59 fedvm1 osafamfnd[28263]: NO 
'safComp=App,safSu=SU1,safSg=app-simplex,safApp=appos' faulted due to 'avaDown' 
: Recovery is 'componentRestart'

Exits at location 
osaf/libs/core/mds/mds_dt_trans.c::mdtm_process_poll_recv_data_tcp

recd_bytes = recv(tcp_cb->DBSRsock, tcp_cb->buffer, local_len_buf, 0);
if (recd_bytes < 0) {
return;
} else if (0 == recd_bytes) {
syslog(LOG_ERR, "MDTM:socket_recv() = 
%d, conn lost with dh server, exiting library err :%d len:%d", recd_bytes, 
errno,
  local_len_buf);
close(tcp_cb->DBSRsock);
exit(0);
} else if (local_len_buf > recd_bytes) {


 local_len_buf turns out be 0


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1275 AMF: SG is in unstable state ( standby csi removal timeout during sponsor si lock )

2015-11-02 Thread Anders Widell

- **Milestone**: 4.6.1 --> 4.6.2



---

** [tickets:#1275] AMF: SG is in unstable state ( standby csi removal timeout 
during sponsor si lock )**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Mar 19, 2015 01:48 PM UTC by Srikanth R
**Last Updated:** Wed Jul 15, 2015 01:08 PM UTC
**Owner:** nobody


*Setup*
Version : 4.6 FC
model : 2n
configuration : 1App,1SG,2SUs with 4comps each, 4SIs with 1 CSI each
si-si deps configured as SI1 is sponsor to SI2,3,&4.
SU1 is mapped to pl-3 and SU2 to pl-4
saAmfSGAutoRepair=1(True)
SuFailover=0(False)
component recovery policy - 3 (comp failover)

*Initial state*
All the AMF entities regarding the application are in unlocked states. SIs are 
in fully assigned state.

*Issue* SG is in unstable state ( standby csi removal timeout during sponsor si 
lock )

*Steps Performed* 

 -> Before performing lock operation of sponsor SI, ensured that component 1 in 
SU2 ( the standby SU) does not respond in CSI removal callback. 

 -> SG went to unstable state, after the lock operation of sponsor SI.



Below are the logs on PL-4 ( where standby SU is hosted ) :


Mar 19 19:05:11 SYSTEST-PLD-2 osafamfnd[24560]: NO Removed 
'safSi=SI1,safApp=test2nApp' from 'safSu=SU2,safSg=SG,safApp=test2nApp'
Mar 19 19:05:21 SYSTEST-PLD-2 osafamfnd[24560]: NO Removed 
'safSi=SI2,safApp=test2nApp' from 'safSu=SU2,safSg=SG,safApp=test2nApp'
Mar 19 19:05:21 SYSTEST-PLD-2 osafamfnd[24560]: CR SU-SI record addition 
failed, SU= safSu=SU2,safSg=SG,safApp=test2nApp : SI=safSi=SI3,safApp=test2nApp
Mar 19 19:05:21 SYSTEST-PLD-2 osafamfnd[24560]: CR SU-SI record addition 
failed, SU= safSu=SU2,safSg=SG,safApp=test2nApp : SI=safSi=SI4,safApp=test2nApp


Below is the final state of SIs after the lock operation.


safSi=SI1,safApp=test2nApp
saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=UNASSIGNED(1)
safSi=SI2,safApp=test2nApp
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)
safSi=SI3,safApp=test2nApp
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)
safSi=SI4,safApp=test2nApp
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #538 AMF: fail-over assignments despite comps in TERM-FAILED state

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#538] AMF: fail-over assignments despite comps in TERM-FAILED 
state**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Aug 09, 2013 06:43 AM UTC by Hans Feldt
**Last Updated:** Wed Jul 15, 2015 01:53 PM UTC
**Owner:** nobody


AMF currently performs fail-over recovery action although a component is in 
termination-failed presence state. This can lead to severe inconsistencies for 
the application. The specification also clearly states how this should work in 
4.8:

"If the component and any of its contained components (for a container 
component)
were assigned the active HA state for some component service instances when the
CLEANUP command was executed, and semantics of the redundancy model of its
enclosing service group guarantee that at a point in time only one component 
can be
in the active HA state for a given component service instance, the failure to 
terminate
that component prevents the Availability Management Framework from assigning to
another component the active HA state for these component service instances (and
by the same token prevents the assignment of the active HA state to other 
service
units for the service instances that contain the involved CSIs). In this case, 
the ser-
vice instances will stay unassigned until an administrative action is performed 
to ter-
minate the failed component."

Can be tested by running the AMF 2N sa-aware sample app and modifying the 
cleanup script to do "exit 1" which gives this effect when the active component 
is killed:

Aug  9 08:40:01 Vostro osafamfnd[11307]: NO 
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' faulted due to 
'avaDown' : Recovery is 'componentRestart'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Cleanup of 
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' failed
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Reason:'Exec of script success, but 
script exits with non-zero status'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Exit code: 1
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Component Failover trigerred for 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1': Failed component: 
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State INSTANTIATED => 
TERMINATION_FAILED
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Assigning 
'safSi=AmfDemo,safApp=AmfDemo1' QUIESCED to 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Assigned 
'safSi=AmfDemo,safApp=AmfDemo1' QUIESCED to 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Assigning 
'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to 
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
Aug  9 08:40:01 Vostro amf_demo[11620]: CSI Set - HAState Active for all 
assigned CSIs
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Assigned 
'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to 
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Removing 
'safSi=AmfDemo,safApp=AmfDemo1' from 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Aug  9 08:40:01 Vostro osafamfnd[11307]: NO Removed 
'safSi=AmfDemo,safApp=AmfDemo1' from 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #531 osaf: Some files have MS-DOS line endings

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#531] osaf: Some files have MS-DOS line endings**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Mon Aug 05, 2013 11:03 AM UTC by Anders Widell
**Last Updated:** Wed Jul 15, 2015 01:55 PM UTC
**Owner:** Anders Widell


I did a quick check, and the following files appear to have MS-DOS line endings:

osaf/services/saf/smfsv/schema/SAI-AIS-SMF-ETF-A.01.02_OpenSAF.xsd
osaf/services/saf/smfsv/schema/SAI-AIS-SMF-UCS-A.01.02_OpenSAF.xsd
samples/amf/non_sa_aware/net-snmp.xml
samples/amf/sa_aware/AppConfig-2N.xml
samples/amf/sa_aware/AppConfig-nwayactive.xml
samples/amf/wrapper/net-snmp.xml
samples/smfsv/campaigns/campaign_rolling_comp_agent.xml
samples/smfsv/campaigns/campaign_rolling_comp.xml
samples/smfsv/campaigns/campaign_rolling_nodes_os_installremove.xml
samples/smfsv/campaigns/campaign_rolling_nodes.xml
samples/smfsv/campaigns/campaign_rolling_su.xml
tests/avsv/suites/AppConfig.xml
tests/common/inc/tet_startup.h
tests/cpsv/inc/tet_cpsv_conf.h
tests/cpsv/inc/tet_cpsv.h
tests/cpsv/src/tet_cpa.c
tests/cpsv/src/tet_cpsv_util.c
tests/cpsv/suites/reg_cpsv.cfg
tests/edsv/inc/tet_eda.h
tests/edsv/src/tet_edsv_func.c
tests/edsv/src/tet_edsv_util.c
tests/edsv/suites/reg_edsv.cfg
tests/glsv/inc/tet_gla_conf.h
tests/glsv/inc/tet_glsv.h
tests/glsv/src/tet_gla.c
tests/glsv/src/tet_gla_conf.c
tests/glsv/src/tet_gld.c
tests/glsv/src/tet_glsv_util.c
tests/glsv/suites/reg_glsv.cfg
tests/mbcsv/inc/mbcsv_purpose.h
tests/mbcsv/src/mbcsv_cb_purpose.c
tests/mbcsv/src/mbcsv_ckpt_purpose.c
tests/mbcsv/src/mbcsv_inv.c
tests/mbcsv/src/mbcsv_purpose.c
tests/mbcsv/src/mbcsv_tmr_purpose.c
tests/mbcsv/src/tet_mbcsv_util.c
tests/mbcsv/suites/reg_mbcsv.cfg
tests/mds/inc/tet_mdstipc.h
tests/mds/suites/reg_mds.cfg
tests/mqsv/inc/tet_mqa_conf.h
tests/mqsv/inc/tet_mqsv.h
tests/mqsv/src/tet_mqa.c
tests/mqsv/src/tet_mqa_conf.c
tests/mqsv/src/tet_mqd.c
tests/mqsv/src/tet_mqnd.c
tests/mqsv/src/tet_mqsv_util.c
tests/mqsv/suites/reg_mqsv.cfg
tests/OpenSAF_TET_Changs.txt



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #638 node cannot join AMF cluster after restart

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#638] node cannot join AMF cluster after restart**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Fri Nov 22, 2013 02:54 PM UTC by Hans Feldt
**Last Updated:** Wed Jul 15, 2015 01:47 PM UTC
**Owner:** A V Mahesh (AVM)


OpenSAF 4.2.2 changeset 3796, 79 extra patches
System: RHEL based, 2 node cluster, MDS/TIPC

After node reboot of the standby controller it cannot join the cluster again. 
This can be seen in the syslog on the active controller:


Nov 17 17:15:20 notice atrcxb3166 osafamfd[6038]: Cold sync complete!
Nov 19 17:40:07 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' joined the cluster
Nov 19 17:42:08 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 19 17:42:28 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f

Nov 21 16:24:21 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' left the cluster
Nov 21 16:29:04 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' joined the cluster
Nov 21 16:29:24 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:29:44 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:30:04 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:30:24 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:30:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:31:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:31:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:31:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:32:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:32:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:32:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:33:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:33:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:33:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:34:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:34:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:34:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:35:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:35:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:35:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:36:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:36:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:36:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 17:41:58 err atrcxb3166 osafamfd[6712]: avd_d2n_msg_dequeue: ncsmds_api 
failed 2
Nov 21 17:42:08 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' left the cluster
Nov 21 17:42:18 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:42:38 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:42:58 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:43:18 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:43:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:43:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:44:19 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:44:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:44:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:45:19 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:45:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:45:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:46:19 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:46:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
msg id 210, from 2020f should be 1
Nov 21 17:46:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
msg id 211, from 2020f should be 1

Nov 21 18:00:40 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
msg id 252, from 2020f should be 1
Nov 21 18:01:00 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' left the cluster
Nov 22 11:44:37 notice atrcxb3166 osafamfd[6712]: Re-initializing with IMM
Nov 22 11:44:39 notice atrcxb3166 osafamfd[6712]:

[tickets] [opensaf:tickets] #178 escalation policy is not happening till the restart count exceeds, instead of reaching saAmfSGCompRestartMax for NPI components

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#178] escalation policy is not happening till the restart count 
exceeds, instead of reaching saAmfSGCompRestartMax for NPI components**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Tue May 14, 2013 06:24 AM UTC by Nagendra Kumar
**Last Updated:** Wed Aug 12, 2015 09:11 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2144

error escalation is not happening till the restart count exceeds 
saAmfSGCompRestartMax for the components brought up in NPI.


But according to spec, first level escalation should happen when the restart 
count reaches the saAmfSGCompRestartMax


Mentioned in the spec, 3.11.2.2 page NO: 203,


If this count reaches the saAmfSGCompRestartMax value before the end of the
"component restart" probation period, the Availability Management Framework per-
forms the first level of recovery escalation for that service unit: the 
Availability Man-
agement Framework restarts the entire service unit





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1308 ccb object create fails with invalid param with old SaNameT Apis

2015-11-02 Thread Anders Widell

- **Milestone**: 4.6.0 --> never

---

** [tickets:#1308] ccb object create fails with invalid param with old SaNameT
Apis**

**Status:** duplicate
**Milestone:** never
**Created:** Wed Apr 08, 2015 07:29 AM UTC by Sirisha Alla
**Last Updated:** Thu Apr 09, 2015 01:45 AM UTC
**Owner:** nobody
**Attachments:**

-
[extralength.tar](https://sourceforge.net/p/opensaf/tickets/1308/attachment/extralength.tar)
(532.5 kB; application/x-tar)

This issue is seen on changeset 6377 along with patch for #1267(969 backport
changes). The setup is single pbe enabled with 50k objects.

The IMM Application tree is being created in the following manner. obj1 is the
parent of obj2 and obj3. Obj2 is the parent of obj4 and obj4 is the parent of
obj5. All the Apis used are old APIs using SaNameT.

Creation for obj1, obj2 and obj3 are successfully added into the CCB. When
object creations for obj4 and obj5 are added to the CCB, CCB Create failed with
INVALID_PARAM.

syslog on SC-1:

Apr 8 12:38:21 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class
noDanglingPreconfigurationClass committing with ccbId:10007
Apr 8 12:38:21 SLES-64BIT-SLOT1 osafimmnd[7221]: NO Create of class
noDanglingPreconfigurationClass is PERSISTENT.
Apr 8 12:38:21 SLES-64BIT-SLOT1 osafimmnd[7221]: NO ERR_INVALID_PARAM: Not a
proper parent name:configRdnObj2,configRdnObj1^? size:28
Apr 8 12:38:21 SLES-64BIT-SLOT1 osafimmnd[7221]: NO ERR_INVALID_PARAM: Not a
proper parent name:configRdnObj4,configRdnObj2,configRdnObj1Â size:42
Apr 8 12:38:21 SLES-64BIT-SLOT1 osafimmnd[7221]: NO Ccb 2 COMMITTED
(noDanglingPreconfigurationClass)

The length passed in the SaNameT is 27 and 41 respectively for obj4 and obj5.
But the length is being considered as 28 and 42 internally.

syslog and immnd traces are attached. This is an old test which worked fine
before changes for #643.

---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1350 configuration error should not lead to reboot of the node

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> never



---

** [tickets:#1350] configuration error should not lead to reboot of the node**

**Status:** invalid
**Milestone:** never
**Created:** Tue Apr 28, 2015 05:46 AM UTC by Sirisha Alla
**Last Updated:** Tue Apr 28, 2015 07:09 AM UTC
**Owner:** nobody


On one of the nodes missed to configure 2PBE. Result is that the standby 
controller goes for reboot continuously. Reboot of the node does not recover 
the node from such errors. 

Apr 28 16:39:52 SLES-SLOT2 osafimmd[967]: NO SBY: New Epoch for IMMND process 
at node 2020f old epoch: 4  new epoch:5
Apr 28 16:39:52 SLES-SLOT2 osafimmd[967]: ER Active IMMD has 2PBE enabled, yet 
this standby is not enabled for 2PBE - exiting
Apr 28 16:39:52 SLES-SLOT2 osafimmnd[831]: NO Epoch set to 5 in ImmModel
Apr 28 16:39:52 SLES-SLOT2 osafamfnd[901]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Apr 28 16:39:52 SLES-SLOT2 osafamfnd[901]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Apr 28 16:39:52 SLES-SLOT2 osafamfnd[901]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Apr 28 16:39:52 SLES-SLOT2 opensaf_reboot: Rebooting local node; timeout=60



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1109 standby failed to come up during failover

2015-11-02 Thread Anders Widell

- **Milestone**: 4.3.3 --> never



---

** [tickets:#1109] standby failed to come up during failover**

**Status:** duplicate
**Milestone:** never
**Created:** Thu Sep 18, 2014 07:33 AM UTC by Sirisha Alla
**Last Updated:** Thu Sep 18, 2014 11:24 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1109/attachment/logs.tar.bz2)
 (221.7 kB; application/x-bzip)


The issue is seen on SLES X86 VMs running with single pbe and opensaf changeset 
5697+#946 and #1067 patches.

During failover it is observed that standby failed to come up. Syslog of SC-1:

Sep 18 12:28:36 SLES-64BIT-SLOT1 osafclmd[2436]: Started
Sep 18 12:28:37 SLES-64BIT-SLOT1 osafimmnd[2399]: NO PBE-OI established on 
other SC. Dumping incrementally to file imm.db
Sep 18 12:28:39 SLES-64BIT-SLOT1 kernel: [   26.576106] eth0: no IPv6 routers 
present
Sep 18 12:28:46 SLES-64BIT-SLOT1 osafclmd[2436]: ER saNtfInitialize Failed (5)
Sep 18 12:28:46 SLES-64BIT-SLOT1 osafclmd[2436]: ER clms_ntf_init FAILED
Sep 18 12:28:46 SLES-64BIT-SLOT1 opensafd[2338]: ER Failed   DESC:CLMD
Sep 18 12:28:46 SLES-64BIT-SLOT1 opensafd[2338]: ER Going for recovery
Sep 18 12:28:46 SLES-64BIT-SLOT1 opensafd[2338]: ER Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-clmd attempt #1
Sep 18 12:28:46 SLES-64BIT-SLOT1 opensafd[2338]: ER Sending SIGKILL to CLMD, 
pid=2428
Sep 18 12:28:46 SLES-64BIT-SLOT1 osafclmd[2436]: ER clms_init failed
Sep 18 12:28:46 SLES-64BIT-SLOT1 osafclmd[2436]: ER Failed, exiting...
Sep 18 12:29:01 SLES-64BIT-SLOT1 osafclmd[2457]: Started
Sep 18 12:29:11 SLES-64BIT-SLOT1 osafclmd[2457]: ER saNtfInitialize Failed (5)
Sep 18 12:29:11 SLES-64BIT-SLOT1 osafclmd[2457]: ER clms_ntf_init FAILED
Sep 18 12:29:11 SLES-64BIT-SLOT1 opensafd[2338]: ER Could Not RESPAWN CLMD
Sep 18 12:29:11 SLES-64BIT-SLOT1 opensafd[2338]: ER Failed   DESC:CLMD
Sep 18 12:29:11 SLES-64BIT-SLOT1 opensafd[2338]: ER Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-clmd attempt #2
Sep 18 12:29:11 SLES-64BIT-SLOT1 opensafd[2338]: ER Sending SIGKILL to CLMD, 
pid=2452
Sep 18 12:29:11 SLES-64BIT-SLOT1 osafclmd[2457]: ER clms_init failed
Sep 18 12:29:11 SLES-64BIT-SLOT1 osafclmd[2457]: ER Failed, exiting...
Sep 18 12:29:26 SLES-64BIT-SLOT1 osafclmd[2482]: Started
Sep 18 12:29:36 SLES-64BIT-SLOT1 osafclmd[2482]: ER saNtfInitialize Failed (5)
Sep 18 12:29:36 SLES-64BIT-SLOT1 osafclmd[2482]: ER clms_ntf_init FAILED
Sep 18 12:29:36 SLES-64BIT-SLOT1 opensafd[2338]: ER Could Not RESPAWN CLMD
Sep 18 12:29:36 SLES-64BIT-SLOT1 opensafd[2338]: ER Failed   DESC:CLMD
Sep 18 12:29:36 SLES-64BIT-SLOT1 opensafd[2338]: ER FAILED TO RESPAWN
Sep 18 12:29:36 SLES-64BIT-SLOT1 osafclmd[2482]: ER clms_init failed
Sep 18 12:29:36 SLES-64BIT-SLOT1 osafclmd[2482]: ER Failed, exiting...
Sep 18 12:29:37 SLES-64BIT-SLOT1 osaffmd[2379]: exiting for shutdown
Sep 18 12:29:37 SLES-64BIT-SLOT1 osafimmd[2389]: exiting for shutdown
Sep 18 12:29:37 SLES-64BIT-SLOT1 osafimmnd[2399]: exiting for shutdown
Sep 18 12:29:37 SLES-64BIT-SLOT1 osafntfimcnd[2429]: ER saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)

syslog, mds log and clmd traces are attached. NTFD traces are not available, 
will try to get the traces if the issue gets reproducible.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1122 attribute authorizedGroup of access control feature is modifiable by any user

2015-11-02 Thread Anders Widell

- **Milestone**: 4.3.3 --> never



---

** [tickets:#1122] attribute authorizedGroup of access control feature is 
modifiable by any user**

**Status:** duplicate
**Milestone:** never
**Created:** Mon Sep 22, 2014 12:11 PM UTC by surender khetavath
**Last Updated:** Mon Sep 22, 2014 02:29 PM UTC
**Owner:** nobody


changeset : 5679

According to README.ACCESS_CONTROL:
"""authorizedGroup" is an optional attribute of type string holding the name of
an existing linux group. Members of this group will have access to IMM.

Only the root user can change these attributes.
"""

But any user, other than root user, is able to modify this attribute.


Trace shown below:

immcfg -a authorizedGroup="GROUP" opensafImm=opensafImm,safApp=safImmService
tet@SC-1:/etc/opensaf> immlist opensafImm=opensafImm,safApp=safImmService
Name   Type Value(s)


authorizedGroupSA_STRING_T  GROUP 
accessControlMode  SA_UINT32_T  0 (0x0)
SaImmAttrImplementerName   SA_STRING_T  OpenSafImmPBE 
SaImmAttrClassName SA_STRING_T  OpensafImm 
SaImmAttrAdminOwnerNameSA_STRING_T  




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1568 CLMD segfaulted for pending lock op during middleware si-swap

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1568] CLMD segfaulted for pending lock op during middleware 
si-swap**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Sat Oct 24, 2015 11:17 AM UTC by Srikanth R
**Last Updated:** Sat Oct 24, 2015 11:17 AM UTC
**Owner:** nobody
**Attachments:**

- 
[SC-1.tgz](https://sourceforge.net/p/opensaf/tickets/1568/attachment/SC-1.tgz) 
(27.3 kB; application/x-compressed-tar)


Changeset : 6901

Steps :

1) Invoked lock operation on one of the payload PL-5.
2) CLM Agent on PL-3 did not respond to the lock operation.
3) With this pending operation, invoked controller switchover.
4) CLMD on active controller seg faulted during quiesced processing.

Oct 24 15:53:13 SYSTEST-CNTLR-1 osafamfd[5863]: NO Pending Response sent for 
CLM track callback::OK '1'
Oct 24 15:53:15 SYSTEST-CNTLR-1 osafamfd[5863]: NO safSi=SC-2N,safApp=OpenSAF 
Swap initiated
Oct 24 15:53:15 SYSTEST-CNTLR-1 osafamfnd[5873]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' QUIESCED to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Oct 24 15:53:15 SYSTEST-CNTLR-1 osafimmnd[5809]: NO Implementer locally 
disconnected. Marking it as doomed 173 <457, 2010f> (safSmfService)
Oct 24 15:53:15 SYSTEST-CNTLR-1 osafamfnd[5873]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'


signal: 11 pid: 0 uid: 0
/usr/lib64/libopensaf_core.so.0(+0x1a27b)[0x7ff095ffb27b]
/lib64/libpthread.so.0(+0xf7c0)[0x7ff0951207c0]
/lib64/libc.so.6(cfree+0x39)[0x7ff094a0a2c9]
/lib64/librt.so.1(timer_delete+0x42)[0x7ff094d08b52]
/usr/lib64/opensaf/osafclmd[0x405298]
/usr/lib64/libSaAmf.so.0(+0x9213)[0x7ff095dd0213]
/usr/lib64/libSaAmf.so.0(+0xa307)[0x7ff095dd1307]
/usr/lib64/libSaAmf.so.0(saAmfDispatch+0x1d4)[0x7ff095dcaf94]
/usr/lib64/opensaf/osafclmd[0x4047df]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7ff0949aec36]
/usr/lib64/opensaf/osafclmd[0x404ea5]

CLMD trace is attached


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1573 pyosaf: Add missing IMM flags

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1573] pyosaf: Add missing IMM flags**

**Status:** review
**Milestone:** 4.6.2
**Created:** Wed Oct 28, 2015 08:12 AM UTC by Hung Nguyen
**Last Updated:** Wed Oct 28, 2015 08:43 AM UTC
**Owner:** Hung Nguyen


Missing flags

#define SA_IMM_ATTR_NO_DUPLICATES 0x0100 /* OpenSaf 4.3 */
#define SA_IMM_ATTR_NOTIFY0x0200 /* OpenSaf 4.3 */
#define SA_IMM_ATTR_NO_DANGLING   0x0400 /* OpenSaf 4.4 */
#define SA_IMM_ATTR_DN0x0800 /* OpenSaf 4.6 */
#define SA_IMM_ATTR_DEFAULT_REMOVED   0x1000 /* OpenSaf 4.7 */

#define SA_IMM_SEARCH_GET_CONFIG_ATTR0x0001 /* OpenSaf 
4.3 */
#define SA_IMM_SEARCH_NO_DANGLING_DEPENDENTS 0x0001 /* OpenSaf 
4.4 */





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1510 CKPT: cpnd crashes during checkpoint open timeout with large sections

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1510] CKPT: cpnd crashes during checkpoint open timeout with large 
sections**

**Status:** review
**Milestone:** 4.6.2
**Created:** Thu Oct 01, 2015 04:14 PM UTC by Alex Jones
**Last Updated:** Thu Oct 01, 2015 07:54 PM UTC
**Owner:** Alex Jones


When opening a collocated checkpoint replica where the active has large numbers 
of sections (~200k), the sync from the active can timeout with errorcode 
SA_AIS_ERR_TRY_AGAIN. In this case the code deletes the memory for the node, 
but does not delete the node from the db. When the checkpoint access is tried 
again, the freed memory for the node is still in the db, and ckptnd crashes.

Valgrind analysis shows the following:

==53610== Thread 1:
==53610== Invalid read of size 4
==53610==at 0x4E4D7C4: ncs_patricia_tree_get (patricia.c:93)
==53610==by 0x40400D: cpnd_ckpt_node_get (cpnd_db.c:42)
==53610==by 0x40D1A2: cpnd_process_evt (cpnd_evt.c:1957)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610==  Address 0x687de60 is 0 bytes inside a block of size 1,072 free'd
==53610==at 0x4C29D4E: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x40A827: cpnd_evt_proc_ckpt_open (cpnd_evt.c:983)
==53610==by 0x40D426: cpnd_process_evt (cpnd_evt.c:202)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610== 
==53610== Invalid read of size 8
==53610==at 0x4E4D7C0: ncs_patricia_tree_get (patricia.c:90)
==53610==by 0x40400D: cpnd_ckpt_node_get (cpnd_db.c:42)
==53610==by 0x40D1A2: cpnd_process_evt (cpnd_evt.c:1957)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610==  Address 0x687de70 is 16 bytes inside a block of size 1,072 free'd
==53610==at 0x4C29D4E: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x40A827: cpnd_evt_proc_ckpt_open (cpnd_evt.c:983)
==53610==by 0x40D426: cpnd_process_evt (cpnd_evt.c:202)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610== 
==53610== Invalid read of size 8
==53610==at 0x4E4D7FB: ncs_patricia_tree_get (patricia.c:435)
==53610==by 0x40400D: cpnd_ckpt_node_get (cpnd_db.c:42)
==53610==by 0x40D1A2: cpnd_process_evt (cpnd_evt.c:1957)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610==  Address 0x687de78 is 24 bytes inside a block of size 1,072 free'd
==53610==at 0x4C29D4E: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x40A827: cpnd_evt_proc_ckpt_open (cpnd_evt.c:983)
==53610==by 0x40D426: cpnd_process_evt (cpnd_evt.c:202)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610== 
==53610== Invalid read of size 1
==53610==at 0x4C2D0B9: bcmp (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x4E4D803: ncs_patricia_tree_get (patricia.c:435)
==53610==by 0x40400D: cpnd_ckpt_node_get (cpnd_db.c:42)
==53610==by 0x40D1A2: cpnd_process_evt (cpnd_evt.c:1957)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610==  Address 0x687de80 is 32 bytes inside a block of size 1,072 free'd
==53610==at 0x4C29D4E: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x40A827: cpnd_evt_proc_ckpt_open (cpnd_evt.c:983)
==53610==by 0x40D426: cpnd_process_evt (cpnd_evt.c:202)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610== 
==53610== Invalid read of size 1
==53610==at 0x4C2D0D0: bcmp (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x4E4D803: ncs_patricia_tree_get (patricia.c:435)
==53610==by 0x40400D: cpnd_ckpt_node_get (cpnd_db.c:42)
==53610==by 0x40D1A2: cpnd_process_evt (cpnd_evt.c:1957)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610==  Address 0x687de81 is 33 bytes inside a block of size 1,072 free'd
==53610==at 0x4C29D4E: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==53610==by 0x40A827: cpnd_evt_proc_ckpt_open (cpnd_evt.c:983)
==53610==by 0x40D426: cpnd_process_evt (cpnd_evt.c:202)
==53610==by 0x40E9D6: cpnd_main_process (cpnd_init.c:568)
==53610==by 0x403882: main (cpnd_main.c:72)
==53610== 
==53610== Invalid read of size 4
==53610==at 0x4E4D7C4: ncs_patricia_tree_get (patricia.c:93)
==53610==by 0x40400D: cpnd_ckpt_node_get (cpnd_db.c:42)
==53610==by 0x405872: cpnd_evt_proc_nd2nd_ckpt_sect_create (cpnd_evt.c:2602)
==53610==by 0x40D2B8: cpnd_process_evt (cpnd_evt.c:335)
==53610==by 0x40E9D6:

[tickets] [opensaf:tickets] #1503 IMM: Augumented CCb client went down the OM client should get err

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1503] IMM: Augumented CCb client went down the OM client should 
get err**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Fri Sep 25, 2015 09:18 AM UTC by Neelakanta Reddy
**Last Updated:** Mon Oct 05, 2015 10:15 AM UTC
**Owner:** Neelakanta Reddy


OM on node1 and OI on node2.
OM creates an object. 
In OI augument by creating an object and the OI client goes down.
The CCb get aborted in IMM database.But the OM create API will not get return 
value and after SYNC_TIMEOUT OM API receives TIME_OUT.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1515 AMF : SU struck in terminating for failure during csi assignment in si-swap (Nway)

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1515] AMF : SU struck in terminating for failure during csi 
assignment in si-swap (Nway)**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Mon Oct 05, 2015 12:45 PM UTC by Srikanth R
**Last Updated:** Tue Oct 06, 2015 06:38 AM UTC
**Owner:** nobody


Changeset : 6901 
amf application :  3 SUs with 5 SIs. ( Su1 and SU3 hosted on PL-3 and SU2 
hosted on PL-4). 
  Nway redundancy model.
  
Issue :  SU struck in terminating for failure during csi active assignment  in 
si-swap (Nway)  

Steps :

 -> Initially brought up the application by unlocking the SG and below are the 
assignments .
 
 
 TestApp_SI1 TestApp_SI2 TestApp_SI3 TestApp_SI4 
TestApp_SI5   

TestApp_SU1ACTIVE ACTIVE ACTIVE STANDBY 

TestApp_SU2STANDBY STANDBY STANDBY 
ACTIVE 
TestApp_SU3ACTIVE 
STANDBY 


 -> Before performing si-swap operation on SU1, ensured that component with SI1 
standby assignment shall reject the active callback
 
 
 -> Invoked the si-swap operation. As the component responded with 
ERR_FAILED_OP in active callback, recovery action is triggered for SU.  
 
 
 Oct  5 15:15:02 PAYLOAD-2 osafamfnd[2659]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_Nway' ACTIVE to 
'safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_Nwa
 Oct  5 15:15:02 PAYLOAD-2 osafamfnd[2659]: NO 
'safComp=COMP1,safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_Nway' faulted 
due to 'csiSetcallbackFailed' : Recovery is 'componentFailover'
Oct  5 15:15:02 PAYLOAD-2 osafamfnd[2659]: NO 
'safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_Nway' Presence State 
INSTANTIATED => TERMINATING


But the SU struck in terminating state and below are the final assignments.




 TestApp_SI1 TestApp_SI2 TestApp_SI3 TestApp_SI4 
TestApp_SI5   

TestApp_SU1QUIESCED ACTIVE ACTIVE STANDBY 
STANDBY 
TestApp_SU2ACTIVE STANDBY STANDBY 
QUIESCED 
TestApp_SU3STANDBY ACTIVE 
ACTIVE 


  
  






---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1512 AMF : SU struck in Quiesced state after Lock operation of SU in Nway

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1512] AMF : SU struck in Quiesced state after Lock operation of SU 
in Nway**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Mon Oct 05, 2015 08:56 AM UTC by Srikanth R
**Last Updated:** Mon Oct 05, 2015 09:26 AM UTC
**Owner:** nobody
**Attachments:**

- 
[QuiescedNway.sh](https://sourceforge.net/p/opensaf/tickets/1512/attachment/QuiescedNway.sh)
 (11.3 kB; application/x-shellscript)


Changeset : 6901
Amf application : 3 SUs hosted on PL-3 and PL-4 4 SIs ( Redundancy model : Nway 
)

Issue : SU struck in Quiesced state, after lock operation issued on one of the 
SU.


Steps :

 -> Initially brought up AMF application configured in Nway redundancy model 
with 3Sus and 4 SIs. Below are the configuration attributes for SG.
 
saAmfSGNumPrefStandbySUs   SA_UINT32_T  1 (0x1)
saAmfSGNumPrefInserviceSUs SA_UINT32_T  4 (0x4)
saAmfSGNumPrefAssignedSUs  SA_UINT32_T  4 (0x4)
saAmfSGNumPrefActiveSUsSA_UINT32_T  3 (0x3)
saAmfSGNumCurrNonInstantiatedSpareSUs  SA_UINT32_T  0 (0x0)
saAmfSGNumCurrInstantiatedSpareSUs SA_UINT32_T  0 (0x0)
saAmfSGNumCurrAssignedSUs  SA_UINT32_T  3 (0x3)
saAmfSGMaxStandbySIsperSU  SA_UINT32_T  1 (0x1)
saAmfSGMaxActiveSIsperSU   SA_UINT32_T  3 (0x3)

 -> Brought up the application by unlocking the SG and below are the 
assignments.
 
 
   TestApp_SI1 TestApp_SI2 TestApp_SI3 TestApp_SI4  
 

TestApp_SU1ACTIVE   ACTIVE ACTIVE STANDBY   
  
TestApp_SU2STANDBY
ACTIVE 
TestApp_SU3 STANDBY 



-> Now performed lock operation on the SU1. SU1 struck in quiesced state after 
the operation.



 TestApp_SI1 TestApp_SI2 TestApp_SI3 TestApp_SI4   

TestApp_SU1QUIESCED QUIESCED STANDBY 
TestApp_SU2ACTIVE STANDBY ACTIVE ACTIVE 
TestApp_SU3STANDBY ACTIVE 
**

-> When the opensafd on the payload PL-3 is stopped, amfd on active controller 
crashed.


Oct  5 13:04:13 CONTROLLER-1 osafamfd[8492]: su.cc:1885: dec_curr_stdby_si: 
Assertion 'saAmfSUNumCurrStandbySIs > 0' failed.
Oct  5 13:04:13 CONTROLLER-1 osafamfnd[8502]: ER AMF director unexpectedly 
crashed


 The script to bring up the application is attached.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1329 ntf Mismatch when ntfread notificationClassId and ntfsend and notificationClassId

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1329] ntf Mismatch when ntfread notificationClassId and ntfsend 
and notificationClassId**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Wed Apr 22, 2015 06:29 AM UTC by Per Rodenvall
**Last Updated:** Wed Jul 15, 2015 01:04 PM UTC
**Owner:** nobody


ntf Mismatch when ntfread notificationClassId and ntfsend and 
notificationClassId

When reading notificationClassId with ntfread the format is dot separated in 
printout. If notificationClassId will be used in ntfsend we have to replace the 
dots with comma. 

ntfread command should follow the syntax specified in “ntfread –help” e.g. with 
comma between vendorid, majorid, minored.

OPTIONS
  -c or --notificationClassId=VE,MA,MI  vendorid, majorid, minorid



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1348 IMM: Document that OI_CALLBACK_TIMEOUT is not applicable to admin-operations.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> never



---

** [tickets:#1348] IMM: Document that OI_CALLBACK_TIMEOUT is not applicable to 
admin-operations.**

**Status:** duplicate
**Milestone:** never
**Created:** Mon Apr 27, 2015 01:01 PM UTC by Sirisha Alla
**Last Updated:** Fri Jun 05, 2015 10:59 AM UTC
**Owner:** Anders Bjornerstedt
**Attachments:**

- 
[logs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1348/attachment/logs.tar.bz2)
 (96.4 kB; application/x-bzip)


When AdminOperation_2() API is invoked with 30 seconds as timeout and 
OI_CALLBACK_TMOUT is configured as 8 seconds, the API returned TIMEOUT only 
after 30 seconds. IMMA_OI_CALLBACK_TMOUT is applicable for all the OI callbacks 
including OI Admin Operation Callback. 

Following is the IMMA Trace:

Apr 27 18:24:17.295219 imma [3392:imma_oi_api.c:0164] T2 OI client version 
A.2.15
Apr 27 18:24:17.295226 imma [3392:imma_oi_api.c:0196] T2 IMMA library OI 
timeout set to:8
Apr 27 18:24:17.295347 imma [3392:imma_oi_api.c:0290] T1 Trying to add OI 
client id:51 node:2030f handle:330002030f
Apr 27 18:24:17.295358 imma [3392:imma_oi_api.c:0383] << initialize_common



Apr 27 18:24:17.303478 imma [3392:imma_om_api.c:3661] >> admin_op_invoke_common
Apr 27 18:24:17.303492 imma [3392:imma_om_api.c:3801] TR immInvocations:0
Apr 27 18:24:17.303499 imma [3392:imma_om_api.c:3815] TR 
PARAM:testOiTmout_verifyAdminOpCallback_101
Apr 27 18:24:17.305642 imma [3392:imma_proc.c:1346] TR ** Event type:6
Apr 27 18:24:17.305674 imma [3392:imma_proc.c:1239] >> imma_proc_free_pointers
Apr 27 18:24:17.305687 imma [3392:imma_proc.c:1332] << imma_proc_free_pointers
Apr 27 18:24:17.305754 imma [3392:imma_db.c:0187] >> imma_oi_ccb_record_find
Apr 27 18:24:17.305762 imma [3392:imma_db.c:0198] << imma_oi_ccb_record_find
Apr 27 18:24:17.305766 imma [3392:imma_proc.c:1914] >> 
imma_process_callback_info
Apr 27 18:24:47.334148 imma [3392:imma_om_api.c:3864] TR Fevs send RETURNED:5
Apr 27 18:24:47.334241 imma [3392:imma_om_api.c:4009] << admin_op_invoke_common

IMMA and IMMND traces are attached.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1335 AMF : health check is started even if safHealthCheckKey attribute is not set as osafHealthCheck

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1335] AMF : health check is started even if  safHealthCheckKey 
attribute is not set as osafHealthCheck**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Wed Apr 22, 2015 05:31 PM UTC by Srikanth R
**Last Updated:** Mon Aug 10, 2015 11:42 AM UTC
**Owner:** nobody


Changeset : 6377

1)For an NPI component, if AMF needs to perform health check, following two 
commands need to be ran as part of the configuration bringup.

  immcfg -c SaAmfCompType $SaAmfCompType_npi -a saAmfCtCompCategory=8 -a 
saAmfCtDefClcCliTimeout=100 -a saAmfCtDefCallbackTimeout=100 -a 
saAmfCtRelPathInstantiateCmd="amf_comp_script instantiate_npi" -a 
saAmfCtRelPathCleanupCmd="amf_comp_script cleanup" -a 
saAmfCtDefRecoveryOnError=2 -a saAmfCtDefDisableRestart=0 -a 
saAmfCtSwBundle=safSmfBundle=$SaSmfSwBundle -a 
osafAmfCtRelPathHcCmd="health_check_script" -a osafAmfCtDefHcCmdArgv="state" -a 
saAmfCtRelPathTerminateCmd="amf_comp_script cleanup"

  immcfg -c SaAmfHealthcheckType 
safHealthcheckKey=osafHealthCheck,$SaAmfCompType_npi -a 
saAmfHctDefPeriod=100 -a saAmfHctDefMaxDuration=60


2) If the user does not run the second command before instantiating the 
component, health check is not  started as of now, which is fine.

3) But if the user run the following command by  deleting the health check key 
once the application configuration is done ( SU in lock-in state) , health 
check is still started when SU is  unlocked.


immcfg -d 
safHealthcheckKey=osafHealthCheck,safVersion=4.0.0,safCompType=TWONCOMPBASETYPE_NPI


*Deviation*

 Health check should not be started, as the key is deleted before performing 
the unlock-in and unlock operations of SU


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #722 payloads did not go for reboot when both the controllers rebooted

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#722] payloads did not go for reboot when both the controllers 
rebooted**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Thu Jan 16, 2014 07:36 AM UTC by Sirisha Alla
**Last Updated:** Tue Aug 11, 2015 06:32 AM UTC
**Owner:** A V Mahesh (AVM)
**Attachments:**

- 
[payloadnoreboot.tar.bz2](https://sourceforge.net/p/opensaf/tickets/722/attachment/payloadnoreboot.tar.bz2)
 (765.1 kB; application/x-bzip)


The issue is seen on changeset 4733 + patches of CLM corresponding to 
changesets of #220. Continuous failovers are happening when some api 
invocations of IMM application are ongoing. The IMMD has asserted on the new 
active which is reported in the ticket #721

When both controllers got rebooted, the payloads did not get rebooted. Instead 
the opensaf services are up and running. CLM shows that both the payloads are 
not part of cluster. When the payloads are restarted manually, they joined the 
cluster.

PL-3 syslog:

Jan 15 18:23:09 SLES-64BIT-SLOT3 osafimmnd[3550]: NO implementer for class 
'testMA_verifyObjApplNoResponseModCallback_101' is released => class extent is 
UNSAFE
Jan 15 18:23:59 SLES-64BIT-SLOT3 logger: Invoking failover from 
invoke_failover.sh
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA DISCARD DUPLICATE FEVS 
message:92993
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA Error code 2 returned for 
message type 57 - ignoring
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA DISCARD DUPLICATE FEVS 
message:92994
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA Error code 2 returned for 
message type 57 - ignoring
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA Director Service in 
NOACTIVE state - fevs replies pending:1 fevs highest processed:92994
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: NO No IMMD service => cluster 
restart
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafamfnd[3572]: NO 
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[6827]: Started
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[6827]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.176901] TIPC: Resetting link 
<1.1.3:eth0-1.1.2:eth0>, peer not responding
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.176911] TIPC: Lost link 
<1.1.3:eth0-1.1.2:eth0> on network plane A
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.176918] TIPC: Lost contact with 
<1.1.2>
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.256091] TIPC: Resetting link 
<1.1.3:eth0-1.1.1:eth0>, peer not responding
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.256100] TIPC: Lost link 
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.256106] TIPC: Lost contact with 
<1.1.1>
Jan 15 18:24:25 SLES-64BIT-SLOT3 kernel: [ 6361.425537] TIPC: Established link 
<1.1.3:eth0-1.1.2:eth0> on network plane A
Jan 15 18:24:27 SLES-64BIT-SLOT3 osafimmnd[6827]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Jan 15 18:24:27 SLES-64BIT-SLOT3 osafimmnd[6827]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Jan 15 18:24:27 SLES-64BIT-SLOT3 osafimmnd[6827]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_LOADING_CLIENT
Jan 15 18:24:29 SLES-64BIT-SLOT3 osafimmnd[6827]: NO ERR_BAD_HANDLE: Admin 
owner 1 does not exist
Jan 15 18:24:36 SLES-64BIT-SLOT3 kernel: [ 6372.473240] TIPC: Established link 
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jan 15 18:24:39 SLES-64BIT-SLOT3 osafimmnd[6827]: NO ERR_BAD_HANDLE: Admin 
owner 2 does not exist
Jan 15 18:24:39 SLES-64BIT-SLOT3 osafimmnd[6827]: NO NODE STATE-> 
IMM_NODE_LOADING
Jan 15 18:24:45 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:5000
Jan 15 18:24:46 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:6000
Jan 15 18:24:47 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:7000
Jan 15 18:24:48 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:8000
Jan 15 18:24:49 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:9000

After both the controllers came up following is the status:

SLES-64BIT-SLOT1:~ # immlist safNode=PL-3,safCluster=myClmCluster
Name   Type Value(s)

safNodeSA_STRING_T  safNode=PL-3
saClmNodeLockCallbackTimeout   SA_TIME_T500 
(0xba43b7400, Thu Jan  1 05:30:50 1970)
saClmNodeIsMember  SA_UINT32_T  
saClmNodeInitialViewNumber SA_UINT64_T  
saClmNodeIDSA_UINT32_T  
saClmNodeEESA_NAME_T
saClmNodeDisableReboot

[tickets] [opensaf:tickets] #865 LOG: standby controller went for reboot after s/w followed by immnd kill

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#865] LOG: standby controller went for reboot after s/w followed by 
immnd kill**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Mon Apr 21, 2014 10:51 AM UTC by surender khetavath
**Last Updated:** Mon Aug 03, 2015 11:30 AM UTC
**Owner:** nobody
**Attachments:**

- [logs.tgz](https://sourceforge.net/p/opensaf/tickets/865/attachment/logs.tgz) 
(14.1 MB; application/x-compressed-tar)


Changeset : 5143

case: 
1) SC-1 is active and 2) SC-2 is standby
2) invoke switchover from SC-1 as 'amf-adm si-swap safSi=SC-2N,safApp=OpenSAF'
3) kill immnd on SC-2 no sooner SC-1 receives quiesced cbk

Si-swap operation will time-out.
console output:
amf-adm si-swap safSi=SC-2N,safApp=OpenSAF
error - command timed out (alarm)

wait for some time say 1-2mins, sc-2 will reboot with message in syslog as 
shown below

Apr 21 16:10:01 SC-2 osafamfnd[15380]: NO 
'safComp=LOG,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Apr 21 16:10:01 SC-2 osafamfnd[15380]: ER 
safComp=LOG,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Apr 21 16:10:01 SC-2 osafamfnd[15380]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Apr 21 16:10:01 SC-2 opensaf_reboot: Rebooting local node; timeout=60
Apr 21 16:10:04 SC-2 kernel: [13082.449771] md: stopping all md devices.
Apr 21 16:10:04 SC-2 kernel: [13083.455393] sd 0:0:0:0: [sda] Synchronizing 
SCSI cache

Also, there is log in sc-1 syslog saying
Apr 21 16:10:33 SC-1 osafamfd[15353]: ER Alarm lost for 
safSi=NoRed1,safApp=OpenSAF

logs of controller attached. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #786 Boot time stamp changes when a clm node is unconfigured and reconfigured

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#786] Boot time stamp changes when a clm node is unconfigured and 
reconfigured**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Fri Feb 14, 2014 06:26 AM UTC by manu
**Last Updated:** Wed Jul 15, 2015 01:32 PM UTC
**Owner:** Mathi Naickan


As per Clm Spec boot timestamp is the time at which this node last booted , It 
is supposed to change only when the board comes after reboot but

Unconfiguring and reconfiguring of clm oblect also causes this parameter to 
change .


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #789 CLM: CallBacks are getting delivered for operations that have been performed before registering for track

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#789] CLM: CallBacks are getting delivered for operations that have 
been performed before registering for track **

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Feb 14, 2014 08:59 AM UTC by manu
**Last Updated:** Wed Jul 15, 2015 01:32 PM UTC
**Owner:** nobody
**Attachments:**

- 
[clm_traces.tar](https://sourceforge.net/p/opensaf/tickets/789/attachment/clm_traces.tar)
 (1.7 MB; application/x-tar)


Sometimes when though Registering for track is done after performing the 
operation ,callback is getting delivered for the same which is a violation of 
spec.

Following is one such case where this behaviour seen.

1.perform a failover

2.Register for Track callback.

3.Change the configuration of any node

4.Dispatch the Callback

After Step4 I am supposed to recieve a callback for admin operation done in 
step 3 ,but the callback for operation done in step1 is also  getting delivered




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #466 Length of the objectnames is more by one for configuration object notifications

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#466] Length of the objectnames is more by one for configuration 
object notifications**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Jun 20, 2013 09:08 AM UTC by Sirisha Alla
**Last Updated:** Wed Jul 15, 2015 02:06 PM UTC
**Owner:** nobody


When ntfimcnd sends notifications for configuration object 
creation/modification/deletion, the length of the notifying object and the 
notification object is been shown wrongly. IMM callback gives the length of the 
notification object correctly.

Notification object length in the imm callback:
objectName->length: 37
objectName->value: 'attrName_testSA_registerSA_Node_37_69'

Object create/modify/delete notifications indicate the length of notification 
object is 38 and the length of notifying object is 15 for "safApp=OpenSaf".

This issue is reproducible.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #416 saAmfResponse() returns invalid parameter when value of 'error' is unrecognized

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#416] saAmfResponse() returns invalid parameter when value of 
'error' is unrecognized**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri May 31, 2013 06:04 AM UTC by Nagendra Kumar
**Last Updated:** Thu Aug 06, 2015 10:11 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2788

This is a spec deviation see e.g. 7.9.3


"Any other error code set in the error parameter in the response will be 
treated by the Availability Management Framework as if the caller had set the 
error parameter to SA_AIS_ERR_FAILED_OPERATION."





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #412 amf: After runtime delete of sponser CSI, SU stuck in Quiesed state after admin lock of SUs.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#412] amf: After runtime delete of sponser CSI, SU stuck in Quiesed 
state after admin lock of SUs.**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri May 31, 2013 05:57 AM UTC by Praveen
**Last Updated:** Thu Aug 06, 2015 10:15 AM UTC
**Owner:** nobody


Migrated to http://devel.opensaf.org/ticket/2861.

changeset : 3796, 4.2.2
 model : 2n
 

Configuration:-
 =
 2SUs.SU1 on PL-3 and SU2 on SC-1
 2 SIs.
 3CSIs (CSI1, CSI2, and CSI3) per SI.
 CSI-CSI dependency configured as
 CSI1 dependent on CSI2 and CSI2 dependent on CSI3 for both the SIs.
 

Problem description:-
 ===
 After successfully runtime delete of the sponser CSI3 of SI1, admin lock on 
SU1 and SU2 returns timeout and SU1 struck in QUIESCED state. Finally SG 
becomes unstable. While doing the admin lock on SU2, /var/log/messages keeps 
printing the below messages:-
 

Oct 11 17:25:51 SLES-SLOT-1 osafamfd[3567]: SG state is not stable
 Oct 11 17:25:52 SLES-SLOT-1 osafamfd[3567]: SG state is not stable
 Oct 11 17:28:21 SLES-SLOT-1 osafamfd[3567]: Admin operation is already going
 Oct 11 17:28:22 SLES-SLOT-1 osafamfd[3567]: Admin operation is already going
 

States after lock of SU1 and SU2:-
 safSu=csidep_2n_1,safSg=SG_csidep_2n,safApp=2nApp
 


saAmfSUAdminState=LOCKED(2)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=OUT-OF-SERVICE(1)
 

safSu=csidep_2n_2,safSg=SG_csidep_2n,safApp=2nApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSISU=safSu=csidep_2n_1\,safSg=SG_csidep_2n\,safApp=2nApp,safSi=csidep_2n,safApp=2nApp
 


saAmfSISUHAState=QUIESCED(3)
 

safSISU=safSu=csidep_2n_1\,safSg=SG_csidep_2n\,safApp=2nApp,safSi=csidep_2n_1,safApp=2nApp
 


saAmfSISUHAState=QUIESCED(3)
 

safSISU=safSu=csidep_2n_2\,safSg=SG_csidep_2n\,safApp=2nApp,safSi=csidep_2n,safApp=2nApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=csidep_2n_2\,safSg=SG_csidep_2n\,safApp=2nApp,safSi=csidep_2n_1,safApp=2nApp
 


saAmfSISUHAState=ACTIVE(1)

Changed 7 months ago by nagendra 
Looks duplicate of http://devel.opensaf.org/ticket/2842



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #449 LOG: OI Completed Callback function has undefined return values

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#449] LOG: OI Completed Callback function has undefined return 
values**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Mon Jun 10, 2013 01:39 PM UTC by elunlen
**Last Updated:** Wed Jul 15, 2015 02:21 PM UTC
**Owner:** elunlen


The OI Completed Callback shall return SA_AIS_OK or SA_AIS_BAD_OPERATION as a 
result of the parameter check. However the LOG service may return other SA_AIS 
return codes.
See file lgs_imm.c


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1344 CLM : clmd should not send the callbacks for tracking on non-member node

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1344] CLM : clmd should not send the callbacks for tracking on 
non-member node**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Sun Apr 26, 2015 06:00 PM UTC by Srikanth R
**Last Updated:** Sun Apr 26, 2015 06:00 PM UTC
**Owner:** nobody


Changeset : 6377

As per the spec 3.5.1 page #44,
"""
If saClmClusterTrack_4() is invoked on non-member nodes, the follow-
ing applies:
• if SA_TRACK_CURRENT is specified, only information about the local node
is returned in the structure pointed to by notificationBuffer or in
the subsequent callback;
• if SA_TRACK_CHANGES or SA_TRACK_CHANGES_ONLY is specified, call-
backs will only be invoked when the node joins the cluster membership.
"""

 As of now, CLM service delivers the callbacks for the agents on non-member 
nodes and wait for the operation completion until the  agent responds. This 
should be changed according to the spec.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1343 CLM : clmd asserted when controller switchover is invoked with CLM shutdown operation of node

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1343] CLM : clmd asserted when controller switchover is invoked 
with CLM shutdown operation of node**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Sun Apr 26, 2015 05:03 PM UTC by Srikanth R
**Last Updated:** Mon Apr 27, 2015 04:08 AM UTC
**Owner:** nobody


Changeset : 6377

Steps performed :


 -> Issued admin shutdown operation on a member node PL-5 and ensured the CLM 
agent did not respond in the start callback 

   426 16:51:12 04/26/2015 NO safApp=safClmService 
"safNode=PL-5,safCluster=myClmCluster Admin State Changed, new 
state=SHUTTING_DOWN"

 -> Invoked controller switchover by issuing admin si-swap operation.

   427 16:51:20 04/26/2015 NO safApp=safAmfService "Admin op "SI_SWAP" 
initiated for 'safSi=SC-2N,safApp=OpenSAF', invocation: 502511173633"
   428 16:51:20 04/26/2015 NO safApp=safAmfService 
"safSi=SC-2N,safApp=OpenSAF Swap initiated"

 -> clmd asserted on the quiesced controller.

Apr 26 16:51:20 CONTROLLER-1 osafamfd[2119]: NO safSi=SC-2N,safApp=OpenSAF Swap 
initiated
Apr 26 16:51:20 CONTROLLER-1 osafamfnd[2129]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' QUIESCED to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Apr 26 16:51:20 CONTROLLER-1 osafimmnd[2063]: NO Implementer locally 
disconnected. Marking it as doomed 80 <604, 2010f> (safSmfService)
Apr 26 16:51:20 CONTROLLER-1 osafimmnd[2063]: NO Implementer disconnected 75 
<332, 2010f> (safMsgGrpService)
Apr 26 16:51:20 CONTROLLER-1 osafimmnd[2063]: NO Implementer disconnected 80 
<604, 2010f> (safSmfService)
Apr 26 16:51:20 CONTROLLER-1 osafimmnd[2063]: NO Implementer disconnected 72 
<3, 2010f> (safLogService)
Apr 26 16:51:20 CONTROLLER-1 osafimmnd[2063]: NO Implementer disconnected 78 
<334, 2010f> (safEvtService)
Apr 26 16:51:20 CONTROLLER-1 osafamfnd[2129]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Apr 26 16:51:20 CONTROLLER-1 osafamfnd[2129]: ER 
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Apr 26 16:51:20 CONTROLLER-1 osafamfnd[2129]: Rebooting OpenSAF NodeId = 131343 
EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60



 -> Below is the backtrace :

 (gdb) thread apply all bt
 
 Thread 4 (Thread 0x7f4caf033700 (LWP 2100)):
 #0  0x7f4cadcd7415 in __lll_unlock_wake () from /lib64/libpthread.so.0
 #1  0x7f4cadcd3ac4 in _L_unlock_553 () from /lib64/libpthread.so.0
 #2  0x7f4cadcd39f7 in __pthread_mutex_unlock_usercnt () from 
/lib64/libpthread.so.0
 #3  0x7f4caec06870 in ncsmds_adm_api () from 
/usr/lib64/libopensaf_core.so.0
 #4  0x7f4caec1f813 in vda_chg_role_vdest () from 
/usr/lib64/libopensaf_core.so.0
 #5  0x7f4caec1ed79 in ncsvda_api () from /usr/lib64/libopensaf_core.so.0
 #6  0x0041e6ec in clms_mds_change_role ()
 #7  0x00404617 in amf_quiesced_state_handler ()
 #8  0x00404778 in clms_amf_csi_set_callback ()
 #9  0x7f4cae9a5ba0 in ava_hdl_cbk_rec_prc () from /usr/lib64/libSaAmf.so.0
 #10 0x7f4cae9a530d in ava_hdl_cbk_dispatch_all () from 
/usr/lib64/libSaAmf.so.0
 #11 0x7f4cae9a4e34 in ava_hdl_cbk_dispatch () from /usr/lib64/libSaAmf.so.0
 #12 0x7f4cae99df14 in saAmfDispatch () at ava_api.c:261
 #13 0x00411032 in main ()
 
 Thread 3 (Thread 0x7f4caf010b00 (LWP 2104)):
 #0  0x7f4cad6164f6 in poll () from /lib64/libc.so.6
 #1  0x7f4caebd0df1 in osaf_ppoll () from /usr/lib64/libopensaf_core.so.0
 #2  0x7f4caebd0d27 in osaf_poll () from /usr/lib64/libopensaf_core.so.0
 #3  0x7f4caebd0ef0 in osaf_poll_one_fd () from 
/usr/lib64/libopensaf_core.so.0
 #4  0x7f4cadee7a04 in rda_read_msg () from /usr/lib64/opensaf/librda.so.0
 #5  0x7f4cadee71e7 in rda_callback_task () from 
/usr/lib64/opensaf/librda.so.0
 #6  0x7f4cadcd07b6 in start_thread () from /lib64/libpthread.so.0
 #7  0x7f4cad61f9cd in clone () from /lib64/libc.so.6
 #8  0x in ?? ()
 
 Thread 2 (Thread 0x7f4caf062b00 (LWP 2102)):
 #0  0x7f4cad6164f6 in poll () from /lib64/libc.so.6
 #1  0x7f4caebd0df1 in osaf_ppoll () from /usr/lib64/libopensaf_core.so.0
 #2  0x7f4caebda7b5 in ncs_tmr_wait () from /usr/lib64/libopensaf_core.so.0
 #3  0x7f4cadcd07b6 in start_thread () from /lib64/libpthread.so.0
 #4  0x7f4cad61f9cd in clone () from /lib64/libc.so.6
 #5  0x in ?? ()
 
 Thread 1 (Thread 0x7f4caf030b00 (LWP 2103)):
 #0  0x7f4cad57ab55 in raise () from /lib64/libc.so.6
 #1  0x7f4cad57c131 in abort () from /lib64/libc.so.6
 #2  0x7f4cad5b7c2f in __libc_message () from /lib64/libc.so.6
 #3  0x7f4cad5bd358 in malloc_printerr () from /lib64/libc.so.6
 #4  0x7f4cad5c099d in _int_malloc () from /lib64/libc.so.6
 #5  0x7f4cad5c23e7 in malloc () from /lib64/libc.so.6
 #6  0x7f4caec0362a in mds_subtn_res_tbl_remove_active () from

[tickets] [opensaf:tickets] #1349 LOG : lgs_own_log_files is not called when logDataGroupname is reset to ""

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1349] LOG : lgs_own_log_files is not called when logDataGroupname 
is reset to ""**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue Apr 28, 2015 04:46 AM UTC by Srikanth R
**Last Updated:** Tue Apr 28, 2015 04:46 AM UTC
**Owner:** nobody


Changeset : 6490

If the group name is set to tet using the following command, all the existing 
log files are owned by the new group. 

immcfg -a logDataGroupname=tet logConfig=1,safApp=safLogService

Apr 28 10:01:14.035761 osaflogd [5213:lgs_imm.c:1988] >> 
config_ccb_apply_modify: CCB ID 9, 'logConfig=1,safApp=safLogService'
Apr 28 10:01:14.035766 osaflogd [5213:lgs_imm.c:1997] TR attribute 
logDataGroupname
Apr 28 10:01:14.035770 osaflogd [5213:lgs_imm.c:1948] >> 
logDataGroupname_fileown
Apr 28 10:01:14.035784 osaflogd [5213:lgs_imm.c:3123] NO LOG service data group 
is changed to tet
Apr 28 10:01:14.035791 osaflogd [5213:lgs_util.c:0606] >> lgs_own_log_files: 
stream safLgStrCfg=appstream1,safApp=safLogService
.
Apr 28 10:01:14.036787 osaflogd [5213:lgs_filehdl.c:0757] T3 
/var/log/opensaf/saflog/./saLogSystem_20150428_092511.log
Apr 28 10:01:14.036804 osaflogd [5213:lgs_filehdl.c:0771] << 
own_log_files_by_group_hdl


If the group name is set to default, the existing log files are not owned by 
calling lgs_own_log_files function.


immcfg -a logDataGroupname="" logConfig=1,safApp=safLogService 

Apr 28 10:01:40.624851 osaflogd [5213:lgs_imm.c:1948] >> 
logDataGroupname_fileown
Apr 28 10:01:40.624875 osaflogd [5213:lgs_imm.c:3123] NO LOG service data group 
is changed to 
Apr 28 10:01:40.624884 osaflogd [5213:lgs_imm.c:1971] << 
logDataGroupname_fileown



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1342 CLM : Deviations from spec in populating track callback parameters.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1342] CLM : Deviations from spec in populating track callback 
parameters.**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Sun Apr 26, 2015 04:16 PM UTC by Srikanth R
**Last Updated:** Sun Apr 26, 2015 04:16 PM UTC
**Owner:** nobody


Changset : 6377

Following are the various issues observed while populating track callback by 
CLM.

1) rootCauseEntity is not set to NULL ( but to random value), when 
saClmClusterTrack_4 is called with trackFlags  set to SA_TRACK_CURRENT.


2) In the callback for start step for lock operation, timeSupervision parameter 
is not filled up with the configured attribute saClmNodeLockCallbackTimeout of 
the node undergoing lock / shutdown operation.

Breakpoint 1, pycbk_SaClmClusterTrackCallbackT_4 (notificationBuff=0x853b08, 
numberOfMembers=4, invocation=137439085583, rootCauseEntity=0x8534b0, 
correlationIds=0x84c130,
step=SA_CLM_CHANGE_START, timeSupervision=-6612564084514619392, 
error1=SA_AIS_OK) at saClm_wrap.c:2901 


  For shutdown operation, timeSupervision parameter should be filled up with 
zero, as the admin operation is not timebound

3) clusterChange in the notificationBuffer is not filled up with 
SA_CLM_NODE_UNLOCK, if unlock operation is performed on the node when shutdown 
operation is in progress.

 Initial callback when shutdown operation is issued :

Breakpoint 1, pycbk_SaClmClusterTrackCallbackT_4 (notificationBuff=0x853b08, 
numberOfMembers=4, invocation=163208889359, rootCauseEntity=0x84d9d0, 
correlationIds=0x7f5df0,
step=SA_CLM_CHANGE_START, timeSupervision=-6612564084514619392, 
error1=SA_AIS_OK) at saClm_wrap.c:2901
2901   printf("root cse entity in c-clbk %s",rootCauseEntity->value);
(gdb) p (*notificationBuff)->notification[0]
$49 = {clusterNode = {nodeId = 132111, nodeAddress = {family = SA_CLM_AF_INET, 
length = 0,
  value = 
"S\367\377\177\000\000\000\000\000\000\000\000\000\246\000\000\000\000\000\000\000<\000\000\000\001\000\000\000\003\000\002\004\017\000\000\000\001\000\000\000$safNode=PL-4,safCluste"},
 nodeName = {length = 36,
  value = 
"safNode=PL-4,safCluster=myClmCluster\000\000\000\000\000\000\000\000\000$safNode=PL-4,safCluster=myClmCluster",
 '\000' , "f\360\240\000\000\000\000\000\000\000 
\000\000\000\004", '\000' , "\001", '\000' }, executionEnvironment = {length = 0, value = '\000' },
member = SA_TRUE, bootTimestamp = 14300225290, initialViewNumber = 
64}, clusterChange = SA_CLM_NODE_SHUTDOWN}


 Second callback, where clusterChange is improperly filled :

Breakpoint 1, pycbk_SaClmClusterTrackCallbackT_4 (notificationBuff=0x859108, 
numberOfMembers=4, invocation=0, rootCauseEntity=0x859610, 
correlationIds=0x84c130,
step=SA_CLM_CHANGE_COMPLETED, timeSupervision=0, error1=SA_AIS_OK) at 
saClm_wrap.c:2901
2901   printf("root cse entity in c-clbk %s",rootCauseEntity->value);
(gdb) p (*notificationBuff)->notification[0]
$50 = {clusterNode = {nodeId = 132111, nodeAddress = {family = SA_CLM_AF_INET, 
length = 0, value = '\000' }, nodeName = {length = 36,
  value = "safNode=PL-4,safCluster=myClmCluster", '\000' }, executionEnvironment = {length = 0, value = '\000' }, member = SA_TRUE,
bootTimestamp = 14300225290, initialViewNumber = 65}, clusterChange 
= SA_CLM_NODE_JOINED}



  In this case, notification is sent about the node joining the cluster, which 
is improper. The node never left the cluster and there is no notification for 
that, which is fine.

===  Apr 26 12:18:46 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = "safNode=PL-4,safCluster=myClmCluster"
notifyingObject = "safApp=safClmService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_CLM.101 (0x65)
additionalText = "CLM node safNode=PL-4,safCluster=myClmCluster Joined"
sourceIndicator = SA_NTF_OBJECT_OPERATION
State ID = SA_CLM_CLUSTER_CHANGE_STATUS
New State: SA_CLM_NODE_JOINED



4) When the lock operation is in progress, hold the response in the start step 
callback and stop the opensaf / reboot the node ( on which operation is in 
progress). In this case notificationBuff is filled up with number of items set 
to zero.


Breakpoint 1, pycbk_SaClmClusterTrackCallbackT_4 (notificationBuff=0x85e3e8, 
numberOfMembers=4, invocation=0, rootCauseEntity=0x85e8f0, 
correlationIds=0x7f5df0,
step=SA_CLM_CHANGE_COMPLETED, timeSupervision=1, error1=SA_AIS_OK) at 
saClm_wrap.c:2901
(gdb) p (*notificationBuff)
$64 = {viewNumber = 74, numberOfItems = 1, notification = 0x85e670}
(gdb) p (*notificationBuff)->notification[0]
$65 = {clusterNode = {nodeId = 132111, nodeAddress = {family = SA_CLM_AF_INET, 
length = 0, value = '\000' }, nodeName = {length = 36,
  value = "safNode=PL-4,safCluster=myClmCluster", '\000' }, executionEnvironment = {length = 0, value = '\000' }, member = SA_FALSE,
bootTimestamp = 14300315740, initialViewNumber = 73}, clusterChange 
= SA_CLM_NODE_LEFT}


Callback when node is rebooted in the

[tickets] [opensaf:tickets] #1345 CLM: Tracking for changes should not be started incase of improper track flags

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1345] CLM: Tracking for changes should not be started incase of 
improper track flags**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Sun Apr 26, 2015 06:06 PM UTC by Srikanth R
**Last Updated:** Sun Apr 26, 2015 06:09 PM UTC
**Owner:** nobody


Changeset : 6377

   Tracking should not be started, if track flags are not set with either of 
flag SA_TRACK_CHANGES or SA_TRACK_CHANGES_ONLY. As of now, callbacks are not 
sent to the agent, but  clm service  waits for the response from the agent in 
the callback if admin operation is issued.

For the track flags combination :TRACK_CURRENT | TRACK_START | TRACK_VALIDATE , 
 agent does not get callback, but clm waits for the response for an admin 
operation


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1323 Java : API does not return values as expected when version parameter is passed incorrectly

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1323] Java : API does not return values as expected when version 
parameter is passed incorrectly**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue Apr 21, 2015 11:11 AM UTC by Sirisha Alla
**Last Updated:** Tue Apr 21, 2015 11:11 AM UTC
**Owner:** nobody


This is a clone of devel ticket 2272

When wrong version is input to the initializeHandle() API, major version is not 
being returned as per the expected supported version.

Example: When C.1.1 is passed as input to the version, the version returned is 
B.1.1 where the expectation is B.4.1

When minorVersion is specified with version less than supported minor version, 
ERR_VERSION is being returned. Specification says that the minor version needs 
to be ignored and SA_AIS_OK needs to be returned.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1278 IMM: admin owner clear/release on an object is allowed when admin operation is in progress for the object

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> never



---

** [tickets:#1278] IMM: admin owner clear/release on an object is allowed when 
admin operation is in progress for the object**

**Status:** wontfix
**Milestone:** never
**Created:** Tue Mar 24, 2015 05:44 AM UTC by Sirisha Alla
**Last Updated:** Fri Mar 27, 2015 11:17 AM UTC
**Owner:** Anders Bjornerstedt


This issue is seen on 46FC Tag changeset, this may also be relevant to all the 
older versions of OpenSAF(not verified)

Spec says on Page 67:

The operation fails if an administrative operation is currently in progress on 
one of the targeted objects. An administrative operation is considered to be in 
progress on an object if the SaImmOiAdminOperationCallbackT_2 Object 
Implementer's callback has been invoked for that operation and the Object 
Implementer is still registered but has not yet called 
saImmOiAdminOperationResult() to provide the operation results.

To simulate the above case, invoked AdminOperationAsync on an object in the 
test application. After AdminOperationCallback is invoked, without responding 
with AdminOperationResult from the object OI, invoked adminOwnerRelease from OM 
and the API succeeded.

According to the spec ERR_BUSY needs to be given as response to 
AdminOwnerRelease operation. The same is applicable for AdminOwnerClear() API.

IMMND trace on that node:

Mar 24 11:02:13.611054 osafimmnd [4131:ImmModel.cc:10998] >> 
adminOperationInvoke 
Mar 24 11:02:13.611072 osafimmnd [4131:ImmModel.cc:11005] T5 Admin op on 
objectName:xattrName_testAdminOwnerRelease_Failures_1012
Mar 24 11:02:13.61 osafimmnd [4131:ImmModel.cc:4] T5 IMPLEMENTER FOR 
ADMIN OPERATION INVOKE 19 conn:55 node:2030f 
name:implementer_testAdminOwnerRelease_Failures_101
Mar 24 11:02:13.611139 osafimmnd [4131:ImmModel.cc:11122] T5 Updating req 
invocation inv:34359738367 conn:54 timeout:0
Mar 24 11:02:13.611163 osafimmnd [4131:ImmModel.cc:11129] TR Located pre 
request continuation 34359738367 adjusting timeout to 0
Mar 24 11:02:13.611182 osafimmnd [4131:ImmModel.cc:11157] T5 Storing impl 
invocation 55 for inv: 34359738367
Mar 24 11:02:13.611215 osafimmnd [4131:ImmModel.cc:11226] << 
adminOperationInvoke 
Mar 24 11:02:13.611252 osafimmnd [4131:immnd_evt.c:4984] T2 IMMND sending Agent 
upcall
Mar 24 11:02:13.613901 osafimmnd [4131:immnd_evt.c:4990] T2 IMMND UPCALL TO 
AGENT SEND SUCCEEDED
Mar 24 11:02:13.614270 osafimmnd [4131:immnd_evt.c:5128] T2 Delayed reply, wait 
for reply from implementer
Mar 24 11:02:13.614547 osafimmnd [4131:immnd_evt.c:5132] << 
immnd_evt_proc_admop 
Mar 24 11:02:13.614873 osafimmnd [4131:immnd_evt.c:8658] >> dequeue_outgoing 
Mar 24 11:02:13.615112 osafimmnd [4131:immnd_evt.c:8664] TR Pending replies:0 
space:16 out list?:(nil)
Mar 24 11:02:13.615396 osafimmnd [4131:immnd_evt.c:8693] << dequeue_outgoing 
Mar 24 11:02:13.615829 osafimmnd [4131:immnd_evt.c:8777] << 
immnd_evt_proc_fevs_rcv 
Mar 24 11:02:14.496009 osafimmnd [4131:ImmModel.cc:12450] T5 Did not timeout 
now - start < 0(1)
Mar 24 11:02:14.609660 osafimmnd [4131:immsv_evt.c:5500] T8 Received: 
IMMND_EVT_A2ND_IMM_FEVS (14) from 2030f
Mar 24 11:02:14.609724 osafimmnd [4131:immnd_evt.c:2837] T2 sender_count: 1 
size: 268 
Mar 24 11:02:14.609761 osafimmnd [4131:immnd_evt.c:3118] >> 
immnd_fevs_local_checks 
Mar 24 11:02:14.609808 osafimmnd [4131:immnd_evt.c:3575] << 
immnd_fevs_local_checks 
Mar 24 11:02:14.609838 osafimmnd [4131:immnd_evt.c:3036] T2 SENDING FEVS TO IMMD
Mar 24 11:02:14.609863 osafimmnd [4131:immsv_evt.c:5481] T8 Sending:  
IMMD_EVT_ND2D_FEVS_REQ to 0
Mar 24 11:02:14.616600 osafimmnd [4131:immnd_evt.c:8716] >> 
immnd_evt_proc_fevs_rcv 
Mar 24 11:02:14.616745 osafimmnd [4131:immnd_evt.c:8732] T2 FEVS from myself, 
still pending:0
Mar 24 11:02:14.616815 osafimmnd [4131:immsv_evt.c:5500] T8 Received: 
IMMND_EVT_A2ND_ADMO_RELEASE (10) from 0
Mar 24 11:02:14.616860 osafimmnd [4131:ImmModel.cc:4549] >> adminOwnerChange 
Mar 24 11:02:14.616893 osafimmnd [4131:ImmModel.cc:4576] T5 Release admin owner 
'exowner'
Mar 24 11:02:14.634875 osafimmnd [4131:ImmModel.cc:4681] TR Cutoff in 
admo-change-loop by childCount
Mar 24 11:02:14.635431 osafimmnd [4131:ImmModel.cc:4589] T5 Release Admin Owner 
for object 
xattrName_testAdminOwnerRelease_Failures_1012
Mar 24 11:02:14.641743 osafimmnd [4131:ImmModel.cc:4681] TR Cutoff in 
admo-change-loop by childCount
Mar 24 11:02:14.642150 osafimmnd [4131:ImmModel.cc:4694] << adminOwnerChange 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To

[tickets] [opensaf:tickets] #1184 daemonize does not support changing primary group

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> never



---

** [tickets:#1184] daemonize does not support changing primary group**

**Status:** invalid
**Milestone:** never
**Created:** Tue Oct 21, 2014 02:29 PM UTC by Hans Feldt
**Last Updated:** Tue Oct 21, 2014 04:02 PM UTC
**Owner:** nobody


The environment variable OPENSAF_GROUP exported in nid.conf is not respected in 
daemon.c
For consistency with specifying the user name, specifying primary group name 
should also be provided on the command line


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1076 2PBE: pbed aborts at pbeClosePrepareTrans

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> never



---

** [tickets:#1076] 2PBE: pbed aborts at pbeClosePrepareTrans**

**Status:** duplicate
**Milestone:** never
**Created:** Mon Sep 15, 2014 06:52 AM UTC by Sirisha Alla
**Last Updated:** Thu Feb 19, 2015 11:42 AM UTC
**Owner:** Anders Bjornerstedt
**Attachments:**

- 
[SLOT2.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1076/attachment/SLOT2.tar.bz2)
 (11.8 MB; application/x-bzip)


The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running on 
changeset 5697 + #946 patches

Syslog on SC-2:

Sep 12 19:15:00 SLES-64BIT-SLOT2 osafamfnd[2409]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
618 <0, 2010f> (@OpenSafImmReplicatorA)
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 640 (@OpenSafImmReplicatorA) <0, 2010f>
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100f0 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Got error on non local rt 
object update err: 6
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
610 <0, 2010f> (safAmfService)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 641 (@safAmfService2010f) <0, 2010f>
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Switching StandBy --> 
Active State
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
623 <14, 2020f> (@safAmfService2020f)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer connected: 642 
(safAmfService) <14, 2020f>
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafrded[2303]: NO RDE role set to ACTIVE
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafclmd[2377]: NO ACTIVE request
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Controller switch over done
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA >>s_info->to_svc == 0<< 
reply context destroyed before this reply could be made
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: ER Failed to send response to 
agent/client over MDS rc:2
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100f0)
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmnd[2332]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:21
Sep 12 19:15:15 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Create of class 
testMA_verifyPrimNoResponseDelCallback_101 is PERSISTENT.
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: IN Create of class 
testMA_verifyPrimNoResponseDelCallback_101 committing with ccbId:100ee
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: ER pbePrepareTrans was called 
when sqliteTransLock(0)!=1
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 625 <315, 2020f> (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 626 <316, 2020f> (OsafImmPbeRt_B)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
625 <315, 2020f> (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
626 <316, 2020f> (OsafImmPbeRt_B)
Sep 12 19:15:17 SLES-64BIT-SLOT2 osafimmnd[2332]: WA SLAVE PBE process has 
apparently died at non coord

Program terminated with signal 6, Aborted.
  #0  0x7fd4af31fb55 in

[tickets] [opensaf:tickets] #999 NTF: memory leak due to missing removal of std:tr1:shared_ptr in last container

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.0 --> never



---

** [tickets:#999] NTF: memory leak due to missing removal of std:tr1:shared_ptr 
in last container**

**Status:** wontfix
**Milestone:** never
**Created:** Wed Aug 20, 2014 02:09 PM UTC by Minh Hon Chau
**Last Updated:** Fri Aug 22, 2014 12:37 AM UTC
**Owner:** Minh Hon Chau


In the method NtfAdmin::deleteConfirmedNotification(...), the NtfNotification 
object should be destroyed after NtfAdmin::notificationMap erases the 
NtfSmartPtr.
But the fact that there's another container (NtfLogger::coll_) still owning 
this shared_ptr, thus the destructor of NtfNotification will not be invoked. 
That causes memory leak because NtfLogger::coll_ has never removed its element



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1000 IMMND asserts at immsv_evt_enc_inline_text after SetErrorString operation

2015-11-02 Thread Anders Widell

- **Milestone**: 4.3.3 --> never



---

** [tickets:#1000] IMMND asserts at immsv_evt_enc_inline_text after 
SetErrorString operation **

**Status:** duplicate
**Milestone:** never
**Created:** Thu Aug 21, 2014 05:22 AM UTC by Sirisha Alla
**Last Updated:** Mon Aug 25, 2014 04:45 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs.tar](https://sourceforge.net/p/opensaf/tickets/1000/attachment/logs.tar) 
(18.9 MB; application/x-tar)


This issue is seen on SLES 64bit 4 node testbed running with 4.5 changeset 5608 
plus patches for #938,#994 and #997

The test is to do OiCcbSetErrorString inside createCallback() Twice and check 
that the second invocation of SetErrorString() returns BAD_OPERATION. The 
CreateCallback() returned with BAD_OPERATION when IMMND crashed.

syslog on the payload where the test is in progress:

Aug 21 10:12:49 SLES-64BIT-SLOT4 osafimmnd[3019]: NO implementer for class 
'testCcbExt_verifySetErrStrSingleStrPerCbk_133' is 
implementertestCcbExt_verifySetErrStrSingleStrPerCbk_133 => class extent is 
safe.
Aug 21 10:12:49 SLES-64BIT-SLOT4 osafimmnd[3019]: NO 
ImmModel::ccbObjCreateContinuation: implementer returned error, Ccb aborted 
with error: 20
Aug 21 10:12:49 SLES-64BIT-SLOT4 osafimmnd[3019]: WA immsv_evt_enc_inline_text: 
Length missmatch from source line:1098 (1 342010752 '')
Aug 21 10:12:49 SLES-64BIT-SLOT4 osafimmnd[3019]: immsv_evt.c:1098: 
immsv_evt_enc_attrName: Assertion 'immsv_evt_enc_inline_text(__LINE__, o_ub, 
os)' failed.
Aug 21 10:12:49 SLES-64BIT-SLOT4 osafamfnd[3038]: NO 
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' component restart probation timer 
started (timeout: 600 ns)
Aug 21 10:12:49 SLES-64BIT-SLOT4 osafamfnd[3038]: NO Restarting a component of 
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)
Aug 21 10:12:49 SLES-64BIT-SLOT4 osafamfnd[3038]: NO 
'safComp=IMMND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'


Following is the back trace of the core:

Program terminated with signal 6, Aborted.
 #0  0x7f5decac5b55 in raise () from /lib64/libc.so.6
(gdb) bt
 #0  0x7f5decac5b55 in raise () from /lib64/libc.so.6
 #1  0x7f5decac7131 in abort () from /lib64/libc.so.6
 #2  0x7f5deddffc0e in __osafassert_fail () from 
/usr/lib64/libopensaf_core.so.0
 #3  0x0047ddef in immsv_evt_enc_attrName.part.5 () at immsv_evt.c:1098
 #4  0x0047e49a in immsv_evt_enc_sublevels () at immsv_evt.c:1506
 #5  0x00418ccf in immnd_mds_callback ()
 #6  0x7f5dede25f9f in mds_mcm_send_msg_enc () from 
/usr/lib64/libopensaf_core.so.0
 #7  0x7f5dede2670d in mcm_pvt_red_snd_process_common () from 
/usr/lib64/libopensaf_core.so.0
 #8  0x7f5dede2b299 in mds_send () from /usr/lib64/libopensaf_core.so.0
 #9  0x7f5dede23d78 in ncsmds_api () from /usr/lib64/libopensaf_core.so.0
 #10 0x00419727 in immnd_mds_send_rsp ()
 #11 0x0040a329 in immnd_evt_proc_ccb_obj_create_rsp.isra.43 () at 
immnd_evt.c:3538
 #12 0x00415f00 in immnd_evt_proc_fevs_dispatch () at immnd_evt.c:7782
 #13 0x0041820d in immnd_process_evt () at immnd_evt.c:8506
 #14 0x0040b6ab in main () at immnd_main.c:336
(gdb) thread apply all bt

Thread 4 (Thread 0x7f5dec25a700 (LWP 3023)):
 #0  0x7f5decb614f6 in poll () from /lib64/libc.so.6
 #1  0x7f5deddfc5f0 in osaf_poll_no_timeout () from 
/usr/lib64/libopensaf_core.so.0
 #2  0x7f5deddfc875 in osaf_poll () from /usr/lib64/libopensaf_core.so.0
 #3  0x7f5deddfea02 in auth_server_main () from 
/usr/lib64/libopensaf_core.so.0
 #4  0x7f5ded5a77b6 in start_thread () from /lib64/libpthread.so.0
 #5  0x7f5decb6a9cd in clone () from /lib64/libc.so.6
 #6  0x in ?? ()

Thread 3 (Thread 0x7f5dee244b00 (LWP 3022)):
 #0  0x7f5decb614f6 in poll () from /lib64/libc.so.6
 #1  0x7f5dede34b35 in mdtm_process_recv_events () from 
/usr/lib64/libopensaf_core.so.0
 #2  0x7f5ded5a77b6 in start_thread () from /lib64/libpthread.so.0
 #3  0x7f5decb6a9cd in clone () from /lib64/libc.so.6
 #4  0x in ?? ()

Thread 2 (Thread 0x7f5dee275b00 (LWP 3021)):
 #0  0x7f5decb614f6 in poll () from /lib64/libc.so.6
 #1  0x7f5deddfc5f0 in osaf_poll_no_timeout () from 
/usr/lib64/libopensaf_core.so.0
 #2  0x7f5deddfc7f5 in osaf_ppoll () from /usr/lib64/libopensaf_core.so.0
 #3  0x7f5dede033df in ncs_tmr_wait () from /usr/lib64/libopensaf_core.so.0
 #4  0x7f5ded5a77b6 in start_thread () from /lib64/libpthread.so.0
 #5  0x7f5decb6a9cd in clone () from /lib64/libc.so.6
 #6  0x in ?? ()

Thread 1 (Thread 0x7f5dee247720 (LWP 3019)):
 #0  0x7f5decac5b55 in raise () from /lib64/libc.so.6
 #1  0x7f5decac7131 in abort () from /lib64/libc.so.6
 #2  0x7f5deddffc0e in __osafassert_fail () from 
/usr/lib64/libopensaf_core.so.0
 #3  0x0047ddef in immsv_evt_enc_attrName.part.5 () at immsv_evt.c:1098
 #4  0x0047e49a in immsv_evt_enc_sublevels ()

[tickets] [opensaf:tickets] #980 IMM: immcfg coredump when creating long dn object

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.0 --> never



---

** [tickets:#980] IMM: immcfg coredump when creating long dn object**

**Status:** invalid
**Milestone:** never
**Created:** Fri Aug 08, 2014 07:38 PM UTC by Minh Hon Chau
**Last Updated:** Mon Sep 08, 2014 10:51 AM UTC
**Owner:** Zoran Milinkovic


Coredump on immcfg by following test:

root@uvb:~/ grep EXTENDED /etc/opensaf/immnd.conf
export SA_ENABLE_EXTENDED_NAMES=1
root@uvb:~/ immcfg -m -a longDnsAllowed=1 
opensafImm=opensafImm,safApp=safImmService
root@uvb:~/ immcfg -f longdn_class.xml
root@uvb:~/ immcfg -c OsafNtfCmTestCFG stringRdnCfg=abcd
root@uvb:~/ immcfg -a testNameCfg=123 stringRdnCfg=abcd
root@uvb:~/ immlist stringRdnCfg=abcd
Name   Type Value(s)

testNameCfgSA_NAME_T123 (3)
stringRdnCfg   SA_STRING_T  
stringRdnCfg=abcd
SaImmAttrImplementerName   SA_STRING_T  
SaImmAttrClassName SA_STRING_T  OsafNtfCmTestCFG
SaImmAttrAdminOwnerNameSA_STRING_T  

root@uvb:~/ immcfg -a 
testNameCfg=
 stringRdnCfg=abcd
Aborted (core dumped)

---
The longdn_class.xml as below:
"


SA_CONFIG

stringRdnCfg
SA_STRING_T
SA_CONFIG
SA_INITIALIZED
SA_NOTIFY



testNameCfg
SA_NAME_T
SA_CONFIG
SA_MULTI_VALUE
SA_NOTIFY
SA_WRITABLE



"

the backtrace as below:

Core was generated by `immcfg -a 
testNameCfg=1'.
Program terminated with signal SIGABRT, Aborted.
\#0  0x415d5f79 in raise () from /lib64/libc.so.6
(gdb) bt
\#0  0x415d5f79 in raise () from /lib64/libc.so.6
\#1  0x415d9388 in abort () from /lib64/libc.so.6
\#2  0x409c5cbe in __osafassert_fail (__file=__file@entry=0x409f9cd1 
"osaf_extended_name.c",
__line=__line@entry=130, __func=__func@entry=0x409f9d60 <__FUNCTION__.3257> 
"osaf_extended_name_length",
__assertion=__assertion@entry=0x409f9d10 "osaf_extended_names_enabled && 
length >= SA_MAX_UNEXTENDED_NAME_LENGTH")
at sysf_def.c:281
\#3  0x409c3936 in osaf_extended_name_length (name=name@entry=0x62dc60) 
at osaf_extended_name.c:129
\#4  0x40c322fd in imma_copyAttrValue (p=0x62d870, 
attrValueType=SA_IMM_ATTR_SANAMET, attrValue=0x62dc60)
at imma_init.c:421
\#5  0x40c24e77 in saImmOmCcbObjectModify_2 
(ccbHandle=ccbHandle@entry=1406852104797087000,
objectName=objectName@entry=0x61b240, attrMods=attrMods@entry=0x62dc40) at 
imma_om_api.c:2349
\#6  0x00412ddc in immutil_saImmOmCcbObjectModify_2 
(immCcbHandle=1406852104797087000, objectName=0x61b240,
attrMods=attrMods@entry=0x62dc40) at immutil.c:1540
\#7  0x0040d2ee in object_modify 
(objectNames=objectNames@entry=0x61b220, optargs=optargs@entry=0x61b010,
optargs_len=optargs_len@entry=1) at imm_cfg.c:589
\#8  0x0040e894 in imm_operation (argc=4, argv=) at 
imm_cfg.c:1439
\#9  0x415c0ec5 in __libc_start_main () from /lib64/libc.so.6
\#10 0x0040378e in _start ()
(gdb) f 2
\#2  0x409c5cbe in __osafassert_fail (__file=__file@entry=0x409f9cd1 
"osaf_extended_name.c",
__line=__line@entry=130, __func=__func@entry=0x409f9d60 <__FUNCTION__.3257> 
"osaf_extended_name_length",
__assertion=__assertion@entry=0x409f9d10 "osaf_extended_names_enabled && 
length >= SA_MAX_UNEXTENDED_NAME_LENGTH")
at sysf_def.c:281
281 abort();
(gdb) p osaf_extended_names_enabled
$1 = false
(gdb)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #972 OI rejection through callback results in immnd crash

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.FC --> never



---

** [tickets:#972] OI rejection through callback results in immnd crash**

**Status:** duplicate
**Milestone:** never
**Created:** Wed Jul 30, 2014 11:49 AM UTC by surender khetavath
**Last Updated:** Thu Aug 07, 2014 07:51 AM UTC
**Owner:** nobody
**Attachments:**

- 
[sc1_logs.tgz](https://sourceforge.net/p/opensaf/tickets/972/attachment/sc1_logs.tgz)
 (461.2 kB; application/x-compressed-tar)


gcc 4.9
setup : 1 controller
changeset : 5491 and patch from #643

Running any ccb operation and allowing the OI to reject this operation by 
replying with ERR_BAD_OP inside callback, results in immnd crash

(gdb) bt
#0  0x000a0042db53 in ?? ()
#1  0x000c in ?? ()
#2  0x006d19c0 in ?? ()
#3  0x7fff030f89a0 in ?? ()
#4  0x0041817c in immnd_evt_proc_ccb_finalize ()
Backtrace stopped: frame did not save the PC
(gdb) bt full
#0  0x000a0042db53 in ?? ()
No symbol table info available.
#1  0x000c in ?? ()
No symbol table info available.
#2  0x006d19c0 in ?? ()
No symbol table info available.
#3  0x7fff030f89a0 in ?? ()
No symbol table info available.
#4  0x0041817c in immnd_evt_proc_ccb_finalize ()
No symbol table info available.
Backtrace stopped: frame did not save the PC
(gdb) 


logs attached.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1576 AMF : SU struck in terminating ( health check timeout - proxy proxied )

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1576] AMF : SU struck in terminating ( health check timeout - 
proxy proxied )**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Oct 29, 2015 05:49 AM UTC by Srikanth R
**Last Updated:** Thu Oct 29, 2015 05:49 AM UTC
**Owner:** nobody
**Attachments:**

- 
[1570.tgz](https://sourceforge.net/p/opensaf/tickets/1576/attachment/1570.tgz) 
(1.5 MB; application/x-compressed-tar)


Changeset : 6901
Application : SU1 mapped to SC-2 & SU2 mapped to SC-1.
  Each SU consists of 3 Pre instantiable components ( one of 
the component is LOCAL & PROXIED and the other two components are SA_AWARE )
  
  
Steps :

 * Brought up two controllers in the cluster.
 * Performed unlock-in  operation on SU1.
 * Health check is started by both SA-AWARE components.
 *  One of the SA-AWARE components faulted in health check and as part of 
repair, SU is struck in terminating state.


Oct 29 10:30:35 SYSTEST-CNTLR-2 osafamfnd[3617]: NO 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair' Presence 
State INSTANTIATING => INSTANTIATED
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO saAmfSUFailover is true for 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair'
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO SU failover probation timer 
started (timeout: 12000 ns)
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO Performing failover of 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair' (SU 
failover count: 1)
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO 
'safComp=2nAdminRepair,safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair'
 recovery action escalated from 'noRecommendation' to 'suFailover'
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO 
'safComp=2nAdminRepair,safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair'
 faulted due to 'healthCheckcallbackTimeout' : Recovery is 'suFailover'
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO Terminating components of 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair'(abruptly 
& unordered)
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair' Presence 
State INSTANTIATED => TERMINATING
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair' Presence 
State TERMINATING => TERMINATING
Oct 29 10:31:21 SYSTEST-CNTLR-2 osafamfnd[3617]: NO 
'safSu=2nAdminRepair_SU_1,safSg=2nAdminRepair_SG,safApp=2nAdminRepair' Presence 
State TERMINATING => TERMINATING

 * Amfd crashes during opensafd stop  on the SC-2,

Oct 29 11:27:46 SYSTEST-CNTLR-2 opensafd: Stopping OpenSAF Services
Oct 29 11:27:46 SYSTEST-CNTLR-2 osafamfnd[3617]: NO Shutdown initiated
Oct 29 11:27:46 SYSTEST-CNTLR-2 osafamfnd[3617]: NO Terminating all AMF 
components
...
Oct 29 11:27:46 SYSTEST-CNTLR-2 osafamfd[3607]: NO Re-initializing with IMM
...
Oct 29 11:28:46 SYSTEST-CNTLR-2 osafamfd[3607]: exiting for shutdown
Oct 29 11:28:46 SYSTEST-CNTLR-2 osafamfnd[3617]: ER AMF director unexpectedly 
crashed
Oct 29 11:28:46 SYSTEST-CNTLR-2 osafamfnd[3617]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) 
received, OwnNodeId = 131599, SupervisionTime = 60



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1542 AMF : Quiesced callbacks should be generated, during recovery (su failover flag disabled)

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1542] AMF : Quiesced callbacks should be generated, during 
recovery (su failover flag disabled)**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue Oct 13, 2015 10:33 AM UTC by Srikanth R
**Last Updated:** Tue Oct 20, 2015 11:17 AM UTC
**Owner:** nobody


Changeset : 6901
Application : 2n 
 4 SIs configured with SI1 as sponsor for SI2,SI3,SI4
 Component recovery policy - 3
 sufailoverflag -0

Steps :

 * Initially all the SIs are in unassigned state. SU1 is hosted active and SU2 
is hosted standby
 * Performed lock on SI4.
 * Later performed unlock on SI4, for which component in SU1 rejected the 
active callback.
 * As part of recovery, all the assignments to SU1 should be removed and active 
assignments to be given to standby su .i.e SU2.
 * In the current implementation, quiesced callbacks are not generated during 
removal of assignments.  
 * According to the spec page NO ;195, 

If the service unit is configured to fail over as a single entity 
(saAmfSUFailover
set to SA_TRUE), all other components of the service unit are abruptly termi-
nated, and all service instances assigned to that service unit are failed over; 
oth-
erwise, only the erroneous component is abruptly terminated, and all component
service instances that were assigned to it are failed over. Other components are
not terminated, but all service instances that contained one of the failed over
component service instances have their remaining component service instances
switched over
 
 
 * Below is the syslog on the node where SU1 is hosted.
 
 
 Oct 13 03:15:24 SYSTEST-PLD-1 osafamfnd[2725]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCED to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:24 SYSTEST-PLD-1 osafamfnd[2725]: NO Assigned 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCED to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:24 SYSTEST-PLD-1 osafamfnd[2725]: NO Removed 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' ACTIVE to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO SU failover probation timer 
started (timeout: 12000 ns)
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Performing failover of 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (SU failover count: 1)
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO 
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted 
due to 'csiSetcallbackFailed' : Recovery is 'componentFailover'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
INSTANTIATED => TERMINATING
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Assigned 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' ACTIVE to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removing 'all (4) SIs' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removing 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removing 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removing 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removing 
'safSi=TestApp_SI4,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removed 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removed 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removed 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removed 
'safSi=TestApp_SI4,safApp=TestApp_TwoN' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO Removed 'all SIs' from 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 13 03:15:27 SYSTEST-PLD-1 osafamfnd[2725]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
TERMINATING => INSTANTIATED
  


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to

[tickets] [opensaf:tickets] #1548 logd on standby crashed, for nonexistent logsv data group

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1548] logd on standby crashed, for nonexistent logsv data group**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Thu Oct 15, 2015 12:29 PM UTC by Srikanth R
**Last Updated:** Thu Oct 15, 2015 02:53 PM UTC
**Owner:** Mathi Naickan


Changeset : 6901

Steps : 

  Logsv crashes on standby controller, if the group does not exits on standby 
controller.  
  For the following command, logd crashed on the standby controller with the 
syslog. 
  immcfg -a logDataGroupname=testGroup logConfig=1,safApp=safLogService
 

Oct 15 17:09:49 CONTROLLER-1 osaflogd[2227]: ER osaf_user_is_member_of_group: 
group 'testGroup' does not exist
Oct 15 17:09:49 CONTROLLER-1 osaflogd[2227]: WA 
lgs_cfg_verify_log_data_groupname: osaf_user_is_member_of_group() Fail
Oct 15 17:09:49 CONTROLLER-1 osaflogd[2227]: WA lgs_cfg_update: Verify fail for 
lgs configuration
Oct 15 17:09:49 CONTROLLER-1 osaflogd[2227]: ER ckpt_proc_lgs_cfg_v5 
lgs_cfg_update Fail
Oct 15 17:09:49 CONTROLLER-1 osaflogd[2227]: lgs_mbcsv_v5.c:127: 
ckpt_proc_lgs_cfg_v5: Assertion '0' failed.
Oct 15 17:09:49 CONTROLLER-1 osafamfnd[2281]: NO 
'safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

  Logd also crashes, if the user is not part of the newly created group on the 
standby. Logsv should reject the ccb operation, if the standby is not properly 
updated


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1541 AMF : Both the 2N SUs are assigned Standby SI Assignment run time objects.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1541] AMF : Both the 2N SUs are assigned Standby SI Assignment run 
time objects.**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue Oct 13, 2015 07:17 AM UTC by Srikanth R
**Last Updated:** Mon Oct 19, 2015 11:49 AM UTC
**Owner:** nobody
**Attachments:**

- [1541.sh](https://sourceforge.net/p/opensaf/tickets/1541/attachment/1541.sh) 
(16.7 kB; application/x-shellscript)


Changeset : 6901
Configuration :  2N 
   2 SUs and 4 SIs with out si-si deps.
   Component recovery = 3 ( suFailoverflag disabled )
   
Steps :

 *  Initially all the SIs are in assigned state.
 *  Performed shutdown operation on the SU hosting active assignment
 *  In the quiescing callback, ensure that component do not respond.
 *  As part of recovery, the other SU got active callbacks.  But the SI 
assignment objects for active are not created. Only the standby SI assignments 
are present in IMM.
 *  After unlocking the locked SU, both the SUs are showing standby assignment 
from the siass runtime objects.
 
safSISU=safSu=TestApp_SU1\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU1\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI4,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU1\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI3,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU2\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI1,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU1\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI1,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU2\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU2\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI3,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=TestApp_SU2\,safSg=TestApp_SG1\,safApp=TestApp_TwoN,safSi=TestApp_SI4,safApp=TestApp_TwoN
saAmfSISUHAState=STANDBY(2)

   
Below is the error logged in active controller syslog.
   
Oct 13 00:19:51 CONTROLLER-2 osafamfd[11712]: EM sg_2n_fsm.cc:2359: 
safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN (55)

Configuration to create the application is attached. Same issue is observed for 
the similar scenario during Node shutdown operation


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1485 amf:Nway, Unstable SG during SI lock when standby faulted with comp failover recovery.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1485] amf:Nway, Unstable SG during SI lock when standby faulted 
with comp failover recovery.**

**Status:** unassigned
**Milestone:** 4.6.2
**Labels:** NWAY COMP_FAILOVER 
**Created:** Wed Sep 16, 2015 09:24 AM UTC by Praveen
**Last Updated:** Wed Sep 16, 2015 09:24 AM UTC
**Owner:** nobody
**Attachments:**

- 
[AppConfig-N-Way.xml](https://sourceforge.net/p/opensaf/tickets/1485/attachment/AppConfig-N-Way.xml)
 (16.1 kB; text/xml)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/1485/attachment/osafamfd) 
(280.5 kB; application/octet-stream)


Attached is the configuration and AMF traces to reproduce the problem.
steps to reproduce:
1)Lock the SI which has assignment on all the SUs.
2)When active component is processing quiesced callback, kill the standby comp 
for this SI on other SU with 
component failover  recovery.
3)AMF will revert back SI to unlocked state.
4)SG becomes unstable.
5)For the faulted SU, removal of assignments is not performed and it stuck in 
Terminating state.

Assignments before si lock:

safSISU=safSu=SU3\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU3\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
saAmfSISUHAState=ACTIVE(1)

After SI lock and fault assignment status and su state:
safSISU=safSu=SU3\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU3\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
saAmfSISUHAState=QUIESCED(3)

safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=DISABLED(2)
saAmfSUPresenceState=TERMINATING(4)
saAmfSUReadinessState=OUT-OF-SERVICE(1)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1488 LCK: if master GLND reboots deadlock can occur for currently held locks

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1488] LCK: if master GLND reboots deadlock can occur for currently 
held locks**

**Status:** review
**Milestone:** 4.6.2
**Created:** Wed Sep 16, 2015 02:54 PM UTC by Alex Jones
**Last Updated:** Thu Sep 17, 2015 04:41 PM UTC
**Owner:** Alex Jones


If the master GLND is rebooted while an exclusive lock (or locks) is held, when 
the new master is elected and the other GLNDs send over the current lock 
information held by them to the new master, they do not send all information 
needed by the new master to lock/unlock currently held locks.

When this happens the lock(s) can never be unlocked or granted.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1317 ckpt : stale replicas observed in a 70 node cluster

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1317] ckpt : stale replicas observed in a 70 node cluster**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Wed Apr 15, 2015 10:16 AM UTC by Sirisha Alla
**Last Updated:** Tue Aug 11, 2015 06:12 AM UTC
**Owner:** A V Mahesh (AVM)
**Attachments:**

- 
[logs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1317/attachment/logs.tar.bz2)
 (6.5 MB; application/x-bzip)


This issue is observed on cs6377 (46FC Tag). The cluster is 0f 70 nodes and 2 
checkpoint applications run on each node. The application running on the active 
controller creates the checkpoint, while the applications running on other 
nodes open the same checkpoint and use them. After sections are created, 
written and read from all the applications finalizes the handles used. The 
retention duration of the checkpoint is specified to a minimal value of 1000 
nanoseconds.

/dev/shm on the active controller after the applications exited.

SLES-64BIT-SLOT1:~ # date;ls -lrt /dev/shm/
Wed Apr 15 14:25:09 IST 2015
total 1772
-rw-r--r-- 1 opensaf opensaf 1076040 Apr 15 13:38 
opensaf_NCS_MQND_QUEUE_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf  328000 Apr 15 13:38 opensaf_NCS_GLND_RES_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf  16 Apr 15 13:38 opensaf_NCS_GLND_LCK_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf   88000 Apr 15 13:38 opensaf_NCS_GLND_EVT_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf  704008 Apr 15 13:38 
opensaf_CPND_CHECKPOINT_INFO_131343
-rw-r--r-- 1 opensaf opensaf   79848 Apr 15 13:55 
opensaf_safCkpt=active_replica_ckpt_name_1_sysgrou_131343_4
-rw-r--r-- 1 opensaf opensaf   79848 Apr 15 13:56 
opensaf_safCkpt=active_replica_ckpt_name_1_sysgrou_131343_9
-rw-r--r-- 1 opensaf opensaf   79848 Apr 15 13:57 
opensaf_safCkpt=active_replica_ckpt_name_1_sysgrou_131343_16
SLES-64BIT-SLOT1:~ # date;immfind|grep -i ckpt
Wed Apr 15 14:25:11 IST 2015
safApp=safCkptService
SLES-64BIT-SLOT1:~ # 

When the same checkpoint name is being tried created, checkpoint service is not 
creating a new replica in the shared memory.

cpd,cpnd traces are attached.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1306 AMF: notifications during various admin operations

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1306] AMF: notifications during various admin operations**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Tue Apr 07, 2015 06:32 PM UTC by Srikanth R
**Last Updated:** Tue May 26, 2015 05:57 AM UTC
**Owner:** Praveen


Changeset : 6377 ( 4.6 FC)
Application / Model : Observed in 2n and NoRed models

Below are the different issues observed in the notifications generated by AMF 
during admin operation

1)If the nodegroup / node  is hosting the entire application and if lock 
operation is issued on the node group / node , alarm and notification order are 
in improper order

Initially two state change notifications about the SI moving to partially 
assigned state and  quiesced state should be generated and later alarm should 
be generated about SI being unassigned.

But the notification and alarm are in improper order :

===  Apr  3 22:28:12 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = "safSi=TWONSI1,safApp=TWONAPP"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.111 (0x6f)
additionalText = "The Assignment state of SI safSi=TWONSI1,safApp=TWONAPP 
changed"
sourceIndicator = SA_NTF_OBJECT_OPERATION
State ID = SA_AMF_ASSIGNMENT_STATE
Old State: SA_AMF_ASSIGNMENT_FULLY_ASSIGNED
New State: SA_AMF_ASSIGNMENT_PARTIALLY_ASSIGNED


===  Apr  3 22:28:12 - Alarm  ===
eventType = SA_NTF_ALARM_PROCESSING
notificationObject = "safSi=TWONSI1,safApp=TWONAPP"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.5 (0x5)
additionalText = "SI designated by safSi=TWONSI1,safApp=TWONAPP has no current 
active assignments to any SU"
probableCause = SA_NTF_SOFTWARE_ERROR
perceivedSeverity = SA_NTF_SEVERITY_MAJOR

===  Apr  3 22:28:12 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = "safSu=SU1,safSg=SGONE,safApp=TWONAPP"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.110 (0x6e)
additionalText = "The HA state of SI safSi=TWONSI1,safApp=TWONAPP assigned to 
SU safSu=SU1,safSg=SGONE,safApp=TWONAPP changed"
- additionalInfo: 0 -
 infoId = 2
 infoType = 10
 infoValue = "safSi=TWONSI1,safApp=TWONAPP"
sourceIndicator = SA_NTF_OBJECT_OPERATION
State ID = SA_AMF_HA_STATE
Old State: 
New State: SA_AMF_HA_QUIESCED


 Incase of SI lock operation, initially two state change notifications and 
later alarm are generated in proper way.


2) For the lock and shutdown operations, old state is not filled up when state 
change notification is issued for HA state change.


Old state ( Active)  is not filled up for shutdown operation
===  Apr  7 15:21:03 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = 
"safSu=Srikanth_nored_3,safSg=SG_Srikanth_nored,safApp=nored"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.110 (0x6e)
additionalText = "The HA state of SI safSi=Srikanth_nored_3,safApp=nored 
assigned to SU safSu=Srikanth_nored_3,safSg=SG_Srikanth_nored,safApp=nored 
changed"
- additionalInfo: 0 -
 infoId = 2
 infoType = 10
 infoValue = "safSi=Srikanth_nored_3,safApp=nored"
sourceIndicator = SA_NTF_OBJECT_OPERATION
State ID = SA_AMF_HA_STATE
Old State:
New State: SA_AMF_HA_QUIESCING

Old state should be quiescing
===  Apr  7 15:21:03 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = 
"safSu=Srikanth_nored_3,safSg=SG_Srikanth_nored,safApp=nored"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.110 (0x6e)
additionalText = "The HA state of SI safSi=Srikanth_nored_3,safApp=nored 
assigned to SU safSu=Srikanth_nored_3,safSg=SG_Srikanth_nored,safApp=nored 
changed"
- additionalInfo: 0 -
 infoId = 2
 infoType = 10
 infoValue = "safSi=Srikanth_nored_3,safApp=nored"
sourceIndicator = SA_NTF_OBJECT_OPERATION
State ID = SA_AMF_HA_STATE
Old State:
New State: SA_AMF_HA_QUIESCED


3) An invalid extra notification is generated, when SG / SI  is locked with 
no-redundancy SU in assigned,in-service and enabled state. 

===  Apr  7 14:53:46 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = "safSg=SG_Srikanth_nored,safApp=nored"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.103 (0x67)
additionalText = "Admin state of safSg=SG_Srikanth_nored,safApp=nored changed"
sourceIndicator = SA_NTF_MANAGEMENT_OPERATION
State ID = SA_AMF_ADMIN_STATE
Old State: SA_AMF_ADMIN_LOCKED
New State: SA_AMF_ADMIN_LOCKED


===  Apr  7 15:11:58 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = "safSi=Srikanth_nored_3,safApp=nored"
notifyingObject = "safApp=safAmfService"
notificationClassId = SA_NTF_VENDOR_ID_SAF.SA_SVC_AMF.104 (0x68)
additionalText = "Admin state of safSi=Srikanth_nored_3,safApp=nored changed"
sourceIndicator = SA_NTF_MANAGEMENT_OPERATION
State ID =

[tickets] [opensaf:tickets] #887 saLckFinalize api returns ERR_LIBRARY after Glnd restart.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#887] saLckFinalize api returns ERR_LIBRARY after Glnd restart.**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue May 06, 2014 10:39 AM UTC by Hrishikesh
**Last Updated:** Mon Sep 28, 2015 05:45 AM UTC
**Owner:** nobody
**Attachments:**

- [logs.tgz](https://sourceforge.net/p/opensaf/tickets/887/attachment/logs.tgz) 
(2.4 MB; application/x-compressed-tar)


This ticket is replication of opensaf devel Ticket #3062 updated with latest 
logs.
ChangeSet: 5142.

 Glsv API Calls return ERR_LIB after glsv node director restart

SetUp:32bit Glsv App on 64bit SLES11 machine.

Opensaf is running on 64bit machine and Glsv App with 32bit libraries.

>From the logs its observed that GLND DECODE failed after ndrestart for a while.

Lcknd trace:
===
May  6 14:31:17.435315 osaflcknd [14542:glnd_main.c:0039] TR GLSV:GLND:ON
May  6 14:31:17.435324 osaflcknd [14542:glnd_api.c:0123] >> glnd_lib_req
May  6 14:31:17.435330 osaflcknd [14542:glnd_api.c:0059] >> glnd_se_lib_create: 
pool id 0
May  6 14:31:17.435335 osaflcknd [14542:glnd_cb.c:0052] >> glnd_cb_create: 
pool_id 0
May  6 14:31:17.435351 osaflcknd [14542:glnd_mds.c:0117] >> glnd_mds_register
May  6 14:31:17.435368 osaflcknd [14542:glnd_mds.c:0095] << glnd_mds_get_handle
May  6 14:31:17.435721 osaflcknd [14542:glnd_mds.c:0228] >> glnd_mds_callback
May  6 14:31:17.435746 osaflcknd [14542:glnd_mds.c:0435] >> glnd_mds_dec
May  6 14:31:17.435780 osaflcknd [14542:glnd_mds.c:0271] << glnd_mds_callback
May  6 14:31:17.435845 osaflcknd [14542:glnd_mds.c:0435] >> glnd_mds_dec
May  6 14:31:17.435857 osaflcknd [14542:glnd_mds.c:1382] T2 GLND DEC FAILED
May  6 14:31:17.435870 osaflcknd [14542:glnd_mds.c:0538] << glnd_mds_dec
May  6 14:31:17.435879 osaflcknd [14542:glnd_mds.c:0267] T2 GLND mds callback 
process failed
May  6 14:31:17.435890 osaflcknd [14542:glnd_mds.c:0271] << glnd_mds_callback
May  6 14:31:17.435950 osaflcknd [14542:glnd_mds.c:0435] >> glnd_mds_dec
May  6 14:31:17.435956 osaflcknd [14542:glnd_mds.c:1382] T2 GLND DEC FAILED
May  6 14:31:17.436138 osaflcknd [14542:glnd_mds.c:0538] << glnd_mds_dec
May  6 14:31:17.436143 osaflcknd [14542:glnd_mds.c:0267] T2 GLND mds callback 
process failed
May  6 14:31:17.436151 osaflcknd [14542:glnd_mds.c:0271] << glnd_mds_callback
May  6 14:31:17.436607 osaflcknd [14542:glnd_cb.c:0117] T1 GLND mds register 
success
May  6 14:31:17.436612 osaflcknd [14542:glnd_amf.c:0152] >> glnd_amf_init
May  6 14:31:17.436624 osaflcknd [14542:ava_api.c:0057] >> saAmfInitialize
May  6 14:31:17.436635 osaflcknd [14542:ava_init.c:0311] >> ncs_ava_startup
May  6 14:31:17.436645 osaflcknd [14542:ava_init.c:0078] >> ava_lib_req
May  6 14:31:17.436651 osaflcknd [14542:ava_init.c:0123] >> ava_create
May  6 14:31:17.436661 osaflcknd [14542:ava_init.c:0138] TR Component name = 
safComp=GLND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF
May  6 14:31:17.436667 osaflcknd [14542:ava_init.c:0156] TR Created handle for 
the control block
May  6 14:31:17.436674 osaflcknd [14542:ava_init.c:0160] TR Initialized the AVA 
control block lock
May  6 14:31:17.436680 osaflcknd [14542:ava_init.c:0164] TR EDU Initialization 
success
May  6 14:31:17.436742 osaflcknd [14542:ava_hdl.c:0060] >> ava_hdl_init
May  6 14:31:17.436750 osaflcknd [14542:ava_hdl.c:0074] << ava_hdl_init
May  6 14:31:17.436756 osaflcknd [14542:ava_init.c:0182] TR AVA Handles DB 
created successfully

Lock app agent trace snippet:
=
May  6 15:36:04.981389 gla [17336:gla_mds.c:0742] << gla_mds_msg_sync_send
May  6 15:36:04.981413 gla [17336:gla_api.c:0482] T2 GLA api lock finalize 
failed
May  6 15:36:04.981430 gla [17336:gla_api.c:0491] << saLckFinalize: 'FAILURE' 
return value '2'
May  6 15:36:04.982914 gla [17336:gla_api.c:0407] >> saLckFinalize: Called with 
Handle 637f80
===

* During the  execution of lock appl(involving all the api's) seg fault was 
observed and below is the snippet of it. Attachment has full backtrace for 
debugging.

Core was generated by `/usr/lib64/opensaf/osaflcknd --tracemask=0x'.
Program terminated with signal 11, Segmentation fault.
#0  0x00405c05 in glnd_client_node_resource_add (client_info=0x0, 
res_info=0x639890) at glnd_client.c:227
227 glnd_client.c: No such file or directory.
in glnd_client.c
(gdb) bt
#0  0x00405c05 in glnd_client_node_resource_add (client_info=0x0, 
res_info=0x639890) at glnd_client.c:227
#1  0x0040755b in glnd_process_gla_resource_open (glnd_cb=0x626f80, 
evt=0x63a380) at glnd_evt.c:634
#2  0x00406da3 in glnd_process_evt (cb=0x626f80, evt=0x63a380) at 
glnd_evt.c:358
#3  0x004034b7 in glnd_process_mbx (cb=0x626f80, mbx=0x626fb8) at 
glnd_api.c:162
#4  0x00403726 in glnd_main_process (mbx=0x626fb8) at glnd_api.c:242
#5  0x0040d240 in main (argc=2, argv=0x7fffd4cc90d8) at glnd_main.c:73




---

Sent from sourceforge.net because

[tickets] [opensaf:tickets] #872 osafdtmd asserts after connect with non member node

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#872] osafdtmd asserts after connect with non member node**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Wed Apr 23, 2014 06:45 AM UTC by Hans Feldt
**Last Updated:** Wed Jul 15, 2015 01:28 PM UTC
**Owner:** nobody


100% reproducible.

By mistake I had opensaf started on my native system (named xubuntu-13 below). 
Then I launched a virtual cluster which then keeps crashing. SC-1 in the 
virtual cluster stays up but all other nodes keeps crashing with the following 
assert:

Apr 23 08:35:27 SC-2 osafdtmd[352]: NO Established contact with 'xubuntu-13'
Apr 23 08:35:27 SC-2 osafdtmd[352]: dtm_node.c:108: dtm_process_node_info: 
Assertion '0' failed.

Apr 23 08:35:38 PL-3 osafdtmd[350]: NO Established contact with 'xubuntu-13'
Apr 23 08:35:38 PL-3 osafdtmd[350]: NO Established contact with 'SC-2'
Apr 23 08:35:38 PL-3 osafdtmd[350]: dtm_node.c:108: dtm_process_node_info: 
Assertion '0' failed.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #916 lock-in of payload middleware su times-out and remains in ENABLED state

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#916] lock-in of payload middleware su times-out and remains in 
ENABLED state**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue May 20, 2014 07:08 AM UTC by surender khetavath
**Last Updated:** Wed Jul 15, 2015 01:25 PM UTC
**Owner:** nobody
**Attachments:**

- [logs.tgz](https://sourceforge.net/p/opensaf/tickets/916/attachment/logs.tgz) 
(122.2 kB; application/x-compressed-tar)


changeset : 5270

1) bring up 4 node cluster
2) lock and then lock-in the payload middleware su i.e 
"safSu=PL-4,safSg=NoRed,safApp=OpenSAF"

console output
amf-adm lock-in safSu=PL-4,safSg=NoRed,safApp=OpenSAF
error - command timed out (alarm)

syslog on pl-4
May 20 12:19:59 PL-4 osafamfnd[5473]: NO 
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATED => 
TERMINATING
May 20 12:19:59 PL-4 osaflcknd[5509]: NO Received AMF component terminate 
callback, exiting
May 20 12:19:59 PL-4 osafckptnd[5518]: NO Received AMF component terminate 
callback, exiting
May 20 12:19:59 PL-4 osafimmnd[5455]: NO Received AMF component terminate 
callback, exiting
May 20 12:19:59 PL-4 osafamfnd[5473]: NO 
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State TERMINATING => 
UNINSTANTIATED
May 20 12:19:59 PL-4 osafsmfnd[5483]: NO Received AMF component terminate 
callback, exiting
May 20 12:19:59 PL-4 osafmsgnd[5492]: ER Amf Terminate Callback called
May 20 12:19:59 PL-4 osafmsgnd[5492]: NO Received AMF component terminate 
callback, exiting


su state:
safSu=PL-4,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=LOCKED-INSTANTIATION(3)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #889 unknown: oi poll timeout differs during switchover and failover scenarios

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#889] unknown: oi poll timeout differs during switchover and 
failover scenarios**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Wed May 07, 2014 03:10 PM UTC by surender khetavath
**Last Updated:** Wed Jul 15, 2015 01:26 PM UTC
**Owner:** nobody
**Attachments:**

- 
[failure_logs.tgz](https://sourceforge.net/p/opensaf/tickets/889/attachment/failure_logs.tgz)
 (747.3 kB; application/x-compressed-tar)
- 
[success_logs.tgz](https://sourceforge.net/p/opensaf/tickets/889/attachment/success_logs.tgz)
 (739.4 kB; application/x-compressed-tar)


changeset : 5143.

test:
1) in a thread do oiInit()
2) oiImplSet() & OiObjectImplSet on an object
3) oiselectionObjectGet()
4) poll() on the fd

In the main thread
1) om init, ownerset,
2) invoke controller failover/switchover
3) AdminOp(ONE_SECOND) on the object

If the poll timeout value is 40secs, then OI doesn't receive AdminOp callback 
and poll timesout.
If the poll timeout value is increased to say 80secs, then OI gets AdminOp 
callback.

How does  it differ? 
1) is the imm operation held until failover is completed?
2) is the imm operation held until the failed node re-joins?
3) The time to receive cbk i.e more than 40secs is not acceptable for HA.

The same test using swithover succeeds i.e receives cbk under 20secs of poll 
timeout.


two versions of logs attached


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #511 SU stucks in Terminatting state.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> never



---

** [tickets:#511] SU stucks in Terminatting state.**

**Status:** not-reproducible
**Milestone:** never
**Created:** Thu Jul 18, 2013 07:32 AM UTC by manu
**Last Updated:** Tue Oct 06, 2015 11:06 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[messages.tar.bz2](https://sourceforge.net/p/opensaf/tickets/511/attachment/messages.tar.bz2)
 (3.1 kB; application/x-bzip)
- 
[osafamfd.tar.bz2](https://sourceforge.net/p/opensaf/tickets/511/attachment/osafamfd.tar.bz2)
 (292.6 kB; application/x-bzip)
- 
[osafamfnd.tar.bz2](https://sourceforge.net/p/opensaf/tickets/511/attachment/osafamfnd.tar.bz2)
 (209.7 kB; application/x-bzip)


Change Set:4325 

Configuration: 2N redundency model.
2SU, on SC-1 and PL-3, 1 SI

Steps to Reproduce:-
1.Bring up the Application with CSI-CSI deps.
 Configuring CSI-CSI dependency with Multiple Sponsors and multiple dependents.

2.UNLOCK-IN / UNLOCK the SUs one by one.
  Both SUs are UNLOCKED/ENABLED/INSTANTIATED/IN-SERVICE.
  CSI Assignments happens perfectly.

3.Perform LOCK operation on SI.
  SI is LOCKED/UNASSIGNED

4.Perform Controller Switchover  "amf-adm si-swap safSi=SC-2N,safApp=OpenSAF"
  Switchover happens perfectly.
  Both SUs are UNLOCKED/ENABLED/INSTANTIATED/IN-SERVICE.

5.Perform UNLOCK Operation ON SI .
  SI becomes UNLOCKED/PARTIALLY_ASSIGNED.

6.SU1 Stucks in TERMINATING state.

safSu=d_2n_1,safSg=SG_d_2n,safApp=2nApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=DISABLED(2)
saAmfSUPresenceState=TERMINATING(4)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=d_2n_2,safSg=SG_d_2n,safApp=2nApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #484 Improper implementer prefix: MsgQueueService131343

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#484] Improper implementer prefix: MsgQueueService131343**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Jul 05, 2013 08:04 AM UTC by Anders Bjornerstedt
**Last Updated:** Wed Jul 15, 2015 02:02 PM UTC
**Owner:** nobody


The MQSv creates implementers with names like:

MsgQueueService?131343
MsgQueueService?131599
MsgQueueService?131855

etc.

Implementer-names, AdminOwner-names, Class-names and root object names in
the imm service are all GLOBAL name spaces. 

This is an open-source project. It is therefore important that components that
register themselves in global name spaces use proper prefixing. This is both to
avoid name collisions but also to allow identification of the component during
troubleshooting (done by others than the maintainer of that service).

As far as I know, the message queue service is the only service currently 
violating this. 

An implementer name should have a prefix that eliminates the
risk of naming conflicts and that makes it clear where it belongs.

If the imm-implementer is part of a SAF standard service,
then it should have the prefix "saf", like:

safAmfService
safSmfService
safCheckPointService
safLckService
safEvtService
safLogService
safMsgGrpService

If the implementer is part of an OpenSAF service that is not a
standard SAF service then it should have a prefix like "OpenSAF"
or osaf, like:

OpenSAFDtsvService


Because this was not caught and fixed early there is now an upgrade problem.
But that should not be too hard to solve.

At the place where the OI sets up its implementer-name and tries to set itself
as class-implementer for the relevant classes. It should:

1) Allocate two OI handles and set implementername to the old bad name in one
and to the new good name in the other. If it fails to set either implementer
name with ERR_EXIST then it should behave the way it currently behaves when the
implementer-name is occupied. 

2) For each class it is to be class implementer for it does:
   Try to set class-implementer to the new good name (using the good handle).
   If this fails with ERR_EXIST then (i) try to set class-implementer to the
   old bad name (using the bad handle). This should succeed. (ii) clear the
   implementer-name using saImmOiClassImplementerRelease() this should 
   succeed and is one of the rare occurrences where this api function is needed.
   (iii) Set implementer to the new good name (using the good handle).

Repeat (2) for all classes used by the service. 

I am raising the priority to major (from previously being minor) because:
 - This is easy to fix.
 - The current setup looks bad and sets a bad example.
 - The current setup can cause confusion or uncertainty during troubleshooting
   or when just trying to understand the system.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #473 AMF: SU rank ordering not followed at adm op SG unlock inst

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#473] AMF: SU rank ordering not followed at adm op SG unlock inst**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Mon Jun 24, 2013 12:32 PM UTC by Hans Feldt
**Last Updated:** Thu Aug 06, 2015 09:53 AM UTC
**Owner:** nobody


>From spec 3.6.1.1:

"Ordered list of service units for a service group: for each service group, an
ordered list of service units defines the rank of the service unit within the 
service group. This rank is configured by setting the saAmfSURank attribute of 
the
saAmfSU object class (see Section 8.10). The rank is represented by a positive
integer. The lower the integer value, the higher the rank. The size of the list 
is
equal to the number of service units configured for the service group. This
ordered list is used to specify the order in which service units are selected 
to be instantiated."

Instead all the SUs are instantiated in one go. See 
sg_app_sg_admin_unlock_inst() in avd_sg.cc




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #525 when instantiating IMMND, start_daemon is not staring the IMMND for the first time.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#525] when instantiating IMMND,  start_daemon is not staring the 
IMMND for the first time.**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Fri Jul 26, 2013 02:49 PM UTC by Neelakanta Reddy
**Last Updated:** Wed Jul 15, 2015 01:57 PM UTC
**Owner:** Neelakanta Reddy
**Attachments:**

- [logs.tgz](https://sourceforge.net/p/opensaf/tickets/525/attachment/logs.tgz) 
(1.2 MB; application/x-compressed-tar)
- 
[osafimmnd_SC1.bz2](https://sourceforge.net/p/opensaf/tickets/525/attachment/osafimmnd_SC1.bz2)
 (4.4 MB; application/x-bzip)


Description:

2-controllers and  1 -payload with #501 fix

1. start the two controller SC-1 and SC-2, which is loaded from the PBE with 
45K objects

2. start the payload

3. In the same time, issue the admin restart command at the active controller 
(SC-1)

amf-adm restart safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF

4. There are logs added in imm clc-cli script, for instantiate() before and 
after start_daemon

ul 26 17:30:43 Slot-3 osafamfnd[7167]: NO Admin restart requested for 
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
Jul 26 17:30:43 Slot-3 osafimmnd[7063]: NO Received AMF component terminate 
callback, exiting
Jul 26 17:30:43 Slot-3 osafimmpbed: NO PBE received SIG_TERM, closing db handle
Jul 26 17:30:43 Slot-3 osafimmpbed: IN IMM PBE process EXITING...
Jul 26 17:30:43 Slot-3 osafimmnd: start /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:43 Slot-3 osafimmd[7048]: WA IMMND coordinator at 2010f apparently 
crashed => electing new coord
Jul 26 17:30:43 Slot-3 osafimmd[7048]: NO New coord elected, resides at 2020f
Jul 26 17:30:43 Slot-3 osafimmnd: end /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:51 Slot-3 dhclient: DHCPREQUEST on eth0 to 10.176.108.18 port 67
Jul 26 17:30:53 Slot-3 osafamfd[7152]: NO Re-initializing with IMM
Jul 26 17:30:53 Slot-3 osafamfnd[7167]: NO Instantiation of 
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' failed
Jul 26 17:30:53 Slot-3 osafamfnd[7167]: NO Reason: component registration timer 
expired
Jul 26 17:30:53 Slot-3 osafimmnd: start /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:53 Slot-3 osafimmnd: end /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: Started
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: NO Persistent Back-End capability 
configured, Pbe file:imm.db
Jul 26 17:30:53 Slot-3 osafamfd[7152]: NO saImmOiAdminOperationResult for 
30064771073 failed 9
Jul 26 17:30:53 Slot-3 osafimmd[7048]: NO New IMMND process is on ACTIVE 
Controller at 2010f
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: NO SERVER STATE: IMM_SERVER_ANONYMOUS 
--> IMM_SERVER_CLUSTER_WAITING
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Jul 26 17:30:53 Slot-3 osafimmd[7048]: WA IMMND on controller (not currently 
coord) requests sync
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: NO NODE STATE-> IMM_NODE_ISOLATED
Jul 26 17:30:53 Slot-3 osafimmd[7048]: NO Node 2010f request sync sync-pid:7592 
epoch:0
Jul 26 17:30:54 Slot-3 osafimmnd[7592]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Jul 26 17:30:54 Slot-3 osafimmd[7048]: NO Successfully announced sync. New 
ruling epoch:13
Jul 26 17:30:54 Slot-3 osafimmnd[7592]: NO SERVER STATE: 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Jul 26 17:31:04 Slot-3 dhclient: DHCPREQUEST on eth0 to 10.176.108.18 port 67
Jul 26 17:31:12 Slot-3 osafimmd[7048]: NO ACT: New Epoch for IMMND process at 
node 2020f old epoch: 12  new epoch:13
Jul 26 17:31:13 Slot-3 osafimmd[7048]: NO ACT: New Epoch for IMMND process at 
node 2030f old epoch: 0  new epoch:13
Jul 26 17:31:13 Slot-3 osafimmnd[7592]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 2171

5. when the above logs are analyzed, the amfnd is instantiating IMMND but the 
start_daemon for some reason is unable to start the immnd process.

Jul 26 17:30:43 Slot-3 osafimmnd: start /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:43 Slot-3 osafimmd[7048]: WA IMMND coordinator at 2010f apparently 
crashed => electing new coord
Jul 26 17:30:43 Slot-3 osafimmd[7048]: NO New coord elected, resides at 2020f
Jul 26 17:30:43 Slot-3 osafimmnd: end /etc/opensaf/osafdir.conf, exiting.

6. After, 10 seconds when component registration timer expired then the amfnd 
tries to instantiate again and immnd got started.

Jul 26 17:30:53 Slot-3 osafimmnd: start /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:53 Slot-3 osafimmnd: end /etc/opensaf/osafdir.conf, exiting.
Jul 26 17:30:53 Slot-3 osafimmnd[7592]: Started

7. IMMND and IMMD traces has no logging when the amfnd tries to instantiate for 
first time.

8. when analyzed from IMM perspective:

PL-3 sent the request for sync to SC-1
SC-1 IMMD sent SYN_REQ to both PL-3 and co-ordinator (to start sync)
SC-1 immnd marked SyncRequested as true, but in same second the IMMND at SC-1

[tickets] [opensaf:tickets] #467 checkpoint with COLLOCATED flag forcing to register for arrival callback

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#467] checkpoint with COLLOCATED flag forcing to register for 
arrival callback**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Mon Jun 24, 2013 06:36 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Aug 11, 2015 06:16 AM UTC
**Owner:** A V Mahesh (AVM)


 am using opensaf 4.0.0
http://devel.opensaf.org/ticket/1866


I am running a simple Amf demo for counting which uses checkpoint.


my checkpoint creation flags are : SA_CKPT_CHECKPOINT_COLLOCATED| 
SA_CKPT_WR_ALL_REPLICAS


i tested it on a 2 node cluster(both target hardware and UML nodes).


problem is that unless i register for arrivalcallback, my standby component is 
faulting. amf is reporting healthcheck timeout.


i tested for SA_CKPT_CHECKPOINT_COLLOCATED| SA_CKPT_WR_ACTIVE_REPLICA also . I 
am facing facing same issue.


If I remove the collocated flag, it works fine. 





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #373 saAmfSGMaxActiveSIsperSU is not followed in the case of csiSetCallbackFailed scenarion in NWAY

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#373] saAmfSGMaxActiveSIsperSU is not followed in the case of 
csiSetCallbackFailed scenarion in NWAY**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Fri May 31, 2013 03:55 AM UTC by Nagendra Kumar
**Last Updated:** Mon Apr 20, 2015 06:30 AM UTC
**Owner:** Praveen
**Attachments:**

- 
[AppConfig-NWAY_1Spare.xml_maxSI](https://sourceforge.net/p/opensaf/tickets/373/attachment/AppConfig-NWAY_1Spare.xml_maxSI)
 (19.0 kB; text/xml)
- [373.tgz](https://sourceforge.net/p/opensaf/tickets/373/attachment/373.tgz) 
(407.8 kB; application/x-compressed)


Migrated from http://devel.opensaf.org/ticket/2362

Redundancy Model : NWAY
Change set : 3049.
Virtual machine


1) With the xml attached, brought up the configuration.


2) Performed lock,lock-in,unlock-in and unlock of the SG 


3) SU2 hosted on SC-2 got one active assignment and one standby CSI assignment. 


SU3 hosted on PL-3 got one active assignment and one standby CSI assignment.


4) Performed admin shutdown on SU3


5) component in SU3 hosted on PL-3 faulted in quiescing callback went for 
reboot.


Nov 24 16:18:01 SLES11-CONN-PC osafamfnd[5776]: 
'safComp=AmfDemo?9,safSu=SU3,safSg=AmfDemo?,safApp=AmfDemo?' faulted due to 
'csiSetcallbackFailed(12)' : Recovery is 'nodeFailover(5)'
Nov 24 16:18:01 SLES11-CONN-PC osafamfnd[5776]: 
'safSu=SU3,safSg=AmfDemo?,safApp=AmfDemo?' Presence State INSTANTIATED => 
TERMINATING


6) Already SU2 got one active assignment. As part of reassignment of SI's 
hosted on SU3, SU2 got more active assignmennt which should not happen. As 
maxActiveSIsPerSU is only 1, this assignment should not happen.


SLES11-SLOT-1:/home/xml # /etc/init.d/opensafd status
safSISU=safSu=SU2\,safSg=AmfDemo?\,safApp=AmfDemo?,safSi=AmfDemo?2,safApp=AmfDemo?


saAmfSISUHAState=ACTIVE(1)


safSISU=safSu=SU2\,safSg=AmfDemo?\,safApp=AmfDemo?,safSi=AmfDemo?,safApp=AmfDemo?


saAmfSISUHAState=ACTIVE(1)



Changed 18 months ago by srikanth 
■attachment AppConfig-NWAY_1Spare.xml_maxSI  added 
Changed 18 months ago by srikanth ¶
  immdump for SG after PL-3 reboot :


SLES11-SLOT-1:/home/xml # immlist safSg=AmfDemo?,safApp=AmfDemo?
Name Type Value(s)

safSg SA_STRING_T safSg=AmfDemo? 
saAmfSGType SA_NAME_T safVersion=4.0.0,safSgType=AmfDemo? (34) 
saAmfSGSuRestartProb SA_TIME_T 
saAmfSGSuRestartMax SA_UINT32_T 
saAmfSGSuHostNodeGroup SA_NAME_T safAmfNodeGroup=SCs,safAmfCluster=myAmfCluster 
(46) 
saAmfSGNumPrefStandbySUs SA_UINT32_T 2 (0x2)
saAmfSGNumPrefInserviceSUs SA_UINT32_T 3 (0x3)
saAmfSGNumPrefAssignedSUs SA_UINT32_T 3 (0x3)
saAmfSGNumPrefActiveSUs SA_UINT32_T 3 (0x3)
saAmfSGNumCurrNonInstantiatedSpareSUs SA_UINT32_T 0 (0x0)
saAmfSGNumCurrInstantiatedSpareSUs SA_UINT32_T 0 (0x0)
saAmfSGNumCurrAssignedSUs SA_UINT32_T 2 (0x2)
saAmfSGMaxStandbySIsperSU SA_UINT32_T 2 (0x2)
saAmfSGMaxActiveSIsperSU SA_UINT32_T 1 (0x1)
saAmfSGCompRestartProb SA_TIME_T 
saAmfSGCompRestartMax SA_UINT32_T 
saAmfSGAutoRepair SA_UINT32_T 0 (0x0)
saAmfSGAutoAdjustProb SA_TIME_T 
saAmfSGAutoAdjust SA_UINT32_T 0 (0x0)
saAmfSGAdminState SA_UINT32_T 1 (0x1)
SaImmAttrImplementerName? SA_STRING_T safAmfService 
SaImmAttrClassName? SA_STRING_T SaAmfSG 
SaImmAttrAdminOwnerName? SA_STRING_T 


Changed 18 months ago by ravisekhar ¶
  ■status changed from new to accepted 
Changed 13 months ago by hafe ¶
  I see nothing happening with this ticket although in accepted state for 
months. If status is not updated in short, I will change the milestone to 
"future" end of this week.


Changed 13 months ago by ravisekhar ¶
  ■milestone changed from 4.2.1 to future_releases 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #392 payload node stuck in locked state.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> never



---

** [tickets:#392] payload node stuck in locked state.**

**Status:** invalid
**Milestone:** never
**Created:** Fri May 31, 2013 05:10 AM UTC by Nagendra Kumar
**Last Updated:** Mon Oct 05, 2015 11:53 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- [logs.tar](https://sourceforge.net/p/opensaf/tickets/392/attachment/logs.tar) 
(34.6 kB; application/x-gzip-compressed)
- 
[AppConfig-npm_392.xml](https://sourceforge.net/p/opensaf/tickets/392/attachment/AppConfig-npm_392.xml)
 (23.1 kB; text/xml)


Migrated from http://devel.opensaf.org/ticket/2578

Model : NPM ( 3+2)
changeset : 3406
Configuration : 1App,1Sg,6sis,8sus,8comps,6csis
SUs 1-6 are mapped to PL-4 and SU7-8 are mapped to PL-3


Scenario:





Bring up the model, unlock-in and unlock the SUs.
The initial assignments are as 


Active standby


SI1 —-> SU1 Su4
SI2——> SU1 SU4
SI3——> SU2 SU4
SI4——> SU2 SU4
SI5——> SU3 SU5
SI6——> SU3 SU5


Now lock all the SIs except SI6.Then lock all the SUs except SU3.
Unlock SI1.Now the assignments are as 


Active Standby


SI1——-> SU6 
SI6——-> SU6


Now lock the PL-4 as "amf-adm lock safAmfNode=PL-4,safAmfCluster=myAmfCluster"


Now unlocking the PL-4 doesn't unlock it. 


SI states:





safSi=SC-2N,safApp=OpenSAF


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)


safSi=NoRed?1,safApp=OpenSAF


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)


safSi=NoRed?2,safApp=OpenSAF


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)


safSi=NoRed?3,safApp=OpenSAF


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)


safSi=NoRed?4,safApp=OpenSAF


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)


safSi=dummy_NplusM_1Norm_1,safApp=NpMApp


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)


safSi=dummy_NplusM_1Norm_2,safApp=NpMApp


saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=UNASSIGNED(1)


safSi=dummy_NplusM_1Norm_3,safApp=NpMApp


saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=UNASSIGNED(1)


safSi=dummy_NplusM_1Norm_4,safApp=NpMApp


saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=UNASSIGNED(1)


safSi=dummy_NplusM_1Norm_5,safApp=NpMApp


saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=UNASSIGNED(1)


safSi=dummy_NplusM_1Norm_6,safApp=NpMApp


saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)


SU states : 





safSu=PL-3,safSg=NoRed?,safApp=OpenSAF


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)


safSu=PL-4,safSg=NoRed?,safApp=OpenSAF


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)


safSu=SC-1,safSg=2N,safApp=OpenSAF


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)


safSu=SC-1,safSg=NoRed?,safApp=OpenSAF


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)


safSu=SC-2,safSg=2N,safApp=OpenSAF


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)


safSu=SC-2,safSg=NoRed?,safApp=OpenSAF


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)


safSu=dummy_NplusM_1Norm_1,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)


safSu=dummy_NplusM_1Norm_2,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)


safSu=dummy_NplusM_1Norm_3,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)


safSu=dummy_NplusM_1Norm_4,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)


safSu=dummy_NplusM_1Norm_5,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)


safSu=dummy_NplusM_1Norm_6,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)


safSu=dummy_NplusM_1Norm_7,safSg=SG_dummy_npm,safApp=NpMApp


saAmfSUAdminState=LOCKED(2)

[tickets] [opensaf:tickets] #399 amf: SU admin state not updated after doing controller switchover and admin lock of SU.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#399] amf: SU admin state not updated after doing controller 
switchover and admin lock of SU.**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Fri May 31, 2013 05:21 AM UTC by Praveen
**Last Updated:** Wed Aug 12, 2015 09:11 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2879.

changeset : 3796, 4.2.2
 model : NpluM
 

Initial Configuration:-
 =
 SI equal distribution
 saAmfSGNumPrefInserviceSUs=5 -a saAmfSGMaxActiveSIsperSU=2 -a 
saAmfSGMaxStandbySIsperSU=3 -a saAmfSGNumPrefActiveSUs=3 -a 
saAmfSGNumPrefStandbySUs=2
 saAmfSGAutoAdjust=1
 

6 SIs in locked state.
 saAmfSIPrefActiveAssignments=1 -a saAmfSIPrefStandbyAssignments=1
 

5SUs with same SURank set to 5.Each SUs admin state was locked-instantiation 
state.
 SU1, SU4, SU5 spawned on SC-1
 SU2 on SC-2
 SU3 on PL-4
 

Steps:-
 1. Brought up the NplusM model with above configuration.
 2. Performed unlock-instantiation operation on each SUs (SU1 to SU5)
 3. Performed unlock operation on each SUs (SU1 to SU5).
 4. Performed unlock of each SIs (SI1 to SI6)
 

Here observed that SUSI assignments were equally distributed.
 

5. Now on SC-1, command line trigger controller switchover
 and immediately on SC-2, trigger the admin lock on SU1.
 

Here observed that controller switchover successfully completed
 but the admin lock on SU1 failed with SA_AIS_ERR_TIMEOUT.
 

Again tried to lock the SU1, but this time it got failed with SA_AIS_ERR_NO_OP. 
It was failing with the same error SA_AIS_ERR_NO_OP after reties. amf-state su 
states was showing the admin state of SU1 as UNLOCKED. Hence admin state of SU1 
was not getting changed. 
Observed that all the SUSI assignments from SU1 got removed but the 


/var/log/messages was printing the below messages:-
 

Oct 23 13:01:53 SLOT2 osafimmnd[7176]: Timeout on syncronous admin operation 1
 Oct 23 13:03:47 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 Oct 23 13:06:15 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 

safSu=d_NplusM_1Norm_1,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_2,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_3,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_4,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_5,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSISU=safSu=SC-1\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?2,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=SC-2\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?1,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-3\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?4,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-4\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?3,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_6,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_4,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_5,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)


Changed 7 months ago by shareef 




Same issue also observed with

[tickets] [opensaf:tickets] #314 AMF looses alarms and notifications during switch-over

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#314] AMF looses alarms and notifications during switch-over**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Fri May 24, 2013 08:34 AM UTC by Nagendra Kumar
**Last Updated:** Mon Apr 20, 2015 06:42 AM UTC
**Owner:** Praveen
**Attachments:**

- [osafamfd](https://sourceforge.net/p/opensaf/tickets/314/attachment/osafamfd) 
(5.7 MB; application/octet-stream)
- [messages](https://sourceforge.net/p/opensaf/tickets/314/attachment/messages) 
(41.9 kB; application/octet-stream)


Migrated from http://devel.opensaf.org/ticket/3051

Background: http://devel.opensaf.org/ticket/3028


If another node (payload) leaves the cluster in the middle of switch-over, amfd 
logs this:


Mar 8 10:18:21 SC-1 osafamfd[304]: ER sendStateChangeNotificationAvd: 
saNtfNotificationSend Failed (6)
Mar 8 10:18:21 SC-1 osafamfd[304]: ER sendAlarmNotificationAvd: 
saNtfNotificationSend Failed (6)


These logs means that amfd failed to send an alarm and a notification due to 
TRYAGAIN returned from NTF (in NOACTIVE state)


AMF needs to store the alarms/notifications produced in the NOACTIVE state and 
send them at the end of the switch-over. Or with using a separate thread that 
can block forever (?) on TRYAGAIN.


The problem exist in all opensaf releases





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1331 java clm : dispatchBlocking() APIs does not return SA_AIS_OK after finalizing the handle

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1331] java clm : dispatchBlocking() APIs does not return SA_AIS_OK 
after finalizing the handle**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Wed Apr 22, 2015 09:06 AM UTC by Sirisha Alla
**Last Updated:** Wed Apr 22, 2015 09:06 AM UTC
**Owner:** nobody


This ticket is clone of devel ticket 1671.

When dispatchBlocking() and dispatchBlocking(tmout) apis are invoked in a 
thread and the handle is finalized, SA_AIS_OK should be returned by the 
dispatchBlocking APIs. Instead an exception is being raised.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1325 Unnecessary state change notification regarding osafntfimcnd during failover

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1325] Unnecessary state change notification regarding osafntfimcnd 
during failover**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue Apr 21, 2015 01:09 PM UTC by Srikanth R
**Last Updated:** Tue Apr 21, 2015 01:09 PM UTC
**Owner:** nobody


Changeset : 6377

Unnecessary and invalid notification is generated during the failover.

===  Apr  9 21:14:03 - State Change  ===
eventType = SA_NTF_OBJECT_STATE_CHANGE
notificationObject = "osafntfimcnd"
notifyingObject = "safApp=OpenSaf"
notificationClassId = 32993.8.0 (0x0)
sourceIndicator = SA_NTF_OBJECT_OPERATION



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1320 configure_tipc script error due to $CORE_ID

2015-11-02 Thread Anders Widell

- **Milestone**: 4.4.2 --> 4.6.2



---

** [tickets:#1320] configure_tipc script error due to $CORE_ID**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Mon Apr 20, 2015 09:34 PM UTC by Adrian Szwej
**Last Updated:** Mon Apr 20, 2015 09:34 PM UTC
**Owner:** nobody


During start of opensaf, the /var/log/opensaf/nid.log gives an error message:
**/usr/local/lib/opensaf/configure_tipc: line 198: [: 1234: unary operator 
expected**

The script **/usr/local/lib/opensaf/configure_tipc** contains $CORE_ID 
parameter which does not seem to be set anywhere.

configured_net_id=`tipc-config -netid | cut -d: -f2`
opensaf_net_id=$CORE_ID
if [ $configured_net_id != $opensaf_net_id ]; then
logger -t opensaf -s "TIPC network ID not configured to OpenSAF 
requirements, exiting..."
exit 1
fi



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #945 AMF: allow creation of unlocked SUs when node is locked-inst

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.FC --> never



---

** [tickets:#945] AMF: allow creation of unlocked SUs when node is locked-inst**

**Status:** invalid
**Milestone:** never
**Created:** Mon Jun 23, 2014 02:05 PM UTC by Hans Feldt
**Last Updated:** Mon Jun 30, 2014 07:43 AM UTC
**Owner:** nobody


A small change but important for the "cluster scale out" use case.
A pre-requisite is that the service unit is mapped to a node (not node group)


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1556 AMF : SU struck in instantiating state during adm su restart op ( component reg failure )

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1556] AMF : SU struck in instantiating state during adm su restart 
op ( component reg failure )**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Thu Oct 22, 2015 06:44 AM UTC by Srikanth R
**Last Updated:** Thu Oct 22, 2015 09:42 AM UTC
**Owner:** Praveen
**Attachments:**

- [TwoN.sh](https://sourceforge.net/p/opensaf/tickets/1556/attachment/TwoN.sh) 
(9.7 kB; application/x-shellscript)


Changeset :  6901
Application : 2N , two SUs

steps :
* Both the SUs are having full assignments.
* Issued restart operation on SU hosting standby assignment. The first 
component in the SU did not register with AMF. Only the CLC CLI script exited 
with success, but saAmfComponentRegister is not called by component .

Oct 17 13:40:37 SYSTEST-PLD-1 osafamfnd[28402]: NO Admin Restart request for 
'safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 17 13:40:37 SYSTEST-PLD-1 osafamfnd[28402]: NO 
'safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
INSTANTIATED => RESTARTING
Oct 17 13:40:37 SYSTEST-PLD-1 osafamfnd[28402]: NO 
'safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
RESTARTING => INSTANTIATING
Oct 17 13:40:47 SYSTEST-PLD-1 osafamfnd[28402]: NO Instantiation of 
'safComp=COMP1,safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN' failed
Oct 17 13:40:47 SYSTEST-PLD-1 osafamfnd[28402]: NO Reason: component 
registration timer expired


Below is the state of SU after the admin operation timed out. 

safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN
saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATING(2)
saAmfSUReadinessState=OUT-OF-SERVICE(1)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1557 Comp fails in INSTANTIATION_FAILED because comp crashes after compRegistration timeout

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1557] Comp fails in INSTANTIATION_FAILED because comp crashes 
after compRegistration timeout**

**Status:** review
**Milestone:** 4.6.2
**Labels:** INSTANTIATION_FAILED component registration 
**Created:** Fri Oct 23, 2015 02:17 AM UTC by Minh Hon Chau
**Last Updated:** Wed Oct 28, 2015 02:46 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- 
[app3_twon2su1si.xml](https://sourceforge.net/p/opensaf/tickets/1557/attachment/app3_twon2su1si.xml)
 (10.5 kB; text/xml)
- 
[amf_demo_script](https://sourceforge.net/p/opensaf/tickets/1557/attachment/amf_demo_script)
 (1.9 kB; application/octet-stream)
- [log.tgz](https://sourceforge.net/p/opensaf/tickets/1557/attachment/log.tgz) 
(698.3 kB; application/x-compressed-tar)
- 
[amf_demo.diff](https://sourceforge.net/p/opensaf/tickets/1557/attachment/amf_demo.diff)
 (2.0 kB; text/x-patch)


Steps reproduce:
. Apply amf_demo.diff and build amf_demo, using attached amf_demo_script as clc 
script
. Run commands:
   . immcfg -f app3_twon2su1si.xml
   . echo 1 > /root/hu23992
   . amf-adm unlock-in safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon

Logs:
Oct 23 12:47:19 PL-4 osafamfnd[421]: NO 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State UNINSTANTIATED 
=> INSTANTIATING
Oct 23 12:47:19 PL-4 amf_demo_script: CLC-START: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:22 PL-4 amf_demo[585]: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' started
Oct 23 12:47:26 PL-4 osafamfnd[421]: NO Instantiation of 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' failed
Oct 23 12:47:26 PL-4 osafamfnd[421]: NO Reason: component registration timer 
expired
Oct 23 12:47:26 PL-4 amf_demo_script: CLC-STOP: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:27 PL-4 amf_demo[585]: Registered with AMF and HC started
Oct 23 12:47:27 PL-4 amf_demo[585]: Health check 1
Oct 23 12:47:29 PL-4 amf_demo[585]: exiting (caught term signal)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' component restart probation 
timer started (timeout: 100 ns)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO Restarting a component of 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' (comp restart count: 1)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due to 
'avaDown' : Recovery is 'componentRestart'
Oct 23 12:47:29 PL-4 amf_demo_script: CLC-STOP: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:29 PL-4 amf_demo_script: CLC-START: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:32 PL-4 amf_demo[628]: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' started
Oct 23 12:47:32 PL-4 amf_demo[628]: exiting (caught term signal)
Oct 23 12:47:32 PL-4 osafamfnd[421]: WA 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State 
INSTANTIATING => INSTANTIATION_FAILED
Oct 23 12:47:32 PL-4 osafamfnd[421]: NO 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State INSTANTIATING 
=> INSTANTIATION_FAILED

Trace is also attached.

Initial analysis:
. After comp timeout in component_registration phase, amfnd enters 
instantiating_fail, thus cleanup clc is called
. Then comp crashed, amfnd receives ava_mds_down, amfnd also enters 
instantiating_fail for component, another cleanup clc is called.
. Eventually, at the returns of two cleanup clc, amfnd will enters 
cleanup_success twice under instantiating state of component
. At the second cleanup_success, the retry_counter has reach retry_max, so 
component fails into INSTANTIATION_FAILED

As first thought, amfnd should not enter instantiating_fail when comp is 
crashed, since it has been already in handling of instantiating_fail.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1562 AMF : (NPM ) Standby assignments are done with out any active assignment

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1562] AMF : (NPM ) Standby assignments are done with out any 
active assignment**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Oct 23, 2015 01:59 PM UTC by Srikanth R
**Last Updated:** Fri Oct 23, 2015 01:59 PM UTC
**Owner:** nobody
**Attachments:**

- 
[1562.tgz](https://sourceforge.net/p/opensaf/tickets/1562/attachment/1562.tgz) 
(178.3 kB; application/x-compressed-tar)


Changeset : 6901
Setup : NPM application with 4 SUs hosted on PL-3 & PL-4 and 4SIs 
 SU1 & SU3 hosted on PL-3 , SU2 & SU4 hosted on PL-4
 
Steps :

After a series of operation on the NPM application, below are the state of 
assignments


   |  TestApp_SI1   |  TestApp_SI2   |  TestApp_SI3   |  TestApp_SI4   

TestApp_SU1|ACTIVE |ACTIVE   |  
   |
TestApp_SU2| |   |
ACTIVE |ACTIVE 
TestApp_SU3|STANDBY  |STANDBY|STANDBY  
|
TestApp_SU4| |   |  
   |STANDBY 



After opensafd is stopped on PL-3, below are the assignments 


 TestApp_SI1 TestApp_SI2 TestApp_SI3 TestApp_SI4   

TestApp_SU1 
TestApp_SU2
ACTIVE  ACTIVE 
TestApp_SU3
TestApp_SU4STANDBY  STANDBY 
  STANDBY 


Corresponding log in syslog on PL-4 :
Oct 23 19:00:29 PAYLOAD-2 osafimmnd[8101]: NO Implementer disconnected 40 <0, 
2010f> (MsgQueueService131855)
Oct 23 19:00:29 PAYLOAD-2 osafamfnd[8120]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_Npm' STANDBY to 
'safSu=TestApp_SU4,safSg=TestApp_SG1,safApp=TestApp_Npm'
Oct 23 19:00:29 PAYLOAD-2 osafamfnd[8120]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_Npm' STANDBY to 
'safSu=TestApp_SU4,safSg=TestApp_SG1,safApp=TestApp_Npm'
Oct 23 19:00:29 PAYLOAD-2 osafamfnd[8120]: NO Assigned 
'safSi=TestApp_SI2,safApp=TestApp_Npm' STANDBY to 
'safSu=TestApp_SU4,safSg=TestApp_SG1,safApp=TestApp_Npm'
Oct 23 19:00:29 PAYLOAD-2 osafamfnd[8120]: NO Assigned 
'safSi=TestApp_SI1,safApp=TestApp_Npm' STANDBY to 
'safSu=TestApp_SU4,safSg=TestApp_SG1,safApp=TestApp_Npm'
Oct 23 19:00:32 PAYLOAD-2 kernel: [ 7785.128227] TIPC: Resetting link 
<1.1.4:eth3-1.1.3:eth3>, peer not responding

Attached is amfd.state and amfd traces on active controller, amfnd trace on 
payload hosting SU2 & SU4 and also the NPM configuration.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1560 AMF : NG admin state should be validated during creation

2015-11-02 Thread Anders Widell

- **Milestone**: 4.6.1 --> 4.6.2



---

** [tickets:#1560] AMF : NG admin state should be validated during creation**

**Status:** review
**Milestone:** 4.6.2
**Created:** Fri Oct 23, 2015 07:20 AM UTC by Srikanth R
**Last Updated:** Thu Oct 29, 2015 05:22 AM UTC
**Owner:** Praveen


Changeset : 6901

While creating node group, the admin state value should be validated.  
Currently, invalid admin state for the node group is accepted 


immcfg -c SaAmfNodeGroup safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster -a 
saAmfNGNodeList=safAmfNode=SC-1,safAmfCluster=myAmfCluster -a 
saAmfNGAdminState=5

CONTROLLER-1:~ # amf-state ng
safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster
saAmfNGAdminState=UNKNOWN(5)


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1529 Node rebooted as saImmOiInitialize_2 failed during middleware active assignment

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1529] Node rebooted as saImmOiInitialize_2 failed during 
middleware active assignment**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Oct 08, 2015 07:53 AM UTC by Chani Srivastava
**Last Updated:** Fri Oct 09, 2015 10:54 AM UTC
**Owner:** nobody
**Attachments:**

- 
[SC1_syslog.txt](https://sourceforge.net/p/opensaf/tickets/1529/attachment/SC1_syslog.txt)
 (436.4 kB; text/plain)
- 
[SC2_syslog.txt](https://sourceforge.net/p/opensaf/tickets/1529/attachment/SC2_syslog.txt)
 (425.6 kB; text/plain)
- 
[1529.tgz](https://sourceforge.net/p/opensaf/tickets/1529/attachment/1529.tgz) 
(586.3 kB; application/x-compressed-tar)


Setup:
Changeset-6901
Invoked continuous failovers on a 4-node Cluster with 2 controllers and 2 
payloads. All nodes have 64bit architecture.
2PBE enabled with 25K objects

Issue Observed:
Cluster reset occurred on invoking continuous failovers

Attachments:
Attaching syslogs for SC-1 and SC-2
Traces for immnd and immd can be shared seperately if required

Steps:
* Initially SC-1 is active and SC-2 standby
* A test script invoked failover via killing osafclmd on SC1
* SC-2 became active

Oct  7 18:23:32 OSAF-SC1 root: killing osafclmd from invoke_failover.sh
Oct  7 19:25:20 OSAF-SC2 osafamfd[2191]: NO FAILOVER StandBy --> Active

* On the new active controler, saImmOiInitialize_2 failed 

Oct  7 19:25:22 OSAF-SC2 osafntfimcnd[2735]: ER ntfimcn_imm_init 
saImmOiInitialize_2 failed SA_AIS_ERR_TIMEOUT (5)
Oct  7 19:25:22 OSAF-SC2 osafntfimcnd[2735]: ER ntfimcn_imm_init() Fail
Oct  7 19:25:22 OSAF-SC2 osafimmnd[2131]: NO Implementer connected: 333 
(safLckService) <299, 2020f>
Oct  7 19:25:22 OSAF-SC2 osafimmnd[2131]: NO Implementer connected: 334 
(safEvtService) <298, 2020f>
Oct  7 19:25:23 OSAF-SC2 osafntfimcnd[2738]: ER ntfimcn_imm_init 
saImmOiInitialize_2 failed SA_AIS_ERR_TIMEOUT (5)
Oct  7 19:25:23 OSAF-SC2 osafntfimcnd[2738]: ER ntfimcn_imm_init() Fail
Oct  7 19:25:23 OSAF-SC2 osafimmnd[2131]: WA MDS Send Failed
Oct  7 19:25:23 OSAF-SC2 osafimmnd[2131]: WA Error code 2 returned for message 
type 4 - ignoring

* Other services also fail to initialize with IMM on new active 
controller..i.e. SC-2

* And finally SMF had csi set timeout
* SC-2 went for reboot and hence the entire cluster reset, as SC-2 is the only 
active controller at the time

Oct  7 19:25:51 OSAF-SC2 osafamfnd[2205]: NO 
'safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Oct  7 19:25:51 OSAF-SC2 osafamfnd[2205]: ER 
safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Oct  7 19:25:51 OSAF-SC2 osafamfnd[2205]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Oct  7 19:25:51 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1465 Don't send alarm "SI has no current active assignments" if node is locked

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1465] Don't send alarm "SI has no current active assignments" if 
node is locked**

**Status:** review
**Milestone:** 4.6.2
**Created:** Fri Aug 28, 2015 03:01 PM UTC by hano
**Last Updated:** Tue Sep 15, 2015 01:24 PM UTC
**Owner:** hano


In a cloud environment, scale in is done with node shutdown, lock and opensafd 
stop.
Considering M/W No Redundancy SI, an alarm 'SI Unassigned' is raised when 
performing opensafd stop as M/W SI assignments are not affected by the node 
lock/shutdwon. This alarm is to be avoided. A patch is sent out were if the 
node is shutdown/locked and the redundancy model is no-red, the alarm will not 
be sent. The alarm is also not wanted for No Redundancy application SIs at node 
shutdown, a version 3 of the patch is sent out.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1400 systemd problems installing on debian jessie

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1400] systemd problems installing on debian jessie**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Thu Jul 02, 2015 08:23 PM UTC by Charles Stuart Johnson
**Last Updated:** Wed Jul 15, 2015 12:52 PM UTC
**Owner:** nobody


After installing all the requested packages through successive installs of 
OpenSAF on debian 8.0.0 and 8.1.0, got  block by a bug when using this command:

cd /data/projects/opensaf/opensaf-staging &&
./bootstrap.sh &&
./configure --disable-tipc --disable-ais-plm --enable-java &&
make &&
sudo make install

Here's what I got:

sh: 6: qmake-qt4: not found
autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal -I m4
autoreconf: configure.ac: tracing
autoreconf: configure.ac: adding subdirectory contrib/plmc to autoreconf
autoreconf: Entering directory `contrib/plmc'
autoreconf: running: libtoolize --copy
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
autoreconf: running: /usr/bin/autoconf
autoreconf: running: /usr/bin/autoheader
autoreconf: running: automake --add-missing --copy --no-force
configure.ac:39: installing './compile'
configure.ac:20: installing './config.guess'
configure.ac:20: installing './config.sub'
configure.ac:25: installing './install-sh'
configure.ac:25: installing './missing'
lib/utils/Makefile.am: installing './depcomp'
autoreconf: Leaving directory `contrib/plmc'
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
configure.ac:27: installing './compile'
configure.ac:20: installing './config.guess'
configure.ac:20: installing './config.sub'
configure.ac:25: installing './install-sh'
configure.ac:25: installing './missing'
java/ais_api_impl_native/Makefile.am: installing './depcomp'
python/pyosaf/Makefile.am:21: installing './py-compile'
autoreconf: Leaving directory `.'
abort: no repository found in '/data/projects/opensaf/opensaf-staging' (.hg not 
found)!
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking target system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking minix/config.h usability... no
checking minix/config.h presence... no
checking for minix/config.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking whether to build with rpath enabled... yes
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to

[tickets] [opensaf:tickets] #1421 log: not check special characters from saLogStreamFileName value

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1421] log: not check special characters from saLogStreamFileName 
value**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Thu Jul 16, 2015 09:35 AM UTC by Vu Minh Nguyen
**Last Updated:** Mon Aug 03, 2015 11:25 AM UTC
**Owner:** Vu Minh Nguyen


Logsv does not validate if the saLogStreamFileName value has sepcial characters 
or not.
See at line of saLogStreamFileName attribute.
>
# immlist safLgStrCfg=str9,safApp=safLogService
Name   Type Value(s)

safLgStrCfgSA_STRING_T  
safLgStrCfg=str9 
saLogStreamSeverityFilter  SA_UINT32_T  30 (0x1e)
saLogStreamPathNameSA_STRING_T  . 
saLogStreamNumOpeners  SA_UINT32_T  1 (0x1)
saLogStreamMaxLogFileSize  SA_UINT64_T  500 
(0x4c4b40)
saLogStreamMaxFilesRotated SA_UINT32_T  4 (0x4)
saLogStreamLogFullHaltThresholdSA_UINT32_T  75 (0x4b)
saLogStreamLogFullAction   SA_UINT32_T  3 (0x3)
saLogStreamLogFileFormat   SA_STRING_T  
saLogStreamFixedLogRecordSize  SA_UINT32_T  150 (0x96)
saLogStreamFileNameSA_STRING_T  \/ a bc  . 
txt 
saLogStreamCreationTimestamp   SA_TIME_T
1437031872934978000 (0x13f15cdfed3421d0, Thu Jul 16 08:31:12 2015)
logStreamDiscardedCounter  SA_UINT64_T  0 (0x0)
SaImmAttrImplementerName   SA_STRING_T  
safLogService 
SaImmAttrClassName SA_STRING_T  
SaLogStreamConfig 
SaImmAttrAdminOwnerNameSA_STRING_T  


As the result, logsv gets failed to create cfg/log files.

In trace log, we get following err message:

> Jul 16  8:31:13.068090 osaflogd [417:lgs_util.c:0106] TR 
> lgs_create_config_file_h - Config file path "/repl_opensaf/saflog/./\/ a bc  
> . txt.cfg"
Jul 16  8:31:13.068997 osaflogd [417:lgs_filehdl.c:0170] >> 
create_config_file_hdl
Jul 16  8:31:13.069422 osaflogd [417:lgs_filehdl.c:0172] TR 
create_config_file_hdl - file_path "/repl_opensaf/saflog/./\/ a bc  . txt.cfg"
Jul 16  8:31:13.074774 osaflogd [417:lgs_filehdl.c:0182] NO Could not open 
'/repl_opensaf/saflog/./\/ a bc  . txt.cfg' - No such file or directory
Jul 16  8:31:13.075243 osaflogd [417:lgs_filehdl.c:0232] << 
create_config_file_hdl: rc = -1
Jul 16  8:31:13.075975 osaflogd [417:lgs_util.c:0166] << 
lgs_create_config_file_h: rc = -1
Jul 16  8:31:13.079080 osaflogd [417:lgs_stream.c:0347] TR 
log_initiate_stream_files - lgs_create_config_file_h() FAIL



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1410 pyosaf: Invalid exception used in ImmObject (object.py)

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#1410] pyosaf: Invalid exception used in ImmObject (object.py)**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Jul 10, 2015 10:11 AM UTC by Johan Mårtensson
**Last Updated:** Wed Jul 15, 2015 12:46 PM UTC
**Owner:** nobody


ImmObject uses an invalid way to raise exceptions:


>>> a = ImmObject('NonExistingClass')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python2.7/dist-packages/pyosaf/utils/immom/object.py", 
line 63, in __init__
raise
TypeError: exceptions must be old-style classes or derived from BaseException, 
not NoneType



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #928 base: Selection object fails due to re-cycled file descriptor

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#928] base: Selection object fails due to re-cycled file 
descriptor**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Wed May 28, 2014 07:48 AM UTC by Anders Widell
**Last Updated:** Wed Jul 15, 2015 01:25 PM UTC
**Owner:** nobody


A case has been seen where syslog gets filled with thousands of messages like 
the one below:

May 3 15:37:48 SC-1 osaflogd[7643]: ncs_sel_obj_rmv_ind: recv failed - 
Socket operation on non-socket

Probably the wrong file descriptor is being used here when this happens. When 
looking at the code, there are some obvious improvements that can be made:

* Whenever the file descriptors raise_obj and/or rmv_obj are closed, the file 
descriptors in the data structure should be overwritten with -1 to indicate 
that the file descriptor is no longer valid. Relying on subsequent system calls 
to fail with EBADF is not a good idea, since the file descriptor may be 
re-cycled. This might be what has happened in the syslog entry above.
* The function ncs_sel_obj_rmv_ind() should check if either file descriptor is 
less than zero, and if so, return immediately without trying to operate on the 
file descriptors. It may log to syslog in this case, but in order to avoid 
spamming the log it should make sure to log only once. This can be achieved by 
e.g. logging if the file descriptor is -1, and then change it to -2 so that the 
next call will not log to syslog.
* If, after implementing the changes suggested above, recv() still fails due to 
any other reason than EAGAIN, EWOULDBLOCK or EINTR, we should call osaf_abort() 
to generate a core dump. Errors like "socket operation on non-socket" is an 
indication of a bug.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #682 LOG: New Active reboots when coordinator IMMND is killed in the middle of switchover

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#682] LOG: New Active reboots when coordinator IMMND is killed in 
the middle of switchover**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Dec 20, 2013 05:27 AM UTC by Sirisha Alla
**Last Updated:** Mon Aug 03, 2015 11:31 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/682/attachment/logs.tar.bz2)
 (4.3 MB; application/x-bzip)
- 
[tic682.tgz](https://sourceforge.net/p/opensaf/tickets/682/attachment/tic682.tgz)
 (208.3 kB; application/x-compressed-tar)


The issue is observed on changeset 4733 + #220 patches corresponding to cs 4741 
and cs 4742. The test setup is a 4 node SLES 64bit VMs.The setup is single PBE 
enabled loaded with 25k objects.

SC-2(SLES-64BIT-SLOT2) is Active and IMMND coordinator is hosted on 
SC-1(SLES-64BIT-SLOT1). Controller Switchover is initiated and immnd is killed 
on SC-1. SC-1 went for reboot because of the csi set callback timeout of logd.

/var/log/messages of SC-1 and SC-2 corresponding to the above mentioned steps :

SC-2:

Dec 19 17:21:36 SLES-64BIT-SLOT2 osafamfd[3609]: NO safSi=SC-2N,safApp=OpenSAF 
Swap initiated
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafamfnd[3619]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' QUIESCED to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
18 <320, 2020f> (safMsgGrpService)
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO implementer for class 
'SaSmfCampaign' is released => class extent is UNSAFE
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
22 <319, 2020f> (safEvtService)
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
23 <3, 2020f> (safLogService)
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO implementer for class 
'OpenSafSmfConfig' is released => class extent is UNSAFE
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO implementer for class 
'SaSmfSwBundle' is released => class extent is UNSAFE
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
24 <298, 2020f> (safSmfService)
Dec 19 17:21:37 SLES-64BIT-SLOT2 osafimmnd[3554]: NO IDec 19 17:21:38 

SC-1:

SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 18 <0, 2020f> 
(safMsgGrpService)
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO implementer for class 
'SaSmfCampaign' is released => class extent is UNSAFE
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
22 <0, 2020f> (safEvtService)
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
23 <0, 2020f> (safLogService)
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO implementer for class 
'OpenSafSmfConfig' is released => class extent is UNSAFE
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO implementer for class 
'SaSmfSwBundle' is released => class extent is UNSAFE
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
24 <0, 2020f> (safSmfService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
20 <0, 2020f> (safLckService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
19 <0, 2020f> (safCheckPointService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
21 <0, 2020f> (safClmService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmpbed: WA PBE lost contact with parent 
IMMND - Exiting
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafamfnd[3578]: NO 
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafntfimcnd[3829]: ER saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafamfd[3565]: NO Re-initializing with IMM
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmd[3488]: NO IMMND coord at 2020f
mplementer disconnected 20 <303, 2020f> (safLckService)
..

Dec 19 17:21:49 SLES-64BIT-SLOT1 osafimmnd[3953]: NO Implementer connected: 40 
(OpenSafImmPBE) <0, 2020f>
Dec 19 17:21:49 SLES-64BIT-SLOT1 osafamfd[3565]: NO Finished re-initializing 
with IMM
Dec 19 17:21:50 SLES-64BIT-SLOT1 osafimmnd[3953]: NO PBE-OI established on 
other SC. Dumping incrementally to file imm.db
Dec 19 17:23:40 SLES-64BIT-SLOT1 osafamfnd[3578]: NO 
'safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Dec 19 17:23:40 SLES-64BIT-SLOT1 osafamfnd[3578]: ER 
safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Dec 19 17:23:40 SLES-64BIT-SLOT1 osafamfnd[3578]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
Dec 19 17:23:40 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60

When LOGD trace is examined there is no information at that point of time for 
the

[tickets] [opensaf:tickets] #665 java: Missing calls to ReleaseIntArrayElements

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#665] java: Missing calls to ReleaseIntArrayElements**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Tue Dec 17, 2013 12:22 PM UTC by Anders Widell
**Last Updated:** Wed Jul 15, 2015 01:45 PM UTC
**Owner:** Anders Widell


In the file j_ais_socketUtil.c there are calls to GetIntArrayElements(), but no 
corresponding calls to ReleaseIntArrayElements(). Because of this, the garbage 
collector may not be able to reclaim the memory.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #648 NTF IMCN: Reinitialize IMM API if OiImplementer set timeout

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#648] NTF IMCN: Reinitialize IMM API if OiImplementer set timeout**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Wed Dec 04, 2013 02:26 PM UTC by elunlen
**Last Updated:** Tue Sep 15, 2015 07:01 AM UTC
**Owner:** nobody


In imcn init:
If ERR_EXIST re-initialize IMM API before changing name.
If OiImplementerSet API timeout also re-initialize


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #689 rollback of campaign fails due to object not found

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#689] rollback of campaign fails due to object not found**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri Dec 20, 2013 01:16 PM UTC by surender khetavath
**Last Updated:** Wed Jul 15, 2015 01:43 PM UTC
**Owner:** nobody
**Attachments:**

- 
[sc1_logs.tgz](https://sourceforge.net/p/opensaf/tickets/689/attachment/sc1_logs.tgz)
 (22.3 MB; application/x-compressed-tar)


changeset : 4733
model : 2n
configuration : 1SG,5SUs,5SIs,SU1 to SU5 has 3comps each. 3CSIs in each SI

si-si deps configured as SI1<-SI2<-SI3<-SI4
SaAmfCSIAttribute is set for all the CSIs.
All SIs are initially in locked state and SUs are in lock-in.
SU1 mapped to SC-1
SU2 mapped to SC-2
SU3-mapped to PL-3
SU4-SU5 mapped to PL-4

Test:
A campaign is modelled to include one more SG with 2SUs having one component in 
each SU and 2SIs with 1 CSI in each SI

Rollback fails with below error- ERR_NOT_EXIST. But actually the object exists

osafsmfd log shows
Dec 20 18:37:19.382514 osafsmfd [20589:SmfUpgradeAction.cc:0584] ER 
SmfImmCcbAction::rollback failed to rollback CCB 
smfRollbackElement=ccb_0002,smfRollbackElement=ProcInit,safSmfProc=AddNewSG,safSmfCampaign=Campaign_4,safApp=safSmfService,
 rc=SA_AIS_ERR_NOT_EXIST (12)


immlist of object :
immlist 
smfRollbackElement=ccb_0002,smfRollbackElement=ProcInit,safSmfProc=AddNewSG,safSmfCampaign=Campaign_4,safApp=safSmfService
Name   Type Value(s)

smfRollbackElement SA_STRING_T  
smfRollbackElement=ccb_0002 
SaImmAttrImplementerName   SA_STRING_T  
safSmfProc=AddNewSG 
SaImmAttrClassName SA_STRING_T  
OpenSafSmfRollbackElement 
SaImmAttrAdminOwnerNameSA_STRING_T  

/var/log/messages show
Dec 20 18:37:19 SC-1 osafsmfd[20589]: NO PROC: Rollback of procedure init 
actions
Dec 20 18:37:19 SC-1 osafsmfd[20589]: NO Execution of IMM operation failed, 
rc=SA_AIS_ERR_NOT_EXIST (12)
Dec 20 18:37:19 SC-1 osafsmfd[20589]: ER Rollback ccb operations failed for 
smfRollbackElement=ccb_0002,smfRollbackElement=ProcInit,safSmfProc=AddNewSG,safSmfCampaign=Campaign_4,safApp=safSmfService,
 rc=SA_AIS_ERR_NOT_EXIST (12)
Dec 20 18:37:19 SC-1 osafsmfd[20589]: ER SmfImmCcbAction::rollback failed to 
rollback CCB 
smfRollbackElement=ccb_0002,smfRollbackElement=ProcInit,safSmfProc=AddNewSG,safSmfCampaign=Campaign_4,safApp=safSmfService,
 rc=SA_AIS_ERR_NOT_EXIST (12)
Dec 20 18:37:19 SC-1 osafsmfd[20589]: NO SmfProcStateExecuting::rollbackInit: 
rollback of init action 2 failed, rc=SA_AIS_ERR_NOT_EXIST (12)
Dec 20 18:37:19 SC-1 osafsmfd[20589]: NO CAMP: Procedure safSmfProc=AddNewSG 
returned ROLLBACKFAILED
aign=Campaign_4,safApp=safSmfService,02,smfRollbackElement=ProcInit,safSmfProc=AddNewSG,safSmfCamp
 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #326 amf: proxied SU's presence state hangs at INSTANTIATING state.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#326] amf: proxied SU's presence state hangs at INSTANTIATING 
state.**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Fri May 24, 2013 09:34 AM UTC by Praveen
**Last Updated:** Thu Aug 06, 2015 10:26 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2213.

setup: 1 controller
 Model observed: TwoN
 

Configuration of proxy : 1 App, 1SG, 1SU, 1 proxy comps
 Configuration of proxied : 1App, 1SG, 1SU, 1 proxied component with 
saAmfCtCompCategory=12 


The proxy code is modelled to respond to amf with ERR_FAILED_OP inside 
SaAmfProxiedComponentInstantiateCallback?() api
 

By default, the SU's of proxy and proxied are in locked-instantiation state. 


Scenario:
 



Bringup the proxy and proxied configuration. 
Do unlock-in and unlock of the proxy. The proxy should be up and running, and 
the proxied registration should be successful. 


Now do unlock-in of proxied SU. The below is the console output 
console text:
 amf-adm unlock-in safSu=SU_pxd,safSg=SG_pxd,safApp=pxd_App
 error - saImmOmAdminOperationInvoke_2 FAILED: SA_AIS_ERR_TIMEOUT (5)
 

Retrying again gives the below output. 
SLES11-SLOT-2:/home/surender/amf # amf-adm unlock-in 
safSu=SU_pxd,safSg=SG_pxd,safApp=pxd_App
 error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_TRY_AGAIN 
(6)
 SLES11-SLOT-2:/home/surender/amf # amf-adm unlock-in 
safSu=SU_pxd,safSg=SG_pxd,safApp=pxd_App
 error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_TRY_AGAIN 
(6)
 

/var/log/messages output for above op's:
 Oct 11 15:13:15 SLES11-SLOT-2 osafamfnd[3852]: 
saAmfCtDefQuiescingCompleteTimeout for 
'safVersion=4.0.0,safCompType=Comp_nored' initialized with 
saAmfCtDefCallbackTimeout
 Oct 11 15:13:15 SLES11-SLOT-2 osafamfnd[3852]: 
'safSu=SU_mycomp,safSg=SG_mycomp,safApp=mycompApp' Presence State 
UNINSTANTIATED => INSTANTIATING
 Oct 11 15:13:16 SLES11-SLOT-2 osafamfnd[3852]: 
'safSu=SU_mycomp,safSg=SG_mycomp,safApp=mycompApp' Presence State INSTANTIATING 
=> INSTANTIATED
 Oct 11 15:13:16 SLES11-SLOT-2 osafamfnd[3852]: 
saAmfCtDefQuiescingCompleteTimeout for 
'safVersion=4.0.0,safCompType=Comp_pxd_basetype' initialized with 
saAmfCtDefCallbackTimeout
 Oct 11 15:13:41 SLES11-SLOT-2 osafamfnd[3852]: 
'safSu=SU_pxd,safSg=SG_pxd,safApp=pxd_App' Presence State UNINSTANTIATED => 
INSTANTIATING
 Oct 11 15:15:55 SLES11-SLOT-2 osafamfd[3711]: Admin operation is already going
 Oct 11 15:15:58 SLES11-SLOT-2 osafamfd[3711]: Admin operation is already going
 

SU states of proxy and proxied:
 safSu=SU_mycomp,safSg=SG_mycomp,safApp=mycompApp
 saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=SU_pxd,safSg=SG_pxd,safApp=pxd_App
 saAmfSUAdminState=LOCKED(2)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATING(2)
 saAmfSUReadinessState=OUT-OF-SERVICE(1)
 

Comp state of proxy and proxied:
 safComp=mycomp,safSu=SU_mycomp,safSg=SG_mycomp,safApp=mycompApp
 saAmfCompOperState=ENABLED(1)
 saAmfCompPresenceState=INSTANTIATED(3)
 saAmfCompReadinessState=IN-SERVICE(2)
 

safComp=Comp_pxd,safSu=SU_pxd,safSg=SG_pxd,safApp=pxd_App
saAmfCompOperState=DISABLED(2)
 saAmfCompPresenceState=INSTANTIATING(2)
 saAmfCompReadinessState=OUT-OF-SERVICE(1)
 

Here the proxied comp is in DISABLED state, but its SU is in ENABLED state. 
Also the proxied comp waits in Instantiating state indefinitely. 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #246 cpsv: Section create fails with random return values when mulitple processes try to create sections in the same checkpoint 70 node setup.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#246] cpsv: Section create fails with random return values when 
mulitple processes try to create sections in the same checkpoint  70 node 
setup. **

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Thu May 16, 2013 06:37 AM UTC by A V Mahesh (AVM)
**Last Updated:** Mon Aug 10, 2015 07:25 AM UTC
**Owner:** A V Mahesh (AVM)


 from http://devel.opensaf.org/ticket/2386

 Changeset: 3065
Setup: 70 node SLES11 VM setup


2 applications per node are running on a 70 node setup. 


Collocated checkpoint is created. After active replica is set from one process, 
section create with section id as GENERATED_SECTION_ID is invoked from rest of 
the processes. But the section create fails with ERR_EXIST, ERR_TIMEOUT, 
ERR_TRY_AGAIN.


/var/log/messages for the two controllers will be shared.





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #239 cpsv : section create returns ERR_EXIST after few try agains on 70 node cluster

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#239] cpsv : section create returns ERR_EXIST after few try agains 
on 70 node cluster**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Thu May 16, 2013 06:19 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Aug 11, 2015 06:19 AM UTC
**Owner:** A V Mahesh (AVM)


>From http://devel.opensaf.org/ticket/3042

This is seen on 70 SLES VM setup. One checkpoint application runs on each node.


1) Checkpoint Application on active controller creates an asynchronous 
collocated checkpoint. The applications on other nodes open the same checkpoint
2) Replica is set active on active controller and section is created
3) Section create API returns TRY_AGAIN few times and returns ERR_EXIST.


When application gets try again, the section should not be created in the 
checkpoint. This is always not reproducible. 


snippet from test journal:


520|0 15 00130961 1 21| FAILED : Section 11 created in active colloc ckpt
520|0 15 00130961 1 22| Return Value : SA_AIS_ERR_TRY_AGAIN
520|0 15 00130961 1 23|
520|0 15 00130961 1 24| Try again count : 8 
520|0 15 00130961 1 25|
520|0 15 00130961 1 26| FAILED : Section 11 created in active colloc ckpt 
520|0 15 00130961 1 27| Return Value : SA_AIS_ERR_EXIST


Attaching CPD and CPND traces of both the controllers





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #272 checkpoint overwrite returns timeout when controllers are running with different compatible versions

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#272] checkpoint overwrite returns timeout when controllers are 
running with different compatible versions**

**Status:** assigned
**Milestone:** 4.6.2
**Created:** Fri May 17, 2013 11:40 AM UTC by Sirisha Alla
**Last Updated:** Tue Aug 11, 2015 06:17 AM UTC
**Owner:** A V Mahesh (AVM)
**Attachments:**

- 
[logs.tar.gz](https://sourceforge.net/p/opensaf/tickets/272/attachment/logs.tar.gz)
 (175.5 kB; application/x-gzip)


The issue is seen on OEL6.4 TCP setup. Changeset being used is 4241 with 
patches 2794 and 3117.

Active controller(SC-1) is running with 4.3 version while standby controller 
(SC-2) is running with cs3533(4.2.x)

A non collocated checkpoint replica is created on Active controller.
A section is created in the checkpoint.
Write and Read APIs are successfull but overwrite API is returning timeout for 
5 seconds after which application timesout and exits.

No ckptnd and agent crashes observed. When the same application is run on SC-2, 
it runs without any error.

Attaching the journal and the traces of ckptnd and ckptd on both the 
controllers.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #159 AVSv to handle NTF Send TRY_AGAIN scenarios

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> 4.6.2



---

** [tickets:#159] AVSv to handle NTF Send TRY_AGAIN scenarios**

**Status:** unassigned
**Milestone:** 4.6.2
**Created:** Tue May 14, 2013 04:08 AM UTC by Nagendra Kumar
**Last Updated:** Fri Aug 07, 2015 10:20 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/967

AVSv to handle NTF Send TRY_AGAIN scenarios.


While analysing ticket #954(unstable test setup), there were a lot of 
notification send failures observed when ntf had returned try again.


try again should be in place for AVSV notifications.


Changed 3 years ago by mathi ¶
  ■component changed from unknown to AvSv 
Changed 3 years ago by murthy ¶
  ■milestone changed from PL 3.0.2 to 4.0.0-RC1 
Changed 3 years ago by hafe ¶
  ■priority changed from major to minor 
I haven't seen any need for TRY-AGAIN handling of the NTF interface in AMF. 
Since the NTF API is now only used from amfd, it is in total control of when it 
can use NTF.


If NTF would be used from amfnd it would require TRY-AGAIN, but that is not the 
case now.


Lowering this prio.


Changed 2 years ago by jfournier ¶
  ■milestone changed from 4.0.RC1 to 4.0.1 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #68 failover didnot succeed and cluster got reset due to MDS problems.

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.2 --> never



---

** [tickets:#68] failover didnot succeed and cluster got reset due to MDS 
problems.**

**Status:** not-reproducible
**Milestone:** never
**Created:** Sat May 11, 2013 05:22 PM UTC by surender khetavath
**Last Updated:** Tue Sep 08, 2015 04:57 AM UTC
**Owner:** A V Mahesh (AVM)
**Attachments:**

- [logs.tgz](https://sourceforge.net/p/opensaf/tickets/68/attachment/logs.tgz) 
(16.2 MB; application/x-compressed-tar)
- 
[AppConfig-2N-68.xml](https://sourceforge.net/p/opensaf/tickets/68/attachment/AppConfig-2N-68.xml)
 (23.1 kB; text/xml)


Changeset : 4241 with 2794&3117 patch
Model : TwoN
configuration: 1App,1SG,4SUs with 3comps each and 5SIs with 3CSIs each
Transport : TCP/ipv6-linklocal
PBE enabled. 

scenario:
sc1 was active and sc2 standby.
Active SU on Sc1 was shutdown and component was made to reject quiescing 
assignment. Component got restarted for 10times as compRestartMax=10 and then 
escalated to nodefailover following a suFailover. 

sc-2 didnot become active, and eventually rebooted. Thus causing a cluster 
reset. 

syslog on sc-1:
--
May 11 21:24:49 sc-1 osafimmnd[4683]: WA Error code 2 returned for message type 
21 - ignoring
May 11 21:24:49 sc-1 osafamfnd[4790]: NO Received reboot order, ordering reboot 
now!
May 11 21:24:49 sc-1 osafamfnd[4790]: Rebooting OpenSAF NodeId = 131343 EE Name 
= , Reason: Received reboot order
May 11 21:24:49 sc-1 opensaf_reboot: Rebooting local node
May 11 21:24:49 sc-1 osafimmnd[4683]: WA MESSAGE:5319 OUT OF ORDER my highest 
processed:5317, exiting
May 11 21:24:49 sc-1 osafimmpbed: WA PBE lost contact with parent IMMND - 
Exiting
May 11 21:24:49 sc-1 osafntfimcnd[4734]: ER saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
May 11 21:24:49 sc-1 osafimmd[4668]: WA IMMND coordinator at 2010f apparently 
crashed => electing new coord
May 11 21:24:49 sc-1 osafimmd[4668]: ER Failed to find candidate for new IMMND 
coordinator
May 11 21:24:49 sc-1 osafimmd[4668]: ER Active IMMD has to restart the IMMSv. 
All IMMNDs will restart
May 11 21:24:49 sc-1 osafimmd[4668]: ER IMM RELOAD  => ensure cluster restart 
by IMMD exit at both SCs, exiting


syslog on sc-2:

May 11 21:24:49 sc-2 osafimmd[3894]: WA IMMD not re-electing coord for 
switch-over (si-swap) coord at (2010f)
May 11 21:24:49 sc-2 osafntfimcnd[3969]: NO exiting on signal 15
May 11 21:24:49 sc-2 osafsmfd[4052]: ER amf_active_state_handler oi activate 
FAILED
May 11 21:24:49 sc-2 osafamfnd[4023]: NO 
'safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackFailed' : Recovery is 'nodeFailfast'
May 11 21:24:49 sc-2 osafamfnd[4023]: ER 
safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackFailed Recovery is:nodeFailfast
May 11 21:24:49 sc-2 osafamfnd[4023]: Rebooting OpenSAF NodeId = 131599 EE Name 
= , Reason: Component faulted: recovery is node failfast
May 11 21:24:49 sc-2 osafmsgd[4216]: ER mqd_imm_declare_implementer failed: err 
= 14
May 11 21:24:49 sc-2 osafckptd[4202]: ER cpd immOiImplmenterSet failed with err 
= 14
May 11 21:24:49 sc-2 opensaf_reboot: Rebooting local node



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #1157 MDS: IMMD coredumps in MDS BCAST send (TCP with MCAST_ADDR)

2015-11-02 Thread Anders Widell

- **Milestone**: 4.5.0 --> never



---

** [tickets:#1157] MDS: IMMD coredumps in MDS BCAST send (TCP with MCAST_ADDR)**

**Status:** duplicate
**Milestone:** never
**Created:** Tue Oct 07, 2014 12:57 AM UTC by Adrian Szwej
**Last Updated:** Fri Oct 10, 2014 06:00 PM UTC
**Owner:** nobody
**Attachments:**

- 
[immd.core](https://sourceforge.net/p/opensaf/tickets/1157/attachment/immd.core)
 (25.2 kB; application/octet-stream)


Changeset: **4.6.M0 - 6009:b2ddaa23aae4**
When starting ~50 linux containers IMMD coredumps resulting in cluster reset.
Communication is TCP.
dtmd.conf configuration is:

DTM_SOCK_SND_RCV_BUF_SIZE=65536
DTM_CLUSTER_ID=1
DTM_NODE_IP=172.17.1.42
DTM_MCAST_ADDR=224.0.0.6

BatchSize reduced to 4096

opensafImm=opensafImm,safApp=safImmService
Name   Type Value(s)

opensafImmSyncBatchSizeSA_UINT32_T  4096 
(0x1000)

When node PL-51 joins the cluster the following messages is seen in the syslog:

Oct  6 00:35:57 SC-1 osafdtmd[1028]: NO Established contact with 'PL-51'
Oct  6 00:35:57 SC-1 osafimmd[1063]: NO Extended intro from node 2330f
Oct  6 00:35:57 SC-1 osafimmd[1063]: NO Node 2330f request sync sync-pid:79 
epoch:0 
Oct  6 00:35:58 SC-1 osafimmnd[1072]: NO Announce sync, epoch:292
Oct  6 00:35:58 SC-1 osafimmnd[1072]: NO SERVER STATE: IMM_SERVER_READY --> 
IMM_SERVER_SYNC_SERVER
Oct  6 00:35:58 SC-1 osafimmnd[1072]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
Oct  6 00:35:58 SC-1 osafimmd[1063]: NO Successfully announced sync. New 
ruling epoch:292
Oct  6 00:35:58 SC-1 osafimmloadd: NO Sync starting
Oct  6 00:36:00 SC-1 osafimmd[1063]:  MDTM unsent message is more!=200
Oct  6 00:36:00 SC-1 osafimmnd[1072]: WA Director Service in NOACTIVE state 
- fevs replies pending:9 fevs highest processed:20037
Oct  6 00:36:00 SC-1 osafamfnd[1143]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Oct  6 00:36:00 SC-1 osafamfnd[1143]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Oct  6 00:36:00 SC-1 osafamfnd[1143]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60
Oct  6 00:36:00 SC-1 opensaf_reboot: Rebooting local node; timeout=60
Oct  6 00:36:00 SC-1 osafimmnd[1072]: NO No IMMD service => cluster 
restart, exiting

There is a coredump generated:
core_1412555760.osafimmd.1063





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

1 2 >

1 - 100 of 135 matches

Mail list logo