[tickets] [opensaf:tickets] #1115 AMF: osafamfnd fails: Assertion 'comp->hc_list.n_nodes == 0' failed

2014-09-18 Thread Hans Feldt
--- ** [tickets:#1115] AMF: osafamfnd fails: Assertion 'comp->hc_list.n_nodes == 0' failed** **Status:** accepted **Milestone:** 4.3.3 **Created:** Fri Sep 19, 2014 06:56 AM UTC by Hans Feldt **Last Updated:** Fri Sep 19, 2014 06:56 AM UTC **Owner:** Hans Feldt If a non SA aware component ha

[tickets] [opensaf:tickets] #1060 AMF: reset of cluster startup timer does not happen (#76)

2014-09-18 Thread Nagendra Kumar
- Description has changed: Diff: --- old +++ new @@ -1,4 +1,3 @@ - When all SUs are ENABLED and INSTANTIATED it was intended in #76 that assignments should be done and not wait for the cluster start timeout. This does not work. It is not a problem since once the timer expires assignments

Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-18 Thread ramesh betham
Hans, We are done with verifying the patch-series updated in the ticket: 1050 and the results are good. Below mail has my review comment and consider this as an official Ack on these patches. Thanks and Regards, Ramesh. On 9/18/2014 11:03 AM, ramesh betham wrote: Hi Hans, Thanks for provid

[tickets] [opensaf:tickets] #1114 NTF: Unadapted LongDns consumer crashes due to read/subsribe long dn notification

2014-09-18 Thread Minh Hon Chau
- **summary**: NTF: Unadapted LongDns consumer crashes due to long dn notification --> NTF: Unadapted LongDns consumer crashes due to read/subsribe long dn notification --- ** [tickets:#1114] NTF: Unadapted LongDns consumer crashes due to read/subsribe long dn notification** **Status:** ass

[tickets] [opensaf:tickets] #1114 NTF: Unadapted LongDns consumer crashes due to long dn notification

2014-09-18 Thread Minh Hon Chau
- **status**: unassigned --> assigned --- ** [tickets:#1114] NTF: Unadapted LongDns consumer crashes due to long dn notification** **Status:** assigned **Milestone:** 4.5.0 **Created:** Fri Sep 19, 2014 05:15 AM UTC by Minh Hon Chau **Last Updated:** Fri Sep 19, 2014 05:15 AM UTC **Owner:** M

[tickets] [opensaf:tickets] #1114 NTF: Unadapted LongDns consumer crashes due to long dn notification

2014-09-18 Thread Minh Hon Chau
--- ** [tickets:#1114] NTF: Unadapted LongDns consumer crashes due to long dn notification** **Status:** unassigned **Milestone:** 4.5.0 **Created:** Fri Sep 19, 2014 05:15 AM UTC by Minh Hon Chau **Last Updated:** Fri Sep 19, 2014 05:15 AM UTC **Owner:** Minh Hon Chau In a long dn upgraded

[tickets] [opensaf:tickets] Re: #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Bjornerstedt
Anders Björnerstedt wrote: > In my last reply below I refer to ticket #1008 but meant ticket #1108. > > Ticket #1108: AMF: Implement use of immsv admin-op for aborting non-critical > CCBs > Ticket #: AMF: Reject SC swichover (si-swap) when active ccb modifying > amf-data exists > > So I me

Re: [tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Bjornerstedt
Anders Björnerstedt wrote: > In my last reply below I refer to ticket #1008 but meant ticket #1108. > > Ticket #1108: AMF: Implement use of immsv admin-op for aborting non-critical > CCBs > Ticket #: AMF: Reject SC swichover (si-swap) when active ccb modifying > amf-data exists > > So I me

[tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Bjornerstedt
In my last reply below I refer to ticket #1008 but meant ticket #1108. Ticket #1108: AMF: Implement use of immsv admin-op for aborting non-critical CCBs Ticket #: AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists So I meant to say: Ticket #1008 could possibly b

Re: [tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Björnerstedt
In my last reply below I refer to ticket #1008 but meant ticket #1108. Ticket #1108: AMF: Implement use of immsv admin-op for aborting non-critical CCBs Ticket #: AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists So I meant to say: Ticket #1008 could possibly b

[tickets] [opensaf:tickets] #1070 Access control don't check the primary group

2014-09-18 Thread Anders Bjornerstedt
- **Milestone**: 4.6.FC --> 4.5.0 --- ** [tickets:#1070] Access control don't check the primary group** **Status:** assigned **Milestone:** 4.5.0 **Created:** Fri Sep 12, 2014 07:59 PM UTC by Adrian Szwej **Last Updated:** Fri Sep 12, 2014 08:04 PM UTC **Owner:** Hans Feldt Access control doe

[tickets] [opensaf:tickets] #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-18 Thread Adrian Szwej
Without the patch there is no coredump. But timeout in three minutes. Then immd exits. I provide traces. Attachment: ticket-1072-vanilla-opensaf.tar (1.3 MB; application/x-tar) --- ** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)** **Status:** unassigned **Miles

Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-18 Thread Hans Feldt
No I used the (to me) standard mechanism available for local connected sockets that I was aware of. It is used in other similar situations. /Hans > -Original Message- > From: Mathivanan Naickan Palanivelu [mailto:mathi.naic...@oracle.com] > Sent: den 18 september 2014 16:55 > To: ramesh.b

[tickets] [opensaf:tickets] #1112 2pbe: immnd crashed on all nodes and led to cluster reset

2014-09-18 Thread Anders Bjornerstedt
- **Priority**: critical --> major - **Comment**: In summary. This system appears to be so corrupt that it is hard to debug. Particularly with only one syslog. Reducing severity to major until I get better information. The ticket in its current state is not complete. --- ** [tickets:#1112]

[tickets] [opensaf:tickets] #1112 2pbe: immnd crashed on all nodes and led to cluster reset

2014-09-18 Thread Anders Bjornerstedt
I have looked at the only syslog provided and what I do see is first a long sequence of successfull failovers back and forth. The normal sequence, for failover, is: 1) FMD reports node down event for peer. Sep 18 13:49:35 SC-2 osaffmd[2791]: NO Node Down event for node id 2010f: 2) TIPC link loss

Re: [tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-18 Thread Mathivanan Naickan Palanivelu
Did you not consider using a/the security-key exchanged between the client and server, as the 'key' to lookup/store from MDS? Mathi. - ramesh.bet...@oracle.com wrote: > Hi Hans, > > Thanks for providing the traces. These traces gave more clarity about > the race condition happening betwee

[tickets] [opensaf:tickets] #1073 --disable-ais-plm does not work on fedora

2014-09-18 Thread Alex Jones
- **status**: unassigned --> accepted - **assigned_to**: Alex Jones --- ** [tickets:#1073] --disable-ais-plm does not work on fedora** **Status:** accepted **Milestone:** 4.6.FC **Created:** Fri Sep 12, 2014 09:29 PM UTC by Adrian Szwej **Last Updated:** Fri Sep 12, 2014 09:30 PM UTC **Owner:*

[tickets] [opensaf:tickets] #1103 imm: uninitialized error code in CCB object delete response

2014-09-18 Thread Neelakanta Reddy
- **status**: review --> fixed - **Comment**: [staging:404779] [staging:ddde42] [staging:e2e952] [staging:f9a0d1] changeset: 5842:f9a0d1fda045 tag: tip parent: 5838:f3d517c14db8 user:Neelakanta Reddy date:Thu Sep 18 19:42:53 2014 +0530 summary: imm:corrected u

[tickets] [opensaf:tickets] #1112 2pbe: immnd crashed on all nodes and led to cluster reset

2014-09-18 Thread Anders Bjornerstedt
The reported crash is on SC1 but logs are only provided for SC2. Please provide logs for SC1 also, covering the same time period. --- ** [tickets:#1112] 2pbe: immnd crashed on all nodes and led to cluster reset** **Status:** assigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 11:07 AM

[tickets] [opensaf:tickets] #1103 imm: uninitialized error code in CCB object delete response

2014-09-18 Thread Neelakanta Reddy
https://sourceforge.net/p/opensaf/mailman/message/32844840/ https://sourceforge.net/p/opensaf/mailman/message/32844841/ --- ** [tickets:#1103] imm: uninitialized error code in CCB object delete response** **Status:** review **Milestone:** 4.3.3 **Created:** Wed Sep 17, 2014 09:25 AM UTC by Zora

[tickets] [opensaf:tickets] #1103 imm: uninitialized error code in CCB object delete response

2014-09-18 Thread Neelakanta Reddy
- **status**: accepted --> review --- ** [tickets:#1103] imm: uninitialized error code in CCB object delete response** **Status:** review **Milestone:** 4.3.3 **Created:** Wed Sep 17, 2014 09:25 AM UTC by Zoran Milinkovic **Last Updated:** Thu Sep 18, 2014 11:47 AM UTC **Owner:** Neelakanta Re

[tickets] [opensaf:tickets] #1051 ais_name_borrow/lend ruins trace of services

2014-09-18 Thread Anders Widell
- **status**: review --> fixed - **Comment**: changeset: 5837:a033c8902c4e branch: opensaf-4.5.x parent: 5835:6c3c09882f97 user:Anders Widell date:Thu Sep 18 13:17:01 2014 +0200 summary: osaf: Remove trace from saAisNameBorrow() and saAisNameLend() [#1051] change

[tickets] [opensaf:tickets] #1080 2PBE: pbed crashed at immpbe_dump.cc:2273

2014-09-18 Thread Anders Bjornerstedt
- **status**: unassigned --> assigned - **assigned_to**: Anders Bjornerstedt - **Milestone**: 4.3.3 --> 4.4.1 --- ** [tickets:#1080] 2PBE: pbed crashed at immpbe_dump.cc:2273** **Status:** assigned **Milestone:** 4.4.1 **Created:** Mon Sep 15, 2014 11:32 AM UTC by Sirisha Alla **Last Updated:*

[tickets] [opensaf:tickets] #1091 2PBE: class create timesout before default SYNCR_TIMEOUT

2014-09-18 Thread Anders Bjornerstedt
- **assigned_to**: Anders Bjornerstedt --- ** [tickets:#1091] 2PBE: class create timesout before default SYNCR_TIMEOUT** **Status:** assigned **Milestone:** 4.4.1 **Created:** Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla **Last Updated:** Thu Sep 18, 2014 11:51 AM UTC **Owner:** Anders Bjorne

[tickets] [opensaf:tickets] #1091 2PBE: class create timesout before default SYNCR_TIMEOUT

2014-09-18 Thread Anders Bjornerstedt
- **status**: unassigned --> assigned --- ** [tickets:#1091] 2PBE: class create timesout before default SYNCR_TIMEOUT** **Status:** assigned **Milestone:** 4.4.1 **Created:** Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla **Last Updated:** Tue Sep 16, 2014 05:28 AM UTC **Owner:** nobody The is

[tickets] [opensaf:tickets] #1112 2pbe: immnd crashed on all nodes and led to cluster reset

2014-09-18 Thread Anders Bjornerstedt
- **status**: unassigned --> assigned - **assigned_to**: Anders Bjornerstedt --- ** [tickets:#1112] 2pbe: immnd crashed on all nodes and led to cluster reset** **Status:** assigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 11:07 AM UTC by surender khetavath **Last Updated:** Thu Sep 1

[tickets] [opensaf:tickets] #1103 imm: uninitialized error code in CCB object delete response

2014-09-18 Thread Neelakanta Reddy
- **status**: unassigned --> accepted - **assigned_to**: Neelakanta Reddy --- ** [tickets:#1103] imm: uninitialized error code in CCB object delete response** **Status:** accepted **Milestone:** 4.3.3 **Created:** Wed Sep 17, 2014 09:25 AM UTC by Zoran Milinkovic **Last Updated:** Wed Sep 17,

[tickets] [opensaf:tickets] Re: #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Bjornerstedt
Ticket #1008 is an alternative to this ticket. Ticket #1008 could possibly be closed if a fix of this ticket prevents all cases of a ccb existing during si-swap that contains ccb-operations on AMF objects. /AndersBj From: Nagendra Kumar [mailto:nagendr...@users.

Re: [tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Björnerstedt
Ticket #1008 is an alternative to this ticket. Ticket #1008 could possibly be closed if a fix of this ticket prevents all cases of a ccb existing during si-swap that contains ccb-operations on AMF objects. /AndersBj From: Nagendra Kumar [mailto:nagendr...@users.

[tickets] [opensaf:tickets] Re: #1108 AMF: Implement use of immsv admin-op for aborting non-critical CCBs

2014-09-18 Thread Anders Bjornerstedt
Dont understand what you mean by "proprietary implementation" ? What is the problem ? If IMM enhancement #1007 is implemented, then this enhancement could be implemented using it. Thius ticket #1008 depends on #1007. Whats the problem ? There is an alternative though: ticket #. /AndersBj _

[tickets] [opensaf:tickets] #1113 AMF: add support for MW standbyErrorRecovery

2014-09-18 Thread Hans Feldt
- Description has changed: Diff: --- old +++ new @@ -2,6 +2,8 @@ Normally there is no point in rebooting the standby controller node if a component of the hosted MW 2N SU fails. This is the default behavior today. It should be enough with component restart and have normal error escalat

[tickets] [opensaf:tickets] #1113 AMF: add support for MW standbyErrorRecovery

2014-09-18 Thread Hans Feldt
--- ** [tickets:#1113] AMF: add support for MW standbyErrorRecovery** **Status:** unassigned **Milestone:** 4.6.FC **Created:** Thu Sep 18, 2014 11:35 AM UTC by Hans Feldt **Last Updated:** Thu Sep 18, 2014 11:35 AM UTC **Owner:** nobody To improve system HA a separate standby entity error re

[tickets] [opensaf:tickets] Re: #707 Quiesced controller failed to become Active when the standby controller rebooted in middle of switchover

2014-09-18 Thread Anders Bjornerstedt
Hi Nags, Do you agree with the point I added to this ticket?: The likely cause is that an RT update is attempted by AMFD using the oi-handle after it has released implementer and before it has restored that implementer. An saImmOiRtObjectUpdate with an oi-handle that has no imple

[tickets] [opensaf:tickets] #1110 NTF healthcheck callback timedout leading to node reboot

2014-09-18 Thread Mathi Naickan
- **assigned_to**: Praveen --- ** [tickets:#1110] NTF healthcheck callback timedout leading to node reboot** **Status:** unassigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 07:41 AM UTC by Sirisha Alla **Last Updated:** Thu Sep 18, 2014 07:41 AM UTC **Owner:** Praveen This issue is

[tickets] [opensaf:tickets] #1109 standby failed to come up during failover

2014-09-18 Thread Mathi Naickan
- **status**: unassigned --> duplicate - **Comment**: This is because of 1110. Traces for 1110 is necessary if the issue is reproducible. --- ** [tickets:#1109] standby failed to come up during failover** **Status:** duplicate **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 07:33 AM UTC b

[tickets] [opensaf:tickets] #1112 2pbe: immnd crashed on all nodes and led to cluster reset

2014-09-18 Thread surender khetavath
- **summary**: immnd crashed on all nodes and led to cluster reset --> 2pbe: immnd crashed on all nodes and led to cluster reset --- ** [tickets:#1112] 2pbe: immnd crashed on all nodes and led to cluster reset** **Status:** unassigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 11:07 A

[tickets] [opensaf:tickets] #1112 immnd crashed on all nodes and led to cluster reset

2014-09-18 Thread surender khetavath
--- ** [tickets:#1112] immnd crashed on all nodes and led to cluster reset** **Status:** unassigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 11:07 AM UTC by surender khetavath **Last Updated:** Thu Sep 18, 2014 11:07 AM UTC **Owner:** nobody changeset : 5697 As part of failovers th

[tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Nagendra Kumar
Check my comment for #1108. --- ** [tickets:#] AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists** **Status:** unassigned **Milestone:** 4.6.FC **Created:** Thu Sep 18, 2014 08:25 AM UTC by Anders Bjornerstedt **Last Updated:** Thu Sep 18, 2014 08:26 AM UTC **Own

[tickets] [opensaf:tickets] #1108 AMF: Implement use of immsv admin-op for aborting non-critical CCBs

2014-09-18 Thread Nagendra Kumar
Well, this looks proprietary implementation, some other alternative need to be evaluated. --- ** [tickets:#1108] AMF: Implement use of immsv admin-op for aborting non-critical CCBs** **Status:** unassigned **Milestone:** 4.6.FC **Created:** Thu Sep 18, 2014 06:39 AM UTC by Anders Bjornerstedt

[tickets] [opensaf:tickets] #707 Quiesced controller failed to become Active when the standby controller rebooted in middle of switchover

2014-09-18 Thread Nagendra Kumar
Hi Anders, This ticket needs synchronization between Amfd thread and thread being spawned for imm apis for handling bad_handle. I am not sure whether to keep mutex as it will make any way Amfd thread waiting. Since most of the flows hits imm interactions, it is bound to delay Amfd HA. So, what is

[tickets] [opensaf:tickets] #1061 imm: memory leak in dumping resources in PBE

2014-09-18 Thread Neelakanta Reddy
- **status**: review --> fixed - **Comment**: [staging:6c3c09] [staging:481a50] changeset: 5836:481a5002d33a tag: tip parent: 5834:605f4ee23194 user:Neelakanta Reddy date:Thu Sep 18 15:40:54 2014 +0530 summary: imm:freeing the allocated memory in dumping resour

[tickets] [opensaf:tickets] #1069 NTF: Incorrect or no validation of ntfsend option attributes

2014-09-18 Thread Praveen
Version 2 pusblished for review "tools/safntf : validate ntfsend options V2 [#1069]" --- ** [tickets:#1069] NTF: Incorrect or no validation of ntfsend option attributes** **Status:** review **Milestone:** 4.3.3 **Created:** Fri Sep 12, 2014 02:26 PM UTC by elunlen **Last Updated:** Wed Sep 17

[tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists

2014-09-18 Thread Anders Bjornerstedt
- **summary**: AMF: Reject SC swichover (si-swap) when active ccb modifyinc amf-data exists --> AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists --- ** [tickets:#] AMF: Reject SC swichover (si-swap) when active ccb modifying amf-data exists** **Status:** unas

[tickets] [opensaf:tickets] #1111 AMF: Reject SC swichover (si-swap) when active ccb modifyinc amf-data exists

2014-09-18 Thread Anders Bjornerstedt
--- ** [tickets:#] AMF: Reject SC swichover (si-swap) when active ccb modifyinc amf-data exists** **Status:** unassigned **Milestone:** 4.6.FC **Created:** Thu Sep 18, 2014 08:25 AM UTC by Anders Bjornerstedt **Last Updated:** Thu Sep 18, 2014 08:25 AM UTC **Owner:** nobody This is relat

[tickets] [opensaf:tickets] #1105 AMFD: New standby crashes if blocked on becoming applier

2014-09-18 Thread Anders Bjornerstedt
- **Comment**: For the failover case, the new active AMFD really must wait eternally on implementer-set, preferraby in combination with actions directed at resolving the issue, such as the proposed admin-op on imm (enhancement #1107). The "alternative" of a cluster restart is not an alternative.

[tickets] [opensaf:tickets] #1105 AMFD: New standby crashes if blocked on becoming applier

2014-09-18 Thread Anders Bjornerstedt
For the switchover case there is an alternative to "eternal wait" on setting OI/applier. This is for the active AMFD to *reject* a switchover if there is currently an active CCB modifying AMF data. The AMFD must know if this is the case since it is the OI for that data. --- ** [tickets:#1105

[tickets] [opensaf:tickets] #1062 imm: immnd may crash in resourceDisplay

2014-09-18 Thread Neelakanta Reddy
- **status**: review --> fixed - **Comment**: [staging:224cb7] [staging:605f4e] changeset: 5834:605f4ee23194 tag: tip parent: 5832:318a5e60431f user:Neelakanta Reddy date:Thu Sep 18 13:21:21 2014 +0530 summary: imm: Return INVALID_PARAM if the Operation name is

[tickets] [opensaf:tickets] Re: #1105 AMFD: New standby crashes if blocked on becoming applier

2014-09-18 Thread Anders Bjornerstedt
HA is a statistical property. It can only be truly evaluated by recording the availability history of a system. But one can predict if an operation will impact HA by analyzing the degree of increased vulnerability that the operation causes. Basically it is (at least) the MTBF of a single SC tha

[tickets] [opensaf:tickets] #1110 NTF healthcheck callback timedout leading to node reboot

2014-09-18 Thread Sirisha Alla
--- ** [tickets:#1110] NTF healthcheck callback timedout leading to node reboot** **Status:** unassigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 07:41 AM UTC by Sirisha Alla **Last Updated:** Thu Sep 18, 2014 07:41 AM UTC **Owner:** nobody This issue is in continuation to ticket #1

[tickets] [opensaf:tickets] #1078 imm: Unnesessary TRY_AGAIN when object apllier does not match the Re-using implementerset info

2014-09-18 Thread Neelakanta Reddy
- **status**: review --> fixed - **Comment**: changeset: 5829:b598dff96e9a branch: opensaf-4.3.x parent: 5817:0ebd64fc24f4 user:Neelakanta Reddy date:Thu Sep 18 12:52:02 2014 +0530 summary: imm: Return TRY_AGAIN only when object apllier matches the Re-using implem

[tickets] [opensaf:tickets] #1109 standby failed to come up during failover

2014-09-18 Thread Sirisha Alla
--- ** [tickets:#1109] standby failed to come up during failover** **Status:** unassigned **Milestone:** 4.3.3 **Created:** Thu Sep 18, 2014 07:33 AM UTC by Sirisha Alla **Last Updated:** Thu Sep 18, 2014 07:33 AM UTC **Owner:** nobody The issue is seen on SLES X86 VMs running with single pbe

[tickets] [opensaf:tickets] Re: #1105 AMFD: New standby crashes if blocked on becoming applier

2014-09-18 Thread Anders Bjornerstedt
Rebooting an SC _always_ harms SA. This is by definition so since the cluster becomes one-safe (single point of failure in the remaining SC). I am of course not saying that AMF as an entity shall wait forever in providing service. All I am saying is that the AMF should keep trying to attach as