- **summary**: lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE 
returning SA_AIS_ERR_TIMEOUT after 5 failovers. --> lck: saLckResourceOpen  
returns SA_AIS_ERR_TIMEOUT / SA_AIS_ERR_LIBRARY after failovers / switchovers.
- **Comment**:

After couple of switchovers / failovers, saLckResourceOpen may fail randomly 
with following return values.

-> SA_AIS_ERR_TIMEOUT
-> SA_AIS_ERR_LIBRARY
-> random return values , which is  out of bound



---

** [tickets:#1801] lck: saLckResourceOpen  returns SA_AIS_ERR_TIMEOUT / 
SA_AIS_ERR_LIBRARY after failovers / switchovers.**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Mon May 02, 2016 09:52 AM UTC by Madhurika Koppula
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody
**Attachments:**

- 
[glsv.tgz](https://sourceforge.net/p/opensaf/tickets/1801/attachment/glsv.tgz) 
(3.0 MB; application/octet-stream)


Setup:
Changeset- 7436
OS: Oracle Linux Server release 6.4 (x86_64)
4 nodes configured with single PBE

some failover tests are being ran.
safLock=resource1_101 object is not getting deleted. Thereby saLckResourceOpen 
with flag SA_LCK_RESOURCE_CREATE is continuously returning SA_AIS_ERR_TIMEOUT.

With sleep of 10secs, 15times retry is done on the same API call.

Snippet from the run:

100|7| SUCCESS         : saLckInitialize with valid parameters
100|7| Return Value    : SA_AIS_OK
100|7| LckHandle       : 6599312
100|7|
100|7|
100|7| SUCCESS         : saLckInitialize with valid parameters
100|7| Return Value    : SA_AIS_OK
100|7| LckHandle       : 6599392
100|7|
100|7|
100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE
100|7| FAILED          : saLckResourceOpen with valid parameters
100|7| Return Value    : SA_AIS_ERR_TIMEOUT

100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE
100|7|
100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE

100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE
100|7|
100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE
100|7|
100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE

100|7|
100|7| Resource Name   : safLock=resource1_101
100|7| open flags      : SA_LCK_RESOURCE_CREATE Timeout count exceeded: 15

Timestamp of the Active controller at this instant:

May  2 14:22:56 OEL_M-SLOT-2 root: killing osafimmd from run_failover.sh
May  2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
May  2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
May  2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: Rebooting OpenSAF NodeId = 131599 
EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
May  2 14:22:56 OEL_M-SLOT-2 opensaf_reboot: Rebooting local node; timeout=60

Timestamp of the Standby controller which is becoming active after failover:

May  2 14:23:00 OEL_M-SLOT-1 opensaf_reboot: Rebooting remote node in the 
absence of PLM is outside the scope of OpenSAF
May  2 14:23:00 OEL_M-SLOT-1 osaffmd[1677]: NO Controller Failover: Setting 
role to ACTIVE
May  2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO RDE role set to ACTIVE
May  2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO Running 
'/usr/lib64/opensaf/opensaf_sc_active' with 0 argument(s)
May  2 14:23:00 OEL_M-SLOT-1 osafimmd[1688]: NO ACTIVE request
May  2 14:23:00 OEL_M-SLOT-1 osaflogd[1711]: NO ACTIVE request
May  2 14:23:00 OEL_M-SLOT-1 osafntfd[1722]: NO ACTIVE request
May  2 14:23:00 OEL_M-SLOT-1 osafclmd[1733]: NO ACTIVE request
May  2 14:23:00 OEL_M-SLOT-1 osafamfd[1744]: NO FAILOVER StandBy --> Active

/var/log/messages and osaflckd traces of both controllers  are attached.





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to