[tickets] [opensaf:tickets] #1801 lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE returning SA_AIS_ERR_TIMEOUT after 5 failovers.
--- ** [tickets:#1801] lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE returning SA_AIS_ERR_TIMEOUT after 5 failovers.** **Status:** unassigned **Milestone:** 4.6.2 **Created:** Mon May 02, 2016 09:52 AM UTC by Madhurika Koppula **Last Updated:** Mon May 02, 2016 09:52 AM UTC **Owner:** nobody **Attachments:** - [glsv.tgz](https://sourceforge.net/p/opensaf/tickets/1801/attachment/glsv.tgz) (3.0 MB; application/octet-stream) Setup: Changeset- 7436 OS: Oracle Linux Server release 6.4 (x86_64) 4 nodes configured with single PBE some failover tests are being ran. safLock=resource1_101 object is not getting deleted. Thereby saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE is continuously returning SA_AIS_ERR_TIMEOUT. With sleep of 10secs, 15times retry is done on the same API call. Snippet from the run: 100|7| SUCCESS : saLckInitialize with valid parameters 100|7| Return Value: SA_AIS_OK 100|7| LckHandle : 6599312 100|7| 100|7| 100|7| SUCCESS : saLckInitialize with valid parameters 100|7| Return Value: SA_AIS_OK 100|7| LckHandle : 6599392 100|7| 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| FAILED : saLckResourceOpen with valid parameters 100|7| Return Value: SA_AIS_ERR_TIMEOUT 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE Timeout count exceeded: 15 Timestamp of the Active controller at this instant: May 2 14:22:56 OEL_M-SLOT-2 root: killing osafimmd from run_failover.sh May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: NO 'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: ER safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 May 2 14:22:56 OEL_M-SLOT-2 opensaf_reboot: Rebooting local node; timeout=60 Timestamp of the Standby controller which is becoming active after failover: May 2 14:23:00 OEL_M-SLOT-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF May 2 14:23:00 OEL_M-SLOT-1 osaffmd[1677]: NO Controller Failover: Setting role to ACTIVE May 2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO RDE role set to ACTIVE May 2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO Running '/usr/lib64/opensaf/opensaf_sc_active' with 0 argument(s) May 2 14:23:00 OEL_M-SLOT-1 osafimmd[1688]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osaflogd[1711]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafntfd[1722]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafclmd[1733]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafamfd[1744]: NO FAILOVER StandBy --> Active /var/log/messages and osaflckd traces of both controllers are attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1801 lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE returning SA_AIS_ERR_TIMEOUT after 5 failovers.
- **Milestone**: 4.6.2 --> 4.7.2 --- ** [tickets:#1801] lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE returning SA_AIS_ERR_TIMEOUT after 5 failovers.** **Status:** unassigned **Milestone:** 4.7.2 **Created:** Mon May 02, 2016 09:52 AM UTC by Madhurika Koppula **Last Updated:** Mon May 02, 2016 09:52 AM UTC **Owner:** nobody **Attachments:** - [glsv.tgz](https://sourceforge.net/p/opensaf/tickets/1801/attachment/glsv.tgz) (3.0 MB; application/octet-stream) Setup: Changeset- 7436 OS: Oracle Linux Server release 6.4 (x86_64) 4 nodes configured with single PBE some failover tests are being ran. safLock=resource1_101 object is not getting deleted. Thereby saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE is continuously returning SA_AIS_ERR_TIMEOUT. With sleep of 10secs, 15times retry is done on the same API call. Snippet from the run: 100|7| SUCCESS : saLckInitialize with valid parameters 100|7| Return Value: SA_AIS_OK 100|7| LckHandle : 6599312 100|7| 100|7| 100|7| SUCCESS : saLckInitialize with valid parameters 100|7| Return Value: SA_AIS_OK 100|7| LckHandle : 6599392 100|7| 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| FAILED : saLckResourceOpen with valid parameters 100|7| Return Value: SA_AIS_ERR_TIMEOUT 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE Timeout count exceeded: 15 Timestamp of the Active controller at this instant: May 2 14:22:56 OEL_M-SLOT-2 root: killing osafimmd from run_failover.sh May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: NO 'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: ER safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 May 2 14:22:56 OEL_M-SLOT-2 opensaf_reboot: Rebooting local node; timeout=60 Timestamp of the Standby controller which is becoming active after failover: May 2 14:23:00 OEL_M-SLOT-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF May 2 14:23:00 OEL_M-SLOT-1 osaffmd[1677]: NO Controller Failover: Setting role to ACTIVE May 2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO RDE role set to ACTIVE May 2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO Running '/usr/lib64/opensaf/opensaf_sc_active' with 0 argument(s) May 2 14:23:00 OEL_M-SLOT-1 osafimmd[1688]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osaflogd[1711]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafntfd[1722]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafclmd[1733]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafamfd[1744]: NO FAILOVER StandBy --> Active /var/log/messages and osaflckd traces of both controllers are attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #1801 lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE returning SA_AIS_ERR_TIMEOUT after 5 failovers.
Similarly, saLckResourceOpen returns SA_AIS_ERR_LIBRARY after switchovers / failovers. This issue is randomly observed. --- ** [tickets:#1801] lck: saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE returning SA_AIS_ERR_TIMEOUT after 5 failovers.** **Status:** unassigned **Milestone:** 4.7.2 **Created:** Mon May 02, 2016 09:52 AM UTC by Madhurika Koppula **Last Updated:** Wed May 04, 2016 06:53 PM UTC **Owner:** nobody **Attachments:** - [glsv.tgz](https://sourceforge.net/p/opensaf/tickets/1801/attachment/glsv.tgz) (3.0 MB; application/octet-stream) Setup: Changeset- 7436 OS: Oracle Linux Server release 6.4 (x86_64) 4 nodes configured with single PBE some failover tests are being ran. safLock=resource1_101 object is not getting deleted. Thereby saLckResourceOpen with flag SA_LCK_RESOURCE_CREATE is continuously returning SA_AIS_ERR_TIMEOUT. With sleep of 10secs, 15times retry is done on the same API call. Snippet from the run: 100|7| SUCCESS : saLckInitialize with valid parameters 100|7| Return Value: SA_AIS_OK 100|7| LckHandle : 6599312 100|7| 100|7| 100|7| SUCCESS : saLckInitialize with valid parameters 100|7| Return Value: SA_AIS_OK 100|7| LckHandle : 6599392 100|7| 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| FAILED : saLckResourceOpen with valid parameters 100|7| Return Value: SA_AIS_ERR_TIMEOUT 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE 100|7| 100|7| Resource Name : safLock=resource1_101 100|7| open flags : SA_LCK_RESOURCE_CREATE Timeout count exceeded: 15 Timestamp of the Active controller at this instant: May 2 14:22:56 OEL_M-SLOT-2 root: killing osafimmd from run_failover.sh May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: NO 'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'nodeFailfast' May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: ER safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast May 2 14:22:56 OEL_M-SLOT-2 osafamfnd[1755]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, SupervisionTime = 60 May 2 14:22:56 OEL_M-SLOT-2 opensaf_reboot: Rebooting local node; timeout=60 Timestamp of the Standby controller which is becoming active after failover: May 2 14:23:00 OEL_M-SLOT-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF May 2 14:23:00 OEL_M-SLOT-1 osaffmd[1677]: NO Controller Failover: Setting role to ACTIVE May 2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO RDE role set to ACTIVE May 2 14:23:00 OEL_M-SLOT-1 osafrded[1667]: NO Running '/usr/lib64/opensaf/opensaf_sc_active' with 0 argument(s) May 2 14:23:00 OEL_M-SLOT-1 osafimmd[1688]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osaflogd[1711]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafntfd[1722]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafclmd[1733]: NO ACTIVE request May 2 14:23:00 OEL_M-SLOT-1 osafamfd[1744]: NO FAILOVER StandBy --> Active /var/log/messages and osaflckd traces of both controllers are attached. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Mobile security can be enabling, not merely restricting. Employees who bring their own devices (BYOD) to work are irked by the imposition of MDM restrictions. Mobile Device Manager Plus allows you to control only the apps on BYO-devices by containerizing them, leaving personal data untouched! https://ad.doubleclick.net/ddm/clk/304595813;131938128;j___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets