[tickets] [opensaf:tickets] #2656 imm: valgrind reports invalid read in imm agent

2017-10-25 Thread Vu Minh Nguyen via Opensaf-tickets



---

** [tickets:#2656] imm: valgrind reports invalid read in imm agent**

**Status:** assigned
**Milestone:** 5.17.10
**Created:** Thu Oct 26, 2017 04:15 AM UTC by Vu Minh Nguyen
**Last Updated:** Thu Oct 26, 2017 04:15 AM UTC
**Owner:** Vu Minh Nguyen


Here is valgrind report:

> ==740== Thread 4:
> ==740== Invalid read of size 1
> ==740==at 0x4C2E7A0: __strncpy_sse2_unaligned (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==740==by 0x5055CF2: strncpy (string3.h:120)
> ==740==by 0x5055CF2: imma_proc_ccbaug_setup(imma_client_node*, 
> imma_callback_info*) (imma_proc.cc:2058)
> ==740==by 0x505C507: imma_hdl_callbk_dispatch_one(imma_cb*, unsigned long 
> long) (imma_proc.cc:1745)
> ==740==by 0x5050D33: saImmOiDispatch (imma_oi_api.cc:638)
> ==740==by 0x120AD1: oi_thread (test_saImmOiSaStringT.c:287)
> ==740==by 0x5725183: start_thread (pthread_create.c:312)
> ==740==by 0x5A3537C: clone (clone.S:111


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2652 clm: return wrong error code

2017-10-25 Thread Vu Minh Nguyen via Opensaf-tickets
- **status**: review --> fixed
- **assigned_to**: Vu Minh Nguyen -->  nobody 
- **Comment**:

commit ba5302f6e65108d9023b16b3bd8e4986fe178ea8 (HEAD, origin/develop, 
ticket-2652, develop)
Author: Vu Minh Nguyen 
Date:   Thu Oct 26 09:38:35 2017 +0700

clm: fix return wrong error code [#2652]

saClmClusterNodeGet_4() and saClmClusterNodeGetAsync() returns
SA_AIS_ERR_UNAVAILABLE(31) when querying non-member node information
from a member node.

According to AIS, they should return SA_AIS_ERR_NOT_EXIST.
SA_AIS_ERR_UNAVAILABLE should be returned when invoking process is not
executing on a member node.



---

** [tickets:#2652] clm: return wrong error code**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Tue Oct 24, 2017 05:09 AM UTC by Vu Minh Nguyen
**Last Updated:** Tue Oct 24, 2017 07:58 AM UTC
**Owner:** nobody


saClmClusterNodeGet_4() returns `SA_AIS_ERR_UNAVAILABLE`(31) when querying 
non-member node information from a member node.

According to AIS, chapter 3.5.5, it should return `SA_AIS_ERR_NOT_EXIST` 
instead.

> root@SC-1:~# clm-state
> safNode=PL-3,safCluster=myClmCluster
>saClmNodeAdminState=LOCKED(2)
>saClmNodeIsMember=NON_MEMBER(0)
>saClmNodeID=131855(0x2030f)
> safNode=PL-4,safCluster=myClmCluster
>saClmNodeAdminState=UNLOCKED(1)
>saClmNodeIsMember=MEMBER(1)
>saClmNodeID=132111(0x2040f)
> safNode=PL-5,safCluster=myClmCluster
>saClmNodeAdminState=UNLOCKED(1)
>saClmNodeIsMember=MEMBER(1)
>saClmNodeID=132367(0x2050f)
> safNode=SC-1,safCluster=myClmCluster
>saClmNodeAdminState=UNLOCKED(1)
>saClmNodeIsMember=MEMBER(1)
>saClmNodeID=131343(0x2010f)
> safNode=SC-2,safCluster=myClmCluster
>saClmNodeAdminState=UNLOCKED(1)
>saClmNodeIsMember=MEMBER(1)
>saClmNodeID=131599(0x2020f)
> root@SC-1:~# clmprint -n 0x2030f
> node_id:131855(2030f)
> error - clmprint:: saClmClusterNodeGet_4 failed, rc = 31


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2650 amfnd: invalid read in mon.cc

2017-10-25 Thread Gary Lee via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 20a16ee0e07ec589d79b1204f511384bd6a9c9d7
Author: Gary Lee 
Date:   Thu Oct 26 13:34:42 2017 +1100

amfnd: store pid before sending event [#2650]

The event may be processed and pm_rec
deleted by the main thread, before it is
read here.



---

** [tickets:#2650] amfnd: invalid read in mon.cc**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Mon Oct 23, 2017 02:59 AM UTC by Gary Lee
**Last Updated:** Mon Oct 23, 2017 03:19 AM UTC
**Owner:** Gary Lee


==478== Invalid read of size 8
==478==at 0x1446B0: avnd_send_pid_exit_evt (mon.cc:274)
==478==by 0x1446B0: avnd_mon_pids (mon.cc:325)
==478==by 0x1446B0: avnd_mon_process(void*) (mon.cc:355)
==478==by 0x5EBF6D9: start_thread (pthread_create.c:456)
==478==by 0x61DED7E: clone (clone.S:105)
==478==  Address 0x8c04558 is 24 bytes inside a block of size 72 free'd
==478==at 0x4C2F25B: operator delete(void*) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==478==by 0x133EF8: avnd_pm_rec_free(ncs_db_link_list_node*) (cpm.cc:84)
==478==by 0x56BDD4A: ncs_db_link_list_del (ncsdlib.c:146)
==478==by 0x134025: avnd_comp_pm_rec_del(avnd_cb_tag*, avnd_comp_tag*, 
avnd_pm_rec*) (cpm.cc:138)
==478==by 0x144B69: avnd_evt_pid_exit_evh(avnd_cb_tag*, avnd_evt_tag*) 
(mon.cc:403)
==478==by 0x141C41: avnd_evt_process (main.cc:658)
==478==by 0x141C41: avnd_main_process() (main.cc:610)
==478==by 0x115D81: main (main.cc:203)
==478==  Block was alloc'd at
==478==at 0x4C2E19F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==478==by 0x134332: avnd_comp_new_rsrc_mon(avnd_cb_tag*, avnd_comp_tag*, 
avsv_amf_pm_start_param_tag*, SaAisErrorT*) (cpm.cc:329)
==478==by 0x134470: avnd_comp_pm_start_process(avnd_cb_tag*, 
avnd_comp_tag*, avsv_amf_pm_start_param_tag*, SaAisErrorT*) (cpm.cc:269)
==478==by 0x134B43: avnd_evt_ava_pm_start_evh(avnd_cb_tag*, avnd_evt_tag*) 
(cpm.cc:419)
==478==by 0x141C41: avnd_evt_process (main.cc:658)
==478==by 0x141C41: avnd_main_process() (main.cc:610)
==478==by 0x115D81: main (main.cc:203)




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2651 clm: clmprint does not work as expected

2017-10-25 Thread Vu Minh Nguyen via Opensaf-tickets
- **status**: review --> fixed
- **assigned_to**: Vu Minh Nguyen -->  nobody 
- **Priority**: major --> minor
- **Comment**:

commit e070300a38f0f564c8c8493f112c68c442c6528c (HEAD, origin/develop, 
ticket-2651, develop)
Author: Vu Minh Nguyen 
Date:   Wed Oct 25 10:57:17 2017 +0700

clm: fix errors in clmprint tool [#2651]

Fix the problems:
1) clmprint returns 0 for the error case.
2) clmprint does not handle invalid inputs.
3) clmprint does not deal with non-member node.



---

** [tickets:#2651] clm: clmprint does not work as expected**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Mon Oct 23, 2017 10:40 AM UTC by Vu Minh Nguyen
**Last Updated:** Tue Oct 24, 2017 07:53 AM UTC
**Owner:** nobody


1. clmprint returns 0 for the error case
> root@SC-1:~# clmprint -n 0x3060f
> node_id:198159(3060f)
> error - clmprint:: saClmClusterNodeGet_4 failed, rc = 12
> root@SC-1:~# echo $?

2. clmprint does not handle invalid inputs
> clmprint -b -m -a -n
> node_id:4294967295()
> node_id:4294967295()
> ...

3. clmprint is not able to print non-member node information
> root@SC-1:~# clm-adm -o lock safNode=PL-5,safCluster=myClmCluster
> root@SC-1:~# clm-state safNode=PL-5,safCluster=myClmCluster
> safNode=PL-5,safCluster=myClmCluster
> saClmNodeAdminState=LOCKED(2)
> saClmNodeIsMember=NON_MEMBER(0)
> saClmNodeID=132367(0x2050f)
> root@SC-1:~# clmprint -n 0x2050f
> node_id:132367(2050f)
> error - clmprint:: saClmClusterNodeGet_4 failed, rc = 31


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2654 clm: clm test asserts due to timeout in poll

2017-10-25 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/36090263/



---

** [tickets:#2654] clm: clm test asserts due to timeout in poll**

**Status:** review
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 03:04 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Oct 25, 2017 03:04 PM UTC
**Owner:** Zoran Milinkovic


In CLM tests, immadm command is used in many places for executing CLM admin 
operations for locking, unlocking and shautting down nodes.
In overloaded system this can make problems, and the execution of immadm can 
take long time.
Since we have sanity check for executing immadm in another thread, this 
situation may go to timeout in poll in an overloaded system.

~~~
Thread 1 (Thread 0x7f3b61c0c740 (LWP 280)):
#0 0x7f3b61007428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
resultvar = 0
pid = 280
selftid = 280
#1 0x7f3b6100902a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4}, sa_mask = 
{__val = {0, 0, 140732357794048, 47244640256, 139893019865088, 94726958067288, 
865, 94726958071200, 0, 0, 139893007538572, 139893008635480, 139893008649136, 
0, 139893008635480, 94726958067288}}, sa_flags = 1640067072, sa_restorer = 
0x562756afae58}
sigs = {__val = {32, 0 }}
#2 0x7f3b60fffbd7 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:92
str = 0x562758c0d360 ""
total = 4096
#3 0x7f3b60fffc82 in __GI___assert_fail 
(assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:101
No locals.
#4 0x562756af64c7 in saClmClusterTrack_27 () at 
src/clm/apitest/tet_saClmClusterTrack.c:865
fds = {{fd = 10, events = 1, revents = 0}}
thread8 = 139892944803584
__PRETTY_FUNCTION__ = "saClmClusterTrack_27"
#5 0x562756afa309 in run_test_case (suite=, tcase=) at src/osaf/apitest/utest.c:178
No locals.
#6 0x562756afa824 in test_run (suite=, tcase=) at src/osaf/apitest/utest.c:202
i = 7
j = 27
#7 0x7f3b60ff2830 in __libc_start_main (main=0x562756af2910 , argc=1, 
argv=0x7ffece31db98, init=, fini=, 
rtld_fini=, stack_end=0x7ffece31db88) at ../csu/libc-start.c:291
result = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -8017775631807234294, 
94726958036048, 140732357794704, 0, 0, -4393240863189713142, 
-4430610558499635446}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 
0x7ffece31dba8, 0x7f3b61c1c168}, data = {prev = 0x0, cleanup = 0x0, canceltype 
= -835593304}}}
not_first_call = 
#8 0x562756af3479 in _start ()
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2655 msg: APIs do not return UNAVAILABLE after node has left and rejoined

2017-10-25 Thread Alex Jones via Opensaf-tickets



---

** [tickets:#2655] msg: APIs do not return UNAVAILABLE after node has left and 
rejoined**

**Status:** assigned
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 03:15 PM UTC by Alex Jones
**Last Updated:** Wed Oct 25, 2017 03:15 PM UTC
**Owner:** Alex Jones


According to Section 3.2.1 of the MSG B.03.01 spec, the API calls must still 
return UNAVAILABLE for handles which were obtained before the node left the 
cluster, even after the node rejoins.

The implementation currently only returns UNAVAILABLE during the time the node 
is not a member.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2654 clm: clm test asserts due to timeout in poll

2017-10-25 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2654] clm: clm test asserts due to timeout in poll**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 03:04 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Oct 25, 2017 03:04 PM UTC
**Owner:** Zoran Milinkovic


In CLM tests, immadm command is used in many places for executing CLM admin 
operations for locking, unlocking and shautting down nodes.
In overloaded system this can make problems, and the execution of immadm can 
take long time.
Since we have sanity check for executing immadm in another thread, this 
situation may go to timeout in poll in an overloaded system.

~~~
Thread 1 (Thread 0x7f3b61c0c740 (LWP 280)):
#0 0x7f3b61007428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
resultvar = 0
pid = 280
selftid = 280
#1 0x7f3b6100902a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4}, sa_mask = 
{__val = {0, 0, 140732357794048, 47244640256, 139893019865088, 94726958067288, 
865, 94726958071200, 0, 0, 139893007538572, 139893008635480, 139893008649136, 
0, 139893008635480, 94726958067288}}, sa_flags = 1640067072, sa_restorer = 
0x562756afae58}
sigs = {__val = {32, 0 }}
#2 0x7f3b60fffbd7 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:92
str = 0x562758c0d360 ""
total = 4096
#3 0x7f3b60fffc82 in __GI___assert_fail 
(assertion=assertion@entry=0x562756afae58 "ret == 1", 
file=file@entry=0x562756afb398 "src/clm/apitest/tet_saClmClusterTrack.c", 
line=line@entry=865, function=function@entry=0x562756afbda0 
<__PRETTY_FUNCTION__.7254> "saClmClusterTrack_27") at assert.c:101
No locals.
#4 0x562756af64c7 in saClmClusterTrack_27 () at 
src/clm/apitest/tet_saClmClusterTrack.c:865
fds = {{fd = 10, events = 1, revents = 0}}
thread8 = 139892944803584
__PRETTY_FUNCTION__ = "saClmClusterTrack_27"
#5 0x562756afa309 in run_test_case (suite=, tcase=) at src/osaf/apitest/utest.c:178
No locals.
#6 0x562756afa824 in test_run (suite=, tcase=) at src/osaf/apitest/utest.c:202
i = 7
j = 27
#7 0x7f3b60ff2830 in __libc_start_main (main=0x562756af2910 , argc=1, 
argv=0x7ffece31db98, init=, fini=, 
rtld_fini=, stack_end=0x7ffece31db88) at ../csu/libc-start.c:291
result = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -8017775631807234294, 
94726958036048, 140732357794704, 0, 0, -4393240863189713142, 
-4430610558499635446}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 
0x7ffece31dba8, 0x7f3b61c1c168}, data = {prev = 0x0, cleanup = 0x0, canceltype 
= -835593304}}}
not_first_call = 
#8 0x562756af3479 in _start ()
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2653 nid: Derive Node ID from TIPC address when not managing TIPC

2017-10-25 Thread Anders Widell via Opensaf-tickets
- **status**: accepted --> review



---

** [tickets:#2653] nid: Derive Node ID from TIPC address when not managing 
TIPC**

**Status:** review
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 11:42 AM UTC by Anders Widell
**Last Updated:** Wed Oct 25, 2017 11:42 AM UTC
**Owner:** Anders Widell


Related to [#2598]. If OpenSAF is not configured to manage TIPC, we don't need 
to require the presence of /etc/opensaf/slot_id or /var/lib/opensaf/node_id. 
Instead, we can create the file /var/lib/opensaf/node_id ourselves based on the 
TIPC address of the node we are running on.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2653 nid: Derive Node ID from TIPC address when not managing TIPC

2017-10-25 Thread Anders Widell via Opensaf-tickets



---

** [tickets:#2653] nid: Derive Node ID from TIPC address when not managing 
TIPC**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Wed Oct 25, 2017 11:42 AM UTC by Anders Widell
**Last Updated:** Wed Oct 25, 2017 11:42 AM UTC
**Owner:** Anders Widell


Related to [#2598]. If OpenSAF is not configured to manage TIPC, we don't need 
to require the presence of /etc/opensaf/slot_id or /var/lib/opensaf/node_id. 
Instead, we can create the file /var/lib/opensaf/node_id ourselves based on the 
TIPC address of the node we are running on.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2648 smf: smfd crashes after cluster reboot when campaign is in ExecutionCompleted

2017-10-25 Thread Rafael Odzakow via Opensaf-tickets
That would work. As long as it is possible to rollback the campaign it 
is fine.


On 10/20/2017 03:18 PM, Alex Jones wrote:
>
> I understand the intention. It makes sense.
>
> One of the other solutions I had considered is to put a check at the 
> beginning of SmfCampaign::initExecution(). If the campaign state is 
> EXECUTION_COMPLETED, then just return. What is the point of 
> reexecuting a campaign that already completed?
>
> Are you OK with that?
>
> 
>
> *[tickets:#2648]  
> smf: smfd crashes after cluster reboot when campaign is in 
> ExecutionCompleted*
>
> *Status:* review
> *Milestone:* 5.17.10
> *Created:* Thu Oct 19, 2017 06:45 PM UTC by Alex Jones
> *Last Updated:* Fri Oct 20, 2017 10:04 AM UTC
> *Owner:* Alex Jones
>
> smfd crashes in updateImmAttr because it returns NO_RESOURCES. Here is 
> how to reproduce:
>
>  1. enable PBE, and make sure the "disable" flag is set in
> OpenSafSmfConfig
>  2. execute an upgrade campaign, and let it go to "execution
> completed", but don't commit it
>  3. reboot the entire cluster
>  4. only allow 1 system controller to come up
>  5. smfd will attempt to re-execute the campaign
>  6. any writes to IMM (like setting an error because the campaign file
> can't be found) will fail with NO_RESOURCES and smfd will assert
> and crash
>
> The reason for the assert and crash is because PBE has not been turned 
> off by smfd before the campaign has been inititialized.
>
> 
>
> Sent from sourceforge.net because you indicated interest in 
> https://sourceforge.net/p/opensaf/tickets/2648/
>
> To unsubscribe from further messages, please visit 
> https://sourceforge.net/auth/subscriptions/
>




---

** [tickets:#2648] smf: smfd crashes after cluster reboot when campaign is in 
ExecutionCompleted**

**Status:** review
**Milestone:** 5.17.10
**Created:** Thu Oct 19, 2017 06:45 PM UTC by Alex Jones
**Last Updated:** Fri Oct 20, 2017 01:18 PM UTC
**Owner:** Alex Jones


smfd crashes in updateImmAttr because it returns NO_RESOURCES. Here is how to 
reproduce:

1. enable PBE, and make sure the "disable" flag is set in OpenSafSmfConfig
2. execute an upgrade campaign, and let it go to "execution completed", but 
don't commit it
3. reboot the entire cluster
4. only allow 1 system controller to come up
5. smfd will attempt to re-execute the campaign
6. any writes to IMM (like setting an error because the campaign file can't be 
found) will fail with NO_RESOURCES and smfd will assert and crash

The reason for the assert and crash is because PBE has not been turned off by 
smfd before the campaign has been inititialized.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2627 amfnd: handle immnd failure during upgrade

2017-10-25 Thread Gary Lee via Opensaf-tickets
- **status**: accepted --> review



---

** [tickets:#2627] amfnd: handle immnd failure during upgrade**

**Status:** review
**Milestone:** 5.17.10
**Created:** Mon Oct 16, 2017 05:13 AM UTC by Gary Lee
**Last Updated:** Mon Oct 23, 2017 04:39 AM UTC
**Owner:** nobody
**Attachments:**

- 
[osafamfnd.6981.PL-2-12.core.txt](https://sourceforge.net/p/opensaf/tickets/2627/attachment/osafamfnd.6981.PL-2-12.core.txt)
 (32.3 kB; text/plain)


Normally, amfnd is able to handle immnd restarting.

However, if immnd fails after OpenSAF is upgraded but before the node is 
rebooted, then amfnd will 'deadlock' trying to reload immnd's configuration but 
immnd is not available!




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets