- **Type**: defect --> enhancement
- **Milestone**: 4.6.2 --> 5.0.FC
- **Comment**:

Changed to "enhancement". We can't see this problem as a bug since cluster 
reboots if SCs have gone



---

** [tickets:#1647] clm: Incorrect error code handling in processing node_up due 
to null ip**

**Status:** assigned
**Milestone:** 5.0.FC
**Created:** Thu Dec 17, 2015 12:12 AM UTC by Minh Hon Chau
**Last Updated:** Thu Dec 17, 2015 12:12 AM UTC
**Owner:** Minh Hon Chau


In function proc_node_up_msg() where clmd handles node_up from node_agent, if 
the node_up comes without ip attached (it's happening by somehow in the test 
case that simulates SCs gone due to disabled tipc for resilience feature), we 
see clmd coredump. 


'#0  proc_node_up_msg (cb=<optimized out>, evt=0x44006550) at clms_evt.c:369
'#1  0x00000000004051c5 in process_api_evt (evt=0x44006550) at clms_evt.c:1333
'#2  0x0000000000408910 in clms_process_mbx (mbx=<optimized out>) at 
clms_evt.c:1373
'#3  0x00000000004042ee in main (argc=<optimized out>, argv=<optimized out>) at 
clms_main.c:499
(gdb) bt full
'#0  proc_node_up_msg (cb=<optimized out>, evt=0x44006550) at clms_evt.c:369
        nodeup_info = 0x440065a0
        node = 0x65f820
        nodeid = 131855
        rc = 1
        node_name = {length = 36, value = 
"safNode=PL-3,safCluster=myClmCluster", '\000' <repeats 219 times>}
        clm_msg = {next = 0x1, evt_type = CLMSV_CLMS_TO_CLMA_API_RESP_MSG, info 
= {api_info = {type = CLMSV_CLUSTER_JOIN_REQ, param = {
                init = {version = {releaseCode = 36 '$', majorVersion = 0 
'\000', minorVersion = 115 's'}}, finalize = {
                  client_id = 1634926628}, track_start = {client_id = 
1634926628, flags = 102 'f', sync_resp = 78 'N'}, track_stop = {
                  client_id = 1634926628}, node_get = {client_id = 1634926628, 
node_id = 1685016166}, node_get_async = {
                  client_id = 1634926628, inv = 8299064482983853413, node_id = 
1816356449}, clm_resp = {client_id = 1634926628, 
                  resp = 1685016166, inv = 8299064482983853413}, nodeup_info = 
{node_id = 1634926628, node_name = {length = 20070, 
                    value = "ode=PL-3,safCluster=myClmCluster", '\000' <repeats 
223 times>}}}}, cbk_info = {client_id = 7, 
              type = CLMSV_NODE_ASYNC_GET_CBK, param = {track = {buf_info = 
{viewNumber = 7237089327836233764, 
                    numberOfItems = 1280327013, notification = 
0x657473756c436661}, mem_num = 2037202290, inv = 125780070987116, 
                  root_cause_ent = 0x0, cor_ids = 0x0, step = 0, time_super = 
0, err = 0}, node_get = {err = 1634926628, 
                  inv = 8299064482983853413, info = {nodeId = 1816356449, 
nodeAddress = {family = (SA_CLM_AF_INET | unknown: 1702130548), 
                      length = 15730, value = "myClmCluster", '\000' <repeats 
51 times>}, nodeName = {length = 0, 
                      value = '\000' <repeats 255 times>}, executionEnvironment 
= {length = 0, 
---Type <return> to continue, or q <return> to quit---
                      value = '\000' <repeats 65 times>, 
"\271n\243\264\017\020\000\000\000\000\000\000\000\000\000Pe\000D\000\000\000\000\r\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\200k\252\277\177",
 '\000' <repeats 19 times>, 
"+\301#@\000\000\000\000\001\000\000\000\000\000\000\000(\000\000\000\060\000\000\000`j\252\277\177\000\000\000\240i\252\277\177\000\000\000\314\271d\000\000\000\000\000V\366$@\000\000\000\000\200j\252\277\177\000\000\000\001\000\000\000\000\000\000\000"...},
 member = (unknown: 6529616), 
                    bootTimestamp = 1078576288, initialViewNumber = 
1076151337}}}}, api_resp_info = {type = CLMSV_CLUSTER_JOIN_RESP, 
              rc = SA_AIS_OK, param = {client_id = 1634926628, node_get = 
{nodeId = 1634926628, nodeAddress = {
                    family = (SA_CLM_AF_INET6 | unknown: 1685016164), length = 
15717, 
                    value = "PL-3,safCluster=myClmCluster", '\000' <repeats 35 
times>}, nodeName = {length = 0, 
                    value = '\000' <repeats 255 times>}, executionEnvironment = 
{length = 0, 
                    value = '\000' <repeats 81 times>, 
"\271n\243\264\017\020\000\000\000\000\000\000\000\000\000Pe\000D\000\000\000\000\r\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\200k\252\277\177",
 '\000' <repeats 19 times>, 
"+\301#@\000\000\000\000\001\000\000\000\000\000\000\000(\000\000\000\060\000\000\000`j\252\277\177\000\000\000\240i\252\277\177\000\000\000\314\271d\000\000\000\000\000V\366$@\000\000\000\000"...},
 member = (unknown: 2741942528), bootTimestamp = 1078576288, initialViewNumber 
= 6529616}, 
                inv = 7237089327836233764, track = {notify_info = 
0x646f4e6661730024, num = 15717}, node_name = {length = 36, 
                  value = "safNode=PL-3,safCluster=myClmCluster", '\000' 
<repeats 219 times>}}}, is_member_info = {
              is_member = (SA_TRUE | unknown: 6), is_configured = SA_TRUE, 
client_id = 1634926628}}}
        check_member = SA_FALSE
        ip = 0x0
        __FUNCTION__ = "proc_node_up_msg"


There should be a mds/tipc problem at the first place, though this ticket is to 
correct the error handling in clmd to avoid coredump. When clmd finds null ip, 
it set error code as SA_AIS_ERR_NOT_EXIST, but later the error code is 
overwriten back to SA_AIS_OK.


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to