- **Type**: defect --> enhancement
- **Milestone**: 4.6.2 --> 5.0.FC
- **Comment**:
Changed to "enhancement". We can't see this problem as a bug since cluster
reboots if SCs have gone
---
** [tickets:#1647] clm: Incorrect error code handling in processing node_up due
to null ip**
**Status:** assigned
**Milestone:** 5.0.FC
**Created:** Thu Dec 17, 2015 12:12 AM UTC by Minh Hon Chau
**Last Updated:** Thu Dec 17, 2015 12:12 AM UTC
**Owner:** Minh Hon Chau
In function proc_node_up_msg() where clmd handles node_up from node_agent, if
the node_up comes without ip attached (it's happening by somehow in the test
case that simulates SCs gone due to disabled tipc for resilience feature), we
see clmd coredump.
'#0 proc_node_up_msg (cb=<optimized out>, evt=0x44006550) at clms_evt.c:369
'#1 0x00000000004051c5 in process_api_evt (evt=0x44006550) at clms_evt.c:1333
'#2 0x0000000000408910 in clms_process_mbx (mbx=<optimized out>) at
clms_evt.c:1373
'#3 0x00000000004042ee in main (argc=<optimized out>, argv=<optimized out>) at
clms_main.c:499
(gdb) bt full
'#0 proc_node_up_msg (cb=<optimized out>, evt=0x44006550) at clms_evt.c:369
nodeup_info = 0x440065a0
node = 0x65f820
nodeid = 131855
rc = 1
node_name = {length = 36, value =
"safNode=PL-3,safCluster=myClmCluster", '\000' <repeats 219 times>}
clm_msg = {next = 0x1, evt_type = CLMSV_CLMS_TO_CLMA_API_RESP_MSG, info
= {api_info = {type = CLMSV_CLUSTER_JOIN_REQ, param = {
init = {version = {releaseCode = 36 '$', majorVersion = 0
'\000', minorVersion = 115 's'}}, finalize = {
client_id = 1634926628}, track_start = {client_id =
1634926628, flags = 102 'f', sync_resp = 78 'N'}, track_stop = {
client_id = 1634926628}, node_get = {client_id = 1634926628,
node_id = 1685016166}, node_get_async = {
client_id = 1634926628, inv = 8299064482983853413, node_id =
1816356449}, clm_resp = {client_id = 1634926628,
resp = 1685016166, inv = 8299064482983853413}, nodeup_info =
{node_id = 1634926628, node_name = {length = 20070,
value = "ode=PL-3,safCluster=myClmCluster", '\000' <repeats
223 times>}}}}, cbk_info = {client_id = 7,
type = CLMSV_NODE_ASYNC_GET_CBK, param = {track = {buf_info =
{viewNumber = 7237089327836233764,
numberOfItems = 1280327013, notification =
0x657473756c436661}, mem_num = 2037202290, inv = 125780070987116,
root_cause_ent = 0x0, cor_ids = 0x0, step = 0, time_super =
0, err = 0}, node_get = {err = 1634926628,
inv = 8299064482983853413, info = {nodeId = 1816356449,
nodeAddress = {family = (SA_CLM_AF_INET | unknown: 1702130548),
length = 15730, value = "myClmCluster", '\000' <repeats
51 times>}, nodeName = {length = 0,
value = '\000' <repeats 255 times>}, executionEnvironment
= {length = 0,
---Type <return> to continue, or q <return> to quit---
value = '\000' <repeats 65 times>,
"\271n\243\264\017\020\000\000\000\000\000\000\000\000\000Pe\000D\000\000\000\000\r\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\200k\252\277\177",
'\000' <repeats 19 times>,
"+\301#@\000\000\000\000\001\000\000\000\000\000\000\000(\000\000\000\060\000\000\000`j\252\277\177\000\000\000\240i\252\277\177\000\000\000\314\271d\000\000\000\000\000V\366$@\000\000\000\000\200j\252\277\177\000\000\000\001\000\000\000\000\000\000\000"...},
member = (unknown: 6529616),
bootTimestamp = 1078576288, initialViewNumber =
1076151337}}}}, api_resp_info = {type = CLMSV_CLUSTER_JOIN_RESP,
rc = SA_AIS_OK, param = {client_id = 1634926628, node_get =
{nodeId = 1634926628, nodeAddress = {
family = (SA_CLM_AF_INET6 | unknown: 1685016164), length =
15717,
value = "PL-3,safCluster=myClmCluster", '\000' <repeats 35
times>}, nodeName = {length = 0,
value = '\000' <repeats 255 times>}, executionEnvironment =
{length = 0,
value = '\000' <repeats 81 times>,
"\271n\243\264\017\020\000\000\000\000\000\000\000\000\000Pe\000D\000\000\000\000\r\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\200k\252\277\177",
'\000' <repeats 19 times>,
"+\301#@\000\000\000\000\001\000\000\000\000\000\000\000(\000\000\000\060\000\000\000`j\252\277\177\000\000\000\240i\252\277\177\000\000\000\314\271d\000\000\000\000\000V\366$@\000\000\000\000"...},
member = (unknown: 2741942528), bootTimestamp = 1078576288, initialViewNumber
= 6529616},
inv = 7237089327836233764, track = {notify_info =
0x646f4e6661730024, num = 15717}, node_name = {length = 36,
value = "safNode=PL-3,safCluster=myClmCluster", '\000'
<repeats 219 times>}}}, is_member_info = {
is_member = (SA_TRUE | unknown: 6), is_configured = SA_TRUE,
client_id = 1634926628}}}
check_member = SA_FALSE
ip = 0x0
__FUNCTION__ = "proc_node_up_msg"
There should be a mds/tipc problem at the first place, though this ticket is to
correct the error handling in clmd to avoid coredump. When clmd finds null ip,
it set error code as SA_AIS_ERR_NOT_EXIST, but later the error code is
overwriten back to SA_AIS_OK.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets