searching for changes
changeset:   7003:fced4b3c5341
parent:      7000:9d30fa46a7a5
user:        A V Mahesh <mahesh.va...@oracle.com>
date:        Tue Oct 13 14:35:57 2015 +0530
summary:     cpsv: review comments on changeset 6995 [#242]
 
changeset:   7004:99e62804cd2c
branch:      opensaf-4.7.x
parent:      6999:2767588ed092
user:        A V Mahesh <mahesh.va...@oracle.com>
date:        Tue Oct 13 14:36:45 2015 +0530
summary:     cpsv: review comments on changeset 6996 [#242]
 
changeset:   7005:a9be7ddf73f3
branch:      opensaf-4.6.x
parent:      7002:fa5da6b01f61
user:        A V Mahesh <mahesh.va...@oracle.com>
date:        Tue Oct 13 14:37:00 2015 +0530
summary:     cpsv: review comments on changeset 6997 [#242]
 
changeset:   7006:4e4f9241483d
branch:      opensaf-4.5.x
tag:         tip
parent:      7001:32075f5c5570
user:        A V Mahesh <mahesh.va...@oracle.com>
date:        Tue Oct 13 14:37:19 2015 +0530
summary:     cpsv: review comments on changeset 6998 [#242]


---

** [tickets:#242] cpsv : ckptnd crashed while running multi thread application 
during section iteration get next**

**Status:** fixed
**Milestone:** 4.5.2
**Created:** Thu May 16, 2013 06:31 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Oct 13, 2015 06:24 AM UTC
**Owner:** A V Mahesh (AVM)
**Attachments:**

- 
[checkpoint_app1.c](https://sourceforge.net/p/opensaf/tickets/242/attachment/checkpoint_app1.c)
 (12.9 kB; application/octet-stream)
- 
[ticket242_app.c](https://sourceforge.net/p/opensaf/tickets/242/attachment/ticket242_app.c)
 (10.0 kB; text/plain)


from http://devel.opensaf.org/ticket/2864


The issue is seen on SLES 64bit VMs


There are two threads in the application, a writer thread and a reader thread.


Writer thread does the follows:
1) Creates the checkpoint
2) In a loop opens the same checkpoint in write mode, creates a section, writes 
into the section and closes the checkpoint


Reader thread does as follows:


1) In a loop open the checkpoint created by writer thread, do a section 
iteration initialize and read the section returned by section descriptor of 
iterationNext() and close the checkpoint


Bt observed:


(gdb) bt
#0 0x0000000000417606 in cpnd_proc_fill_sec_desc (pTmpSecPtr=0x0, 
sec_des=0x7fffa9c28530) at cpnd_proc.c:1637
#1 0x0000000000417b42 in cpnd_proc_getnext_section (cp_node=0x64a810, 
get_next=0x654bb0, sec_des=0x7fffa9c28530, 


n_secs_trav=0x7fffa9c2852c) at cpnd_proc.c:1756


#2 0x000000000040f680 in cpnd_evt_proc_ckpt_iter_getnext (cb=0x637f30, 
evt=0x654ba0, sinfo=0x6551f8) at cpnd_evt.c:4122
#3 0x00000000004059df in cpnd_process_evt (evt=0x654b90) at cpnd_evt.c:241
#4 0x0000000000411619 in cpnd_main_process (cb=0x637f30) at cpnd_init.c:544
#5 0x00000000004118e3 in main (argc=1, argv=0x7fffa9c28e68) at cpnd_main.c:72
(gdb) fr 2
#2 0x000000000040f680 in cpnd_evt_proc_ckpt_iter_getnext (cb=0x637f30, 
evt=0x654ba0, sinfo=0x6551f8) at cpnd_evt.c:4122
4122 cpnd_evt.c: No such file or directory.


in cpnd_evt.c


(gdb) p *evt
$1 = {dont_free_me = false, error = 0, type = CPND_EVT_A2ND_CKPT_ITER_GETNEXT, 
info = {initReq = {version = {releaseCode = 51 '3', 


majorVersion = 0 '\0', minorVersion = 0 '\0'}}, finReq = {client_hdl = 51}, 
openReq = {client_hdl = 51, lcl_ckpt_hdl = 11, 


ckpt_name = {length = 61664, value = 
"d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t", '\0' 
<repeats 236 times>}, 
ckpt_attrib = {creationFlags = 0, checkpointSize = 0, retentionDuration = 0, 
maxSections = 0, maxSectionSize = 0, 


maxSectionIdSize = 0}, ckpt_flags = 0, invocation = 0, timeout = 0}, closeReq = 
{client_hdl = 51, ckpt_id = 11, 


ckpt_flags = 6615264}, ulinkReq = {ckpt_name = {length = 51, 


value = 
"\000\000\000\000\000\000\v\000\000\000\000\000\000\000��d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t",
 '\0' <repeats 220 times>}}, rdsetReq = {ckpt_id = 51, reten_time = 11}, 
arsetReq = {ckpt_id = 51}, statReq = {ckpt_id = 51}, 


refCntsetReq = {no_of_nodes = 51, ref_cnt_array = {{ckpt_id = 11, ckpt_ref_cnt 
= 6615264}, {ckpt_id = 6390432, ckpt_ref_cnt = 5}, {


ckpt_id = 0, ckpt_ref_cnt = 0} <repeats 98 times>}}, sec_creatReq = {ckpt_id = 
51, lcl_ckpt_id = 11, agent_mdest = 6615264, 


sec_attri = {sectionId = 0x6182a0, expirationTime = 38654705669}, init_data = 
0x0, init_size = 0}, sec_delReq = {ckpt_id = 51, 
sec_id = {idLen = 11, id = 0x64f0e0 "section_4_1"}, lcl_ckpt_id = 6390432, 
agent_mdest = 38654705669}, sec_expset = {ckpt_id = 51, 
sec_id = {idLen = 11, id = 0x64f0e0 "section_4_1"}, exp_time = 6390432}, 
iter_getnext = {ckpt_id = 51, section_id = {idLen = 11, 


id = 0x64f0e0 "section_4_1"}, iter_id = 6390432, filter = SA_CKPT_SECTIONS_ANY, 
n_secs_trav = 9, exp_tmr = 0}, arr_ntfy = {


client_hdl = 51}, ckpt_write = {type = 51, ckpt_id = 11, lcl_ckpt_id = 6615264, 
agent_mdest = 6390432, num_of_elmts = 5, 
all_repl_evt_flag = 9, data = 0x0, seqno = 0, last_seq = 0 '\0', ckpt_sync = 
{ckpt_id = 0, lcl_ckpt_hdl = 0, client_hdl = 0, 


invocation = 0, cpa_sinfo = {to_svc = 0, dest = 0, stype = MDS_SENDTYPE_SND, 
ctxt = {length = 0 '\0', 


data = '\0' <repeats 11 times>}}, is_ckpt_open = false}}, ckpt_read = {type = 
51, ckpt_id = 11, lcl_ckpt_id = 6615264, 


agent_mdest = 6390432, num_of_elmts = 5, all_repl_evt_flag = 9, data = 0x0, 
seqno = 0, last_seq = 0 '\0', ckpt_sync = {ckpt_id = 0, 


lcl_ckpt_hdl = 0, client_hdl = 0, invocation = 0, cpa_sinfo = {to_svc = 0, dest 
= 0, stype = MDS_SENDTYPE_SND, ctxt = {


length = 0 '\0', data = '\0' <repeats 11 times>}}, is_ckpt_open = false}}, 
ckpt_sync = {ckpt_id = 51, lcl_ckpt_hdl = 11, 


client_hdl = 6615264, invocation = 6390432, cpa_sinfo = {to_svc = 5, dest = 0, 
stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', 


data = '\0' <repeats 11 times>}}, is_ckpt_open = false}, ckpt_read_ack = 
{ckpt_id = 51, mds_dest = 11}, ckpt_info = {error = 51, 


ckpt_id = 11, is_active_exists = 224, active_dest = 6390432, dest_cnt = 5, 
dest_list = 0x0, attributes = {creationFlags = 0, 


checkpointSize = 0, retentionDuration = 0, maxSections = 0, maxSectionSize = 0, 
maxSectionIdSize = 0}, ckpt_rep_create = false}, 


ckpt_mem_size = {ckpt_id = 51, ckpt_used_size = 11, error = 0}, ckpt_sections = 
{ckpt_id = 51, ckpt_num_sections = 11, error = 0}, 
ckpt_add = {ckpt_id = 51, mds_dest = 11, active_dest = 6615264, attributes = 
{creationFlags = 6390432, checkpointSize = 38654705669, 


retentionDuration = 0, maxSections = 0, maxSectionSize = 0, maxSectionIdSize = 
0}, ckpt_flags = 0, is_cpnd_restart = false, 


dest_cnt = 0, dest_list = 0x0}, ckpt_del = {ckpt_id = 51, mds_dest = 11}, 
ckpt_create = {ckpt_name = {length = 51, 


value = 
"\000\000\000\000\000\000\v\000\000\000\000\000\000\000��d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t",
 '\0' <repeats 220 times>}, ckpt_info = {error = 0, ckpt_id = 0, 
is_active_exists = false, active_dest = 0, dest_cnt = 0, dest_list = 0x0, 
attributes = {creationFlags = 0, checkpointSize = 0, retentionDuration = 0, 
maxSections = 0, maxSectionSize = 0, 


maxSectionIdSize = 0}, ckpt_rep_create = false}}, ckpt_destroy = {ckpt_id = 
51}, ckpt_ulink = {ckpt_id = 51}, rdset = {


ckpt_id = 51, reten_time = 11, type = 6615264}, active_set = {ckpt_id = 51, 
mds_dest = 11}, cl_ack = {error = 51}, ulink_ack = {
error = 51}, rdset_ack = {error = 51}, crset_ack = {error = 51}, arep_ack = 
{error = 51}, destroy_ack = {error = 51}, 


cpnd_restart = {ckpt_id = 51}, cpnd_restart_done = {ckpt_id = 51, mds_dest = 
11, active_dest = 6615264, attributes = {


creationFlags = 6390432, checkpointSize = 38654705669, retentionDuration = 0, 
maxSections = 0, maxSectionSize = 0, 


—Type <return> to continue, or q <return> to quit—


maxSectionIdSize = 0}, ckpt_flags = 0, is_cpnd_restart = false, dest_cnt = 0, 
dest_list = 0x0}, stat_get = {ckpt_id = 51}, 


status = {error = 51, ckpt_id = 11, status = {checkpointCreationAttributes = 
{creationFlags = 6615264, checkpointSize = 6390432, 


retentionDuration = 38654705669, maxSections = 0, maxSectionSize = 0, 
maxSectionIdSize = 0}, numberOfSections = 0, 


memoryUsed = 0}}, active_sec_creat = {ckpt_id = 51, lcl_ckpt_id = 11, 
agent_mdest = 6615264, sec_attri = {sectionId = 0x6182a0, 
expirationTime = 38654705669}, init_data = 0x0, init_size = 0}, sec_creat_rsp = 
{error = 51}, active_sec_creat_rsp = {


ckpt_id = 51, sec_id = {idLen = 11, id = 0x64f0e0 "section_4_1"}, error = 
6390432, lcl_ckpt_id = 38654705669, agent_mdest = 0}, 


sec_delete_req = {ckpt_id = 51, sec_id = {idLen = 11, id = 0x64f0e0 
"section_4_1"}, error = 6390432, lcl_ckpt_id = 38654705669, 


agent_mdest = 0}, sec_delete_rsp = {error = 51}, sec_iter_req = {ckpt_id = 51}, 
sec_exp_set = {ckpt_id = 51, sec_id = {idLen = 11, 


id = 0x64f0e0 "section_4_1"}, exp_time = 6390432}, sec_exp_rsp = {error = 51}, 
sync_req = {ckpt_id = 51, lcl_ckpt_hdl = 11, 


client_hdl = 6615264, invocation = 6390432, cpa_sinfo = {to_svc = 5, dest = 0, 
stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', 


data = '\0' <repeats 11 times>}}, is_ckpt_open = false}, ckpt_nd2nd_sync = 
{type = 51, ckpt_id = 11, lcl_ckpt_id = 6615264, 


agent_mdest = 6390432, num_of_elmts = 5, all_repl_evt_flag = 9, data = 0x0, 
seqno = 0, last_seq = 0 '\0', ckpt_sync = {ckpt_id = 0, 


lcl_ckpt_hdl = 0, client_hdl = 0, invocation = 0, cpa_sinfo = {to_svc = 0, dest 
= 0, stype = MDS_SENDTYPE_SND, ctxt = {


length = 0 '\0', data = '\0' <repeats 11 times>}}, is_ckpt_open = false}}, 
active_sync_rsp = {error = 51}, ckpt_nd2nd_data = {


type = 51, ckpt_id = 11, lcl_ckpt_id = 6615264, agent_mdest = 6390432, 
num_of_elmts = 5, all_repl_evt_flag = 9, data = 0x0, 
seqno = 0, last_seq = 0 '\0', ckpt_sync = {ckpt_id = 0, lcl_ckpt_hdl = 0, 
client_hdl = 0, invocation = 0, cpa_sinfo = {to_svc = 0, 


dest = 0, stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', data = '\0' 
<repeats 11 times>}}, is_ckpt_open = false}}, 


ckpt_nd2nd_data_rsp = {type = 51, num_of_elmts = 0, size = 11, error = 0, 
ckpt_id = 6615264, error_index = 6390432, 


from_svc = 38654705669, info = {write_err_index = 0x0, read_mapping = 0x0, 
read_data = 0x0, ovwrite_error = {error = 0}}}, 


getnext_req = {ckpt_id = 51, section_id = {idLen = 11, id = 0x64f0e0 
"section_4_1"}, iter_id = 6390432, filter = SA_CKPT_SECTIONS_ANY, 


n_secs_trav = 9, exp_tmr = 0}, ckpt_nd2nd_getnext_rsp = {ckpt_id = 51, iter_id 
= 11, error = 6615264, sect_desc = {sectionId = {


idLen = 33440, id = 0x900000005 <Address 0x900000005 out of bounds>}, 
expirationTime = 0, sectionSize = 0, sectionState = 0, 


lastUpdate = 0}, n_secs_trav = 0}, mds_info = {change = 51, dest = 11, svc_id = 
6615264, node_id = 0, role = 6390432}, tmr_info = {


type = 51, ckpt_id = 11, lcl_sec_id = 6615264, agent_dest = 6390432, write_type 
= 5, sinfo = {to_svc = 0, dest = 0, 


stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', data = '\0' <repeats 11 
times>}}, invocation = 0, lcl_ckpt_hdl = 0, 


cpnd_tmr = 0x0}, ckptListUpdate = {client_hdl = 51, ckpt_name = {length = 11, 


value = 
"\000\000\000\000\000\000��d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t",
 '\0' <repeats 228 times>}}}}


(gdb) p *sinfo
$2 = {to_svc = 18, dest = 566314965155865, stype = MDS_SENDTYPE_SNDRSP, ctxt = 
{length = 12 '\f', 


data = "\000\000\001\n\000\002\003\017zT@\031"}}


The issue is reproducible with the attached application





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to