OK, I think I fixed the small-io problem and the mkdir problem.
That only leaves the mounting problem. I've never attempted to build the kernel interface or mount the file system (being the old goat that I am) so that might take a bit.

I'll commit the changes I made and you can run them in the next nightly to see if anything new pops up.

Walt

Robert Latham wrote:
On Tue, Aug 29, 2006 at 04:55:06PM -0400, Walter B. Ligon III wrote:

So, I would appreciate some help running some tests on the branch, while I start documenting, and let me know when you think I should start merging it back with the trunk. Or I'm open to whatever other suggestions ...


OK, walt, we're getting close.  I committed a couple small fixes to
get pvfs2-client-core building.  Here's what's not working so well
right now:

- mounting pvfs2 fails with a timeout

- many MPI-IO workloads pass, but the noncontig test triggered a
  segfault in small_io_cleanup, where it cleans up various fields in
  the sm_p structure.  In particular, 'sm_p->msgarray = NULL' caused a
  core dump, and when I look at that core file in gdb,
  sm_p->msgarray_count is really high (135950228).  Looks like maybe
  the sm_p wasn't properly allocated? I dunno, I'm just the messenger.

- pvfs2-cp dies with a segfault when using a very small blocksize (-b
  128). here's where gdb says the fault lies:

---------------
#0 0x0806d3d8 in small_io_completion_fn (user_args=0x80f0da8, resp_p=0xbfffb42c, index=0) at sys-small-io.sm:242
242                 fdata.server_nr = sm_p->u.io.datafile_index_array[index];
(gdb) p sm_p->u.io $8 = {io_type = 135162104, file_req = 0x2, file_req_offset = 0, buffer = 0x0, mem_req = 0x0, io_resp_p = 0x50, flowproto_type = 17, encoding = 135206232, datafile_index_array = 0x0, datafile_count = 0, msgpair_completion_count = 81, flow_completion_count = 0, write_ack_completion_count = 0, contexts = 0x80f13d4, context_count = 135205832, total_cancellations_remaining = 0, retry_count = 135206064, stored_error_code = 3396, total_size = 9, dfile_size_array = 0x0, small_io = 0}
---------------

- test-zero-fill fails with a segfault in the same place as pvfs2-cp:

---------------
#0 0x08065149 in small_io_completion_fn (user_args=0x80e9940, resp_p=0xbfffb86c, index=0) at sys-small-io.sm:317
317         sm_p->u.io.dfile_size_array[index] = 
resp_p->u.small_io.bstream_size;
---------------

- pvfs2-mkdir (a test contributed by acxiom) fails with a seg fault:

---------------
#0  0x080b134e in PINT_smcb_op (smcb=0x0)
    at 
/sandbox/robl/pvfs2-nightly/pvfs2-WALT3/src/common/misc/state-machine-fns.c:348
348         return smcb->op;
---------------


So I think if you can take care of the small-io cases, that would be a
good start, as it would knock out 3 of the 5 failures.  Once WALT3
passes our nightlies, we can think about merging into HEAD.

==rob


--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to