Hi Quincy,

I'll be pulling pieces out of the large c++ project into a small test c program to see if the seg fault can be duplicated in a wieldable example and if accomplished, will send it to you.

MemoryScape and gdb(Netbeans) doesn't show any memory issues from our library code and hdf5. MemoryScape doesn't expand through the H5SL_REMOVE macro so in another working copy I'm trying to treat a copy of it as a function.

On 12/07/2010 05:20 PM, Quincey Koziol wrote:
Hi Roger,

On Dec 7, 2010, at 2:06 PM, Roger Martin wrote:

Further:

Debugging with MemoryScape:
Reveals a segfault in H5SL.c (1.8.5) at line 1068
...1068....
            H5SL_REMOVE(SCALAR, slist, x, const haddr_t, key, -) 
//H5SL_TYPE_HADDR case
....

The stack trace is:
H5SL_remove                     1068
H5C_flush_single_entry      7993
H5C_flush_cache                1395
H5AC_flush                          941
H5F_flush                           1673
H5F_dest                              996
H5F_try_close                     1900
H5F_close                           1750
H5I_dec_ref                         1490
H5F_close                           1951

I'll be adding print outs to see what variable/pointer is causing the seg 
fault.  The MemoryScape Fame shows:
..............
Stack Frame
Function "H5SL_remove":
  slist:                       0x0b790fc0 (Allocated) ->  (H5SL_t)
  key:                         0x0b9853f8 (Allocated Interior) ->  
0x000000000001affc (110588)
Block "$b8":
  _last:                       0x0b772270 (Allocated) ->  (H5SL_node_t)
  _llast:                      0x0001affc ->  (H5SL_node_t)
  _next:                       0x0b9855c0 (Allocated) ->  (H5SL_node_t)
  _drop:                       0x0b772270 (Allocated) ->  (H5SL_node_t)
  _ldrop:                      0x0b772270 (Allocated) ->  (H5SL_node_t)
  _count:                      0x00000000 (0)
  _i:<Bad address: 0x00000000>
Local variables:
  x:<Bad address: 0x00000000>
  hashval:<Bad address: 0x00000000>
  ret_value:<Bad address: 0x00000000>
  FUNC:                        "H5SL_remove"
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
................

Some bad addresses on some of the variables such as x which was set by "x = 
slist->header;" which is a skip list.

These appear to be internal API functions and I'm wondering how I could be 
offending them from high level API calls and file interfaces.  What could be in 
the cache H5C when
H5Fget_obj_count(fileID, H5F_OBJ_ALL) = 1
and H5Fget_obj_count(fileID, H5F_OBJ_DATASET | H5F_OBJ_GROUP | H5F_OBJ_DATATYPE 
| H5F_OBJ_ATTR) =0
for the file the code is trying to close.
        Yes, you are correct, that shouldn't happen. :-/  Do you have a simple 
C program you can send to show this failure?

        Quincey

On 12/03/2010 11:33 AM, Roger Martin wrote:
Hi,

Using hdf1.8.5 and 1.8.6 pre2; openmpi 1.4.3 on linux rhel4 and rhel5


In a case where the hdf5 operations aren't using MPI but build an h5 file 
exclusive to individual MPI jobs/processes:

The create:
currentFileID = H5Fcreate(filePath.c_str(), H5F_ACC_TRUNC, H5P_DEFAULT, 
H5P_DEFAULT);

and many file operations using the hl methods including packet table, tables 
and datasets etc. perform successfully.

Then near the individual processes' end the
H5Fclose(currentFileID);
is called but doesn't return.  A check for open objects says only one file 
object is open but no other objects(group, dataset etc).  No other software or 
process is acting on this h5; it is named exclusively for the one job it is 
associated with.

This isn't a parallel hdf5 in MPI attempt.  In another scenario parallel hdf5 
is working the collective way just fine.  This current issue is for people who 
don't have or want a parallel file system and I made a coarsed grained MPI to 
run independent jobs for these folks.  Each job has its own h5 opened with 
H5Fcreate(filePath.c_str(), H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

Where should I look?

I'll try to make a small example test case for show and tell.






_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org



_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to