On Mon, 2005-09-26 at 14:59 +0000, Ferris McCormick wrote: > On Fri, 2005-09-16 at 11:35 -0500, Brian Barrett wrote: > > On Sep 16, 2005, at 8:44 AM, Ferris McCormick wrote: > > > > > ========================================== > > > fmccor@polylepis util [235]% ./opal_timer > > > --> frequency: 900000000 > > > --> cycle count > > > Slept approximately 903151189 cycles, or 1003501 us > > > --> usecs > > > Slept approximately 18446744073289684648 us > > > ========================================== > > > > That last value means that I'm munging the upper 32 bits of the tick > > register (it's 64 bits long). So we're not quite there yet, but > > getting closer. I should be able to get to that today. > > > > The other problem is very odd. Since you're compiling in 32bit mode, > > I'd expect us to see it on our PowerPC machines, but I haven't run into > > that one yet. I'll try to compile without debugging and see what I can > > see. > > > > > > Brian > > > Here's a little more information on the SegFault when trying > OBJ_DESTRUCT(&verbose); in opal/util/optput.c: > First of all, verbose is of type opal_output_stream_t, and this is not > an opal_object_t, so OBJ_DESTRUCT is calling opal_obj_run_destructors > with an object of the wrong type (although ompi might be forcing storage > allocation so that this call should work; I haven't worked it out). > > Second, on my system at least, when OBJ_DESTRUCT(&verbose) gets called, > verbose looks like this (I have a debug fprintf to try to look at a bit > of the verbose structure. The corresponding fprintf I put after > OBJ_CONSTRUCT(&verbose, opal_output_stream_t); is fine.) > ==================================== > Program received signal SIGSEGV, Segmentation fault. > 0x7014f7d4 in opal_output_close (output_id=1883966264) at output.c:287 > 287 fprintf(stderr,"Destroying verbose, depth=%d > \n",(/*(opal_object_t*)&*/verbose.super).obj_class->cls_depth); > Current language: auto; currently c > (gdb) print verbose > $1 = {super = {obj_class = 0x0, obj_reference_count = 1}, > lds_is_debugging = false, > lds_verbose_level = 0, lds_want_syslog = false, lds_syslog_priority = > 0, > lds_syslog_ident = 0x0, lds_prefix = 0x0, lds_want_stdout = false, > lds_want_stderr = true, > lds_want_file = false, lds_want_file_append = false, lds_file_suffix = > 0x0} > ===================================== > so that verbose.super.obj_class has been set to null, and no matter how > it is supposed to work, the opal_obj_run_destructors loop: > cls = object->obj_class; > for(i=0; i < cls->cls_depth;i++) { ... > is going to be working on garbage, because nothing in verbose has a > useful obj_class element. >
I've looked at the structures, and I see that opal_output_stream_t is set up so that (opal_object_t*)(&verbose) should resolve correctly, and thus my first concern is gone. Now, for the second: If built with --enable-debug, then when the program finally reaches OBJ_DESTRUCT(&verbose), the obj_class pointer is correct. Without --enable-debug, it is NULL. I'll keep looking at it, but so far, I don't see what is going wrong. Regards, -- Ferris McCormick (P44646, MI) <fmc...@gentoo.org> Developer, Gentoo Linux (Sparc, Devrel)
signature.asc
Description: This is a digitally signed message part