On Mon, Mar 06, 2006 at 06:49:35PM -0500, Milliken, Clark wrote:
> This is from a different machine/build than the one I previously gave
> pstack and pmap output from...this environment is a bit more complex as
> there are a few more shared libs involved, but it's the same symptom and
> it's what I have available to me at the moment.
> 
> I assume that you only want the dump of the address in the free call...

Yup;  so somehow we are deleting (which calls free()) a totally invalid
pointer; I've run your mdb output through c++filt:

> > ::umem_status
> Status:         ready and active
> Concurrency:    4
> Logs:           transaction=128k
> Message buffer:
> free(f1fd2860): invalid or corrupted buffer
> stack trace:
> libumem.so.1'free+0x54
> libCrun.so.1'void operator delete(void*)+0x4
> libCstd_isa.so.1'std::string &std::string::operator=(const std::string &)+0xa8
> libCstd.so.1'std::string 
> *__rwstd::locale_vector<std::string>::resize(unsigned,std::string )+0xbc
> libCstd.so.1'__rwstd::locale_imp::locale_imp #Nvariant 
> 1(unsigned,unsigned)+0xb0
> libCstd.so.1'void std::locale::init()+0x40
> libCstd.so.1'std::istream::basic_istream(std::ios_base::EmptyCtor)+0x70
> libCstd.so.1'?? (0xef875020)
> libCstd.so.1'?? (0xef876108)
> libCstd.so.1'_init+0x1e0
> ld.so.1'?? (0xff3c0254)
> ld.so.1'?? (0xff3c56d4)
> ld.so.1'?? (0xff3c5818)
> ld.so.1'dlopen+0xac
> libhpi.so'?? (0xfed52344)
> libjvm.so'JVM_LoadLibrary+0xe8
> libjava.so'Java_java_lang_ClassLoader_00024NativeLibrary_load+0xe8
> ?? (0xfac0bbc8)
> ?? (0xfac05c64)
> ?? (0xfac05a44)
> ?? (0xfac05c64)
> ?? (0xfac05c64)
> ?? (0xfac05c64)
> ?? (0xfac00118)
> libjvm.so'?? (0xfe8d4b48)
> libjvm.so'?? (0xfe8ecaf8)
> libjvm.so'?? (0xfe988d3c)
> java'main+0x1560
> java'_start+0x108
> 
> 
> > f1fd2860::whatis
> f1fd2860 is unknown

This means that umem has no idea where this pointer comes from.

> > f1fd2860-8,20::dump
>             0 1 2 3  4 5 6 7 \/ 9 a b  c d e f  01234567v9abcdef
> f1fd2850:  f092aa40 f092c138 f0929ad8 00000001  ... at ...8........
> f1fd2860:  00000000 00000000 00000000 00000000  ................
> f1fd2870:  00000000 00000000 ffffffff 00000000  ................

This doesn't look like a umem buffer at all;  the first two bytes are
f0929ad8 00000001, where the first is typically a small number, and the
second is a large number;  the two XOR together to 0x3a10C000.

looking up the address in the core file:
> pmap of this core:
> 
> core 'core' of 9412:    java -cp . foo
> 00010000      40K r-x--  /usr/j2se/bin/java
> 00028000       8K rwx--  /usr/j2se/bin/java
> 0002A000    2200K rwx--    [ heap ]
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this is where umem-managed storage will
be.

> EF780000    1376K r-x--  /usr/lib/libCstd.so.1
> EF8E6000      32K rwx--  /usr/lib/libCstd.so.1
> EF8EE000       8K rwx--  /usr/lib/libCstd.so.1
> EF900000    3464K r-x--
> /usr/local/galaxy-2.3X/lib/libvgalaxy-unicode.so.7
> EFC70000     152K rwx--
> /usr/local/galaxy-2.3X/lib/libvgalaxy-unicode.so.7
> EFC96000      80K rwx--
> /usr/local/galaxy-2.3X/lib/libvgalaxy-unicode.so.7
> EFD00000    1976K r-x--
> /home/matrix/xerces/xerces-c-src2_1_0/lib/libxerces-c.so.21.0
> EFEFC000     648K rwx--
> /home/matrix/xerces/xerces-c-src2_1_0/lib/libxerces-c.so.21.0
> EFF9E000       8K rwx--
> /home/matrix/xerces/xerces-c-src2_1_0/lib/libxerces-c.so.21.0
> F0000000   32040K r-x--
> /home/matrix/ds-10.7.0.0/MXD-10700/bin/solaris4/libeMatrix.so
> F1F58000     960K rwx--
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this is where the buffer is.
> F2048000      24K rwx--
> F2100000     560K r-x--  /usr/openwin/lib/libX11.so.4
> F219C000      24K rwx--  /usr/openwin/lib/libX11.so.4
...


So it looks like someone is deleting a random pointer.

Could you send the output of 'pargs -e core' as well?

Cheers,
- jonathan

> Thanks,
> 
> Clark
> 
> 
> -----Original Message-----
> From: Jonathan Adams [mailto:jonathan.adams at sun.com] 
> Sent: Monday, March 06, 2006 6:14 PM
> To: Milliken, Clark
> Cc: David McDaniel (damcdani); Rod.Evans at sun.com;
> tools-linking at opensolaris.org
> Subject: Re: [tools-linking] shared lib using libCstd under java
> 
> On Mon, Mar 06, 2006 at 06:06:36PM -0500, Milliken, Clark wrote:
> > Optimized builds crash consistently with or without libumem.  Debug
> > builds only crash with libumem.  It's the same stack for both...
> 
> Alright;  what is the output of:
> 
> % mdb core
> ...
> > ::umem_status
> 
> 
> Also, look at the ::umem_status output;  take the hex number between the
> parenthesis, and do:
> 
> > number::whatis
> > number-8,20::dump
> 
> With the output from all of that, I can tell you why umem is barfing.
> 
> Cheers,
> - jonathan
> 
> > -----Original Message-----
> > From: Jonathan Adams [mailto:jonathan.adams at sun.com] 
> > Sent: Monday, March 06, 2006 6:05 PM
> > To: David McDaniel (damcdani)
> > Cc: Rod.Evans at sun.com; Milliken, Clark; tools-linking at opensolaris.org
> > Subject: Re: [tools-linking] shared lib using libCstd under java
> > 
> > On Mon, Mar 06, 2006 at 02:55:13PM -0800, David McDaniel (damcdani)
> > wrote:
> > > IIRC, the original traceback showed the abort due to umem
> determining
> > a
> > > deallocation was a duplicate or invalid buffer. Preloading libCstd
> > would
> > > result in umem being usurped by libc, would it not? And thus the
> > > presumed double free isnt caught. Maybe the umem noabort flag might
> be
> > > worth trying, without the libCstd preload. 
> > 
> > The umem noabort flag should only be used as a last resort; there is
> > still
> > an unresolved problem here.
> > 
> > Does the program crash without libumem?
> > 
> > Cheers,
> > - jonathan
> > 
> > > > -----Original Message-----
> > > > From: tools-linking-bounces at opensolaris.org 
> > > > [mailto:tools-linking-bounces at opensolaris.org] On Behalf Of Rod
> > Evans
> > > > Sent: Monday, March 06, 2006 4:33 PM
> > > > To: Milliken, Clark
> > > > Cc: tools-linking at opensolaris.org
> > > > Subject: Re: [tools-linking] shared lib using libCstd under java
> > > > 
> > > > Milliken, Clark wrote:
> > > > > Removing the -Bsymbolic option has no effect.
> > > > > 
> > > > > I created a test prog, that just does just does a:
> > > > > 
> > > > >  dlopen(...,RTLD_NOW | RTLD_GLOBAL) and that works fine.
> > > > > 
> > > > > Another tid-bit...
> > > > > 
> > > > > If I LD_PRELOAD=libCstd.so.1, my crash goes away.
> > > > 
> > > > You don't happen to have two different versions of libCstd
> > > > being loaded in the process do you?   What does a pmap of the
> > > > core file reveal?
> > > > 
> > > > If not, then one of the C++/iostreams experts on this alias 
> > > > will hopefully provide more information.
> > > > 
> > > > --
> > > > Rod
> > > > _______________________________________________
> > > > tools-linking mailing list
> > > > tools-linking at opensolaris.org
> > > > 
> > > _______________________________________________
> > > tools-linking mailing list
> > > tools-linking at opensolaris.org
> > 
> > -- 
> > Jonathan Adams, Solaris Kernel Development
> 
> -- 
> Jonathan Adams, Solaris Kernel Development

-- 
Jonathan Adams, Solaris Kernel Development

Reply via email to