Another tidbit:
If I load libCstd in my java program first the crash goes away even when
run under libumem as the LD_PRELOAD.
echo $LD_PRELOAD
libumem.so.1
cat foo.java
public class foo
{
public static void main (String [] args) {
try
{
System.loadLibrary("Cstd");
System.loadLibrary("eMatrix");
}
catch(Exception e)
{
System.out.println("exception: " + e.toString());
}
}
}
java -cp . foo
no crash...
comment out the line with Cstd
java -cp . foo
Abort (core dumped)
-----Original Message-----
From: Jonathan Adams [mailto:[email protected]]
Sent: Monday, March 06, 2006 7:12 PM
To: Milliken, Clark
Cc: David McDaniel (damcdani); Rod.Evans at sun.com;
tools-linking at opensolaris.org
Subject: Re: [tools-linking] shared lib using libCstd under java
On Mon, Mar 06, 2006 at 06:49:35PM -0500, Milliken, Clark wrote:
> This is from a different machine/build than the one I previously gave
> pstack and pmap output from...this environment is a bit more complex
as
> there are a few more shared libs involved, but it's the same symptom
and
> it's what I have available to me at the moment.
>
> I assume that you only want the dump of the address in the free
call...
Yup; so somehow we are deleting (which calls free()) a totally invalid
pointer; I've run your mdb output through c++filt:
> > ::umem_status
> Status: ready and active
> Concurrency: 4
> Logs: transaction=128k
> Message buffer:
> free(f1fd2860): invalid or corrupted buffer
> stack trace:
> libumem.so.1'free+0x54
> libCrun.so.1'void operator delete(void*)+0x4
> libCstd_isa.so.1'std::string &std::string::operator=(const std::string
&)+0xa8
> libCstd.so.1'std::string
*__rwstd::locale_vector<std::string>::resize(unsigned,std::string )+0xbc
> libCstd.so.1'__rwstd::locale_imp::locale_imp #Nvariant
1(unsigned,unsigned)+0xb0
> libCstd.so.1'void std::locale::init()+0x40
>
libCstd.so.1'std::istream::basic_istream(std::ios_base::EmptyCtor)+0x70
> libCstd.so.1'?? (0xef875020)
> libCstd.so.1'?? (0xef876108)
> libCstd.so.1'_init+0x1e0
> ld.so.1'?? (0xff3c0254)
> ld.so.1'?? (0xff3c56d4)
> ld.so.1'?? (0xff3c5818)
> ld.so.1'dlopen+0xac
> libhpi.so'?? (0xfed52344)
> libjvm.so'JVM_LoadLibrary+0xe8
> libjava.so'Java_java_lang_ClassLoader_00024NativeLibrary_load+0xe8
> ?? (0xfac0bbc8)
> ?? (0xfac05c64)
> ?? (0xfac05a44)
> ?? (0xfac05c64)
> ?? (0xfac05c64)
> ?? (0xfac05c64)
> ?? (0xfac00118)
> libjvm.so'?? (0xfe8d4b48)
> libjvm.so'?? (0xfe8ecaf8)
> libjvm.so'?? (0xfe988d3c)
> java'main+0x1560
> java'_start+0x108
>
>
> > f1fd2860::whatis
> f1fd2860 is unknown
This means that umem has no idea where this pointer comes from.
> > f1fd2860-8,20::dump
> 0 1 2 3 4 5 6 7 \/ 9 a b c d e f 01234567v9abcdef
> f1fd2850: f092aa40 f092c138 f0929ad8 00000001 ... at ...8........
> f1fd2860: 00000000 00000000 00000000 00000000 ................
> f1fd2870: 00000000 00000000 ffffffff 00000000 ................
This doesn't look like a umem buffer at all; the first two bytes are
f0929ad8 00000001, where the first is typically a small number, and the
second is a large number; the two XOR together to 0x3a10C000.
looking up the address in the core file:
> pmap of this core:
>
> core 'core' of 9412: java -cp . foo
> 00010000 40K r-x-- /usr/j2se/bin/java
> 00028000 8K rwx-- /usr/j2se/bin/java
> 0002A000 2200K rwx-- [ heap ]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this is where umem-managed
storage will
be.
> EF780000 1376K r-x-- /usr/lib/libCstd.so.1
> EF8E6000 32K rwx-- /usr/lib/libCstd.so.1
> EF8EE000 8K rwx-- /usr/lib/libCstd.so.1
> EF900000 3464K r-x--
> /usr/local/galaxy-2.3X/lib/libvgalaxy-unicode.so.7
> EFC70000 152K rwx--
> /usr/local/galaxy-2.3X/lib/libvgalaxy-unicode.so.7
> EFC96000 80K rwx--
> /usr/local/galaxy-2.3X/lib/libvgalaxy-unicode.so.7
> EFD00000 1976K r-x--
> /home/matrix/xerces/xerces-c-src2_1_0/lib/libxerces-c.so.21.0
> EFEFC000 648K rwx--
> /home/matrix/xerces/xerces-c-src2_1_0/lib/libxerces-c.so.21.0
> EFF9E000 8K rwx--
> /home/matrix/xerces/xerces-c-src2_1_0/lib/libxerces-c.so.21.0
> F0000000 32040K r-x--
> /home/matrix/ds-10.7.0.0/MXD-10700/bin/solaris4/libeMatrix.so
> F1F58000 960K rwx--
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this is where the buffer is.
> F2048000 24K rwx--
> F2100000 560K r-x-- /usr/openwin/lib/libX11.so.4
> F219C000 24K rwx-- /usr/openwin/lib/libX11.so.4
...
So it looks like someone is deleting a random pointer.
Could you send the output of 'pargs -e core' as well?
Cheers,
- jonathan
> Thanks,
>
> Clark
>
>
> -----Original Message-----
> From: Jonathan Adams [mailto:jonathan.adams at sun.com]
> Sent: Monday, March 06, 2006 6:14 PM
> To: Milliken, Clark
> Cc: David McDaniel (damcdani); Rod.Evans at sun.com;
> tools-linking at opensolaris.org
> Subject: Re: [tools-linking] shared lib using libCstd under java
>
> On Mon, Mar 06, 2006 at 06:06:36PM -0500, Milliken, Clark wrote:
> > Optimized builds crash consistently with or without libumem. Debug
> > builds only crash with libumem. It's the same stack for both...
>
> Alright; what is the output of:
>
> % mdb core
> ...
> > ::umem_status
>
>
> Also, look at the ::umem_status output; take the hex number between
the
> parenthesis, and do:
>
> > number::whatis
> > number-8,20::dump
>
> With the output from all of that, I can tell you why umem is barfing.
>
> Cheers,
> - jonathan
>
> > -----Original Message-----
> > From: Jonathan Adams [mailto:jonathan.adams at sun.com]
> > Sent: Monday, March 06, 2006 6:05 PM
> > To: David McDaniel (damcdani)
> > Cc: Rod.Evans at sun.com; Milliken, Clark;
tools-linking at opensolaris.org
> > Subject: Re: [tools-linking] shared lib using libCstd under java
> >
> > On Mon, Mar 06, 2006 at 02:55:13PM -0800, David McDaniel (damcdani)
> > wrote:
> > > IIRC, the original traceback showed the abort due to umem
> determining
> > a
> > > deallocation was a duplicate or invalid buffer. Preloading libCstd
> > would
> > > result in umem being usurped by libc, would it not? And thus the
> > > presumed double free isnt caught. Maybe the umem noabort flag
might
> be
> > > worth trying, without the libCstd preload.
> >
> > The umem noabort flag should only be used as a last resort; there is
> > still
> > an unresolved problem here.
> >
> > Does the program crash without libumem?
> >
> > Cheers,
> > - jonathan
> >
> > > > -----Original Message-----
> > > > From: tools-linking-bounces at opensolaris.org
> > > > [mailto:tools-linking-bounces at opensolaris.org] On Behalf Of Rod
> > Evans
> > > > Sent: Monday, March 06, 2006 4:33 PM
> > > > To: Milliken, Clark
> > > > Cc: tools-linking at opensolaris.org
> > > > Subject: Re: [tools-linking] shared lib using libCstd under java
> > > >
> > > > Milliken, Clark wrote:
> > > > > Removing the -Bsymbolic option has no effect.
> > > > >
> > > > > I created a test prog, that just does just does a:
> > > > >
> > > > > dlopen(...,RTLD_NOW | RTLD_GLOBAL) and that works fine.
> > > > >
> > > > > Another tid-bit...
> > > > >
> > > > > If I LD_PRELOAD=libCstd.so.1, my crash goes away.
> > > >
> > > > You don't happen to have two different versions of libCstd
> > > > being loaded in the process do you? What does a pmap of the
> > > > core file reveal?
> > > >
> > > > If not, then one of the C++/iostreams experts on this alias
> > > > will hopefully provide more information.
> > > >
> > > > --
> > > > Rod
> > > > _______________________________________________
> > > > tools-linking mailing list
> > > > tools-linking at opensolaris.org
> > > >
> > > _______________________________________________
> > > tools-linking mailing list
> > > tools-linking at opensolaris.org
> >
> > --
> > Jonathan Adams, Solaris Kernel Development
>
> --
> Jonathan Adams, Solaris Kernel Development
--
Jonathan Adams, Solaris Kernel Development