Addition: building the openssl application with -lmalloc results in the same 
coredump....

May be dlopen(libdevinfo.so.1) and using -lmalloc does not work together (at 
least on UltraSparcII).



Kees



> -----Original Message-----

> From: Kees Dekker

> Sent: Tuesday, 24 August, 2010 14:25

> To: '[email protected]'

> Cc: [email protected]

> Subject: RE: [openssl.org #2321] bug report: core dump on

> OPENSSL_cpuid_setup() on Solaris 10 with a Sun Enterprise 450 system

> 

> 

> 

> > -----Original Message-----

> > From: Andy Polyakov via RT [mailto:[email protected]]

> > Sent: Monday, 23 August, 2010 17:23

> > To: Kees Dekker

> > Cc: [email protected]

> > Subject: Re: [openssl.org #2321] bug report: core dump on

> > OPENSSL_cpuid_setup() on Solaris 10 with a Sun Enterprise 450 system

> >

> > Hi,

> >

> > >>> The 32-bit of openSSL 1.0.0a (solaris-sparcv9-cc configuration)

> > >>> coredumps upon initialization. The stack trace is (of our product

> > >>> binary):

> > >> Does reference to your product binary mean that apps/openssl does

> > not

> > >> crash? In other words does 'make test' pass? If so, then the

> > question

> > >> is

> > >> how come? Try to truss your application and compare it to truss

> > output

> > >> for 'apps/openssl version'... Try to single-step your application

> in

> > >> debugger...

> > >

> > > The openSSL and cURL libs are built on a different system (e.g. Sun

> > > Fire V440 or Sun Fire T200). This (old) system being used, where

> the

> > > crash occurs, is a test system, and not equiped with a compiler

> > > (similar to customer situation). So (re)building on this test

> system,

> > > or run make test is not possible.

> >

> > But you can still copy apps/openssl binary to this old system and

> > invoke

> > it, say with 'version' command-line argument... If it doesn't crash,

> > then the question would be what is special about *your* application

> and

> > what can be done about it.

> 

> I tried it, and it worked (surprise for me)

> # OPENSSL_CONF=ssl/openssl.cnf ./openssl

> OpenSSL> version

> OpenSSL 1.0.0a 1 Jun 2010

> OpenSSL> quit

> >

> > >>> #0  0xff360c90 in free_unlocked () from /usr/lib/libmalloc.so.1

> > >>> #1  0xff360b78 in free () from /usr/lib/libmalloc.so.1

> > >>> #2  0x007107a4 in OPENSSL_cpuid_setup ()

> > >>> #3  0x00791784 in ?? ()

> > >>> #4  0x00791784 in ?? ()

> > >> Note that OPENSSL_cpuid_setup does not call free() (see

> > >> crypto/sparcv9cap.c). It does call

> > dlopen("libdevinfo.so.1",RTLD_LAZY)

> > >> and dlclose(h) though... As well as some functions from

> > libdevinfo... I

> > >> mean chances are that root of the problem lies outside

> > >> OPENSSL_cpuid_setup... It's easy to verify by setting

> > >> OPENSSL_sparcv9cap

> > >> environment variable (value of 3 is appropriate for USII) prior

> > >> starting

> > >> your application.

> > >

> > > I saw that no free was called (I did check the source code as

> well),

> > > but Using OPENSSL_sparcv9cap=3 works well. But skipping the dlopen

> by

> > > setting OPENSSL_sparcv9cap  environment variable solved the

> > > problem... So I'm not 100% sure that the problem is outside

> > > OPENSSL_cpui_setup(). But I also can't explain why this problem did

> > > not exist on our other/newer (V440/T200) systems. These two are not

> > > Ultra-sparcII, but UltraSparc-IIIi/UltraSparc-T1 respectively.

> >

> > I meant "outside" as "in code beyond my control", such as function in

> > vendor-supplied libraries, e.g. libdevinfo.so or libdl.so. I'd guess,

> > and truss log suggests that, crash occurs in dlclose (note that it

> > closed /devices/pseudo/devi...@0:devinfo's file descriptor, meaning

> > that

> > di_walk_node succeeded). Try to comment out dlclose and see if it

> > helps...

> 

> Here is some more detailed stack info (gdb, using openSSL with using

> debug-solaris-sparcv9-cc instead of solaris-sparcv9-cc).

> The result is not a real debug build, without optimization, but the

> optimization level is far lower (-O instead of -xO5).

> 

> *******gdb output***********

> 99                              OPENSSL_sparcv9cap_P |=

> SPARCV9_PREFER_FPU;

> (gdb) p OPENSSL_sparcv9cap_P

> $1 = 1

> (gdb) n

> 102             if (sysinfo(SI_ISALIST,si,sizeof(si))>0)

> (gdb) p OPENSSL_sparcv9cap_P

> $2 = 3

> (gdb) n

> 104                     if (strstr(si,"+vis"))

> (gdb) n

> 106                     if (strstr(si,"+vis2"))

> (gdb) n

> 105                             OPENSSL_sparcv9cap_P |= SPARCV9_VIS1;

> (gdb) n

> 106                     if (strstr(si,"+vis2"))

> (gdb) p OPENSSL_sparcv9cap_P

> $3 = 7

> (gdb) n

> 109                             OPENSSL_sparcv9cap_P &=

> ~SPARCV9_TICK_PRIVILEGED;

> (gdb) n

> 114             if ((h = dlopen("libdevinfo.so.1",RTLD_LAZY))) do

> (gdb) p OPENSSL_sparcv9cap_P

> $4 = 7

> (gdb) n

> 122                     if (!DLLINK(h,di_init))         break;

> (gdb) p OPENSSL_sparcv9cap_P

> $5 = 7

> (gdb) n

> 114             if ((h = dlopen("libdevinfo.so.1",RTLD_LAZY))) do

> (gdb) n

> 122                     if (!DLLINK(h,di_init))         break;

> (gdb) n

> 123                     if (!DLLINK(h,di_fini))         break;

> (gdb) n

> 124                     if (!DLLINK(h,di_walk_node))    break;

> (gdb) n

> 125                     if (!DLLINK(h,di_node_name))    break;

> (gdb) n

> 127                     if ((root_node =

> (*di_init)("/",DINFOSUBTREE))!=DI_NODE_NIL)

> (gdb) n

> 130

> di_node_name,walk_nodename);

> (gdb) n

> 131                             (*di_fini)(root_node);

> (gdb) n

> 

> Program received signal SIGSEGV, Segmentation fault.

> 0xff380c90 in free_unlocked () from /usr/lib/libmalloc.so.1

> 

> Program received signal SIGSEGV, Segmentation fault.

> 0xff380c90 in free_unlocked () from /usr/lib/libmalloc.so.1

> (gdb) bt

> #0  0xff380c90 in free_unlocked () from /usr/lib/libmalloc.so.1

> #1  0xff380b78 in free () from /usr/lib/libmalloc.so.1

> #2  0x00707804 in OPENSSL_cpuid_setup () at

> /vobs/obj.dbg.SOL10/thirdparty/OpenSSL/32bit/openssl-

> 1.0.0a/crypto/sparcv9cap.c:131

> #3  0x007845cc in ?? ()

> warning: (Internal error: pc 0x0 in read in psymtab, but not in

> symtab.)

> 

> #4  0x007845cc in ?? ()

> warning: (Internal error: pc 0x0 in read in psymtab, but not in

> symtab.)

> 

> Backtrace stopped: previous frame identical to this frame (corrupt

> stack?)

> *******end of gdb output***********

> 

> The call to di_fini() casues to fire free(), which causes a SIGSEGV.

> 

> I can't really prove, but one of the differences of the openssl

> application and our one is that -lmalloc was used. May be

> dlopen(libdevinfo.so) conflicts a little with -lmalloc, since -lc also

> contains free/malloc (but no mallinfo(), used by us).

> 

> # ldd openssl

>         libsocket.so.1 =>        /lib/libsocket.so.1

>         libnsl.so.1 =>   /lib/libnsl.so.1

>         libdl.so.1 =>    /lib/libdl.so.1

>         libc.so.1 =>     /lib/libc.so.1

>         libmp.so.2 =>    /lib/libmp.so.2

>         libmd.so.1 =>    /lib/libmd.so.1

>         libscf.so.1 =>   /lib/libscf.so.1

>         libdoor.so.1 =>  /lib/libdoor.so.1

>         libuutil.so.1 =>         /lib/libuutil.so.1

>         libgen.so.1 =>   /lib/libgen.so.1

>         libm.so.2 =>     /lib/libm.so.2

>         /platform/SUNW,Ultra-4/lib/libc_psr.so.1

>         /platform/SUNW,Ultra-4/lib/libmd_psr.so.1

> 

> # ldd kshell6.2.new

>         libmalloc.so.1 =>        /usr/lib/libmalloc.so.1

>         libm.so.2 =>     /lib/libm.so.2

>         libsocket.so.1 =>        /lib/libsocket.so.1

>         libnsl.so.1 =>   /lib/libnsl.so.1

>         libdl.so.1 =>    /lib/libdl.so.1

>         libw.so.1 =>     /lib/libw.so.1

>         librt.so.1 =>    /lib/librt.so.1

>         libthread.so.1 =>        /lib/libthread.so.1

>         libpam.so.1 =>   /lib/libpam.so.1

>         libCstd.so.1 =>  /usr/lib/libCstd.so.1

>         libCrun.so.1 =>  /usr/lib/libCrun.so.1

>         libc.so.1 =>     /lib/libc.so.1

>         libmp.so.2 =>    /lib/libmp.so.2

>         libmd.so.1 =>    /lib/libmd.so.1

>         libscf.so.1 =>   /lib/libscf.so.1
>         libaio.so.1 =>   /lib/libaio.so.1

>         libcmd.so.1 =>   /lib/libcmd.so.1

>         libdoor.so.1 =>  /lib/libdoor.so.1

>         libuutil.so.1 =>         /lib/libuutil.so.1

>         libgen.so.1 =>   /lib/libgen.so.1

>         /usr/lib/cpu/sparcv8plus/libCstd_isa.so.1

>         /platform/SUNW,Ultra-4/lib/libc_psr.so.1

>         /platform/SUNW,Ultra-4/lib/libmd_psr.so.1

> 

> # ldd /usr/lib/libdevinfo.so

>         libnvpair.so.1 =>        /lib/libnvpair.so.1

>         libsec.so.1 =>   /lib/libsec.so.1

>         libc.so.1 =>     /lib/libc.so.1

>         libgen.so.1 =>   /lib/libgen.so.1

>         libnsl.so.1 =>   /lib/libnsl.so.1

>         libavl.so.1 =>   /lib/libavl.so.1

>         libmp.so.2 =>    /lib/libmp.so.2

>         libmd.so.1 =>    /lib/libmd.so.1

>         libscf.so.1 =>   /lib/libscf.so.1

>         libdoor.so.1 =>  /lib/libdoor.so.1

>         libuutil.so.1 =>         /lib/libuutil.so.1

>         libm.so.2 =>     /lib/libm.so.2

>         /platform/SUNW,Ultra-4/lib/libc_psr.so.1

>         /platform/SUNW,Ultra-4/lib/libmd_psr.so.1

> 

> 

> >

> > Another possible workaround is to explicitly link your application

> with

> > -ldevinfo. In this case dlopen/dlclose would only increment/decrement

> > reference counter, but not actually do anything upon dlclose. A.

> >

> 

> Explictly linking with -ldevinfo did not help. Same core dump at same

> place. Still thinking about malloc() (from libc.so) and free() (from

> libmalloc.so) mismatches. I don’t know how to solve this. It may point

> to a Solaris issue on UltraSparcII (problem did not exist on my

> newer/less older UltraSparc systems).

> 

> KD



______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to