That's quite odd that it only happens for Java programs -- it should happen for 
*all* programs, based on the stack trace you've shown.

Can you print the value of the lds struct where the error occurs?


On Jul 25, 2014, at 2:29 AM, Siegmar Gross 
<siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi,
> 
> I have installed openmpi-1.8.2rc2 with Sun c 5.12 on Solaris
> 10 Sparc and x86_64 and I receive a segmentation fault, if I
> run a small Java program.
> 
> rs0 java 105 mpiexec -np 1 java InitFinalizeMain
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0xffffffff7ea3c830, pid=18363, tid=2
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode 
> solaris-sparc 
> compressed oops)
> # Problematic frame:
> # C  [libc.so.1+0x3c830]  strlen+0x50
> ...
> 
> 
> I get the following output if I run the program in "dbx".
> 
> ...
> RTC: Running program...
> Write to unallocated (wua) on thread 1:
> Attempting to write 1 byte at address 0xffffffff79f04000
> t@1 (l@1) stopped in _readdir at 0xffffffff56574da0
> 0xffffffff56574da0: _readdir+0x0064:    call     
> _PROCEDURE_LINKAGE_TABLE_+0x2380 [PLT] ! 0xffffffff56742a80
> Current function is find_dyn_components
>  397                       if (0 != lt_dlforeachfile(dir, save_filename, 
> NULL)) 
> {
> (dbx) 
> 
> 
> I get the following output if I run the program on Solaris 10
> x86_64.
> 
> ...
> RTC: Running program...
> Reading disasm.so
> Read from uninitialized (rui) on thread 1:
> Attempting to read 1 byte at address 0x437387
>    which is 15 bytes into a heap block of size 16 bytes at 0x437378
> This block was allocated from:
>        [1] vasprintf() at 0xfffffd7fdc9b335a 
>        [2] asprintf() at 0xfffffd7fdc9b3452 
>        [3] opal_output_init() at line 184 in "output.c"
>        [4] do_open() at line 548 in "output.c"
>        [5] opal_output_open() at line 219 in "output.c"
>        [6] opal_malloc_init() at line 68 in "malloc.c"
>        [7] opal_init_util() at line 258 in "opal_init.c"
>        [8] opal_init() at line 363 in "opal_init.c"
> 
> t@1 (l@1) stopped in do_open at line 638 in file "output.c"
>  638           info[i].ldi_prefix = strdup(lds->lds_prefix);
> (dbx) 
> 
> 
> Hopefully the above output helps to fix the errors. Can I provide
> anything else? Thank you very much for any help in advance.
> 
> 
> Kind regards
> 
> Siegmar
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/07/24870.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to