Corresponding issue: https://github.com/JuliaLang/julia/issues/12840

On Monday, August 31, 2015 at 9:07:47 AM UTC-5, Mahesh Waidande wrote:
>
> Hi All,
>
>
> I am working on building/porting Julia on ppc64le architecture. I am using 
> Ubuntu 14.10 on top of ppc64le hardware, while compiling Julia code (master 
> branch) I was getting segmentation fault, I am able to resolve this 
> segmentation fault by turning on ‘MEMDEBUG’ flag from ‘src/options.h’ 
> file.  
>
>
> I decided to work more this issue and try to find out root cause of 
> segmentation fault, so I started studying/understanding memory management 
> of Julia. I have couple of questions in my mind regarding memory management 
> of Julia and I want to discuss those here. 
>
> 1. While defining ‘REGION_PG_COUNT’ macro, 4096 value is used, I want to 
> know what is the significance of 4096?
>                   If 4096 is indicates page-size then this code is valid 
> or work fine on amd64/x86_64   architecture where page size 4k and it may 
> behave abnormally in case of PPC64 where page size is 64k, basically here I 
> want to discuss the impact of large page size on Julia code and what all 
> other things I need to take into consideration while porting Julia on 
> PPC64le.  
>
> 2. Past few days I was working on understanding memory management scheme 
> of Julia and I find it bit of difficult and time consuming process though I 
> have some success. I want to know is there any official / unofficial 
> document around which will help me understand it.
>
> Any suggestions/pointers on above mention points are much appreciated.
>
> -Mahesh  
>
> On Tue, Aug 18, 2015 at 8:11 PM, Jameson Nash <vtj...@gmail.com 
> <javascript:>> wrote:
>
>> It is a considerable performance impact to run with MEMDEBUG, but 
>> otherwise has no side-effects. It is not necessary to run with this flag in 
>> production (and probably not helpful either, since you wouldn't have a 
>> debugger attached).
>>
>>
>> On Tue, Aug 18, 2015 at 9:53 AM Mahesh Waidande <mahesh.wa...@gmail.com 
>> <javascript:>> wrote:
>>
>>> Hi Jamseson,
>>>
>>> Thanks for explaining memory allocations on PPC and providing pointers 
>>> on resolving segmentation fault, pointers are really helpful and I am 
>>> working on those. I am able to compile Julia master branch after turning 
>>> ‘MEMDEBUG‘ flag on from options.h file, compilation went smooth and I am 
>>> able to see the Julia prompt. Although I will continue to work on finding 
>>> root cause of segmentation fault, occur at a time of Julia initialization. 
>>>
>>>
>>> I think when we turn on the ‘MEMDEBUG‘ flag it will reduce a performance 
>>> of Julia bit as with MEMDEBUG no memory pools are used and all allocation 
>>> is treated as big. 
>>>
>>>
>>> Apart from performance issue, I have few questions in my mind and I 
>>> would like to discuss those,
>>> 1. Apart from performance hit, is there any other functionality has 
>>> impacted due to turning on ‘MEMDEBUG’ flag OR what are side effects of 
>>> turning ‘MEMDEBUG’ flag on?
>>> 2. Should I use these settings (turning MEMDEBUG flag on) in production 
>>> environment or in release mode?
>>>
>>>
>>>
>>> -Mahesh 
>>>
>>> On Fri, Aug 14, 2015 at 10:03 PM, Jameson Nash <vtj...@gmail.com 
>>> <javascript:>> wrote:
>>>
>>>> It's a JIT copy of a julia function named "new". The last time this 
>>>> error popped up, it was due to an error in the free_page function logic to 
>>>> compute whether it was safe to free the current page (since PPC using 
>>>> large 
>>>> pages). One place to check then is to ensure the invalid pointer hadn't 
>>>> accidentally being deleted by an madvise(DONTNEED) for an unrelated page 
>>>> free operations.
>>>>
>>>> Beyond that, I would suggest trying with the `MEMDEBUG` turned on in 
>>>> options.h (which will also disable the `free_page` function).
>>>>
>>>> Also, when you have gdb running, there are many more useful things to 
>>>> print than just the backtrace. For starters, I would suggest looking at 
>>>> `disassembly` and `info registers`. Also, go `up` on the stack trace and 
>>>> look at `jl_(f->linfo)`, `jl_(jl_uncompress_ast(f->linfo, 
>>>> f->linfo->ast))`, 
>>>> and `jl_(args[0])` / `jl_(args[1])`
>>>>
>>>>
>>>> On Fri, Aug 14, 2015 at 9:07 AM Mahesh Waidande <mahesh.wa...@gmail.com 
>>>> <javascript:>> wrote:
>>>>
>>>>> Hi All, 
>>>>>
>>>>> I am working on building/porting Julia on ppc64le architecture. I am 
>>>>> using Ubuntu 14.10 on top of ppc64le hardware, while compiling Julia 
>>>>> code(master branch) I am getting segmentation fault. I tried to debug 
>>>>> segmentation fault with tools like gdb/vgdb , valgrind , electric-fence 
>>>>> etc. but I not able to find a root cause of it. I need some 
>>>>> help/pointers/suggestions on how I resolve it. 
>>>>>
>>>>> Here are some details which will help you to diagnose a problem,
>>>>>
>>>>> 1. Machine details : 
>>>>> $ uname -a
>>>>> Linux pts00433-vm1 3.16.0-30-generic #40-Ubuntu SMP Mon Jan 12 
>>>>> 22:07:11 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>>>>> $
>>>>>
>>>>> 2. Snapshot of ‘make debug’ log 
>>>>> make[1]: Leaving directory '/home/test/Mahesh/julia/julia/base'
>>>>> make[1]: Entering directory '/home/test/Mahesh/julia/julia'
>>>>>  cd base && /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C 
>>>>> native --output-ji 
>>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>>>
>>>>>   Electric Fence 2.2 Copyright (C) 1987-1999 Bruce Perens <
>>>>> br...@perens.com <javascript:>>
>>>>> Segmentation fault
>>>>> Makefile:175: recipe for target 
>>>>> '/home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji' failed
>>>>> make[1]: *** 
>>>>> [/home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji] Error 139
>>>>> make[1]: Leaving directory '/home/test/Mahesh/julia/julia'
>>>>> Makefile:64: recipe for target 'julia-inference' failed
>>>>> make: *** [julia-inference] Error 2
>>>>>
>>>>> 3. gdb stack trace
>>>>> test@pts00433-vm1:~/Mahesh/julia/julia/base$ gdb --args 
>>>>> /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C native --output-ji 
>>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>>> GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
>>>>> Copyright (C) 2014 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later <
>>>>> http://gnu.org/licenses/gpl.html>
>>>>> This is free software: you are free to change and redistribute it.
>>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show 
>>>>> copying"
>>>>> and "show warranty" for details.
>>>>> This GDB was configured as "powerpc64le-linux-gnu".
>>>>> Type "show configuration" for configuration details.
>>>>> For bug reporting instructions, please see:
>>>>> <http://www.gnu.org/software/gdb/bugs/>.
>>>>> Find the GDB manual and other documentation resources online at:
>>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>>> For help, type "help".
>>>>> Type "apropos word" to search for commands related to "word"...
>>>>> Reading symbols from 
>>>>> /home/test/Mahesh/julia/julia/usr/bin/julia-debug...done.
>>>>> (gdb) b repl.c:532
>>>>> Breakpoint 1 at 0x10003a34: file repl.c, line 532.
>>>>> (gdb) r
>>>>> Starting program: /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C 
>>>>> native --output-ji 
>>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>>> [Thread debugging using libthread_db enabled]
>>>>> Using host libthread_db library 
>>>>> "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
>>>>>
>>>>>   Electric Fence 2.2 Copyright (C) 1987-1999 Bruce Perens <
>>>>> br...@perens.com <javascript:>>
>>>>>
>>>>> Breakpoint 1, main (argc=7, argv=0x3ffffffff478) at repl.c:533
>>>>> 533     {
>>>>> (gdb) c
>>>>> Continuing.
>>>>>
>>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>> 0x00003fffb6970078 in julia.new_0 ()
>>>>> (gdb) where
>>>>> #0  0x00003fffb6970078 in julia.new_0 ()
>>>>> #1  0x00003fffb6b3b820 in jl_apply (f=0x3ffd9ac1de10, 
>>>>> args=0x3fffffffde28, nargs=2) at julia.h:1263
>>>>> #2  0x00003fffb6b4137c in jl_trampoline (F=0x3ffd9ac1de10, 
>>>>> args=0x3fffffffde28, nargs=2) at builtins.c:979
>>>>> #3  0x00003fffb6b2b084 in jl_apply (f=0x3ffd9ac1de10, 
>>>>> args=0x3fffffffde28, nargs=2) at julia.h:1263
>>>>> #4  0x00003fffb6b328d0 in jl_apply_generic (F=0x3ffd9ac1dd90, 
>>>>> args=0x3fffffffde28, nargs=2) at gf.c:1675
>>>>> #5  0x00003fffb6c2d9a0 in jl_apply (f=0x3ffd9ac1dd90, 
>>>>> args=0x3fffffffde28, nargs=2) at julia.h:1263
>>>>> #6  0x00003fffb6c2e014 in do_call (f=0x3ffd9ac1dd90, 
>>>>> args=0x3ffd9ac215a8, nargs=2, eval0=0x0, locals=0x0, nl=0, ngensym=0)
>>>>>     at interpreter.c:65
>>>>> #7  0x00003fffb6c2eec4 in eval (e=0x3ffd9ac1ddd0, locals=0x0, nl=0, 
>>>>> ngensym=0) at interpreter.c:212
>>>>> #8  0x00003fffb6c2dc20 in jl_interpret_toplevel_expr 
>>>>> (e=0x3ffd9ac1ddd0) at interpreter.c:27
>>>>> #9  0x00003fffb6c55eac in jl_toplevel_eval_flex (e=0x3ffd9ac1ddb0, 
>>>>> fast=1) at toplevel.c:524
>>>>> #10 0x00003fffb6c56260 in jl_parse_eval_all (fname=0x3fffb7950158 
>>>>> "boot.jl", len=8) at toplevel.c:574
>>>>> #11 0x00003fffb6c56510 in jl_load (fname=0x3fffb7950158 "boot.jl", 
>>>>> len=8) at toplevel.c:614
>>>>> #12 0x00003fffb6c3cf58 in _julia_init (rel=JL_IMAGE_JULIA_HOME) at 
>>>>> init.c:1107
>>>>> #13 0x00003fffb6c3f38c in julia_init (rel=JL_IMAGE_JULIA_HOME) at 
>>>>> task.c:252
>>>>> #14 0x0000000010003af8 in main (argc=1, argv=0x3ffffffff4a8) at 
>>>>> repl.c:601
>>>>> (gdb) q
>>>>> A debugging session is active.
>>>>>
>>>>>         Inferior 1 [process 26906] will be killed.
>>>>>
>>>>> Quit anyway? (y or n) y
>>>>> test@pts00433-vm1:~/Mahesh/julia/julia/base$
>>>>>
>>>>> 4. Segmentation fault occur at a time of Julia initialization, at a 
>>>>> time of initialization Julia compile some jl files, while compiling 
>>>>> ‘int.jl’ segmentation fault occurs.
>>>>>
>>>>> I extract above information from inspecting Julia code and attaching 
>>>>> valgrind to Julia-debug binary.   
>>>>> $ valgrind -v --vgdb=yes --vgdb-error=0 --leak-check=full 
>>>>> --show-leak-kinds=all --log-file=valgrind-test.log 
>>>>> /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C native --output-ji 
>>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>>>
>>>>>   Electric Fence 2.2 Copyright (C) 1987-1999 Bruce Perens <
>>>>> br...@perens.com <javascript:>>
>>>>> essentials.jl
>>>>> reflection.jl
>>>>> options.jl
>>>>> promotion.jl
>>>>> tuple.jl
>>>>> range.jl
>>>>> expr.jl
>>>>> error.jl
>>>>> bool.jl
>>>>> number.jl
>>>>> int.jl
>>>>>
>>>>> signal (11): Segmentation fault
>>>>> $
>>>>>
>>>>>
>>>>> I have couple questions/doubt in my mind, 
>>>>>
>>>>> a.) When I search ‘julia.new_0 ()’[gdb - frame 0 ]  in Julia and 
>>>>> dependent  source, I am not able find it out, my guess is at run time 
>>>>> julia 
>>>>> create this function for initialization. I would like to hear your 
>>>>> comments 
>>>>> / suggestion on how I debug this or where I need to look/check or quick 
>>>>> word on initialization process.  
>>>>>
>>>>> b.) When I try to step in jl_apply() [gdb - frame 1 ] function, I get 
>>>>> the segmentation fault, I could not step in to the function. When I try 
>>>>> to 
>>>>> step in (with step/s command of gdb) I am getting segmentation fault.  So 
>>>>> my question is, how I validate ‘0x3ffd9ac1de10’ address or contain 
>>>>> present 
>>>>> at ‘0x3ffd9ac1de10’?.  ‘0x3ffd9ac1de10’ is virtual address because every 
>>>>> time I see same address in stack trace.
>>>>>
>>>>>
>>>>> c.) Is this segmentation fault is a cascading effect of something goes 
>>>>> wrong at pre internalize stage and if I want to put any check on it, 
>>>>> where 
>>>>> should I put it, any specific code snippet? 
>>>>>
>>>>> d.) I observe strange behavior while compiling Julia source. For 
>>>>> debugging purpose I insert some printf statement in main() [gdb- frame 
>>>>> 14] 
>>>>> it done some trick and I did not get any segmentation fault, code compile 
>>>>> smoothly and I get Julia prompt but when I remove printf statements all 
>>>>> together,  again I observe an segmentation fault, Any comments on this ? 
>>>>> Even single statement will do the trick. 
>>>>>
>>>>>
>>>>> I have attached all logs that I have i.e make debug log, gdb stack 
>>>>> trace and valgrind log with this mail.I am continued to investigate more 
>>>>> on 
>>>>> this any suggestions/pointers are much appreciated. 
>>>>>
>>>>> -Mahesh 
>>>>>
>>>>
>>>
>

Reply via email to