On Tue, Sep 13, 2016 at 4:11 PM, Yichao Yu <yyc1...@gmail.com> wrote:
> I'm able to reproduce it in rr and found the issue. > > TL;DR the issue is at https://github.com/JuliaGraphics/Cairo.jl/blame/ > master/src/Cairo.jl#L625, where it passes the ownership of a cairo > pointer to julia, causing a double free. > Which comes from your commit a year ago ;-p https://github.com/JuliaGraphics/Cairo.jl/commit/23681dc1270882c964059d23863a88d4276f6fd8 > > Here's the rough process of my debugging, I'm not really sure how to > summarize it though.... > > 1. It abort in cairo `cairo_destory_path` so I first compiled a cairo with > debug symbol to make my life easier. (the function is pretty short so > reading the disasm would have worked too) > 2. It is free'ing `path->data` so I added a watchpoint on it `watch -l > path->data` and reverse-continue to find the point of assignment. > 3. Assignment happens in cairo from a valid malloc so path->data isn't > corrupted. > 4. Now it takes some guessing to figure out exactly what's wrong. I'm not > sure how glibc stores it's malloc metadata (would help to know that) so I > tried the naive thing and watch the intptr_t before the malloc result > (that's how julia store the gc metadata) and run forward. None of the > assignment to this location looks suspicious (they are all in glibc and the > first hit isn't free'ing this value) > 5. So now I tried the brute force way,the pointer (`path->data`) I see is > `0x3746950` so I simply did a conditional breakpoint to see when it's > free'd with `br free if $rdi == 0x3746950`. I use rdi to get the first > argument since the glibc I installed doesn't have that detailed debug info. > 6. After a long run (conditional breakpoint is really slow which is why I > didn't use it first) it hits a breakpoint in the julia GC when free'ing an > array. The array has a data pointer the same as the one in question and > that's before the pointer is free'd by cairo so sth is wrong with the > creation of the array. Now simply watch the `a->data` and go back again. > I'm lucky this time, if this didn't work, the next thing to try would be > trying to reduce the code/ run GC more often so that I can afford looking > at the code more carefully instead of just catching events in the debugger. > 7. As expected, it hits `jl_ptr_to_array` and going up a frame it seems > that the caller is supplying a cairo pointer and transfering the ownership, > which is wrong. > > > > On Tue, Sep 13, 2016 at 3:36 PM, Yichao Yu <yyc1...@gmail.com> wrote: > >> >> >> On Tue, Sep 13, 2016 at 3:31 PM, Andreas Lobinger <lobing...@gmail.com> >> wrote: >> >>> Hello colleague, >>> >>> On Tuesday, September 13, 2016 at 7:25:38 PM UTC+2, Yichao Yu wrote: >>>> >>>> >>>> On Tue, Sep 13, 2016 at 12:49 PM, Andreas Lobinger <lobi...@gmail.com> >>>> wrote: >>>> >>>>> Hello colleagues, >>>>> >>>>> i'm trying to find out, why this >>>>> ... >>>>> >>>> fails miserably. I guess, but cannot track it down right now: There is >>>>> something wrong in memory management of Cairo.jl that only shows up for >>>>> objects that could have been freed long ago and julia and libcairo have >>>>> different concepts of invalidation. >>>>> >>>>> Any blog/receipe/issue that deals with GC debugging? >>>>> >>>> >>>> It's not too different from debugging memory issue in any other program. >>>> It usually helps (a lot) to reproduce under rr[1] >>>> >>> >>> Many thanks for pointing to this. I was aware it exists but wasn't aware >>> of their progress. >>> >>> >>>> Other than that, it strongly depend on the kind of error and I've seen >>>> it happens due to almost all parts of the runtime and it's really hard to >>>> summarize. >>>> >>> >>> What do you mean with "happens due to almost all parts of the runtime" ? >>> >> >> The general procedure is basically catch the failure and try to figure >> out why it got into this states. This means that you generally need to >> trace back where certain value is generated which also usually means that >> you need to trace back through a few layers of code and they might be >> scattered all over the place. >> >> >