Hi Arne,

Alright. Any idea what the crash might be related to?

I’ve pushed the marty/call_frames branch now. As mentioned, something breaks 
when precompiled bytecode is decoded, so many testsuite tests will segfault 
(since they are precompiled).

Compiling --with-mc-stack-frames and running the very nice 
Debug.generate_perf_map() (previously implemented by TobiJ) should enable perf 
to extract what’s needed. I’ve used https://github.com/jrfonseca/gprof2dot 
<https://github.com/jrfonseca/gprof2dot> and 
http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html 
<http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html> for visualisation.

/Marty

> On 21 Feb 2017, at 20:31 , Arne Goedeke <[email protected]> wrote:
> 
> Hi Marty,
> 
> thanks!
> 
> Yes, low_mega_apply still needs to be refactored. It is slightly more
> "complicated" because of APPLY_STACK, where the return value will
> overwrite the function on the stack. I want to fix the last crash in the
> testsuite before refactoring that. If you are interested in working on
> those, just let me know so we don't both do it ;)
> 
> Adding more perf support would be great, do you have your code in a
> branch somewhere? I would be interested to have a look at it.
> 
> Arne
> 
> On 02/20/17 23:47, Martin Karlgren wrote:
>> Hi Arne,
>> 
>> That’s awesome!
>> 
>> I’d love to help (with the limited spare time I have.) I guess 
>> low_mega_apply should be refactored to make use of the new API too?
>> 
>> Speaking of faster calls, I’ve incidentally been poking around a bit with 
>> machine code function calling conventions lately. For profiling purposes 
>> (i.e. Linux perf) I’ve added minimal call frame information to Pike 
>> functions in the amd64 machine code generator. I’ve gotten to the point 
>> where I can start Roxen and get proper stack traces from perf, but the 
>> testsuite still fails – it seems related to decoding of dumped bytecode, and 
>> I haven’t been able to sort out why.
>> Anyways, the good thing is that readymade visualisation tools built on perf 
>> output can be used to profile Pike code, and the interaction between Pike 
>> code and C functions is more apparent.
>> Examples from a very simple Roxen site being hit by apachebench:
>> http://marty.se/dotgraph.png <http://marty.se/dotgraph.png> (nodes with a 
>> “perf-17628.map” header represent Pike functions)
>> http://marty.se/flamegraph.svg <http://marty.se/flamegraph.svg> (time on 
>> horisontal axis, stack depth on vertical axis).
>> 
>> Hopefully this can be used to weed out where we should start looking for 
>> optimisation candidates eventually.
>> 
>> /Marty
>> 
> 

Reply via email to