Hi Arne,

Great news! I can verify that Roxen seems to run just fine on it, too.

One thing I noticed is that it doesn’t seem to compile --with-debug: the 
PIKE_NEEDS_TRACE macro (defined in interpret.c) can’t be resolved in 
interpret_functions.h. I’m not sure where it should reside instead.

I’d definitely vote for a merge to 8.1 – there’s no feature freeze in place 
yet, is it?
After merge, I’d like to rebase/merge the call_frames branch on top of this, 
unless someone disagrees. That should enable further profiling/optimization 
iterations.

Best regards,
/Marty

> On 8 Mar 2017, at 10:33 , Arne Goedeke <[email protected]> wrote:
> 
> I think I managed to fix the last issue. I was somehow confusing things
> and removed the locals from the stack before unlinking the stack frame.
> This of course broke trampolines. I also ended up rebasing the branch to
> get rid of the reverts I did at some point.
> 
> The current state passes the testsuite (the same tests as 8.1 at least).
> Performance wise it is roughly where 8.1 is, except for map/automap
> being significantly faster. There are some slowdowns currently, which
> are due to me removing some fast paths from the F_CALL_OTHER opcode. I
> will look into that.
> 
> I readded most of the tracing code, however, some of it is unfinished
> and DTrace is probably broken. I have also not looked at PROFILING, yet,
> that is probably also not right yet.
> 
> Sidenote: Profiling unfortunately does not work properly when fork()ing
> because timers change. It might even crash when running with debug mode
> because of that. But that is probably just a bug we need to fix.
> 
> Whats currently left on my list before proposing to merge it into 8.1/8.3
> 
> * Make sure the map/automap optimizations do not break in pathological
>  cases (e.g. objects being destructed or similar).
> * Maybe think about the API again (e.g. callsite_execute and
>  callsite_return could be merged. same with
>  callsite_init/callsite_set_args).
> 
> Otherwise I played around with adding frame caching to apply_array,
> which looks promising performance wise. However, it takes some attention
> to make sure the stack traces are always correct. This would be a good
> test-case for caching frames in general.
> 
> Anyway, feedback welcome, as usual,
> 
> Arne

Reply via email to