Re: [lldb-dev] Breakpoint + callback performance ... Can it be faster?

Jim Ingham via lldb-dev Tue, 16 Aug 2016 11:08:54 -0700

> On Aug 16, 2016, at 10:42 AM, Benjamin Dicken <bddic...@datawareventures.com> 
> wrote:
> 
> Thanks for the quick reply.
> 
> > Are you sure the actual handling of the breakpoint & callback in lldb is 
> > what is taking most of the time?
> 
> I'm not positive. I did collect some callgrind profiles to take a look at 
> where most of the time is being spent, but i'm not very familiar with lldb 
> internals so the results were hard to interpret. I did notice that there was 
> a lot of packet/network business when using lldb to profile a program (which 
> I assumed was communication between my program and lldb-server). I was not 
> sure how this effected the performance, so perhaps this is the real 
> bottleneck.


I would be pretty surprised if it was not.  We had some bugs in breakpoint 
handling - mostly related to having very very many breakpoints.  But other than 
that the dispatching of the breakpoint StopInfo is a pretty simple, straight 
forward bit of work.

> 
> > Greg just switched to using a unix-domain socket for this communication for 
> > platforms that support it.  This speeds up the packet traffic side of 
> > things.
> 
> In what version of lldb was this introduced? I'm running 3.7.1. I'm also on 
> ubuntu 14.04, is that a supported platform?

It is just in TOT lldb, he just added it last week.  It is currently only 
turned on for OS X.

> 
> > One of the original motivations of having lldb-server be based on lldb 
> > classes - as opposed to the MacOS X version of debugserver which is an 
> > independent construct - was that you could re-use the server code to create 
> > an in-process Process plugin, eliminating a lot of this traffic & context 
> > switching when you needed maximum speed.
> 
> That sounds very interesting. Is there an example of this implementation you 
> could point me to?
> 

FreeBSB & Windows still have native Process plugins.  But they aren't used for 
the lldb-server implementation so far as I can tell (I've mostly worked on the 
OS X side.)  I think this was more of a design intent that hasn't actually been 
used anywhere yet.  But the Linux/Android folks will know better.

Jim


> 
> 
> On Tue, Aug 16, 2016 at 10:20 AM, Jim Ingham <jing...@apple.com> wrote:
> Are you sure the actual handling of the breakpoint & callback in lldb is what 
> is taking most of the time?  The last time we looked at this, the majority of 
> the work was in communicating with debugserver to get the stop notification 
> and restart.  Note, besides all the packet code, this involves context 
> switches from process->lldbserver->lldb and back, which is also pretty 
> expensive.
> 
> Greg just switched to using a unix-domain socket for this communication for 
> platforms that support it.  This speeds up the packet traffic side of things.
> 
> One of the original motivations of having lldb-server be based on lldb 
> classes - as opposed to the MacOS X version of debugserver which is an 
> independent construct - was that you could re-use the server code to create 
> an in-process Process plugin, eliminating a lot of this traffic & context 
> switching when you needed maximum speed.  The original Mac OS X lldb port 
> actually had a process plugin wholly in-process with lldb as well as the 
> debugserver based one, but there wasn't enough motivation to justify 
> maintaining the two different implementations of the same code.  I don't know 
> whether the Linux port takes advantage of this possibility, however.  That 
> would be something to look into, however.
> 
> Once we actually figure out about the stop, figuring out the breakpoint and 
> getting to its callback is pretty simple...  I doubt making "lighter weight 
> breakpoints" in particular will recover the performance you need, though if 
> your sampling turns up some inefficient algorithms have crept in, it would be 
> great to fix that.
> 
> Another option we've toyed with on and off is something like the gdb 
> "tracepoints" were you can upload instructions to perform "experiments" when 
> a breakpoint is hit to the lldb-server instance.  The work to perform the 
> experiment and the results would all be kept in the lldb-server instance till 
> a real breakpoint is hit, at which point lldb can download all the results 
> and present them to the user.  This would eliminate some of the 
> context-switches and packet traffic while you were running in the hot parts 
> of your code.  This is a decent chunk of work, however.
> 
> Jim
> 
> 
> > On Aug 16, 2016, at 9:57 AM, Benjamin Dicken via lldb-dev 
> > <lldb-dev@lists.llvm.org> wrote:
> >
> > I recently started using lldb to write a basic instrumentation tool for 
> > tracking the values of variables at various code-points in a program. I've 
> > been working with lldb for less than two weeks, so I am pretty new. Though, 
> > I have used and written llvm passes in the past, so I'm familiar with the 
> > clang/llvm/lldb ecosystem.
> >
> > I have a very early prototype of the tool up and running, using the C++ 
> > API. The user can specify either an executable to run or an already-running 
> > PID to attach to. The user also supplies a file+line_number at which a 
> > breakpoint (with a callback) is placed. For testing/prototyping purposes, 
> > the breakpoint callback just increments a counter and then immediately 
> > returns false. Eventually, more interesting things will happen in this 
> > callback.
> >
> > I've noticed that just the action of hitting a breakpoint and invoking the 
> > callback is very expensive. I did some instruction-count collection by 
> > running this lldb tool on a simple test program, and placing the 
> > breakpoint+callback at different points in the program, causing it to get 
> > triggered different amounts of times. I used `perf stat -e instructions 
> > ...` to gather instruction exec counts for each run. After doing a little 
> > math, it appears that I'm incurring 1.0 - 1.1 million instruction execs per 
> > breakpoint.
> >
> > This amount of slowdown is prohibitively expensive for my needs, because I 
> > want to place callbacks in hot portions of the "inferior" program.
> >
> > Is there a way to make this faster? Is it possible to create 
> > "lighter-weight" breakpoints? I really like the lldb API (though the 
> > documentation is lacking in some places), but if this performance hit can't 
> > be mitigated, it may be unusable for me.
> >
> > For reference, this is the callback function:
> >
> > ```
> > static int cb_count = 0;
> > bool SimpleCallback (
> >     void *baton,
> >     lldb::SBProcess &process,
> >     lldb::SBThread &thread,
> >     lldb::SBBreakpointLocation &location) {
> >   //TODO: Eventually do more interesting things...
> >   cb_count++;
> >   return false;
> > }
> > ```
> >
> > And here is how I set it up to be called back:
> >
> > ```
> > lldb::SBBreakpoint bp1 = 
> > debugger_data->target.BreakpointCreateByLocation(file_name, line_no);
> > if (!bp1.IsValid()) std::cerr << "invalid breakpoint";
> > bp1.SetCallback(SimpleCallback, 0);
> > ```
> >
> > -Benjamin
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev@lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 
> 
> 
> 
> -- 
> Ben

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] Breakpoint + callback performance ... Can it be faster?

Reply via email to