Speaking for Android Studio, I think that we *could* use a python-based implementation (hard to say exactly without knowing the details of the implementation), but I believe a different implementation could be *easier* to integrate. Plus, if the solution integrates more closely with lldb, we could surface some of the data in the command-line client as well.
pl On 1 February 2016 at 10:30, Ravitheja Addepally <ravithejaw...@gmail.com> wrote: > And what about the ease of integration into a an IDE, I don't really know if > the python based approach would be usable or not in this context ? > > On Mon, Feb 1, 2016 at 11:17 AM, Pavel Labath <lab...@google.com> wrote: >> >> It feels to me that the python based approach could run into a dead >> end fairly quickly: a) you can only access the data when the target is >> stopped; b) the self-tracing means that the evaluation of these >> expressions would introduce noise in the data; c) overhead of all the >> extra packets(?). >> >> So, I would be in favor of a lldb-server based approach. I'm not >> telling you that you shouldn't do that, but I don't think that's an >> approach I would take... >> >> pl >> >> >> On 1 February 2016 at 08:58, Ravitheja Addepally >> <ravithejaw...@gmail.com> wrote: >> > Ok, that is one option, but one of the aim for this activity is to make >> > the >> > data available for use by the IDE's like Android Studio or XCode or any >> > other that may want to display this information in its environment so >> > keeping that in consideration would the complete python based approach >> > be >> > useful ? or would providing LLDB api's to extract raw perf data from the >> > target be useful ? >> > >> > On Thu, Jan 21, 2016 at 10:00 PM, Greg Clayton <gclay...@apple.com> >> > wrote: >> >> >> >> One thing to think about is you can actually just run an expression in >> >> the >> >> program that is being debugged without needing to change anything in >> >> the GDB >> >> remote server. So this can all be done via python commands and would >> >> require >> >> no changes to anything. So you can run an expression to enable the >> >> buffer. >> >> Since LLDB supports multiple line expression that can define their own >> >> local >> >> variables and local types. So the expression could be something like: >> >> >> >> int perf_fd = (int)perf_event_open(...); >> >> struct PerfData >> >> { >> >> void *data; >> >> size_t size; >> >> }; >> >> PerfData result = read_perf_data(perf_fd); >> >> result >> >> >> >> >> >> The result is then a structure that you can access from your python >> >> command (it will be a SBValue) and then you can read memory in order to >> >> get >> >> the perf data. >> >> >> >> You can also split things up into multiple calls where you can run >> >> perf_event_open() on its own and return the file descriptor: >> >> >> >> (int)perf_event_open(...) >> >> >> >> This expression will return the file descriptor >> >> >> >> Then you could allocate memory via the SBProcess: >> >> >> >> (void *)malloc(1024); >> >> >> >> The result of this expression will be the buffer that you use... >> >> >> >> Then you can read 1024 bytes at a time into this newly created buffer. >> >> >> >> So a solution that is completely done in python would be very >> >> attractive. >> >> >> >> Greg >> >> >> >> >> >> > On Jan 21, 2016, at 7:04 AM, Ravitheja Addepally >> >> > <ravithejaw...@gmail.com> wrote: >> >> > >> >> > Hello, >> >> > Regarding the questions in this thread please find the answers >> >> > -> >> >> > >> >> > How are you going to present this information to the user? (I know >> >> > debugserver can report some performance data... Have you looked into >> >> > how that works? Do you plan to reuse some parts of that >> >> > infrastructure?) and How will you get the information from the server >> >> > to >> >> > the client? >> >> > >> >> > Currently I plan to show a list of instructions that have been >> >> > executed >> >> > so far, I saw the >> >> > implementation suggested by pavel, the already present infrastructure >> >> > is >> >> > a little bit lacking in terms of the needs of the >> >> > project, but I plan to follow a similar approach, i.e to extract the >> >> > raw >> >> > trace data by querying the server (which can use the >> >> > perf_event_open to get the raw trace data from the kernel) and >> >> > transport >> >> > it through gdb packets ( qXfer packets >> >> > >> >> > >> >> > https://sourceware.org/gdb/onlinedocs/gdb/Branch-Trace-Format.html#Branch-Trace-Format). >> >> > At the client side the raw trace data >> >> > could be passed on to python based command that could decode the >> >> > data. >> >> > This also eliminates the dependency of libipt since LLDB >> >> > would not decode the data itself. >> >> > >> >> > There is also the question of this third party library. Do we take a >> >> > hard dependency on libipt (probably a non-starter), or only use it if >> >> > it's >> >> > available (much better)? >> >> > >> >> > With the above mentioned way LLDB would not need the library, who >> >> > ever >> >> > wants to use the python command would have to install it separately >> >> > but LLDB >> >> > wont need it >> >> > >> >> > With the performance counters, the interface would still be >> >> > perf_event_open, so if there was a perf_wrapper in LLDB server then >> >> > it could >> >> > be reused to configure and use the >> >> > software performance counters as well, you would just need to pass >> >> > different attributes in the perf_event_open system call, plus I think >> >> > the >> >> > perf_wrapper could be reused to >> >> > get CoreSight information as well (see >> >> > https://lwn.net/Articles/664236/ >> >> > ) >> >> > >> >> > >> >> > On Wed, Oct 21, 2015 at 8:57 PM, Greg Clayton <gclay...@apple.com> >> >> > wrote: >> >> > one main benefit to doing this externally is allow this to be done >> >> > remotely over any debugger connection. If you can run expressions to >> >> > enable/disable/setup the memory buffer/access the buffer contents, >> >> > then you >> >> > don't need to add code into the debugger to actually do this. >> >> > >> >> > Greg >> >> > >> >> > > On Oct 21, 2015, at 11:54 AM, Greg Clayton <gclay...@apple.com> >> >> > > wrote: >> >> > > >> >> > > IMHO the best way to provide this information is to implement >> >> > > reverse >> >> > > debugging packets in a GDB server (lldb-server). If you enable this >> >> > > feature >> >> > > via some packet to lldb-server, and that enables the gathering of >> >> > > data that >> >> > > keeps the last N instructions run by all threads in some buffer >> >> > > that gets >> >> > > overwritten. The lldb-server enables it and gives a buffer to the >> >> > > perf_event_interface(). Then clients can ask the lldb-server to >> >> > > step back in >> >> > > any thread. Only when the data is requested do we actually use the >> >> > > data to >> >> > > implement the reverse stepping. >> >> > > >> >> > > Another way to do this would be to use a python based command that >> >> > > can >> >> > > be added to any target that supports this. The plug-in could >> >> > > install a set >> >> > > of LLDB commands. To see how to create new lldb command line >> >> > > commands in >> >> > > python, see the section named "CREATE A NEW LLDB COMMAND USING A >> >> > > PYTHON >> >> > > FUNCTION" on the http://lldb.llvm.org/python-reference.html web >> >> > > page. >> >> > > >> >> > > Then you can have some commands like: >> >> > > >> >> > > intel-pt-start >> >> > > intel-pt-dump >> >> > > intel-pt-stop >> >> > > >> >> > > Each command could have options and arguments as desired. The >> >> > > "intel-pt-start" command could make an expression call to enable >> >> > > the feature >> >> > > in the target by running and expression that runs the some >> >> > > perf_event_interface calls that would allocate some memory and hand >> >> > > it to >> >> > > the Intel PT stuff. The "intel-pt-dump" could just give a raw dump >> >> > > all of >> >> > > history for one or more threads (again, add options and arguments >> >> > > as needed >> >> > > to this command). The python code could bridge to C and use the >> >> > > intel >> >> > > libraries that know how to process the data. >> >> > > >> >> > > If this all goes well we can think about building it into LLDB as a >> >> > > built in command. >> >> > > >> >> > > >> >> > >> On Oct 21, 2015, at 9:50 AM, Zachary Turner via lldb-dev >> >> > >> <lldb-dev@lists.llvm.org> wrote: >> >> > >> >> >> > >> There are two different kinds of performance counters: OS >> >> > >> performance >> >> > >> counters and CPU performance counters. It sounds like you're >> >> > >> talking about >> >> > >> the latter, but it's worth considering whether this could be >> >> > >> designed in a >> >> > >> way to support both (i.e. even if you don't do both yourself, at >> >> > >> least make >> >> > >> the machinery reusable and apply to both for when someone else >> >> > >> wanted to >> >> > >> come through and add OS perf counters). >> >> > >> >> >> > >> There is also the question of this third party library. Do we >> >> > >> take a >> >> > >> hard dependency on libipt (probably a non-starter), or only use it >> >> > >> if it's >> >> > >> available (much better)? >> >> > >> >> >> > >> As Pavel said, how are you planning to present the information to >> >> > >> the >> >> > >> user? Through some sort of top level command like "perfcount >> >> > >> instructions_retired"? >> >> > >> >> >> > >> On Wed, Oct 21, 2015 at 8:16 AM Pavel Labath via lldb-dev >> >> > >> <lldb-dev@lists.llvm.org> wrote: >> >> > >> [ Moving this discussion back to the list. I pressed the wrong >> >> > >> button >> >> > >> when replying.] >> >> > >> >> >> > >> Thanks for the explanation Ravi. It sounds like a very useful >> >> > >> feature >> >> > >> indeed. I've found a reference to the debugserver profile data in >> >> > >> GDBRemoteCommunicationClient.cpp:1276, so maybe that will help >> >> > >> with >> >> > >> your investigation. Maybe also someone more knowledgeable can >> >> > >> explain >> >> > >> what those A packets are used for (?). >> >> > >> >> >> > >> >> >> > >> On 21 October 2015 at 15:48, Ravitheja Addepally >> >> > >> <ravithejaw...@gmail.com> wrote: >> >> > >>> Hi, >> >> > >>> Thanx for your reply, some of the future processors to be >> >> > >>> released >> >> > >>> by >> >> > >>> Intel have this hardware support for recording the instructions >> >> > >>> that >> >> > >>> were >> >> > >>> executed by the processor and this recording process is also >> >> > >>> quite >> >> > >>> fast and >> >> > >>> does not add too much computational load. Now this hardware is >> >> > >>> made >> >> > >>> accessible via the perf_event_interface where one could map a >> >> > >>> region >> >> > >>> of >> >> > >>> memory for this purpose by passing it as an argument to this >> >> > >>> perf_event_interface. The recorded instructions are then written >> >> > >>> to >> >> > >>> the >> >> > >>> memory region assigned. Now this is basically the raw >> >> > >>> information, >> >> > >>> which can >> >> > >>> be obtained from the hardware. It can be interpreted and >> >> > >>> presented >> >> > >>> to the >> >> > >>> user in the following ways -> >> >> > >>> >> >> > >>> 1) Instruction history - where the user gets basically a list of >> >> > >>> all >> >> > >>> instructions that were executed >> >> > >>> 2) Function Call History - It is also possible to get a list of >> >> > >>> all >> >> > >>> the >> >> > >>> functions called in the inferior >> >> > >>> 3) Reverse Debugging with limited information - In GDB this is >> >> > >>> only >> >> > >>> the >> >> > >>> functions executed. >> >> > >>> >> >> > >>> This raw information also needs to decoded (even before you can >> >> > >>> disassemble >> >> > >>> it ), there is already a library released by Intel called libipt >> >> > >>> which can >> >> > >>> do that. At the moment we plan to work with Instruction History. >> >> > >>> I will look into the debugserver infrastructure and get back to >> >> > >>> you. >> >> > >>> I guess >> >> > >>> for the server client communication we would rely on packets >> >> > >>> only. >> >> > >>> In case >> >> > >>> of concerns about too much data being transferred, we can limit >> >> > >>> the >> >> > >>> number >> >> > >>> of entries we report because anyway the amount of data recorded >> >> > >>> is >> >> > >>> too big >> >> > >>> to present all at once so we would have to resort to something >> >> > >>> like >> >> > >>> a >> >> > >>> viewport. >> >> > >>> >> >> > >>> Since a lot of instructions can be recorded this way, the >> >> > >>> function >> >> > >>> call >> >> > >>> history can be quite useful for debugging and especially since it >> >> > >>> is >> >> > >>> a lot >> >> > >>> faster to collect function traces this way. >> >> > >>> >> >> > >>> -ravi >> >> > >>> >> >> > >>> On Wed, Oct 21, 2015 at 3:14 PM, Pavel Labath <lab...@google.com> >> >> > >>> wrote: >> >> > >>>> >> >> > >>>> Hi, >> >> > >>>> >> >> > >>>> I am not really familiar with the perf_event interface (and I >> >> > >>>> suspect >> >> > >>>> others aren't also), so it might help if you explain what kind >> >> > >>>> of >> >> > >>>> information do you plan to collect from there. >> >> > >>>> >> >> > >>>> As for the PtraceWrapper question, I think that really depends >> >> > >>>> on >> >> > >>>> bigger design decisions. My two main questions for a feature >> >> > >>>> like >> >> > >>>> this >> >> > >>>> would be: >> >> > >>>> - How are you going to present this information to the user? (I >> >> > >>>> know >> >> > >>>> debugserver can report some performance data... Have you looked >> >> > >>>> into >> >> > >>>> how that works? Do you plan to reuse some parts of that >> >> > >>>> infrastructure?) >> >> > >>>> - How will you get the information from the server to the >> >> > >>>> client? >> >> > >>>> >> >> > >>>> pl >> >> > >>>> >> >> > >>>> >> >> > >>>> On 21 October 2015 at 13:41, Ravitheja Addepally via lldb-dev >> >> > >>>> <lldb-dev@lists.llvm.org> wrote: >> >> > >>>>> Hello, >> >> > >>>>> I want to implement support for reading Performance >> >> > >>>>> measurement >> >> > >>>>> information using the perf_event_open system calls. The motive >> >> > >>>>> is >> >> > >>>>> to add >> >> > >>>>> support for Intel PT hardware feature, which is available >> >> > >>>>> through >> >> > >>>>> the >> >> > >>>>> perf_event interface. I was thinking of implementing a new >> >> > >>>>> Wrapper >> >> > >>>>> like >> >> > >>>>> PtraceWrapper in NativeProcessLinux files. My query is that, is >> >> > >>>>> this a >> >> > >>>>> correct place to start or not ? in case not, could someone >> >> > >>>>> suggest >> >> > >>>>> me >> >> > >>>>> another place to begin with ? >> >> > >>>>> >> >> > >>>>> BR, >> >> > >>>>> A Ravi Theja >> >> > >>>>> >> >> > >>>>> >> >> > >>>>> _______________________________________________ >> >> > >>>>> lldb-dev mailing list >> >> > >>>>> lldb-dev@lists.llvm.org >> >> > >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> >> > >>>>> >> >> > >>> >> >> > >>> >> >> > >> _______________________________________________ >> >> > >> lldb-dev mailing list >> >> > >> lldb-dev@lists.llvm.org >> >> > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> >> > >> _______________________________________________ >> >> > >> lldb-dev mailing list >> >> > >> lldb-dev@lists.llvm.org >> >> > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> >> > > >> >> > >> >> > >> >> >> > > > _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev