It feels to me that the python based approach could run into a dead end fairly quickly: a) you can only access the data when the target is stopped; b) the self-tracing means that the evaluation of these expressions would introduce noise in the data; c) overhead of all the extra packets(?).
So, I would be in favor of a lldb-server based approach. I'm not telling you that you shouldn't do that, but I don't think that's an approach I would take... pl On 1 February 2016 at 08:58, Ravitheja Addepally <ravithejaw...@gmail.com> wrote: > Ok, that is one option, but one of the aim for this activity is to make the > data available for use by the IDE's like Android Studio or XCode or any > other that may want to display this information in its environment so > keeping that in consideration would the complete python based approach be > useful ? or would providing LLDB api's to extract raw perf data from the > target be useful ? > > On Thu, Jan 21, 2016 at 10:00 PM, Greg Clayton <gclay...@apple.com> wrote: >> >> One thing to think about is you can actually just run an expression in the >> program that is being debugged without needing to change anything in the GDB >> remote server. So this can all be done via python commands and would require >> no changes to anything. So you can run an expression to enable the buffer. >> Since LLDB supports multiple line expression that can define their own local >> variables and local types. So the expression could be something like: >> >> int perf_fd = (int)perf_event_open(...); >> struct PerfData >> { >> void *data; >> size_t size; >> }; >> PerfData result = read_perf_data(perf_fd); >> result >> >> >> The result is then a structure that you can access from your python >> command (it will be a SBValue) and then you can read memory in order to get >> the perf data. >> >> You can also split things up into multiple calls where you can run >> perf_event_open() on its own and return the file descriptor: >> >> (int)perf_event_open(...) >> >> This expression will return the file descriptor >> >> Then you could allocate memory via the SBProcess: >> >> (void *)malloc(1024); >> >> The result of this expression will be the buffer that you use... >> >> Then you can read 1024 bytes at a time into this newly created buffer. >> >> So a solution that is completely done in python would be very attractive. >> >> Greg >> >> >> > On Jan 21, 2016, at 7:04 AM, Ravitheja Addepally >> > <ravithejaw...@gmail.com> wrote: >> > >> > Hello, >> > Regarding the questions in this thread please find the answers -> >> > >> > How are you going to present this information to the user? (I know >> > debugserver can report some performance data... Have you looked into >> > how that works? Do you plan to reuse some parts of that >> > infrastructure?) and How will you get the information from the server to >> > the client? >> > >> > Currently I plan to show a list of instructions that have been executed >> > so far, I saw the >> > implementation suggested by pavel, the already present infrastructure is >> > a little bit lacking in terms of the needs of the >> > project, but I plan to follow a similar approach, i.e to extract the raw >> > trace data by querying the server (which can use the >> > perf_event_open to get the raw trace data from the kernel) and transport >> > it through gdb packets ( qXfer packets >> > >> > https://sourceware.org/gdb/onlinedocs/gdb/Branch-Trace-Format.html#Branch-Trace-Format). >> > At the client side the raw trace data >> > could be passed on to python based command that could decode the data. >> > This also eliminates the dependency of libipt since LLDB >> > would not decode the data itself. >> > >> > There is also the question of this third party library. Do we take a >> > hard dependency on libipt (probably a non-starter), or only use it if it's >> > available (much better)? >> > >> > With the above mentioned way LLDB would not need the library, who ever >> > wants to use the python command would have to install it separately but >> > LLDB >> > wont need it >> > >> > With the performance counters, the interface would still be >> > perf_event_open, so if there was a perf_wrapper in LLDB server then it >> > could >> > be reused to configure and use the >> > software performance counters as well, you would just need to pass >> > different attributes in the perf_event_open system call, plus I think the >> > perf_wrapper could be reused to >> > get CoreSight information as well (see https://lwn.net/Articles/664236/ >> > ) >> > >> > >> > On Wed, Oct 21, 2015 at 8:57 PM, Greg Clayton <gclay...@apple.com> >> > wrote: >> > one main benefit to doing this externally is allow this to be done >> > remotely over any debugger connection. If you can run expressions to >> > enable/disable/setup the memory buffer/access the buffer contents, then you >> > don't need to add code into the debugger to actually do this. >> > >> > Greg >> > >> > > On Oct 21, 2015, at 11:54 AM, Greg Clayton <gclay...@apple.com> wrote: >> > > >> > > IMHO the best way to provide this information is to implement reverse >> > > debugging packets in a GDB server (lldb-server). If you enable this >> > > feature >> > > via some packet to lldb-server, and that enables the gathering of data >> > > that >> > > keeps the last N instructions run by all threads in some buffer that gets >> > > overwritten. The lldb-server enables it and gives a buffer to the >> > > perf_event_interface(). Then clients can ask the lldb-server to step >> > > back in >> > > any thread. Only when the data is requested do we actually use the data >> > > to >> > > implement the reverse stepping. >> > > >> > > Another way to do this would be to use a python based command that can >> > > be added to any target that supports this. The plug-in could install a >> > > set >> > > of LLDB commands. To see how to create new lldb command line commands in >> > > python, see the section named "CREATE A NEW LLDB COMMAND USING A PYTHON >> > > FUNCTION" on the http://lldb.llvm.org/python-reference.html web page. >> > > >> > > Then you can have some commands like: >> > > >> > > intel-pt-start >> > > intel-pt-dump >> > > intel-pt-stop >> > > >> > > Each command could have options and arguments as desired. The >> > > "intel-pt-start" command could make an expression call to enable the >> > > feature >> > > in the target by running and expression that runs the some >> > > perf_event_interface calls that would allocate some memory and hand it to >> > > the Intel PT stuff. The "intel-pt-dump" could just give a raw dump all of >> > > history for one or more threads (again, add options and arguments as >> > > needed >> > > to this command). The python code could bridge to C and use the intel >> > > libraries that know how to process the data. >> > > >> > > If this all goes well we can think about building it into LLDB as a >> > > built in command. >> > > >> > > >> > >> On Oct 21, 2015, at 9:50 AM, Zachary Turner via lldb-dev >> > >> <lldb-dev@lists.llvm.org> wrote: >> > >> >> > >> There are two different kinds of performance counters: OS performance >> > >> counters and CPU performance counters. It sounds like you're talking >> > >> about >> > >> the latter, but it's worth considering whether this could be designed >> > >> in a >> > >> way to support both (i.e. even if you don't do both yourself, at least >> > >> make >> > >> the machinery reusable and apply to both for when someone else wanted to >> > >> come through and add OS perf counters). >> > >> >> > >> There is also the question of this third party library. Do we take a >> > >> hard dependency on libipt (probably a non-starter), or only use it if >> > >> it's >> > >> available (much better)? >> > >> >> > >> As Pavel said, how are you planning to present the information to the >> > >> user? Through some sort of top level command like "perfcount >> > >> instructions_retired"? >> > >> >> > >> On Wed, Oct 21, 2015 at 8:16 AM Pavel Labath via lldb-dev >> > >> <lldb-dev@lists.llvm.org> wrote: >> > >> [ Moving this discussion back to the list. I pressed the wrong button >> > >> when replying.] >> > >> >> > >> Thanks for the explanation Ravi. It sounds like a very useful feature >> > >> indeed. I've found a reference to the debugserver profile data in >> > >> GDBRemoteCommunicationClient.cpp:1276, so maybe that will help with >> > >> your investigation. Maybe also someone more knowledgeable can explain >> > >> what those A packets are used for (?). >> > >> >> > >> >> > >> On 21 October 2015 at 15:48, Ravitheja Addepally >> > >> <ravithejaw...@gmail.com> wrote: >> > >>> Hi, >> > >>> Thanx for your reply, some of the future processors to be released >> > >>> by >> > >>> Intel have this hardware support for recording the instructions that >> > >>> were >> > >>> executed by the processor and this recording process is also quite >> > >>> fast and >> > >>> does not add too much computational load. Now this hardware is made >> > >>> accessible via the perf_event_interface where one could map a region >> > >>> of >> > >>> memory for this purpose by passing it as an argument to this >> > >>> perf_event_interface. The recorded instructions are then written to >> > >>> the >> > >>> memory region assigned. Now this is basically the raw information, >> > >>> which can >> > >>> be obtained from the hardware. It can be interpreted and presented >> > >>> to the >> > >>> user in the following ways -> >> > >>> >> > >>> 1) Instruction history - where the user gets basically a list of all >> > >>> instructions that were executed >> > >>> 2) Function Call History - It is also possible to get a list of all >> > >>> the >> > >>> functions called in the inferior >> > >>> 3) Reverse Debugging with limited information - In GDB this is only >> > >>> the >> > >>> functions executed. >> > >>> >> > >>> This raw information also needs to decoded (even before you can >> > >>> disassemble >> > >>> it ), there is already a library released by Intel called libipt >> > >>> which can >> > >>> do that. At the moment we plan to work with Instruction History. >> > >>> I will look into the debugserver infrastructure and get back to you. >> > >>> I guess >> > >>> for the server client communication we would rely on packets only. >> > >>> In case >> > >>> of concerns about too much data being transferred, we can limit the >> > >>> number >> > >>> of entries we report because anyway the amount of data recorded is >> > >>> too big >> > >>> to present all at once so we would have to resort to something like >> > >>> a >> > >>> viewport. >> > >>> >> > >>> Since a lot of instructions can be recorded this way, the function >> > >>> call >> > >>> history can be quite useful for debugging and especially since it is >> > >>> a lot >> > >>> faster to collect function traces this way. >> > >>> >> > >>> -ravi >> > >>> >> > >>> On Wed, Oct 21, 2015 at 3:14 PM, Pavel Labath <lab...@google.com> >> > >>> wrote: >> > >>>> >> > >>>> Hi, >> > >>>> >> > >>>> I am not really familiar with the perf_event interface (and I >> > >>>> suspect >> > >>>> others aren't also), so it might help if you explain what kind of >> > >>>> information do you plan to collect from there. >> > >>>> >> > >>>> As for the PtraceWrapper question, I think that really depends on >> > >>>> bigger design decisions. My two main questions for a feature like >> > >>>> this >> > >>>> would be: >> > >>>> - How are you going to present this information to the user? (I >> > >>>> know >> > >>>> debugserver can report some performance data... Have you looked >> > >>>> into >> > >>>> how that works? Do you plan to reuse some parts of that >> > >>>> infrastructure?) >> > >>>> - How will you get the information from the server to the client? >> > >>>> >> > >>>> pl >> > >>>> >> > >>>> >> > >>>> On 21 October 2015 at 13:41, Ravitheja Addepally via lldb-dev >> > >>>> <lldb-dev@lists.llvm.org> wrote: >> > >>>>> Hello, >> > >>>>> I want to implement support for reading Performance >> > >>>>> measurement >> > >>>>> information using the perf_event_open system calls. The motive is >> > >>>>> to add >> > >>>>> support for Intel PT hardware feature, which is available through >> > >>>>> the >> > >>>>> perf_event interface. I was thinking of implementing a new Wrapper >> > >>>>> like >> > >>>>> PtraceWrapper in NativeProcessLinux files. My query is that, is >> > >>>>> this a >> > >>>>> correct place to start or not ? in case not, could someone suggest >> > >>>>> me >> > >>>>> another place to begin with ? >> > >>>>> >> > >>>>> BR, >> > >>>>> A Ravi Theja >> > >>>>> >> > >>>>> >> > >>>>> _______________________________________________ >> > >>>>> lldb-dev mailing list >> > >>>>> lldb-dev@lists.llvm.org >> > >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> > >>>>> >> > >>> >> > >>> >> > >> _______________________________________________ >> > >> lldb-dev mailing list >> > >> lldb-dev@lists.llvm.org >> > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> > >> _______________________________________________ >> > >> lldb-dev mailing list >> > >> lldb-dev@lists.llvm.org >> > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> > > >> > >> > >> > _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev