[Lldb-commits] [PATCH] D91679: [trace][intel-pt] Implement trace start and trace stop

walter erquinigo via Phabricator via lldb-commits Tue, 26 Jan 2021 18:33:39 -0800

wallace added inline comments.

================
Comment at: lldb/docs/lldb-gdb-remote.txt:278
+//     "tids": [<thread ids in decimal>],
+//     "variant": "specificThreads" | "currentAndFutureThreads" | "process",
+//     "params": {
----------------
clayborg wrote:
> Seems like we don't need the "variant" here. If we have "tids" then we are 
> enabling tracing for only that list of threads, and if it is missing, then we 
> enable it for all threads.
> 
> See my questions in the non-inlined comment section about the "process" 
> variant.
After thinking a lot about it, I think that providing whole-process tracing 
instead of per-thread tracing is going to be impractical for lldb, which means 
that only two variants are necessary and your suggestion is correct. 

I'll elaborate more: whole-process tracing means using a single buffer per 
logical core, having each of these buffers contain the instructions of all the 
threads that run on its core. As Intel PT doesn't trace thread switches, it's 
necessary to add a secondary context switch trace to be able to produce 
per-thread traces, which is a requirement for analyzing the instruction of a 
thread. This splitting is done by correlating the timestamps of the context 
switches with the timestamps of the packets in the Intel PT trace. There are 
two problems here:

- Tracing context switches requires the kernel to have certain capabilities 
that are not ubiquitous on all systems, and some of these are more accurate 
than others. One option is to use PERF_RECORD_SWITCH, but it's not available 
everywhere.
- Even if you were able to trace context switches, now you depend on the 
granularity of the timestamps in the intel pt packets. Old Intel hardware 
provides a timestamp for every few KBs of packets, and newer hardware allows 
you to have a timestamp for each cpu cycle. In any case, you can end up with 
invalid splits, as you need to interpolate the timestamps for the packets whose 
timestamp is not available, and you could assign a packet to the wrong thread.

I was chatting about that with some infra folks at fb, and they think that they 
will only be able to support this kind of tracing by implementing a special 
collector that has access to a special feature in the kernel they are working 
on, and would only work on the most recent Intel hardware with high timestamp 
accuracy.

That being said, it seems that it's not worth working on that for LLDB. Each 
company that wants that level of tracing should work on their own way to do 
accurate thread splitting, and it would be misleading to implement something in 
LLDB that can produce wrong values. For LLDB we can use per-thread tracing, 
which will ensure 100% accuracy.

================
Comment at: lldb/docs/lldb-gdb-remote.txt:310
+//  {
+//    "bufferSizeInKB": <trace buffer size in KB in decimal>
+//    "perfConfig": <custom low-level perf event intel-pt config in decimal>
----------------
clayborg wrote:
> Is this the trace buffer size that will be used for each thread? For all 
> threads? Is this size divided up between the threads? What happens when new 
> threads get created? Can we run out of memory in the kernel if a process 
> creates too many threads if this buffer is for each thread? Need 
> clarification on what this number means.
This is KB per thread. See my comment below about adding a limit on the total 
buffer memory used.

================
Comment at: lldb/docs/lldb-gdb-remote.txt:403
+//    "type": <tracing technology name, e.g. intel-pt, arm-coresight>
+//    "label": <identifier for the data>
+//    "tid": <tid in decimal if the data is related to a thread, 0 otherwise>
----------------
clayborg wrote:
> Are these custom for each plug-in? Any details on what intel-pt can specify 
> here? You supplied intel PT specific info for other settings above in the 
> docs.
Yes, it's custom for each plug-in. I added entry for intel pt in line 411.

================
Comment at: lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp:1680
+            thread_id,
+            m_intel_pt_trace_new_threads_params->params.buffer_size_in_kb,
+            m_intel_pt_trace_new_threads_params->params.perf_config)) {
----------------
clayborg wrote:
> Having a fixed size of memory for each thread can be dangerous. What if the 
> user specifies a really large value, and then creates thousands of threads? 
> What will happen to the system? Will if fail gracefully? Can you deadlock 
> your computer?
AFAIK, the buffer is stored in a region in memory that is fixed and shouldn't 
be swapped, as the CPU needs to write directly there without interrupting the 
traced process. So I imagine that if you have too many threads, then you might 
end up consuming all your RAM and killing your computer.

In any case, your point is valid. Given that we don't know beforehand how many 
things will end up being created, what about creating a setting for the maximum 
total amount of memory to use? E.g. 500MB. The user could override it if 
needed. At least with this, the first threads, which are commonly the 
long-lived threads, will be traced, and future short-lived threads won't be 
traced, but they don't matter that much. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91679/new/

https://reviews.llvm.org/D91679

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D91679: [trace][intel-pt] Implement trace start and trace stop

Reply via email to