wallace added inline comments.
================
Comment at: lldb/docs/lldb-gdb-remote.txt:278
+// "tids": [<thread ids in decimal>],
+// "variant": "specificThreads" | "currentAndFutureThreads" | "process",
+// "params": {
----------------
clayborg wrote:
> Seems like we don't need the "variant" here. If we have "tids" then we are
> enabling tracing for only that list of threads, and if it is missing, then we
> enable it for all threads.
>
> See my questions in the non-inlined comment section about the "process"
> variant.
After thinking a lot about it, I think that providing whole-process tracing
instead of per-thread tracing is going to be impractical for lldb, which means
that only two variants are necessary and your suggestion is correct.
I'll elaborate more: whole-process tracing means using a single buffer per
logical core, having each of these buffers contain the instructions of all the
threads that run on its core. As Intel PT doesn't trace thread switches, it's
necessary to add a secondary context switch trace to be able to produce
per-thread traces, which is a requirement for analyzing the instruction of a
thread. This splitting is done by correlating the timestamps of the context
switches with the timestamps of the packets in the Intel PT trace. There are
two problems here:
- Tracing context switches requires the kernel to have certain capabilities
that are not ubiquitous on all systems, and some of these are more accurate
than others. One option is to use PERF_RECORD_SWITCH, but it's not available
everywhere.
- Even if you were able to trace context switches, now you depend on the
granularity of the timestamps in the intel pt packets. Old Intel hardware
provides a timestamp for every few KBs of packets, and newer hardware allows
you to have a timestamp for each cpu cycle. In any case, you can end up with
invalid splits, as you need to interpolate the timestamps for the packets whose
timestamp is not available, and you could assign a packet to the wrong thread.
I was chatting about that with some infra folks at fb, and they think that they
will only be able to support this kind of tracing by implementing a special
collector that has access to a special feature in the kernel they are working
on, and would only work on the most recent Intel hardware with high timestamp
accuracy.
That being said, it seems that it's not worth working on that for LLDB. Each
company that wants that level of tracing should work on their own way to do
accurate thread splitting, and it would be misleading to implement something in
LLDB that can produce wrong values. For LLDB we can use per-thread tracing,
which will ensure 100% accuracy.
================
Comment at: lldb/docs/lldb-gdb-remote.txt:310
+// {
+// "bufferSizeInKB": <trace buffer size in KB in decimal>
+// "perfConfig": <custom low-level perf event intel-pt config in decimal>
----------------
clayborg wrote:
> Is this the trace buffer size that will be used for each thread? For all
> threads? Is this size divided up between the threads? What happens when new
> threads get created? Can we run out of memory in the kernel if a process
> creates too many threads if this buffer is for each thread? Need
> clarification on what this number means.
This is KB per thread. See my comment below about adding a limit on the total
buffer memory used.
================
Comment at: lldb/docs/lldb-gdb-remote.txt:403
+// "type": <tracing technology name, e.g. intel-pt, arm-coresight>
+// "label": <identifier for the data>
+// "tid": <tid in decimal if the data is related to a thread, 0 otherwise>
----------------
clayborg wrote:
> Are these custom for each plug-in? Any details on what intel-pt can specify
> here? You supplied intel PT specific info for other settings above in the
> docs.
Yes, it's custom for each plug-in. I added entry for intel pt in line 411.
================
Comment at: lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp:1680
+ thread_id,
+ m_intel_pt_trace_new_threads_params->params.buffer_size_in_kb,
+ m_intel_pt_trace_new_threads_params->params.perf_config)) {
----------------
clayborg wrote:
> Having a fixed size of memory for each thread can be dangerous. What if the
> user specifies a really large value, and then creates thousands of threads?
> What will happen to the system? Will if fail gracefully? Can you deadlock
> your computer?
AFAIK, the buffer is stored in a region in memory that is fixed and shouldn't
be swapped, as the CPU needs to write directly there without interrupting the
traced process. So I imagine that if you have too many threads, then you might
end up consuming all your RAM and killing your computer.
In any case, your point is valid. Given that we don't know beforehand how many
things will end up being created, what about creating a setting for the maximum
total amount of memory to use? E.g. 500MB. The user could override it if
needed. At least with this, the first threads, which are commonly the
long-lived threads, will be traced, and future short-lived threads won't be
traced, but they don't matter that much.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D91679/new/
https://reviews.llvm.org/D91679
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits