[Lldb-commits] [PATCH] D103588: [trace] Create a top-level instruction class

Vedant Kumar via Phabricator via lldb-commits Wed, 09 Jun 2021 15:34:55 -0700

vsk added a comment.

In D103588#2806894 <https://reviews.llvm.org/D103588#2806894>, @wallace wrote:


> I've been thinking about what you said and I'm having second thoughts on my 
> implementation. I'll share more context:
>
> - I want to work in the short term on reverse debugging and reconstruction of 
> stack traces, for that i'll need to know the instruction type of each 
> instruction in the trace, which will be used as part of some heuristics to 
> identify calls and returns between functions.
> - A future application that we plan to work on is adding profiling 
> information to the instructions
> - Right now the intel-pt plugin is storing the decoded instructions in a 
> vector, which works for small traces but wont' for gigantic traces. I imagine 
> that I'll end up removing that vector and make the TraverseInstruction API 
> decode instructions in place keeping one instruction in memory at a time 
> within the intel pt plugin for a given traversal. For that I'll need 
> accessors that can provide information of the current Instruction. As there 
> could be two or more concurrent traversals happening at the same time (I'm 
> trying to be generic here), it might make sense to create an abstract class 
> TraceInstruction that can be extended by each plug-in and implement its 
> getters.
>
> I'm thinking about something like this
>
>   class TraceInstruction {
>     virtual lldb::addr_t GetLoadAddress() = 0;
>     virtual TraceInstructionType() GetInstructionType() = 0;
>     virtual uint64_t GetTimestamp() = 0;
>     ... anything that can appear in the future 
>   };
>
> and have no members, leaving to each plug-in the decision of which of those 
> methods to implement and how.
>
> What do you think of this? I think this incorporates your feedback.

Thanks for the context.

I'm not convinced that retaining only 1 decoded instruction in memory at a time 
(under a TraverseInstructions callback) will be sufficient for the use cases 
you've outlined. Let's say the user sets a few breakpoints, then repeatedly 
does "rev-continue" followed by "bt". If lldb only holds a max of 1 decoded 
instruction in memory, it would need to repeatedly decode parts of the trace to 
evaluate those user commands, which would be slow as you pointed out earlier. 
OTOH there's not enough memory to retain all of the decoded instructions.

It seems like we need a way for generic trace analyses to request _batches_ of 
decoded trace data at a time, lazily demanding more from the trace plugin only 
if the analysis requires it to proceed, while freeing any stale decoded data 
it's not using.

Moreover, there doesn't seem to be a hard requirement that each trace plugin 
specialize a TraceInstruction class. For example, to support that 
breakpoint/rev-continue/bt workflow, you really only need a list of the places 
where _control flow changed_. So the generic layer could request that data from 
a trace plugin ('plugin->decodeControlFlowChanges(startPos, stopPos)'), then 
query more specific APIs that answer questions about how many control flow 
changes there were, and what the inst type + load address was at each control 
flow change point. Each plugin implementation can pick the most efficient way 
to implement those narrower APIs, instead of trying to cram all potentially 
useful information into a single TraceInst class.

The reason why I believe this to be such a crucial part of the design is 
because of the scaling problem I mentioned earlier. A single 1.5GHz core 
running at IPC=2 yields 3*10^9 instructions/second, so we hit "gigantic" all 
too quickly!



================
Comment at: lldb/source/Plugins/Trace/intel-pt/IntelPTDecoder.cpp:117
+  case ptic_far_call:
+    return lldb::eTraceInstructionFarCall;
+  case ptic_far_return:
----------------
Are all of these instruction types both necessary and generic? E.g. does having 
FarCall in addition to Call assist with backtrace reconstruction, and if so, 
what does FarCall mean precisely?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103588/new/

https://reviews.llvm.org/D103588

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D103588: [trace] Create a top-level instruction class

Reply via email to