Re: [lldb-dev] Setting breakpoint on file and function name doesn't work as expected.

2019-11-07 Thread Jim Ingham via lldb-dev
Note first, "break set -f foo.c -l 10" and "break set -f foo.c -n bar" are two 
very different breakpoints.  In the first case, we are looking for a specific 
file & line combination in the line table or inlined functions info.  In the 
latter case we are searching for a function names bar, and then restricting the 
results to the matching functions whose definition is in foo.c.  It seemed neat 
to not make "break set" be a three-level command, like:

(lldb) break set file-and-line
(lldb) break set function-name
etc...

and use options for everything, but it does make it a little harder to document 
the cases where one option means something slightly different when used in 
combination with one or the other option.

Anyway, so setting a breakpoint by file name and function name was meant to 
support the case where you have:

$ cat foo.c
static int baz() {}
...
$ cat bar.c
static int baz() {}
...

and you want to limit the breakpoint to only the version of foo that was in 
foo.c.  You would do:

(lldb) b s -n baz -f foo.c
Breakpoint 1: where = a.out`baz + 4 at foo.c:5:10, address = 0x00010ee4

In that case searching in the CU makes sense, and is efficient.

Adding the name to a function name breakpoint is really only useful if you have 
multiple different instances of the same name that you need to disambiguate.  
In your example, you gain nothing by providing the header file name in which 
the symbol is defined.

I don't think I designed the behavior for when the colliding functions are 
actually defined in separate header files - I don't remember considering the 
possibility.  So what's happening is just an accidental outcome. It would 
certainly be possible, though a little trickier to do the filtering based on 
defining header.   You will have to look for the symbol and then chase down its 
declaration records to make sure they are in the right .h file, and also look 
for any inlined instances.  You probably also want to take some care not to 
regress the performance of the case where the function is defined in a CU, 
since that's the more common usage.

Jim

> On Nov 4, 2019, at 7:16 AM, Konrad Kleine via lldb-dev 
>  wrote:
> 
> I read the LLDB troubleshooting page [1] and found interesting quotes:
> 
> > When setting breakpoints in implementation source files (.c, cpp, cxx, .m, 
> > .mm, etc), LLDB by
> > default will only search for compile units whose filename matches. 
> > [...]
> >   % echo "settings set target.inline-breakpoint-strategy always" >> 
> > ~/.lldbinit
> > This tells LLDB to always look in all compile units and search for 
> > breakpoint locations
> > by file and line even if the implementation file doesn’t match. Setting 
> > breakpoints in header
> > files always searches all compile units because inline functions are 
> > commonly defined in
> > header files and often cause multiple breakpoints to have source line 
> > information that matches
> > many header file paths.
> 
> In my email before I did this
> 
> $ lldb -x -b -o "breakpoint set --file foo.h --name foo" ./a.out
> 
> I now added the breakpoint strategy and ran the above command without the -x 
> in order to pick up
> the LLDB init code. Still no luck.
> 
> [1]: https://lldb.llvm.org/use/troubleshooting.html#troubleshooting
> 
> Am Mo., 4. Nov. 2019 um 13:56 Uhr schrieb Konrad Kleine :
> Hello,
> 
> I noticed this behavior for LLDB under Linux when setting a breakpoint on a 
> file and a function name:
> 
> When doing "breakpoint set --file  --name ", the 
>  is that of the compile unit (CU) and not necessarily where the 
> function is defined. This is not what an end-user expects.
> 
> Take this simple example program:
> 
> $ cat foo.h 
> int foo(){ return 42; }
> 
> $ cat main.c
> #include "foo.h"
> int main(){return foo();}
> 
> $ clang -g main.c
> 
> As you can see, the function foo is defined in foo.h so it seems natural to 
> set a breakpoint on foo.h, doesn't it?
> 
> $ lldb -x -b -o "breakpoint set --file foo.h --name foo" ./a.out 
> (lldb) target create "./a.out"
> Current executable set to './a.out' (x86_64).
> (lldb) breakpoint set --file foo.h --name foo
> Breakpoint 1: no locations (pending).
> WARNING:  Unable to resolve breakpoint to any actual locations.
> 
> Apparently, LLDB cannot find the symbol like this. Let's try the only other 
> file that we have in the project:
> 
> $ lldb -x -b -o "breakpoint set --file main.c --name foo" ./a.out
> (lldb) target create "./a.out"
> Current executable set to './a.out' (x86_64).
> (lldb) breakpoint set --file main.c --name foo
> Breakpoint 1: where = a.out`foo + 4 at foo.h:1:12, address = 
> 0x00401114
> 
> Isn't that remarkable? LLDB uses main.c as the file to search in but then 
> finds it in foo.h.
> 
> Let's recall what the parameters --file and --name mean:
> 
>-n  ( --name  )
> Set the breakpoint by function name.  Can be repeated multiple 
> times to make one breakpoint for multiple names
> 
>-f  ( --file  )
>

Re: [lldb-dev] Slow expression evaluation (ASTImporter is too eager?)

2019-11-07 Thread Gábor Márton via lldb-dev
Hi Jaroslav,

Thanks for working on this. Still, there are things that are unclear for me.
I see that `LayoutRecordType` is called superfluously during the second
evaluation of `c2.x`, ok.

But, could you explain why LLDB is calling that multiple times? Maybe it
thinks a type is not completed while it is? As far as I know we keep track
which RecordDecl needs completeion in LLDB. At least, we do not store that
info in clang::ASTImporter. Minimal import gives a CXXRecordDecl whose
`DefinitionData` is set, even though the members are not imported the
definition data is there! So, there is no way for clang::ASTImporter to
know which RecordDecl had been imported in a minimal way.

I suspect there are multiple invocations of ASTImporter::ImportDefinition
with C2, C1, C0 within ClangASTSource (could you please check that?). And
`ImportDefinition` will blindly import the full definitions recursively
even if the minimal import is set (see ASTNodeImporter::IDK_Everything).
The patch would change this behavior which I'd like to avoid if possible. I
think first we should discover why there are multiple invocations of
ASTImporter::ImportDefinition from within LLDB.

Gabor


On Wed, Nov 6, 2019 at 11:21 PM Raphael “Teemperor” Isemann via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> Can you post that patch on Phabricator with an '[ASTImporter]’ as a name
> prefix? This way the ASTImporter folks will see your patch (The ASTImporter
> is more of a Clang thing, so they might not read lldb-dev). Also it makes
> it easier to see/test your patch :)
>
> (And +Gabor just in case)
>
> > On Nov 6, 2019, at 10:25 PM, Jaroslav Sevcik via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
> >
> > Hello,
> >
> > I noticed that the AST importer is very eager to import classes/structs
> that were already completed, even if they are not needed. This is best
> illustrated with an example.
> >
> > struct C0 { int x = 0; };
> > struct C1 { int x = 1; C0* c0 = 0; };
> > struct C2 { int x = 2; C1* c1 = 0; };
> >
> > int main() {
> >   C0 c0;
> >   C1 c1;
> >   C2 c2;
> >
> >   return 0;  // break here
> > }
> >
> > When we evaluate “c2.x” in LLDB, AST importer completes and imports only
> class C2. This is working as intended. Similarly, evaluating “c1.x” imports
> just C1 and “c0.x” imports C0. However, if we evaluate “c2.x” after
> evaluating “c1.x” and “c0.x”, the importer suddenly imports both C1 and C0
> (in addition to C2). See a log from a lldb session at the end of this email
> for illustration.
> >
> > I believe the culprit here is the following code at the end of the
> ASTNodeImporter::VisitRecordDecl method:
> >
> >   if (D->isCompleteDefinition())
> > if (Error Err = ImportDefinition(D, D2, IDK_Default))
> >   return std::move(Err);
> >
> > This will import a definition of class from LLDB if LLDB already happens
> to have a complete definition from before. For large programs, this can
> lead to importing very large chunks of ASTs even if they are not needed. I
> have tried to remove the code above from clang and test performance on
> several expressions in an Unreal engine sample - preliminary results show
> this could cut down evaluation time by roughly 50%.
> >
> > My prototype patch is below; note that couple of lldb tests are failing
> with a wrong error message, this is a work in progress. What would the
> experts here think? Is this a plausible direction?
> >
> > Cheers, Jaro
> >
> > diff --git a/clang/lib/AST/ASTImporter.cpp
> b/clang/lib/AST/ASTImporter.cpp
> > index 54acca7dc62..ebbce5c66c7 100644
> > --- a/clang/lib/AST/ASTImporter.cpp
> > +++ b/clang/lib/AST/ASTImporter.cpp
> > @@ -2799,7 +2799,7 @@ ExpectedDecl
> ASTNodeImporter::VisitRecordDecl(RecordDecl *D) {
> >if (D->isAnonymousStructOrUnion())
> >  D2->setAnonymousStructOrUnion(true);
> >
> > -  if (D->isCompleteDefinition())
> > +  if (!Importer.isMinimalImport() && D->isCompleteDefinition())
> >  if (Error Err = ImportDefinition(D, D2, IDK_Default))
> >return std::move(Err);
> >
> > @@ -3438,6 +3438,9 @@ ExpectedDecl
> ASTNodeImporter::VisitFieldDecl(FieldDecl *D) {
> >if (ToInitializer)
> >  ToField->setInClassInitializer(ToInitializer);
> >ToField->setImplicit(D->isImplicit());
> > +  if (CXXRecordDecl *FieldType = D->getType()->getAsCXXRecordDecl())
> > +   if (Error Err = ImportDefinitionIfNeeded(FieldType))
> > + return std::move(Err);
> >LexicalDC->addDeclInternal(ToField);
> >return ToField;
> >  }
> > @@ -5307,7 +5310,7 @@ ExpectedDecl
> ASTNodeImporter::VisitClassTemplateSpecializationDecl(
> >
> >D2->setTemplateSpecializationKind(D->getTemplateSpecializationKind());
> >
> > -  if (D->isCompleteDefinition())
> > +  if (!Importer.isMinimalImport() && D->isCompleteDefinition())
> >  if (Error Err = ImportDefinition(D, D2))
> >return std::move(Err);
> >
> >
> >
> > —— lldb session illustrating the unnecessary imports —-
> > This shows that evaluation of “c2.x” after evaluation “c1.x” 

Re: [lldb-dev] Identifying instructions that definitely access memory

2019-11-07 Thread Tatyana Krasnukha via lldb-dev
Hi Vangelis,

Not sure this will help you, but you can try to compare 
llvm::MachineInstr::getOpcode() with TargetOpcode::G_LOAD and 
TargetOpcode::G_STORE if you can obtain a MachineInstr instance.
It also may have sense to ask llvm-dev for a proper solution.

From: lldb-dev  On Behalf Of Vangelis 
Tsiatsianas via lldb-dev
Sent: Tuesday, November 5, 2019 3:43 PM
To: via lldb-dev 
Cc: Vangelis Tsiatsianas 
Subject: Re: [lldb-dev] Identifying instructions that definitely access memory

Hello,

I decided to try once more with a follow-up email, since my previous one got no 
responses (I hope it’s not considered rude to send more than one message in a 
row for a particular question).

To sum up and clarify my previous question, what I need is a way to track 
memory stores and save both the old and the new value of the memory location 
being modified.

My thinking so far:

  1.  Recognize the instructions that definitely access memory before they 
execute, based on their opcode.
  2.  Tell whether each operand is a register or a memory location.
  3.  If it’s a memory location, check whether it is a load or store 
destination.
  4.  In case it is a store destination, fetch and save current value from 
memory.
  5.  Execute instruction.
  6.  Fetch and save new value from memory.

However, I was not able to find a cross-architecture API that covers all of the 
conditions above and more specifically Instruction::DoesStore() and 
Operand::IsStoreDestination().

Last but not least, I should notice that the target is executed in single-step 
mode, so I do have control right before and after the execution of every 
instruction.

Thanks, again, in advance! 


― Vangelis



On 21 Oct 2019, at 08:54, Vangelis Tsiatsianas 
mailto:vangeli...@icloud.com>> wrote:

Hello,

I am looking for a way to identify loads, stores and any other kind of 
instruction that definitely perform memory access and extract the address 
operand(s), however I was not able to find a cross-architecture API. The 
closest I stumbled upon are "MCInstrDesc::mayLoad()" and 
"MCInstrDesc::mayStore()", but I understand that their results are just a hint, 
so I would then need to examine the instruction name or opcode in order to find 
out whether it’s actually a load or store and which operand(s) is (are) memory 
address(es) and also do so for each architecture separately, which I would 
really like to avoid.

Is there a way to identify such instructions either by examining them through 
the disassembler (e.g. "DoesLoad()" | "DoesStore()") before they execute or 
right after they perform any kind of memory access?

Thank you very much, in advance! 


― Vangelis



___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev