clayborg added a comment.

General things to know about SymbolFiles:

- We use SymbolContext objects to refer to a specific symbol context: module, 
compile unit, function, deepest block, line entry and symbol.
- A load address should produce 1 symbol context. Never more.
- A file address can produce many symbol contexts. Think about looking up 
address 0x1000. On MacOSX all shared libraries have their file virtual 
addresses such that they start at 0x0, so many functions shortly after zero, so 
address 0x1000 can produce many matches since file addresses aren't unique
- file + line can match many symbol contexts. Think about what the user would 
expect if they typed:  (lldb) b vector:123
- SymbolFile objects produce IDs for compile units, functions, variables, 
types, and many other things by creating these objects with a "lldb::user_id_t" 
that makes sense for the SymbolFile itself such that if they are given this ID 
back and asked to parse further information, they should quickly be able to use 
that number to get back to the debug info and parse more. For DWARF we use the 
DIE offset. The only thing that isn't correctly abstracted right now is the 
file IDs, they are currently expected to be indexes into the support files. 
This works well for DWARF, but probably not for other formats. We can change 
this if needed as I mentioned before.


================
Comment at: source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp:98-100
@@ +97,5 @@
+{
+    auto global = m_session->getGlobalScope();
+    auto compilands = global->findAllChildren<llvm::PDBSymbolCompiland>();
+    return compilands->getChildCount();
+}
----------------
zturner wrote:
> clayborg wrote:
> > Do you want to cache the compile unit count, or is this really cheap to 
> > call over and over?
> Most of these methods are really cheap.  In fact, the entire process of 
> caching stuff in `SymbolContexts` and whatnot is probably unnecessary for us 
> in theory because while it may be cheaper performance wise to cache it the 
> way we're doing, it's not *that* expensive to just query the file, and when 
> you consider the memory usage of caching it a second time, it's not clear 
> it's a good tradeoff.
> 
> That said, there's not really a way around this with the current 
> architecture, so I only cache where necessary (can always add more later if 
> performance becomes an issue)
I was just talking about caching the number of compile units. You will be asked 
to create each individual lldb_private::CompileUnit by index with:

```
lldb::CompUnitSP
SymbolFilePDB::ParseCompileUnitAtIndex(uint32_t index) 
```

And you will be asked only once for a compile unit by index since the symbol 
vendor will cache those for you.

================
Comment at: source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp:257
@@ +256,3 @@
+        auto compilands =
+            m_session->findCompilandsForSourceFile(file_spec.GetPath(), 
llvm::PDB_NameSearchFlags::NS_CaseInsensitive);
+
----------------
> Just to make sure I understand what "support files" are.
> 
> If I have this file:
> 
> // foo.cpp
> #include "foo.h"
> #include <vector>
> int main(int argc, char **argv) {
>     return 0;
> }
> and I run clang.exe foo.cpp -o foo.exe, and someone calls 
> ResolveSymbolContext("foo.cpp", ...), then in this scenario:
> 
> The compilation unit is foo.cpp
> The support files are foo.h, vector, and any other headers indirectly 
> included by those headers.

The support files for DWARF will be "foo.cpp" as the first file (this is just 
the way DWARF does it) and foo.h vector and any other header files that were 
used will also be in there. So your support files will need to include your 
compile unit source file because you will create some line entries that point 
to your main source file.

> If check_inlines is false, the SymbolContextList should only have one entry 
> for foo.cpp, and all the line table should contain all entries from foo.cpp 
> that match the line number.

SymbolContextList would have N entries inside of it where the 
SymbolContextList.comp_unit is always a compile unit whose name matches the 
"file_spec". The LineEntry in each symbol context will also have a file index 
that matches the SupportFiles index for the "foo.cpp" source file.

> If check_inlines is true, the SymbolContextList should still only have one 
> entry for foo.cpp, but the line table should contain all entries from foo.cpp 
> that match *and* all entries from all support files (as defined in #2) that 
> match.

No there will be one SymbolContext for each line entry that matches. All of the 
SymbolContext objects will have LineEntry whose file index will produce a file 
that is "foo.cpp" when the file is extracted by index from the compile units 
support files. 

So a quick example in main.cpp:

```
     1  #include <vector>
     2  
     3  int main()
     4  {
     5      std::vector<int> ints;
     6      ints.push_back(123);
     7      ints.push_back(234);
     8  }
     9  
    10  int foo()
    11  {
    12      std::vector<long> longs;
    13      longs.push_back(123);
    14      longs.push_back(234);
    15  }
```

If vector<T>::push_back is on vector:234, and the user asks to set a breakpoint:

```
(lldb) b vector:234
```

Lets says our support files look like:

```
support_files[0] = main.cpp
support_files[1] = /usr/include/.../vector
```

You would have 4 symbol contexts in your SymbolContextList:

```
SymbolContext[0]:
   module = a.out
   compile_unit = main.cpp
   function = main
   block = std::vector<int>::push_back() with call_file = 0 and call_line = 6 
(main.cpp:6)
   line_entry = file_idx = 1, line = 234, address = 0x1000

SymbolContext[1]:
   module = a.out
   compile_unit = main.cpp
   function = main
   block = std::vector<int>::push_back() with call_file = 0 and call_line = 7 
(main.cpp:7)
   line_entry = file_idx = 1, line = 234, address = 0x1010

SymbolContext[2]:
   module = a.out
   compile_unit = main.cpp
   function = foo
   block = std::vector<long>::push_back() with call_file = 0 and call_line = 13 
(main.cpp:13)
   line_entry = file_idx = 1, line = 234, address = 0x2000

SymbolContext[3]:
   module = a.out
   compile_unit = main.cpp
   function = foo
   block = std::vector<long>::push_back() with call_file = 0 and call_line = 14 
(main.cpp:14)
   line_entry = file_idx = 1, line = 234, address = 0x2010
```

Note the compile unit always is the compile unit that the code exists in. The 
block is a lldb_private::Block that has inlined function info that tells us 
about std::vector inlined instantiation and also tells us the inline call 
stack. In LLDB inlined functions are just lexical blocks that have inline 
function info (inlined function name + call file + call line). There could be 
many inlined blocks above your current block as well..

> Also, the code as written should already handle the case where only basename 
> is specified (the SDK handles this case transparently)
> 
> (BTW, it would be nice if there were a way to write unit tests for all the 
> things you're describing. All the different ways of calling it with different 
> combinations of arguments, and verifying the output is as expected. It's a 
> lot to wrap your head around trying to turn words into code, but if you have 
> a test that just fails, then it's easy to look at it and see what it expects).

We should be covering this using API tests that make an example like the one 
above and setting breakpoints in the inline header file. We have many tests for 
this. It is hard to get compilers to generate things that you can put in unit 
tests since they often change. 


http://reviews.llvm.org/D17363



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to