jansvoboda11 wrote:

> > I changed SourceManager::isInTheSameTranslationUnit() to check for that 
> > case explicitly instead of relying on the fact/bug that built-ins buffers 
> > happened to be assigned an untranslated include SourceLocation.
> 
> What is the translated location you get here? It's hard for me to tell if the 
> comment in `isBeforeInTranslationUnit` is describing the desired behaviour or 
> just a consequence of how it happened to work at the time.

When a module is built, its `SourceLocation` offsets go from 0 to N. When it's 
imported, they are remapped into the range from 2<sup>32</sup>-N to 
2<sup>32</sup>.

So the fact that we didn't remap the offset of the built-ins buffer meant it 
was not considered part of the imported module. This made 
`isInTheSameTranslationUnit(<built-ins>, module.h)` return `[false, false]` and 
the special case in `isBeforeInTranslationUnit()` would make sure to return 
`true` in order to enforce that the `<built-ins>` buffer always compares `<` to 
other offsets in the module.

It took me a while to figure out why that is desired. My understanding is that 
whether one offset is before another can't be just simple comparison of the 
numeric values. (More details on this are in `isInTheSameTranslationUnit()`.)

Here's a little example of how this ends up working with modules:

| entry               | ID [t] | offset    | `isBefore(?, t(250))` | 
`isBefore(t(250), ?)` |
|---------------------|--------|-----------|-----------------------|-----------------------|
| dummy               | 0 [-6] | <0,1)     | yes                   | no         
           |
| modulemap           | 1 [-5] | <1,10)    | yes                   | no         
           |
| `<module-includes>` | 2 [-4] | <10,100)  | yes                   | no         
           |
| `<built-in>`        | 3 [-3] | <100,200) | no                    | yes        
           |
| header              | 4 [-2] | <200,300) | yes -> no             | no -> yes  
           |

The `<module-includes>` buffer is parent of the header, but the `<built-in>` 
buffer ends up being created in between them. This causes the `isBefore` 
relationships to not be sorted/partitioned, as one might expect.

This is the reason the binary search in 
`ASTReader::findPreprocessedEntitiesInRange()` broke.

So the special case in `isBeforeInTranslationUnit()` that moves `<built-in>` 
before all other entries makes sense, but now needs to happen earlier, before 
calling `isInTheSameTranslationUnit()`. That's what I've done in the latest 
commit, and it actually fixes "Index/annotate-module.m". (I guess we should 
also consider merging both of those functions, because 
`isInTheSameTranslationUnit()` doesn't really stand on its own now.) 
Unfortunately, "SemaCXX/warn-unsafe-buffer-usage-fixits-parm-span.cpp" started 
failing, so I'll look into that next.

@benlangmuir, let me know if this explanation makes sense.

Also adding @sam-mccall as a reviewer, since he's worked in this area recently.

https://github.com/llvm/llvm-project/pull/66962
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to