jansvoboda11 added inline comments.

================
Comment at: clang/lib/Basic/FileManager.cpp:625-627
+        Entries.push_back(
+            FileEntryRef(*reinterpret_cast<const FileEntryRef::MapEntry *>(
+                Value->V.get<const void *>())));
----------------
rmaz wrote:
> jansvoboda11 wrote:
> > Why is this necessary? IIUC, `FileEntryRef` has this logic in 
> > `getBaseMapEntry()`. I would expect `FileEntryRef(Entry)` to be correct 
> > regardless of what's in `Value->V`.
> I borrowed this from: 
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/FileManager.cpp#L399-L408
> 
> I assumed this was necessary from @benlangmuir 's comment on D142780:
> 
> > Also, it will never give you a vfs mapped path since it's skipping those 
> > (V.dyn_cast<FileEntry *>()).
> 
> Are you saying we should just do something like:
> ```
> for (const auto &Entry : SeenFileEntries) {
>     if (llvm::ErrorOr<FileEntryRef::MapValue> Value = Entry.getValue())
>         Entries.push_back(FileEntryRef(Entry));
> }
> ```
> ?
Oh, okay, so the goal is to skip VFS-mapped file entries and only serialize the 
on-disk ones into PCMs. That makes sense given we do actual `FileEntry` 
comparison in `HeaderFileInfoTrait::EqualKey()` (instead of plain comparison of 
filenames that I assumed). I think that for VFS-mapped entries, we always have 
a separate element in the `SeenFileEntries` map that represents the on-disk 
file entry. So we should be skipping the VFS-mapped entries entirely instead of 
considering their remapping target entries again.

If my understanding is correct, we should do something like this:

```
    if (llvm::ErrorOr<FileEntryRef::MapValue> Value = Entry.getValue())
      // Ignore VFS-mapped entries.
      if (Value->V.dyn_cast<FileEntry *>())
        Entries.push_back(FileEntryRef(Entry));
```


================
Comment at: clang/lib/Serialization/ASTWriter.cpp:176
 
+  for (auto &File : FileEntries) {
     const HeaderFileInfo *HFI =
----------------
rmaz wrote:
> jansvoboda11 wrote:
> > I'm afraid that iterating over file entries in "non-deterministic" order 
> > can cause non-determinism down the line. For example, calling 
> > `HeaderSearch::findAllModulesForHeader()` in the loop can trigger 
> > deserialization of `HeaderFileInfo` from loaded PCMs. That code fills the 
> > `Module::Headers` vectors, which we serialize without sorting first.
> Do you think we should move the sorting into the `getSeenFileEntries()` 
> method?
I think so, yes.


================
Comment at: clang/lib/Serialization/ASTWriter.cpp:1939
     // Massage the file path into an appropriate form.
     StringRef Filename = File->getName();
     SmallString<128> FilenameTmp(Filename);
----------------
rmaz wrote:
> jansvoboda11 wrote:
> > This will insert duplicate entries (with the same key) into the on-disk 
> > hash table. I don't know if that's problematic, just thought I'd call it 
> > out.
> Not sure I follow, why would we have `FileEntry`s with duplicate names?
Sorry, I had assumed we only do lookups in the on-disk hash table based on 
filenames (instead of full `FileEntry` comparison) and that for that reason, we 
should serialize all `FileEntryRef` into the PCM, even the VFS-mapped ones. But 
I now see that you're trying to only serialize the on-disk `FileEntryRef`s, 
which will not cause duplication. Ignore my comment here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143414/new/

https://reviews.llvm.org/D143414

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to