sammccall added a comment.

TL;DR: I think there are three viable paths that we should consider:
A) minimal change: put the FSProvider in ParseInputs
B) this patch, but with more documentation and safety
C) fight to clean up VFS multithreading semantics

Assuming we don't want to block on C, I'm not sure whether A or B is 
conceptually better. But A is a smaller change and brings side benefits, so 
wondering if you see advantages in B.

---

You're right about the race condition of course and I'm sure this patch is 
correct.
But this rubs me the wrong way a bit - the API change doesn't seem clearly 
motivated either from a high level (we conceptually *want* a single FS to build 
a single version of code) or from a low level (what are we actually doing in 
parallel that isn't threadsafe?).

Looking at the low level perspective first. VFS has three classes threadsafety 
issues:

- it contains the "working directory" as mutable state, which isn't 
multi-threading friendly. We do set working directory concurrently with other 
work, but always to the same thing. We could factor our way around this by 
setting the working directory prior to any concurrency (to CDB.Directory) and 
ensuring we never set it again.
- the RealFilesystem in real-working-directory mode is a singleton so its use 
is thread-hostile. This isn't really relevant to us because we use 
emulated-working-directory mode and have distinct instances. But it probably 
informed VFS's lack of useful threading semantics.
- read operations aren't `const` or documented as threadsafe even though the 
canonical `RealFilesystem` is. In-tree implementations seem to be threadsafe, 
some out-of-tree ones are not :-( This definitely bites us here.

I think it would be fundamentally possible to use a single VFS here. We need to 
modify the VFS interface to make concurrent reads safe (and const, while here) 
and fix any implementations. This isn't trivial. I think it's worthwhile, but 
experience tells me different projects want different things from the VFS 
interface...

The high-level view: IMO FSProvider is basically a hack around VFS's 
thread-safety issues, rather than a fundamental part of the application - it's 
a filesystem that doesn't have an attached working directory and whose read 
operations (e.g. `getFileSystem()->status()`) are threadsafe. We could rename 
FSProvider -> ThreadsafeFS and getFileSystem() -> newCursor(WD) or something, 
and maybe we should. 
From this perspective, this patch is changing two things that have little to do 
with each other: it's replacing FS -> ThreadsafeFS and it's moving the FS 
ownership from one revision of the file to the TUScheduler itself.

I'm not really sure whether the latter is a good or a bad idea. If we want to 
do it, the use of ParseInputs as the interface to TUScheduler is a bit 
confusing - we should at least document that FS isn't used (as we do for 
CompileCommand) and probably assert that it's null, too.

But maybe simpler would be to "just" change the VFS member in ParseInputs to a 
`const FSProvider*`. This keeps the design approximately what it is now, but 
simplifies the ParseInputs concept: it's now truly readonly data, and functions 
that consume it won't have effects that escape.



================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:629
     std::vector<std::string> CC1Args;
     std::unique_ptr<CompilerInvocation> Invocation = buildCompilerInvocation(
         Inputs, CompilerInvocationDiagConsumer, &CC1Args);
----------------
Yikes, ISTM we're getting the FS from Inputs here, and from this->FSProvider 
below. We should use one or the other (i.e. FSProvider only, following this 
patch) unless we're trying to do something really subtle...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80900/new/

https://reviews.llvm.org/D80900



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to