jhuber6 added inline comments.

================
Comment at: clang/lib/Driver/ToolChains/Flang.cpp:128
+  if (IsHostOffloadingAction) {
+    for (size_t i = 1; i < Inputs.size(); ++i) {
+      if (Inputs[i].getType() != types::TY_Nothing)
----------------
agozillon wrote:
> jhuber6 wrote:
> > agozillon wrote:
> > > awarzynski wrote:
> > > > agozillon wrote:
> > > > > awarzynski wrote:
> > > > > > What's the magic "1"? And given that the input count matters here - 
> > > > > > is there a test with multiple inputs?
> > > > > It aims to mimic the behavior of Clang: 
> > > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L4561
> > > > >  where the main input is skipped (the input currently being compiled 
> > > > > or embedded into etc.), when adding to //-fembed-offload-object//. 
> > > > > 
> > > > > It does look different to Clang's as Clang has more cases and the 
> > > > > logic is spread across the constructJob invocation, but the first if 
> > > > > case is what the if statement inside of the loop and setting the loop 
> > > > > index variable to 1 do. The HostOffloadingInputs array is what is 
> > > > > being generated here, except we're skipping and directly applying it 
> > > > > as arguments.
> > > > > 
> > > > > I tried to condense it a little in this case! Perhaps it loses 
> > > > > readability though, I had hoped the comment might have kept it clear
> > > > Thanks for the link - that code in Clang doesn't really clarify what 
> > > > makes `Inputs[0]` special 🤔 . 
> > > > 
> > > > Let me rephrase my question - what's so special about the first input? 
> > > > (referred to in Clang as "main input") Is that something specific to 
> > > > OpenMP? For example, in this case:
> > > > ```
> > > > flang-new  -fopenmp  file.f90
> > > > ```
> > > > I assume that `inputs[0]` is "file.f90", so nothing will happen?
> > > > 
> > > > > I tried to condense it a little in this case! Perhaps it loses 
> > > > > readability though, I had hoped the comment might have kept it clear
> > > > 
> > > > Nah, I think that your implementation is fine. It's my ignorance with 
> > > > respect to OpenMP that's the problem here ;-)
> > > It's not specific to OpenMP I believe, as far as I am aware Clang's 
> > > supported offload models (SYCL and CUDA as well as OpenMP) use it! In 
> > > Flang's case we only really care about OpenMP as I believe it's the only 
> > > offload programming model supported.
> > > 
> > > In the case of the command: 
> > > 
> > > ```
> > > flang-new -fopenmp file.f90
> > > ``` 
> > > The code should never be executed as no part of the command will enable 
> > > an offloading action (for device or host)! But yes inputs[0] would indeed 
> > > refer to file.f90.
> > > 
> > > However, this code becomes relevant when you utilise an option that 
> > > enables the clangDriver to perform some form of offloading action. For 
> > > example a command like: 
> > > 
> > > ```
> > > flang-new -fopenmp --offload-arch=gfx90a file.f90 
> > > ```
> > > Will trigger two phase compilation, one for the host device (your 
> > > resident CPU in this command) and one for the device (gfx90a in this 
> > > command), the regular host pass will invoke like your provided command 
> > > and the device pass will then invoke with -fopenmp-is-device in addition 
> > > alongside the device triple. This generates two bitcode files from the 
> > > one file, one containing the host code from the file, the other the 
> > > device code (extracted from OpenMP target regions or declare target 
> > > etc.). 
> > > 
> > > However, now we have two files, with both parts of our program, we need 
> > > to conjoin them together, the clangDriver generates an embeddable 
> > > fat-binary/binary using the clang-offload-packager and then invokes 
> > > flang-new again, and this is where the above code becomes relevant, as 
> > > this binary (or multiple binaries, if we target multiple devices in the 
> > > same program) is what is passed to -fembed-offload-object! And inputs[0] 
> > > in this case would refer to the output from the original host pass, which 
> > > is what we want to embed the device binary into, so we wish to skip this 
> > > original host output and only pass the subsequent inputs (which should be 
> > > device binaries when the clangDriver initiates a host offloading action) 
> > > we want to embed as -fembed-offload-object arguments. 
> > > 
> > > The offloading driver is quite complex and my knowledge of it is not 
> > > perfect as I am not our resident expert on it unfortunately (so if anyone 
> > > sees anything incorrect, please do chime in and correct me)! 
> > > 
> > > But hopefully this answers your question and gives you an idea of when 
> > > and how this -fembed-offload-object comes into play, essentially a way to 
> > > get the device binaries we wish to insert into the host binary, so it can 
> > > load the binaries at runtime and execute them. Currently upstream Flang 
> > > doesn't utilise this option of course, but we intend to use this as part 
> > > of our OpenMP offloading efforts for AMD devices (whilst leaving the door 
> > > open for other vendors devices as well). We are trying to re-use/mimic as 
> > > much of the existing machinery that the clangDriver implements as we can! 
> > >  
> > The compiler requires at least one input file to run, otherwise it exits 
> > early. Therefore we're guaranteed to have at least one input file in the 
> > list. Some functions need an input file, usually to write some temp name 
> > to, and `Inputs[0]` is the easiest way to get an input file.
> Thank you very much @jhuber6! I should have added you as a 
> subscriber/reviewer as well in hindsight, sorry about that. 
Sorry, this is two separate things. I was thinking about the driver's input 
list which behaves like I mentioned above.

Here is the input list to an actual compilation job. In this case we expect to 
only get one. E.g. `clang foo.c bar.c` gets turned into `clang -cc1 foo.c` and 
`clang -cc1 bar.c`. The extra files added here are intended to be used as 
additional input handled separately. So, for example consider the following 
OpenMP offloading compilation.

```
> clang input.c -fopenmp --offload-arch=gfx1030 -ccc-print-bindings
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["input.c"], output: 
"/tmp/input-44c8b5.bc"
# "amdgcn-amd-amdhsa" - "clang", inputs: ["input.c", "/tmp/input-44c8b5.bc"], 
output: "/tmp/input-gfx1030-4471d9.bc"
# "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: 
["/tmp/input-gfx1030-4471d9.bc"], output: "/tmp/input-83327d.out"
# "x86_64-unknown-linux-gnu" - "clang", inputs: ["/tmp/input-44c8b5.bc", 
"/tmp/input-83327d.out"], output: "/tmp/input-cdd693.o"
# "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: 
["/tmp/input-cdd693.o"], output: "a.out"
```
The first input file is the actual file meant to be compiled. The other files 
are handled separately. For the `amdgcn-amd-amdhsa` triple the extra input is 
the host bitcode we use to match symbols between the host and the device. For 
the `x64` triple the extra input is a binary blob to be embedded into the host.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145815/new/

https://reviews.llvm.org/D145815

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to