tra added a comment.

LGTM in principle. This will keep around the GPU code we do need.

That said, it seems to be a rather blunt hammer. I think we'll end up linking 
almost everything in an archive into the final executable as we'll likely have 
a host-visible symbol in most of the GPU objects (e.g. most of them would have 
a kernel).
Device-side linking would also be unaware of which objects were actually linked 
into the host executable and thus would link in more objects than necessary. We 
could have achieved about the same result by linking with `--whole-archive`.

The root of the problem here is that in isolation GPU-side linking does not 
know what will really be needed by the host and thus has to link in everything, 
except, maybe, object files where we may have `__device__` functions only.
Ideally, the linking should be a two-phase process -- link CPU side, extract 
references to the GPU symbols (host-side compilation would have to be augmented 
to place them in a well known location) and pass them to the GPU-side linker 
which would then have all the info necessary to pull in relevant GPU-side 
objects without compiler having to force having nearly all of them linked in.

I realize that this would be a nontrivial change to the compilation pipeline. 
As a short-to-medium term solution, this patch may do, though I'd probably 
prefer just linking with `--whole-archive` as it would, in theory, be simpler.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123441/new/

https://reviews.llvm.org/D123441

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to