tra added a subscriber: jdoerfert.
tra added a comment.

In D101630#2730273 <https://reviews.llvm.org/D101630#2730273>, @yaxunl wrote:

> How about an option -fhip-bundle-device-output. If it is on, device output is 
> bundled no matter how many GPU arch there are. By default it is on.

+1 to the option, but I can't say I'm particularly happy about the default. I'd 
still prefer the default to be a no-bundling + an error in cases when we'd 
nominally produce multiple outputs.
We could use a third opinion here.

@jdoerfert : Do you have any thoughts on what would be a sensible default when 
a user uses `-S -o foo.s` for compilations that may produce multiple results? I 
think OpenMP may have to deal with similar issues.

On one hand it would be convenient for ccache to just work with the CUDA/HIP 
compilation out of the box. Compiler always produces one output file, 
regardless of what it does under the hood and ccache may not care what's in it.

On the other, this behavior breaks user expectations. I.e. `clang -S` is 
supposed to produce the assembly, not an opaque binary bundle blob.
Using an `-S` with multiple sub-compilations is also likely an error on the 
user's end and should be explicitly diagnosed and that's how it currently work.
Using `-fno-hip-bundle-device-output` to restore the expected behavior puts the 
burden on the wrong side, IMO.  I believe, it should be ccache which should be 
using `-fhip-bundle-device-output` to deal with the CUDA/HIP compilations.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to