yaxunl added a comment.

In D101630#2733761 <https://reviews.llvm.org/D101630#2733761>, @tra wrote:

> In D101630#2730273 <https://reviews.llvm.org/D101630#2730273>, @yaxunl wrote:
>
>> How about an option -fhip-bundle-device-output. If it is on, device output 
>> is bundled no matter how many GPU arch there are. By default it is on.
>
> +1 to the option, but I can't say I'm particularly happy about the default. 
> I'd still prefer the default to be a no-bundling + an error in cases when 
> we'd nominally produce multiple outputs.
> We could use a third opinion here.
>
> @jdoerfert : Do you have any thoughts on what would be a sensible default 
> when a user uses `-S -o foo.s` for compilations that may produce multiple 
> results? I think OpenMP may have to deal with similar issues.
>
> On one hand it would be convenient for ccache to just work with the CUDA/HIP 
> compilation out of the box. Compiler always produces one output file, 
> regardless of what it does under the hood and ccache may not care what's in 
> it.
>
> On the other, this behavior breaks user expectations. I.e. `clang -S` is 
> supposed to produce the assembly, not an opaque binary bundle blob.
> Using an `-S` with multiple sub-compilations is also likely an error on the 
> user's end and should be explicitly diagnosed and that's how it currently 
> work.
> Using `-fno-hip-bundle-device-output` to restore the expected behavior puts 
> the burden on the wrong side, IMO.  I believe, it should be ccache which 
> should be using `-fhip-bundle-device-output` to deal with the CUDA/HIP 
> compilations.

I choose to emit the bundled output by default since it is the convention for 
compiler to have one output. The compilation is like a pipeline. If we break it 
into stages, users would expect to use the output from one stage as input for 
the next stage. This is possible only if there is one output. This happens with 
host compilations and combined device/host compilations. I would see it is a 
surprise that this is not true for device compilation.

Also, when users do not want the output to be bundled, it is usually for 
debugging or special purposes. Users need to know the naming convention of the 
multiple outputs. I think it is justifiable to enable this by an option.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to