yaxunl added a comment. In D101630#2777702 <https://reviews.llvm.org/D101630#2777702>, @tra wrote:
> In D101630#2777346 <https://reviews.llvm.org/D101630#2777346>, @yaxunl wrote: > >> In D101630#2748513 <https://reviews.llvm.org/D101630#2748513>, @tra wrote: >> >>> How about this: >>> If the user explicitly specified `--cuda-host-only` or >>> `--cuda-device-only`, then by default only allow producing the natural >>> output format, unless a bundled output is requested by an option. This >>> should keep existing users working. >>> If the compilation is done without explicitly requested sub-compilation(s), >>> then bundle the output by default. This should keep the GPU-unaware tools >>> like ccache happy as they would always get the single output they expect. >>> >>> WDYT? >> >> `--cuda-host-only` always have one output, therefore there is no point of >> bundle its output. We only need to decide the proper behavior of >> `--cuda-device-only`. > > It still fits my proposal of requiring a single sub-compilation and not > bundling the output. > The point was that such behavior is consistent regardless of whether we're > compiling CUDA or HIP for the host or for device. > >> How about keeping the original default behavior of not bundling if users do >> not specify output file, >> whereas bundle the output if users specifying output file. > > I think it will make things worse. Compiler output should not change > depending on whether `-o` is used. > >> Since specifying output file indicates users requesting one output. >> -f[no-]hip-bundle-device-output override the default behavior. > > I disagree. When user specifies the output, the intent is to specify the > **location** of the outputs, not its contents or format. > > Telling compiler to produce a different output format should not depend on > specifying (or not) the output location. > > I think our options are: > > - Always bundle --cuda-device-only outputs by default. This is consistent for > HIP compilation, but deviates from CUDA, which can't do bundling. Also, > single-target subcompilation deviates from both CUDA and regular C++ > compilation, which is what most users would be familiar with and which would > probably be the most sensible default for a single sub-compilation. It can be > overridden with an option, but it goes against the principle that it's > specialized use case that should need extra options. The most common use case > should not need them. > > - Only bundle multiple sub-compilations' output by default. This would > preserve the sensible single sub-compilation behavior. The downside is that > it makes the output format depend on whether compiler ends up doing one or > many sub-compilations. E.g. `--offload-arch=A -S` would produce ASM and > `--offload-arch=A --offload-arch=B -S` would produce a bundle. If the user > can't control some of the compiler options, Such approach would make output > format unpredictable. E.g. passing `--offload-arch=foo` to compiler on > godbolt would all of a sudden produce bundled output instead of assembly text > or a sensible error message that you're trying to produce multiple outputs. > > - Keep the current behavior (insist on single sub-compilation) as the > default, allow overriding it for HIP with the flag. IMO that's the most > consistent option and I still think it's the one most suitable to keep as the > default. > > I can see the benefit of always bundling for HIP, but I also believe that > keeping things simple, consistent and predictable is important. Considering > that we're tinkering in a relatively obscure niche of the compiler, it > probably does not matter all that much, but it should not stop us from trying > to figure out the best approach in a principled way. > > I think we could benefit from a second opinion on which approach would make > more sense for clang. > Summoning @jdoerfert and @echristo. How does nvcc --genco behave when there are multiple GPU arch's? Does it output a fat binary containing multiple ISA's? Also, does it support device-only compilation for intermediate outputs? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101630/new/ https://reviews.llvm.org/D101630 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits