jhuber6 added a comment. A
================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6836 + + // If the CUID is not specified we try to generate a unique postfix. + if (getLangOpts().CUID.empty()) { ---------------- jhuber6 wrote: > tra wrote: > > jhuber6 wrote: > > > tra wrote: > > > > > However, [CUID] is not always availible. > > > > > > > > The question is -- when and why is it not available? I'm getting the > > > > feeling that we're fixing the consequence here, not the root cause. > > > > > > > > Is there a reason we can't make sure that the driver always generates a > > > > cuid for offload subcompilations and error out if it's needed but is > > > > not provided? > > > > That would make this fallback unnecessary and would be a more robust > > > > approach in general. > > > > > > > So, I'm more in favor of this approach because it doesn't require extra > > > intervention from the compiler driver, this makes it less convoluted to > > > do split compilation since we don't have an extra arguments. The way I > > > would prefer it, is that we do this implicitly by default without > > > requiring extra thought from the driver, but if it's not good enough we > > > can support the manual `CUID` approach to let the user override it. I > > > think this is a cleaner implementation, and is mostly coming from my > > > support for CUDA in the new driver which currently doesn't implement the > > > CUID as we do with the old driver. Generally I'd prefer things to behave > > > independent of the driver, so we can consider host and device compilation > > > more separately. > > > So, I'm more in favor of this approach because it doesn't require extra > > > intervention from the compiler driver > > > > We need the driver intervention for any cc1 compilations anyways, so this > > does not buy us anything. While you can run a sub-compilation manually > > with handcrafted cc1 flags, that's not a practical use case. The driver is > > the ultimate source of cc1 flags. > > > > > this makes it less convoluted to do split compilation since we don't have > > > an extra arguments. > > > > For CUDA/HIP sub-compilation should be done with clang > > --cuda-host-only/--cuda-device-only. Whether the driver supplies yet > > another cc1 option, --cuid=... makes no difference to the user launching > > such sub-compilation. > > > > > The way I would prefer it, is that we do this implicitly by default > > > without requiring extra thought from the driver, but if it's not good > > > enough we can support the manual CUID approach to let the user override > > > it. > > > > I agree that we can come up with something that will almost always work. > > Possibly even good enough for all practical purposes. However, if a better > > solution would take comparable effort, it would make sense to do things > > right and avoid adding technical debt. > > > > On the other hand, requiring the driver to supply identical cuid to all > > sub-compilations appears to be a better approach to me: > > * Driver is the best place to do it, functionally. Driver has access to all > > user-provided inputs and is in position to guarantee that all > > subcompilations get the same cuid. > > * Calculating CUID in the driver keeps relevant logic in one place. Doing > > it in the driver *and* in the codegen > > * Figuring out what inputs are relevant for calculation of CUID in cc1 > > invocation is error prone. E.g. we have to guess which cc1 options are > > relevant or not and is the driver would pass a macro to one subcompilation > > but not to another, we would end up generating mismatching CUID and would > > not have any way to notice that. Even when that's not the case, we would > > need to guess which flags, supplied by the driver, are relevant. At CC1 > > level that may be somewhat complicated as top-level options may expand to > > quite a few more cc1 options. E.g. we'll need to take into account > > `-std=...`, `--cuda-path=`, `-include ...`, `-I` (and other include > > paths)... All of that does not belong to the codegen. > > > > The driver is already doing CUID computation, so I do not see any downsides > > to just letting it do its job, and I do believe it will be a better, and > > likely less complicated, solution. > > > > > ... mostly coming from my support for CUDA in the new driver which > > > currently doesn't implement the CUID as we do with the old driver > > > > Right. That appears to be the key missing piece. > > > > What are the obstacles for having CUID calculation done in the new driver. > > It should have all the info it needs. What am I missing? > > > > For CUDA/HIP sub-compilation should be done with clang > > --cuda-host-only/--cuda-device-only. Whether the driver supplies yet > > another cc1 option, --cuid=... makes no difference to the user launching > > such sub-compilation. > The problem I have with this is that we use the command line to generate the > value, so they aren't going to be the same without the user manually > specifying it. I guess we could filter out only "relevant" command line > flags, maybe that's an option. I just think it's not intuitive for a name > mangling scheme to depend on something external, but there's definitely > advantages to doing it that way. > > I can see your point for the Driver handling this stuff. Now that I'm > thinking about it I don't think looking at the macros or the other arguments > is a sound solution in the first place. Even without that it would work for > almost all the same cases just using the file's unique ID. Without that, this > solution is guaranteed not to conflict with any other file on the same file > system at the time of compilation. This, as we discussed, potentially fails > for non-static source trees and compiling the same file twice and linking it. > The current CUID implementation fails on the former, this method fails on > both. > > If the CUID didn't exist, the way I would have implemented it would simply be > with the File-ID, and have the CUID be a simple marshalling option that lets > the user override it to something unique if needed. I personally think that's > simpler for 99.99% of cases and has an easy-out in the last 0.01%. Given that > it already exists there's some desire to keep it since the work has already > been done I understand. > > > What are the obstacles for having CUID calculation done in the new driver. > > It should have all the info it needs. What am I missing? > It's less of a difficulty in implementing and more hoping we could make the > name mangling more simple and work by default without the driver. > Also, we may need this support for a single case in OpenMP, and I'd prefer > not need to generate the CUID for OpenMP offloading when it's unused the vast > majority of the time. Generally I'd prefer if compiling for the host / device > was conceptually the same to the user without requiring external values. If > we're sold on the CUID method I can go forward with that, but from my > perspective what it's buying us is the ability to compile the following > ``` > static __device__ int a; > > #ifdef MACRO > do_something_with(a); > #else > do_something_else(); > #endif > ``` > ``` > clang foo.cu -DMACRO -c -o 1.o > clang foo.cu 1.o > ``` > > This is just a tough problem overall, I don't think there's a single perfect > solution. Whatever we choose we'll be trading reproducibility for correctness > or whatever. You have more seniority in this space so it's your call what you > think I should go forward with. Also, it's incredibly convoluted, but I can think of a way to break even the current CUID for this. ``` static __device__ int a; __device__ int __attribute__((weak)) *a_ref = &a; ``` ``` $ clang a.cu -c -fgpu-rdc $ mv a.o b.o $ clang a.cu -c -fgpu-rdc $ nvlink a.o b.o -arch=sm_35 -o out.cubin nvlink error : Multiple definition of '_ZL1a__static__d041026c8e4167e6' in '1.o', first defined in 'a.o' nvlink fatal : merge_elf failed ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D125904/new/ https://reviews.llvm.org/D125904 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits