tra added a comment.

I'm concerned that if we make it a top-level option, it's likely to be 
cargo-culted and (mis)used as a sledgehammer in cases where it's not needed. I 
think the option should remain hidden.

While thresholds do need to be tweaked,  in my experience it happens very 
rarely. 
When it does, most of the time it's sufficient to apply 
`__attribute__((always_inline))` to a few functions where it matters.
If AMDGPU bumps into the limit too often, perhaps it's the default threshold 
value that needs to be changed.

If we do add an option to control inlining threshold, then we should also 
consider doing the same for other thresholds. 
For instance, loop unrolling thresholds in my experience need bumping up about 
as often as the inlining ones. 
Similarly, most of the time the issue can be dealt with at the source code 
level with `#pragma unroll`.

Perhaps we should generalize this patch to deal with wider range of threshold. 
E.g. we could have something like `--gpu-threshold threshold-kind=x` which 
would expand to appropriate cc1 options for GPU sub-compilations.
It would also be nice to handle it in a way that can be used by both CUDA and 
HIP w/o having to copy/paste code.

Also, this patch would not be necessary if we had the generalized way to 
specify options for particular offload targets. Alas, we don't have it yet.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99233/new/

https://reviews.llvm.org/D99233

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to