https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104841
Bug ID: 104841
Summary: [nvptx] Multi-version ptx
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
[ From the category wild ideas... ]
The current default is sm_35, and soon will revert back to sm_30.
This makes all libraries use sm_30.
We could add multilibs for higher sm_xx, but I wonder if we could not exploit
the fact that a ptx object is text.
That is, rather than emit a fixed .target, emit a range say:
...
.target sm_30, sm_35
...
and at driver load time, determine the sm_xx you want to use based on the
capabilities of the board you're preparing to load to, and then say preprocess
the ptx:
...
#if TARGET_SM >= 35
// some ptx using sm_35
#else
// some more basic ptx using sm_30
#endif
...
and so support different sm_xx within a single object, without the need to add
multilibs.
The preprocessing will add extra time, but is only necessary if the object uses
a target range. [ Alternatively, you can have the compiler emit complete
objects with different versions and at driver load time pick from those without
preprocessing overhead. ]
Likewise for mptx, but code generation might actually diverge more in the
feature due to PR104768 to the point where the approach is not feasible, at
least not for arbitrary ptx isa version ranges.