Hi Thomas,
I apologise that it might complicate things, but one potential benefit of 
--with-cuda-driver
(i.e. linking the compiler against proprietary libraries) is that it would 
allow support for
-march=native on nvptx (i.e. the gcc driver can figure out what sm_xx is 
available on the
GPU(s) of the current machine, and pass that to cc1.  Like with all the 
microarchitectures
on other platforms (x86_64), figuring this out is not a trivial task for many 
end-users.

Of course, ideally I'd love to be able to figure out the PTX hardware 
specifications and
driver versions without using a third-party library, but I've no idea how this 
could be done
(portably across the platforms that support libcuda).  Perhaps dlopen at 
runtime?
Or calling out to (executing) nvptx-tools?

Cheers,
Roger
--

> -----Original Message-----
> From: Thomas Schwinge <tho...@codesourcery.com>
> Sent: 05 April 2022 16:14
> To: Tom de Vries <tdevr...@suse.de>; Jakub Jelinek <ja...@redhat.com>
> Cc: gcc-patches@gcc.gnu.org; Tobias Burnus <tob...@codesourcery.com>;
> Roger Sayle <ro...@nextmovesoftware.com>
> Subject: Proposal to remove '--with-cuda-driver' (was: [wwwdocs][patch] gcc-
> 12: Nvptx updates)
> 
> Hi!
> 
> Still catching up with GCC/nvptx back end changes...  %-)
> 
> 
> In the following I'm not discussing the patch to document
> "gcc-12: Nvptx updates", but rather one aspect of the
> "gcc-12: Nvptx updates" themselves.  ;-)
> 
> On 2022-03-30T14:27:41+0200, Tom de Vries <tdevr...@suse.de> wrote:
> > +  <li>The <code>-march</code> flag has been added.  The <code>-
> misa</code>
> > +    flag is now considered an alias of the <code>-march</code>
> > + flag.</li>  <li>Support for PTX ISA target architectures 
> > <code>sm_53</code>,
> > +    <code>sm_70</code>, <code>sm_75</code> and <code>sm_80</code>
> has been
> > +    added.  These can be specified using the <code>-march</code>
> > + flag.</li>  <li>The default PTX ISA target architecture has been set back
> > +    to <code>sm_30</code>, to fix support for <code>sm_30</code>
> > + boards.</li>  <li>The <code>-march-map</code> flag has been added.  The
> > +    <code>-march-map</code> value will be mapped to an valid
> > +    <code>-march</code> flag value.  For instance,
> > +    <code>-march-map=sm_50</code> maps to <code>-
> march=sm_35</code>.
> > +    This can be used to specify that generated code is to be executed on a
> > +    board with at least some specific compute capability, without having to
> > +    know the valid values for the <code>-march</code> flag.</li>
> 
> Regarding the following:
> 
> >    <li>The <code>-mptx</code> flag has been added to specify the PTX ISA
> version
> >        for the generated code; permitted values are <code>3.1</code>
> > -      (default, matches previous GCC versions) and <code>6.3</code>.
> > +      (matches previous GCC versions), <code>6.0</code>, <code>6.3</code>,
> > +      and <code>7.0</code>. If not specified, the used version is the 
> > minimal
> > +      version required for <code>-march</code> but at least
> <code>6.0</code>.
> >    </li>
> 
> For "the PTX ISA version [used is] at least '6.0'", per
> <https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes>,
> this means we now require "CUDA 9.0, driver r384" (or more recent).
> Per <https://developer.nvidia.com/cuda-toolkit-archive>:
> "CUDA Toolkit 9.0 (Sept 2017)", so ~4.5 years old.
> Per <https://download.nvidia.com/XFree86/Linux-x86_64/>, I'm guessing a
> similar timeframe for the imprecise "r384" Driver version stated in that 
> table.
> That should all be fine (re not mandating use of all-too-recent versions).
> 
> Now, consider doing a GCC/nvptx offloading build with '--with-cuda-driver'
> pointing to CUDA 9.0 (or more recent).  This means that the libgomp nvptx
> plugin may now use CUDA Driver features of the CUDA 9.0 distribution ("driver
> r384", etc.) -- because that's what it is being 'configure'd and linked 
> against.  (I
> say "may now use", because we're currently not making a lot of effort to use
> "modern" CUDA Driver features -- but we could, and probably should.  That's a
> separate discussion, of course.)  It then follows that the libgomp nvptx 
> plugin
> has a hard dependency on CUDA Driver features of the CUDA 9.0 distribution
> ("driver r384", etc.).  That's dependency as in ABI: via '*.so' symbol 
> versions as
> well as internal CUDA interface configuration; see <cuda.h> doing different
> '#define's for different '__CUDA_API_VERSION' etc.)
> 
> Now assume one such dependency on "modern" CUDA Driver were not
> implemented by:
> 
> > +  <li>An <code>mptx-3.1</code> multilib was added.  This allows using older
> > +      drivers which do not support PTX ISA version 6.0.</li>
> 
> ... this "old" CUDA Driver.  Then you do have the '-mptx-3.1' multilib to use 
> with
> "old" CUDA Driver -- but you cannot actually use the libgomp nvptx plugin,
> because that's been built against "modern" CUDA Driver.
> 
> Same problem, generally, for 'nvptx-run' of the nvptx-tools, which has similar
> CUDA Driver dependencies.
> 
> Now, that may currently be a latent problem only, because we're not actually
> making use of "modern" CUDA Driver features.  But, I'd like to resolve this
> "impedance mismatch", before we actually run into such problems.
> 
> Already long ago Jakub put in changes to use '--without-cuda-driver' to "Allow
> building GCC with PTX offloading even without CUDA being installed (gcc and
> nvptx-tools patches)": "Especially for distributions it is undesirable to 
> need to
> have proprietary CUDA libraries and headers installed when building GCC.", 
> and I
> understand GNU/Linux distributions all use that.  That configuration uses the
> GCC-provided 'libgomp/plugin/cuda/cuda.h', 'libgomp/plugin/cuda-lib.def' to
> manually define the CUDA Driver ABI to use, and then 'dlopen("libcuda.so.1")'.
> (Similar to what the libgomp GCN (and before: HSA) plugin is doing, for
> example.)  Quite likely that our group (at work) are the only ones to 
> actually use
> '--with-cuda-driver'?
> 
> My proposal now is: we remove '--with-cuda-driver' (make its use a no-op, per
> standard GNU Autoconf behavior), and offer '--without-cuda-driver'
> only.  This shouldn't cause any user-visible change in behavior, so safe 
> without a
> prior deprecation phase.
> 
> Before I prepare the patches (GCC, nvptx-tools): any comments or objections?
> 
> 
> Grüße
>  Thomas
> 
> 
> >    <li>The new <code>__PTX_SM__</code> predefined macro allows code to
> check the
> > -      compute model being targeted by the compiler.</li>
> > +      PTX ISA target architecture being targeted by the
> > + compiler.</li>  <li>The new <code>__PTX_ISA_VERSION_MAJOR__</code>
> > +      and <code>__PTX_ISA_VERSION_MINOR__</code> predefined macros
> allows code
> > +      to check the PTX ISA version being targeted by the
> > + compiler.</li>
> >  </ul>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
> Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
> München, HRB 106955

Reply via email to