On Fri, Jan 08, 2016 at 09:39:11AM -0600, James Norris wrote: > On 01/08/2016 09:30 AM, Jakub Jelinek wrote: > >On Fri, Jan 08, 2016 at 09:24:14AM -0600, James Norris wrote: > >>This patch removes the constraint whereby the PTX > >>emitted was only for sm_30 GPU's. With this removal, > >>the PTX emitted will be targeted for the current > >>context, i.e., attached GPU. > >> > >>Bootstrapped/regtested on x86_64-linux, ok for trunk? > >How does that work? > >I see > >static void > >nvptx_file_start (void) > >{ > > fputs ("// BEGIN PREAMBLE\n", asm_out_file); > > fputs ("\t.version\t3.1\n", asm_out_file); > > fputs ("\t.target\tsm_30\n", asm_out_file); > > fprintf (asm_out_file, "\t.address_size %d\n", GET_MODE_BITSIZE (Pmode)); > > fputs ("// END PREAMBLE\n", asm_out_file); > >} > >which I'd guess means the PTX code can be executed in sm_30 and newer only > >anyway. > > > > Jakub > > Yes. Per the PTX ISA manual for the .target directive: "In general, > generations of SM architectures follow an onion layer model, where > each generation adds new features and retains all features of > previous generations."
And CU_JIT_TARGET / CU_TARGET_COMPUTE_30 requests JITting only on sm_30 and nothing else, or just on sm_30 or later, something else? This really should be reviewed by somebody familiar with CUDA more than myself. Jakub