subject:"Re\: \[PATCH\] D61458\: \[hip\] Relax CUDA call restriction within `decltype` context."

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via cfe-commits

> In any case, it seems like your examples argue for disallowing a
return-type mismatch between host and device overloads, not disallowing
observing the type?

Oh no, we have to allow return-type mismatches between host and device
overloads, that is a common thing in CUDA code I've seen.  You can safely
observe this difference *so long as you're inside of a function*.  This is
because we have this caller-sensitive function parsing thing.  When parsing
a __host__ __device__ function, we look at the caller to understand what
context we're in.

What I think you can't do is observe the return-type mismatch between host
and device overloads *from outside of a function*, e.g. from within a
trailing return type.

But perhaps rsmith or another expert can take my attempt at a
contract above and trap me in a Faustian contradiction.

On Thu, May 2, 2019 at 7:47 PM Finkel, Hal J.  wrote:

> Thanks, Justin. It sees like we have the standard set of options: We can
> disallow the mismatch. We can allow it with a warning. We can allow it
> without a warning. We can say that if the mismatch contributes to the type
> of a kernel function, that's illformed (NDR).
>
> In any case, it seems like your examples argue for disallowing a
> return-type mismatch between host and device overloads, not disallowing
> observing the type? Or maybe disallowing observing the type only when
> there's a mismatch?
>
>  -Hal
>
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> --
> *From:* Justin Lebar 
> *Sent:* Thursday, May 2, 2019 9:16 PM
> *To:* reviews+d61458+public+f6ea501465ad5...@reviews.llvm.org
> *Cc:* michael.hl...@gmail.com; Artem Belevich; John McCall; Liu, Yaxun
> (Sam); Finkel, Hal J.; Richard Smith; Clang Commits; mlek...@skidmore.edu;
> blitzrak...@gmail.com; Han Shen
> *Subject:* Re: [PATCH] D61458: [hip] Relax CUDA call restriction within
> `decltype` context.
>
> > So, actually, I wonder if that's not the right answer. We generally
> allow different overloads to have different return types. What if, for
> example, the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> The problem is that conceptually compiling for host/device does not create
> a new set of overloads.
>
> When we compile for (say) host, we build a full AST for all functions,
> including device functions, and that AST must pass sema checks.  This is
> significant for example because when compiling for device we need to know
> which kernel templates were instantiated on the host side, so we know which
> kernels to emit.
>
> Here's a contrived example.
>
> ```
>  __host__ int8 bar();
> __device__ int16 bar();
> __host__ __device__ auto foo() -> decltype(bar()) {}
>
> template  __global__ kernel();
>
> void launch_kernel() {
>   kernel<<<...>>>();
> }
> ```
>
> This template instantiation had better be the same when compiling for host
> and device.
>
> That's contrived, but consider this much simpler case:
>
> ```
> void host_fn() {
>   static_assert(sizeof(decltype(foo())) == sizeof(int8));
> }
> ```
>
> If we let foo return int16 in device mode, this static_assert will fail
> when compiling in *device* mode even though host_fn is never called on the
> device.  https://gcc.godbolt.org/z/gYq901
>
> Why are we doing sema checks on the host code when compiling for device?
> See contrived example above, we need quite a bit of info about the host
> code to infer those templates.
>
> On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator <
> revi...@reviews.llvm.org> wrote:
>
> hfinkel added a comment.
>
> In D61458#1488970 <https://reviews.llvm.org/D61458#1488970>, @jlebar
> wrote:
>
> > Here's one for you:
> >
> >   __host__ float bar();
> >   __device__ int bar();
> >   __host__ __device__ auto foo() -> decltype(bar()) {}
> >
> >
> > What is the return type of `foo`?  :)
> >
> > I don't believe the right answer is, "float when compiling for host, int
> when compiling for device."
>
>
> So, actually, I wonder if that's not the right answer. We generally allow
> different overloads to have different return types. What if, for example,
> the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> > I'd be happy if we said this was an error, so long as it's well-defined
> what exactly we're disallowing.  But I bet @rsmith can come up with
> substantially more evil testcases than this.
>
>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D61458/new/
>
> https://reviews.llvm.org/D61458
>
>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Finkel, Hal J. via cfe-commits

Thanks, Justin. It sees like we have the standard set of options: We can 
disallow the mismatch. We can allow it with a warning. We can allow it without 
a warning. We can say that if the mismatch contributes to the type of a kernel 
function, that's illformed (NDR).

In any case, it seems like your examples argue for disallowing a return-type 
mismatch between host and device overloads, not disallowing observing the type? 
Or maybe disallowing observing the type only when there's a mismatch?

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

From: Justin Lebar 
Sent: Thursday, May 2, 2019 9:16 PM
To: reviews+d61458+public+f6ea501465ad5...@reviews.llvm.org
Cc: michael.hl...@gmail.com; Artem Belevich; John McCall; Liu, Yaxun (Sam); 
Finkel, Hal J.; Richard Smith; Clang Commits; mlek...@skidmore.edu; 
blitzrak...@gmail.com; Han Shen
Subject: Re: [PATCH] D61458: [hip] Relax CUDA call restriction within 
`decltype` context.

> So, actually, I wonder if that's not the right answer. We generally allow 
> different overloads to have different return types. What if, for example, the 
> return type on the host is __float128 and on the device it's `MyLongFloatTy`?

The problem is that conceptually compiling for host/device does not create a 
new set of overloads.

When we compile for (say) host, we build a full AST for all functions, 
including device functions, and that AST must pass sema checks.  This is 
significant for example because when compiling for device we need to know which 
kernel templates were instantiated on the host side, so we know which kernels 
to emit.

Here's a contrived example.

```
 __host__ int8 bar();
__device__ int16 bar();
__host__ __device__ auto foo() -> decltype(bar()) {}

template  __global__ kernel();

void launch_kernel() {
  kernel<<<...>>>();
}
```

This template instantiation had better be the same when compiling for host and 
device.

That's contrived, but consider this much simpler case:

```
void host_fn() {
  static_assert(sizeof(decltype(foo())) == sizeof(int8));
}
```

If we let foo return int16 in device mode, this static_assert will fail when 
compiling in *device* mode even though host_fn is never called on the device.  
https://gcc.godbolt.org/z/gYq901

Why are we doing sema checks on the host code when compiling for device?  See 
contrived example above, we need quite a bit of info about the host code to 
infer those templates.

On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator 
mailto:revi...@reviews.llvm.org>> wrote:
hfinkel added a comment.

In D61458#1488970 <https://reviews.llvm.org/D61458#1488970>, @jlebar wrote:

> Here's one for you:
>
>   __host__ float bar();
>   __device__ int bar();
>   __host__ __device__ auto foo() -> decltype(bar()) {}
>
>
> What is the return type of `foo`?  :)
>
> I don't believe the right answer is, "float when compiling for host, int when 
> compiling for device."

So, actually, I wonder if that's not the right answer. We generally allow 
different overloads to have different return types. What if, for example, the 
return type on the host is __float128 and on the device it's `MyLongFloatTy`?

> I'd be happy if we said this was an error, so long as it's well-defined what 
> exactly we're disallowing.  But I bet @rsmith can come up with substantially 
> more evil testcases than this.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61458/new/

https://reviews.llvm.org/D61458

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via cfe-commits

> So, actually, I wonder if that's not the right answer. We generally allow
different overloads to have different return types. What if, for example,
the return type on the host is __float128 and on the device it's
`MyLongFloatTy`?

The problem is that conceptually compiling for host/device does not create
a new set of overloads.

When we compile for (say) host, we build a full AST for all functions,
including device functions, and that AST must pass sema checks.  This is
significant for example because when compiling for device we need to know
which kernel templates were instantiated on the host side, so we know which
kernels to emit.

Here's a contrived example.

```
 __host__ int8 bar();
__device__ int16 bar();
__host__ __device__ auto foo() -> decltype(bar()) {}

template  __global__ kernel();

void launch_kernel() {
  kernel<<<...>>>();
}
```

This template instantiation had better be the same when compiling for host
and device.

That's contrived, but consider this much simpler case:

```
void host_fn() {
  static_assert(sizeof(decltype(foo())) == sizeof(int8));
}
```

If we let foo return int16 in device mode, this static_assert will fail
when compiling in *device* mode even though host_fn is never called on the
device.  https://gcc.godbolt.org/z/gYq901

Why are we doing sema checks on the host code when compiling for device?
See contrived example above, we need quite a bit of info about the host
code to infer those templates.

On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator <
revi...@reviews.llvm.org> wrote:

> hfinkel added a comment.
>
> In D61458#1488970 , @jlebar
> wrote:
>
> > Here's one for you:
> >
> >   __host__ float bar();
> >   __device__ int bar();
> >   __host__ __device__ auto foo() -> decltype(bar()) {}
> >
> >
> > What is the return type of `foo`?  :)
> >
> > I don't believe the right answer is, "float when compiling for host, int
> when compiling for device."
>
>
> So, actually, I wonder if that's not the right answer. We generally allow
> different overloads to have different return types. What if, for example,
> the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> > I'd be happy if we said this was an error, so long as it's well-defined
> what exactly we're disallowing.  But I bet @rsmith can come up with
> substantially more evil testcases than this.
>
>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D61458/new/
>
> https://reviews.llvm.org/D61458
>
>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

3 matches

Site Navigation

Mail list logo

Footer information