Re: [Clang] Convergent Attribute

2016-05-17 Thread Ettore Speziale via cfe-commits
Hello,

>>> CUDA? In any case, I don't see how the restriction helps users, and the 
>>> attribute at the IR level has a well-defined meaning regardless. If a user 
>>> were to have a use case, they'd simply find the restriction arbitrary and 
>>> frustrating.
>> 
>> Yes, CUDA was already considered as well. I just think that compilers should 
>> help to reduce amount of erroneous or meaningless use cases. That's one of 
>> the reasons to have language options for the attributes. But I don't feel 
>> strongly about this particular case anyways. So let's make it language 
>> independent then. ;)
> 
> Is the patch OK now or you guys want to apply some other modifications?
> 
> 

Sorry guys, but on a second tough this patch might be completely useless. I 
guess I’d should be marking all function calls convergent at CodeGen time, and 
then rely on LLMV’s FunctionAttrs to remove the unnecessary ones.

Thank you for the good review and sorry for wasting your times,
Ettore Speziale

--
Ettore Speziale — Compiler Engineer
speziale.ett...@gmail.com
espezi...@apple.com
--

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-11 Thread Ettore Speziale via cfe-commits
Hello,

>> CUDA? In any case, I don't see how the restriction helps users, and the 
>> attribute at the IR level has a well-defined meaning regardless. If a user 
>> were to have a use case, they'd simply find the restriction arbitrary and 
>> frustrating.
> 
> Yes, CUDA was already considered as well. I just think that compilers should 
> help to reduce amount of erroneous or meaningless use cases. That's one of 
> the reasons to have language options for the attributes. But I don't feel 
> strongly about this particular case anyways. So let's make it language 
> independent then. ;)

Is the patch OK now or you guys want to apply some other modifications?



convergent.diff
Description: Binary data


Thanks,
Ettore Speziale

--
Ettore Speziale — Compiler Engineer
speziale.ett...@gmail.com
espezi...@apple.com
--

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


RE: [Clang] Convergent Attribute

2016-05-10 Thread Anastasia Stulova via cfe-commits
> CUDA? In any case, I don't see how the restriction helps users, and the 
> attribute at the IR level has a well-defined meaning regardless. If a user 
> were to have a use case, they'd simply find the restriction arbitrary and 
> frustrating.

Yes, CUDA was already considered as well. I just think that compilers should 
help to reduce amount of erroneous or meaningless use cases. That's one of the 
reasons to have language options for the attributes. But I don't feel strongly 
about this particular case anyways. So let's make it language independent then. 
;)

Anastasia 
 
-Original Message-
From: Hal Finkel [mailto:hfin...@anl.gov] 
Sent: 10 May 2016 00:33
To: Anastasia Stulova
Cc: nd; Clang Commits; Matt Arsenault; Ettore Speziale; Aaron Ballman
Subject: Re: [Clang] Convergent Attribute

- Original Message -
> From: "Anastasia Stulova via cfe-commits" 
> To: "Matt Arsenault" , "Ettore Speziale" 
> , "Aaron Ballman"
> 
> Cc: "nd" , "Clang Commits" 
> Sent: Monday, May 9, 2016 12:39:19 PM
> Subject: RE: [Clang] Convergent Attribute
> 
> Since it's not a part of any official spec we could of course make it 
> accepted with anything.
> 
> Just out of curiosity what other programming models supported by Clang 
> do you think this attribute would be useful for?
> 
> Anastasia
> 
> -Original Message-
> From: Matt Arsenault [mailto:matthew.arsena...@amd.com]
> Sent: 07 May 2016 00:37
> To: Anastasia Stulova; Ettore Speziale; Aaron Ballman
> Cc: nd; Clang Commits
> Subject: Re: [Clang] Convergent Attribute
> 
> On 05/06/2016 12:11 PM, Anastasia Stulova via cfe-commits wrote:
> > I was just wondering whether it would make sense to restrict the 
> > usage of the attribute to OpenCL language i.e. to add  "let LangOpts 
> > = [OpenCL];" in the attribute definition.
> This seems to be a pointless arbitrary restriction to me
> 
> -Matt
> 
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-09 Thread Hal Finkel via cfe-commits
- Original Message -

> From: "Richard Smith via cfe-commits" 
> To: "Matt Arsenault" 
> Cc: "Clang Commits" 
> Sent: Monday, May 9, 2016 4:45:04 PM
> Subject: Re: [Clang] Convergent Attribute

> On Mon, May 9, 2016 at 2:43 PM, Richard Smith < rich...@metafoo.co.uk
> > wrote:

> > On Sun, May 8, 2016 at 12:43 PM, Matt Arsenault via cfe-commits <
> > cfe-commits@lists.llvm.org > wrote:
> 

> > > > On May 6, 2016, at 18:12, Richard Smith via cfe-commits <
> > > > cfe-commits@lists.llvm.org > wrote:
> > > 
> > 
> 

> > > > On Fri, May 6, 2016 at 4:20 PM, Matt Arsenault via cfe-commits
> > > > <
> > > > cfe-commits@lists.llvm.org > wrote:
> > > 
> > 
> 

> > > > > On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
> > > > 
> > > 
> > 
> 

> > > > > > This example looks wrong to me. It doesn't seem meaningful
> > > > > > for
> > > > > > a
> > > > > > function to be both readonly and convergent, because
> > > > > > convergent
> > > > > > means the call has some side-effect visible to other
> > > > > > threads
> > > > > > and
> > > > > > readonly means the call has no side-effects visible outside
> > > > > > the
> > > > > > function.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > This s not correct. It is valid for convergent operations to
> > > > > be
> > > > > readonly/readnone. Barriers are a common case which do have
> > > > > side
> > > > > effects, but there are also classes of GPU instructions which
> > > > > do
> > > > > not
> > > > > access memory and still need the convergent semantics.
> > > > 
> > > 
> > 
> 

> > > > Can you give an example? It's not clear to me how a function
> > > > could
> > > > be
> > > > both convergent and satisfy the readnone requirement that it
> > > > not
> > > > "access[...] any mutable state (e.g. memory, control registers,
> > > > etc)
> > > > visible to caller functions". Synchronizing with other threads
> > > > seems
> > > > like it would cause such a state change in an abstract sense.
> > > > Is
> > > > the
> > > > critical distinction here that the state mutation is visible to
> > > > the
> > > > code that spawned the gang of threads, but not to other threads
> > > > within the gang? (This seems like a bug in the definition of
> > > > readonly if so, because it means that a readonly call whose
> > > > result
> > > > is unused cannot be deleted.)
> > > 
> > 
> 

> > > > I care about this because Clang maps __attribute__((pure)) to
> > > > LLVM
> > > > readonly, and -- irrespective of the LLVM semantics -- a call
> > > > to
> > > > a
> > > > function marked pure is permitted to be deleted if the return
> > > > value
> > > > is unused, or to have multiple calls CSE'd. As a result, inside
> > > > Clang, we use that attribute to determine whether an expression
> > > > has
> > > > side effects, and Clang's reasoning about these things may also
> > > > lead
> > > > to miscompiles if a call to a function marked
> > > > __attribute__((pure,
> > > > convergent)) actually can have a side effect.
> > > > ___
> > > 
> > 
> 
> > > > cfe-commits mailing list
> > > 
> > 
> 
> > > > cfe-commits@lists.llvm.org
> > > 
> > 
> 
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> > > 
> > 
> 

> > > These are communication operations between lanes that do not
> > > require
> > > synchronization within the wavefront. These are mostly cross lane
> > > communication instructions. An example would be the
> > > amdgcn.mov.dpp
> > > instruction, which reads a register from a neighboring lane, or
> > > the
> > > CUDA warp vote functions.
> > 
> 
> > Those both appear to technically fail to satisfy the requirements
> > of
> > an __attribute__(

Re: [Clang] Convergent Attribute

2016-05-09 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Anastasia Stulova via cfe-commits" 
> To: "Matt Arsenault" , "Ettore Speziale" 
> , "Aaron Ballman"
> 
> Cc: "nd" , "Clang Commits" 
> Sent: Monday, May 9, 2016 12:39:19 PM
> Subject: RE: [Clang] Convergent Attribute
> 
> Since it's not a part of any official spec we could of course make it
> accepted with anything.
> 
> Just out of curiosity what other programming models supported by
> Clang do you think this attribute would be useful for?

CUDA? In any case, I don't see how the restriction helps users, and the 
attribute at the IR level has a well-defined meaning regardless. If a user were 
to have a use case, they'd simply find the restriction arbitrary and 
frustrating.

 -Hal

> 
> Anastasia
> 
> -Original Message-
> From: Matt Arsenault [mailto:matthew.arsena...@amd.com]
> Sent: 07 May 2016 00:37
> To: Anastasia Stulova; Ettore Speziale; Aaron Ballman
> Cc: nd; Clang Commits
> Subject: Re: [Clang] Convergent Attribute
> 
> On 05/06/2016 12:11 PM, Anastasia Stulova via cfe-commits wrote:
> > I was just wondering whether it would make sense to restrict the
> > usage of the attribute to OpenCL language i.e. to add  "let
> > LangOpts = [OpenCL];" in the attribute definition.
> This seems to be a pointless arbitrary restriction to me
> 
> -Matt
> 
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-09 Thread Richard Smith via cfe-commits
On Mon, May 9, 2016 at 2:43 PM, Richard Smith  wrote:

> On Sun, May 8, 2016 at 12:43 PM, Matt Arsenault via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>> On May 6, 2016, at 18:12, Richard Smith via cfe-commits <
>> cfe-commits@lists.llvm.org> wrote:
>>
>> On Fri, May 6, 2016 at 4:20 PM, Matt Arsenault via cfe-commits <
>> cfe-commits@lists.llvm.org> wrote:
>>
>>> On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
>>>
 This example looks wrong to me. It doesn't seem meaningful for a
 function to be both readonly and convergent, because convergent means the
 call has some side-effect visible to other threads and readonly means the
 call has no side-effects visible outside the function.

>>> This s not correct. It is valid for convergent operations to be
>>> readonly/readnone. Barriers are a common case which do have side effects,
>>> but there are also classes of GPU instructions which do not access memory
>>> and still need the convergent semantics.
>>>
>>
>> Can you give an example? It's not clear to me how a function could be
>> both convergent and satisfy the readnone requirement that it not
>> "access[...] any mutable state (e.g. memory, control registers, etc)
>> visible to caller functions". Synchronizing with other threads seems like
>> it would cause such a state change in an abstract sense. Is the critical
>> distinction here that the state mutation is visible to the code that
>> spawned the gang of threads, but not to other threads within the gang?
>> (This seems like a bug in the definition of readonly if so, because it
>> means that a readonly call whose result is unused cannot be deleted.)
>>
>> I care about this because Clang maps __attribute__((pure)) to LLVM
>> readonly, and -- irrespective of the LLVM semantics -- a call to a function
>> marked pure is permitted to be deleted if the return value is unused, or to
>> have multiple calls CSE'd. As a result, inside Clang, we use that attribute
>> to determine whether an expression has side effects, and Clang's reasoning
>> about these things may also lead to miscompiles if a call to a function
>> marked __attribute__((pure, convergent)) actually can have a side effect.
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>>
>> These are communication operations between lanes that do not require
>> synchronization within the wavefront. These are mostly cross lane
>> communication instructions. An example would be the amdgcn.mov.dpp
>> instruction, which reads a register from a neighboring lane, or the CUDA
>> warp vote functions.
>>
>
> Those both appear to technically fail to satisfy the requirements of an
> __attribute__((pure)) function. If I understand correctly, the DPP function
> effectively stores a value into some state that is shared with another lane
> (from Clang and LLVM's perspectives, state that is visible to a function
> evaluation outside the current one), and then reads a value from another
> such shared storage location. The CUDA warp vote functions effectively
> store a value into some state that is shared with all other threads in the
> warp and then read some summary information about the values stored by all
> the threads. In both cases, the function mutates state that is visible to
> other functions running on other threads, and so is not
> __attribute__((pure)) / readonly, as far as I can see.
>

(And just to be clear, the fact that no actual storage is used for this is
irrelevant to the notional semantics of the operation. Note that the
definition of the pure attribute also covers "control registers, etc".)


> It seems to me that this change weakens the definition of these attributes
> when combined with the convergent attribute to mean that the function *is*
> still allowed to store to mutable state that's shared with other lanes /
> other threads in the same warp, but only via convergent combined store/load
> primitives. That makes some sense, given that the behavior of the
> *execution* model does not (necessarily) treat each notional lane as a
> separate thread, and from that perspective the instruction can be viewed as
> operating on a vector and communicating only with itself, but it doesn't
> match the current definitions of the semantics of these attributes (which
> are specified in terms of the *source* model, in which each notional lane
> is a separate invocation of the function). So I'd like at least for some
> documentation to be added for our "pure" and "const" attributes, saying
> something like "if this is combined with the "convergent" attribute, the
> function may still communicate with other lanes through convergent
> operations, even though such communication notionally involves modification
> of mutable state visible to the other lanes". I'd suggest a similar change
> also be made to LLVM's LangRef.
>
>
> I've checked through how clang is u

Re: [Clang] Convergent Attribute

2016-05-09 Thread Richard Smith via cfe-commits
On Sun, May 8, 2016 at 12:43 PM, Matt Arsenault via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> On May 6, 2016, at 18:12, Richard Smith via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
> On Fri, May 6, 2016 at 4:20 PM, Matt Arsenault via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>> On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
>>
>>> This example looks wrong to me. It doesn't seem meaningful for a
>>> function to be both readonly and convergent, because convergent means the
>>> call has some side-effect visible to other threads and readonly means the
>>> call has no side-effects visible outside the function.
>>>
>> This s not correct. It is valid for convergent operations to be
>> readonly/readnone. Barriers are a common case which do have side effects,
>> but there are also classes of GPU instructions which do not access memory
>> and still need the convergent semantics.
>>
>
> Can you give an example? It's not clear to me how a function could be both
> convergent and satisfy the readnone requirement that it not "access[...]
> any mutable state (e.g. memory, control registers, etc) visible to caller
> functions". Synchronizing with other threads seems like it would cause such
> a state change in an abstract sense. Is the critical distinction here that
> the state mutation is visible to the code that spawned the gang of threads,
> but not to other threads within the gang? (This seems like a bug in the
> definition of readonly if so, because it means that a readonly call whose
> result is unused cannot be deleted.)
>
> I care about this because Clang maps __attribute__((pure)) to LLVM
> readonly, and -- irrespective of the LLVM semantics -- a call to a function
> marked pure is permitted to be deleted if the return value is unused, or to
> have multiple calls CSE'd. As a result, inside Clang, we use that attribute
> to determine whether an expression has side effects, and Clang's reasoning
> about these things may also lead to miscompiles if a call to a function
> marked __attribute__((pure, convergent)) actually can have a side effect.
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
>
> These are communication operations between lanes that do not require
> synchronization within the wavefront. These are mostly cross lane
> communication instructions. An example would be the amdgcn.mov.dpp
> instruction, which reads a register from a neighboring lane, or the CUDA
> warp vote functions.
>

Those both appear to technically fail to satisfy the requirements of an
__attribute__((pure)) function. If I understand correctly, the DPP function
effectively stores a value into some state that is shared with another lane
(from Clang and LLVM's perspectives, state that is visible to a function
evaluation outside the current one), and then reads a value from another
such shared storage location. The CUDA warp vote functions effectively
store a value into some state that is shared with all other threads in the
warp and then read some summary information about the values stored by all
the threads. In both cases, the function mutates state that is visible to
other functions running on other threads, and so is not
__attribute__((pure)) / readonly, as far as I can see.

It seems to me that this change weakens the definition of these attributes
when combined with the convergent attribute to mean that the function *is*
still allowed to store to mutable state that's shared with other lanes /
other threads in the same warp, but only via convergent combined store/load
primitives. That makes some sense, given that the behavior of the
*execution* model does not (necessarily) treat each notional lane as a
separate thread, and from that perspective the instruction can be viewed as
operating on a vector and communicating only with itself, but it doesn't
match the current definitions of the semantics of these attributes (which
are specified in terms of the *source* model, in which each notional lane
is a separate invocation of the function). So I'd like at least for some
documentation to be added for our "pure" and "const" attributes, saying
something like "if this is combined with the "convergent" attribute, the
function may still communicate with other lanes through convergent
operations, even though such communication notionally involves modification
of mutable state visible to the other lanes". I'd suggest a similar change
also be made to LLVM's LangRef.


I've checked through how clang is using the "pure" attribute, and it seems
like it should mostly do the right thing in this case. There are a few
places where (using your amdgcn.mov.dpp example) we would cause a dpp
instruction to be emitted where the source code called the relevant
operation from within an operand that we do not notionally evaluate (for
instance, the operand of a __assume or __builtin_object_size). Cont

Re: [Clang] Convergent Attribute

2016-05-09 Thread Matt Arsenault via cfe-commits

> On May 9, 2016, at 10:39, Anastasia Stulova via cfe-commits 
>  wrote:
> 
> Since it's not a part of any official spec we could of course make it 
> accepted with anything.
> 
> Just out of curiosity what other programming models supported by Clang do you 
> think this attribute would be useful for?
> 
> Anastasia

I’m not sure of any real uses for it

-Matt___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-09 Thread Ettore Speziale via cfe-commits
Hello,

> On 05/06/2016 12:11 PM, Anastasia Stulova via cfe-commits wrote:
>> I was just wondering whether it would make sense to restrict the usage of 
>> the attribute to OpenCL language i.e. to add  "let LangOpts = [OpenCL];" in 
>> the attribute definition.
> This seems to be a pointless arbitrary restriction to me

Updated the patch based on the review comments:

* allow the [[clang::convergent]] attribute spelling
* used a better example in the documentation

Thanks

--
Ettore Speziale — Compiler Engineer
speziale.ett...@gmail.com
espezi...@apple.com
--



convergent.diff
Description: Binary data
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


RE: [Clang] Convergent Attribute

2016-05-09 Thread Anastasia Stulova via cfe-commits
Since it's not a part of any official spec we could of course make it accepted 
with anything.

Just out of curiosity what other programming models supported by Clang do you 
think this attribute would be useful for?

Anastasia

-Original Message-
From: Matt Arsenault [mailto:matthew.arsena...@amd.com] 
Sent: 07 May 2016 00:37
To: Anastasia Stulova; Ettore Speziale; Aaron Ballman
Cc: nd; Clang Commits
Subject: Re: [Clang] Convergent Attribute

On 05/06/2016 12:11 PM, Anastasia Stulova via cfe-commits wrote:
> I was just wondering whether it would make sense to restrict the usage of the 
> attribute to OpenCL language i.e. to add  "let LangOpts = [OpenCL];" in the 
> attribute definition.
This seems to be a pointless arbitrary restriction to me

-Matt

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-08 Thread Matt Arsenault via cfe-commits

> On May 6, 2016, at 18:12, Richard Smith via cfe-commits 
>  wrote:
> 
> On Fri, May 6, 2016 at 4:20 PM, Matt Arsenault via cfe-commits 
> mailto:cfe-commits@lists.llvm.org>> wrote:
> On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
> This example looks wrong to me. It doesn't seem meaningful for a function to 
> be both readonly and convergent, because convergent means the call has some 
> side-effect visible to other threads and readonly means the call has no 
> side-effects visible outside the function.
> This s not correct. It is valid for convergent operations to be 
> readonly/readnone. Barriers are a common case which do have side effects, but 
> there are also classes of GPU instructions which do not access memory and 
> still need the convergent semantics.
> 
> Can you give an example? It's not clear to me how a function could be both 
> convergent and satisfy the readnone requirement that it not "access[...] any 
> mutable state (e.g. memory, control registers, etc) visible to caller 
> functions". Synchronizing with other threads seems like it would cause such a 
> state change in an abstract sense. Is the critical distinction here that the 
> state mutation is visible to the code that spawned the gang of threads, but 
> not to other threads within the gang? (This seems like a bug in the 
> definition of readonly if so, because it means that a readonly call whose 
> result is unused cannot be deleted.)
> 
> I care about this because Clang maps __attribute__((pure)) to LLVM readonly, 
> and -- irrespective of the LLVM semantics -- a call to a function marked pure 
> is permitted to be deleted if the return value is unused, or to have multiple 
> calls CSE'd. As a result, inside Clang, we use that attribute to determine 
> whether an expression has side effects, and Clang's reasoning about these 
> things may also lead to miscompiles if a call to a function marked 
> __attribute__((pure, convergent)) actually can have a side effect.
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

These are communication operations between lanes that do not require 
synchronization within the wavefront. These are mostly cross lane communication 
instructions. An example would be the amdgcn.mov.dpp instruction, which reads a 
register from a neighboring lane, or the CUDA warp vote functions. There is no 
synchronization required, and there is no other way for the same item to access 
that information private to the other workitem. There’s no observable global 
state from the perspective of a single lane. The individual registers changed 
aren’t visible to the spawning host program (perhaps with the exception of some 
debug hardware inspecting all of the individual registers). Deleting these 
would be perfectly acceptable if the result is unused.

-Matt___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Richard Smith via cfe-commits
On Fri, May 6, 2016 at 4:20 PM, Matt Arsenault via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
>
>> This example looks wrong to me. It doesn't seem meaningful for a function
>> to be both readonly and convergent, because convergent means the call has
>> some side-effect visible to other threads and readonly means the call has
>> no side-effects visible outside the function.
>>
> This s not correct. It is valid for convergent operations to be
> readonly/readnone. Barriers are a common case which do have side effects,
> but there are also classes of GPU instructions which do not access memory
> and still need the convergent semantics.
>

Can you give an example? It's not clear to me how a function could be both
convergent and satisfy the readnone requirement that it not "access[...]
any mutable state (e.g. memory, control registers, etc) visible to caller
functions". Synchronizing with other threads seems like it would cause such
a state change in an abstract sense. Is the critical distinction here that
the state mutation is visible to the code that spawned the gang of threads,
but not to other threads within the gang? (This seems like a bug in the
definition of readonly if so, because it means that a readonly call whose
result is unused cannot be deleted.)

I care about this because Clang maps __attribute__((pure)) to LLVM
readonly, and -- irrespective of the LLVM semantics -- a call to a function
marked pure is permitted to be deleted if the return value is unused, or to
have multiple calls CSE'd. As a result, inside Clang, we use that attribute
to determine whether an expression has side effects, and Clang's reasoning
about these things may also lead to miscompiles if a call to a function
marked __attribute__((pure, convergent)) actually can have a side effect.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Matt Arsenault via cfe-commits

On 05/06/2016 12:11 PM, Anastasia Stulova via cfe-commits wrote:

I was just wondering whether it would make sense to restrict the usage of the attribute 
to OpenCL language i.e. to add  "let LangOpts = [OpenCL];" in the attribute 
definition.

This seems to be a pointless arbitrary restriction to me

-Matt
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Matt Arsenault via cfe-commits

On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
This example looks wrong to me. It doesn't seem meaningful for a 
function to be both readonly and convergent, because convergent means 
the call has some side-effect visible to other threads and readonly 
means the call has no side-effects visible outside the function.
This s not correct. It is valid for convergent operations to be 
readonly/readnone. Barriers are a common case which do have side 
effects, but there are also classes of GPU instructions which do not 
access memory and still need the convergent semantics.


-Matt
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Matt Arsenault via cfe-commits

On 05/06/2016 02:53 PM, Richard Smith via cfe-commits wrote:
It looks like we added the noduplicate attribute to clang to support 
OpenCL's barrier function. Did we get the semantics for it wrong for 
its intended use case?
Yes. Noduplicate is essentially deprecated in favor of convergent. 
noduplicate is too strict, duplicating is OK in the case of unrolling a 
loop with a barrier for example.

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Richard Smith via cfe-commits
On Fri, May 6, 2016 at 2:42 PM, David Majnemer 
wrote:

> On Fri, May 6, 2016 at 2:36 PM, Richard Smith via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>> On Fri, May 6, 2016 at 1:56 PM, Ettore Speziale via cfe-commits <
>> cfe-commits@lists.llvm.org> wrote:
>>
>>> Hello,
>>>
>>> > In the case of foo, there could be a problem.
>>> > If you do not mark it convergent, the LLVM sink pass push the call to
>>> foo to the then branch of the ternary operator, hence the program has been
>>> incorrectly optimized.
>>> >
>>> > Really? It looks like the problem is that you lied to the compiler by
>>> marking the function as 'pure'. The barrier is a side-effect that cannot be
>>> removed or duplicated, so it's not correct to mark this function as pure.
>>>
>>> I was trying to write a very small example to trick LLVM and trigger the
>>> optimization. It is based on Transforms/Sink/convergent.ll:
>>>
>>> define i32 @foo(i1 %arg) {
>>> entry:
>>>   %c = call i32 @bar() readonly convergent
>>>   br i1 %arg, label %then, label %end
>>>
>>> then:
>>>   ret i32 %c
>>>
>>> end:
>>>   ret i32 0
>>> }
>>>
>>> declare i32 @bar() readonly convergent
>>>
>>
>> This example looks wrong to me. It doesn't seem meaningful for a function
>> to be both readonly and convergent, because convergent means the call has
>> some side-effect visible to other threads and readonly means the call has
>> no side-effects visible outside the function.
>>
>> Here is another example:
>>>
>>> void foo0(void);
>>> void foo1(void);
>>>
>>> __attribute__((convergent)) void baz() {
>>>   barrier(CLK_GLOBAL_MEM_FENCE);
>>> }
>>>
>>> void bar(int x, global int *y) {
>>>   if (x < 5)
>>> foo0();
>>>   else
>>> foo1();
>>>
>>>   baz();
>>>
>>>   if (x < 5)
>>> foo0();
>>>   else
>>> foo1();
>>> }
>>>
>>
>> This one looks a lot more interesting. It looks like 'convergent' is a
>> way of informing LLVM that the call cannot be duplicated, yes? That being
>> the case, how is this attribute different from the existing
>> [[clang::noduplicate]] / __attribute__((noduplicate)) attribute?
>>
>
> I think it has more to do with LLVM's definition of convergent: that you
> really do not want control dependencies changing for a callsite.
>

Hmm, so we can't transform:

  %a = complex_pure_operation1
  %b = complex_pure_operation2
  %c = select i1 %x, i32 %a, i32 %b
  call void @foo(i32 %c) convergent

... into ...

  br i1 %x, label %aa, label %bb

aa:
  %a = complex_pure_operation1
  br label %cont

bb:
  %b = complex_pure_operation2
  br label %cont

cont:
  %c = phi i32 [ %a, %aa ],  [ %b, %bb ]
  call void @foo(i32 %c) convergent

?

It looks like we added the noduplicate attribute to clang to support
OpenCL's barrier function. Did we get the semantics for it wrong for its
intended use case?


> http://llvm.org/docs/LangRef.html#function-attributes
>
>
>>
>> Based on Transforms/JumpThreading/basic.ll:
>>>
>>> define void @h_con(i32 %p) {
>>>   %x = icmp ult i32 %p, 5
>>>   br i1 %x, label %l1, label %l2
>>>
>>> l1:
>>>   call void @j()
>>>   br label %l3
>>>
>>> l2:
>>>   call void @k()
>>>   br label %l3
>>>
>>> l3:
>>> ; CHECK: call void @g() [[CON:#[0-9]+]]
>>> ; CHECK-NOT: call void @g() [[CON]]
>>>   call void @g() convergent
>>>   %y = icmp ult i32 %p, 5
>>>   br i1 %y, label %l4, label %l5
>>>
>>> l4:
>>>   call void @j()
>>>   ret void
>>>
>>> l5:
>>>   call void @k()
>>>   ret void
>>> ; CHECK: }
>>> }
>>>
>>> If you do not mark baz convergent, you get this:
>>>
>>> clang -x cl -emit-llvm -S -o - test.c -O0 | opt -mem2reg -jump-threading
>>> -S
>>>
>>> define void @bar(i32 %x) #0 {
>>> entry:
>>>   %cmp = icmp slt i32 %x, 5
>>>   br i1 %cmp, label %if.then2, label %if.else3
>>>
>>> if.then2: ; preds = %entry
>>>   call void @foo0()
>>>   call void @baz()
>>>   call void @foo0()
>>>   br label %if.end4
>>>
>>> if.else3: ; preds = %entry
>>>   call void @foo1()
>>>   call void @baz()
>>>   call void @foo1()
>>>   br label %if.end4
>>>
>>> if.end4:  ; preds = %if.else3,
>>> %if.then2
>>>   ret void
>>> }
>>>
>>> Which is illegal, as the value of x might not be the same for all
>>> work-items.
>>>
>>> I’ll update the patch such as:
>>>
>>> * it uses the example about jump-threading
>>> * it marks the attribute available in OpenCL/Cuda
>>> * it provides the [[clang::convergent]] attribute
>>>
>>> Thanks,
>>> Ettore Speziale
>>>
>>> --
>>> Ettore Speziale — Compiler Engineer
>>> speziale.ett...@gmail.com
>>> espezi...@apple.com
>>> --
>>>
>>> ___
>>> cfe-commits mailing list
>>> cfe-commits@lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>>
>>
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> h

Re: [Clang] Convergent Attribute

2016-05-06 Thread David Majnemer via cfe-commits
On Fri, May 6, 2016 at 2:36 PM, Richard Smith via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> On Fri, May 6, 2016 at 1:56 PM, Ettore Speziale via cfe-commits <
> cfe-commits@lists.llvm.org> wrote:
>
>> Hello,
>>
>> > In the case of foo, there could be a problem.
>> > If you do not mark it convergent, the LLVM sink pass push the call to
>> foo to the then branch of the ternary operator, hence the program has been
>> incorrectly optimized.
>> >
>> > Really? It looks like the problem is that you lied to the compiler by
>> marking the function as 'pure'. The barrier is a side-effect that cannot be
>> removed or duplicated, so it's not correct to mark this function as pure.
>>
>> I was trying to write a very small example to trick LLVM and trigger the
>> optimization. It is based on Transforms/Sink/convergent.ll:
>>
>> define i32 @foo(i1 %arg) {
>> entry:
>>   %c = call i32 @bar() readonly convergent
>>   br i1 %arg, label %then, label %end
>>
>> then:
>>   ret i32 %c
>>
>> end:
>>   ret i32 0
>> }
>>
>> declare i32 @bar() readonly convergent
>>
>
> This example looks wrong to me. It doesn't seem meaningful for a function
> to be both readonly and convergent, because convergent means the call has
> some side-effect visible to other threads and readonly means the call has
> no side-effects visible outside the function.
>
> Here is another example:
>>
>> void foo0(void);
>> void foo1(void);
>>
>> __attribute__((convergent)) void baz() {
>>   barrier(CLK_GLOBAL_MEM_FENCE);
>> }
>>
>> void bar(int x, global int *y) {
>>   if (x < 5)
>> foo0();
>>   else
>> foo1();
>>
>>   baz();
>>
>>   if (x < 5)
>> foo0();
>>   else
>> foo1();
>> }
>>
>
> This one looks a lot more interesting. It looks like 'convergent' is a way
> of informing LLVM that the call cannot be duplicated, yes? That being the
> case, how is this attribute different from the existing
> [[clang::noduplicate]] / __attribute__((noduplicate)) attribute?
>

I think it has more to do with LLVM's definition of convergent: that you
really do not want control dependencies changing for a callsite.

http://llvm.org/docs/LangRef.html#function-attributes


>
> Based on Transforms/JumpThreading/basic.ll:
>>
>> define void @h_con(i32 %p) {
>>   %x = icmp ult i32 %p, 5
>>   br i1 %x, label %l1, label %l2
>>
>> l1:
>>   call void @j()
>>   br label %l3
>>
>> l2:
>>   call void @k()
>>   br label %l3
>>
>> l3:
>> ; CHECK: call void @g() [[CON:#[0-9]+]]
>> ; CHECK-NOT: call void @g() [[CON]]
>>   call void @g() convergent
>>   %y = icmp ult i32 %p, 5
>>   br i1 %y, label %l4, label %l5
>>
>> l4:
>>   call void @j()
>>   ret void
>>
>> l5:
>>   call void @k()
>>   ret void
>> ; CHECK: }
>> }
>>
>> If you do not mark baz convergent, you get this:
>>
>> clang -x cl -emit-llvm -S -o - test.c -O0 | opt -mem2reg -jump-threading
>> -S
>>
>> define void @bar(i32 %x) #0 {
>> entry:
>>   %cmp = icmp slt i32 %x, 5
>>   br i1 %cmp, label %if.then2, label %if.else3
>>
>> if.then2: ; preds = %entry
>>   call void @foo0()
>>   call void @baz()
>>   call void @foo0()
>>   br label %if.end4
>>
>> if.else3: ; preds = %entry
>>   call void @foo1()
>>   call void @baz()
>>   call void @foo1()
>>   br label %if.end4
>>
>> if.end4:  ; preds = %if.else3,
>> %if.then2
>>   ret void
>> }
>>
>> Which is illegal, as the value of x might not be the same for all
>> work-items.
>>
>> I’ll update the patch such as:
>>
>> * it uses the example about jump-threading
>> * it marks the attribute available in OpenCL/Cuda
>> * it provides the [[clang::convergent]] attribute
>>
>> Thanks,
>> Ettore Speziale
>>
>> --
>> Ettore Speziale — Compiler Engineer
>> speziale.ett...@gmail.com
>> espezi...@apple.com
>> --
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>
>
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Richard Smith via cfe-commits
On Fri, May 6, 2016 at 1:56 PM, Ettore Speziale via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> Hello,
>
> > In the case of foo, there could be a problem.
> > If you do not mark it convergent, the LLVM sink pass push the call to
> foo to the then branch of the ternary operator, hence the program has been
> incorrectly optimized.
> >
> > Really? It looks like the problem is that you lied to the compiler by
> marking the function as 'pure'. The barrier is a side-effect that cannot be
> removed or duplicated, so it's not correct to mark this function as pure.
>
> I was trying to write a very small example to trick LLVM and trigger the
> optimization. It is based on Transforms/Sink/convergent.ll:
>
> define i32 @foo(i1 %arg) {
> entry:
>   %c = call i32 @bar() readonly convergent
>   br i1 %arg, label %then, label %end
>
> then:
>   ret i32 %c
>
> end:
>   ret i32 0
> }
>
> declare i32 @bar() readonly convergent
>

This example looks wrong to me. It doesn't seem meaningful for a function
to be both readonly and convergent, because convergent means the call has
some side-effect visible to other threads and readonly means the call has
no side-effects visible outside the function.

Here is another example:
>
> void foo0(void);
> void foo1(void);
>
> __attribute__((convergent)) void baz() {
>   barrier(CLK_GLOBAL_MEM_FENCE);
> }
>
> void bar(int x, global int *y) {
>   if (x < 5)
> foo0();
>   else
> foo1();
>
>   baz();
>
>   if (x < 5)
> foo0();
>   else
> foo1();
> }
>

This one looks a lot more interesting. It looks like 'convergent' is a way
of informing LLVM that the call cannot be duplicated, yes? That being the
case, how is this attribute different from the existing
[[clang::noduplicate]] / __attribute__((noduplicate)) attribute?

Based on Transforms/JumpThreading/basic.ll:
>
> define void @h_con(i32 %p) {
>   %x = icmp ult i32 %p, 5
>   br i1 %x, label %l1, label %l2
>
> l1:
>   call void @j()
>   br label %l3
>
> l2:
>   call void @k()
>   br label %l3
>
> l3:
> ; CHECK: call void @g() [[CON:#[0-9]+]]
> ; CHECK-NOT: call void @g() [[CON]]
>   call void @g() convergent
>   %y = icmp ult i32 %p, 5
>   br i1 %y, label %l4, label %l5
>
> l4:
>   call void @j()
>   ret void
>
> l5:
>   call void @k()
>   ret void
> ; CHECK: }
> }
>
> If you do not mark baz convergent, you get this:
>
> clang -x cl -emit-llvm -S -o - test.c -O0 | opt -mem2reg -jump-threading -S
>
> define void @bar(i32 %x) #0 {
> entry:
>   %cmp = icmp slt i32 %x, 5
>   br i1 %cmp, label %if.then2, label %if.else3
>
> if.then2: ; preds = %entry
>   call void @foo0()
>   call void @baz()
>   call void @foo0()
>   br label %if.end4
>
> if.else3: ; preds = %entry
>   call void @foo1()
>   call void @baz()
>   call void @foo1()
>   br label %if.end4
>
> if.end4:  ; preds = %if.else3,
> %if.then2
>   ret void
> }
>
> Which is illegal, as the value of x might not be the same for all
> work-items.
>
> I’ll update the patch such as:
>
> * it uses the example about jump-threading
> * it marks the attribute available in OpenCL/Cuda
> * it provides the [[clang::convergent]] attribute
>
> Thanks,
> Ettore Speziale
>
> --
> Ettore Speziale — Compiler Engineer
> speziale.ett...@gmail.com
> espezi...@apple.com
> --
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Ettore Speziale via cfe-commits
Hello,

> In the case of foo, there could be a problem.
> If you do not mark it convergent, the LLVM sink pass push the call to foo to 
> the then branch of the ternary operator, hence the program has been 
> incorrectly optimized.
> 
> Really? It looks like the problem is that you lied to the compiler by marking 
> the function as 'pure'. The barrier is a side-effect that cannot be removed 
> or duplicated, so it's not correct to mark this function as pure.

I was trying to write a very small example to trick LLVM and trigger the 
optimization. It is based on Transforms/Sink/convergent.ll:

define i32 @foo(i1 %arg) {
entry:
  %c = call i32 @bar() readonly convergent
  br i1 %arg, label %then, label %end

then:
  ret i32 %c

end:
  ret i32 0
}

declare i32 @bar() readonly convergent

Here is another example:

void foo0(void);
void foo1(void);

__attribute__((convergent)) void baz() {
  barrier(CLK_GLOBAL_MEM_FENCE);
}

void bar(int x, global int *y) {
  if (x < 5)
foo0();
  else
foo1();

  baz();

  if (x < 5)
foo0();
  else
foo1();
}

Based on Transforms/JumpThreading/basic.ll:

define void @h_con(i32 %p) {
  %x = icmp ult i32 %p, 5
  br i1 %x, label %l1, label %l2

l1:
  call void @j()
  br label %l3

l2:
  call void @k()
  br label %l3

l3:
; CHECK: call void @g() [[CON:#[0-9]+]]
; CHECK-NOT: call void @g() [[CON]]
  call void @g() convergent
  %y = icmp ult i32 %p, 5
  br i1 %y, label %l4, label %l5

l4:
  call void @j()
  ret void

l5:
  call void @k()
  ret void
; CHECK: }
}

If you do not mark baz convergent, you get this:

clang -x cl -emit-llvm -S -o - test.c -O0 | opt -mem2reg -jump-threading -S

define void @bar(i32 %x) #0 {
entry:
  %cmp = icmp slt i32 %x, 5
  br i1 %cmp, label %if.then2, label %if.else3

if.then2: ; preds = %entry
  call void @foo0()
  call void @baz()
  call void @foo0()
  br label %if.end4

if.else3: ; preds = %entry
  call void @foo1()
  call void @baz()
  call void @foo1()
  br label %if.end4

if.end4:  ; preds = %if.else3, %if.then2
  ret void
}

Which is illegal, as the value of x might not be the same for all work-items.

I’ll update the patch such as:

* it uses the example about jump-threading
* it marks the attribute available in OpenCL/Cuda
* it provides the [[clang::convergent]] attribute

Thanks,
Ettore Speziale

--
Ettore Speziale — Compiler Engineer
speziale.ett...@gmail.com
espezi...@apple.com
--

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread Richard Smith via cfe-commits
On Wed, May 4, 2016 at 5:47 PM, Ettore Speziale via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> Hello,
>
> > I would appreciate a bit more background on this attribute's
> > semantics. How would a user know when to add this attribute to their
> > function definition? Are there other attributes that cannot be used in
> > conjunction with this one? Should this apply to member functions? What
> > about Objective-C methods?
>
> The convergent attribute is meant to be used with languages supporting the
> SIMT execution model, like OpenCL.
>
> I put the following example in the documentation:
>
>   __attribute__((convergent)) __attribute__((pure)) int foo(void) {
> int x;
> ...
> barrier(CLK_GLOBAL_MEM_FENCE);
> ...
> return x;
>   }
>
>   kernel void bar(global int *y) {
> int z = foo();
> *y = get_global_id() == 0 ? z : 0;
>   }
>
> The call to barrier must be either executed by all work-items in a
> work-group, or by none of them.
> This is a requirement of OpenCL, and is left to the programmer to ensure
> that happens.
>
> In the case of foo, there could be a problem.
> If you do not mark it convergent, the LLVM sink pass push the call to foo
> to the then branch of the ternary operator, hence the program has been
> incorrectly optimized.
>

Really? It looks like the problem is that you lied to the compiler by
marking the function as 'pure'. The barrier is a side-effect that cannot be
removed or duplicated, so it's not correct to mark this function as pure.


> The LLVM convergent attribute has been introduced in order to solve this
> problem for intrinsic functions.
> The goal of this patch is to expose that attribute at the CLANG level, so
> it can be used on all functions.
>
> The user is supposed to add such attribute when the function requires
> convergent execution, like in the example above.
>
> I’m not aware of any attribute that would conflict with convergent.
>
> The convergent attribute can be applied as well to member functions.
>
> The convergent attribute cannot be applied to Objective-C methods right
> now — it will be ignored:
>
> test.c:14:27: warning: 'convergent' attribute only applies to functions
> [-Wignored-attributes]
> - (void) x __attribute__((convergent));
>
> Since convergent is meant for languages supporting the SIMT execution
> model, and to the best of my knowledge I’m not aware of any language based
> on Objective-C supporting that, I would guess there is no benefit in
> supporting convergent on ObjectiveC methods.
>
> >> diff --git a/include/clang/Basic/Attr.td b/include/clang/Basic/Attr.td
> >> index df41aeb..eafafc6 100644
> >> --- a/include/clang/Basic/Attr.td
> >> +++ b/include/clang/Basic/Attr.td
> >> @@ -580,6 +580,12 @@ def Constructor : InheritableAttr {
> >>   let Documentation = [Undocumented];
> >> }
> >>
> >> +def Convergent : InheritableAttr {
> >> +  let Spellings = [GNU<"convergent">];
> >
> > Is there a reason to not support this under CXX11<"clang",
> > "convergent"> as well?
>
> I’ve just used the most basic spelling, which fit the OpenCL case.
> I can add support for the CXX11 spelling if you find it valuable.
>
> >> +  let Subjects = SubjectList<[Function]>;
> >> +  let Documentation = [Undocumented];
> >
> > Please, no new undocumented attributes.
>
> Fixed, here is updated patch:
>
>
>
>
> Thanks!
>
> --
> Ettore Speziale — Compiler Engineer
> speziale.ett...@gmail.com
> espezi...@apple.com
> --
>
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-06 Thread David Majnemer via cfe-commits
I think it could be useful for CUDA too.

On Friday, May 6, 2016, Anastasia Stulova via cfe-commits <
cfe-commits@lists.llvm.org> wrote:

> Hi Ettore,
>
> LGTM generally!
>
> I was just wondering whether it would make sense to restrict the usage of
> the attribute to OpenCL language i.e. to add  "let LangOpts = [OpenCL];" in
> the attribute definition.
>
> Thanks!
> Anastasia
>
> -Original Message-
> From: Ettore Speziale [mailto:speziale.ett...@gmail.com ]
> Sent: 05 May 2016 01:48
> To: Aaron Ballman
> Cc: Ettore Speziale; Anastasia Stulova; Clang Commits
> Subject: Re: [Clang] Convergent Attribute
>
> Hello,
>
> > I would appreciate a bit more background on this attribute's
> > semantics. How would a user know when to add this attribute to their
> > function definition? Are there other attributes that cannot be used in
> > conjunction with this one? Should this apply to member functions? What
> > about Objective-C methods?
>
> The convergent attribute is meant to be used with languages supporting the
> SIMT execution model, like OpenCL.
>
> I put the following example in the documentation:
>
>   __attribute__((convergent)) __attribute__((pure)) int foo(void) {
> int x;
> ...
> barrier(CLK_GLOBAL_MEM_FENCE);
> ...
> return x;
>   }
>
>   kernel void bar(global int *y) {
> int z = foo();
> *y = get_global_id() == 0 ? z : 0;
>   }
>
> The call to barrier must be either executed by all work-items in a
> work-group, or by none of them.
> This is a requirement of OpenCL, and is left to the programmer to ensure
> that happens.
>
> In the case of foo, there could be a problem.
> If you do not mark it convergent, the LLVM sink pass push the call to foo
> to the then branch of the ternary operator, hence the program has been
> incorrectly optimized.
>
> The LLVM convergent attribute has been introduced in order to solve this
> problem for intrinsic functions.
> The goal of this patch is to expose that attribute at the CLANG level, so
> it can be used on all functions.
>
> The user is supposed to add such attribute when the function requires
> convergent execution, like in the example above.
>
> I’m not aware of any attribute that would conflict with convergent.
>
> The convergent attribute can be applied as well to member functions.
>
> The convergent attribute cannot be applied to Objective-C methods right
> now — it will be ignored:
>
> test.c:14:27: warning: 'convergent' attribute only applies to functions
> [-Wignored-attributes]
> - (void) x __attribute__((convergent));
>
> Since convergent is meant for languages supporting the SIMT execution
> model, and to the best of my knowledge I’m not aware of any language based
> on Objective-C supporting that, I would guess there is no benefit in
> supporting convergent on ObjectiveC methods.
>
> >> diff --git a/include/clang/Basic/Attr.td
> >> b/include/clang/Basic/Attr.td index df41aeb..eafafc6 100644
> >> --- a/include/clang/Basic/Attr.td
> >> +++ b/include/clang/Basic/Attr.td
> >> @@ -580,6 +580,12 @@ def Constructor : InheritableAttr {
> >>   let Documentation = [Undocumented]; }
> >>
> >> +def Convergent : InheritableAttr {
> >> +  let Spellings = [GNU<"convergent">];
> >
> > Is there a reason to not support this under CXX11<"clang",
> > "convergent"> as well?
>
> I’ve just used the most basic spelling, which fit the OpenCL case.
> I can add support for the CXX11 spelling if you find it valuable.
>
> >> +  let Subjects = SubjectList<[Function]>;  let Documentation =
> >> + [Undocumented];
> >
> > Please, no new undocumented attributes.
>
> Fixed, here is updated patch:
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


RE: [Clang] Convergent Attribute

2016-05-06 Thread Anastasia Stulova via cfe-commits
Hi Ettore,

LGTM generally!

I was just wondering whether it would make sense to restrict the usage of the 
attribute to OpenCL language i.e. to add  "let LangOpts = [OpenCL];" in the 
attribute definition.

Thanks!
Anastasia

-Original Message-
From: Ettore Speziale [mailto:speziale.ett...@gmail.com] 
Sent: 05 May 2016 01:48
To: Aaron Ballman
Cc: Ettore Speziale; Anastasia Stulova; Clang Commits
Subject: Re: [Clang] Convergent Attribute

Hello,

> I would appreciate a bit more background on this attribute's 
> semantics. How would a user know when to add this attribute to their 
> function definition? Are there other attributes that cannot be used in 
> conjunction with this one? Should this apply to member functions? What 
> about Objective-C methods?

The convergent attribute is meant to be used with languages supporting the SIMT 
execution model, like OpenCL.

I put the following example in the documentation:

  __attribute__((convergent)) __attribute__((pure)) int foo(void) {
int x;
...
barrier(CLK_GLOBAL_MEM_FENCE);
...
return x;
  }

  kernel void bar(global int *y) {
int z = foo();
*y = get_global_id() == 0 ? z : 0;
  }

The call to barrier must be either executed by all work-items in a work-group, 
or by none of them.
This is a requirement of OpenCL, and is left to the programmer to ensure that 
happens.

In the case of foo, there could be a problem.
If you do not mark it convergent, the LLVM sink pass push the call to foo to 
the then branch of the ternary operator, hence the program has been incorrectly 
optimized.

The LLVM convergent attribute has been introduced in order to solve this 
problem for intrinsic functions.
The goal of this patch is to expose that attribute at the CLANG level, so it 
can be used on all functions.

The user is supposed to add such attribute when the function requires 
convergent execution, like in the example above.

I’m not aware of any attribute that would conflict with convergent.

The convergent attribute can be applied as well to member functions.

The convergent attribute cannot be applied to Objective-C methods right now — 
it will be ignored:

test.c:14:27: warning: 'convergent' attribute only applies to functions 
[-Wignored-attributes]
- (void) x __attribute__((convergent));

Since convergent is meant for languages supporting the SIMT execution model, 
and to the best of my knowledge I’m not aware of any language based on 
Objective-C supporting that, I would guess there is no benefit in supporting 
convergent on ObjectiveC methods.

>> diff --git a/include/clang/Basic/Attr.td 
>> b/include/clang/Basic/Attr.td index df41aeb..eafafc6 100644
>> --- a/include/clang/Basic/Attr.td
>> +++ b/include/clang/Basic/Attr.td
>> @@ -580,6 +580,12 @@ def Constructor : InheritableAttr {
>>   let Documentation = [Undocumented]; }
>> 
>> +def Convergent : InheritableAttr {
>> +  let Spellings = [GNU<"convergent">];
> 
> Is there a reason to not support this under CXX11<"clang", 
> "convergent"> as well?

I’ve just used the most basic spelling, which fit the OpenCL case.
I can add support for the CXX11 spelling if you find it valuable.

>> +  let Subjects = SubjectList<[Function]>;  let Documentation = 
>> + [Undocumented];
> 
> Please, no new undocumented attributes.

Fixed, here is updated patch:

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-04 Thread Ettore Speziale via cfe-commits
Hello,

> I would appreciate a bit more background on this attribute's
> semantics. How would a user know when to add this attribute to their
> function definition? Are there other attributes that cannot be used in
> conjunction with this one? Should this apply to member functions? What
> about Objective-C methods?

The convergent attribute is meant to be used with languages supporting the SIMT 
execution model, like OpenCL.

I put the following example in the documentation:

  __attribute__((convergent)) __attribute__((pure)) int foo(void) {
int x;
...
barrier(CLK_GLOBAL_MEM_FENCE);
...
return x;
  }

  kernel void bar(global int *y) {
int z = foo();
*y = get_global_id() == 0 ? z : 0;
  }

The call to barrier must be either executed by all work-items in a work-group, 
or by none of them.
This is a requirement of OpenCL, and is left to the programmer to ensure that 
happens.

In the case of foo, there could be a problem.
If you do not mark it convergent, the LLVM sink pass push the call to foo to 
the then branch of the ternary operator, hence the program has been incorrectly 
optimized.

The LLVM convergent attribute has been introduced in order to solve this 
problem for intrinsic functions.
The goal of this patch is to expose that attribute at the CLANG level, so it 
can be used on all functions.

The user is supposed to add such attribute when the function requires 
convergent execution, like in the example above.

I’m not aware of any attribute that would conflict with convergent.

The convergent attribute can be applied as well to member functions.

The convergent attribute cannot be applied to Objective-C methods right now — 
it will be ignored:

test.c:14:27: warning: 'convergent' attribute only applies to functions 
[-Wignored-attributes]
- (void) x __attribute__((convergent));

Since convergent is meant for languages supporting the SIMT execution model, 
and to the best of my knowledge I’m not aware of any language based on 
Objective-C supporting that, I would guess there is no benefit in supporting 
convergent on ObjectiveC methods.

>> diff --git a/include/clang/Basic/Attr.td b/include/clang/Basic/Attr.td
>> index df41aeb..eafafc6 100644
>> --- a/include/clang/Basic/Attr.td
>> +++ b/include/clang/Basic/Attr.td
>> @@ -580,6 +580,12 @@ def Constructor : InheritableAttr {
>>   let Documentation = [Undocumented];
>> }
>> 
>> +def Convergent : InheritableAttr {
>> +  let Spellings = [GNU<"convergent">];
> 
> Is there a reason to not support this under CXX11<"clang",
> "convergent"> as well?

I’ve just used the most basic spelling, which fit the OpenCL case.
I can add support for the CXX11 spelling if you find it valuable.

>> +  let Subjects = SubjectList<[Function]>;
>> +  let Documentation = [Undocumented];
> 
> Please, no new undocumented attributes.

Fixed, here is updated patch:



convergent.diff
Description: Binary data


Thanks!

--
Ettore Speziale — Compiler Engineer
speziale.ett...@gmail.com
espezi...@apple.com
--

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [Clang] Convergent Attribute

2016-05-04 Thread Aaron Ballman via cfe-commits
On Tue, May 3, 2016 at 12:18 PM, Ettore Speziale
 wrote:
> Hello,
>
> the attached patch introduces the `convergent` attribute.
>
> It is meant to be lowered into the LLVM `convergent` attribute, to restrict 
> optimizations of attributed functions — e.g. you can attach convergent to 
> OpenCL’s barrier, and thus prevent a call site being moved to another 
> position which is not control equivalent.

I would appreciate a bit more background on this attribute's
semantics. How would a user know when to add this attribute to their
function definition? Are there other attributes that cannot be used in
conjunction with this one? Should this apply to member functions? What
about Objective-C methods?

> diff --git a/include/clang/Basic/Attr.td b/include/clang/Basic/Attr.td
> index df41aeb..eafafc6 100644
> --- a/include/clang/Basic/Attr.td
> +++ b/include/clang/Basic/Attr.td
> @@ -580,6 +580,12 @@ def Constructor : InheritableAttr {
>let Documentation = [Undocumented];
>  }
>
> +def Convergent : InheritableAttr {
> +  let Spellings = [GNU<"convergent">];

Is there a reason to not support this under CXX11<"clang",
"convergent"> as well?

> +  let Subjects = SubjectList<[Function]>;
> +  let Documentation = [Undocumented];

Please, no new undocumented attributes.

> +}
> +
>  def CUDAConstant : InheritableAttr {
>let Spellings = [GNU<"constant">];
>let Subjects = SubjectList<[Var]>;
> diff --git a/lib/CodeGen/CGCall.cpp b/lib/CodeGen/CGCall.cpp
> index 50ea7f7..3e69c79 100644
> --- a/lib/CodeGen/CGCall.cpp
> +++ b/lib/CodeGen/CGCall.cpp
> @@ -1626,6 +1626,8 @@ void CodeGenModule::ConstructAttributeList(
>FuncAttrs.addAttribute(llvm::Attribute::NoReturn);
>  if (TargetDecl->hasAttr())
>FuncAttrs.addAttribute(llvm::Attribute::NoDuplicate);
> +if (TargetDecl->hasAttr())
> +  FuncAttrs.addAttribute(llvm::Attribute::Convergent);
>
>  if (const FunctionDecl *Fn = dyn_cast(TargetDecl)) {
>AddAttributesFromFunctionProtoType(
> diff --git a/lib/Sema/SemaDeclAttr.cpp b/lib/Sema/SemaDeclAttr.cpp
> index cbc95dc..847ed6c 100644
> --- a/lib/Sema/SemaDeclAttr.cpp
> +++ b/lib/Sema/SemaDeclAttr.cpp
> @@ -5426,6 +5426,9 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, 
> Decl *D,
>case AttributeList::AT_Constructor:
>  handleConstructorAttr(S, D, Attr);
>  break;
> +  case AttributeList::AT_Convergent:
> +handleSimpleAttribute(S, D, Attr);
> +break;
>case AttributeList::AT_CXX11NoReturn:
>  handleSimpleAttribute(S, D, Attr);
>  break;
> diff --git a/test/CodeGen/attr-convergent.c b/test/CodeGen/attr-convergent.c
> new file mode 100644
> index 000..d759e75
> --- /dev/null
> +++ b/test/CodeGen/attr-convergent.c
> @@ -0,0 +1,18 @@
> +// RUN: %clang_cc1 -triple x86_64-apple-macosx10.7.0 %s -emit-llvm -o - | 
> FileCheck %s
> +
> +int f0(void) __attribute__((convergent));
> +
> +int f1(void) {
> +  return f0();
> +}
> +
> +// CHECK: define i32 @f1() [[ATTR_1:#[0-9]+]] {
> +// CHECK:   [[RET:%.+]] = call i32 @f0() [[ATTR_CS:#[0-9]+]]
> +// CHECK:   ret i32 [[RET]]
> +// CHECK: }
> +
> +// CHECK: declare i32 @f0() [[ATTR_0:#[0-9]+]]
> +
> +// CHECK-NOT: attributes [[ATTR_1]] = { convergent {{.*}} }
> +// CHECK: attributes [[ATTR_0]] = { convergent {{.*}} }
> +// CHECK: attributes [[ATTR_CS]] = { convergent }
> diff --git a/test/Sema/attr-convergent.c b/test/Sema/attr-convergent.c
> new file mode 100644
> index 000..d9a9db9
> --- /dev/null
> +++ b/test/Sema/attr-convergent.c
> @@ -0,0 +1,7 @@
> +// RUN: %clang_cc1 %s -verify -fsyntax-only
> +
> +int t0 __attribute__((convergent)); // expected-warning {{'convergent' 
> attribute only applies to functions}}
> +
> +void t1() __attribute__((convergent));
> +
> +void t2() __attribute__((convergent(2))); // expected-error {{'convergent' 
> attribute takes no arguments}}
>

~Aaron
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits