> On May 6, 2016, at 18:12, Richard Smith via cfe-commits 
> <cfe-commits@lists.llvm.org> wrote:
> 
> On Fri, May 6, 2016 at 4:20 PM, Matt Arsenault via cfe-commits 
> <cfe-commits@lists.llvm.org <mailto:cfe-commits@lists.llvm.org>> wrote:
> On 05/06/2016 02:42 PM, David Majnemer via cfe-commits wrote:
> This example looks wrong to me. It doesn't seem meaningful for a function to 
> be both readonly and convergent, because convergent means the call has some 
> side-effect visible to other threads and readonly means the call has no 
> side-effects visible outside the function.
> This s not correct. It is valid for convergent operations to be 
> readonly/readnone. Barriers are a common case which do have side effects, but 
> there are also classes of GPU instructions which do not access memory and 
> still need the convergent semantics.
> 
> Can you give an example? It's not clear to me how a function could be both 
> convergent and satisfy the readnone requirement that it not "access[...] any 
> mutable state (e.g. memory, control registers, etc) visible to caller 
> functions". Synchronizing with other threads seems like it would cause such a 
> state change in an abstract sense. Is the critical distinction here that the 
> state mutation is visible to the code that spawned the gang of threads, but 
> not to other threads within the gang? (This seems like a bug in the 
> definition of readonly if so, because it means that a readonly call whose 
> result is unused cannot be deleted.)
> 
> I care about this because Clang maps __attribute__((pure)) to LLVM readonly, 
> and -- irrespective of the LLVM semantics -- a call to a function marked pure 
> is permitted to be deleted if the return value is unused, or to have multiple 
> calls CSE'd. As a result, inside Clang, we use that attribute to determine 
> whether an expression has side effects, and Clang's reasoning about these 
> things may also lead to miscompiles if a call to a function marked 
> __attribute__((pure, convergent)) actually can have a side effect.
> _______________________________________________
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

These are communication operations between lanes that do not require 
synchronization within the wavefront. These are mostly cross lane communication 
instructions. An example would be the amdgcn.mov.dpp instruction, which reads a 
register from a neighboring lane, or the CUDA warp vote functions. There is no 
synchronization required, and there is no other way for the same item to access 
that information private to the other workitem. There’s no observable global 
state from the perspective of a single lane. The individual registers changed 
aren’t visible to the spawning host program (perhaps with the exception of some 
debug hardware inspecting all of the individual registers). Deleting these 
would be perfectly acceptable if the result is unused.

-Matt
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to