================
@@ -0,0 +1,180 @@
+===============================
+ AMDGPU Asynchronous Operations
+===============================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Asynchronous operations are memory transfers (usually between the global memory
+and LDS) that are completed independently at an unspecified scope. A thread 
that
+requests one or more asynchronous transfers can use *async markers* to track
+their completion. The thread waits for each marker to be *completed*, which
+indicates that requests initiated in program order before this marker have also
+completed.
+
+Operations
+==========
+
+``async_load_to_lds``
+---------------------
+
+.. code-block:: llvm
+
+  ; Legacy "LDS DMA" operations
+  void @llvm.amdgcn.load.to.lds(ptr %src, ptr %dst, ASYNC)
+  void @llvm.amdgcn.global.load.lds(ptr %src, ptr %dst, ASYNC)
+  void @llvm.amdgcn.raw.buffer.load.lds(ptr %src, ptr %dst, ASYNC)
+  void @llvm.amdgcn.raw.ptr.buffer.load.lds(ptr %src, ptr %dst, ASYNC)
+  void @llvm.amdgcn.struct.buffer.load.lds(ptr %src, ptr %dst, ASYNC)
+  void @llvm.amdgcn.struct.ptr.buffer.load.lds(ptr %src, ptr %dst, ASYNC)
+
+Requests an async operation that copies the specified number of bytes from the
+global/buffer pointer ``%src`` to the LDS pointer ``%dst``.
+
+The optional parameter `ASYNC` is a bit in the auxiliary argument to those
+intrinsics, as documented in :ref:`LDS DMA operations<amdgpu-lds-dma-bits>`.
+When set, it indicates that the compiler should not automatically track the
+completion of this operation.
+
+``@llvm.amdgcn.asyncmark()``
+----------------------------
+
+Creates an *async marker* to track all the async operations that are program
+ordered before this call. A marker M is said to be *completed* only when all
+async operations program ordered before M are reported by the implementation as
+having finished, and it is said to be *outstanding* otherwise.
+
+Thus we have the following sufficient condition:
+
+  An async operation X is *completed* at a program point P if there exists a
+  marker M such that X is program ordered before M, M is program ordered before
+  P, and M is completed. X is said to be *outstanding* at P otherwise.
----------------
arsenm wrote:

Isn't this just waitcnt? 

https://github.com/llvm/llvm-project/pull/173259
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to