I hit a problem in on one of my reduction test cases where the
GOACC_JOIN was getting cloned. Nvptx requires FORK and JOIN to be
single-entry, single-exit regions, or some form of thread divergence may
occur. When that happens, we cannot use the shfl instruction for
reductions or broadcasting (if
On 08/26/15 09:57, Cesar Philippidis wrote:
I hit a problem in on one of my reduction test cases where the
GOACC_JOIN was getting cloned. Nvptx requires FORK and JOIN to be
single-entry, single-exit regions, or some form of thread divergence may
occur. When that happens, we cannot use the shfl