Bug#949767: clblas: *gemm wrong answers in out-of-order queues

2020-11-16 Thread Witold Baryluk
Source: clblas
Followup-For: Bug #949767
X-Debbugs-Cc: witold.bary...@gmail.com

Hi,

any progress on this?

Is this something that the upstream should look into maybe?

Thanks,
Witold


-- System Information:
Debian Release: bullseye/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.7.0-1-amd64 (SMP w/32 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled



Bug#949767: clblas: *gemm wrong answers in out-of-order queues

2020-01-27 Thread Rebecca N. Palmer

Control: retitle -1 clblas: *gemm wrong answers in out-of-order queues
Control: reassign -1 src:clblas
Control: found -1 2.12-1

I think I've found the actual bug, in clblas src/library/blas/xgemm.cc: 
clblasGemm (with a single command queue) enqueues up to 4 kernels and 
returns an event that depends on only the last of them, so if the queue 
is out-of-order, waiting on this event doesn't necessarily wait for all 
of them to finish.


This was previously noticed in 
https://github.com/clMathLibraries/clBLAS/issues/269#issuecomment-225453543 
, but not actually reported as a bug.


clblas includes a client/performance tester that creates an out-of-order 
queue (at src/client/clfunc_common.hpp:306), implying that it intends to 
allow such queues.  (We don't run clblas' own tests, possibly because of 
https://github.com/clMathLibraries/clBLAS/issues/338.)


The real fix would be to return an event that depends on all the 
kernels' events (e.g. created with clEnqueueMarkerWithWaitList).


As a workaround for now, I intend to disable out-of-order queues in 
libgpuarray.  (It appears to be the only reverse dependency of clblas 
that also uses out-of-order queues.)