Hey,

The GEMM kernel(s) are getting pretty tricky, with quite a few fallbacks
involved. This gets hard to test, so I thought it could be a good idea to
discuss this. Basically, here is how it works:

A = [A1 A2; A3 A4]
B = [B1 B2; B3 B4]
C = [C1 C2; C3 C4]

Where each block is divided according to the corresponding block size of
the template. For example; A1 is the closest multiple of the size tuple
(ML, KL), where ML is the number of rows computed by each work group, and
KL the "width step" for computing the inner products (If the kernel use
local memories, it will load successive blocks of size ML*KL in each work
group).

A few kernels are enqueued so that:
C1 = A1*B1 [optimized kernel]
C1 += A2*B3 [fallback] if needed
C2 = A1*B2 [fallback] if needed
C2 += A2*B4 [fallback] if needed
etc...

Basically, one optimized kernel doing the bulk of the work, and the other
ones doing the "clean-up". This works well for full matrices and ranges.
When slices are involved, things get more complicated. If the stride is on
the non-leading dimension (stride2 for column-major matrices), then it can
be incorporated in the optimized kernel. (by appending ld *= stride2 at the
beginning of the kernel). However, if stride1 > 1, then we need to use the
fallback kernel. This is a reasonable thing to do : in most applications I
know of, only one stride is accessed at the time (we want a set of the
rows/columns of a given matrix).

However, this becomes really messy to test! Basically, I think that, to
have an exhaustive enough testing suite, then we should go for:

- Matrices of complicated arbitrary sizes (143, 284, 395). It is important
to space them by more than 128, to be sure that A1, B1 and C1 is not square.
- Ranges of similar complicated sizes.
- "Optimized" range: (128, 256, 384) for example
- matrix row-wise slices, matrix col-wise slices, matrix slice in both
directions.

I am ready to rewrite the GEMM tests accordingly, but any thought on the
procedure would be appreciated!

Philippe
------------------------------------------------------------------------------
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to