Hi,
> I believe I figured it out, your comment about the global sizes allowed
> me to realize the the defaults don't account for a second dimension.
> Once I set that I am able to get the kernel to work properly. Thank you
> for listening and directing me to different points to check.
ah, great
Karl,
I believe I figured it out, your comment about the global sizes allowed me
to realize the the defaults don't account for a second dimension. Once I
set that I am able to get the kernel to work properly. Thank you for
listening and directing me to different points to check.
Regards,
Charle
Hi Charles,
> Here is the current kernel
> with all the different attempts commented out (where MdimPad and PdimPad
> or the padded dimensions).
where is the third dimension? Are you assuming C to be M-by-M?
> If I don't have a size condition check, the
> device quickly runs out of resources (
Hi Karl,
I have been trying on and off to get this to work and I am completely
stumped. I have reversed the kernel so it is for row-major format. Right
now, if I use the logic you suggested (switching the '<' for '>') though
the kernel stops after the first column. So only the first element of
Hey,
> Ah yes, thanks Karl. I remember that now. With that said, are there
> recommendations on how kernels should be written to address the padded
> columns? I am imagining some if/else or loop limits on indices but
> thought I would ask here before I start trying to do that. I am trying
> t
Ah yes, thanks Karl. I remember that now. With that said, are there
recommendations on how kernels should be written to address the padded
columns? I am imagining some if/else or loop limits on indices but thought
I would ask here before I start trying to do that. I am trying to look
through th
Hi,
On 05/23/2016 05:38 PM, Charles Determan wrote:
> I am experimenting with the custom OpenCL kernel functionality,
> specifically a naive matrix multiplication as an example.
>
> My OpenCL Kernel:
> __kernel void iMatMult(const int Mdim, const int Pdim,
> __global const
I am experimenting with the custom OpenCL kernel functionality,
specifically a naive matrix multiplication as an example.
My OpenCL Kernel:
__kernel void iMatMult(const int Mdim, const int Pdim,
__global const int *A, __global const int *B,
__global int *C) {
// Get the