Re: GPGPUs

luminousone Sat, 17 Aug 2013 17:40:56 -0700

We basically have to follow these rules,

1. The range must be none prior to execution of a gpu code block

2. The range can not be changed during execution of a gpu codeblock3. Code blocks can only receive a single range, it can however bemultidimensional

4. index keys used in a code block are immutable

5. Code blocks can only use a single key(the gpu executes manyinstances in parallel each with their own unique key)

6. index's are always an unsigned integer type
7. openCL,CUDA have no access to global state
8. gpu code blocks can not allocate memory
9. gpu code blocks can not call cpu functions

10. atomics tho available on the gpu are many times slower thenon the cpu11. separate running instances of the same code block on the gpucan not have any interdependency on each other.

Now if we are talking about HSA, or other similar setup, then afew of those rules don't apply or become fuzzy.

HSA, does have limited access to global state, HSA can call cpufunctions that are pure, and of course because in HSA the cpu andgpu share the same virtual address space most of memory is openfor access.

HSA also manages memory, via the hMMU, and their is no need forgpu memory management functions, as that is managed by theoperating system and video card drivers.

Basically, D would either need to opt out of legacy api's such asopenCL, CUDA, etc, these are mostly tied to c/c++ anyway, andgenerally have ugly as sin syntax; or D would have go the routeof a full and safe gpu subset of features.

I don't think such a setup can be implemented as simply alibrary, as the GPU needs compiled source.

If D where to implement gpgpu features, I would actually suggeststarting by simply adding a microthreading function syntax, forexample...

void example( aggregate in float a[] ; key , in float b[], outfloat c[]) {

        c[key] = a[key] + b[key];
}

By adding an aggregate keyword to the function, we can assume therange simply using the length of a[] without adding an extra setof brackets or something similar.

This would make access to the gpu more generic, and moreimportantly, because llvm will support HSA, removes the needs forwriting more complex support into dmd as openCL and CUDA wouldrequire, a few hints for the llvm backend would be enough togenerate the dual bytecode ELF executables.

Re: GPGPUs

Reply via email to