Ok, so the main point is to keep the backard compatibility with POCL_DEVICE_*. In a first time, I will implement probe functions for pthread and basic by using POCL_DEVICE environnement variables.

Each device driver maintainer will then be able to implement probe as desired.

In short: there will be no noticeable changes for legacy usage but new devices will be able to use the dynamic detection easily.


On 04/04/2014 12:50 PM, Pekka Jääskeläinen wrote:
Hi Vincent,

On 04/04/2014 01:15 PM, Vincent Danjean wrote:
I think that, per default, the pthread driver should return the
number of usable cores into the current cpuset.
This is not a good idea in general as it generates one OpenCL
device per core/HW thread. Then the OpenCL app is required to use
multiple command queues to exploit all the HW threads in the CPU.

The multithreading of pthread is currently done at the granularity of
work-groups inside one device instance.

    Then, a environment variable can override this default. It would
be good if:
- as all other pocl envvar, this one also starts with POCL_
- its contents is similar to the contents of same-goal-envvar in
    other environment. In particular, I think to:
    * OMP_NUM_THREADS to define the number of thread to use
    * KMP_AFFINITY (Intel extension) that defines how logical thread
      numbers are mapped onto physical cores. I think pocl pthread
      driver will need something similar (ie define which groups of
      threads (on which physical cores) will form a workgroup, ...)
pocl does not run a single work-item per thread in the pthread device
(like it would be easy to do).

It quickly becomes very expensive with larger WGs and does not exploit
all parallel resources efficiently.

Now work-items inside a single WG are statically parallelized using finer
granularity parallel HW (multi-issue HW and/or SIMD instructions) using the
kernel compiler. Thread/task level parallelism is exploited at the
WG/kernel level.

    => if possible, using the same syntax and keyword (as must as
    possible) would be great for users
Yes, if there are earlier conventions on naming the envs, it is
a good idea to try to mimic them.

Finally, device init operation would also be called only on
clGetDeviceIds/clCreateContext when requesting a specific device in
order to speedup the initialization and use less ressources.
There can be an envvar to limit the scanned plateform/devices on
enumerating functions (clIcdGetPlatformIDsKHR, clGetDeviceIDs, ...)
This could work well enough for backwards compatibility
I.e., use POCL_DEVICES only for _limiting_ the set of probed
devices.

--
signature Clément
------------------------------------------------------------------------------
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to