Hi Matthias, all,

Matthias Noack <[email protected]> writes:
> the following talk from IWOCL'17 has some numbers on PoCL on Xeon (HSW), 
> and Xeon Phi (KNL) for two applications using PoCL and other SDKs:
>
> http://www.iwocl.org/wp-content/uploads/iwocl2017_matthias-noack-good-bad-ugly.pdf
>
> The numbers for the comet simulation (slide 18, pdf 53), show some weird 
> outliers every 32 work-items, when increasing the overall number of 
> work-items (particles).
>
> The HEOM code reveals a large performance gap in comparison with the 
> Intel SDK on KNL, while is looks quite competitive on HSW. This is 
> especially strange, as the Intel SDK has only AVX2 support, while PoCL 
> should be able to generate AVX-512 code using LLVM. I compiled PoCL and 
> LLVM on the target architectures, and made sure the result is an actual 
> Haswell and KNL built.
>
> Is anyone interested in working on improving these issues?

I use pocl extensively as my reference CL implementation on which all my
numerical research codes are based. I naturally have an interest in
making those run as fast as possible. While I (sadly) likely won't be
able to contributed myself, I very likely might be able to contribute
time in the form of student projects. If someone could identify
"student-sized" subprojects of this endeavor, that'd be hugely
helpful. :)

Andreas

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to