Pekka, Thank you for the explanation.
At clEnqueueNDRangeKernel() time, all workgroup details are available. If I have a local workgroup of (2, 1, 1), can I understand that the kernel is replicated/unrolled 2 * 1 * 1 = 2 times, regardless other global dimensions? Thank you Chuck On 8/21/2013 11:49 AM, Pekka Jääskeläinen wrote: > Hi Chuck, > > On 08/21/2013 09:28 PM, Chuck Zhao wrote: >> - when the kernel is vectorized, how many times does the kernel get >> replicated ? (through either WorkitemReplicate or WorkitemUnroll). >> - Who/How is this number decided? > WorkitemReplicate and WorkitemLoops produce work-group functions for > known local sizes. This means that they can be generated only > when the local size is known (at enqueue). > > WorkitemReplicate always replicates the work-items fully (i.e. you > get as many work-item copies as the local size is). WorkitemLoops > unrolls the wiloop only if instructed. See > > http://pocl.sourceforge.net/docs/html/using.html#tuning-pocl-behavior > -->POCL_WORK_GROUP_METHOD > > If you want to vectorize work groups, WorkitemReplicate can be used in > combination with WIVectorize of pocl (branched BBVectorize). This is > "deprecated" as I do not think it's a good idea to maintain the hackish > WIVectorize out of LLVM tree (we implemented this for a research > core and it was never really polished for wider use). > This style of WG-vectorization is enabled with > export POCL_WORK_GROUP_METHOD=repl > export POCL_VECTORIZE_WORK_GROUPS=1 > > In the future, better use and improve the WorkitemLoops+LoopVectorize > (of LLVM upstream) instead. Then the wiloop unrolling decision should > be left to the LLVM's LoopVectorizer. Enable this with > > export POCL_WORK_GROUP_METHOD=loopvec > > If you want to debug why your kernel does not vectorize horizontally, > you can get some hints by enabling the -debug in the opt call of > pocl-workgroup and grep for the vectorizer debug printouts. > ------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
