Hi all, I am glad of seeing this cooperation between pocl and wfv :-) Great news!
As most of you know, I sadly do not have almost time for pocl developing since I changed jobs. Pekka description was clear and acurate, just my two cents on this... > > A tail replication pass ensures all barriers are reachable by > > only one preceding barrier to make the parallel region formation > > (regions between barriers) feasible. > > This sounds very similar to our continuation-based approach. We create a > function for each of these "parallel regions" as you call them. > And yes, this is only possible with OpenCL's barrier specification (but > anything else would not make sense anyway). The approach is quite similar, you are right. I remember reading your WFV paper and realizing both aproaches, while using different algorithms, produce almost same results. IIRC the main difference was WFV did not take into account the need for all WIs to reach the same barriers. That is, your "trampolines" only need to choose which continuation to run for the first WI on each WG, then they could use the same continuation unconditionally. I remember finding it surprising that you did not apply such optimization, after a significant part of the paper was dedicated to de-conditionalizing intra-WI branches to create longer vectorizable regions. Anyways I might not remember well, and I am writing from mobile as I am on holydays until tuesday. I will have a look back at this on wed. BR Carlos ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_jan _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
