Excuse my confusion, but what do you mean with global addition? C = a + b, where a b and c are vectors of single precision floats in shared memory? Or is it double precision? On 12 Aug 2015 22:15, "Joe Haywood" <[email protected]> wrote:
> I apologize in advance for asking the following questions because they are > not directly related to pyopencl. Also, I realize opinions can be very > diverse but I think you all might be able to help me. I am planning on > purchasing a new laptop to have for programming at home. I am currently > using a workstation with an NVIDIA 780 TI while at work. I have been able > to get my pyopencl code to run at nearly the same speed as my CUDA code on > this hardware. I have tried running the pyopencl code on an AMD FirePro > V4800 and see serious speed degradation. According to the AMD profiler, the > bottleneck is the global add. Also, a few websites suggest utilizing > float4's would increase the speed, but programming float4s in this > embarrassingly parallel Monte Carlo code is impractical due to branching. > Further investigation using the old CompuBench website (early 2014 ish) > confirmed the global addition on anything except NVIDIA was very slow. That > was nearly 2 years ago. The compubench website no longer lists global add > as an evaluation. So, in your experience is this still the case, that > anything except Nvidia will be slow at global additions? Or have AMD and > Intel "caught up"? I cannot find any laptops spec'd exactly the way I want, > but the 2015 MacBook Pro is close. I just don't want to buy one and run the > code and see it also suffers a terrible loss of speed. Finally, I noticed > on the compubench website that the NVIDIA GTX 980M is equal or better than > the GTX 780 TI in nearly all tests. If you have this hardware, can you > confirm this with your own code? I can run some tests on my computer if > someone has a 980M they would be willing to give me numbers on. > > > > Again, I apologize for being off topic, private messages might be best, > and I appreciate your help. > > > > Thanks > > Reese > > > > Joe Reese Haywood, Ph.D., DABR > > Medical Physicist > > Johnson Family Cancer Center > > Mercy Health Muskegon > > 1440 E. Sherman Blvd, Suite 300 > > Muskegon, MI 49444 > > Phone: 231-672-2019 > > Email: [email protected] > > > > Confidentiality Notice: > This e-mail, including any attachments is the property of Trinity Health > and is intended for the sole use of the intended recipient(s). It may > contain information that is privileged and confidential. Any unauthorized > review, use, disclosure, or distribution is prohibited. If you are not the > intended recipient, please delete this message, and reply to the sender > regarding the error in a separate email. > > _______________________________________________ > PyOpenCL mailing list > [email protected] > http://lists.tiker.net/listinfo/pyopencl > >
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
