Excuse my confusion, but what do you mean with global addition? C = a + b,
where a b and c are vectors of single precision floats in shared memory?
Or is it double precision?
On 12 Aug 2015 22:15, "Joe Haywood" <[email protected]> wrote:

> I apologize in advance for asking the following questions because they are
> not directly related to pyopencl.  Also, I realize opinions can be very
> diverse but I think you all might be able to help me. I am planning on
> purchasing a new laptop to have for programming at home. I am currently
> using a workstation with an NVIDIA 780 TI while at work.  I have been able
> to get my pyopencl code to run at nearly the same speed as my CUDA code on
> this hardware. I have tried running the pyopencl code on an AMD FirePro
> V4800 and see serious speed degradation. According to the AMD profiler, the
> bottleneck is the global add. Also, a few websites suggest utilizing
> float4's would increase the speed, but programming float4s in this
> embarrassingly parallel Monte Carlo code is impractical due to branching.
> Further investigation using the old CompuBench website (early 2014 ish)
> confirmed the global addition on anything except NVIDIA was very slow. That
> was nearly 2 years ago. The compubench website no longer lists global add
> as an evaluation. So, in your experience is this still the case, that
> anything except Nvidia will be slow at global additions? Or have AMD and
> Intel "caught up"? I cannot find any laptops spec'd exactly the way I want,
> but the 2015 MacBook Pro is close. I just don't want to buy one and run the
> code and see it also suffers a terrible loss of speed. Finally, I noticed
> on the compubench website that the NVIDIA GTX 980M is equal or better than
> the GTX 780 TI in nearly all tests. If you have this hardware, can you
> confirm this with your own code? I can run some tests on my computer if
> someone has a 980M they would be willing to give me numbers on.
>
>
>
> Again, I apologize for being off topic, private messages might be best,
> and I appreciate your help.
>
>
>
> Thanks
>
> Reese
>
>
>
> Joe Reese Haywood, Ph.D., DABR
>
> Medical Physicist
>
> Johnson Family Cancer Center
>
> Mercy Health Muskegon
>
> 1440 E. Sherman Blvd, Suite 300
>
> Muskegon, MI 49444
>
> Phone: 231-672-2019
>
> Email: [email protected]
>
>
>
> Confidentiality Notice:
> This e-mail, including any attachments is the property of Trinity Health
> and is intended for the sole use of the intended recipient(s). It may
> contain information that is privileged and confidential.  Any unauthorized
> review, use, disclosure, or distribution is prohibited. If you are not the
> intended recipient, please delete this message, and reply to the sender
> regarding the error in a separate email.
>
> _______________________________________________
> PyOpenCL mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pyopencl
>
>
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to