No UVA is not enabled on them; I already tried memcpy_dtod.

In send_pyobj the array would just be passed as reference though right?

Thanks

On Tuesday, November 17, 2015, Lev Givon <l...@columbia.edu> wrote:

> Received from Baskaran Sankaran on Tue, Nov 17, 2015 at 03:08:10PM EST:
> > @Lev, thanks for the tip; I will look into it.
> >
> > In the meanwhile, I am running into some speed issues. I notice that it
> > slows down progressively almost by a factor of 0.5, in just 7000 updates.
> > It starts with about 2.6 sec/ mini-batch (average speed), but after 7000
> > mini-batches, the time increases to 3.7 secs/ mini-batch.
> >
> > I suspect that I may not be sending the host memory pointers but the
> actual
> > arrays, serialized by zmq's send_pyobj (see below in the code). Could
> > someone confirm whether I am doing it correctly? Should I just be
> sending/
> > receiving host memory pointers?
>
> You are transmitting the array contents. If you use IPC to send the GPU
> array
> pointers to both processes [1], you should be able to perform a
> device-to-device
> copy between the two memory locations even if you can't use P2P [2]
> (assuming
> that UVA is supported on both devices).
>
> [1] https://gist.github.com/e554b3985e196b07f93b
> [2] https://gist.github.com/3078644
> --
> Lev Givon
> Bionet Group | Neurokernel Project
> http://lebedov.github.io/
> http://neurokernel.github.io/
>
>
_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to