Great to see detailed information on this topic Niketan, I guess I have
missed when you posted it initially.

Could you elaborate a little more on what is the programming model for when
the user wants to leverage GPU ? Also, today I can submit a job to spark
using --jars and it will handle copying the dependencies to the worker
nodes. If my application wants to leverage GPU, what extras dependencies
will be required on the worker nodes, and how they are going to be
installed/updated on the Spark cluster ?



On Tue, May 3, 2016 at 1:26 PM, Niketan Pansare <npan...@us.ibm.com> wrote:

>
>
> Hi all,
>
> I have updated the design document for our GPU backend in the JIRA
> https://issues.apache.org/jira/browse/SYSTEMML-445. The implementation
> details are based on the prototype I created and is available in PR
> https://github.com/apache/incubator-systemml/pull/131. Once we are done
> with the discussion, I can clean up and separate out the GPU backend in a
> separate PR for easier review :)
>
> Here are key design points:
> A GPU backend would implement two abstract classes:
>    1.   GPUContext
>    2.   GPUObject
>
>
>
> The GPUContext is responsible for GPU memory management and gets call-backs
> from SystemML's bufferpool on following methods:
>    1.   void acquireRead(MatrixObject mo)
>    2.   void acquireModify(MatrixObject mo)
>    3.   void release(MatrixObject mo, boolean isGPUCopyModified)
>    4.   void exportData(MatrixObject mo)
>    5.   void evict(MatrixObject mo)
>
>
>
> A GPUObject (like RDDObject and BroadcastObject) is stored in CacheableData
> object. It contains following methods that are called back from the
> corresponding GPUContext:
>    1.   void allocateMemoryOnDevice()
>    2.   void deallocateMemoryOnDevice()
>    3.   long getSizeOnDevice()
>    4.   void copyFromHostToDevice()
>    5.   void copyFromDeviceToHost()
>
>
>
> In the initial implementation, we will add JCudaContext and JCudaPointer
> that will extend the above abstract classes respectively. The JCudaContext
> will be created by ExecutionContextFactory depending on the user-specified
> accelarator. Analgous to MR/SPARK/CP, we will add a new ExecType: GPU and
> implement GPU instructions.
>
> The above design is general enough so that other people can implement
> custom accelerators (for example: OpenCL) and also follows the design
> principles of our CP bufferpool.
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>



-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Reply via email to