[ https://issues.apache.org/jira/browse/SPARK-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163712#comment-14163712 ]
RJ Nowling commented on SPARK-3785: ----------------------------------- Part of my graduate work involved implementing physics simulations on GPUs and managing multi-user GPU clusters. >From a performance perspective, we saw 100x+ speed ups on a single machine >with a GPU vs multiple cores using specialized GPU implementations such as >OpenMM or Gromacs. But this was using hand-optimized GPU implementations that >were pipelined to prevent unnecessary host/GPU copies and do as much work as >possible on the GPU. For clusters, we'd get the 2-5x speed up due to communication overhead between host/GPU and other nodes. In these cases, you could only run a few iterations on the GPU before you had to communicate with other nodes. Thus, GPUs are great if you're doing computation that will run using hand-optimized GPU implementations for long periods of time before communicating outside the GPU. But I think you won't get much of a performance improvement using simple operations (like RDD operations) without explicit (and challenging) pipeline optimization work. I think the most practical case for Spark/GPU integration is jobs involving large chunks of image processing, rendering, linear algebra, etc. work that can be done independently in each task. For example, Naive Bayes where the number of features is large enough to fit on the GPUs in a single node but there are many, many samples to classify. In this case, you may be able use a GPU linear algebra library to do the GPU operations and move data asynchronously and in large chunks to reduce performance issues. Further, GPU scheduling is immature. Very little isolation, GPUs often get into bad states that require machine reboots, and no OS support so mostly done by each application. It's like MacOS 9 -- have to hope each process is a responsible citizen. I think that would end up being a huge distraction for Spark's developers. I think [~srowen]'s point about calling GPU libraries from your Spark driver is probably the most practical solution. > Support off-loading computations to a GPU > ----------------------------------------- > > Key: SPARK-3785 > URL: https://issues.apache.org/jira/browse/SPARK-3785 > Project: Spark > Issue Type: Brainstorming > Components: MLlib > Reporter: Thomas Darimont > Priority: Minor > > Are there any plans to adding support for off-loading computations to the > GPU, e.g. via an open-cl binding? > http://www.jocl.org/ > https://code.google.com/p/javacl/ > http://lwjgl.org/wiki/index.php?title=OpenCL_in_LWJGL -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org