Hi Alexander, Thank you for having an interest. We used a LR derived from a Spark sample program https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkLR.scala (not from mllib or ml). Here are scala source files for GPU and non-GPU versions. GPU: https://github.com/kiszk/spark-gpu/blob/dev/examples/src/main/scala/org/apache/spark/examples/SparkGPULR.scala non-GPU: https://github.com/kiszk/spark-gpu/blob/dev/examples/src/main/scala/org/apache/spark/examples/SparkLR.scala
Best Regards, Kazuaki Ishizaki From: "Ulanov, Alexander" <alexander.ula...@hpe.com> To: Kazuaki Ishizaki/Japan/IBM@IBMJP, "dev@spark.apache.org" <dev@spark.apache.org> Date: 2016/01/05 06:13 Subject: RE: Support off-loading computations to a GPU Hi Kazuaki, Sounds very interesting! Could you elaborate on your benchmark with regards to logistic regression (LR)? Did you compare your implementation with the current implementation of LR in Spark? Best regards, Alexander From: Kazuaki Ishizaki [mailto:ishiz...@jp.ibm.com] Sent: Sunday, January 03, 2016 7:52 PM To: dev@spark.apache.org Subject: Support off-loading computations to a GPU Dear all, We reopened the existing JIRA entry https://issues.apache.org/jira/browse/SPARK-3785to support off-loading computations to a GPU by adding a description for our prototype. We are working to effectively and easily exploit GPUs on Spark at http://github.com/kiszk/spark-gpu. Please also visit our project page http://kiszk.github.io/spark-gpu/. For now, we added a new format for a partition in an RDD, which is a column-based structure in an array format, in addition to the current Iterator[T] format with Seq[T]. This reduces data serialization/deserialization and copy overhead between CPU and GPU. Our prototype achieved more than 3x performance improvement for a simple logistic regression program using a NVIDIA K40 card. This JIRA entry (SPARK-3785) includes a link to a design document. We are very glad to hear valuable feedback/suggestions/comments and to have great discussions to exploit GPUs in Spark. Best Regards, Kazuaki Ishizaki