I think Xiangrui's ALS code implement certain aspect of it. You may want to
check it out.
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
From: Xiangrui Meng men...@gmail.com
To: Duy Huynh duy.huynh@gmail.com
Thank you Debasish.
I am fine with either Scala or Java. I would like to get a quick
evaluation on the performance gain, e.g., ALS on GPU. I would like to try
whichever library does the business :)
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J
Thank you all. Actually I was looking at JCUDA. Function wise this may be
a perfect solution to offload computation to GPU. Will see how performance
it will be, especially with the Java binding.
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J
Hi I am trying to find a CUDA library in Scala, to see if some matrix
manipulation in MLlib can be sped up.
I googled a few but found no active projects on Scala+CUDA. Python is
supported by CUDA though. Any suggestion on whether this idea makes any
sense?
Best regards,
Wei
. Any idea on which
method is better?
Thanks!
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
From: Xiangrui Meng men...@gmail.com
To: Wei Tan/Watson/IBM@IBMUS,
Cc: user@spark.apache.org
Hi Deb, thanks for sharing your result. Please find my comments inline in
blue.
Best regards,
Wei
From: Debasish Das debasish.da...@gmail.com
To: Wei Tan/Watson/IBM@IBMUS,
Cc: Xiangrui Meng men...@gmail.com, user@spark.apache.org
user@spark.apache.org
Date: 08/17/2014 08:15 PM
Thanks for sharing your experience. I got the same experience -- multiple
moderate JVMs beat a single huge JVM.
Besides the minor JVM starting overhead, is it always better to have
multiple JVMs rather than a single one?
Best regards,
Wei
-
Wei Tan, PhD
the two reduceByKey stages run in parallel given sufficient
capacity?
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
From: Sean Owen so...@cloudera.com
To: user@spark.apache.org,
Date: 07/15/2014 04:37 PM
Subject:Re: parallel stages?
The last two lines
Just curious: how about using scala to drive the workflow? I guess if you
use other tools (oozie, etc) you lose the advantage of reading from RDD --
you have to read from HDFS.
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research
cache?
I will try more workers so that each JVM has a smaller heap.
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
From: Gaurav Jain ja...@student.ethz.ch
To: u
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
Thanks you all for advice including (1) using CMS GC (2) use multiple
worker instance and (3) use Tachyon.
I will try (1) and (2) first and report back what I found.
I will also try JDK 7 with G1 GC.
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T
BTW: nowadays a single machine with huge RAM (200G to 1T) is really
common. With virtualization you lose some performance. It would be ideal
to see some best practice on how to use Spark in these state-of-art
machines...
Best regards,
Wei
-
Wei Tan, PhD
org.apache.spark.deploy.SparkSubmit spark-shell --class
org.apache.spark.repl.Main
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
/version
/dependency
dependency
groupIdorg.apache.hadoop/groupId
artifactIdhadoop-client/artifactId
version1.2.1/version
/dependency
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us
application (like wordcount)
and debug quickly on a remote spark instance?
Thanks!
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http://researcher.ibm.com/person/us-wtan
to run it in Hadoop. It is fairly complex and relies on
a lot of utility java classes I wrote. Can I reuse the map function in
java and port it into Spark?
Best regards,
Wei
-
Wei Tan, PhD
Research Staff Member
IBM T. J. Watson Research Center
http
Hello,
I am trying to use spark in such a scenario:
I have code written in Hadoop and now I try to migrate to Spark. The
mappers and reducers are fairly complex. So I wonder if I can reuse the
map() functions I already wrote in Hadoop (Java), and use Spark to chain
them, mixing the Java
19 matches
Mail list logo