Re: breeze DGEMM slow in spark

2014-05-18 Thread wxhsdp
Hi, xiangrui
  i check the stderr of worker node, yes it's failed to load implementation
from:   
  com.github.fommil.netlib.NativeSystemBLAS...

  what do you mean by include breeze-natives or netlib:all? 

  things i've already done:
  1. add breeze and breeze native dependency in sbt build file
  2. download all breeze jars to slaves
  3. add jars to classpath in slave
  4. ln -s libopenblas_nehalemp-r0.2.9.rc2.so libblas.so.3 and add it to
LD_LIBRARY_PATH in slave

  thank you for your help



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5977.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: breeze DGEMM slow in spark

2014-05-18 Thread wxhsdp
Hi, xiangrui

  you said It doesn't work if you put the netlib-native jar inside an
assembly 
  jar. Try to mark it provided in the dependencies, and use --jars to 
  include them with spark-submit. -Xiangrui

  i'am not use an assembly jar which contains every thing, i also mark
breeze dependencies
  provided, and manually download the jars and add them to slave classpath.
but doesn't work:(



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5979.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: breeze DGEMM slow in spark

2014-05-18 Thread wxhsdp
ok

Spark Executor Command: java -cp
:/root/ephemeral-hdfs/conf:/root/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.10.4.jar:/root/.ivy2/cache/org.scalanlp/breeze_2.10/jars/breeze_2.10-0.7.jar:/root/.ivy2/cache/org.scalanlp/breeze-macros_2.10/jars/breeze-macros_2.10-0.3.jar:/root/.sbt/boot/scala-2.10.3/lib/scala-reflect.jar:/root/.ivy2/cache/com.thoughtworks.paranamer/paranamer/jars/paranamer-2.2.jar:/root/.ivy2/cache/com.github.fommil.netlib/core/jars/core-1.1.2.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1-javadoc.jar:/root/.ivy2/cache/net.sf.opencsv/opencsv/jars/opencsv-2.3.jar:/root/.ivy2/cache/com.github.rwl/jtransforms/jars/jtransforms-2.4.0.jar:/root/.ivy2/cache/junit/junit/jars/junit-4.8.2.jar:/root/.ivy2/cache/org.apache.commons/commons-math3/jars/commons-math3-3.2.jar:/root/.ivy2/cache/org.spire-math/spire_2.10/jars/spire_2.10-0.7.1.jar:/root/.ivy2/cache/org.spire-math/spire-macros_2.10/jars/spire-macros_2.10-0.7.1.jar:/root/.ivy2/cache/com.typesafe/scalalogging-slf4j_2.10/jars/scalalogging-slf4j_2.10-1.0.1.jar:/root/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.2.jar:/root/.ivy2/cache/org.scalanlp/breeze-natives_2.10/jars/breeze-natives_2.10-0.7.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-osx-x86_64/jars/netlib-native_ref-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_ref-java/jars/native_ref-java-1.1.jar:/root/.ivy2/cache/com.github.fommil/jniloader/jars/jniloader-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-x86_64/jars/netlib-native_ref-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-i686/jars/netlib-native_ref-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-x86_64/jars/netlib-native_ref-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-i686/jars/netlib-native_ref-win-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-armhf/jars/netlib-native_ref-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-osx-x86_64/jars/netlib-native_system-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_system-java/jars/native_system-java-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-x86_64/jars/netlib-native_system-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-i686/jars/netlib-native_system-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-armhf/jars/netlib-native_system-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-x86_64/jars/netlib-native_system-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-i686/jars/netlib-native_system-win-i686-1.1-natives.jar
::/root/spark/conf:/root/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.0.4.jar
-Xms4096M -Xmx4096M



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5994.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


breeze DGEMM slow in spark

2014-05-17 Thread wxhsdp
Dear, all
  i'am testing double precision matrix multiplication in spark on ec2
m1.large machines.
  i use breeze linalg library, and internally it calls native
library(openblas nehalem single threaded)

m1.large:
model name  : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
cpu MHz : 1795.672
model name  : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
cpu MHz : 1795.672

os:
Linux ip-172-31-24-33 3.4.37-40.44.amzn1.x86_64 #1 SMP Thu Mar 21 01:17:08
UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

  here's my test code:
  def main(args: Array[String]) {

val n = args(0).toInt
val loop = args(1).toInt

val ranGen = new Random

var arr = ofDim[Double](loop,n*n)

for(i - 0 until loop)
  for(j - 0 until n*n) {
arr(i)(j) = ranGen.nextDouble()
  }

var time0 = System.currentTimeMillis()
println(init time = +time0)

var c = new DenseMatrix[Double](n,n)

var time1 = System.currentTimeMillis()
println(start time = +time1)

for(i - 0 until loop) {
  var a = new DenseMatrix[Double](n,n,arr(i))
  var b = new DenseMatrix[Double](n,n,arr(i))

  c :+= (a * b)
}

var time2 = System.currentTimeMillis()
println(stop time = +time2)
println(init time = +(time1-time0))
println(used time = +(time2-time1))
  }

  two n=3584 matrix mult uses about 14s using the above test code. but when
i put matrix
  mult part in spark mapPartitions function:

  val b = a.mapPartitions{ itr =
val arr = itr.toArray

//timestamp here
var a = new DenseMatrix[Double](n,n,arr)
var b = new DenseMatrix[Double](n,n,arr)

c = a*b

   //timestamp here
c.toIterator
  }

  two n=3584 matrix mult uses about 50s!
  there's a shuffle operation before matrix mult in spark, during shuffle
phase the aggregated data are
  put in memory on the reduce side, there is no spill to disk. so the above
2 cases are all in memory 
  matrix mult, and they all have enough memory, GC time is really small

  so why case 2 is 3.5x slower than case 1? has any one met this before, and
what's your performance
  of DGEMM in spark? thanks for advices
  



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: breeze DGEMM slow in spark

2014-05-17 Thread wxhsdp
i think maybe it's related to m1.large, because i also tested on my laptop,
the two case cost nearly
the same amount of time.

my laptop:
model name  : Intel(R) Core(TM) i5-3380M CPU @ 2.90GHz
cpu MHz : 2893.549

os:
Linux ubuntu 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5971.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: breeze DGEMM slow in spark

2014-05-17 Thread Xiangrui Meng
You need to include breeze-natives or netlib:all to load the native
libraries. Check the log messages to ensure native libraries are used,
especially on the worker nodes. The easiest way to use OpenBLAS is
copying the shared library to /usr/lib/libblas.so.3 and
/usr/lib/liblapack.so.3. -Xiangrui

On Sat, May 17, 2014 at 8:02 PM, wxhsdp wxh...@gmail.com wrote:
 i think maybe it's related to m1.large, because i also tested on my laptop,
 the two case cost nearly
 the same amount of time.

 my laptop:
 model name  : Intel(R) Core(TM) i5-3380M CPU @ 2.90GHz
 cpu MHz : 2893.549

 os:
 Linux ubuntu 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013
 x86_64 x86_64 x86_64 GNU/Linux




 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5971.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.