Dear, all
  i'am testing double precision matrix multiplication in spark on ec2
m1.large machines.
  i use breeze linalg library, and internally it calls native
library(openblas nehalem single threaded)

m1.large:
model name      : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
cpu MHz         : 1795.672
model name      : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
cpu MHz         : 1795.672

os:
Linux ip-172-31-24-33 3.4.37-40.44.amzn1.x86_64 #1 SMP Thu Mar 21 01:17:08
UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

  here's my test code:
  def main(args: Array[String]) {

    val n = args(0).toInt
    val loop = args(1).toInt

    val ranGen = new Random

    var arr = ofDim[Double](loop,n*n)

    for(i <- 0 until loop)
      for(j <- 0 until n*n) {
        arr(i)(j) = ranGen.nextDouble()
      }

    var time0 = System.currentTimeMillis()
    println("init time = "+time0)

    var c = new DenseMatrix[Double](n,n)

    var time1 = System.currentTimeMillis()
    println("start time = "+time1)

    for(i <- 0 until loop) {
      var a = new DenseMatrix[Double](n,n,arr(i))
      var b = new DenseMatrix[Double](n,n,arr(i))

      c :+= (a * b)
    }

    var time2 = System.currentTimeMillis()
    println("stop time = "+time2)
    println("init time = "+(time1-time0))
    println("used time = "+(time2-time1))
  }

  two n=3584 matrix mult uses about 14s using the above test code. but when
i put matrix
  mult part in spark mapPartitions function:

  val b = a.mapPartitions{ itr =>
    val arr = itr.toArray

    //timestamp here
    var a = new DenseMatrix[Double](n,n,arr)
    var b = new DenseMatrix[Double](n,n,arr)

    c = a*b

   //timestamp here
    c.toIterator
  }

  two n=3584 matrix mult uses about 50s!
  there's a shuffle operation before matrix mult in spark, during shuffle
phase the aggregated data are
  put in memory on the reduce side, there is no spill to disk. so the above
2 cases are all in memory 
  matrix mult, and they all have enough memory, GC time is really small

  so why case 2 is 3.5x slower than case 1? has any one met this before, and
what's your performance
  of DGEMM in spark? thanks for advices
  



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to