subject:"\[GitHub\] spark pull request\: \[SPARK\-10478\]\[ML\] Performance, organization, a..."

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2016-05-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-10-17 Thread feynmanliang

Github user feynmanliang commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-148893887
  
@mengxr added benchmarks, can you make another pass when you have a chance


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-10-10 Thread feynmanliang

Github user feynmanliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/8648#discussion_r41700924
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -260,127 +263,126 @@ private[ann] trait ActivationFunction extends 
Serializable {
 }
 
 /**
- * Implements in-place application of functions
+ * Implements in-place application of functions.
  */
 private[ann] object ActivationFunction {
 
-  def apply(x: BDM[Double], y: BDM[Double], func: Double => Double): Unit 
= {
-var i = 0
-while (i < x.rows) {
-  var j = 0
-  while (j < x.cols) {
-y(i, j) = func(x(i, j))
-j += 1
-  }
-  i += 1
-}
+  def apply(x: BDM[Double], y: BDM[Double], func: UFunc with MappingUFunc)(
+  implicit impl: func.Impl[BDM[Double], BDM[Double]]): Unit = {
+y := func(x)
   }
 
   def apply(
-x1: BDM[Double],
-x2: BDM[Double],
-y: BDM[Double],
-func: (Double, Double) => Double): Unit = {
-var i = 0
-while (i < x1.rows) {
-  var j = 0
-  while (j < x1.cols) {
-y(i, j) = func(x1(i, j), x2(i, j))
-j += 1
-  }
-  i += 1
-}
+  x1: BDM[Double],
+  x2: BDM[Double],
+  y: BDM[Double],
+  func: UFunc with MappingUFunc)(
+  implicit impl: func.Impl2[BDM[Double], BDM[Double], BDM[Double]]): 
Unit = {
+y := func(x1, x2)
   }
 }
 
 /**
- * Implements SoftMax activation function
+ * Implements Softmax activation function.
  */
 private[ann] class SoftmaxFunction extends ActivationFunction {
   override def eval(x: BDM[Double], y: BDM[Double]): Unit = {
-var j = 0
-// find max value to make sure later that exponent is computable
-while (j < x.cols) {
-  var i = 0
-  var max = Double.MinValue
-  while (i < x.rows) {
-if (x(i, j) > max) {
-  max = x(i, j)
-}
-i += 1
-  }
-  var sum = 0.0
-  i = 0
-  while (i < x.rows) {
-val res = Math.exp(x(i, j) - max)
-y(i, j) = res
-sum += res
-i += 1
-  }
-  i = 0
-  while (i < x.rows) {
-y(i, j) /= sum
-i += 1
-  }
-  j += 1
+(0 until x.cols).foreach { j =>
+  // subtract max value to prevent overflow during exp
+  // does not affect correctness since we normalize right after
+  val maxVal = Bmax(x(::, j))
+  y(::, j) := breeze.numerics.exp(x(::, j) - maxVal)
+  y(::, j) :/= Bsum(y(::, j))
--- End diff --

@mengxr [Local benchmarks 
here](https://gist.github.com/feynmanliang/bc64b82a1258c4e86b9a). Performance 
is more or less equivalent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-22 Thread mengxr

Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/8648#discussion_r40146842
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -260,127 +263,126 @@ private[ann] trait ActivationFunction extends 
Serializable {
 }
 
 /**
- * Implements in-place application of functions
+ * Implements in-place application of functions.
  */
 private[ann] object ActivationFunction {
 
-  def apply(x: BDM[Double], y: BDM[Double], func: Double => Double): Unit 
= {
-var i = 0
-while (i < x.rows) {
-  var j = 0
-  while (j < x.cols) {
-y(i, j) = func(x(i, j))
-j += 1
-  }
-  i += 1
-}
+  def apply(x: BDM[Double], y: BDM[Double], func: UFunc with MappingUFunc)(
+  implicit impl: func.Impl[BDM[Double], BDM[Double]]): Unit = {
+y := func(x)
   }
 
   def apply(
-x1: BDM[Double],
-x2: BDM[Double],
-y: BDM[Double],
-func: (Double, Double) => Double): Unit = {
-var i = 0
-while (i < x1.rows) {
-  var j = 0
-  while (j < x1.cols) {
-y(i, j) = func(x1(i, j), x2(i, j))
-j += 1
-  }
-  i += 1
-}
+  x1: BDM[Double],
+  x2: BDM[Double],
+  y: BDM[Double],
+  func: UFunc with MappingUFunc)(
+  implicit impl: func.Impl2[BDM[Double], BDM[Double], BDM[Double]]): 
Unit = {
+y := func(x1, x2)
   }
 }
 
 /**
- * Implements SoftMax activation function
+ * Implements Softmax activation function.
  */
 private[ann] class SoftmaxFunction extends ActivationFunction {
   override def eval(x: BDM[Double], y: BDM[Double]): Unit = {
-var j = 0
-// find max value to make sure later that exponent is computable
-while (j < x.cols) {
-  var i = 0
-  var max = Double.MinValue
-  while (i < x.rows) {
-if (x(i, j) > max) {
-  max = x(i, j)
-}
-i += 1
-  }
-  var sum = 0.0
-  i = 0
-  while (i < x.rows) {
-val res = Math.exp(x(i, j) - max)
-y(i, j) = res
-sum += res
-i += 1
-  }
-  i = 0
-  while (i < x.rows) {
-y(i, j) /= sum
-i += 1
-  }
-  j += 1
+(0 until x.cols).foreach { j =>
+  // subtract max value to prevent overflow during exp
+  // does not affect correctness since we normalize right after
+  val maxVal = Bmax(x(::, j))
+  y(::, j) := breeze.numerics.exp(x(::, j) - maxVal)
+  y(::, j) :/= Bsum(y(::, j))
--- End diff --

@feynmanliang Could you run some micro-benchmark on this function? I think 
this is the only place that might cause performance issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-14 Thread avulanov

Github user avulanov commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-140234035
  
@feynmanliang I suggest using native BLAS for testing. It worth checking 
the impact of using UFunc as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139649707
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139649708
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42348/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread feynmanliang

Github user feynmanliang commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139633496
  
@avulanov The benchmarking code is written against a WIP implementation; I 
sent you a PR for bringing it up to date.

LBFGS is taking significantly long time on my machine:

![image](https://cloud.githubusercontent.com/assets/990069/9823779/512a0e28-587d-11e5-8c09-205fb4c0caa9.png)

I've removed the `flatten/flatMap` changes from this PR and will save them 
for when I have more time to properly perf test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139636610
  
  [Test build #42348 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42348/consoleFull)
 for   PR 8648 at commit 
[`f56e2d5`](https://github.com/apache/spark/commit/f56e2d5301f10f23c985defa63a4461a9e8d0f1b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139649524
  
  [Test build #42348 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42348/console)
 for   PR 8648 at commit 
[`f56e2d5`](https://github.com/apache/spark/commit/f56e2d5301f10f23c985defa63a4461a9e8d0f1b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139635181
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-139635124
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-10 Thread brkyvz

Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/8648#discussion_r39191027
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -573,47 +582,27 @@ private[ml] class FeedForwardModel private(
   case _ =>
 throw new UnsupportedOperationException("Non-functional layer not 
supported at the top")
 }
+// backward pass (back-propagate deltas given errors)
 deltas(L) = new BDM[Double](0, 0)
 deltas(L - 1) = newE
 for (i <- (L - 2) to (0, -1)) {
   deltas(i) = layerModels(i + 1).prevDelta(deltas(i + 1), outputs(i + 
1))
 }
-val grads = new Array[Array[Double]](layerModels.length)
-for (i <- 0 until layerModels.length) {
-  val input = if (i==0) data else outputs(i - 1)
-  grads(i) = layerModels(i).grad(deltas(i), input)
+// forward pass (forward-propagate gradients given inputs)
+val grads = layerModels.zipWithIndex.map { case (layer, i) =>
+  val input = if (i == 0) data else outputs(i - 1)
+  layer.grad(deltas(i), input)
 }
-// update cumGradient
+// update cumulative gradients
 val cumGradientArray = cumGradient.toArray
-var offset = 0
-// TODO: extract roll
-for (i <- 0 until grads.length) {
-  val gradArray = grads(i)
-  var k = 0
-  while (k < gradArray.length) {
-cumGradientArray(offset + k) += gradArray(k)
-k += 1
-  }
-  offset += gradArray.length
+grads.flatten.zipWithIndex.foreach { case (newGrad, i) =>
+  cumGradientArray(i) += newGrad
 }
 newError
   }
 
-  // TODO: do we really need to copy the weights? they should be read-only
   override def weights(): Vector = {
-// TODO: extract roll
-var size = 0
-for (i <- 0 until layerModels.length) {
-  size += layerModels(i).size
-}
-val array = new Array[Double](size)
-var offset = 0
-for (i <- 0 until layerModels.length) {
-  val layerWeights = layerModels(i).weights().toArray
-  System.arraycopy(layerWeights, 0, array, offset, layerWeights.length)
-  offset += layerWeights.length
-}
-Vectors.dense(array)
+Vectors.dense(layerModels.flatMap(_.weights().toArray))
--- End diff --

I imagine System.arraycopy would be much faster and efficient here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138447507
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42121/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138447506
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138447486
  
  [Test build #42121 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42121/console)
 for   PR 8648 at commit 
[`abdba81`](https://github.com/apache/spark/commit/abdba81135154aab3065fe8316ca51b1cd03885f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread avulanov

Github user avulanov commented on a diff in the pull request:

https://github.com/apache/spark/pull/8648#discussion_r38969255
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -573,47 +582,27 @@ private[ml] class FeedForwardModel private(
   case _ =>
 throw new UnsupportedOperationException("Non-functional layer not 
supported at the top")
 }
+// backward pass (back-propagate deltas given errors)
 deltas(L) = new BDM[Double](0, 0)
 deltas(L - 1) = newE
 for (i <- (L - 2) to (0, -1)) {
   deltas(i) = layerModels(i + 1).prevDelta(deltas(i + 1), outputs(i + 
1))
 }
-val grads = new Array[Array[Double]](layerModels.length)
-for (i <- 0 until layerModels.length) {
-  val input = if (i==0) data else outputs(i - 1)
-  grads(i) = layerModels(i).grad(deltas(i), input)
+// forward pass (forward-propagate gradients given inputs)
+val grads = layerModels.zipWithIndex.map { case (layer, i) =>
+  val input = if (i == 0) data else outputs(i - 1)
+  layer.grad(deltas(i), input)
 }
-// update cumGradient
+// update cumulative gradients
 val cumGradientArray = cumGradient.toArray
-var offset = 0
-// TODO: extract roll
-for (i <- 0 until grads.length) {
-  val gradArray = grads(i)
-  var k = 0
-  while (k < gradArray.length) {
-cumGradientArray(offset + k) += gradArray(k)
-k += 1
-  }
-  offset += gradArray.length
+grads.flatten.zipWithIndex.foreach { case (newGrad, i) =>
--- End diff --

Flatten might be expensive for array of large arrays, is not it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread avulanov

Github user avulanov commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138679510
  
@feynmanliang Thank you for reviewing the code! I made one pass. It seems 
that UFunc simplifies it a lot. However I am not sure about `.flatten` and 
`.flatMap` on array of large arrays. We need to perform performance comparison. 
Could you run the benchmark from https://github.com/avulanov/ann-benchmark 
before and after refactoring to see the difference?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread avulanov

Github user avulanov commented on a diff in the pull request:

https://github.com/apache/spark/pull/8648#discussion_r38969473
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -573,47 +582,27 @@ private[ml] class FeedForwardModel private(
   case _ =>
 throw new UnsupportedOperationException("Non-functional layer not 
supported at the top")
 }
+// backward pass (back-propagate deltas given errors)
 deltas(L) = new BDM[Double](0, 0)
 deltas(L - 1) = newE
 for (i <- (L - 2) to (0, -1)) {
   deltas(i) = layerModels(i + 1).prevDelta(deltas(i + 1), outputs(i + 
1))
 }
-val grads = new Array[Array[Double]](layerModels.length)
-for (i <- 0 until layerModels.length) {
-  val input = if (i==0) data else outputs(i - 1)
-  grads(i) = layerModels(i).grad(deltas(i), input)
+// forward pass (forward-propagate gradients given inputs)
+val grads = layerModels.zipWithIndex.map { case (layer, i) =>
+  val input = if (i == 0) data else outputs(i - 1)
+  layer.grad(deltas(i), input)
 }
-// update cumGradient
+// update cumulative gradients
 val cumGradientArray = cumGradient.toArray
-var offset = 0
-// TODO: extract roll
-for (i <- 0 until grads.length) {
-  val gradArray = grads(i)
-  var k = 0
-  while (k < gradArray.length) {
-cumGradientArray(offset + k) += gradArray(k)
-k += 1
-  }
-  offset += gradArray.length
+grads.flatten.zipWithIndex.foreach { case (newGrad, i) =>
+  cumGradientArray(i) += newGrad
 }
 newError
   }
 
-  // TODO: do we really need to copy the weights? they should be read-only
   override def weights(): Vector = {
-// TODO: extract roll
-var size = 0
-for (i <- 0 until layerModels.length) {
-  size += layerModels(i).size
-}
-val array = new Array[Double](size)
-var offset = 0
-for (i <- 0 until layerModels.length) {
-  val layerWeights = layerModels(i).weights().toArray
-  System.arraycopy(layerWeights, 0, array, offset, layerWeights.length)
-  offset += layerWeights.length
-}
-Vectors.dense(array)
+Vectors.dense(layerModels.flatMap(_.weights().toArray))
--- End diff --

Ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138648577
  
  [Test build #42137 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42137/console)
 for   PR 8648 at commit 
[`f6731ff`](https://github.com/apache/spark/commit/f6731ff10a5bf671d1d727cf590bccd7fb6e13c1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class BlockFetchException(messages: String, throwable: Throwable)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138637405
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138638322
  
  [Test build #42137 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42137/consoleFull)
 for   PR 8648 at commit 
[`f6731ff`](https://github.com/apache/spark/commit/f6731ff10a5bf671d1d727cf590bccd7fb6e13c1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138637439
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138648747
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42137/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138648745
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138383081
  
  [Test build #42107 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42107/consoleFull)
 for   PR 8648 at commit 
[`84f8bea`](https://github.com/apache/spark/commit/84f8bea0f4a3b24f8aae6eadd98d7166311f951d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138383214
  
  [Test build #42107 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42107/console)
 for   PR 8648 at commit 
[`84f8bea`](https://github.com/apache/spark/commit/84f8bea0f4a3b24f8aae6eadd98d7166311f951d).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138383217
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42107/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138415826
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138415825
  
  [Test build #42113 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42113/console)
 for   PR 8648 at commit 
[`22ba174`](https://github.com/apache/spark/commit/22ba174f1880fb99fe778bd2fafc54f0afe3ba01).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138415827
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42113/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138438023
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138438082
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138383216
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138439943
  
  [Test build #42121 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42121/consoleFull)
 for   PR 8648 at commit 
[`abdba81`](https://github.com/apache/spark/commit/abdba81135154aab3065fe8316ca51b1cd03885f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138382936
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138382940
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138414447
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138414336
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8648#issuecomment-138415609
  
  [Test build #42113 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42113/consoleFull)
 for   PR 8648 at commit 
[`22ba174`](https://github.com/apache/spark/commit/22ba174f1880fb99fe778bd2fafc54f0afe3ba01).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10478][ML] Performance, organization, a...

2015-09-07 Thread feynmanliang

GitHub user feynmanliang opened a pull request:

https://github.com/apache/spark/pull/8648

[SPARK-10478][ML] Performance, organization, and style improvements for 
multi-layer perceptron

* Changes manual iterations into higher-performance  `UFunc`s, vectorized 
operations, and broadcasted operations
* Refactors multiple `while` and `for` loops into `foreach` and `map`s
* Fixes various style issues
* Adds comments and improves scaladocs

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/feynmanliang/spark ann-improvements

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8648.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8648


commit 611c76f723405fe40ed1ba5c871d5e88b617aed3
Author: Feynman Liang 
Date:   2015-09-07T20:00:32Z

Documentation and indentation fixes

commit 7b192db010ff660da39404009ce883f903f9980a
Author: Feynman Liang 
Date:   2015-09-07T20:02:36Z

Refactors unneeded helpers

commit 12169d7236caaad0db57d941cbf47a947b61
Author: Feynman Liang 
Date:   2015-09-07T20:08:05Z

More doc and style fixes

commit bc52b652541f2d6c4508662af937e0a26e207651
Author: Feynman Liang 
Date:   2015-09-07T20:39:21Z

Cleans up documentation and uses functional code

commit 84f8bea0f4a3b24f8aae6eadd98d7166311f951d
Author: Feynman Liang 
Date:   2015-09-07T21:27:53Z

Vectorizes linalg using ufuncs and vector ops




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

41 matches

Mail list logo