subject:"\[GitHub\] spark pull request #17373\: \[SPARK\-12664\]\[ML\] Expose probability in mlp model"

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17373


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-15 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133324363
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -527,9 +550,21 @@ private[ml] class FeedForwardModel private(
 
   override def predict(data: Vector): Vector = {
 val size = data.size
-val result = forward(new BDM[Double](size, 1, data.toArray))
+val result = forward(new BDM[Double](size, 1, data.toArray), true)
 Vectors.dense(result.last.toArray)
   }
+
+  override def predictRaw(data: Vector): Vector = {
+val size = data.size
+val result = forward(new BDM[Double](size, 1, data.toArray), false)
+Vectors.dense(result(result.length - 2).toArray)
+  }
+
+  override def raw2ProbabilityInPlace(data: Vector): Vector = {
+val dataMatrix = new BDM[Double](data.size, 1, data.toArray)
+layerModels.last.eval(dataMatrix, dataMatrix)
--- End diff --

Ping: If this proposal sounds good, then can you please update accordingly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-15 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133322927
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
+val layers = Array[Int](4, 5, 5, 2)
+
+val strongDataset = Seq(
+  (Vectors.dense(1, 2, 3, 4), 0d, Vectors.dense(1d, 0d)),
+  (Vectors.dense(4, 3, 2, 1), 1d, Vectors.dense(0d, 1d)),
+  (Vectors.dense(1, 1, 1, 1), 0d, Vectors.dense(.5, .5)),
+  (Vectors.dense(1, 1, 1, 1), 1d, Vectors.dense(.5, .5))
+).toDF("features", "label", "expectedProbability")
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(strongDataset)
+val result = model.transform(strongDataset)
+model.setProbabilityCol("probability")
+MLTestingUtils.checkCopyAndUids(trainer, model)
+// result.select("probability").show(false)
+result.select("probability", "expectedProbability").collect().foreach {
+  case Row(p: Vector, e: Vector) =>
+assert(p ~== e absTol 1e-3)
+}
+  }
+
+  test("test model probability") {
+val layers = Array[Int](2, 5, 2)
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(dataset)
+model.setProbabilityCol("probability")
--- End diff --

Ping --- this should not be necessary


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-15 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133323889
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -374,6 +380,22 @@ private[ann] trait TopologyModel extends Serializable {
   def predict(data: Vector): Vector
 
   /**
+   * Raw prediction of the model
+   *
+   * @param data input data
+   * @return raw prediction
+   */
+  def predictRaw(data: Vector): Vector
--- End diff --

Ping: rename data -> features


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133081809
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -527,9 +550,21 @@ private[ml] class FeedForwardModel private(
 
   override def predict(data: Vector): Vector = {
 val size = data.size
-val result = forward(new BDM[Double](size, 1, data.toArray))
+val result = forward(new BDM[Double](size, 1, data.toArray), true)
 Vectors.dense(result.last.toArray)
   }
+
+  override def predictRaw(data: Vector): Vector = {
+val size = data.size
+val result = forward(new BDM[Double](size, 1, data.toArray), false)
+Vectors.dense(result(result.length - 2).toArray)
+  }
+
+  override def raw2ProbabilityInPlace(data: Vector): Vector = {
+val dataMatrix = new BDM[Double](data.size, 1, data.toArray)
+layerModels.last.eval(dataMatrix, dataMatrix)
--- End diff --

This assumes that the ```eval``` method can operate in-place.  That is fine 
for the last layer for MLP (SoftmaxLayerModelWithCrossEntropyLoss), but not OK 
in general.  More generally, these methods for classifiers should not go in the 
very general TopologyModel abstraction; that abstraction may be used in the 
future for regression as well.  I'd be fine with putting this 
classification-specific logic in MLP itself; we do not need to generalize the 
logic until we add other Classifiers, which might take a long time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133079098
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -374,6 +380,22 @@ private[ann] trait TopologyModel extends Serializable {
   def predict(data: Vector): Vector
 
   /**
+   * Raw prediction of the model
+   *
+   * @param data input data
+   * @return raw prediction
+   */
+  def predictRaw(data: Vector): Vector
+
+  /**
+   * Probability of the model
--- End diff --

This documentation does not add any information.  Can you please link to 
ProbabilisticClassifier instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133079087
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -374,6 +380,22 @@ private[ann] trait TopologyModel extends Serializable {
   def predict(data: Vector): Vector
 
   /**
+   * Raw prediction of the model
--- End diff --

This documentation does not add any information.  Can you please link to 
ProbabilisticClassifier instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133079226
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -374,6 +380,22 @@ private[ann] trait TopologyModel extends Serializable {
   def predict(data: Vector): Vector
 
   /**
+   * Raw prediction of the model
+   *
+   * @param data input data
+   * @return raw prediction
+   */
+  def predictRaw(data: Vector): Vector
+
+  /**
+   * Probability of the model
+   *
+   * @param data input data
--- End diff --

"input data" sounds like input feature values, which is not correct.  Why 
not match the ProbabilisticClassifier API?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133076755
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
+val layers = Array[Int](4, 5, 5, 2)
--- End diff --

Can you make this test faster by using a simpler network, e.g., by removing 
one of the middle layers?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133075752
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
+val layers = Array[Int](4, 5, 5, 2)
+
+val strongDataset = Seq(
+  (Vectors.dense(1, 2, 3, 4), 0d, Vectors.dense(1d, 0d)),
+  (Vectors.dense(4, 3, 2, 1), 1d, Vectors.dense(0d, 1d)),
+  (Vectors.dense(1, 1, 1, 1), 0d, Vectors.dense(.5, .5)),
+  (Vectors.dense(1, 1, 1, 1), 1d, Vectors.dense(.5, .5))
+).toDF("features", "label", "expectedProbability")
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(strongDataset)
+val result = model.transform(strongDataset)
+model.setProbabilityCol("probability")
+MLTestingUtils.checkCopyAndUids(trainer, model)
--- End diff --

checkCopyAndUids is a generic test which should only be run in a single 
test; it does not need to be run in each test.  Please remove it from here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133075802
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
+val layers = Array[Int](4, 5, 5, 2)
+
+val strongDataset = Seq(
+  (Vectors.dense(1, 2, 3, 4), 0d, Vectors.dense(1d, 0d)),
+  (Vectors.dense(4, 3, 2, 1), 1d, Vectors.dense(0d, 1d)),
+  (Vectors.dense(1, 1, 1, 1), 0d, Vectors.dense(.5, .5)),
+  (Vectors.dense(1, 1, 1, 1), 1d, Vectors.dense(.5, .5))
+).toDF("features", "label", "expectedProbability")
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(strongDataset)
+val result = model.transform(strongDataset)
+model.setProbabilityCol("probability")
+MLTestingUtils.checkCopyAndUids(trainer, model)
+// result.select("probability").show(false)
--- End diff --

remove old comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133080046
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -361,9 +361,15 @@ private[ann] trait TopologyModel extends Serializable {
* Forward propagation
*
* @param data input data
+   * @param includeLastLayer include last layer when computing. In 
MultilayerPerceptronClassifier,
--- End diff --

This text is unclear.  This phrasing is better: "Include the last layer in 
the output. In MultilayerPerceptronClassifier, the last layer is always 
softmax; the last layer of outputs is needed for class predictions, but not for 
rawPrediction."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133076328
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
--- End diff --

Make the test title more descriptive so it is clear what it is testing.  
E.g. "Predicted class probabilities: calibration on toy dataset"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133082415
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
+val layers = Array[Int](4, 5, 5, 2)
+
+val strongDataset = Seq(
+  (Vectors.dense(1, 2, 3, 4), 0d, Vectors.dense(1d, 0d)),
+  (Vectors.dense(4, 3, 2, 1), 1d, Vectors.dense(0d, 1d)),
+  (Vectors.dense(1, 1, 1, 1), 0d, Vectors.dense(.5, .5)),
+  (Vectors.dense(1, 1, 1, 1), 1d, Vectors.dense(.5, .5))
+).toDF("features", "label", "expectedProbability")
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(strongDataset)
+val result = model.transform(strongDataset)
+model.setProbabilityCol("probability")
+MLTestingUtils.checkCopyAndUids(trainer, model)
+// result.select("probability").show(false)
+result.select("probability", "expectedProbability").collect().foreach {
+  case Row(p: Vector, e: Vector) =>
+assert(p ~== e absTol 1e-3)
+}
+  }
+
+  test("test model probability") {
+val layers = Array[Int](2, 5, 2)
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(dataset)
+model.setProbabilityCol("probability")
--- End diff --

That's the default already, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133080187
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -527,9 +550,21 @@ private[ml] class FeedForwardModel private(
 
   override def predict(data: Vector): Vector = {
 val size = data.size
-val result = forward(new BDM[Double](size, 1, data.toArray))
+val result = forward(new BDM[Double](size, 1, data.toArray), true)
 Vectors.dense(result.last.toArray)
   }
+
+  override def predictRaw(data: Vector): Vector = {
+val size = data.size
--- End diff --

This temp val of "size" is only used once, so I recommend removing it to 
make the code clearer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133082607
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
 ---
@@ -82,6 +83,49 @@ class MultilayerPerceptronClassifierSuite
 }
   }
 
+  test("strong dataset test") {
+val layers = Array[Int](4, 5, 5, 2)
+
+val strongDataset = Seq(
+  (Vectors.dense(1, 2, 3, 4), 0d, Vectors.dense(1d, 0d)),
+  (Vectors.dense(4, 3, 2, 1), 1d, Vectors.dense(0d, 1d)),
+  (Vectors.dense(1, 1, 1, 1), 0d, Vectors.dense(.5, .5)),
+  (Vectors.dense(1, 1, 1, 1), 1d, Vectors.dense(.5, .5))
+).toDF("features", "label", "expectedProbability")
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(strongDataset)
+val result = model.transform(strongDataset)
+model.setProbabilityCol("probability")
+MLTestingUtils.checkCopyAndUids(trainer, model)
+// result.select("probability").show(false)
+result.select("probability", "expectedProbability").collect().foreach {
+  case Row(p: Vector, e: Vector) =>
+assert(p ~== e absTol 1e-3)
+}
+  }
+
+  test("test model probability") {
+val layers = Array[Int](2, 5, 2)
+val trainer = new MultilayerPerceptronClassifier()
+  .setLayers(layers)
+  .setBlockSize(1)
+  .setSeed(123L)
+  .setMaxIter(100)
+  .setSolver("l-bfgs")
+val model = trainer.fit(dataset)
+model.setProbabilityCol("probability")
+val result = model.transform(dataset)
+val features2prob = udf { features: Vector => 
model.mlpModel.predict(features) }
+val cmpVec = udf { (v1: Vector, v2: Vector) => v1 ~== v2 relTol 1e-3 }
+assert(result.select(cmpVec(features2prob(col("features")), 
col("probability")))
--- End diff --

If this test fails, it will not give much info.  How about collecting the 
data and comparing on the driver?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-14 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/17373#discussion_r133079322
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -374,6 +380,22 @@ private[ann] trait TopologyModel extends Serializable {
   def predict(data: Vector): Vector
 
   /**
+   * Raw prediction of the model
+   *
+   * @param data input data
+   * @return raw prediction
+   */
+  def predictRaw(data: Vector): Vector
--- End diff --

Please match the Classifier API here (data -> features) unless there's a 
reason to deviate


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

[GitHub] spark pull request #17373: [SPARK-12664][ML] Expose probability in mlp model

17 matches

Site Navigation

Mail list logo

Footer information