Github user schmit closed the pull request at:
https://github.com/apache/spark/pull/160
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user schmit commented on the pull request:
https://github.com/apache/spark/pull/160#issuecomment-38319124
They are indeed either 0.0 or 1.0, as they are the true labels. However,
maybe > 0.5 is a bit safer, since these are implemented as doubles? That's my
argument f
Github user schmit commented on the pull request:
https://github.com/apache/spark/pull/160#issuecomment-38316626
Sorry, I am blind. Anyway, do you agree that > 0.5 makes most sense?
---
If your project is set up for it, you can reply to this email and have your
reply appear on Git
Github user schmit commented on a diff in the pull request:
https://github.com/apache/spark/pull/160#discussion_r10734227
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/classification/BinaryClassificationModel.scala
---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the
Github user schmit commented on a diff in the pull request:
https://github.com/apache/spark/pull/160#discussion_r10732053
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/classification/BinaryClassificationModel.scala
---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the
Github user schmit commented on the pull request:
https://github.com/apache/spark/pull/160#issuecomment-38003806
On your more general remarks @srowen:
I think those are valid concerns, here is my reasoning for doing it this
way:
The predict function returns the label, but
Github user schmit commented on a diff in the pull request:
https://github.com/apache/spark/pull/160#discussion_r10731686
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/classification/BinaryClassificationModel.scala
---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the
GitHub user schmit opened a pull request:
https://github.com/apache/spark/pull/160
ROC area under the curve for binary classification
Implementation of receiver-operator-curve area under the curve goodness of
fit measure.
JIRA:
https://spark-project.atlassian.net