[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-04-28 Thread schmit
Github user schmit closed the pull request at: https://github.com/apache/spark/pull/160 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-21 Thread schmit
Github user schmit commented on the pull request: https://github.com/apache/spark/pull/160#issuecomment-38319124 They are indeed either 0.0 or 1.0, as they are the true labels. However, maybe > 0.5 is a bit safer, since these are implemented as doubles? That's my argument f

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-21 Thread schmit
Github user schmit commented on the pull request: https://github.com/apache/spark/pull/160#issuecomment-38316626 Sorry, I am blind. Anyway, do you agree that > 0.5 makes most sense? --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-18 Thread schmit
Github user schmit commented on a diff in the pull request: https://github.com/apache/spark/pull/160#discussion_r10734227 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/classification/BinaryClassificationModel.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-18 Thread schmit
Github user schmit commented on a diff in the pull request: https://github.com/apache/spark/pull/160#discussion_r10732053 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/classification/BinaryClassificationModel.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-18 Thread schmit
Github user schmit commented on the pull request: https://github.com/apache/spark/pull/160#issuecomment-38003806 On your more general remarks @srowen: I think those are valid concerns, here is my reasoning for doing it this way: The predict function returns the label, but

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-18 Thread schmit
Github user schmit commented on a diff in the pull request: https://github.com/apache/spark/pull/160#discussion_r10731686 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/classification/BinaryClassificationModel.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the

[GitHub] spark pull request: ROC area under the curve for binary classifica...

2014-03-16 Thread schmit
GitHub user schmit opened a pull request: https://github.com/apache/spark/pull/160 ROC area under the curve for binary classification Implementation of receiver-operator-curve area under the curve goodness of fit measure. JIRA: https://spark-project.atlassian.net