Google+ <https://plus.google.com/app/basic?nopromo=1&source=mog&gl=uk> <http://mail.google.com/mail/x/mog-/gp/?source=mog&gl=uk> Calendar <https://www.google.com/calendar/gpcal?source=mog&gl=uk> Web <http://www.google.co.uk/?source=mog&gl=uk> more Inbox Apache Spark Email GmailNot Work S sam.sav...@barclays.com to me 0 minutes ago Details According to https://spark.apache.org/docs/1.4.0/api/scala/index.html#org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
The constructor takes `RDD[(Double, Double)]` meaning lables are Doubles, this seems odd, shouldn't it be Boolean? Similarly for MutlilabelMetrics (I.e. Should be RDD[(Array[Double], Array[Boolean])]), and for MulticlassMetrics the type of both should be generic? Additionally it would be good if either the ROC output type was changed or another method was added that returned confusion matricies, so that the hard integer values can be obtained before the divisions. E.g. ``` case class Confusion(tp: Int, fp: Int, fn: Int, tn: Int) { // bunch of methods for each of the things in the table here https://en.wikipedia.org/wiki/Receiver_operating_characteristic } ... def confusions(): RDD[Confusion] ```