Yes, good catch. I also think the 1.0 * is suboptimal as a cast to
double. I searched for similar issues and didn't see any. Open a PR --
I'm not even sure this is enough to warrant a JIRA? but feel free to
as well.
On Mon, Nov 3, 2014 at 6:46 PM, Dariusz Kobylarz
darek.kobyl...@gmail.com wrote:
Hi,
I noticed a bug in the sample java code in MLlib - Naive Bayes docs page:
http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html
In the filter:
double accuracy = 1.0 * predictionAndLabel.filter(new
FunctionTuple2Double, Double, Boolean() {
@Override public Boolean call(Tuple2Double, Double pl) {
return pl._1() == pl._2();
}
}).count() / test.count();
it tests Double object by references whereas it should test their values:
double accuracy = 1.0 * predictionAndLabel.filter(new
FunctionTuple2Double, Double, Boolean() {
@Override public Boolean call(Tuple2Double, Double pl) {
return pl._1().doubleValue() == pl._2().doubleValue();
}
}).count() / test.count();
The Java version accuracy is always 0.0. Scala code outputs the correct
value 1.0
Thanks,
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org