Yes, good catch. I also think the "1.0 *" is suboptimal as a cast to double. I searched for similar issues and didn't see any. Open a PR -- I'm not even sure this is enough to warrant a JIRA? but feel free to as well.
On Mon, Nov 3, 2014 at 6:46 PM, Dariusz Kobylarz <darek.kobyl...@gmail.com> wrote: > Hi, > I noticed a bug in the sample java code in MLlib - Naive Bayes docs page: > http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html > > In the filter: > > double accuracy = 1.0 * predictionAndLabel.filter(new > Function<Tuple2<Double, Double>, Boolean>() { > @Override public Boolean call(Tuple2<Double, Double> pl) { > return pl._1() == pl._2(); > } > }).count() / test.count(); > > it tests Double object by references whereas it should test their values: > > double accuracy = 1.0 * predictionAndLabel.filter(new > Function<Tuple2<Double, Double>, Boolean>() { > @Override public Boolean call(Tuple2<Double, Double> pl) { > return pl._1().doubleValue() == pl._2().doubleValue(); > } > }).count() / test.count(); > > The Java version accuracy is always 0.0. Scala code outputs the correct > value 1.0 > > Thanks, > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org