Hi,
I noticed a bug in the sample java code in MLlib - Naive Bayes docs page:
http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html

In the filter:

|double  accuracy  =  1.0  *  predictionAndLabel.filter(new  Function<Tuple2<Double,  
Double>,  Boolean>()  {
    @Override  public  Boolean  call(Tuple2<Double,  Double>  pl)  {
      return  pl._1()  ==  pl._2();
    }
  }).count()  /  test.count();

it tests Double object by references whereas it should test their values:

||double  accuracy  =  1.0  *  predictionAndLabel.filter(new  Function<Tuple2<Double, 
 Double>,  Boolean>()  {
    @Override  public  Boolean  call(Tuple2<Double,  Double>  pl)  {
|||       |return pl._1().doubleValue() == pl._2().doubleValue();
    }
  }).count()  /  test.count();|

The Java version accuracy is always 0.0. Scala code outputs the correct value 
1.0

Thanks,




Reply via email to