MLlib - Naive Bayes Java example bug

2014-11-03 Thread Dariusz Kobylarz

Hi,
I noticed a bug in the sample java code in MLlib - Naive Bayes docs page:
http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html

In the filter:

|double  accuracy  =  1.0  *  predictionAndLabel.filter(new  FunctionTuple2Double,  
Double,  Boolean()  {
@Override  public  Boolean  call(Tuple2Double,  Double  pl)  {
  return  pl._1()  ==  pl._2();
}
  }).count()  /  test.count();

it tests Double object by references whereas it should test their values:

||double  accuracy  =  1.0  *  predictionAndLabel.filter(new  FunctionTuple2Double, 
 Double,  Boolean()  {
@Override  public  Boolean  call(Tuple2Double,  Double  pl)  {
|||   |return pl._1().doubleValue() == pl._2().doubleValue();
}
  }).count()  /  test.count();|

The Java version accuracy is always 0.0. Scala code outputs the correct value 
1.0

Thanks,






Re: MLlib - Naive Bayes Java example bug

2014-11-03 Thread Sean Owen
Yes, good catch. I also think the 1.0 * is suboptimal as a cast to
double. I searched for similar issues and didn't see any. Open a PR --
I'm not even sure this is enough to warrant a JIRA? but feel free to
as well.

On Mon, Nov 3, 2014 at 6:46 PM, Dariusz Kobylarz
darek.kobyl...@gmail.com wrote:
 Hi,
 I noticed a bug in the sample java code in MLlib - Naive Bayes docs page:
 http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html

 In the filter:

 double accuracy = 1.0 * predictionAndLabel.filter(new
 FunctionTuple2Double, Double, Boolean() {
 @Override public Boolean call(Tuple2Double, Double pl) {
   return pl._1() == pl._2();
 }
   }).count() / test.count();

 it tests Double object by references whereas it should test their values:

 double accuracy = 1.0 * predictionAndLabel.filter(new
 FunctionTuple2Double, Double, Boolean() {
 @Override public Boolean call(Tuple2Double, Double pl) {
   return pl._1().doubleValue() == pl._2().doubleValue();
 }
   }).count() / test.count();

 The Java version accuracy is always 0.0. Scala code outputs the correct
 value 1.0

 Thanks,





-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org