You should never use the training data to measure your prediction accuracy. Always use a fresh dataset (test data) for this purpose.
On Sun, Nov 29, 2015 at 8:36 AM, Jeff Zhang <[email protected]> wrote: > I think this should represent the label of LabledPoint (0 means negative 1 > means positive) > http://spark.apache.org/docs/latest/mllib-data-types.html#labeled-point > > The document you mention is for the mathematical formula, not the > implementation. > > On Sun, Nov 29, 2015 at 9:13 AM, Tarek Elgamal <[email protected]> > wrote: > >> According to the documentation >> <http://spark.apache.org/docs/latest/mllib-linear-methods.html>, by >> default, if wTx≥0 then the outcome is positive, and negative otherwise. I >> suppose that wTx is the "score" in my case. If score is more than 0 and the >> label is positive, then I return 1 which is correct classification and I >> return zero otherwise. Do you have any idea how to classify a point as >> positive or negative using this score or another function ? >> >> On Sat, Nov 28, 2015 at 5:14 AM, Jeff Zhang <[email protected]> wrote: >> >>> if((score >=0 && label == 1) || (score <0 && label == 0)) >>> { >>> return 1; //correct classiciation >>> } >>> else >>> return 0; >>> >>> >>> >>> I suspect score is always between 0 and 1 >>> >>> >>> >>> On Sat, Nov 28, 2015 at 10:39 AM, Tarek Elgamal <[email protected] >>> > wrote: >>> >>>> Hi, >>>> >>>> I am trying to run the straightforward example of SVm but I am getting >>>> low accuracy (around 50%) when I predict using the same data I used for >>>> training. I am probably doing the prediction in a wrong way. My code is >>>> below. I would appreciate any help. >>>> >>>> >>>> import java.util.List; >>>> >>>> import org.apache.spark.SparkConf; >>>> import org.apache.spark.SparkContext; >>>> import org.apache.spark.api.java.JavaRDD; >>>> import org.apache.spark.api.java.function.Function; >>>> import org.apache.spark.api.java.function.Function2; >>>> import org.apache.spark.mllib.classification.SVMModel; >>>> import org.apache.spark.mllib.classification.SVMWithSGD; >>>> import org.apache.spark.mllib.regression.LabeledPoint; >>>> import org.apache.spark.mllib.util.MLUtils; >>>> >>>> import scala.Tuple2; >>>> import edu.illinois.biglbjava.readers.LabeledPointReader; >>>> >>>> public class SimpleDistSVM { >>>> public static void main(String[] args) { >>>> SparkConf conf = new SparkConf().setAppName("SVM Classifier >>>> Example"); >>>> SparkContext sc = new SparkContext(conf); >>>> String inputPath=args[0]; >>>> >>>> // Read training data >>>> JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(sc, >>>> inputPath).toJavaRDD(); >>>> >>>> // Run training algorithm to build the model. >>>> int numIterations = 3; >>>> final SVMModel model = SVMWithSGD.train(data.rdd(), numIterations); >>>> >>>> // Clear the default threshold. >>>> model.clearThreshold(); >>>> >>>> >>>> // Predict points in test set and map to an RDD of 0/1 values where >>>> 0 is misclassication and 1 is correct classification >>>> JavaRDD<Integer> classification = data.map(new >>>> Function<LabeledPoint, Integer>() { >>>> public Integer call(LabeledPoint p) { >>>> int label = (int) p.label(); >>>> Double score = model.predict(p.features()); >>>> if((score >=0 && label == 1) || (score <0 && label == 0)) >>>> { >>>> return 1; //correct classiciation >>>> } >>>> else >>>> return 0; >>>> >>>> } >>>> } >>>> ); >>>> // sum up all values in the rdd to get the number of correctly >>>> classified examples >>>> int sum=classification.reduce(new Function2<Integer, Integer, >>>> Integer>() >>>> { >>>> public Integer call(Integer arg0, Integer arg1) >>>> throws Exception { >>>> return arg0+arg1; >>>> }}); >>>> >>>> //compute accuracy as the percentage of the correctly classified >>>> examples >>>> double accuracy=((double)sum)/((double)classification.count()); >>>> System.out.println("Accuracy = " + accuracy); >>>> >>>> } >>>> } >>>> ); >>>> } >>>> } >>>> >>> >>> >>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >>> >> >> > > > -- > Best Regards > > Jeff Zhang > -- Thanks & Regards, Fazlan Nazeem *Software Engineer* *WSO2 Inc* Mobile : +94772338839 <%2B94%20%280%29%20773%20451194> [email protected]
