Hi, I build a transformer model for spark svm for binary classification. I basically implement the predictRaw method for classification and classification model of spark api.
override def predictRaw(dataMatrix: Vector):Vector = { val m = weights.toBreeze.dot(dataMatrix.toBreeze) + intercept Vectors.dense(-m, m) } I have an imbalanced text dataset. The scores of logistic regression and naive bayes for bag of words model is very high for author classification with OneVsRest settings but the scores of SVM is very low. I am using standard parameters of SVM with 3000 maximum iteration in OneVsRest. What might be the problem? I am using the same features (200125), labels (9), ~1500 training instances, ~500 test instances and OneVsRest for all the compared settings. Thanks in advance... Hayri Volkan Agun PhD. Student - Anadolu University