In the other thread I had an issue with Python. In this issue, I tried switching to Scala. The code is:
*import* org.apache.spark.mllib.regression.*LabeledPoint**;* *import org.apache.spark.mllib.linalg.SparseVector;* *import org.apache.spark.mllib.classification.NaiveBayes;* import scala.collection.mutable.ArrayBuffer def isEmpty(a: String): Boolean = a != null && !a.replaceAll("""(?m)\s+$""", "").isEmpty() def parsePoint(a: String): LabeledPoint = { val values = a.split('\t') val feat = values(1).split(' ') val indices = ArrayBuffer.empty[Int] val featValues = ArrayBuffer.empty[Double] for (f <- feat) { val q = f.split(':') if (q.length == 2) { indices += (q(0).toInt) featValues += (q(1).toDouble) } } val vector = new SparseVector(2357815, indices.toArray, featValues.toArray) return LabeledPoint(values(0).toDouble, vector) } val data = sc.textFile("data.txt") val empty = data.filter(isEmpty) val points = empty.map(parsePoint) points.cache() val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng <men...@gmail.com> wrote: > Do you mind sharing more code and error messages? The information you > provided is too little to identify the problem. -Xiangrui > > On Thu, Apr 24, 2014 at 1:55 PM, John King <usedforprinting...@gmail.com> > wrote: > > Last command was: > > > > val model = new NaiveBayes().run(points) > > > > > > > > On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng <men...@gmail.com> wrote: > >> > >> Could you share the command you used and more of the error message? > >> Also, is it an MLlib specific problem? -Xiangrui > >> > >> On Thu, Apr 24, 2014 at 11:49 AM, John King > >> <usedforprinting...@gmail.com> wrote: > >> > ./spark-shell: line 153: 17654 Killed > >> > $FWDIR/bin/spark-class org.apache.spark.repl.Main "$@" > >> > > >> > > >> > Any ideas? > > > > >