Spark mllib throwing error
./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
Last command was: val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
Do you mind sharing more code and error messages? The information you provided is too little to identify the problem. -Xiangrui On Thu, Apr 24, 2014 at 1:55 PM, John King usedforprinting...@gmail.com wrote: Last command was: val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
In the other thread I had an issue with Python. In this issue, I tried switching to Scala. The code is: *import* org.apache.spark.mllib.regression.*LabeledPoint**;* *import org.apache.spark.mllib.linalg.SparseVector;* *import org.apache.spark.mllib.classification.NaiveBayes;* import scala.collection.mutable.ArrayBuffer def isEmpty(a: String): Boolean = a != null !a.replaceAll((?m)\s+$, ).isEmpty() def parsePoint(a: String): LabeledPoint = { val values = a.split('\t') val feat = values(1).split(' ') val indices = ArrayBuffer.empty[Int] val featValues = ArrayBuffer.empty[Double] for (f - feat) { val q = f.split(':') if (q.length == 2) { indices += (q(0).toInt) featValues += (q(1).toDouble) } } val vector = new SparseVector(2357815, indices.toArray, featValues.toArray) return LabeledPoint(values(0).toDouble, vector) } val data = sc.textFile(data.txt) val empty = data.filter(isEmpty) val points = empty.map(parsePoint) points.cache() val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng men...@gmail.com wrote: Do you mind sharing more code and error messages? The information you provided is too little to identify the problem. -Xiangrui On Thu, Apr 24, 2014 at 1:55 PM, John King usedforprinting...@gmail.com wrote: Last command was: val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
I don't see anything wrong with your code. Could you do points.count() to see how many training examples you have? Also, make sure you don't have negative feature values. The error message you sent did not say NaiveBayes went wrong, but the Spark shell was killed. -Xiangrui On Thu, Apr 24, 2014 at 4:05 PM, John King usedforprinting...@gmail.com wrote: In the other thread I had an issue with Python. In this issue, I tried switching to Scala. The code is: import org.apache.spark.mllib.regression.LabeledPoint; import org.apache.spark.mllib.linalg.SparseVector; import org.apache.spark.mllib.classification.NaiveBayes; import scala.collection.mutable.ArrayBuffer def isEmpty(a: String): Boolean = a != null !a.replaceAll((?m)\s+$, ).isEmpty() def parsePoint(a: String): LabeledPoint = { val values = a.split('\t') val feat = values(1).split(' ') val indices = ArrayBuffer.empty[Int] val featValues = ArrayBuffer.empty[Double] for (f - feat) { val q = f.split(':') if (q.length == 2) { indices += (q(0).toInt) featValues += (q(1).toDouble) } } val vector = new SparseVector(2357815, indices.toArray, featValues.toArray) return LabeledPoint(values(0).toDouble, vector) } val data = sc.textFile(data.txt) val empty = data.filter(isEmpty) val points = empty.map(parsePoint) points.cache() val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng men...@gmail.com wrote: Do you mind sharing more code and error messages? The information you provided is too little to identify the problem. -Xiangrui On Thu, Apr 24, 2014 at 1:55 PM, John King usedforprinting...@gmail.com wrote: Last command was: val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
It just displayed this error and stopped on its own. Do the lines of code mentioned in the error have anything to do with it? On Thu, Apr 24, 2014 at 7:54 PM, Xiangrui Meng men...@gmail.com wrote: I don't see anything wrong with your code. Could you do points.count() to see how many training examples you have? Also, make sure you don't have negative feature values. The error message you sent did not say NaiveBayes went wrong, but the Spark shell was killed. -Xiangrui On Thu, Apr 24, 2014 at 4:05 PM, John King usedforprinting...@gmail.com wrote: In the other thread I had an issue with Python. In this issue, I tried switching to Scala. The code is: import org.apache.spark.mllib.regression.LabeledPoint; import org.apache.spark.mllib.linalg.SparseVector; import org.apache.spark.mllib.classification.NaiveBayes; import scala.collection.mutable.ArrayBuffer def isEmpty(a: String): Boolean = a != null !a.replaceAll((?m)\s+$, ).isEmpty() def parsePoint(a: String): LabeledPoint = { val values = a.split('\t') val feat = values(1).split(' ') val indices = ArrayBuffer.empty[Int] val featValues = ArrayBuffer.empty[Double] for (f - feat) { val q = f.split(':') if (q.length == 2) { indices += (q(0).toInt) featValues += (q(1).toDouble) } } val vector = new SparseVector(2357815, indices.toArray, featValues.toArray) return LabeledPoint(values(0).toDouble, vector) } val data = sc.textFile(data.txt) val empty = data.filter(isEmpty) val points = empty.map(parsePoint) points.cache() val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng men...@gmail.com wrote: Do you mind sharing more code and error messages? The information you provided is too little to identify the problem. -Xiangrui On Thu, Apr 24, 2014 at 1:55 PM, John King usedforprinting...@gmail.com wrote: Last command was: val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?
Re: Spark mllib throwing error
I only see one risk: if your feature indices are not sorted, it might have undefined behavior. Other than that, I don't see any thing suspicious. -Xiangrui On Thu, Apr 24, 2014 at 4:56 PM, John King usedforprinting...@gmail.com wrote: It just displayed this error and stopped on its own. Do the lines of code mentioned in the error have anything to do with it? On Thu, Apr 24, 2014 at 7:54 PM, Xiangrui Meng men...@gmail.com wrote: I don't see anything wrong with your code. Could you do points.count() to see how many training examples you have? Also, make sure you don't have negative feature values. The error message you sent did not say NaiveBayes went wrong, but the Spark shell was killed. -Xiangrui On Thu, Apr 24, 2014 at 4:05 PM, John King usedforprinting...@gmail.com wrote: In the other thread I had an issue with Python. In this issue, I tried switching to Scala. The code is: import org.apache.spark.mllib.regression.LabeledPoint; import org.apache.spark.mllib.linalg.SparseVector; import org.apache.spark.mllib.classification.NaiveBayes; import scala.collection.mutable.ArrayBuffer def isEmpty(a: String): Boolean = a != null !a.replaceAll((?m)\s+$, ).isEmpty() def parsePoint(a: String): LabeledPoint = { val values = a.split('\t') val feat = values(1).split(' ') val indices = ArrayBuffer.empty[Int] val featValues = ArrayBuffer.empty[Double] for (f - feat) { val q = f.split(':') if (q.length == 2) { indices += (q(0).toInt) featValues += (q(1).toDouble) } } val vector = new SparseVector(2357815, indices.toArray, featValues.toArray) return LabeledPoint(values(0).toDouble, vector) } val data = sc.textFile(data.txt) val empty = data.filter(isEmpty) val points = empty.map(parsePoint) points.cache() val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng men...@gmail.com wrote: Do you mind sharing more code and error messages? The information you provided is too little to identify the problem. -Xiangrui On Thu, Apr 24, 2014 at 1:55 PM, John King usedforprinting...@gmail.com wrote: Last command was: val model = new NaiveBayes().run(points) On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote: Could you share the command you used and more of the error message? Also, is it an MLlib specific problem? -Xiangrui On Thu, Apr 24, 2014 at 11:49 AM, John King usedforprinting...@gmail.com wrote: ./spark-shell: line 153: 17654 Killed $FWDIR/bin/spark-class org.apache.spark.repl.Main $@ Any ideas?