I have a big dataset of categories of cars and descriptions of cars. So i
want to give a description of a car and the program to classify the category
of that car.
So i decided to use multinomial naive Bayes. I created a unique id for each
word and replaced my whole category,description data.

//My input
2,25187 15095 22608 28756 17862 29523 499 32681 9830 24957 18993 19501 16596
17953 16596 
20,1846 29058 16252 20446 9835 
52,16861 808 26785 17874 18993 18993 18993 18269 34157 33811 18437 6004 2791
27923 19141 
...
...

Why do I have errors like:

//Errors

3 ERROR Executor: Exception in task 0.0 in stage 211.0 (TID 392)
java.lang.IndexOutOfBoundsException: 13 not in [-13,13)

ERROR Executor: Exception in task 1.0 in stage 211.0 (TID 393)
java.lang.IndexOutOfBoundsException: 17 not in [-17,17)

ERROR TaskSetManager: Task 0 in stage 211.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 211.0 failed 1 times, most recent failure: Lost task 0.0 in stage
211.0 (TID 392, localhost): java.lang.IndexOutOfBoundsException: 13 not in
[-13,13)

Driver stacktrace:
        at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-Problem-tp22531.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to