Please find the attached worker log. I could see stream closed exception On Wed, Jan 7, 2015 at 10:51 AM, Xiangrui Meng <men...@gmail.com> wrote:
> Could you attach the executor log? That may help identify the root > cause. -Xiangrui > > On Mon, Jan 5, 2015 at 11:12 PM, Priya Ch <learnings.chitt...@gmail.com> > wrote: > > Hi All, > > > > Word2Vec and TF-IDF algorithms in spark mllib-1.1.0 are working only in > > local mode and not on distributed mode. Null pointer exception has been > > thrown. Is this a bug in spark-1.1.0 ? > > > > Following is the code: > > def main(args:Array[String]) > > { > > val conf=new SparkConf > > val sc=new SparkContext(conf) > > val > > > documents=sc.textFile("hdfs://IMPETUS-DSRV02:9000/nlp/sampletext").map(_.split(" > > ").toSeq) > > val hashingTF = new HashingTF() > > val tf= hashingTF.transform(documents) > > tf.cache() > > val idf = new IDF().fit(tf) > > val tfidf = idf.transform(tf) > > val rdd=tfidf.map { vec => println("vector is...."+vec) > > (10) > > } > > rdd.saveAsTextFile("/home/padma/usecase") > > > > } > > > > > > > > > > Exception thrown: > > > > 15/01/06 12:36:09 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 > with > > 2 tasks > > 15/01/06 12:36:10 INFO cluster.SparkDeploySchedulerBackend: Registered > > executor: > > Actor[akka.tcp:// > sparkexecu...@impetus-dsrv05.impetus.co.in:33898/user/Executor#-1525890167 > ] > > with ID 0 > > 15/01/06 12:36:10 INFO scheduler.TaskSetManager: Starting task 0.0 in > stage > > 0.0 (TID 0, IMPETUS-DSRV05.impetus.co.in, NODE_LOCAL, 1408 bytes) > > 15/01/06 12:36:10 INFO scheduler.TaskSetManager: Starting task 1.0 in > stage > > 0.0 (TID 1, IMPETUS-DSRV05.impetus.co.in, NODE_LOCAL, 1408 bytes) > > 15/01/06 12:36:10 INFO storage.BlockManagerMasterActor: Registering block > > manager IMPETUS-DSRV05.impetus.co.in:35130 with 2.1 GB RAM > > 15/01/06 12:36:12 INFO network.ConnectionManager: Accepted connection > from > > [IMPETUS-DSRV05.impetus.co.in/192.168.145.195:46888] > > 15/01/06 12:36:12 INFO network.SendingConnection: Initiating connection > to > > [IMPETUS-DSRV05.impetus.co.in/192.168.145.195:35130] > > 15/01/06 12:36:12 INFO network.SendingConnection: Connected to > > [IMPETUS-DSRV05.impetus.co.in/192.168.145.195:35130], 1 messages pending > > 15/01/06 12:36:12 INFO storage.BlockManagerInfo: Added > broadcast_1_piece0 in > > memory on IMPETUS-DSRV05.impetus.co.in:35130 (size: 2.1 KB, free: 2.1 > GB) > > 15/01/06 12:36:12 INFO storage.BlockManagerInfo: Added > broadcast_0_piece0 in > > memory on IMPETUS-DSRV05.impetus.co.in:35130 (size: 10.1 KB, free: 2.1 > GB) > > 15/01/06 12:36:13 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory > on > > IMPETUS-DSRV05.impetus.co.in:35130 (size: 280.0 B, free: 2.1 GB) > > 15/01/06 12:36:13 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory > on > > IMPETUS-DSRV05.impetus.co.in:35130 (size: 416.0 B, free: 2.1 GB) > > 15/01/06 12:36:13 WARN scheduler.TaskSetManager: Lost task 1.0 in stage > 0.0 > > (TID 1, IMPETUS-DSRV05.impetus.co.in): java.lang.NullPointerException: > > org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) > > org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) > > > > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > > > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) > > org.apache.spark.scheduler.Task.run(Task.scala:54) > > > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > java.lang.Thread.run(Thread.java:722) > > > > > > Thanks, > > Padma Ch >
spark-rtauser-org.apache.spark.deploy.worker.Worker-1-IMPETUS-DSRV02.out
Description: Binary data
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org