Re: Spark Streaming not processing file with particular number of entries

2014-06-13 Thread Tathagata Das
This is very odd. If it is running fine on mesos, I dont see a obvious reason why it wont work on Spark standalone cluster. Is the .4 million file already present in the monitored directory when the context is started? In that case, the file will not be picked up (unless textFileStream is created

Re: Spark Streaming not processing file with particular number of entries

2014-06-13 Thread praveshjain1991
There doesn't seem to be any obvious reason - that's why it looks like a bug. The .4 million file is present in the directory when the context is started - same as for all other files (which are processed just fine by the application). In the logs we can see that the file is being picked up by

Re: Spark Streaming not processing file with particular number of entries

2014-06-13 Thread praveshjain1991
If you look at the file 400k.output, you'll see the string file:/newdisk1/praveshj/pravesh/data/input/testing4lk.txt This file contains 0.4 mn records. So the file is being picked up but the app goes on to hang later on. Also you mentioned the term Standalone cluster in your previous reply

Re: Spark Streaming not processing file with particular number of entries

2014-06-11 Thread praveshjain1991
Well i was able to get it to work by running spark over mesos. But it looks like a bug while running spark alone. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-not-processing-file-with-particular-number-of-entries-tp6694p7382.html Sent

Re: Spark Streaming not processing file with particular number of entries

2014-06-06 Thread praveshjain1991
Hi, I am using Spark-1.0.0 over a 3 node cluster with 1 master and 2 slaves. I am trying to run LR algorithm over Spark Streaming. package org.apache.spark.examples.streaming; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileWriter; import

Re: Spark Streaming not processing file with particular number of entries

2014-06-05 Thread praveshjain1991
The same issue persists in spark-1.0.0 as well (was using 0.9.1 earlier). Any suggestions are welcomed. -- Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-not-processing-file-with-particular-number-of-entries-tp6694p7056.html Sent