This is very odd. If it is running fine on mesos, I dont see a obvious
reason why it wont work on Spark standalone cluster.
Is the .4 million file already present in the monitored directory when the
context is started? In that case, the file will not be picked up (unless
textFileStream is created
There doesn't seem to be any obvious reason - that's why it looks like a bug.
The .4 million file is present in the directory when the context is started
- same as for all other files (which are processed just fine by the
application). In the logs we can see that the file is being picked up by
If you look at the file 400k.output, you'll see the string
file:/newdisk1/praveshj/pravesh/data/input/testing4lk.txt
This file contains 0.4 mn records. So the file is being picked up but the
app goes on to hang later on.
Also you mentioned the term Standalone cluster in your previous reply
Well i was able to get it to work by running spark over mesos. But it looks
like a bug while running spark alone.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-not-processing-file-with-particular-number-of-entries-tp6694p7382.html
Sent
Hi,
I am using Spark-1.0.0 over a 3 node cluster with 1 master and 2 slaves. I
am trying to run LR algorithm over Spark Streaming.
package org.apache.spark.examples.streaming;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileWriter;
import
The same issue persists in spark-1.0.0 as well (was using 0.9.1 earlier). Any
suggestions are welcomed.
--
Thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-not-processing-file-with-particular-number-of-entries-tp6694p7056.html
Sent