Hi,

I have a streaming application that reads batches from Flume, does some
transformations and then writes parquet files to HDFS.

The problem I have right now is that the scheduling delays are really really
high, and get even higher as time goes. Have seen it go up to 24 hours. The
processing time for each batch is usually steady at 50s or less.

The workers and master are pretty much idle most of the time. Any ideas why
the scheduling time would be so high when the processing time is low?

Thanks

Juan



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-job-delays-tp26433.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to