Hi, I have a streaming application that reads batches from Flume, does some transformations and then writes parquet files to HDFS.
The problem I have right now is that the scheduling delays are really really high, and get even higher as time goes. Have seen it go up to 24 hours. The processing time for each batch is usually steady at 50s or less. The workers and master are pretty much idle most of the time. Any ideas why the scheduling time would be so high when the processing time is low? Thanks Juan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-job-delays-tp26433.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org