Re: Using dynamic allocation and shuffle service in Standalone Mode

2016-03-08 Thread jleaniz
You've got to start the shuffle service on all your workers. There's a script for that in the 'sbin' directory. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-dynamic-allocation-and-shuffle-service-in-Standalone-Mode-tp26430p26434.html Sent from the

Streaming job delays

2016-03-08 Thread jleaniz
Hi, I have a streaming application that reads batches from Flume, does some transformations and then writes parquet files to HDFS. The problem I have right now is that the scheduling delays are really really high, and get even higher as time goes. Have seen it go up to 24 hours. The processing