may be you are experiencing problem with FileOutputCommiter vs DirectCommiter while working with s3? do you have hdfs so you can try it to verify?
committing in s3 will copy 1-by-1 all partitions to your final destination bucket from _temporary, so this stage might become a bottleneck(so reducing number of partitions might "solve" it) there is a thread in this mailing list regarding this problem few weeks ago On 7 March 2016 at 23:53, Andy Davidson <a...@santacruzintegration.com> wrote: > We just deployed our first streaming apps. The next step is running them > so they run reliably > > We have spend a lot of time reading the various prog guides looking at the > standalone cluster manager app performance web pages. > > Looking at the streaming tab and the stages tab have been the most helpful > in tuning our app. However we do not understand the connection between > memory and # cores will effect throughput and performance. Usually > adding memory is the cheapest way to improve performance. > > When we have a single receiver call spark-submit --total-executor-cores > 2. Changing the value does not seem to change throughput. our bottle neck > was s3 write time, saveAsTextFile(). Reducing the number of partitions > dramatically reduces s3 write times. > > Adding memory also does not improve performance > > *I would think that adding more cores would allow more concurrent tasks > run. That is to say reducing num partions would slow things down* > > What are best practices? > > Kind regards > > Andy > > > > > > >