Thank you for your inputs. Will test it out and share my findings


On Thursday, July 14, 2016, CosminC <ciob...@adobe.com> wrote:

> Didn't have the time to investigate much further, but the one thing that
> popped out is that partitioning was no longer working on 1.6.1. This would
> definitely explain the 2x performance loss.
>
> Checking 1.5.1 Spark logs for the same application showed that our
> partitioner was working correctly, and after the DStream / RDD creation a
> user session was only processed on a single machine. Running on top of
> 1.6.1
> though, the session was processed on up to 4 machines, in a 5 node cluster
> including the driver, with a lot of redundant operations. We use a custom
> but very simple partitioner which extends HashPartitioner. It partitions on
> a case class which has a single string parameter.
>
> Speculative operations are turned off by default, and we never enabled it,
> so it's not that.
>
> Right now we're postponing any Spark upgrade, and we'll probably try to
> upgrade directly to Spark 2.0, hoping the partitioning issue is no longer
> present there.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Severe-Spark-Streaming-performance-degradation-after-upgrading-to-1-6-1-tp27056p27334.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org <javascript:;>
>
>

Reply via email to