If you fill up the cache, 1.6.0+ will suffer performance degradation from GC thrashing. You can set spark.memory.useLegacyMode to true, or spark.memory.fraction to 0.66, or spark.executor.extraJavaOptions to -XX:NewRatio=3 to avoid this issue.
I think my colleague filed a ticket for this issue, but I can't find it now. So treat it like unverified rumor for now, and try it for yourself if you're out of better ideas :). Good luck! On Sat, Jun 4, 2016 at 11:49 AM, Cosmin Ciobanu <ciob...@adobe.com> wrote: > Microbatch is 20 seconds. We’re not using window operations. > > > > The graphs are for a test cluster, and the entire load is artificially > generated by load tests (100k / 200k generated sessions). > > > > We’ve performed a few more performance tests. On the same 5 node cluster, > with the same application: > > · Spark 1.5.1 handled 170k+ generated sessions for 24hours with > no scheduling delay – the limit seems to be around 180k, above which > scheduling delay starts to increase; > > · Spark 1.6.1 had constant upward-trending scheduling delay from > the beginning for 100k+ generated sessions (this is also mentioned in the > initial post) – the load test was stopped after 25 minutes as scheduling > delay reached 3,5 minutes. > > > > P.S. Florin and I will be in SF next week, attending the Spark Summit on > Tuesday and Wednesday. We can meet and go into more details there - is > anyone working on Spark Streaming available? > > > > Cosmin > > > > > > *From: *Mich Talebzadeh <mich.talebza...@gmail.com> > *Date: *Saturday 4 June 2016 at 12:33 > *To: *Florin Broască <florin.broa...@gmail.com> > *Cc: *David Newberger <david.newber...@wandcorp.com>, Adrian Tanase < > atan...@adobe.com>, "user@spark.apache.org" <user@spark.apache.org>, > ciobanu <ciob...@adobe.com> > *Subject: *Re: [REPOST] Severe Spark Streaming performance degradation > after upgrading to 1.6.1 > > > > batch interval I meant > > > > thx > > > Dr Mich Talebzadeh > > > > LinkedIn > *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > > > On 4 June 2016 at 10:32, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > I may have missed these but: > > > > What is the windows interval, windowsLength and SlidingWindow > > > > Has the volume of ingest data (Kafka streaming) changed recently that you > may not be aware of? > > > > HTH > > > > > Dr Mich Talebzadeh > > > > LinkedIn > *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > > > On 4 June 2016 at 09:50, Florin Broască <florin.broa...@gmail.com> wrote: > > Hi David, > > > > Thanks for looking into this. This is how the processing time looks like: > > > > [image: nline image 1] > > > > Appreciate any input, > > Florin > > > > > > On Fri, Jun 3, 2016 at 3:22 PM, David Newberger < > david.newber...@wandcorp.com> wrote: > > What does your processing time look like. Is it consistently within that > 20sec micro batch window? > > > > *David Newberger* > > > > *From:* Adrian Tanase [mailto:atan...@adobe.com] > *Sent:* Friday, June 3, 2016 8:14 AM > *To:* user@spark.apache.org > *Cc:* Cosmin Ciobanu > *Subject:* [REPOST] Severe Spark Streaming performance degradation after > upgrading to 1.6.1 > > > > Hi all, > > > > Trying to repost this question from a colleague on my team, somehow his > subscription is not active: > > > http://apache-spark-user-list.1001560.n3.nabble.com/Severe-Spark-Streaming-performance-degradation-after-upgrading-to-1-6-1-td27056.html > > > > Appreciate any thoughts, > > -adrian > > > > > > >