Spark Streaming getting slower

2016-06-09 Thread John Simon
Hi, I'm running Spark Streaming with Kafka Direct Stream, batch interval is 10 seconds. After running about 72 hours, the batch processing time almost doubles. I didn't find anything wrong on JVM GC logs, but I did find that broadcast variable reading time increasing, like this: initially: ``` 1

Re: Spark Streaming getting slower

2016-06-09 Thread John Simon
Sorry, forgot to mention that I don't use broadcast variables. That's why I'm puzzled here. -- John Simon On Thu, Jun 9, 2016 at 11:09 AM, John Simon wrote: > Hi, > > I'm running Spark Streaming with Kafka Direct Stream, batch interval > is 10 seconds. > After running about 72 hours, the batch p

Long Running Spark Streaming getting slower

2016-06-10 Thread john.simon
in YARN client mode. Attached spark application environment settings file. -- John Simon environment.txt (7K) <http://apache-spark-user-list.1001560.n3.nabble.com/attachment/27138/0/environment.txt> -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Lo

Re: Long Running Spark Streaming getting slower

2016-06-10 Thread Mich Talebzadeh
> mode. > Attached spark application environment settings file. > > -- > John Simon > > *environment.txt* (7K) Download Attachment > <http://apache-spark-user-list.1001560.n3.nabble.com/attachment/27138/0/environment.txt> > > --

Re: Long Running Spark Streaming getting slower

2016-06-10 Thread John Simon
I, we're running on AWS EMR with Spark version 1.6.1, in YARN client >> mode. >> Attached spark application environment settings file. >> >> -- >> John Simon >> >> *environment.txt* (7K) Download Attachment >> <http://apache-spark-user-list.10

Re: Long Running Spark Streaming getting slower

2016-06-10 Thread Mich Talebzadeh
gt;> FYI, we're running on AWS EMR with Spark version 1.6.1, in YARN client >>> mode. >>> Attached spark application environment settings file. >>> >>> -- >>> John Simon >>> >>> *environment.txt* (7K) Download Attachment >>