Hi, According to the docs ( https://spark.apache.org/docs/latest/api/python/pyspark.streaming.html#pyspark.streaming.DStream.reduceByKeyAndWindow), filerFunc can be used to retain expiring keys. I do not want to retain any expiring key, so I do not understand how can this help me stabilize it. Please correct me if this is not the case.
I am also specifying both reduceFunc and invReduceFunc. Can you can a sample code of what you are using. Thanks. On Fri, Jun 17, 2016 at 3:43 AM, N B <nb.nos...@gmail.com> wrote: > We had this same issue with the reduceByKeyAndWindow API that you are > using. For fixing this issue, you have to use different flavor of that > API, specifically the 2 versions that allow you to give a 'Filter function' > to them. Putting in the filter functions helped stabilize our application > too. > > HTH > NB > > > On Sun, Jun 12, 2016 at 11:19 PM, Roshan Singh <singh.rosha...@gmail.com> > wrote: > >> Hi all, >> I have a python streaming job which is supposed to run 24x7. I am unable >> to stabilize it. The job just counts no of links shared in a 30 minute >> sliding window. I am using reduceByKeyAndWindow operation with a batch of >> 30 seconds, slide interval of 60 seconds. >> >> The kafka queue has a rate of nearly 2200 messages/second which can >> increase to 3000 but the mean is 2200. >> >> I have played around with batch size, slide interval, and by increasing >> parallelism with no fruitful result. These just delay the destabilization. >> >> GC time is usually between 60-100 ms. >> >> I also noticed that the jobs were not distributed to other nodes in the >> spark UI, for which I have used configured spark.locality.wait as 100ms. >> After which I have noticed that the job is getting distributed properly. >> >> I have a cluster of 6 slaves and one master each with 16 cores and 15gb >> of ram. >> >> Code and configuration: http://pastebin.com/93DMSiji >> >> Streaming screenshot: http://imgur.com/psNfjwJ >> >> I need help in debugging the issue. Any help will be appreciated. >> >> -- >> Roshan Singh >> >> > -- Roshan Singh http://roshansingh.in