Memory consumption and checkpointed data seems to increase incrementally when reduceByKeyAndWIndow with inverse function is used with mapWithState in Stateful streaming

2017-07-14 Thread SRK
Hi, Memory consumption and checkpointed data seems to increase incrementally when reduceByKeyAndWindow with inverse function is used with mapWithState. My application uses stateful streaming with mapWithState. The keys generated by mapWithState are then used by reduceByKeyAndWindow to do

Re: calculate diff of value and median in a group

2017-07-14 Thread roni
I was using this function percentile_approx on 100GB of compressed data and it just hangs there. Any pointers? On Wed, Mar 22, 2017 at 6:09 PM, ayan guha wrote: > For median, use percentile_approx with 0.5 (50th percentile is the median) > > On Thu, Mar 23, 2017 at 11:01