Amit, thanks for the prompt answer. Can you point me, in the code, where the purge is done ?
On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke <amit.ha...@gmail.com> wrote: > Hi Hakim, > RecordIterator will not hold all batches in memory. It holds batches from > last mark() operation. > It will purge batches as join moves along. > > Worst case case is when there are lots of repeating values on right side > which iterator will hold in memory. > > ~ Amit. > > On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche < > adene...@maprtech.com > > wrote: > > > Amit, > > > > I am looking at DRILL-4190 where one of the sort operators is hitting > it's > > allocator limit when it's sending data downstream. This generally happen > > when a downstream operator is holding those batches in memory (e.g. > Window > > Operator). > > > > The same query is running fine on 1.2.0 which seems to suggest that the > > recent changes to MergeJoinBatch "may" be causing the issue. > > > > It looks like RecordIterator is holding all incoming batches into a > > TreeRangeMap and if I'm not mistaken it doesn't release anything until > it's > > closed. Is this correct ? > > > > I am not familiar with how merge join used to work before RecordIterator. > > Was it also the case that we hold all incoming batches in memory ? > > > > Thanks > > > > -- > > > > Abdelhakim Deneche > > > > Software Engineer > > > > <http://www.mapr.com/> > > > > > > Now Available - Free Hadoop On-Demand Training > > < > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > -- Abdelhakim Deneche Software Engineer <http://www.mapr.com/> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>