Sorry for no attachment (Apache mail rules) -- Here is a link to the document:
DrillSpillmemoryforHashAggregation.pdf - https://drive.google.com/file/d/0ByUg32jfEW16ajNiQlVRczhPTjA/view?usp=sharing [https://lh3.googleusercontent.com/U9FNbWEBljT-HDRBE1-vhMnE4Ug5YFgutztvbys2UnTiVp-FQX6mzQ=w1200-h630-p]<https://drive.google.com/file/d/0ByUg32jfEW16ajNiQlVRczhPTjA/view?usp=sharing> DrillSpillmemoryforHashAggregation.pdf<https://drive.google.com/file/d/0ByUg32jfEW16ajNiQlVRczhPTjA/view?usp=sharing> drive.google.com -- Boaz ________________________________ From: Julian Hyde <jh...@apache.org> Sent: Friday, January 13, 2017 11:00 PM To: d...@drill.apache.org Subject: Re: Drill: Memory Spilling for the Hash Aggregate Operator The attachment didn't come through. I'm hoping that you settled on a "hybrid" hash algorithm that can write to disk, or write to memory, and the cost of discovering that is wrong is not too great. With Goetz Graefe's hybrid hash join (which can be easily adapted to hybrid hash aggregate) if the input ALMOST fits in memory you could process most of it in memory, then revisit the stuff you spilled to disk. > On Jan 13, 2017, at 7:46 PM, Boaz Ben-Zvi <bben-...@mapr.com> wrote: > > Hi Drill developers, > > Attached is a document describing the design for memory spilling > implementation for the Hash Aggregate operator. > > Please send me any comments or questions, > > -- Boaz