This is what I understand from existing implementation: - fix the merge fan in F, and number of tapes=F+1 - for each sorted run generated, we have a formula to determine which tape it belongs to - the sorted run is added to the end of the tape - all sorted runs have been added to tape. There is always an empty tape called output tape - add dummy runs if necessary, to each tape - merge the input tapes, write result to output tape, until one of the input tape is empty - repeat this process until 1 sorted run remains
I think the main difference is that at each step, the current impl does not merge the smallest k-runs. It merges from the beginning of each tape. For 1 pass, this does not make any difference. For 2 passes onwards there are some differences. In some complex query, where the memory is eaten by some other operators, more passes are needed and the differences become larger. This is the only improvement that can be achieved. Let me know if this does not make any sense. Regarding disk layout, I am open to suggestion whether we should reuse the tape module or not. On Wed, Feb 4, 2009 at 11:20 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Don Marvick <donmarv...@gmail.com> writes: > > Since nobody replied, I would give it a try. I am going to implement the > > merge pattern described in Knuth Page 365 (5.4.9), essentially it is as > > follow: > > - create initial runs using replacement selection (basically this is as > in > > the current implementation) > > - add enough dummy runs of size 0 so that the number of sorted run minus > one > > can be divided by k-1 (k is merge fan in) > > - repetitively merge k smallest runs until only 1 run left > > How is this better than, or even different from, what is there now? > > regards, tom lane >