Re: [HACKERS] Polyphase merge is obsolete
Hi, I planned to do some benchmarking on this patch, but apparently the patch no longer applies. Rebase please? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
On Wed, Aug 30, 2017 at 12:48 PM, Robert Haaswrote: > These are separate topics. They should each be discussed on their own > thread. Please don't hijack this thread to talk about something else. I don't think that that is a fair summary. Heikki has done a number of things in this area, of which this is only the latest. I'm saying "hey, have you thought about RS too?". Whether or not I'm "hijacking" this thread is, at best, ambiguous. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
On Wed, Aug 30, 2017 at 2:54 PM, Peter Geogheganwrote: > I noticed that this is in the upcoming CF 1 for v11. I'm signed up to review. > > I'd like to point out that replacement selection is also obsolete, > which is something I brought up recently [1]. I don't actually have > any feature-driven reason to want to kill replacement selection - it's > just an annoyance at this point. I do think that RS is more deserving > of being killed than Polyphase merge, because it actually costs users > something to continue to support it. The replacement_sort_tuples GUC > particularly deserves to be removed. > > It would be nice if killing RS was put in scope here. I'd appreciate > it, at least, since it would simplify the heap routines noticeably. > The original analysis that led to adding replacement_sort_tuples was > based on certain performance characteristics of merging that have > since changed by quite a bit, due to our work for v10. These are separate topics. They should each be discussed on their own thread. Please don't hijack this thread to talk about something else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
On Mon, Feb 27, 2017 at 2:45 PM, Peter Geogheganwrote: > Since we have an awful lot of stuff in the last CF, and this patch > doesn't seem particularly strategic, I've marked it "Returned with > Feedback". I noticed that this is in the upcoming CF 1 for v11. I'm signed up to review. I'd like to point out that replacement selection is also obsolete, which is something I brought up recently [1]. I don't actually have any feature-driven reason to want to kill replacement selection - it's just an annoyance at this point. I do think that RS is more deserving of being killed than Polyphase merge, because it actually costs users something to continue to support it. The replacement_sort_tuples GUC particularly deserves to be removed. It would be nice if killing RS was put in scope here. I'd appreciate it, at least, since it would simplify the heap routines noticeably. The original analysis that led to adding replacement_sort_tuples was based on certain performance characteristics of merging that have since changed by quite a bit, due to our work for v10. [1] postgr.es/m/cah2-wzmmnjg_k0r9nqywmq3zjyjjk+hcbizynghay-zyjs6...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
On Mon, Jan 16, 2017 at 5:56 PM, Peter Geogheganwrote: > On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangas wrote: >> The number of *input* tapes we can use in each merge pass is still limited, >> by the memory needed for the tape buffers and the merge heap, but only one >> output tape is active at any time. The inactive output tapes consume very >> few resources. So the whole idea of trying to efficiently reuse input tapes >> as output tapes is pointless > > I picked this up again. The patch won't apply cleanly. Can you rebase? Since we have an awful lot of stuff in the last CF, and this patch doesn't seem particularly strategic, I've marked it "Returned with Feedback". Thanks -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangaswrote: > The number of *input* tapes we can use in each merge pass is still limited, > by the memory needed for the tape buffers and the merge heap, but only one > output tape is active at any time. The inactive output tapes consume very > few resources. So the whole idea of trying to efficiently reuse input tapes > as output tapes is pointless I picked this up again. The patch won't apply cleanly. Can you rebase? Also, please look at my bugfix for logtape.c free block management [1] before doing so, as that might be prerequisite. Finally, I don't think that the Logical tape pause/resume idea is compelling. Is it hard to not do that, but still do everything else that you propose here? That's what I lean towards doing right now. Anyway, efficient use of tapes certainly mattered a lot more when rewinding meant sitting around for a magnetic tape deck to physically rewind. There is another algorithm in AoCP Vol III that lets us write to tapes backwards, actually, which is motivated by similar obsolete considerations about hardware. Why not write while we rewind, to avoid doing nothing else while rewinding?! Perhaps this patch should make a clean break from the "rewinding" terminology. Perhaps you should rename LogicalTapeRewindForRead() to LogicalTapePrepareForRead(), and so on. It's already a bit awkward that that routine is called LogicalTapeRewindForRead(), because it behaves significantly differently when a tape is frozen, and because the whole point of logtape.c is space reuse that is completely dissimilar to rewinding. (Space reuse is thus totally unlike how polyphase merge is supposed to reuse space, which is all about rewinding, and isn't nearly as eager. Same applies to K-way balanced merge, of course.) I think that the "rewinding" terminology does more harm than good, now that it doesn't even help the Knuth reader to match Knuth's remarks to what's going on in tuplesort.c. Just a thought. > Let's switch over to a simple k-way balanced merge. Because it's simpler. If > you're severely limited on memory, like when sorting 1GB of data with > work_mem='1MB' or less, it's also slightly faster. I'm not too excited about > the performance aspect, because in the typical case of a single-pass merge, > there's no difference. But it would be worth changing on simplicity grounds, > since we're mucking around in tuplesort.c anyway. I actually think that the discontinuities in the merge scheduling are worse than you suggest here. There doesn't have to be as extreme a difference between work_mem and the size of input as you describe here. As an example: create table seq_tab(t int); insert into seq_tab select generate_series(1, 1000); set work_mem = '4MB'; select count(distinct t) from seq_tab; The trace_sort output ends like this: 30119/2017-01-16 17:17:05 PST LOG: begin datum sort: workMem = 4096, randomAccess = f 30119/2017-01-16 17:17:05 PST LOG: switching to external sort with 16 tapes: CPU: user: 0.07 s, system: 0.00 s, elapsed: 0.06 s 30119/2017-01-16 17:17:05 PST LOG: starting quicksort of run 1: CPU: user: 0.07 s, system: 0.00 s, elapsed: 0.06 s 30119/2017-01-16 17:17:05 PST LOG: finished quicksort of run 1: CPU: user: 0.07 s, system: 0.00 s, elapsed: 0.07 s *** SNIP *** 30119/2017-01-16 17:17:08 PST LOG: finished writing run 58 to tape 0: CPU: user: 2.50 s, system: 0.27 s, elapsed: 2.78 s 30119/2017-01-16 17:17:08 PST LOG: using 4095 KB of memory for read buffers among 15 input tapes 30119/2017-01-16 17:17:08 PST LOG: finished 1-way merge step: CPU: user: 2.52 s, system: 0.28 s, elapsed: 2.80 s 30119/2017-01-16 17:17:08 PST LOG: finished 4-way merge step: CPU: user: 2.58 s, system: 0.30 s, elapsed: 2.89 s 30119/2017-01-16 17:17:08 PST LOG: finished 14-way merge step: CPU: user: 2.86 s, system: 0.34 s, elapsed: 3.20 s 30119/2017-01-16 17:17:08 PST LOG: finished 14-way merge step: CPU: user: 3.09 s, system: 0.41 s, elapsed: 3.51 s 30119/2017-01-16 17:17:09 PST LOG: finished 15-way merge step: CPU: user: 3.61 s, system: 0.52 s, elapsed: 4.14 s 30119/2017-01-16 17:17:09 PST LOG: performsort done (except 15-way final merge): CPU: user: 3.61 s, system: 0.52 s, elapsed: 4.14 s 30119/2017-01-16 17:17:10 PST LOG: external sort ended, 14678 disk blocks used: CPU: user: 4.93 s, system: 0.57 s, elapsed: 5.51 s (This is the test case that Cy posted earlier today, for the bug that was just fixed in master.) The first 1-way merge step is clearly kind of a waste of time. We incur no actual comparisons during this "merge", since there is only one real run from each input tape (all other active tapes contain only dummy runs). We are, in effect, just shoveling the tuples from that single run from one tape to another (from one range in the underlying logtape.c BufFile space to another). I've seen this quite a lot before, over the years, while working on sorting. It's not that big of a deal, but it's a degenerate case that
Re: [HACKERS] Polyphase merge is obsolete
On Wed, Oct 12, 2016 at 10:16 AM, Heikki Linnakangaswrote: > Let's switch over to a simple k-way balanced merge. Because it's simpler. If > you're severely limited on memory, like when sorting 1GB of data with > work_mem='1MB' or less, it's also slightly faster. I'm not too excited about > the performance aspect, because in the typical case of a single-pass merge, > there's no difference. But it would be worth changing on simplicity grounds, > since we're mucking around in tuplesort.c anyway. This analysis seems sound. I suppose we might as well simplify things while we're at it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
On 10/12/2016 08:27 PM, Tom Lane wrote: Heikki Linnakangaswrites: The beauty of the polyphase merge algorithm is that it allows reusing input tapes as output tapes efficiently ... So the whole idea of trying to efficiently reuse input tapes as output tapes is pointless. It's been awhile since I looked at that code, but I'm quite certain that it *never* thought it was dealing with actual tapes. Rather, the point of sticking with polyphase merge was that it allowed efficient incremental re-use of temporary disk files, so that the maximum on-disk footprint was only about equal to the volume of data to be sorted, rather than being a multiple of that. Have we thrown that property away? No, there's no difference to that behavior. logtape.c takes care of incremental re-use of disk space, regardless of the merging pattern. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Polyphase merge is obsolete
Heikki Linnakangaswrites: > The beauty of the polyphase merge algorithm is that it allows reusing > input tapes as output tapes efficiently ... So the whole idea of trying to > efficiently reuse input tapes as output tapes is pointless. It's been awhile since I looked at that code, but I'm quite certain that it *never* thought it was dealing with actual tapes. Rather, the point of sticking with polyphase merge was that it allowed efficient incremental re-use of temporary disk files, so that the maximum on-disk footprint was only about equal to the volume of data to be sorted, rather than being a multiple of that. Have we thrown that property away? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers