Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Tom Lane
Alvaro Herrera writes: > They get moderated because the nore...@postgresql.org address which > appears as sender is not subscribed to any list. Ah-hah. > I also added that > address to the whitelist now, but whether that's a great fix in the long > run is debatable. Can/should we change the

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Alvaro Herrera
On 2020-Sep-15, Tom Lane wrote: > Peter Geoghegan writes: > > On Tue, Sep 15, 2020 at 3:02 PM Tom Lane wrote: > >> The tag is applied, though for some reason the pgsql-committers auto > >> e-mail about new tags hasn't been working lately. > > > Thanks. FWIW I did get the automated email

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Tom Lane
Peter Geoghegan writes: > On Tue, Sep 15, 2020 at 3:02 PM Tom Lane wrote: >> The tag is applied, though for some reason the pgsql-committers auto >> e-mail about new tags hasn't been working lately. > Thanks. FWIW I did get the automated email shortly after you sent this email. Yeah, it did

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Peter Geoghegan
On Tue, Sep 15, 2020 at 3:02 PM Tom Lane wrote: > The tag is applied, though for some reason the pgsql-committers auto > e-mail about new tags hasn't been working lately. Thanks. FWIW I did get the automated email shortly after you sent this email. -- Peter Geoghegan

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Tom Lane
I wrote: > I plan to tag rc1 in around six hours, ~2200UTC today, barring > trouble reports from packagers (none so far). Feel free to push your > patch once the tag appears. The tag is applied, though for some reason the pgsql-committers auto e-mail about new tags hasn't been working lately.

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 11:44 PM Jeff Davis wrote: > Attached. HashAgg seems to accurately reflect the filesize, as does > Sort. For the avoidance of doubt: I think that this is the right way to go, and that it should be committed shortly, before we stamp 13.0. The same goes for

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Tom Lane
Peter Geoghegan writes: > On Mon, Sep 14, 2020 at 8:48 PM Peter Geoghegan wrote: >> On Mon, Sep 14, 2020 at 8:07 PM Jeff Davis wrote: >>> Sure. Will backporting either patch into REL_13_STABLE now interfere >>> with RC1 release in any way? > It is okay to skip RC1 and commit the patch/patches

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 8:48 PM Peter Geoghegan wrote: > On Mon, Sep 14, 2020 at 8:07 PM Jeff Davis wrote: > > Sure. Will backporting either patch into REL_13_STABLE now interfere > > with RC1 release in any way? > > The RMT will discuss this. It is okay to skip RC1 and commit the patch/patches

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 11:52 PM Jeff Davis wrote: > On Thu, 2020-09-10 at 18:42 -0700, Peter Geoghegan wrote: > > We still need to put the reliance on ltsWriteBlock() allocating many > > blocks before they've been logically written on some kind of formal > > footing for Postgres 13 -- it is now

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Jeff Davis
On Thu, 2020-09-10 at 18:42 -0700, Peter Geoghegan wrote: > We still need to put the reliance on ltsWriteBlock() allocating many > blocks before they've been logically written on some kind of formal > footing for Postgres 13 -- it is now possible that an all-zero block > will be left behind even

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-15 Thread Jeff Davis
On Mon, 2020-09-14 at 20:48 -0700, Peter Geoghegan wrote: > On Mon, Sep 14, 2020 at 8:07 PM Jeff Davis wrote: > > Sure. Will backporting either patch into REL_13_STABLE now > > interfere > > with RC1 release in any way? > > The RMT will discuss this. > > It would help if there was a patch ready

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 8:07 PM Jeff Davis wrote: > Sure. Will backporting either patch into REL_13_STABLE now interfere > with RC1 release in any way? The RMT will discuss this. It would help if there was a patch ready to go. Thanks -- Peter Geoghegan

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Jeff Davis
On Mon, 2020-09-14 at 19:29 -0700, Peter Geoghegan wrote: > Let's assume that we'll teach LogicalTapeSetBlocks() to use > nBlocksWritten in place of nBlocksAllocated in all cases, as now > seems > likely. Rather than asserting "nBlocksWritten == nBlocksAllocated" > inside LogicalTapeSetBlocks()

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 7:09 PM Peter Geoghegan wrote: > Oh, wait. Of course the point was that we wouldn't even have to use > nBlocksAllocated in LogicalTapeSetBlocks() anymore -- we would make > the assumption that nBlocksWritten could be used for all callers in > all cases. Which is a

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 6:56 PM Peter Geoghegan wrote: > I'm not sure what I was talking about earlier when I connected this > with the main/instrumentation issue, since preallocation used by > logtape.c to help HashAggs-that-spill necessarily reserves blocks > without writing them out for a

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 5:50 PM Jeff Davis wrote: > Yes, it was apparently an oversight. Patch attached. This is closer to how logical tapes are used within tuplesort.c. I notice that this leads to about a 50% reduction in temp file usage for a test case involving very little work_mem (work_mem

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Jeff Davis
On Mon, 2020-09-14 at 15:54 -0700, Peter Geoghegan wrote: > Maybe the LogicalTapeRewindForRead() inconsistency you mention could > be fixed, which would enable the simplification you suggested. What > do > you think? Yes, it was apparently an oversight. Patch attached. RC1 was just stamped, are

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 3:54 PM Peter Geoghegan wrote: > If I add the assertion described above and run the regression tests, > it fails within "select_distinct" (and at other points). This is the > specific code: This variant of the same assertion works fine: +Assert(lts->enable_prealloc

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 3:39 PM Peter Geoghegan wrote: > I think that they are an exact match in practice (i.e. nBlocksWritten > == nBlocksAllocated), given when and how we call > LogicalTapeSetBlocks(). Just to be clear: this is only true for external sorts. The preallocation stuff can make

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 3:20 PM Jeff Davis wrote: > In the comment in the patch, you say: > > "In practice this probably doesn't matter because we'll be called after > the flush anyway, but be tidy." > > By which I assume you mean that LogicalTapeRewindForRead() will be > called before

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Mon, Sep 14, 2020 at 3:24 PM Alvaro Herrera wrote: > I don't understand this patch. Or maybe I should say I don't understand > the code you're patching. Why isn't the correct answer *always* > nBlocksWritten? The comment in LogicalTapeSet says: I think that they are an exact match in

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Alvaro Herrera
On 2020-Sep-14, Peter Geoghegan wrote: > On Fri, Sep 11, 2020 at 6:37 PM Peter Geoghegan wrote: > > It would be awkward if we just used nBlocksWritten within > > LogicalTapeSetBlocks() in the case where we didn't preallocate (or in > > all cases). Not entirely sure what to do about that just

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Jeff Davis
On Mon, 2020-09-14 at 14:24 -0700, Peter Geoghegan wrote: > On Fri, Sep 11, 2020 at 6:37 PM Peter Geoghegan wrote: > > It would be awkward if we just used nBlocksWritten within > > LogicalTapeSetBlocks() in the case where we didn't preallocate (or > > in > > all cases). Not entirely sure what to

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-14 Thread Peter Geoghegan
On Fri, Sep 11, 2020 at 6:37 PM Peter Geoghegan wrote: > It would be awkward if we just used nBlocksWritten within > LogicalTapeSetBlocks() in the case where we didn't preallocate (or in > all cases). Not entirely sure what to do about that just yet. I guess that that's the logical thing to do,

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-11 Thread Peter Geoghegan
On Fri, Sep 11, 2020 at 6:29 PM Peter Geoghegan wrote: > I'll probably close out this open item tomorrow. I need to think about > it some more, but right now everything looks good. I think I'll > probably end up pushing a commit with more explanatory comments. That said, we still need to make

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-11 Thread Peter Geoghegan
On Thu, Sep 10, 2020 at 6:42 PM Peter Geoghegan wrote: > Obviously you must be wondering what the difference is, if it's not > just the nHoleBlocks thing. nBlocksAllocated is not necessarily equal > to nBlocksWritten (even when we ignore concatenation/nHoleBlocks), but > it's almost always equal

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-10 Thread Peter Geoghegan
On Tue, Sep 8, 2020 at 11:28 PM Jeff Davis wrote: > Preallocation showed significant gains for HashAgg, and BufFile doesn't > support sparse writes. So, for HashAgg, it seems like we should just > update the comment and consider it the price of using BufFile. > For Sort, we can just disable

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-09 Thread Jeff Davis
On Sat, 2020-09-05 at 12:03 -0700, Peter Geoghegan wrote: > The comments in ltsWriteBlock() added by the 2017 bugfix commit > 7ac4a389a7d clearly say that the zero block writing stuff is only > supposed to happen at the edge of a tape boundary, which ought to be > rare -- see the comment block in

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-08 Thread Jeff Davis
On Sat, 2020-09-05 at 12:03 -0700, Peter Geoghegan wrote: > We should totally disable the preallocation stuff for external sorts > in any case. External sorts are naturally characterized by relatively > large, distinct batching of reads and writes -- preallocation cannot > help. Patch attached to

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-05 Thread Peter Geoghegan
On Tue, Sep 1, 2020 at 5:24 PM Peter Geoghegan wrote: > On Tue, Sep 1, 2020 at 4:36 PM Alvaro Herrera > wrote: > > This open item hasn't received any replies. I think Peter knows how to > > fix it already, but no patch has been posted ... It'd be good to get a > > move on it. > > I picked this

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-01 Thread Peter Geoghegan
On Tue, Sep 1, 2020 at 4:36 PM Alvaro Herrera wrote: > This open item hasn't received any replies. I think Peter knows how to > fix it already, but no patch has been posted ... It'd be good to get a > move on it. I picked this up again today. It's not obvious what we should do. It's true that

Re: logtape.c stats don't account for unused "prefetched" block numbers

2020-09-01 Thread Alvaro Herrera
On 2020-Jul-30, Peter Geoghegan wrote: > Commit 896ddf9b added prefetching to logtape.c to avoid excessive > fragmentation in the context of hash aggs that spill and have many > batches/tapes. Apparently the preallocation doesn't actually perform > any filesystem operations, so the new mechanism

logtape.c stats don't account for unused "prefetched" block numbers

2020-07-30 Thread Peter Geoghegan
Commit 896ddf9b added prefetching to logtape.c to avoid excessive fragmentation in the context of hash aggs that spill and have many batches/tapes. Apparently the preallocation doesn't actually perform any filesystem operations, so the new mechanism should be zero overhead when "preallocated"