Re: [HACKERS] CLUSTER command progress monitor
On 2017/09/06 16:11, Michael Paquier wrote: On Wed, Sep 6, 2017 at 3:58 PM, Tatsuro Yamada wrote: I revised the patch like this: You should avoid top-posting. I see. I didn't change the name of view (pg_stat_progress_cluster) because I'm not sure whether the new name (pg_stat_progress_reorg) is suitable or not. Here are some ideas: rewrite (incorrect for ALTER TABLE), organize (somewhat fine), order, operate (too general?), bundle, reform, assemble. Thanks for sharing your ideas. I searched the words like a "reform table", "reassemble table" and "reorganize table" on google. I realized "reorganaize table" is good choice than others because many DBMS uses this keyword. Therefore, I'll change the name to it like this: before pg_stat_progress_cluster after pg_stat_progress_reorg (I abbreviate reorganize to reorg.) Does anyone have any suggestions? I'll revise the patch. Regards, Tatsuro Yamada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested I'm afraid this patch conflicts with current master branch. The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting pd_lower in GIN metapage
On Thu, Sep 7, 2017 at 2:40 PM, Amit Langote wrote: > On 2017/09/07 13:09, Michael Paquier wrote: >> pd_lower should remain at 0 on pre-10 servers. > > Doesn't PageInit(), which is where any page gets initialized, has always > set pd_lower to SizeOfPageHeaderData? Yes, sorry. I had a mind slippery here.. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Causal reads take II
On Thu, Aug 10, 2017 at 2:02 PM, Thomas Munro wrote: > Rebased after conflicting commit 030273b7. Now using format-patch > with a commit message to keep track of review/discussion history. TAP test 006_logical_decoding.pl failed with that version. I had missed some places that know how to decode wire protocol messages I modified. Fixed in the attached version. It might be a good idea to consolidate the message encoding/decoding logic into reusable routines, independently of this work. -- Thomas Munro http://www.enterprisedb.com synchronous-replay-v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting pd_lower in GIN metapage
On 2017/09/07 13:09, Michael Paquier wrote: > On Thu, Sep 7, 2017 at 12:59 PM, Tom Lane wrote: >> The idea I'd had was to apply the masking only if pd_lower >= >> SizeOfPageHeaderData, or if you wanted to be stricter, only if >> pd_lower != 0. > > If putting a check, it seems to me that the stricter one makes the > most sense. OK, patches updated that way. > pd_lower should remain at 0 on pre-10 servers. Doesn't PageInit(), which is where any page gets initialized, has always set pd_lower to SizeOfPageHeaderData? Thanks, Amit From 681c0ea82d0d9acd37f1979ebe1918d8636b0d26 Mon Sep 17 00:00:00 2001 From: amit Date: Fri, 23 Jun 2017 11:20:41 +0900 Subject: [PATCH 1/3] Set pd_lower correctly in the GIN metapage. --- src/backend/access/gin/ginutil.c | 7 +++ src/backend/access/gin/ginxlog.c | 24 +--- 2 files changed, 16 insertions(+), 15 deletions(-) diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index 136ea27718..ebac391818 100644 --- a/src/backend/access/gin/ginutil.c +++ b/src/backend/access/gin/ginutil.c @@ -374,6 +374,13 @@ GinInitMetabuffer(Buffer b) metadata->nDataPages = 0; metadata->nEntries = 0; metadata->ginVersion = GIN_CURRENT_VERSION; + + /* +* Set pd_lower just past the end of the metadata. This is not essential +* but it makes the page look compressible to xlog.c. +*/ + ((PageHeader) page)->pd_lower = + ((char *) metadata + sizeof(GinMetaPageData)) - (char *) page; } /* diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c index 7ba04e324f..f5c11b2d9a 100644 --- a/src/backend/access/gin/ginxlog.c +++ b/src/backend/access/gin/ginxlog.c @@ -514,7 +514,7 @@ ginRedoUpdateMetapage(XLogReaderState *record) Assert(BufferGetBlockNumber(metabuffer) == GIN_METAPAGE_BLKNO); metapage = BufferGetPage(metabuffer); - GinInitPage(metapage, GIN_META, BufferGetPageSize(metabuffer)); + GinInitMetabuffer(metabuffer); memcpy(GinPageGetMeta(metapage), &data->metadata, sizeof(GinMetaPageData)); PageSetLSN(metapage, lsn); MarkBufferDirty(metabuffer); @@ -656,7 +656,7 @@ ginRedoDeleteListPages(XLogReaderState *record) Assert(BufferGetBlockNumber(metabuffer) == GIN_METAPAGE_BLKNO); metapage = BufferGetPage(metabuffer); - GinInitPage(metapage, GIN_META, BufferGetPageSize(metabuffer)); + GinInitMetabuffer(metabuffer); memcpy(GinPageGetMeta(metapage), &data->metadata, sizeof(GinMetaPageData)); PageSetLSN(metapage, lsn); @@ -768,6 +768,7 @@ void gin_mask(char *pagedata, BlockNumber blkno) { Pagepage = (Page) pagedata; + PageHeader pagehdr = (PageHeader) page; GinPageOpaque opaque; mask_page_lsn(page); @@ -776,18 +777,11 @@ gin_mask(char *pagedata, BlockNumber blkno) mask_page_hint_bits(page); /* -* GIN metapage doesn't use pd_lower/pd_upper. Other page types do. Hence, -* we need to apply masking for those pages. +* For GIN_DELETED page, the page is initialized to empty. Hence, mask +* the page content. */ - if (opaque->flags != GIN_META) - { - /* -* For GIN_DELETED page, the page is initialized to empty. Hence, mask -* the page content. -*/ - if (opaque->flags & GIN_DELETED) - mask_page_content(page); - else - mask_unused_space(page); - } + if (opaque->flags & GIN_DELETED) + mask_page_content(page); + else if (pagehdr->pd_lower != 0) + mask_unused_space(page); } -- 2.11.0 From a605ae99664138aa827500216567137be97d0460 Mon Sep 17 00:00:00 2001 From: amit Date: Mon, 26 Jun 2017 15:13:32 +0900 Subject: [PATCH 2/3] Set pd_lower correctly in the BRIN index metapage --- src/backend/access/brin/brin_pageops.c | 7 +++ src/backend/access/brin/brin_xlog.c| 9 +++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c index 80f803e438..8762356433 100644 --- a/src/backend/access/brin/brin_pageops.c +++ b/src/backend/access/brin/brin_pageops.c @@ -491,6 +491,13 @@ brin_metapage_init(Page page, BlockNumber pagesPerRange, uint16 version) * revmap page to be created when the index is. */ metadata->lastRevmapPage = 0; + + /* +* Set pd_lower just past the end of the metadata. This is to log full +* page image of metapage in xloginsert.c. +*/ + ((PageHeader) page)->pd_lower = + ((char *) metadata + sizeof(BrinMetaPageData)) - (char *) page; } /* diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c index dff7198a39..1309d44b04 100644 --- a
Re: [HACKERS] Parallel Append implementation
On Thu, Aug 31, 2017 at 12:47 PM, Amit Khandekar wrote: > The last updated patch needs a rebase. Attached is the rebased version. > Few comments on the first read of the patch: 1. @@ -279,6 +347,7 @@ void ExecReScanAppend(AppendState *node) { int i; + ParallelAppendDesc padesc = node->as_padesc; for (i = 0; i < node->as_nplans; i++) { @@ -298,6 +367,276 @@ ExecReScanAppend(AppendState *node) if (subnode->chgParam == NULL) ExecReScan(subnode); } + + if (padesc) + { + padesc->pa_first_plan = padesc->pa_next_plan = 0; + memset(padesc->pa_finished, 0, sizeof(bool) * node->as_nplans); + } + For rescan purpose, the parallel state should be reinitialized via ExecParallelReInitializeDSM. We need to do that way mainly to avoid cases where sometimes in rescan machinery we don't perform rescan of all the nodes. See commit 41b0dd987d44089dc48e9c70024277e253b396b7. 2. + * shared next_subplan counter index to start looking for unfinished plan, Here using "counter index" seems slightly confusing. I think using one of those will be better. 3. +/* + * exec_append_leader_next + * + * To be used only if it's a parallel leader. The backend should scan + * backwards from the last plan. This is to prevent it from taking up + * the most expensive non-partial plan, i.e. the first subplan. + * + */ +static bool +exec_append_leader_next(AppendState *state) >From above explanation, it is clear that you don't want backend to pick an expensive plan for a leader, but the reason for this different treatment is not clear. 4. accumulate_partialappend_subpath() { .. + /* Add partial subpaths, if any. */ + return list_concat(partial_subpaths, apath_partial_paths); .. + return partial_subpaths; .. + if (is_partial) + return lappend(partial_subpaths, subpath); .. } In this function, instead of returning from multiple places partial_subpaths list, you can just return it at the end and in all other places just append to it if required. That way code will look more clear and simpler. 5. * is created to represent the case that a relation is provably empty. + * */ typedef struct AppendPath Spurious line addition. 6. if (!node->as_padesc) { /* * This is Parallel-aware append. Follow it's own logic of choosing * the next subplan. */ if (!exec_append_seq_next(node)) I think this is the case of non-parallel-aware appends, but the comments are indicating the opposite. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge psql ef/ev sf/sv handling functions
you can't do this sort of thing: -psql_error("The server (version %s) does not support editing function source.\n", +psql_error("The server (version %s) does not support editing %s.\n", formatPGVersionNumber(pset.sversion, false, - sverbuf, sizeof(sverbuf))); + sverbuf, sizeof(sverbuf)), + is_func ? "function source" : "view definitions"); It's too much of a pain in the rear for translators. Argh, indeed, I totally forgot about translations. Usually there is a _() hint for gettext. See https://www.postgresql.org/docs/devel/static/nls-programmer.html#nls-guidelines Usually we just use two independent messages, and that's what I did. Yep, makes sense. Thanks for the fix. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE command progress checker
On 2017/04/05 10:17, Masahiko Sawada wrote: On Wed, Apr 5, 2017 at 1:49 AM, Robert Haas wrote: On Tue, Apr 4, 2017 at 4:57 AM, Amit Langote wrote: Hmm, you're right. It could be counted with a separate variable initialized to 0 and incremented every time we decide to add a row to the final set of sampled rows, although different implementations of AcquireSampleRowsFunc have different ways of deciding if a given row will be part of the final set of sampled rows. On the other hand, if we decide to count progress in terms of blocks as you suggested afraid, I'm afraid that FDWs won't be able to report the progress. I think it may be time to push this patch out to v11. It was submitted one day before the start of the last CommitFest, the design wasn't really right, and it's not clear even now that we know what the right design is. And we're pretty much out of time. +1 We're encountering the design issue and it takes more time to find out right design including FDWs support. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center Hi Vinayak, I found a typo in your patch: s/taht/that/ sampling.c /* Report total number of blocks taht will be sampled */ Then, regarding FDWs support, I believe it means postgres_fdw support (is my understanding right?). You might better to put pgstat_progress_update_param() into these functions, maybe. postgres_fdw.c - postgresAnalyzeForeignTable - postgresAcquireSampleRowsFunc And PgFdwAnalyzeState has these variables, hopefully you can get current number of rows as a progress indicator. sturuct PgFdwAnalyzeState ... /* collected sample rows */ HeapTuple *rows; /* array of size targrows */ int targrows; /* target # of sample rows */ int numrows;/* # of sample rows collected */ /* for random sampling */ double samplerows; /* # of rows fetched */ double rowstoskip; /* # of rows to skip before next sample */ ... I hope it will help you. Regards, Tatsuro Yamada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Restricting maximum keep segments by repslots
Hello, At Fri, 1 Sep 2017 23:49:21 -0400, Peter Eisentraut wrote in <751e09c4-93e0-de57-edd2-e64c4950f...@2ndquadrant.com> > I'm still concerned about how the critical situation is handled. Your > patch just prints a warning to the log and then goes on -- doing what? > > The warning rolls off the log, and then you have no idea what happened, > or how to recover. The victims should be complaining in their log files, but, yes, I must admit that it's extremely resembles /dev/null. And the catastrophe comes suddenly. > I would like a flag in pg_replication_slots, and possibly also a > numerical column that indicates how far away from the critical point > each slot is. That would be great for a monitoring system. Great! I'll do that right now. > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > Thanks. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
On Wed, Sep 6, 2017 at 4:01 PM, Fabien COELHO wrote: > > Applies, compiles, works for me. > > Very very minor comments that I should have noticed before, sorry for this > additional round trip. Thank you for the dedicated review! > > In the help line, move -I just after -i, to put short options in > alphabetical and decreasing importance order. On this line, also add the > information about the default, something like: > > -i, --ini... > -I, --custom=[...]+ (default "ctgvp") > ... Agreed. Attached latest patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center pgbench_custom_initialization_v12.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: DROP SUBSCRIPTION hangs if sub is disabled in the same transaction
On Wed, Sep 06, 2017 at 03:28:47PM +0900, Masahiko Sawada wrote: > On Mon, Sep 4, 2017 at 11:43 PM, Arseny Sher wrote: > > Arseny Sher writes: > > > >> Attached patch fixes this by stopping workers before RO drop, as > >> already done in case when we drop replication slot. > > > > Sorry, here is the patch. > > > > I could reproduce this issue, it's a bug. Added this to the open item. > The cause of this is commit 7e174fa7 which make subscription ddls kill > the apply worker only when transaction commit. I didn't look the patch > deeply yet but I'm not sure the approach that kills apply worker > before commit would be good. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Peter, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Failover Slots
On 14 August 2017 at 11:56, Craig Ringer wrote: > > I don't want to block failover slots on decoding on standby just because > decoding on standby would be nice to have. > However, during discussion with Tomas Munro a point has come up that does block failover slots as currently envisioned - silent timeline divergence. It's a solid reason why the current design and implementation is insufficient to solve the problem. This issue exists both with the original failover slots and with the model Robert and I were discussing. Say a decoding client has replayed from master up to commit of xid 42 at 1/1000 and confirmed flush, then a failover slots standby of the master is promoted. The standby has only received WAL from the failed master up to 1/500 with most recent xid 20. Now the standby does some other new xacts, pushing xid up to 30 at 1/1000 then continuing to insert until xid 50 at lsn 1/2000. Then the logical client reconnects. The logical client will connect to the failover slot fine, and start replay. But it'll ask for replay to start at 1/1000. The standby will happily fast-forward the slot (as it should), and start replay after 1/1000. But now we have silent divergence in timelines. The logical replica has received and committed xacts 20...42 at lsn 1/500 through 1/1000, but these are not present on the promoted master. And the replica has skipped over the new-master's xids 20...30 with lsns 1/500 through 1/1000, so they're present on the new master but not the replica. IMO, this shows that not including the timeline in replication origins was a bit of a mistake, since we'd trivially detect this if they were included - but it's a bit late now. And anyway, detection would just mean logical rep would break, which doesn't help much. The simplest fix, but rather limited, is to require that failover candidates be in synchronous_standby_names, and delay ReorderBufferCommit sending the actual commit message until all peers in s_s_n confirm flush of the commit lsn. But that's not much good if you want sync rep for your logical connections too, and is generally a hack. A more general solution requires that masters be told which peers are failover candidates, so they can ensure ordering between logical decoding and physical failover candidates. Which effectively adds another kind of sync rep, where we do "wait for physical failover candidates to flush, and only then allow logical decoding". This actually seems pretty practical with the design Robert and I discussed, but it's definitely an expansion in scope. Alternately, we could require the decoding clients to keep an eye on the flush/replay positions of all failover candidates and delay commit+confirm of decoded xacts until the upstream's failover candidates have received and flushed up to that lsn. Theat starts to look at lot like a decoding on standby based model for logical failover, where the downstream maintains slots on each failover candidate upstream. So yeah. More work needed here. Even if we suddenly decided the original failover slots model was OK, it's not sufficient to fully solve the problem. (It's something I'd thought for BDR failover, but never applied to falover slots: the problem of detecting or preventing divergence when the logical client is ahead of physical receive at the time the physical standby is promoted.) -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Setting pd_lower in GIN metapage
On Thu, Sep 7, 2017 at 12:59 PM, Tom Lane wrote: > The idea I'd had was to apply the masking only if pd_lower >= > SizeOfPageHeaderData, or if you wanted to be stricter, only if > pd_lower != 0. If putting a check, it seems to me that the stricter one makes the most sense. pd_lower should remain at 0 on pre-10 servers. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting pd_lower in GIN metapage
Michael Paquier writes: > On Thu, Sep 7, 2017 at 5:49 AM, Tom Lane wrote: >> I looked briefly at these patches. I'm not sure that it's safe for the >> mask functions to assume that meta pages always have valid pd_lower. >> What happens when replaying log data concerning an old index that doesn't >> have that field filled? > There will be inconsistency between the pages, and the masking check > will complain. That doesn't seem like a pleasant outcome to me. The WAL consistency check code is supposed to complain if there's some kind of replication or replay failure, and this cannot be categorized as either. The idea I'd had was to apply the masking only if pd_lower >= SizeOfPageHeaderData, or if you wanted to be stricter, only if pd_lower != 0. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Hello, At Wed, 6 Sep 2017 12:23:53 -0700, Andres Freund wrote in <20170906192353.ufp2dq7wm5fd6...@alap3.anarazel.de> > On 2017-09-06 17:36:02 +0900, Kyotaro HORIGUCHI wrote: > > The problem is that the current ReadRecord needs the first one of > > a series of continuation records from the same source with the > > other part, the master in the case. > > What's the problem with that? We can easily keep track of the beginning > of a record, and only confirm the address before that. After failure while reading a record locally, ReadRecored tries streaming to read from the beginning of a record, which is not on the master, then retry locally and.. This loops forever. > > A (or the) solution closed in the standby side is allowing to > > read a seris of continuation records from muliple sources. > > I'm not following. All we need to use is the beginning of the relevant > records, that's easy enough to keep track of. We don't need to read the > WAL or anything. The beginning is already tracked and nothing more to do. I reconsider that way and found that it doesn't need such destructive refactoring. The first *problem* was WaitForWALToBecomeAvaialble requests the beginning of a record, which is not on the page the function has been told to fetch. Still tliRecPtr is required to determine the TLI to request, it should request RecPtr to be streamed. The rest to do is let XLogPageRead retry other sources immediately. To do this I made ValidXLogPageHeader@xlogreader.c public (and renamed to XLogReaderValidatePageHeader). The patch attached fixes the problem and passes recovery tests. However, the test for this problem is not added. It needs to go to the last page in a segment then put a record continues to the next segment, then kill the standby after receiving the previous segment but before receiving the whole record. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 8932a390bd3d1acfe5722bc62f42fc7e381ca617 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 7 Sep 2017 12:14:55 +0900 Subject: [PATCH 1/2] Allow switch WAL source midst of record. The corrent recovery machinary assumes the whole of a record is avaiable from single source. This prevents a standby from restarting under a certain condition. This patch allows source switching during reading a series of continuation records. --- src/backend/access/transam/xlog.c | 7 ++- src/backend/access/transam/xlogreader.c | 12 +--- src/include/access/xlogreader.h | 5 + 3 files changed, 16 insertions(+), 8 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index df4843f..eef3a97 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -11566,6 +11566,11 @@ retry: Assert(reqLen <= readLen); *readTLI = curFileTLI; + + if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, + (XLogPageHeader) readBuf)) + goto next_record_is_invalid; + return readLen; next_record_is_invalid: @@ -11700,7 +11705,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, } else { - ptr = tliRecPtr; + ptr = RecPtr; tli = tliOfPointInHistory(tliRecPtr, expectedTLEs); if (curFileTLI > 0 && tli < curFileTLI) diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 0781a7b..aa05e3f 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -27,8 +27,6 @@ static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength); -static bool ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, - XLogPageHeader hdr); static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr, XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess); static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record, @@ -545,7 +543,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) hdr = (XLogPageHeader) state->readBuf; - if (!ValidXLogPageHeader(state, targetSegmentPtr, hdr)) + if (!XLogReaderValidatePageHeader(state, targetSegmentPtr, hdr)) goto err; } @@ -582,7 +580,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) /* * Now that we know we have the full header, validate it. */ - if (!ValidXLogPageHeader(state, pageptr, hdr)) + if (!XLogReaderValidatePageHeader(state, pageptr, hdr)) goto err; /* update read state information */ @@ -709,9 +707,9 @@ ValidXLogRecord(XLogReaderState *state, XLogRecord *record, XLogRecPtr recptr) /* * Validate a page header */ -static bool -ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, - XLogPageHeader hdr) +bool +XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr, + XLogPageHeader hdr) { XLogRecPtr recaddr; XLogSegNo segno; diff --git a/src/include/access/xlogreader.h
Re: [HACKERS] Replication vs. float timestamps is a disaster
On 09/06/17 18:33, Omar Kilani wrote: > Is there anything people using float datetimes can do that isn't a > pg_dumpall | pg_restore to do a less painful update? > > We have several TB of data still using float datetimes and I'm trying > to figure out how we can move to 10 (currently on 9.6.x) without > massive amounts of $ in duplicated hardware or downtime. I ran into an analogous issue with endianness of PL/Java-defined datatypes, and ended up devising a procedure [1] for treating the type one way on output and another way on input, and then constructing an UPDATE query that just assigns the column to itself using a function that's essentially identity, but forces the output-input conversion (and doesn't look like a simple identity function to the query optimizer, which otherwise might optimize the whole update away). While the details of that were specific to PL/Java and byte order, something similar might work in your case if you can afford some period, either just before or just after migration, when your normal workload is blocked, long enough to run such updates on your several TB of data. Or if that's too big a chunk of downtime, you might be able to add a column in advance, of some user-defined type the same width as an integer datetime, populate that in advance, migrate, then do a quick catalog switcheroo of the column names and types. I've never done that, so someone else might have comments on its feasibility or the best way of doing it. > the exact crash, but the datetime values were either too small or too > large to fit into the integer datetimes field -- I can retry this if That does seem important to get the details on. If the integer datetime column won't represent your stuff (and the too-large or too-small values are necessary to represent), then some schema change might be necessary. Or, if the outlying values were sentinel values of some kind (I wonder, how else would you even have datetimes outside of the int64 range?), maybe they can be replaced with others that fit. -Chap [1]: https://tada.github.io/pljava/use/byteordermigrate.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A design for amcheck heapam verification
On Wed, Aug 30, 2017 at 9:29 AM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera > wrote: >> Eh, if you want to optimize it for the case where debug output is not >> enabled, make sure to use ereport() not elog(). ereport() >> short-circuits evaluation of arguments, whereas elog() does not. > > I should do that, but it's still not really noticeable. Since this patch has now bit-rotted, I attach a new revision, V2. Apart from fixing some Makefile bitrot, this revision also makes some small tweaks as suggested by Thomas and Alvaro. The documentation is also revised and expanded, and now discusses practical aspects of the set membership being tested using a Bloom filter, how that relates to maintenance_work_mem, and so on. Note that this revision does not let the Bloom filter caller use their own dynamic shared memory, which is something that Thomas asked about. While that could easily be added, I think it should happen later. I really just wanted to make sure that my Bloom filter was not in some way fundamentally incompatible with Thomas' planned enhancements to (parallel) hash join. -- Peter Geoghegan From d4dda95dd41204315dc12936fac83d2df8676992 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Thu, 24 Aug 2017 20:58:21 -0700 Subject: [PATCH 1/2] Add Bloom filter data structure implementation. A Bloom filter is a space-efficient, probabilistic data structure that can be used to test set membership. Callers will sometimes incur false positives, but never false negatives. The rate of false positives is a function of the total number of elements and the amount of memory available for the Bloom filter. Two classic applications of Bloom filters are cache filtering, and data synchronization testing. Any user of Bloom filters must accept the possibility of false positives as a cost worth paying for the benefit in space efficiency. --- src/backend/lib/Makefile | 4 +- src/backend/lib/README| 2 + src/backend/lib/bloomfilter.c | 297 ++ src/include/lib/bloomfilter.h | 26 4 files changed, 327 insertions(+), 2 deletions(-) create mode 100644 src/backend/lib/bloomfilter.c create mode 100644 src/include/lib/bloomfilter.h diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile index d1fefe4..191ea9b 100644 --- a/src/backend/lib/Makefile +++ b/src/backend/lib/Makefile @@ -12,7 +12,7 @@ subdir = src/backend/lib top_builddir = ../../.. include $(top_builddir)/src/Makefile.global -OBJS = binaryheap.o bipartite_match.o dshash.o hyperloglog.o ilist.o \ - knapsack.o pairingheap.o rbtree.o stringinfo.o +OBJS = binaryheap.o bipartite_match.o bloomfilter.o dshash.o hyperloglog.o \ + ilist.o knapsack.o pairingheap.o rbtree.o stringinfo.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/lib/README b/src/backend/lib/README index 5e5ba5e..376ae27 100644 --- a/src/backend/lib/README +++ b/src/backend/lib/README @@ -3,6 +3,8 @@ in the backend: binaryheap.c - a binary heap +bloomfilter.c - probabilistic, space-efficient set membership testing + hyperloglog.c - a streaming cardinality estimator pairingheap.c - a pairing heap diff --git a/src/backend/lib/bloomfilter.c b/src/backend/lib/bloomfilter.c new file mode 100644 index 000..e93f9b0 --- /dev/null +++ b/src/backend/lib/bloomfilter.c @@ -0,0 +1,297 @@ +/*- + * + * bloomfilter.c + * Minimal Bloom filter + * + * A Bloom filter is a probabilistic data structure that is used to test an + * element's membership of a set. False positives are possible, but false + * negatives are not; a test of membership of the set returns either "possibly + * in set" or "definitely not in set". This can be very space efficient when + * individual elements are larger than a few bytes, because elements are hashed + * in order to set bits in the Bloom filter bitset. + * + * Elements can be added to the set, but not removed. The more elements that + * are added, the larger the probability of false positives. Caller must hint + * an estimated total size of the set when its Bloom filter is initialized. + * This is used to balance the use of memory against the final false positive + * rate. + * + * Copyright (c) 2017, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/backend/lib/bloomfilter.c + * + *- + */ +#include "postgres.h" + +#include + +#include "access/hash.h" +#include "lib/bloomfilter.h" +#include "utils/memutils.h" + +#define MAX_HASH_FUNCS 10 +#define NBITS(filt) ((1 << (filt)->bloom_power)) + +typedef struct bloom_filter +{ + /* 2 ^ bloom_power is the size of the bitset (in bits) */ + intbloom_power; + unsigned char *bitset; + + /* K hash functions are used, which are randomly seeded */ + intk_hash_funcs; + uint32 seed; +} bloom_filter; + +static int pow2_tru
Re: [HACKERS] Setting pd_lower in GIN metapage
On Thu, Sep 7, 2017 at 10:42 AM, Amit Langote wrote: > I too tend to think that any users who use this masking facility would > know to expect to get these failures on upgraded clusters with invalid > pd_lower in meta pages. Yes, I don't think that an optimization reducing WAL that impacts all users should be stopped by a small set of users who use an option for development purposes. > (PS: I wonder if it is reasonable to allow configuring the error level > used when a masking failure occurs? Currently, checkXLogConsistency() > will abort the process (FATAL)) It definitely is worth it in my opinion, perhaps with an on/off switch to trigger a warning instead. The reason why we use FATAL now is to trigger more easily red flags for any potential buildfarm runs: a startup process facing FATAL takes down the standby. >> Perhaps we should document this point for wal_consistency_check? > > Do you mean permanently under wal_consistency_check parameter > documentation or in the release notes under incompatibilities for the > affected index types? Under the parameter itself, in the spirit of a don't-do-that from an upgraded instance. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
On Wed, Sep 6, 2017 at 10:05 PM, Peter Eisentraut wrote: > Please send a rebased patch. > > I propose splitting the single Perl script into three separate test > files: one for basic command-line option handling and so on (I would > like to expand that later), one for the main upgrade test, and one for > the funny database names tests. That makes sense. There will be additional overhead with the creation of an extra server though. > In the testing file, you removed the paragraph that explains how to do > cross-version upgrade testing. It's unfortunate that we would lose that > functionality. What can we do about that? Right, simply removing support for something which has been here for a long time is no fun. I think that we should add in PostgresNode objects a new bindir variable which will be used to define path to binaries. Any new node created needs to go through init() or init_from_backup(), so a node created with init() would set this bindir to what pg_config in PATH reports, or to the value defined by the caller if it is defined (let's use an option for %params). A node created from init_from_backup() inherits the path of its root node. This requires a bit of refactoring first. This could help also for cross version tests out of the code core. In the existing scripts, there are the following variables: - oldsrc, old version's source tree - oldbindir, old version's installed bin dir - bindir, this version's installed bin dir. - libdir, this version's installed lib dir bindir and libdir are pointing to the currently installed version in the tree, so we could do without it, no? oldbindir and oldsrc need to be kept around to enforce the position of binaries for the old version, as well as a proper shape of the original dump being compared (+ drop of the past functions). Then, for the pg_upgrade tests, let's check for ENV{oldbindir} and enforce the bindir value of the PostgresNode to-be-upgraded. And also for ENV{oldsrc}, first check if it is defined, and then do the existing psql/dump changes. So one, in order to run cross-version checks, would just need to rely on the fact that the version where installcheck runs is the new version. Does that sound acceptable? Looking at 5bab198, those don't run that often, but I definitely agree that breaking something for no apparent reason is not cool either ;p > We also need to have a plan for handling the build farm. Maybe keep the > vcregress.pl upgradecheck target as a thin wrapper for the time being? Or we could make upgradecheck a noop, then remove it once all the MSVC animals have upgraded to a newer version of the buildfarm client which does not use upgradecheck anymore (I am fine to send a patch or a pull request to Andrew for that). -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pluggable storage
On Fri, Sep 1, 2017 at 1:51 PM, Haribabu Kommi wrote: > Here I attached new set of patches that are rebased to the latest master. Hi Haribabu, Could you please post a new rebased version? 0007-Scan-functions-are-added-to-storage-AM.patch conflicts with recent changes. Thanks! -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting pd_lower in GIN metapage
On 2017/09/07 8:51, Michael Paquier wrote: > On Thu, Sep 7, 2017 at 5:49 AM, Tom Lane wrote: >> Amit Langote writes: >>> On 2017/08/22 9:39, Michael Paquier wrote: On Tue, Jun 27, 2017 at 4:56 PM, Amit Langote wrote: > I updated brin_mask() and spg_mask() in the attached updated patches so > that they consider meta pages as containing unused space. >> >> I looked briefly at these patches. I'm not sure that it's safe for the >> mask functions to assume that meta pages always have valid pd_lower. >> What happens when replaying log data concerning an old index that doesn't >> have that field filled? > > There will be inconsistency between the pages, and the masking check > will complain. My point here is that wal_consistency_checking is > primarily used by developers on newly-deployed clusters to check WAL > consistency by using installcheck. So an upgraded cluster would see > diffs because of that, but I would think that nobody would really face > them. I too tend to think that any users who use this masking facility would know to expect to get these failures on upgraded clusters with invalid pd_lower in meta pages. (PS: I wonder if it is reasonable to allow configuring the error level used when a masking failure occurs? Currently, checkXLogConsistency() will abort the process (FATAL)) > Perhaps we should document this point for wal_consistency_check? Do you mean permanently under wal_consistency_check parameter documentation or in the release notes under incompatibilities for the affected index types? Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
On Wed, Sep 6, 2017 at 11:44 PM, Tom Lane wrote: > Alvaro Herrera writes: >> Peter Eisentraut wrote: >>> We also need to have a plan for handling the build farm. Maybe keep the >>> vcregress.pl upgradecheck target as a thin wrapper for the time being? > >> The buildfarm already runs "make check" in src/bin/ when TAP tests are >> enabled, which should be enough to trigger the rewritten test, no? > > I think Peter's on about the Windows case. Not sure how that's handled, > but it's not "make check". For MSVC, one can use "vcregress.bat upgradecheck". So perhaps we could keep upgradecheck for a short time but make it a noop instead with this patch, and then remove it once buildfarm animals are upgraded to a newer client version? I would prefer seeing a simple removal of upgradecheck at the end, and put all TAP tests for binaries under the bincheck path. This feels more natural. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
On Thu, Sep 7, 2017 at 10:11 AM, Peter Eisentraut wrote: > On 9/4/17 06:03, Alvaro Herrera wrote: >> Tom Lane wrote: >>> Michael Paquier writes: I don't like breaking the abstraction of pg_log() with the existing flags with some kind of pg_debug() layer. The set of APIs present now in pg_rewind for logging is nice to have, and I think that those debug messages should be translated. So what about the attached? >>> >>> Your point about INT64_FORMAT not necessarily working with fprintf >>> is an outstanding reason not to keep it like it is. I've not reviewed >>> this patch in detail but I think this is basically the way to fix it. >> >> Actually this code goes throgh vsnprintf, not fprintf, which should be >> safe, so I removed that part of the comment, and pushed. > > Is there a reason this was not backpatched to 9.5? Indeed. Please note that cherry-picking the fix from 23c1f0a works just fine. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
On 9/4/17 06:03, Alvaro Herrera wrote: > Tom Lane wrote: >> Michael Paquier writes: >>> I don't like breaking the abstraction of pg_log() with the existing >>> flags with some kind of pg_debug() layer. The set of APIs present now >>> in pg_rewind for logging is nice to have, and I think that those debug >>> messages should be translated. So what about the attached? >> >> Your point about INT64_FORMAT not necessarily working with fprintf >> is an outstanding reason not to keep it like it is. I've not reviewed >> this patch in detail but I think this is basically the way to fix it. > > Actually this code goes throgh vsnprintf, not fprintf, which should be > safe, so I removed that part of the comment, and pushed. Is there a reason this was not backpatched to 9.5? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor code improvement to postgresGetForeignPlan
On 2017/09/07 6:52, Tom Lane wrote: Tatsuro Yamada writes: The declaration of postgresGetForeignPlan uses baserel, but the actual definition uses foreignrel. It would be better to sync. Pushed, thanks. regards, tom lane Thanks! Regards, Tatsuro Yamada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting pd_lower in GIN metapage
On Thu, Sep 7, 2017 at 5:49 AM, Tom Lane wrote: > Amit Langote writes: >> On 2017/08/22 9:39, Michael Paquier wrote: >>> On Tue, Jun 27, 2017 at 4:56 PM, Amit Langote >>> wrote: I updated brin_mask() and spg_mask() in the attached updated patches so that they consider meta pages as containing unused space. > > I looked briefly at these patches. I'm not sure that it's safe for the > mask functions to assume that meta pages always have valid pd_lower. > What happens when replaying log data concerning an old index that doesn't > have that field filled? There will be inconsistency between the pages, and the masking check will complain. My point here is that wal_consistency_checking is primarily used by developers on newly-deployed clusters to check WAL consistency by using installcheck. So an upgraded cluster would see diffs because of that, but I would think that nobody would really face them. Perhaps we should document this point for wal_consistency_check? Getting the patch discussed on this thread into v10 would have saved the day here, but this boat has sailed already. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump
On Thu, Sep 7, 2017 at 1:43 AM, Robert Haas wrote: > On Wed, Sep 6, 2017 at 12:26 PM, Simon Riggs wrote: I'd personally be fine with --no-whatever for any whatever that might be a subsidiary property of database objects. We've got --no-security-labels, --no-tablespaces, --no-owner, and --no-privileges already, so what's wrong with --no-comments? (We've also got --no-publications; I think it's arguable whether that is the same kind of thing.) >>> >>> And --no-subscriptions in the same bucket. >> >> Yes, it is. I was suggesting that we remove those as well. FWIW, I do too. They are useful for given application code paths. > That seems like a non-starter to me. I have used those options many > times to solve real problems, and I'm sure other people have as well. > We wouldn't have ended up with all of these options if users didn't > want to control such things. As there begins to be many switches of this kind and much code duplication, I think that some refactoring into a more generic switch infrastructure would be nicer. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication vs. float timestamps is a disaster
Omar Kilani writes: > Is there anything people using float datetimes can do that isn't a > pg_dumpall | pg_restore to do a less painful update? Um, not really. You may be stuck on 9.6 until you can spare the effort to convert. The physical representations of timestamps are totally different in the two cases. > I did attempt a pg_dumpall | pg_restore at one point but for whatever > reason we had data in tables that integer datetimes fails on (I forget > the exact crash, but the datetime values were either too small or too > large to fit into the integer datetimes field -- I can retry this if > it would be helpful). I'm pretty sure the minimum values are the same in both cases, to wit Julian day zero. As for the max, according to the old code comments * The upper limit for dates is 5874897-12-31, which is a bit less than what * the Julian-date code can allow. We use that same limit for timestamps when * using floating-point timestamps ... For integer timestamps, the upper * limit is 294276-12-31. I would hope that any timestamps you've got beyond 294276AD are erroneous data that you need to clean up anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Red-Black tree traversal tests
[ btw, please don't cc pgsql-hackers-owner, the list moderators don't need the noise ] Aleksander Alekseev writes: > I would say that this patch is in a pretty good shape now. And I see a > 99% code coverage of rbtree.c. Let's see what committers think. I took a quick look through the patch --- haven't tried to compile it or anything like that --- and have a few comments: * There's some typos, eg extention should be extension, triversal should be traversal. Maybe try a spell checker? * The diff complains about lack of file-ending newlines in several places. * There's something weird at the start of test.c: @@ -0,0 +1,577 @@ +/*-- Maybe your compiler thinks that's a BOM? It's hard to see how it compiles otherwise. * I think it might be simpler to have the module expose just one SQL function that invokes all these individual tests internally. Less boilerplate text that way, and less to change if you add more tests later. Also, you might consider passing in TEST_SIZE as an argument of the SQL function instead of having it be hard-wired. * We don't do C++-style (//) comments around here. Please use C style. (You might consider running the code through pgindent, which will convert those comments automatically.) * It's also generally not per project style to use malloc/calloc/free directly in backend code; and it's certainly not cool to use malloc or calloc and then not check for a null result. Use palloc instead. Given the short runtime of this test, you likely needn't be too worried about pfree'ing stuff. * _PG_init() declaration seems to be a leftover? doesn't belong here either, as postgres.h will bring that in for you. * I know next to zip about red-black trees, but it disturbs me that the traversal tests use trees whose values were inserted in strictly increasing order. Seems like that's not proving as much as it could. I wonder how hard it would be to insert those integers in random order. * I'm not too pleased that the rb_find calls mostly just assume that the find won't return NULL. You should be testing for that and reporting the problem, not just dying on a null pointer crash if it happens. * Possibly the tests should exercise rb_delete on keys *not* present. And how about insertion of already-existing keys? Although that's a bit outside the scope of testing traversal, so if you want to leave it for some future expansion, that'd be OK. I'll set this back to Waiting on Author. I do encourage you to finish it up. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication vs. float timestamps is a disaster
Hi, I know I'm 7 months late to this, but only just read the beta 4 release notes. Is there anything people using float datetimes can do that isn't a pg_dumpall | pg_restore to do a less painful update? We have several TB of data still using float datetimes and I'm trying to figure out how we can move to 10 (currently on 9.6.x) without massive amounts of $ in duplicated hardware or downtime. I did attempt a pg_dumpall | pg_restore at one point but for whatever reason we had data in tables that integer datetimes fails on (I forget the exact crash, but the datetime values were either too small or too large to fit into the integer datetimes field -- I can retry this if it would be helpful). Thanks. Regards, Omar On Mon, Feb 27, 2017 at 5:13 PM, Andres Freund wrote: > On 2017-02-27 17:00:23 -0800, Joshua D. Drake wrote: >> On 02/22/2017 02:45 PM, Tom Lane wrote: >> > Andres Freund writes: >> > > On 2017-02-22 08:43:28 -0500, Tom Lane wrote: >> > > > (To be concrete, I'm suggesting dropping --disable-integer-datetimes >> > > > in HEAD, and just agreeing that in the back branches, use of >> > > > replication >> > > > protocol with float-timestamp servers is not supported and we're not >> > > > going to bother looking for related bugs there. Given the lack of >> > > > field >> > > > complaints, I do not believe anyone cares.) >> > >> > What I *am* willing to spend time on is removing float-timestamp code >> > in HEAD. I've not yet heard anybody speak against doing that (or at >> > least, nothing I interpreted as a vote against it). If I've not heard >> > any complaints by tomorrow, I'll get started on that. >> >> Rip it out! > > Already happened: > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The case for removing replacement selection sort
On Wed, Sep 6, 2017 at 2:47 PM, Thomas Munro wrote: >> I attach a patch to remove replacement selection, which I'll submit to CF 1. > > This breaks the documentation build, because > doc/src/sgml/release-9.6.sgml still contains linkend="guc-replacement-sort-tuples"> but you removed that id. Thanks for looking into it. I suppose that the solution is to change the 9.6 release notes to no longer have a replacement_sort_tuples link. Anyone else have an opinion on that? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor code improvement to postgresGetForeignPlan
Tatsuro Yamada writes: > The declaration of postgresGetForeignPlan uses baserel, but > the actual definition uses foreignrel. It would be better to sync. Pushed, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The case for removing replacement selection sort
On Fri, Sep 1, 2017 at 6:00 AM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 4:59 PM, Peter Geoghegan wrote: >> I may submit the simple patch to remove replacement selection, if >> other contributors are receptive. Apart from everything else, the >> "incrementalism" of replacement selection works against cleverer batch >> memory management of the type I just outlined, which seems like a >> useful direction to take tuplesort in. > > I attach a patch to remove replacement selection, which I'll submit to CF 1. This breaks the documentation build, because doc/src/sgml/release-9.6.sgml still contains but you removed that id. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge psql ef/ev sf/sv handling functions
Fabien COELHO writes: > Here is an attempt at merging some functions which removes 160 lines of > code. Pushed with minor adjustments. I thought a couple of variable names could be better chosen, but mostly, you can't do this sort of thing: -psql_error("The server (version %s) does not support editing function source.\n", +psql_error("The server (version %s) does not support editing %s.\n", formatPGVersionNumber(pset.sversion, false, - sverbuf, sizeof(sverbuf))); + sverbuf, sizeof(sverbuf)), + is_func ? "function source" : "view definitions"); It's too much of a pain in the rear for translators. See https://www.postgresql.org/docs/devel/static/nls-programmer.html#nls-guidelines Usually we just use two independent messages, and that's what I did. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting pd_lower in GIN metapage
Amit Langote writes: > On 2017/08/22 9:39, Michael Paquier wrote: >> On Tue, Jun 27, 2017 at 4:56 PM, Amit Langote >> wrote: >>> I updated brin_mask() and spg_mask() in the attached updated patches so >>> that they consider meta pages as containing unused space. I looked briefly at these patches. I'm not sure that it's safe for the mask functions to assume that meta pages always have valid pd_lower. What happens when replaying log data concerning an old index that doesn't have that field filled? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
Robert Haas writes: > On Wed, Sep 6, 2017 at 3:41 PM, Tom Lane wrote: >> I'm not entirely following. I thought that add_path was set up to treat >> "can be parallelized" as an independent dimension of merit, so that >> parallel paths would always survive. > Here, the Gather path is not parallel-safe, but rather > parallel-restricted: Ah, right, the problem is with the Gather not its sub-paths. >> Might be a tad messy to rearrange things that way. > Why do you think I wanted you to do it? :-) I'll think about it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
I wrote: > Ah. I was not thinking of touching pg_atomic_read_u32/u64_impl, > although now that you mention it, it's not clear to me why we > couldn't simplify > - return *(&ptr->value); > + return ptr->value; Just to check, I applied that change to pg_atomic_read_u32_impl and pg_atomic_read_u64_impl, and recompiled. I get bit-for-bit the same backend executable. Maybe it would have an effect on some other compiler, but I doubt it, except perhaps at -O0. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Wed, Sep 6, 2017 at 3:41 PM, Tom Lane wrote: > Robert Haas writes: >> In particular, as Jeff and Amit point out, it >> may well be that (a) before apply_projection_to_path(), the cheapest >> plan is non-parallel and (b) after apply_projection_to_path(), the >> cheapest plan would be a Gather plan, except that it's too late >> because we've already thrown that path out. > > I'm not entirely following. I thought that add_path was set up to treat > "can be parallelized" as an independent dimension of merit, so that > parallel paths would always survive. It treats parallel-safety as an independent dimension of merit; a parallel-safe plan is more meritorious than one of equal cost which is not. We need that so that because, for example, forming a partial path for a join means joining a partial path to a parallel-safe path. But that doesn't help us here; that's to make sure we can build the necessary stuff *below* the Gather. IOW, if we threw away parallel-safe paths because there was a cheaper parallel-restricted path, we might be unable to build a partial path for the join *at all*. Here, the Gather path is not parallel-safe, but rather parallel-restricted: it's OK for it to exist in a plan that uses parallelism (duh), but it can't be nested under another Gather (also duh, kinda). So before accounting for the differing projection cost, the Gather path is doubly inferior: it is more expensive AND not parallel-safe, whereas the competing non-parallel plan is both cheaper AND parallel-safe. After applying the expensive target list, the parallel-safe plan gets a lot more expensive, but the Gather path gets more expensive to a lesser degree because the projection step ends up below the Gather and thus happens in parallel, so now the Gather plan, still a loser on parallel-safety, is a winner on total cost and thus ought to have been retained and, in fact, ought to have won. Instead, we threw it out too early. >> What we ought to do, I think, is avoid generating gather paths until >> after we've applied the target list (and the associated costing >> changes) to both the regular path list and the partial path list. > > Might be a tad messy to rearrange things that way. Why do you think I wanted you to do it? :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
Robert Haas writes: > In particular, as Jeff and Amit point out, it > may well be that (a) before apply_projection_to_path(), the cheapest > plan is non-parallel and (b) after apply_projection_to_path(), the > cheapest plan would be a Gather plan, except that it's too late > because we've already thrown that path out. I'm not entirely following. I thought that add_path was set up to treat "can be parallelized" as an independent dimension of merit, so that parallel paths would always survive. > What we ought to do, I think, is avoid generating gather paths until > after we've applied the target list (and the associated costing > changes) to both the regular path list and the partial path list. Might be a tad messy to rearrange things that way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Andres Freund writes: > On 2017-09-06 15:25:20 -0400, Tom Lane wrote: >> I think we can just use "old = ptr->value" to set up for the cmpxchg >> loop in every generic.h function that uses such a loop. > I think we might have been talking past each other - I thought you were > talking about changing the pg_atomic_read_u64_impl implementation for > external users. Ah. I was not thinking of touching pg_atomic_read_u32/u64_impl, although now that you mention it, it's not clear to me why we couldn't simplify - return *(&ptr->value); + return ptr->value; AFAIK, the compiler is entitled to, and does, simplify away that take-a-pointer-and-immediately-dereference-it dance. If it did not, a whole lot of standard array locutions would be much less efficient than they should be. What matters here is the volatile qualifier, which we've already got. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Wed, Sep 6, 2017 at 3:18 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Sep 6, 2017 at 1:47 PM, Tom Lane wrote: >>> If somebody's applying apply_projection_to_path to a path that's already >>> been add_path'd, that's a violation of the documented restriction. > >> /me is confused. Isn't that exactly what grouping_planner() is doing, >> and has done ever since your original pathification commit >> (3fc6e2d7f5b652b417fa6937c34de2438d60fa9f)? It's iterating over >> current_rel->pathlist, so surely everything in there has been >> add_path()'d. > > I think the assumption there is that we no longer care about validity of > the input Relation, since we won't be looking at it any more (and > certainly not adding more paths to it). If there's some reason why > that's not true, then maybe grouping_planner has a bug there. Right, that's sorta what I assumed. But I think that thinking is flawed in the face of parallel query, because of the fact that apply_projection_to_path() pushes down target list projection below Gather when possible. In particular, as Jeff and Amit point out, it may well be that (a) before apply_projection_to_path(), the cheapest plan is non-parallel and (b) after apply_projection_to_path(), the cheapest plan would be a Gather plan, except that it's too late because we've already thrown that path out. What we ought to do, I think, is avoid generating gather paths until after we've applied the target list (and the associated costing changes) to both the regular path list and the partial path list. Then the cost comparison is apples-to-apples. The use of apply_projection_to_path() on every path in the pathlist would be fine if it were adjusting all the costs by a uniform amount, but it isn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 2017-09-06 15:25:20 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-09-06 15:12:13 -0400, Tom Lane wrote: > >> It looks to me like two of the three implementations promise no such > >> thing. > > > They're volatile vars, so why not? > > Yeah, but so are the caller's variables. That is, in > > pg_atomic_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 xchg_) > { > uint64 old; > old = ptr->value; > > ISTM that the compiler is required to actually fetch ptr->value, not > rely on some previous read of it. I do not think that (the first > version of) pg_atomic_read_u64_impl is adding any guarantee that wasn't > there already. > > >> Even if they somehow do, it hardly matters given that the cmpxchg loop > >> would be self-correcting. > > > Well, in this one instance maybe, hardly in others. > > All the functions involved use nigh-identical cmpxchg loops. > > > What are you suggesting as an alternative? > > I think we can just use "old = ptr->value" to set up for the cmpxchg > loop in every generic.h function that uses such a loop. I think we might have been talking past each other - I thought you were talking about changing the pg_atomic_read_u64_impl implementation for external users. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Andres Freund writes: > On 2017-09-06 15:12:13 -0400, Tom Lane wrote: >> It looks to me like two of the three implementations promise no such >> thing. > They're volatile vars, so why not? Yeah, but so are the caller's variables. That is, in pg_atomic_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 xchg_) { uint64 old; old = ptr->value; ISTM that the compiler is required to actually fetch ptr->value, not rely on some previous read of it. I do not think that (the first version of) pg_atomic_read_u64_impl is adding any guarantee that wasn't there already. >> Even if they somehow do, it hardly matters given that the cmpxchg loop >> would be self-correcting. > Well, in this one instance maybe, hardly in others. All the functions involved use nigh-identical cmpxchg loops. > What are you suggesting as an alternative? I think we can just use "old = ptr->value" to set up for the cmpxchg loop in every generic.h function that uses such a loop. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
On 2017-09-06 17:36:02 +0900, Kyotaro HORIGUCHI wrote: > Hi, > > At Mon, 4 Sep 2017 17:17:19 +0900, Michael Paquier > wrote in > > > On Mon, Sep 4, 2017 at 4:04 PM, Andres Freund wrote: > > > I've not read through the thread, but this seems like the wrong approach > > > to me. The receiving side should use a correct value, instead of putting > > > this complexity on the sender's side. > > > > Yes I agree with that. The current patch gives me a bad feeling to be > > honest with the way it does things.. > > The problem is that the current ReadRecord needs the first one of > a series of continuation records from the same source with the > other part, the master in the case. What's the problem with that? We can easily keep track of the beginning of a record, and only confirm the address before that. > A (or the) solution closed in the standby side is allowing to > read a seris of continuation records from muliple sources. I'm not following. All we need to use is the beginning of the relevant records, that's easy enough to keep track of. We don't need to read the WAL or anything. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 2017-09-06 15:12:13 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2017-09-06 14:31:26 -0400, Tom Lane wrote: > >> However, if that's the reasoning, why don't we make all of these > >> use simple reads? It seems unlikely that a locked read is free. > > > We don't really use locked reads? All the _atomic_ wrapper forces is an > > actual read from memory rather than a register. > > It looks to me like two of the three implementations promise no such > thing. They're volatile vars, so why not? > Even if they somehow do, it hardly matters given that the cmpxchg loop > would be self-correcting. Well, in this one instance maybe, hardly in others. > Mostly, though, I'm looking at the fallback pg_atomic_read_u64_impl > implementation (with a CAS), which seems far more expensive than can > be justified for this. What are you suggesting as an alternative? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
Robert Haas writes: > On Wed, Sep 6, 2017 at 1:47 PM, Tom Lane wrote: >> If somebody's applying apply_projection_to_path to a path that's already >> been add_path'd, that's a violation of the documented restriction. > /me is confused. Isn't that exactly what grouping_planner() is doing, > and has done ever since your original pathification commit > (3fc6e2d7f5b652b417fa6937c34de2438d60fa9f)? It's iterating over > current_rel->pathlist, so surely everything in there has been > add_path()'d. I think the assumption there is that we no longer care about validity of the input Relation, since we won't be looking at it any more (and certainly not adding more paths to it). If there's some reason why that's not true, then maybe grouping_planner has a bug there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Andres Freund writes: > On 2017-09-06 14:31:26 -0400, Tom Lane wrote: >> However, if that's the reasoning, why don't we make all of these >> use simple reads? It seems unlikely that a locked read is free. > We don't really use locked reads? All the _atomic_ wrapper forces is an > actual read from memory rather than a register. It looks to me like two of the three implementations promise no such thing. Even if they somehow do, it hardly matters given that the cmpxchg loop would be self-correcting. Mostly, though, I'm looking at the fallback pg_atomic_read_u64_impl implementation (with a CAS), which seems far more expensive than can be justified for this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Wed, Sep 6, 2017 at 1:47 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Sep 5, 2017 at 4:34 AM, Amit Kapila wrote: >>> Okay, now I understand your point, but I think we already change the >>> cost of paths in apply_projection_to_path which is done after add_path >>> for top level scan/join paths. > >> Yeah. I think that's a nasty hack, and I think it's Tom's fault. :-) > > Yeah, and it's also documented: > > * This has the same net effect as create_projection_path(), except that if > * a separate Result plan node isn't needed, we just replace the given path's > * pathtarget with the desired one. This must be used only when the caller > * knows that the given path isn't referenced elsewhere and so can be modified > * in-place. > > If somebody's applying apply_projection_to_path to a path that's already > been add_path'd, that's a violation of the documented restriction. /me is confused. Isn't that exactly what grouping_planner() is doing, and has done ever since your original pathification commit (3fc6e2d7f5b652b417fa6937c34de2438d60fa9f)? It's iterating over current_rel->pathlist, so surely everything in there has been add_path()'d. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
Thus short, simple but meaningful examples which show how to do something useful with all that in the documentation may help people take advantage of these new features. I don't have an objection to providing an example. I wasn't terribly impressed with Simon's version, but maybe we can do better. That was just a quick idea to show how they could be used, probably it can be improved. I think it should be concrete, not partly abstract; and likely it needs to be with \if not the variable proper. ISTM that currently there is no "not". Maybe I do not understand your sentence. Given my experience with "\d*", I'm not sure I would assume that a new psql feature would work with older servers. I don't recall anyone questioning whether PQserverVersion() would work with older servers. Being able to tell what version the server is is sort of the point, no? If there were a restriction on what it would work with, I would agree that that needs to be documented. But I don't think "yes, it really works" is a helpful use of documentation space. Sure. I can use psql without knowing that there is a libpq underneath, that it contains a PQserverVersion function implemented since pg7, and that the SERVER_VERSION_* variables introduced in pg10 or pg11 rely on that thus it should work with pg7/8/9 as well. It is not obvious, as a new feature could depend on any combination of server side functions, protocol extension, client library functions or client code... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
> > > I think a GUC is a decent, though not perfect, mechanism for this. > This problem isn't restricted to PL/pgsql; indeed, the cases I've seen > have come via prepared queries, not PL/pgsql functions. Even without > that, one advantage of a GUC is that they are fairly broadly > understood and experienced users understand what they can do with > them. They can be set at various different scopes (system, user, > database, SET clause for a particular function) and it's relatively > convenient to do so. Some kind of PL/pgsql-specific PRAGMA syntax is > more likely to be overlooked by users who would actually benefit from > it, and also won't cover non-PL/pgsql cases. If we were going to go > the PRAGMA route, it would make more sense to me to define that as a > way of setting a GUC for the scope of one PL/pgsql block, like PRAGMA > SETTING(custom_plan_tries, 0). I think it is in general unfortunate > that we don't have a mechanism to change a GUC for the lifespan of one > particular query, like this: > > LET custom_plan_tries = 0 IN SELECT ... > > I bet a lot of people would find that quite convenient. The problem > of needing to set a planner GUC for one particular query is pretty > common, and Pavel is absolutely right that having to do SET beforehand > and RESET afterward is ugly. > Currently only prepared statement and PLpgSQL uses plan cache, what I know - so some special GUC has sense only for this two environments. I understand so GUC can be used and has sense when users cannot to modify source code or when they want to apply this change globally. For PLpgSQL there should be block, or statement level related syntax - PRAGMA is well known alternative from ADA language. Maybe another syntax like BEGIN SET xxx = 1, yyy = 2 .. END ... theoretically we can introduce block level GUC. I don't like it, because there are successful syntax from ADA, PL/SQL. Maybe can be useful enhance a PREPARE command to accept second optional parameter for plan cache controlling - I have not idea about syntax. Regards Pavel > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [HACKERS] Fix performance of generic atomics
Hi, On 2017-09-06 14:31:26 -0400, Tom Lane wrote: > I wrote: > > Anyway, I don't have a big objection to applying this. My concern > > is more that we need to be taking a harder look at other parts of > > the atomics infrastructure, because tweaks there are likely to buy > > much more. > > I went ahead and pushed these patches, adding __sync_fetch_and_sub > since gcc seems to provide that on the same footing as these other > functions. > > Looking at these generic functions a bit closer, I notice that > most of them are coded like > > old = pg_atomic_read_u32_impl(ptr); > while (!pg_atomic_compare_exchange_u32_impl(ptr, &old, old | or_)) > /* skip */; > > but there's an exception: pg_atomic_exchange_u64_impl just does > > old = ptr->value; > while (!pg_atomic_compare_exchange_u64_impl(ptr, &old, xchg_)) > /* skip */; > > That's interesting. Why not pg_atomic_read_u64_impl there? Because our platforms don't generally guaranatee that 64bit reads: /* * 64 bit reads aren't safe on all platforms. In the generic * implementation implement them as a compare/exchange with 0. That'll * fail or succeed, but always return the old value. Possible might store * a 0, but only if the prev. value also was a 0 - i.e. harmless. */ pg_atomic_compare_exchange_u64_impl(ptr, &old, 0); so I *guess* I decided doing an unnecessary cmpxchg wasn't useful. But since then we've added an optimization for platforms where we know 64bit reads have single copy atomicity. So we could change that now. > I think that the simple read is actually okay as it stands: if we > are unlucky enough to get a torn read, the first compare/exchange > will fail to compare equal, and it will replace "old" with a valid > atomically-read value, and then the next loop iteration has a chance > to succeed. Therefore there's no need to pay the extra cost of a > locked read instead of an unlocked one. > > However, if that's the reasoning, why don't we make all of these > use simple reads? It seems unlikely that a locked read is free. We don't really use locked reads? All the _atomic_ wrapper forces is an actual read from memory rather than a register. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
Fabien COELHO writes: > Thus short, simple but meaningful examples which show how to do something > useful with all that in the documentation may help people take advantage > of these new features. I don't have an objection to providing an example. I wasn't terribly impressed with Simon's version, but maybe we can do better. I think it should be concrete, not partly abstract; and likely it needs to be with \if not the variable proper. > Given my experience with "\d*", I'm not sure I would assume that a new > psql feature would work with older servers. I don't recall anyone questioning whether PQserverVersion() would work with older servers. Being able to tell what version the server is is sort of the point, no? If there were a restriction on what it would work with, I would agree that that needs to be documented. But I don't think "yes, it really works" is a helpful use of documentation space. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
I wrote: > Anyway, I don't have a big objection to applying this. My concern > is more that we need to be taking a harder look at other parts of > the atomics infrastructure, because tweaks there are likely to buy > much more. I went ahead and pushed these patches, adding __sync_fetch_and_sub since gcc seems to provide that on the same footing as these other functions. Looking at these generic functions a bit closer, I notice that most of them are coded like old = pg_atomic_read_u32_impl(ptr); while (!pg_atomic_compare_exchange_u32_impl(ptr, &old, old | or_)) /* skip */; but there's an exception: pg_atomic_exchange_u64_impl just does old = ptr->value; while (!pg_atomic_compare_exchange_u64_impl(ptr, &old, xchg_)) /* skip */; That's interesting. Why not pg_atomic_read_u64_impl there? I think that the simple read is actually okay as it stands: if we are unlucky enough to get a torn read, the first compare/exchange will fail to compare equal, and it will replace "old" with a valid atomically-read value, and then the next loop iteration has a chance to succeed. Therefore there's no need to pay the extra cost of a locked read instead of an unlocked one. However, if that's the reasoning, why don't we make all of these use simple reads? It seems unlikely that a locked read is free. If there's actually a reason for pg_atomic_exchange_u64_impl to be different from the rest, it needs to have a comment explaining why. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
Hello, * Clarification that this will work for current AND past server versions The short answer is it works. I do not think we need a longer answer. To have something operational you have to know quite a bit of psql details (:-substitutions, backslash command logic, gset, if, quit...). Thus short, simple but meaningful examples which show how to do something useful with all that in the documentation may help people take advantage of these new features. Given my experience with "\d*", I'm not sure I would assume that a new psql feature would work with older servers. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
Robert Haas writes: > On Tue, Sep 5, 2017 at 4:34 AM, Amit Kapila wrote: >> Okay, now I understand your point, but I think we already change the >> cost of paths in apply_projection_to_path which is done after add_path >> for top level scan/join paths. > Yeah. I think that's a nasty hack, and I think it's Tom's fault. :-) Yeah, and it's also documented: * This has the same net effect as create_projection_path(), except that if * a separate Result plan node isn't needed, we just replace the given path's * pathtarget with the desired one. This must be used only when the caller * knows that the given path isn't referenced elsewhere and so can be modified * in-place. If somebody's applying apply_projection_to_path to a path that's already been add_path'd, that's a violation of the documented restriction. It might be that we should just get rid of apply_projection_to_path and use create_projection_path, which is less mistake-prone at the cost of manufacturing another level of Path node. Now that that has the dummypp flag, it really shouldn't make any difference in terms of the accuracy of the cost estimates. > I'd feel a lot happier if Tom were to decide how this ought to be > fixed, because - in spite of some modifications by various parallel > query code - this is basically all his design and mostly his code. I can take a look, but not right away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module
On Tue, Jul 18, 2017 at 5:18 AM, Alexey Chernyshov wrote: > Hi all, Hi Alexey, I took a look at your patch. Builds fine here, and passes the new tests. I'm new to this code, so take my review with a grain of salt. > The attached patch introduces citext_pattern_ops for citext extension type > like text_pattern_ops for text type. Here are operators ~<~, ~<=~, ~>~, ~>=~ > combined into citext_pattern_ops operator class. These operators simply > compare underlying citext values as C strings with memcmp() function. Are there any cases where performing the str_tolower with the default collation, then comparing byte-by-byte, could backfire? The added test cases don't make use of any multibyte/accented characters, so it's not clear to me yet what *should* be happening in those cases. It also might be a good idea to add some test cases that compare strings of different lengths, to exercise all the paths in internal_citext_pattern_cmp(). > +-- test citext_pattern_cmp() function explicetily. Spelling nitpick in the new SQL: s/explicetily/explicitly . --Jacob -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Hi Jeff, On 09/05/2017 03:47 PM, Jeff Janes wrote: I ran pgbench (-M prepared) with synchronous_commit 'on' and 'off' using both logged and unlogged tables. Also ran an internal benchmark which didn't show anything either. What scale factor and client count? How many cores per socket? It looks like Sokolov was just starting to see gains at 200 clients on 72 cores, using -N transaction. I have done a run with scale factor 300, and another with 3000 on a 2S/28C/56T/256Gb w/ 2 x RAID10 SSD machine; up to 200 clients. I would consider the runs as "noise" as I'm seeing +-1% for all client counts, so nothing like Yura is seeing in [1] for the higher client counts. I did a run with -N too using scale factor 300, using the settings in [1], but with same result (+-1%). [1] https://www.postgresql.org/message-id/d62d7d9d473d07e172d799d5a57e7...@postgrespro.ru Best regards, Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER INDEX .. SET STATISTICS ... behaviour
On 6 September 2017 at 10:27, Tom Lane wrote: > Simon Riggs writes: >> Other than that, I'm good to commit. >> Any last minute objections? > > The docs and comments could stand a bit closer copy-editing by a > native English speaker. Other than that, it seemed sane in a > quick read-through. Will do -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Tue, Sep 5, 2017 at 4:34 AM, Amit Kapila wrote: > Okay, now I understand your point, but I think we already change the > cost of paths in apply_projection_to_path which is done after add_path > for top level scan/join paths. Yeah. I think that's a nasty hack, and I think it's Tom's fault. :-) It's used in various places with comments like this: /* * The path might not return exactly what we want, so fix that. (We * assume that this won't change any conclusions about which was the * cheapest path.) */ And in another place: * In principle we should re-run set_cheapest() here to identify the * cheapest path, but it seems unlikely that adding the same tlist * eval costs to all the paths would change that, so we don't bother. I think these assumptions were a little shaky even before parallel query came along, but they're now outright false, because we're not adding the *same* tlist eval costs to all paths any more. The parallel paths are getting smaller costs. That probably doesn't matter much if the expressions in questions are things like a + b, but when as in Jeff's example it's slow(a), then it matters a lot. I'd feel a lot happier if Tom were to decide how this ought to be fixed, because - in spite of some modifications by various parallel query code - this is basically all his design and mostly his code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER INDEX .. SET STATISTICS ... behaviour
Simon Riggs writes: > Other than that, I'm good to commit. > Any last minute objections? The docs and comments could stand a bit closer copy-editing by a native English speaker. Other than that, it seemed sane in a quick read-through. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Robert Haas writes: > It's not a question of whether the return value is used, but of > whether the updated value of *old is used. Right, but if we re-read "old" in the loop, and if the primitive doesn't return "old" (or does, but call site ignores it) then in principle the CAS might be strength-reduced a bit. Whether that can win enough to be better than removing the unlocked read, I dunno. In the case at hand, looking at a loop like while (count-- > 0) { (void) pg_atomic_fetch_or_u32(myptr, 1); } with our HEAD code and RHEL6's gcc I get this for the inner loop: .L9: movl(%rdx), %eax movl%eax, %ecx orl $1, %ecx lock cmpxchgl%ecx,(%rdx) setz%cl testb %cl, %cl je .L9 subq$1, %rbx testq %rbx, %rbx jg .L9 Applying the proposed generic.h patch, it becomes .L10: movl(%rsi), %eax .L9: movl%eax, %ecx orl $1, %ecx lock cmpxchgl%ecx,(%rdx) setz%cl testb %cl, %cl je .L9 subq$1, %rbx testq %rbx, %rbx jg .L10 Note that in both cases the cmpxchgl is coming out of the asm construct in pg_atomic_compare_exchange_u32_impl from atomics/arch-x86.h, so that even if a simpler assembly instruction were possible given that we don't need %eax to be updated, there's no chance of that actually happening. This gets back to the point I made in the other thread that maybe the arch-x86.h asm sequences are not an improvement over the gcc intrinsics. The code is the same if the loop is modified to use the result of pg_atomic_fetch_or_u32, so I won't bother showing that. Adding the proposed generic-gcc.h patch, the loop reduces to .L11: lock orl$1, (%rax) subq$1, %rbx testq %rbx, %rbx jg .L11 or if we make the loop do junk += pg_atomic_fetch_or_u32(myptr, 1); then we get .L13: movl(%rsi), %eax .L10: movl%eax, %edi movl%eax, %ecx orl $1, %ecx lock cmpxchgl %ecx, (%rdx) jne .L10 addl%edi, %r8d subq$1, %rbx testq %rbx, %rbx jg .L13 So that last is slightly better than the generic.h + asm CAS version, perhaps not meaningfully so --- but it's definitely better when the compiler can see the result isn't used. In short then, given the facts on the ground here, in particular the asm-based CAS primitive at the heart of these loops, it's clear that taking the read out of the loop can't hurt. But that should be read as a rather narrow conclusion. With a different compiler and/or a different version of pg_atomic_compare_exchange_u32_impl, maybe the answer would be different. And of course it's moot once the generic-gcc.h patch is applied. Anyway, I don't have a big objection to applying this. My concern is more that we need to be taking a harder look at other parts of the atomics infrastructure, because tweaks there are likely to buy much more. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER INDEX .. SET STATISTICS ... behaviour
On 4 September 2017 at 10:30, Adrien Nayrat wrote: > On 09/04/2017 06:16 PM, Alexander Korotkov wrote: >> Looks good for me. I've integrated those changes in the patch. >> New revision is attached. > > Thanks, I changed status to "ready for commiter". This looks useful and good to me. I've changed this line of code to return NULL rather than return tuple if (!HeapTupleIsValid(tuple)) return tuple; Other than that, I'm good to commit. Any last minute objections? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump
On Wed, Sep 6, 2017 at 12:26 PM, Simon Riggs wrote: >>> I'd personally be fine with --no-whatever for any whatever that might >>> be a subsidiary property of database objects. We've got >>> --no-security-labels, --no-tablespaces, --no-owner, and >>> --no-privileges already, so what's wrong with --no-comments? >>> >>> (We've also got --no-publications; I think it's arguable whether that >>> is the same kind of thing.) >> >> And --no-subscriptions in the same bucket. > > Yes, it is. I was suggesting that we remove those as well. That seems like a non-starter to me. I have used those options many times to solve real problems, and I'm sure other people have as well. We wouldn't have ended up with all of these options if users didn't want to control such things. > But back to the main point which is that --no-comments discards ALL > comments simply to exclude one pointless and annoying comment. That > runs counter to our stance that we do not allow silent data loss. I > want to solve the problem too. I accept that not everyone uses > comments, but if they do, spilling them all on the floor is a user > visible slip up that we should not be encouraging. Sorry y'all. /me shrugs. I agree that there could be better solutions to the original problem, but I think there's no real argument that a user can't exclude comments from a backup if they wish to do so. As the OP already pointed out, it can still be done without the switch; it's just more annoying. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variable substitution in psql backtick expansion
Seeing it as is, it calls for having "SERVER_VERSION" as well, but I'm not sure of the better way to get it. I tried with "SELECT VERSION() AS SERVER_VERSION \gset" but varnames are lowerized. The problem there is you can't get version() without an extra round trip to the server --- and an extra logged query --- which people are going to complain about. Here is a PoC that does it through a guc, just like "server_version" (the short version) is transmitted, with a fallback if it is not there. Whether it is worth it is debatable, but I like the symmetry of having the same informations accessible the same way for client and server sides. -- Fabien.diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 5f59a38..8b69ed1 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7961,8 +7961,8 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' -Reports the version number of the server. It is determined by the -value of PG_VERSION when building the server. +Reports the version number of the server as a short string. It is determined +by the value of PG_VERSION when building the server. @@ -7981,6 +7981,20 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' + + server_version_raw (string) + + server_version_raw configuration parameter + + + + +Reports the version of the server as a long string. It is determined +by the value of PG_VERSION_STR when building the server. + + + + wal_block_size (integer) diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 5bdbc1e..1be57d2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -3690,11 +3690,14 @@ bar +SERVER_VERSION SERVER_VERSION_NAME SERVER_VERSION_NUM -The server's version number as a string, for +The server's version number as a long string, for +example PostgreSQL 11devel ..., +as a short string, for example 9.6.2, 10.1 or 11beta1, and in numeric form, for example 90602 or 11. diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 246fea8..fd843d4 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -500,6 +500,7 @@ static char *locale_collate; static char *locale_ctype; static char *server_encoding_string; static char *server_version_string; +static char *server_version_raw_string; static int server_version_num; static char *timezone_string; static char *log_timezone_string; @@ -3298,6 +3299,18 @@ static struct config_string ConfigureNamesString[] = }, { + /* Can't be set in postgresql.conf */ + {"server_version_raw", PGC_INTERNAL, PRESET_OPTIONS, + gettext_noop("Shows the server version string."), + NULL, + GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE + }, + &server_version_raw_string, + PG_VERSION_STR, + NULL, NULL, NULL + }, + + { /* Not for general use --- used by SET ROLE */ {"role", PGC_USERSET, UNGROUPED, gettext_noop("Sets the current role."), diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index fe0b83e..e2ba8ee 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -3358,7 +3358,8 @@ void SyncVariables(void) { char vbuf[32]; - const char *server_version; + const char *server_version, + *server_version_raw; /* get stuff from connection */ pset.encoding = PQclientEncoding(pset.db); @@ -3385,6 +3386,17 @@ SyncVariables(void) snprintf(vbuf, sizeof(vbuf), "%d", pset.sversion); SetVariable(pset.vars, "SERVER_VERSION_NUM", vbuf); + server_version_raw = PQparameterStatus(pset.db, "server_version_raw"); + /* fall back again */ + if (!server_version_raw) + { + snprintf(vbuf, sizeof(vbuf), "PostgreSQL "); + formatPGVersionNumber(pset.sversion, true, vbuf + strlen(vbuf), + sizeof(vbuf) - strlen(vbuf)); + server_version_raw = vbuf; + } + SetVariable(pset.vars, "SERVER_VERSION", server_version_raw); + /* send stuff to it, too */ PQsetErrorVerbosity(pset.db, pset.verbosity); PQsetErrorContextVisibility(pset.db, pset.show_context); @@ -3403,6 +3415,7 @@ UnsyncVariables(void) SetVariable(pset.vars, "HOST", NULL); SetVariable(pset.vars, "PORT", NULL); SetVariable(pset.vars, "ENCODING", NULL); + SetVariable(pset.vars, "SERVER_VERSION", NULL); SetVariable(pset.vars, "SERVER_VERSION_NAME", NULL); SetVariable(pset.vars, "SERVER_VERSION_NUM", NULL); } diff --git a/src/interfaces/libpq/fe-protocol2.c b/src/interfaces/libpq/fe-protocol2.c index a58f701..4aa18fd 100644 --- a/src/interfaces/libpq/fe-protocol2.c +++ b/src/interfaces/libpq/fe-protocol2.c @@ -281,6 +281,10 @@ pqSetenvPoll(PGconn *conn) { char *ptr; + /* keep
Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump
On 1 September 2017 at 22:08, Michael Paquier wrote: > On Sat, Sep 2, 2017 at 1:53 AM, Robert Haas wrote: >> On Mon, Aug 21, 2017 at 5:30 PM, Simon Riggs wrote: >>> Thinking ahead, are we going to add a new --no-objecttype switch every >>> time someone wants it? >> >> I'd personally be fine with --no-whatever for any whatever that might >> be a subsidiary property of database objects. We've got >> --no-security-labels, --no-tablespaces, --no-owner, and >> --no-privileges already, so what's wrong with --no-comments? >> >> (We've also got --no-publications; I think it's arguable whether that >> is the same kind of thing.) > > And --no-subscriptions in the same bucket. Yes, it is. I was suggesting that we remove those as well. But back to the main point which is that --no-comments discards ALL comments simply to exclude one pointless and annoying comment. That runs counter to our stance that we do not allow silent data loss. I want to solve the problem too. I accept that not everyone uses comments, but if they do, spilling them all on the floor is a user visible slip up that we should not be encouraging. Sorry y'all. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present
On 8/18/17 05:28, Michael Banck wrote: >>> Rebased, squashed and slighly edited version attached. I've added this >>> to the 2017-07 commitfest now as well: >>> >>> https://commitfest.postgresql.org/14/1112/ >> Can you rebase this past some conflicting changes? > Thanks for letting me know, PFA a rebased version. I have reviewed the thread so far. I think there is general agreement that something like this would be good to have. I have not found any explanation, however, why the "if not exists" behavior is desirable, let alone as the default. I can only think of two workflows here: Either you have scripts for previous PG versions that create the slot externally, in which can you omit --create, or you use the new functionality to have pg_basebackup create the slot. I don't see any use for pg_basebackup to opportunistically use a slot if it happens to exist. Even if there is one, it should not be the default. So please change that. A minor point, I suggest to print the message about the replication slot being created *after* the slot has been created. This aligns with how logical replication subscriptions work. I don't see the need for printing a message about temporary slots. Since they are temporary, they will go away automatically, so there is nothing the user needs to know there. With these two changes, I think I'd be content with this patch. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPDATE of partition key
On Mon, Sep 4, 2017 at 10:52 AM, Amit Khandekar wrote: > On 4 September 2017 at 07:43, Amit Kapila wrote: > Oops sorry. Now attached. I have done some basic testing and initial review of the patch. I have some comments/doubts. I will continue the review. + if (mtstate->operation == CMD_UPDATE && mtstate->mt_transition_capture) + ExecARUpdateTriggers(estate, resultRelInfo, InvalidOid, For passing invalid ItemPointer we are using InvalidOid, this seems bit odd to me are we using simmilar convention some other place? I think it would be better to just pass 0? -- - if ((event == TRIGGER_EVENT_DELETE && delete_old_table) || - (event == TRIGGER_EVENT_UPDATE && update_old_table)) + if (oldtup != NULL && + ((event == TRIGGER_EVENT_DELETE && delete_old_table) || + (event == TRIGGER_EVENT_UPDATE && update_old_table))) { Tuplestorestate *old_tuplestore; - Assert(oldtup != NULL); Only if TRIGGER_EVENT_UPDATE it is possible that oldtup can be NULL, so we have added an extra check for oldtup and removed the Assert, but if TRIGGER_EVENT_DELETE we never expect it to be NULL. Is it better to put Assert outside the condition check (Assert(oldtup != NULL || event == TRIGGER_EVENT_UPDATE)) ? same for the newtup. I think we should also explain in comments about why oldtup or newtup can be NULL in case of if TRIGGER_EVENT_UPDATE --- +triggers affect the row being moved. As far as AFTER ROW +triggers are concerned, AFTER DELETE and +AFTER INSERT triggers are applied; but +AFTER UPDATE triggers are not applied +because the UPDATE has been converted to a +DELETE and INSERT. Above comments says that ARUpdate trigger is not fired but below code call ARUpdateTrigger + if (mtstate->operation == CMD_UPDATE && mtstate->mt_transition_capture) + ExecARUpdateTriggers(estate, resultRelInfo, InvalidOid, + NULL, + tuple, + NULL, + mtstate->mt_transition_capture); -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
Simon Riggs writes: > * Clarification that this will work for current AND past server versions The short answer is it works. I do not think we need a longer answer. (The existing comment in libpq says that the "select version()" method works in any server version that supports protocol v2, which means that we will derive a correct server version ID for any server that modern libpq is capable of connecting to at all. I've not gone back to re-verify what I wrote in 2003, but I see no reason to doubt it. We'd have been considerably more fussed about whether PQserverVersion worked correctly for ancient server versions back then than we need be today.) > * Clarification to avoid confusion between VERSION and SERVER_VERSION In what way are these variables not adequately specified already? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On Wed, Sep 6, 2017 at 12:01 PM, Tom Lane wrote: >> This seems like a pretty sound argument to me. I think Tom's probably >> right that the changes in generic-gcc.h are the important ones, but >> I'm not sure that's an argument against patching generics.h. Given >> that pg_atomic_compare_exchange_u32_impl is defined to update *old >> there seems to be no reason to call pg_atomic_read_u32_impl every time >> through the loop. > > Probably not. I'm not quite 100% convinced of that, because of my > observation that gcc is capable of generating different and better > code for some of these primitives if it can prove that the return > value is not needed. It's not clear that that could apply in any > of these uses of pg_atomic_compare_exchange_u32_impl, though. > In any case, by my own argument, it shouldn't matter, because if > any of these are really performance-critical then we should be > looking for better ways. It's not a question of whether the return value is used, but of whether the updated value of *old is used. Unless I'm confused. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Robert Haas writes: > On Wed, Sep 6, 2017 at 9:42 AM, Sokolov Yura > wrote: >> But I think, generic version still should be "fixed". >> If generic version is not reached on any platform, then why it is kept? >> If it is reached somewhere, then it should be improved. > This seems like a pretty sound argument to me. I think Tom's probably > right that the changes in generic-gcc.h are the important ones, but > I'm not sure that's an argument against patching generics.h. Given > that pg_atomic_compare_exchange_u32_impl is defined to update *old > there seems to be no reason to call pg_atomic_read_u32_impl every time > through the loop. Probably not. I'm not quite 100% convinced of that, because of my observation that gcc is capable of generating different and better code for some of these primitives if it can prove that the return value is not needed. It's not clear that that could apply in any of these uses of pg_atomic_compare_exchange_u32_impl, though. In any case, by my own argument, it shouldn't matter, because if any of these are really performance-critical then we should be looking for better ways. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
On 6 September 2017 at 06:38, Tom Lane wrote: > Simon Riggs writes: >> Based upon input from Tom and Fabien, I propose this additional doc patch. > > I do not think any of this is appropriate, particularly not the reference > to 7.0.3. OK, no problem. SERVER_VERSION_NUM is a great new feature. I think these points need further changes * An example of the intended use of SERVER_VERSION_NUM in psql * Clarification that this will work for current AND past server versions * Clarification to avoid confusion between VERSION and SERVER_VERSION Thanks -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Assert that the correct locks are held when calling PageGetLSN()
On Tue, Sep 5, 2017 at 10:49 PM, Michael Paquier wrote: > On Wed, Sep 6, 2017 at 1:15 AM, Jacob Champion wrote: >> On Sun, Sep 3, 2017 at 11:14 PM, Michael Paquier >> wrote: >>> +#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND) >>> +void >>> +AssertPageIsLockedForLSN(Page page) >>> +{ >>> From a design point of view, bufmgr.c is an API interface for buffer >>> management on the backend-size, so just using FRONTEND there looks >>> wrong to me (bufmgr.h is already bad enough IMO now). >> >> The comments suggested that bufmgr.h could be incidentally pulled into >> frontend code through other headers. Or do you mean that I don't need >> to check for FRONTEND in the C source file (since, I assume, it's only >> ever being compiled for the backend)? I'm not sure what you mean by >> "bufmgr.h is already bad enough". > > I mean that this should not become worse without a better reason. This > patch makes it so. > >>> This bit in src/backend/access/transam/README may interest you: >>> Note that we must only use PageSetLSN/PageGetLSN() when we know the action >>> is serialised. Only Startup process may modify data blocks during recovery, >>> so Startup process may execute PageGetLSN() without fear of serialisation >>> problems. All other processes must only call PageSet/GetLSN when holding >>> either an exclusive buffer lock or a shared lock plus buffer header lock, >>> or be writing the data block directly rather than through shared buffers >>> while holding AccessExclusiveLock on the relation. >>> >>> So I see no problem with the exiting caller. >> >> Is heap_xlog_visible only called during startup? > > Recovery is more exact. When replaying a XLOG_HEAP2_VISIBLE record. > >> If so, is there a >> general rule for which locks are required during startup and which >> aren't? > > You can surely say that a process is fine to read a variable without a > lock if it is the only one updating it, other processes would need a > lock to read a variable. In this case the startup process is the only > one updating blocks, so that's at least one code path where the is no > need to take a lock when looking at a page LSN with PageGetLSN. Is there a way to programmatically determine whether or not we're in such a situation? That could be added to the assertion. The code here is pretty inconsistent on whether or not PageGetLSN is called inside or outside a lock -- see XLogReadBufferForRedoExtended for instance. > Another example is pageinspect which works on a copy of a page and > uses PageGetLSN, so no direct locks on the buffer page is needed. And again -- this case should be correctly handled by the new assertion. Copies of pages are not checked for locks; we can't recover a buffer header from a page that isn't part of the BufferBlocks array. > In short, it seems to me that this patch should be rejected in its > current shape. Is the half of the patch that switches PageGetLSN to BufferGetLSNAtomic correct, at least? > There is no need to put more constraints on a API which > does not need more constraints and which is used widely by extensions. The goal here is not to add more constraints, but to verify that the existing constraints are followed. Perhaps this patch doesn't do that well/correctly, but I want to make sure we're on the same page regarding the end goal. --Jacob -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On Wed, Sep 6, 2017 at 9:42 AM, Sokolov Yura wrote: > Yes, you're right. > > But I think, generic version still should be "fixed". > If generic version is not reached on any platform, then why it is kept? > If it is reached somewhere, then it should be improved. This seems like a pretty sound argument to me. I think Tom's probably right that the changes in generic-gcc.h are the important ones, but I'm not sure that's an argument against patching generics.h. Given that pg_atomic_compare_exchange_u32_impl is defined to update *old there seems to be no reason to call pg_atomic_read_u32_impl every time through the loop. Whether doing so hurts performance in practice is likely to depend on a bunch of stuff, but we don't normally refuse to remove unnecessary code on the grounds that it will rarely be reached. Given that CPUs are weird, it's possible that there is some system where throwing an extra read of a value we already have into the loop works out to a win, but in the absence of evidence I'm reluctant to presume it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
On Wed, Sep 6, 2017 at 11:03 AM, Tom Lane wrote: > That's fair enough. We need to have a discussion about exactly what > the knob does, which is distinct from the question of how you spell > the incantation for twiddling it. I'm dubious that a dumb "force a > custom plan" setting is going to solve all that many cases usefully. I think what people need is the ability to force the behavior in either direction - either insist on a custom plan, or insist on a generic plan. The former is what you need if the plan stinks, and the latter is what you need if replanning is a waste of effort. I have seen both cases. The latter has been a bigger problem than the former, because the former can be hacked around in various ugly and inefficient ways, but if we're adding a knob I think it should have three settings. There is perhaps an argument for even more configurability, like altering the number of plan tries from 5 to some other value, but I'm not clear that there's any use case for values other than 0, 5, and +infinity. >> I think it is in general unfortunate that we don't have a mechanism to >> change a GUC for the lifespan of one particular query, like this: > >> LET custom_plan_tries = 0 IN SELECT ... > > Hmm. I think the core problem here is that we're trying to control > the plancache, which is a pretty much behind-the-scenes mechanism. > Except in the case of an explicit PREPARE, you can't even see from > SQL that the cache is being used, or when it's used. So part of what > needs to be thought about, if we use the GUC approach, is when the > GUC's value is consulted. If we don't do anything special then > the GUC(s) would be consulted when retrieving plans from the cache, > and changes in their values from one retrieval to the next might > cause funny behavior. Maybe the relevant settings need to be captured > when the plancache entry is made ... not sure. What sort of funny behavior are you concerned about? It seems likely to me that in most cases the GUC will have the same value every time through, but if it doesn't, I'm not sure why we'd need to use the old value rather than the current one. Indeed, if the user changes the GUC from "force custom" to "force generic" and reruns the function, we want the new value to take effect, lest a POLA violation occur. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
Robert Haas writes: > I don't think we can just indefinitely continue to resist > providing manual control over this behavior on the theory that some > day we'll fix it. That's fair enough. We need to have a discussion about exactly what the knob does, which is distinct from the question of how you spell the incantation for twiddling it. I'm dubious that a dumb "force a custom plan" setting is going to solve all that many cases usefully. > I think a GUC is a decent, though not perfect, mechanism for this. > This problem isn't restricted to PL/pgsql; indeed, the cases I've seen > have come via prepared queries, not PL/pgsql functions. That's 100% correct, and is actually the best reason not to consider a PRAGMA (or any other plpgsql-specific mechanism) as the incantation spelling. > I think it is in general unfortunate that we don't have a mechanism to > change a GUC for the lifespan of one particular query, like this: > LET custom_plan_tries = 0 IN SELECT ... Hmm. I think the core problem here is that we're trying to control the plancache, which is a pretty much behind-the-scenes mechanism. Except in the case of an explicit PREPARE, you can't even see from SQL that the cache is being used, or when it's used. So part of what needs to be thought about, if we use the GUC approach, is when the GUC's value is consulted. If we don't do anything special then the GUC(s) would be consulted when retrieving plans from the cache, and changes in their values from one retrieval to the next might cause funny behavior. Maybe the relevant settings need to be captured when the plancache entry is made ... not sure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
Alvaro Herrera writes: > Peter Eisentraut wrote: >> We also need to have a plan for handling the build farm. Maybe keep the >> vcregress.pl upgradecheck target as a thin wrapper for the time being? > The buildfarm already runs "make check" in src/bin/ when TAP tests are > enabled, which should be enough to trigger the rewritten test, no? I think Peter's on about the Windows case. Not sure how that's handled, but it's not "make check". regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
On Tue, Sep 5, 2017 at 1:38 PM, Tom Lane wrote: > The complaint I have about PRAGMA is that it's yet another syntax for > accomplishing pretty much the same thing. If you don't like the GUC > solution, we've already got the "comp_option" syntax for static options > in plpgsql. Sure, that's not too pretty, but that's not a good reason > to invent yet another way to do it. On the general question of whether we should have something like this, I expressed a lot of doubt when e6faf910d75027bdce7cd0f2033db4e912592bcc first went in about whether that algorithm was really going to work, and nothing has happened since then to remove any of that doubt. It's pretty clear to me from experiences with customer problems that the heuristics we have often fail to do the right thing, either because queries with no hope of benefiting still replan 5 times - which can waste a ton of CPU when there are many different queries in the plan cache and many sessions - or because queries that would benefit in some cases give up on replanning before they hit a case where a parameter-specific plan helps. I don't think we can just indefinitely continue to resist providing manual control over this behavior on the theory that some day we'll fix it. It's been six years and we haven't made any significant progress. In some cases, a long delay without any progress might just point to a lack of effort that should have been applied, but in this case I think it's because the problem is incredibly hard. I think what we ideally want to do is notice whether the new bind variables cause a change in selectivity which is sufficient to justify a re-plan. If we annotated the original plan with markers indicating that it was valid for all values with a frequency of more than X and less than Y, for example, we could cover most cases involving equality; range queries would need some other kind of annotation. However, it's unclear how the planner could produce such annotations, and it's unclear how expensive checking against them would be if we had them. Barring somebody having a brilliant insight about how to make some system that's way better than what we have right now, I think we can't hold out much hope of any better fix than a manual knob. I think a GUC is a decent, though not perfect, mechanism for this. This problem isn't restricted to PL/pgsql; indeed, the cases I've seen have come via prepared queries, not PL/pgsql functions. Even without that, one advantage of a GUC is that they are fairly broadly understood and experienced users understand what they can do with them. They can be set at various different scopes (system, user, database, SET clause for a particular function) and it's relatively convenient to do so. Some kind of PL/pgsql-specific PRAGMA syntax is more likely to be overlooked by users who would actually benefit from it, and also won't cover non-PL/pgsql cases. If we were going to go the PRAGMA route, it would make more sense to me to define that as a way of setting a GUC for the scope of one PL/pgsql block, like PRAGMA SETTING(custom_plan_tries, 0). I think it is in general unfortunate that we don't have a mechanism to change a GUC for the lifespan of one particular query, like this: LET custom_plan_tries = 0 IN SELECT ... I bet a lot of people would find that quite convenient. The problem of needing to set a planner GUC for one particular query is pretty common, and Pavel is absolutely right that having to do SET beforehand and RESET afterward is ugly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix bloom WAL tap test
On Wed, Sep 6, 2017 at 4:08 PM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > I just realized that these lines of contrib/bloom/t/001_wal.pl don't > check that queries give same results on master and standby. They just > check that *return codes* of psql are equal. > > # Run test queries and compare their result >> my $master_result = $node_master->psql("postgres", $queries); >> my $standby_result = $node_standby->psql("postgres", $queries); >> is($master_result, $standby_result, "$test_name: query result matches"); > > > Attached patch fixes this problem by using safe_psql() which returns psql > output instead of return code. For safety, this patch replaces psql() with > safe_psql() in other places too. > > I think this should be backpatched to 9.6 as bugfix. > Also, it would be nice to call wal-check on bloom check (now wal-check isn't called even on check-world). See attached patch. My changes to Makefile could be cumbersome. Sorry for that, I don't have much experience with them... -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company wal-check-on-bloom-check.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
Peter Eisentraut wrote: > I propose splitting the single Perl script into three separate test > files: one for basic command-line option handling and so on (I would > like to expand that later), one for the main upgrade test, and one for > the funny database names tests. Check. > We also need to have a plan for handling the build farm. Maybe keep the > vcregress.pl upgradecheck target as a thin wrapper for the time being? The buildfarm already runs "make check" in src/bin/ when TAP tests are enabled, which should be enough to trigger the rewritten test, no? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Walsender timeouts and large transactions
I've changed to "need review" to gain more attention from other. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Is anything preventing us from allowing write to foreign tables from standby?
Hi! We're currently blocking writing queries on standby if even they are modifying contents of foreign tables. But do we have serious reasons for that? Keeping in the mind FDW-sharding, making FDW-tables writable from standby would be good to prevent single-master bottleneck. I wrote simple patch enabling writing to foreign tables from standbys. It works from the first glance for me. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company standby-fdw-writable.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 2017-09-06 16:36, Tom Lane wrote: Sokolov Yura writes: On 2017-09-06 15:56, Tom Lane wrote: The point I'm trying to make is that if tweaking generic.h improves performance then it's an indicator of missed cases in the less-generic atomics code, and the latter is where our attention should be focused. I think basically all of the improvement Sokolov got was from upgrading the coverage of generic-gcc.h. Not exactly. I've checked, that new version of generic pg_atomic_fetch_or_u32 loop also gives improvement. But once you put in the generic-gcc version, that's not reached anymore. Yes, you're right. But I think, generic version still should be "fixed". If generic version is not reached on any platform, then why it is kept? If it is reached somewhere, then it should be improved. -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
Simon Riggs writes: > Based upon input from Tom and Fabien, I propose this additional doc patch. I do not think any of this is appropriate, particularly not the reference to 7.0.3. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Sokolov Yura writes: > On 2017-09-06 15:56, Tom Lane wrote: >> The point I'm trying to make is that if tweaking generic.h improves >> performance then it's an indicator of missed cases in the less-generic >> atomics code, and the latter is where our attention should be focused. >> I think basically all of the improvement Sokolov got was from upgrading >> the coverage of generic-gcc.h. > Not exactly. I've checked, that new version of generic > pg_atomic_fetch_or_u32 > loop also gives improvement. But once you put in the generic-gcc version, that's not reached anymore. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 2017-09-06 15:56, Tom Lane wrote: Simon Riggs writes: On 5 September 2017 at 21:23, Tom Lane wrote: Moreover, it matters which primitive you're testing, on which platform, with which compiler, because we have a couple of layers of atomic ops implementations. If there is no gain on 2-socket, at least there is no loss either. The point I'm trying to make is that if tweaking generic.h improves performance then it's an indicator of missed cases in the less-generic atomics code, and the latter is where our attention should be focused. I think basically all of the improvement Sokolov got was from upgrading the coverage of generic-gcc.h. regards, tom lane Not exactly. I've checked, that new version of generic pg_atomic_fetch_or_u32 loop also gives improvement. Without that check I'd not suggest to fix generic atomic functions. Of course, gcc intrinsic gives more gain. -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add psql variables showing server version and psql version.
On 5 September 2017 at 11:58, Fabien COELHO wrote: > > Hello Simon, > >> Does raise the further question of how psql behaves when we connect to >> a pre-10 server, so we have SERVER_VERSION_NUM but yet it is not set. >> How does this >> \if SERVER_VERSION_NUM < x > > > The if does not have expressions (yet), it just handles TRUE/ON/1 and > FALSE/0/OFF. > >> Do we need some macro or suggested special handling? > > > If "SERVER_VERSION_NUM" is available in 10, then: > > -- exit if version < 10 (\if is ignored and \q is executed) > \if false \echo "prior 10" \q \endif > > -- then test version through a server side expression, will work > SELECT :SERVER_VERSION_NUM < 11 AS prior_11 \gset > \if :prior_11 > -- version 10 > \else > -- version 11 or more > \endif Based upon input from Tom and Fabien, I propose this additional doc patch. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services clarify_psql_server_version.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fix bloom WAL tap test
Hi! I just realized that these lines of contrib/bloom/t/001_wal.pl don't check that queries give same results on master and standby. They just check that *return codes* of psql are equal. # Run test queries and compare their result > my $master_result = $node_master->psql("postgres", $queries); > my $standby_result = $node_standby->psql("postgres", $queries); > is($master_result, $standby_result, "$test_name: query result matches"); Attached patch fixes this problem by using safe_psql() which returns psql output instead of return code. For safety, this patch replaces psql() with safe_psql() in other places too. I think this should be backpatched to 9.6 as bugfix. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company fix-bloom-wal-check.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
On 4/14/17 02:00, Michael Paquier wrote: > Attached is an updated patch to use --no-sync with pg_dumpall calls. Please send a rebased patch. I propose splitting the single Perl script into three separate test files: one for basic command-line option handling and so on (I would like to expand that later), one for the main upgrade test, and one for the funny database names tests. In the testing file, you removed the paragraph that explains how to do cross-version upgrade testing. It's unfortunate that we would lose that functionality. What can we do about that? We also need to have a plan for handling the build farm. Maybe keep the vcregress.pl upgradecheck target as a thin wrapper for the time being? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] <> join selectivity estimate question
Simon Riggs writes: > Why isn't this an open item for PG10? Why should it be? This behavior has existed for a long time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
Simon Riggs writes: > On 5 September 2017 at 21:23, Tom Lane wrote: >> Moreover, it matters which primitive you're testing, on which platform, >> with which compiler, because we have a couple of layers of atomic ops >> implementations. > If there is no gain on 2-socket, at least there is no loss either. The point I'm trying to make is that if tweaking generic.h improves performance then it's an indicator of missed cases in the less-generic atomics code, and the latter is where our attention should be focused. I think basically all of the improvement Sokolov got was from upgrading the coverage of generic-gcc.h. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] <> join selectivity estimate question
On 6 September 2017 at 04:14, Ashutosh Bapat wrote: > On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro > wrote: >> >> Thanks. Bearing all that in mind, I ran through a series of test >> scenarios and discovered that my handling for JOIN_ANTI was wrong: I >> thought that I had to deal with inverting the result, but I now see >> that that's handled elsewhere (calc_joinrel_size_estimate() I think). >> So neqjoinsel should just treat JOIN_SEMI and JOIN_ANTI exactly the >> same way. > > I agree, esp. after looking at eqjoinsel_semi(), which is used for > both semi and anti joins, it becomes more clear. > >> >> That just leaves the question of whether we should try to handle the >> empty RHS and single-value RHS cases using statistics. My intuition >> is that we shouldn't, but I'll be happy to change my intuition and >> code that up if that is the feedback from planner gurus. > > Empty RHS can result from dummy relations also, which are produced by > constraint exclusion, so may be that's an interesting case. Single > value RHS may be interesting with partitioned table with all rows in a > given partition end up with the same partition key value. But may be > those are just different patches. I am not sure. > >> >> Please find attached a new version, and a test script I used, which >> shows a bunch of interesting cases. I'll add this to the commitfest. > > I added some "stable" tests to your patch taking inspiration from the > test SQL file. I think those will be stable across machines and runs. > Please let me know if those look good to you. Why isn't this an open item for PG10? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 5 September 2017 at 21:23, Tom Lane wrote: > Jeff Janes writes: >> What scale factor and client count? How many cores per socket? It looks >> like Sokolov was just starting to see gains at 200 clients on 72 cores, >> using -N transaction. ... > Moreover, it matters which primitive you're testing, on which platform, > with which compiler, because we have a couple of layers of atomic ops > implementations. ... I think Sokolov was aiming at 4-socket servers specifically, rather than as a general performance gain. If there is no gain on 2-socket, at least there is no loss either. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] document and use SPI_result_code_string()
Michael Paquier writes: > Fine for 0002. This reminds me of LockGXact and RemoveGXact in > twophase.c, as well as _hash_squeezebucket that have some code paths > that cannot return... Any thoughts about having some kind of > PG_NOTREACHED defined to 0 which could be put in an assertion? Generally we just do "Assert(false)", maybe with "not reached" in a comment. I don't feel a strong need to invent a new way to do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 2017-09-06 14:54, Sokolov Yura wrote: On 2017-09-06 07:23, Tom Lane wrote: Jeff Janes writes: What scale factor and client count? How many cores per socket? It looks like Sokolov was just starting to see gains at 200 clients on 72 cores, using -N transaction. This means that Sokolov's proposed changes in atomics/generic.h ought to be uninteresting for performance on any platform we care about --- at least for pg_atomic_fetch_add_u32(). May I cite you this way: I apologize for tone of my previous message. My tongue is my enemy. -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
Hi Ashutosh, I have tried to address your comments in V27 patch series[1]. Please find my comments inlined. > >> > >> The current set of patches contains 6 patches as below: > >> > >> 0001: > >> Refactoring existing ATExecAttachPartition code so that it can be used > >> for > >> default partitioning as well > If I understand correctly these comments refer to patch 0001 of V25_rebase series, which is related to "Fix assumptions that get_qual_from_partbound()" and not refactoring. This is patch 0001 in V27 now. * Returns an expression tree describing the passed-in relation's partition > - * constraint. > + * constraint. If there are no partition constraints returns NULL e.g. in > case > + * default partition is the only partition. > The first sentence uses singular constraint. The second uses plural. Given > that > partition bounds together form a single constraint we should use singular > constraint in the second sentence as well. > I have changed the wording now. > > Do we want to add a similar comment in the prologue of > generate_partition_qual(). The current wording there seems to cover this > case, > but do we want to explicitly mention this case? > I have added a comment there. > > +if (!and_args) > +result = NULL; > While this is correct, I am increasingly seeing (and_args != NIL) usage. > Changed this to: + if (and_args == NIL) + result = NULL; > get_partition_qual_relid() is called from pg_get_partition_ > constraintdef(), > constr_expr = get_partition_qual_relid(relationId); > > /* Quick exit if not a partition */ > if (constr_expr == NULL) > PG_RETURN_NULL(); > The comment is now wrong since a default partition may have no > constraints. May > be rewrite it as simply, "Quick exit if no partition constraint." > Fixed. > > generate_partition_qual() has three callers and all of them are capable of > handling NIL partition constraint for default partition. May be it's > better to > mention in the commit message that we have checked that the callers of > this function > can handle NIL partition constraint. > Added in commit message. > >> > >> 0002: > >> This patch teaches the partitioning code to handle the NIL returned by > >> get_qual_for_list(). > >> This is needed because a default partition will not have any constraints > >> in case > >> it is the only partition of its parent. > Comments below refer to patch 0002 in V25_rebase(0003 in V25), which adds basic support for default partition, which is now 0002 in V27. > If the partition being dropped is the default partition, > heap_drop_with_catalog() locks default partition twice, once as the default > partition and the second time as the partition being dropped. So, it will > be > counted as locked twice. There doesn't seem to be any harm in this, since > the > lock will be help till the transaction ends, by when all the locks will be > released. > Fixed. > Same is the case with cache invalidation message. If we are dropping > default > partition, the cache invalidation message on "default partition" is not > required. Again this might be harmless, but better to avoid it. Fixed. > Similar problems exists with ATExecDetachPartition(), default partition > will be > locked twice if it's being detached. > In ATExecDetachPartition() we do not have OID of the relation being detached available at the time we lock default partition. Moreover, here we are taking an exclusive lock on default OID and an share lock on partition being detached. As you correctly said in your earlier comment that it will be counted as locked twice, which to me also seems harmless. As these locks are to be held till commit of the transaction nobody else is supposed to be releasing these locks in between. I am not able to visualize a problem here, but still I have tried to avoid locking the default partition table twice, please review the changes and let me know your thoughts. > +/* > + * If this is a default partition, pg_partitioned_table must have > it's > + * OID as value of 'partdefid' for it's parent (i.e. rel) entry. > + */ > +if (castNode(PartitionBoundSpec, boundspec)->is_default) > +{ > +Oidpartdefid; > + > +partdefid = get_default_partition_oid(RelationGetRelid(rel)); > +Assert(partdefid == inhrelid); > +} > Since an accidental change or database corruption may change the default > partition OID in pg_partition_table. An Assert won't help in such a case. > May > be we should throw an error or at least report a warning. If we throw an > error, > the table will become useless (or even the database will become useless > RelationBuildPartitionDesc is called from RelationCacheInitializePhase3() > on > such a corrupted table). To avoid that we may raise a warning. > > I have added a warning here instead of Assert. > I am wondering whether we could avoid call to get_default_partitio
Re: [HACKERS] 【ECPG】strncpy function does not set the end character '\0'
Michael Meskes writes: >> In the function ECPGnoticeReceiver, we use the stncpy function copy >> the >> sqlstate to sqlca->sqlstate. And the sqlca->sqlstate is defined as >> the size >> of 5, and the copy size is sizeof(sqlca->sqlstate). However, from the >> previous strcmp function, the sqlstate size may be 5,such as >> ECPG_SQLSTATE_INVALID_CURSOR_NAME. So there may be lack of the end >> character >> for sqlca->sqlstate. > Why do you think there should be one? My memory might be wrong but I > don't think it's supposed to be a null terminated string. That field is defined as char[5] in struct sqlca_t, so the intent is clearly that it not be null terminated. However, it looks to me like there'd be at least one alignment-padding byte after it, and that byte is likely to be 0 in a lot of situations (particularly for statically allocated sqlca_t's). So a lot of the time, you could get away with using strcmp() or other functions that expect null termination. I'm thinking therefore that there's probably code out there that tries to do strcmp(sqlca->sqlstate, "22000") or suchlike, and it works often enough that the authors haven't identified their bug. The question is do we want to try to make that be valid code. Changing the field declaration to char[5+1] would be easy enough, but I have no idea how many places in ecpglib would need to change to make sure that the last byte gets set to 0. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix performance of generic atomics
On 2017-09-06 07:23, Tom Lane wrote: Jeff Janes writes: What scale factor and client count? How many cores per socket? It looks like Sokolov was just starting to see gains at 200 clients on 72 cores, using -N transaction. This means that Sokolov's proposed changes in atomics/generic.h ought to be uninteresting for performance on any platform we care about --- at least for pg_atomic_fetch_add_u32(). May I cite you this way: "Tom Lane says PostgreSQL core team doesn't care for 4 socket 72 core Intel Xeon servers" ??? To be honestly, I didn't measure exact impact of changing pg_atomic_fetch_add_u32. I found issue with pg_atomic_fetch_or_u32, cause it is highly contended both in LWLock, and in buffer page state management. I found changing loop for generic pg_atomic_fetch_or_u32 already gives improvement. And I changed loop for other generic atomic functions to make all functions uniform. It is bad style guide not to changing worse (but correct) code into better (and also correct) just because you can't measure difference (in my opinion. Perhaps i am mistaken). (and yes: gcc intrinsic gives more improvement). -- With regards, Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] <> join selectivity estimate question
On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro wrote: > > Thanks. Bearing all that in mind, I ran through a series of test > scenarios and discovered that my handling for JOIN_ANTI was wrong: I > thought that I had to deal with inverting the result, but I now see > that that's handled elsewhere (calc_joinrel_size_estimate() I think). > So neqjoinsel should just treat JOIN_SEMI and JOIN_ANTI exactly the > same way. I agree, esp. after looking at eqjoinsel_semi(), which is used for both semi and anti joins, it becomes more clear. > > That just leaves the question of whether we should try to handle the > empty RHS and single-value RHS cases using statistics. My intuition > is that we shouldn't, but I'll be happy to change my intuition and > code that up if that is the feedback from planner gurus. Empty RHS can result from dummy relations also, which are produced by constraint exclusion, so may be that's an interesting case. Single value RHS may be interesting with partitioned table with all rows in a given partition end up with the same partition key value. But may be those are just different patches. I am not sure. > > Please find attached a new version, and a test script I used, which > shows a bunch of interesting cases. I'll add this to the commitfest. I added some "stable" tests to your patch taking inspiration from the test SQL file. I think those will be stable across machines and runs. Please let me know if those look good to you. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 23e5526..4c1bae6 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -2697,26 +2697,63 @@ neqjoinsel(PG_FUNCTION_ARGS) Oid eqop; float8 result; - /* - * We want 1 - eqjoinsel() where the equality operator is the one - * associated with this != operator, that is, its negator. - */ - eqop = get_negator(operator); - if (eqop) + + if (jointype == JOIN_SEMI || jointype == JOIN_ANTI) { - result = DatumGetFloat8(DirectFunctionCall5(eqjoinsel, - PointerGetDatum(root), - ObjectIdGetDatum(eqop), - PointerGetDatum(args), - Int16GetDatum(jointype), - PointerGetDatum(sjinfo))); + VariableStatData leftvar; + VariableStatData rightvar; + double nullfrac; + bool reversed; + HeapTuple statsTuple; + + get_join_variables(root, args, sjinfo, &leftvar, &rightvar, &reversed); + statsTuple = reversed ? rightvar.statsTuple : leftvar.statsTuple; + if (HeapTupleIsValid(statsTuple)) + nullfrac = ((Form_pg_statistic) GETSTRUCT(statsTuple))->stanullfrac; + else + nullfrac = 0.0; + ReleaseVariableStats(leftvar); + ReleaseVariableStats(rightvar); + + /* + * For semi-joins, if there is more than one distinct value in the RHS + * relation then every non-null LHS row must find a row to join since + * it can only be equal to one of them. We'll assume that there is + * always more than one distinct RHS value for the sake of stability, + * though in theory we could have special cases for empty RHS + * (selectivity = 0) and single-distinct-value RHS (selectivity = + * fraction of LHS that has the same value as the single RHS value). + * + * For anti-joins, if we use the same assumption that there is more + * than one distinct key in the RHS relation, then every non-null LHS + * row must be suppressed by the anti-join leaving only nullfrac. + */ + result = 1.0 - nullfrac; } else { - /* Use default selectivity (should we raise an error instead?) */ - result = DEFAULT_EQ_SEL; + /* + * We want 1 - eqjoinsel() where the equality operator is the one + * associated with this != operator, that is, its negator. + */ + eqop = get_negator(operator); + if (eqop) + { + result = DatumGetFloat8(DirectFunctionCall5(eqjoinsel, + PointerGetDatum(root), + ObjectIdGetDatum(eqop), + PointerGetDatum(args), + Int16GetDatum(jointype), + PointerGetDatum(sjinfo))); + } + else + { + /* Use default selectivity (should we raise an error instead?) */ + result = DEFAULT_EQ_SEL; + } + result = 1.0 - result; } - result = 1.0 - result; + PG_RETURN_FLOAT8(result); } diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index 9f4c88d..10bfb68 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -1845,6 +1845,89 @@ SELECT '' AS "xxx", * | 1 | 4 | one | -1 (1 row) +-- SEMI and ANTI join neq selectivity +ANALYZE J1_TBL; +ANALYZE J2_TBL; +EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF, SUMMARY OFF) +SELECT * FROM J1_TBL t1 +WHERE EXISTS (SELECT * FROM J2_TBL t2 WHERE t1.i <> t2.i); +QUERY PLAN +--- + Nested Loop Semi Join (actual rows=9 loops=1)
Re: [HACKERS] additional contrib test suites
On Sat, Aug 12, 2017 at 1:20 PM, Peter Eisentraut wrote: > Here are some small test suites for some contrib modules as well as > pg_archivecleanup that didn't have one previously, as well as one patch > to improve code coverage in a module. > > Will add to commit fest. Testing on different platforms and with > different build configurations would be useful. After applying these patches cleanly on top of 0b554e4e63a4ba4852c01951311713e23acdae02 and running "./configure --enable-tap-tests --with-tcl --with-python --with-perl --with-ldap --with-icu && make && make check-world" I saw this failure: cd . && TESTDIR='/home/travis/build/postgresql-cfbot/postgresql/src/bin/pg_archivecleanup' PATH="/home/travis/build/postgresql-cfbot/postgresql/tmp_install/usr/local/pgsql/bin:$PATH" LD_LIBRARY_PATH="/home/travis/build/postgresql-cfbot/postgresql/tmp_install/usr/local/pgsql/lib" PGPORT='65432' PG_REGRESS='/home/travis/build/postgresql-cfbot/postgresql/src/bin/pg_archivecleanup/../../../src/test/regress/pg_regress' /usr/bin/prove -I ../../../src/test/perl/ -I . t/*.pl t/010_pg_archivecleanup.pl .. 1/42 # Failed test 'fails if restart file name is not specified: matches' # at /home/travis/build/postgresql-cfbot/postgresql/src/bin/pg_archivecleanup/../../../src/test/perl/TestLib.pm line 330. # 'pg_archivecleanup: must specify oldest kept WAL file # Try "pg_archivecleanup --help" for more information. # ' # doesn't match '(?^:must specify restartfilename)' # Failed test 'fails with too many parameters: matches' # at /home/travis/build/postgresql-cfbot/postgresql/src/bin/pg_archivecleanup/../../../src/test/perl/TestLib.pm line 330. # 'pg_archivecleanup: too many command-line arguments # Try "pg_archivecleanup --help" for more information. # ' # doesn't match '(?^:too many parameters)' # Failed test 'fails with invalid restart file name: matches' # at /home/travis/build/postgresql-cfbot/postgresql/src/bin/pg_archivecleanup/../../../src/test/perl/TestLib.pm line 330. # 'pg_archivecleanup: invalid file name argument # Try "pg_archivecleanup --help" for more information. # ' # doesn't match '(?^:invalid filename)' # Looks like you failed 3 tests of 42. t/010_pg_archivecleanup.pl .. Dubious, test returned 3 (wstat 768, 0x300) Failed 3/42 subtests Test Summary Report --- t/010_pg_archivecleanup.pl (Wstat: 768 Tests: 42 Failed: 3) Failed tests: 12, 16, 18 Non-zero exit status: 3 Files=1, Tests=42, 0 wallclock secs ( 0.03 usr 0.00 sys + 0.05 cusr 0.00 csys = 0.08 CPU) Result: FAIL -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel
On Sat, Jul 22, 2017 at 12:05 AM, Alexey Chernyshov wrote: > the following patch transfers functionality from gevel module > (http://www.sai.msu.su/~megera/wiki/Gevel) which provides functions for > analyzing GIN and GiST indexes to pageinspect. Gevel was originally > designed by Oleg Bartunov, and Teodor Sigaev for developers of GiST and > GIN indexes. > > Functions added: > - gist_stat(text) - shows statistics on GiST Tree > - gist_tree(text) - shows GiST tree > - gist_tree(text, int4) - shows GiST tree up to MAXLEVEL > - gist_print(text) - prints objects stored in GiST tree > - spgist_stat(text) - shows statistics on SP-GiST > - spgist_print(text) - prints objects stored in index > - gin_value_count() - originally gin_stat(text) - prints estimated counts > for index values > - gin_stats() - originally gin_statpage(text) - shows statistics > - gin_count_estimate(text, tsquery) - shows number of indexed rows matched > query > > Tests also transferred, docs for new functions are added. I run pgindent > over the code, but the result is different from those I expected, so I leave > pgindented one. > The patch is applicable to the commit > 866f4a7c210857aa342bf901558d170325094dde. Hi Alexey, This patch still applies but doesn't build after commits 2cd70845 and c6293249. ginfuncs.c:91:47: error: invalid type argument of ‘->’ (have ‘FormData_pg_attribute’) st->index->rd_att->attrs[st->attnum]->attbyval, ...several similar errors... For example, that expression needs to be changed to: TupleDescAttr(st->index->rd_att, attnum)->attbyval Here is some superficial review since I'm here: +/* + * process_tuples + * Retrieves the number of TIDs in a tuple. + */ +static void +process_tuple(FuncCallContext *funcctx, GinStatState * st, IndexTuple itup) "process_tuples" vs "process_tuple" + /* do no distiguish various null category */ "do not distinguish various null categories" + st->nulls[0] = (st->category == GIN_CAT_NORM_KEY) ? false : true; That's a long way to write (st->category != GIN_CAT_NORM_KEY)! + * We scaned the whole page, so we should take right page "scanned" +/* + * refind_position + * Refinds a previois position, at returns it has correctly + * set offset and buffer is locked. + */ "previous", s/, at returns/. On return/ + memset(st->nulls, false, 2 * sizeof(*st->nulls)); "false" looks strange here where an int is expected. The size expression is weird. I think you should just write: st->nulls[0] = false; st->nulls[1] = false; Investigating the types more closely, I see that 'nulls' is declared like this (in struct GinStatState): + char nulls[2]; Then later you do this: + htuple = heap_form_tuple(funcctx->attinmeta->tupdesc, + st->dvalues, + st->nulls); But that's not OK, heap_form_tuple takes bool *. bool and char are not interchangeable (they may have different sizes on some platforms, among other problems, even though usually they are both 1 byte). So I think you should declare it as "bool nulls[2]". Meanwhile in TypeStorage you have a member "bool *nulls", which you then initialise like so: + st->nulls = (char *) palloc((tupdesc->natts + 2) * sizeof(*st->nulls)); That cast is wrong. +/* + * gin_count_estimate + * Outputs number of indexed rows matched query. It doesn't touch heap at + * all. Maybe "... number of indexed rows matching query."? + if (attnum < 0 || attnum >= st->index->rd_att->natts) + elog(ERROR, "Wrong column's number"); + st->attnum = attnum; Maybe "invalid column number" or "invalid attribute number"? + elog(ERROR, "Column type is not a tsvector"); s/C/c/ (according to convention). + * Prints various stat about index internals. s/stat/stats/ + elog(ERROR, "Relation %s.%s is not an index", s/R/r/ + elog(ERROR, "Index %s.%s has wrong type", s/I/i/ + gin_value_count prints estimated counts for each + indexed values, single-argument function will return result for a first + column of index. For example: "... for each indexed value. The single argument version will return results for the first column of an index. For example:" + gin_count_estimate outputs number of indexed + rows matched query. It doesn't touch heap at all. For example: "... outputs the number of indexed rows matched by a query. ..." + gist_print prints objects stored in + GiST tree, works only if objects in index have + textual representation (type_out functions should be + implemented for the given object type). It's known to work with R-tree + GiST based index. Note, in example below, objects are + of type box. For example: "... prints objects stored in a GiST tree. it works only if the objects in the index have textual representation (type_out functions should be implemented for the given object type). It's known to work with R-tree GiST-based indexes. Note: in the example below, objects are of type box. For example:" + gin_value_count prints estimated counts fo