Re: to be a multirange or not be, that's the question
On Saturday, November 6, 2021, Jaime Casanova wrote: > Ok, subject was a bit philosophical but this message I just found is > quite confusing. > > """ > regression=# select cast(null as anyrange) &> cast(null as anymultirange); > ERROR: argument declared anymultirange is not a multirange type but type > anymultirange > """ > > I get that it is hard to parse but it is unambiguously correct. It does seem to presume that one understands how polymorphic pseudo-types work (the supplied query was written without that knowledge applied). I can envision some improvements here though it would depend a lot on the actual implementation. I’m also curious as to why we get as far the operator invocation when casting literals to polymorphic pseudo-types is supposedly an invalid thing to do. David J.
to be a multirange or not be, that's the question
Ok, subject was a bit philosophical but this message I just found is quite confusing. """ regression=# select cast(null as anyrange) &> cast(null as anymultirange); ERROR: argument declared anymultirange is not a multirange type but type anymultirange """ -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Re: archive modules
On 2021/11/03 0:03, Bossart, Nathan wrote: On 11/1/21, 9:44 PM, "Fujii Masao" wrote: What is the main motivation of this patch? I was thinking that it's for parallelizing WAL archiving. But as far as I read the patch very briefly, WAL file name is still passed to the archive callback function one by one. The main motivation is provide a way to archive without shelling out. This reduces the amount of overhead, which can improve archival rate significantly. It's helpful if you share how much this approach reduces the amount of overhead. It should also make it easier to archive more safely. For example, many of the common shell commands used for archiving won't fsync the data, but it isn't too hard to do so via C. But probably we can do the same thing even by using the existing shell interface? For example, we can implement and provide the C program of the archive command that fsync's the file? Users can just use it in archive_command. The current proposal doesn't introduce any extra infrastructure for batching or parallelism, but it is probably still possible. I would like to eventually add batching, but for now I'm only focused on introducing basic archive module support. Understood. I agree that it's reasonable to implement them gradually. Are you planning to extend this mechanism to other WAL archiving-related commands like restore_command? I can imagine that those who use archive library (rather than shell) would like to use the same mechanism for WAL restore. I would like to do this eventually, but my current proposal is limited to archive_command. Understood. I think that it's worth adding this module into core rather than handling it as test module. It provides very basic WAL archiving feature, but (I guess) it's enough for some users. Do you think it should go into contrib? Yes, at least for me.. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Allow escape in application_name
On 2021/11/05 12:17, Kyotaro Horiguchi wrote: I'm not sure that that distinction is so clear for users. So I feel we want a notice something like this. But it doesn't seem correct as it is. When the user name of the session consists of non-ascii characters, %u is finally seen as a sequence of '?'. It is not a embedded strings in pgfdw_application_name. I didn't notice this behavior. "pgfdw_application_name is set to application_name of a pgfdw connection after placeholder conversion, thus the resulting string is subject to the same restrictions to application_names.". Maybe followed by "If session user name or database name contains non-ascii characters, they are replaced by question marks "?"". If we add something to the docs about this, we should clearly describe what postgres_fdw.application_name does and what postgres_fdw in a foreign server does. BTW, this kind of information was already commented in option.c. Probably it's better to mention the limitations on not only ASCII characters but also the length of application name. * Unlike application_name GUC, don't set GUC_IS_NAME flag nor check_hook * to allow postgres_fdw.application_name to be any string more than * NAMEDATALEN characters and to include non-ASCII characters. Instead, * remote server truncates application_name of remote connection to less * than NAMEDATALEN and replaces any non-ASCII characters in it with a '?' * character. If possible, I'd like to see this change as a separate patch and commt it first because this is the description for the existing parameter postgres_fdw.application_name. I'd like to hear more opinions about this from others. But *if* there is actually no use case, I'd like not to add the feature, to make the code simpler. I think padding is useful because it alingns the significant content of log lines by equating the length of the leading fixed components. application_name doesn't make any other visible components unaligned even when shown in the pg_stat_activity view. It is not useful in that sense. Padding make the string itself make look maybe nicer, but gives no other advantage. I doubt people complain to the lack of the padding feature. Addition to that, since pgfdw_application_name and log_line_prefix work different way in regard to multibyte characters, we don't need to struggle so hard to consilidate them in their behavior. # I noticed that the paddig feature doesn't consider multibyte # characters. (I'm not suggesting them ought to be handled # "prpoerly"). In short, I'm for to removing it by +0.7. So our current consensus is to remove the padding part from postgres_fdw.application_name. + case 'u': + Assert(MyProcPort != NULL); Isn't it enough to perform this assertion check only once at the top of parse_pgfdw_appname()? Yeah, in either way, we should treat them in the same way. We can force parse_pgfdw_appname() not to be called if MyProcPort does not exist, but I don't think it is needed now. Yes. (I assume you said "it is needed now".) I'm not sure how to force that but if it means a NULL MyProcPort cuases a crash, I think crashing server is needlessly too aggressive as the penatly. I said "Yes" for Kuroda-san's comment "I don't think it is needed now". So I meant that "it is NOT needed now". Sorry for unclear comment.. His idea was to skip calling parse_pgfdw_appname() if MyProcPort is NULL, so as to prevent parse_pgfdw_appname() from seeing NULL pointer of MyProcPort. But he thought it's not necessary now, and I agree with him because the idea would cause more confusing behavior. It seems to me that postgres-fdw asumes a valid user id, but doesn't make no use of databsae, server port, and process id. What I thought here is that making it an assertion is too much. So just ignoring the replacement is also fine to me. That being said, if we are eager not to have unused code paths, it is reasonable enough. I don't object strongly to replace it with an assertion. So no one strongly objects to the addition of assertion? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Parallel Full Hash Join
> Rebased patches attached. I will change status back to "Ready for Committer" The CI showed a crash on freebsd, which I reproduced. https://cirrus-ci.com/task/5203060415791104 The crash is evidenced in 0001 - but only ~15% of the time. I think it's the same thing which was committed and then reverted here, so maybe I'm not saying anything new. https://commitfest.postgresql.org/33/3031/ https://www.postgresql.org/message-id/flat/20200929061142.ga29...@paquier.xyz (gdb) p pstate->build_barrier->phase Cannot access memory at address 0x7f82e0fa42f4 #1 0x7f13de34f801 in __GI_abort () at abort.c:79 #2 0x5638e6a16d28 in ExceptionalCondition (conditionName=conditionName@entry=0x5638e6b62850 "!pstate || BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUN", errorType=errorType@entry=0x5638e6a6f00b "FailedAssertion", fileName=fileName@entry=0x5638e6b625be "nodeHash.c", lineNumber=lineNumber@entry=3305) at assert.c:69 #3 0x5638e678085b in ExecHashTableDetach (hashtable=0x5638e8e6ca88) at nodeHash.c:3305 #4 0x5638e6784656 in ExecShutdownHashJoin (node=node@entry=0x5638e8e57cb8) at nodeHashjoin.c:1400 #5 0x5638e67666d8 in ExecShutdownNode (node=0x5638e8e57cb8) at execProcnode.c:812 #6 ExecShutdownNode (node=0x5638e8e57cb8) at execProcnode.c:772 #7 0x5638e67cd5b1 in planstate_tree_walker (planstate=planstate@entry=0x5638e8e58580, walker=walker@entry=0x5638e6766680 , context=context@entry=0x0) at nodeFuncs.c:4009 #8 0x5638e67666b2 in ExecShutdownNode (node=0x5638e8e58580) at execProcnode.c:792 #9 ExecShutdownNode (node=0x5638e8e58580) at execProcnode.c:772 #10 0x5638e67cd5b1 in planstate_tree_walker (planstate=planstate@entry=0x5638e8e58418, walker=walker@entry=0x5638e6766680 , context=context@entry=0x0) at nodeFuncs.c:4009 #11 0x5638e67666b2 in ExecShutdownNode (node=0x5638e8e58418) at execProcnode.c:792 #12 ExecShutdownNode (node=node@entry=0x5638e8e58418) at execProcnode.c:772 #13 0x5638e675f518 in ExecutePlan (execute_once=, dest=0x5638e8df0058, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, use_parallel_mode=, planstate=0x5638e8e58418, estate=0x5638e8e57a10) at execMain.c:1658 #14 standard_ExecutorRun () at execMain.c:410 #15 0x5638e6763e0a in ParallelQueryMain (seg=0x5638e8d823d8, toc=0x7f13df4e9000) at execParallel.c:1493 #16 0x5638e663f6c7 in ParallelWorkerMain () at parallel.c:1495 #17 0x5638e68542e4 in StartBackgroundWorker () at bgworker.c:858 #18 0x5638e6860f53 in do_start_bgworker (rw=) at postmaster.c:5883 #19 maybe_start_bgworkers () at postmaster.c:6108 #20 0x5638e68619e5 in sigusr1_handler (postgres_signal_arg=) at postmaster.c:5272 #21 #22 0x7f13de425ff7 in __GI___select (nfds=nfds@entry=7, readfds=readfds@entry=0x7ffef03b8400, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7ffef03b8360) at ../sysdeps/unix/sysv/linux/select.c:41 #23 0x5638e68620ce in ServerLoop () at postmaster.c:1765 #24 0x5638e6863bcc in PostmasterMain () at postmaster.c:1473 #25 0x5638e658fd00 in main (argc=8, argv=0x5638e8d54730) at main.c:198
XLogReadRecord() error in XlogReadTwoPhaseData()
On Sun, Oct 24, 2021 at 04:35:02PM -0700, Noah Misch wrote: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kittiwake&dt=2021-10-24%2012%3A01%3A10 > got an interesting v9.6 failure [...]: > > 2021-10-24 14:25:29.263 CEST [34569:175] pgbench ERROR: could not read > two-phase state from xlog at 0/158F4E0 > 2021-10-24 14:25:29.263 CEST [34569:176] pgbench STATEMENT: COMMIT PREPARED > 'c1'; As a first step, let's report the actual XLogReadRecord() error message. Attached. All the other sites that expect no error already do this. Author: Noah Misch Commit: Noah Misch Report any XLogReadRecord() error in XlogReadTwoPhaseData(). Buildfarm member kittiwake has witnessed errors at this site. The site discarded key facts. Back-patch to 9.6 (all supported versions). Reviewed by FIXME. Discussion: https://postgr.es/m/FIXME diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index ef4b5f6..28b153a 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1397,10 +1397,18 @@ XlogReadTwoPhaseData(XLogRecPtr lsn, char **buf, int *len) record = XLogReadRecord(xlogreader, &errormsg); if (record == NULL) - ereport(ERROR, - (errcode_for_file_access(), -errmsg("could not read two-phase state from WAL at %X/%X", - LSN_FORMAT_ARGS(lsn; + { + if (errormsg) + ereport(ERROR, + (errcode_for_file_access(), +errmsg("could not read two-phase state from WAL at %X/%X: %s", + LSN_FORMAT_ARGS(lsn), errormsg))); + else + ereport(ERROR, + (errcode_for_file_access(), +errmsg("could not read two-phase state from WAL at %X/%X", + LSN_FORMAT_ARGS(lsn; + } if (XLogRecGetRmid(xlogreader) != RM_XACT_ID || (XLogRecGetInfo(xlogreader) & XLOG_XACT_OPMASK) != XLOG_XACT_PREPARE)
Re: inefficient loop in StandbyReleaseLockList()
Andres Freund writes: > On 2021-11-06 18:32:54 -0400, Tom Lane wrote: >> Good point. The note at list_delete_last that it's O(1) isn't really >> on point --- instead, the text for list_delete_first should be like >> >> + * Note that this takes time proportional to the length of the list, >> + * since the remaining entries must be moved. Consider reversing the >> + * list order so that you can use list_delete_last() instead. However, >> + * if that causes you to replace lappend() with lcons(), you haven't >> + * improved matters. > LGTM Done that way, then. regards, tom lane
Re: inefficient loop in StandbyReleaseLockList()
On 2021-11-06 18:32:54 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2021-11-06 14:06:12 -0400, Tom Lane wrote: > >> + * Note that this takes time proportional to the length of the list, > >> + * since the remaining entries must be moved. > >> */ > >> List * > >> list_delete_first(List *list) > > > Perhaps we could point to list_delete_last()? But it's an improvement > > without > > that too. > > Good point. The note at list_delete_last that it's O(1) isn't really > on point --- instead, the text for list_delete_first should be like > > + * Note that this takes time proportional to the length of the list, > + * since the remaining entries must be moved. Consider reversing the > + * list order so that you can use list_delete_last() instead. However, > + * if that causes you to replace lappend() with lcons(), you haven't > + * improved matters. LGTM
Re: inefficient loop in StandbyReleaseLockList()
Andres Freund writes: > On 2021-11-06 14:06:12 -0400, Tom Lane wrote: >> + * Note that this takes time proportional to the length of the list, >> + * since the remaining entries must be moved. >> */ >> List * >> list_delete_first(List *list) > Perhaps we could point to list_delete_last()? But it's an improvement without > that too. Good point. The note at list_delete_last that it's O(1) isn't really on point --- instead, the text for list_delete_first should be like + * Note that this takes time proportional to the length of the list, + * since the remaining entries must be moved. Consider reversing the + * list order so that you can use list_delete_last() instead. However, + * if that causes you to replace lappend() with lcons(), you haven't + * improved matters. regards, tom lane
Re: inefficient loop in StandbyReleaseLockList()
Hi, On 2021-11-06 14:06:12 -0400, Tom Lane wrote: > I wrote: > > Hm. I think it's not the only list function with O(N) behavior; > > in fact there used to be more such functions than there are now. > > But I could get behind a patch that annotates all of them. Personally I think the delete first is particularly easy to run into, due to implementing fifo like behaviour. But I'm i > Here's a quick hack at that. Having done it, I'm not sure if it's > really worth the trouble or not ... thoughts? In favor of adding them. > @@ -870,6 +890,9 @@ list_delete_oid(List *list, Oid datum) > * where the intent is to alter the list rather than just traverse it. > * Beware that the list is modified, whereas the Lisp-y coding leaves > * the original list head intact in case there's another pointer to it. > + * > + * Note that this takes time proportional to the length of the list, > + * since the remaining entries must be moved. > */ > List * > list_delete_first(List *list) Perhaps we could point to list_delete_last()? But it's an improvement without that too. This reminds me that I wanted a list splicing operation for ilist.h... Greetings, Andres Freund
Re: amcheck's verify_heapam(), and HOT chain verification
On Fri, Nov 5, 2021 at 7:51 PM Peter Geoghegan wrote: > Here are some specific checks I have in mind: One more for the list: * Validate PageIsAllVisible() for each page. In other words, pg_visibility should be merged with verify_heapam.c (or at least pg_visibility 's pg_check_frozen() and pg_check_visible() functions should be moved, merged, or whatever). This would mean that verify_heapam() would directly check if the page-level PD_ALL_VISIBLE flag contradicts either the tuple headers of tuples with storage on the page, the presence (or absence) of LP_DEAD stub line pointers on the page, or the corresponding visibility map bit (e.g., VISIBILITYMAP_ALL_VISIBLE) for the page. There is value in teaching verify_heapam() about any possible problem, including with the visibility map, but it's certainly less valuable than the HOT chain verification stuff -- and probably trickier to get right. I'm mentioning it now to be exhaustive, but it's less of a priority for me personally. I am quite willing to help out with all this, if you're interested. One more thing about HOT chain validation: I can give you another example bug of the kind I'd expect verify_heapam() to catch only with full HOT chain validation. This one is a vintage MultiXact bug that has the same basic HOT chain corruption, looks-like-index-corruption-but-isn't quality as the more memorable freeze-the-dead bug (this one was fixed by commit 6bfa88ac): https://www.postgresql.org/message-id/CAM3SWZTMQiCi5PV5OWHb%2BbYkUcnCk%3DO67w0cSswPvV7XfUcU5g%40mail.gmail.com In general I think that reviewing historic examples of pernicious corruption bugs is a valuable exercise when designing tools like amcheck. Maybe even revert the fix during testing, to be sure it would have been caught had the final tool been available. History doesn't repeat itself, but it does rhyme. -- Peter Geoghegan
Re: amcheck's verify_heapam(), and HOT chain verification
On Fri, Nov 5, 2021 at 8:18 PM Mark Dilger wrote: > Thanks, Peter, for this analysis. It's strategically important work. What you've done so far has every chance of catching corruption involving storage issues, which is great. But we ought to do more to catch certain known general patterns of corruption. The so-called "freeze the dead" bug (see commit 9c2f0a6c for the final fix) is the single weirdest and scariest bug that I can recall (there were a couple of incorrect bug fixes that had to be reverted), and it is very much the kind of thing that I'd like to see verify_heapam.c handle exhaustively. That would de-risk a lot of things. Note that the same variety of HOT chain corruption bugs besides that one, so it's really a general class of corruption, not one specific bug. The heapallindexed verification option actually detected the "freeze the dead" bug [1], although that was a total accident -- that checking option happens to use CREATE INDEX infrastructure, which happens to sanitize HOT chains by calling heap_get_root_tuples() and detecting contradictions -- heap_get_root_tuples() is code that is right next to the pruning code I mentioned. That level of detection is certainly better than nothing, but it's still not good enough -- we need this in verify_heapam, too. Here's why I think that we need very thorough HOT chain verification that lives directly in verify_heapam: First of all, the heapallindexed option is rather expensive, whereas the additional overhead for verify_heapam would be minimal -- you probably wouldn't even need to make it optional. Second of all, why should you need to use bt_index_check() to detect what is after all heap corruption? And third of all, the approach of detecting corruption by outsourcing detection to heapam_index_build_range_scan() (which calls heap_get_root_tuples() itself) probably risks missing corrupt HOT chains where the corruption doesn't look a certain way -- it was not designed with amcheck in mind. heapam_index_build_range_scan() + heap_get_root_tuples() will notice a heap-only tuple without a parent, which is a good start. But what about an LP_REDIRECT item whose lp_off (i.e. page offset number redirect) somehow points to any item on the page other than a valid heap-only tuple? Also doesn't catch contradictory commit states for heap-only tuples that form a hot chain through their t_ctid links. I'm sure that there are other things that I haven't thought of, too -- there is a lot going on here. This HOT chain verification business is a little like the checking that we do in verify_nbtree.c, actually. The individual tuples (in this case heap tuples) may well not be corrupt (or demonstrably corrupt) in isolation. But taken together, in a particular page offset number sequence order, they can nevertheless *imply* corruption -- it's implied if the tuples tell a contradictory story about what's going on at the level of the whole page. A HOT chain LP_REDIRECT is after all a little like a "mini index" stored on a heap page. Sequential scans don't care about LP_REDIRECTs at all. But index scans expect a great deal from LP_REDIRECTs. [1] https://www.postgresql.org/message-id/cah2-wznm4rcrhfaiwkpwtpew2bxdtgrozk7jwwgucxeh3d1...@mail.gmail.com -- Peter Geoghegan
Re: Draft release notes for next week's releases
Peter Geoghegan writes: > You probably won't miss this, but just in case: Alexander's very > recent "Reset lastOverflowedXid on standby when needed" commit needs > to be listed now. Yup, I saw it. regards, tom lane
Re: Draft release notes for next week's releases
On Fri, Nov 5, 2021 at 5:27 PM Tom Lane wrote: > As usual, please send any comments/corrections by Sunday. You probably won't miss this, but just in case: Alexander's very recent "Reset lastOverflowedXid on standby when needed" commit needs to be listed now. -- Peter Geoghegan
Re: inefficient loop in StandbyReleaseLockList()
I wrote: > Hm. I think it's not the only list function with O(N) behavior; > in fact there used to be more such functions than there are now. > But I could get behind a patch that annotates all of them. Here's a quick hack at that. Having done it, I'm not sure if it's really worth the trouble or not ... thoughts? regards, tom lane diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c index 12847e35ea..247032124d 100644 --- a/src/backend/nodes/list.c +++ b/src/backend/nodes/list.c @@ -410,6 +410,9 @@ insert_new_cell(List *list, int pos) /* * Insert the given datum at position 'pos' (measured from 0) in the list. * 'pos' must be valid, ie, 0 <= pos <= list's length. + * + * Note that this takes time proportional to the distance to the end of the + * list, since the following entries must be moved. */ List * list_insert_nth(List *list, int pos, void *datum) @@ -460,6 +463,9 @@ list_insert_nth_oid(List *list, int pos, Oid datum) * value, rather than continuing to use the pointer passed as the * second argument. * + * Note that this takes time proportional to the length of the list, + * since the existing entries must be moved. + * * Caution: before Postgres 8.0, the original List was unmodified and * could be considered to retain its separate identity. This is no longer * the case. @@ -525,6 +531,10 @@ lcons_oid(Oid datum, List *list) * Callers should be sure to use the return value as the new pointer to the * concatenated list: the 'list1' input pointer may or may not be the same * as the returned pointer. + * + * Note that this takes at least time proportional to the length of list2. + * It'd typically be the case that we have to enlarge list1's storage, + * probably adding time proportional to the length of list1. */ List * list_concat(List *list1, const List *list2) @@ -623,6 +633,8 @@ list_truncate(List *list, int new_size) * Return true iff 'datum' is a member of the list. Equality is * determined via equal(), so callers should ensure that they pass a * Node as 'datum'. + * + * This does a simple linear search --- avoid using it on long lists. */ bool list_member(const List *list, const void *datum) @@ -706,6 +718,9 @@ list_member_oid(const List *list, Oid datum) * Delete the n'th cell (counting from 0) in list. * * The List is pfree'd if this was the last member. + * + * Note that this takes time proportional to the distance to the end of the + * list, since the following entries must be moved. */ List * list_delete_nth_cell(List *list, int n) @@ -777,6 +792,9 @@ list_delete_nth_cell(List *list, int n) * * The List is pfree'd if this was the last member. However, we do not * touch any data the cell might've been pointing to. + * + * Note that this takes time proportional to the distance to the end of the + * list, since the following entries must be moved. */ List * list_delete_cell(List *list, ListCell *cell) @@ -787,6 +805,8 @@ list_delete_cell(List *list, ListCell *cell) /* * Delete the first cell in list that matches datum, if any. * Equality is determined via equal(). + * + * This does a simple linear search --- avoid using it on long lists. */ List * list_delete(List *list, void *datum) @@ -870,6 +890,9 @@ list_delete_oid(List *list, Oid datum) * where the intent is to alter the list rather than just traverse it. * Beware that the list is modified, whereas the Lisp-y coding leaves * the original list head intact in case there's another pointer to it. + * + * Note that this takes time proportional to the length of the list, + * since the remaining entries must be moved. */ List * list_delete_first(List *list) @@ -885,8 +908,8 @@ list_delete_first(List *list) /* * Delete the last element of the list. * - * This is the opposite of list_delete_first(), but is noticeably cheaper - * with a long list, since no data need be moved. + * This is the opposite of list_delete_first(), but is O(1), + * since no data need be moved. */ List * list_delete_last(List *list) @@ -910,6 +933,9 @@ list_delete_last(List *list) * Delete the first N cells of the list. * * The List is pfree'd if the request causes all cells to be deleted. + * + * Note that this takes time proportional to the distance to the end of the + * list, since the following entries must be moved. */ List * list_delete_first_n(List *list, int n) @@ -989,8 +1015,10 @@ list_delete_first_n(List *list, int n) * you probably want to use list_concat_unique() instead to avoid wasting * the storage of the old x list. * - * This function could probably be implemented a lot faster if it is a - * performance bottleneck. + * Note that this takes time proportional to the product of the list + * lengths, so beware of using it on long lists. (We could probably + * improve that, but really you should be using some other data structure + * if this'd be a performance bottleneck.) */ List * list_union(const List *list1,
Re: Commitfest 2021-11 Patch Triage - Part 1
On 11/5/21 22:15, Daniel Gustafsson wrote: ... 1651: GROUP BY optimization === This is IMO a desired optimization, and after all the heavy lifting by Tomas the patch seems to be in pretty good shape. I agree it's desirable. To move the patch forward, I need some feedback on the costing questions, mentioned in my last review, and the overall approach to planning (considering multiple group by variants). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: lastOverflowedXid does not handle transaction ID wraparound
Hi! On Fri, Nov 5, 2021 at 10:31 AM Kyotaro Horiguchi wrote: > At Thu, 4 Nov 2021 01:07:05 +0300, Alexander Korotkov > wrote in > > On Wed, Nov 3, 2021 at 8:51 PM Simon Riggs > > wrote: > > > It is however, an undocumented modularity violation. I think that is > > > acceptable because of the ProcArrayLock traffic, but needs to have a > > > comment to explain this at the call to > > > ExpireOldKnownAssignedTransactionIds() i.e. " and potentially reset > > > lastOverflowedXid", as well as a comment on the > > > ExpireOldKnownAssignedTransactionIds() function. > > > > Thank you for your feedback. Please find the revised patch attached. > > It incorporates this function comment changes altogether with minor > > editings and commit message. Let me know if you have further > > suggestions. > > > > I'm going to push this if no objections. > > Thanks for taking a look on and refining this, Simon and Alex! (while > I was sick in bed X:) > > It looks good to me except the commit Author doesn't contain the name > of Alexander Korotkov? Thank you for the suggestion. And thanks to everybody for the feedback. Pushed! -- Regards, Alexander Korotkov
Re: GiST operator class for bool
On 11/3/21 16:18, Tomas Vondra wrote: Hi, I looked at this patch today - it's pretty simple and in pretty good shape, I can't think of anything that'd need fixing. Perhaps the test might also do EXPLAIN like for other types, to verify the new index is actually used. But that's minor enough to handle during commit. I've marked this as RFC and will get it committed in a day or two. Pushed, after adding some simple EXPLAIN to the regression test. Thanks for the patch! regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Schema variables - new implementation for Postgres 15
so 6. 11. 2021 v 15:57 odesílatel Justin Pryzby napsal: > On Sat, Nov 06, 2021 at 04:45:19AM +0100, Pavel Stehule wrote: > > st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra < > tomas.von...@enterprisedb.com> napsal: > > > 1) Not sure why we need to call this "schema variables". Most objects > > > are placed in a schema, and we don't say "schema tables" for example. > > > And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit > > > inconsistent. > > +1 > > At least the error messages need to be consistent. > It doesn't make sense to have both of these: > > + elog(ERROR, "cache lookup failed for schema variable %u", > varid); > + elog(ERROR, "cache lookup failed for variable %u", varid); > > > Yes, there is inconsistency, but I think it is necessary. The name > > "variable" is too generic. Theoretically we can use other adjectives like > > session variables or global variables and the name will be valid. But it > > doesn't describe the fundamentals of design. This is similar to the > > package's variables from PL/SQL. These variables are global, session's > > variables too. But the usual name is "package variables". So schema > > variables are assigned to schemes, and I think a good name can be "schema > > variables". But it is not necessary to repeat keyword schema in the > CREATE > > COMMAND. > > > > My opinion is not too strong in this case, and I can accept just > > "variables" or "session's variables" or "global variables", but I am not > > sure if these names describe this feature well, because still they are > too > > generic. There are too many different implementations of session global > > variables (see PL/SQL or T-SQL or DB2). > > I would prefer "session variable". > > To me, this feature seems similar to a CTE (which exists for a single > statement), or a temporary table (which exists for a single transaction). > So > "session" conveys a lot more of its meaning than "schema". > It depends on where you are looking. There are two perspectives - data and metadata. And if I use data perspective, then it is session related. If I use metadata perspective, then it can be persistent or temporal like tables. I see strong similarity with Global Temporary Tables - but I think naming "local temporary tables" and "global temporary tables" can be not intuitive or messy for a lot of people too. Anyway, if people will try to find this feature on Google, then probably use keywords "session variables", so maybe my preference of more technical terminology is obscure and not practical, and the name "session variables" can be more practical for other people. If I use the system used for GTT - then the exact name can be "Global Session Variable". Can we use this name? Or shortly just Session Variables because we don't support local session variables now. What do you think about it? Regards Pavel > But don't rename everything just for me... > > -- > Justin >
Re: Schema variables - new implementation for Postgres 15
On Sat, Nov 06, 2021 at 04:45:19AM +0100, Pavel Stehule wrote: > st 3. 11. 2021 v 14:05 odesílatel Tomas Vondra > napsal: > > 1) Not sure why we need to call this "schema variables". Most objects > > are placed in a schema, and we don't say "schema tables" for example. > > And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit > > inconsistent. +1 At least the error messages need to be consistent. It doesn't make sense to have both of these: + elog(ERROR, "cache lookup failed for schema variable %u", varid); + elog(ERROR, "cache lookup failed for variable %u", varid); > Yes, there is inconsistency, but I think it is necessary. The name > "variable" is too generic. Theoretically we can use other adjectives like > session variables or global variables and the name will be valid. But it > doesn't describe the fundamentals of design. This is similar to the > package's variables from PL/SQL. These variables are global, session's > variables too. But the usual name is "package variables". So schema > variables are assigned to schemes, and I think a good name can be "schema > variables". But it is not necessary to repeat keyword schema in the CREATE > COMMAND. > > My opinion is not too strong in this case, and I can accept just > "variables" or "session's variables" or "global variables", but I am not > sure if these names describe this feature well, because still they are too > generic. There are too many different implementations of session global > variables (see PL/SQL or T-SQL or DB2). I would prefer "session variable". To me, this feature seems similar to a CTE (which exists for a single statement), or a temporary table (which exists for a single transaction). So "session" conveys a lot more of its meaning than "schema". But don't rename everything just for me... -- Justin
Re: Draft release notes for next week's releases
Justin Pryzby writes: > On Fri, Nov 05, 2021 at 08:27:49PM -0400, Tom Lane wrote: >> + one more row than requested, so tha it can determine whether the > A little typo: tha it > [ etc ] Ugh. These notes were prepared in a bit more haste than usual, and I guess it shows :-(. Will fix, thanks for proofreading! regards, tom lane
Re: using extended statistics to improve join estimates
Hi Tomas: This is the exact patch I want, thanks for the patch! On Thu, Oct 7, 2021 at 3:33 AM Tomas Vondra wrote: > 3) estimation by join pairs > > At the moment, the estimates are calculated for pairs of relations, so > for example given a query > > explain analyze > select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b) >join t3 on (t1.b = t3.b and t2.c = t3.c); > > we'll estimate the first join (t1,t2) just fine, but then the second > join actually combines (t1,t2,t3). What the patch currently does is it > splits it into (t1,t2) and (t2,t3) and estimates those. Actually I can't understand how this works even for a simpler example. let's say we query like this (ONLY use t2's column to join t3). select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b) join t3 on (t2.c = t3.c and t2.d = t3.d); Then it works well on JoinRel(t1, t2) AND JoinRel(t2, t3). But when comes to JoinRel(t1, t2, t3), we didn't maintain the MCV on join rel, so it is hard to work. Here I see your solution is splitting it into (t1, t2) AND (t2, t3) and estimate those. But how does this help to estimate the size of JoinRel(t1, t2, t3)? > I wonder if this > should actually combine all three MCVs at once - we're pretty much > combining the MCVs into one large MCV representing the join result. > I guess we can keep the MCVs on joinrel for these matches. Take the above query I provided for example, and suppose the MCV data as below: t1(a, b) (1, 2) -> 0.1 (1, 3) -> 0.2 (2, 3) -> 0.5 (2, 8) -> 0.1 t2(a, b) (1, 2) -> 0.2 (1, 3) -> 0.1 (2, 4) -> 0.2 (2, 10) -> 0.1 After t1.a = t2.a AND t1.b = t2.b, we can build the MCV as below (1, 2, 1, 2) -> 0.1 * 0.2 (1, 3, 1, 3) -> 0.2 * 0.1 And recording the total mcv frequence as (0.1 + 0.2 + 0.5 + 0.1) * (0.2 + 0.1 + 0.2 + 0.1) With this design, the nitems of MCV on joinrel would be less than either of baserel. and since we handle the eqjoin as well, we even can record the items as (1, 2) -> 0.1 * 0.2 (1, 3) -> 0.2 * 0.1; About when we should maintain the JoinRel's MCV data, rather than maintain this just after the JoinRel size is estimated, we can only estimate it when it is needed. for example: select * from t1 join t2 on (t1.a = t2.a and t1.b = t2.b) join t3 on (t2.c = t3.c and t2.d = t3.d); we don't need to maintain the MCV on (t1, t2, t3) since no others need it at all. However I don't check code too closely to see if it (Lazing computing MVC on joinrel) is convenient to do. > But I haven't done that yet, as it requires the MCVs to be combined > using the join clauses (overlap in a way), but I'm not sure how likely > that is in practice. In the example it could help, but that's a bit > artificial example. > > > 4) still just inner equi-joins > > I haven't done any work on extending this to outer joins etc. Adding > outer and semi joins should not be complicated, mostly copying and > tweaking what eqjoinsel() does. > Overall, thanks for the feature and I am expecting there are more cases to handle during discussion. To make the review process more efficient, I suggest that we split the patch into smaller ones and review/commit them separately if we have finalized the design roughly . For example: Patch 1 -- required both sides to have extended statistics. Patch 2 -- required one side to have extended statistics and the other side had per-column MCV. Patch 3 -- handle the case like WHERE t1.a = t2.a and t1.b = Const; Patch 3 -- handle the case for 3+ table joins. Patch 4 -- supports the outer join. I think we can do this if we are sure that each individual patch would work in some cases and would not make any other case worse. If you agree with this, I can do that splitting work during my review process. -- Best Regards Andy Fan (https://www.aliyun.com/)
AW: VS2022: Support Visual Studio 2022 on Windows
Now it is three days before release of VS2022. I updated to the latest Preview 7 (= RC3) and recompiled PG14 64bit Release without issues. There seem to be not many internal differences from previous versions in the tools used for building Postgres. My intention for an early support is to catch the momentum for signalling to any user: "We support current tools". The risks seem non-existent. Updated the patch to reflect the VisualStudioVersion for Preview 7, which is the version number compiled into the main devenv.exe image. This version number seems to be of no interest elsewhere in the postgres source tree. I will reflect any updates after official release on monday, November 8 Hans Buschmann 0001_support_vs2022_v2.patch Description: 0001_support_vs2022_v2.patch
Re: Extending amcheck to check toast size and compression
> On Nov 5, 2021, at 8:56 PM, Tom Lane wrote: > > Some of the buildfarm is unimpressed with this --- looks like the test > output is less stable than you thought. Yes, it does. I had to play with it a bit to be sure the test itself is faulty, and I believe that it is. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company