Re: [HACKERS] Causal reads take II
On Sun, Jul 30, 2017 at 7:07 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote: > I looked through the code of `synchronous-replay-v1.patch` a bit and ran a > few > tests. I didn't manage to break anything, except one mysterious error that > I've > got only once on one of my replicas, but I couldn't reproduce it yet. > Interesting thing is that this error did not affect another replica or > primary. > Just in case here is the log for this error (maybe you can see something > obvious, that I've not noticed): > > ``` > LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211/47732": > Directory not empty > CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 > LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211": > Directory not empty > CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 > LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211/47732": > Directory not empty > CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 > LOG: could not remove directory "pg_tblspc/47733/PG_10_201707211": > Directory not empty > CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 > LOG: directories for tablespace 47733 could not be removed > HINT: You can remove the directories manually if necessary. > CONTEXT: WAL redo at 0/125F4D90 for Tablespace/DROP: 47733 > FATAL: could not create directory "pg_tblspc/47734/PG_10_201707211/47732": > File exists > CONTEXT: WAL redo at 0/125F5768 for Storage/CREATE: > pg_tblspc/47734/PG_10_201707211/47732/47736 > LOG: startup process (PID 8034) exited with exit code 1 > LOG: terminating any other active server processes > LOG: database system is shut down > ``` Hmm. The first error ("could not remove directory") could perhaps be explained by temporary files from concurrent backends, leaked files from earlier crashes or copying a pgdata directory over the top of an existing one as a way to set it up, leaving behind some files from an earlier test? The second error ("could not create directory") is a bit stranger though... I think this must come from TablespaceCreateDbspace(): it must have stat()'d the file and got ENOENT, decided to create the directory, acquired TablespaceCreateLock, stat()'d the file again and found it still absent, then run mkdir() on the parents and got EEXIST and finally on the directory to be created, and surprisingly got EEXIST. That means that someone must have concurrently created the directory. Perhaps in your testing you accidentally copied a pgdata directory over the top of it while it was running? In any case I'm struggling to see how anything in this patch would affect anything at the REDO level. > And speaking about the code, so far I have just a few notes (some of them > merely questions): > > * In general the idea behind this patch sounds interesting for me, but it > relies heavily on time synchronization. As mentioned in the documentation: > "Current hardware clocks, NTP implementations and public time servers are > unlikely to allow the system clocks to differ more than tens or hundreds > of > milliseconds, and systems synchronized with dedicated local time servers > may > be considerably more accurate." But as far as I remember from my own > experience sometimes it maybe not that trivial on something like AWS > because > of virtualization. Maybe it's an unreasonable fear, but is it possible to > address this problem somehow? Oops, I had managed to lose an important hunk that deals with detecting excessive clock drift (ie badly configured servers) while rebasing a couple of versions back. Here is a version to put it back. With that change, if you disable NTP and manually set your standby's clock to be more than 1.25s (assuming synchronous_replay_lease_time is set to the default of 5s) behind the primary, the synchronous_replay should be unavailable and you should see this error in the standby's log: ereport(LOG, (errmsg("the primary server's clock time is too far ahead for synchronous_replay"), errhint("Check your servers' NTP configuration or equivalent."))); One way to test this without messing with your NTP setting or involving two different computers is to modify this code temporarily in WalSndKeepalive: now = GetCurrentTimestamp() + 1250100; This is a best effort intended to detect a system not running ntpd at all or talking to an insane time server. Fundamentally this proposal is based on the assumption that you can get your system clocks into sync within a tolerance that we feel confident estimating an upper bound for. It does appear that some Amazon OS images come with NTP disabled; that's a problem if you want to use this feature, but if you're running a virtual server without an ntpd you'll pretty soon drift seconds to minutes off UTC time and get "unavailable for synchronous replay" errors from this patch (and possibly the LOG message above, depending on the direction of drift).
Re: [HACKERS] Adding support for Default partition in partitioning
On Sat, Jul 29, 2017 at 2:55 AM, Robert Haaswrote: > On Fri, Jul 28, 2017 at 9:30 AM, Ashutosh Bapat > wrote: >> 0004 patch >> The patch adds another column partdefid to catalog pg_partitioned_table. The >> column gives OID of the default partition for a given partitioned table. This >> means that the default partition's OID is stored at two places 1. in the >> default partition table's pg_class entry and in pg_partitioned_table. There >> is >> no way to detect when these two go out of sync. Keeping those two in sync is >> also a maintenance burdern. Given that default partition's OID is required >> only >> while adding/dropping a partition, which is a less frequent operation, it >> won't >> hurt to join a few catalogs (pg_inherits and pg_class in this case) to find >> out >> the default partition's OID. That will be occasional performance hit >> worth the otherwise maintenance burden. > > Performance isn't the only consideration here. We also need to think > about locking and concurrency. I think that most operations that > involve locking the parent will also involve locking the default > partition. However, we can't safely build a relcache entry for a > relation before we've got some kind of lock on it. We can't assume > that there is no concurrent DDL going on before we take some lock. We > can't assume invalidation messages are processed before we have taken > some lock. If we read multiple catalog tuples, they may be from > different points in time. If we can figure out everything we need to > know from one or two syscache lookups, it may be easier to verify that > the code is bug-free vs. having to do something more complicated. > The code takes a lock on the parent relation. While that function holds that lock nobody else would change partitions of that relation and hence nobody changes the default partition. heap_drop_with_catalog() has code to lock the parent. Looking up pg_inherits catalog for its partitions followed by identifying the partition which has default partition bounds specification while holding the lock on the parent should be safe. Any changes to partition bounds, or partitions would require lock on the parent. In order to prevent any buggy code changing the default partition without sufficient locks, we should lock the default partition after it's found and check the default partition bound specification again. Will that work? > Now that having been said, I'm not taking the position that Jeevan's > patch (based on Amit Langote's idea) has definitely got the right > idea, just that you should think twice before shooting down the > approach. > If we can avoid the problems specified by Amit Langote, I am fine with the approach of reading the default partition OID from the Relcache as well. But I am not able to device a solution to those problems. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Complex Source Code Reading Strategy
On Fri, Jul 28, 2017 at 6:23 AM, Craig Ringerwrote: > On 28 July 2017 at 07:45, Tom Lane wrote: >> >> Peter Geoghegan writes: >> > 2. Start somewhere. I have no idea where that should be, but it has to >> > be some particular place that seems interesting to you. >> >> Don't forget to start with the available documentation, ie >> https://www.postgresql.org/docs/devel/static/internals.html >> You should certainly read >> https://www.postgresql.org/docs/devel/static/overview.html >> and depending on what your interests are, there are probably other >> chapters of Part VII that are worth your time. >> >> Also keep an eye out for README files in the part of the source >> tree you're browsing in. > > > In fact, even though you won't initially understand much from some of them, > reading most of > > find src/ -name README\* > > can be quite useful. It's nearly time for me to do that again myself; each > time I absorb more. > > There are very useful comments at the start of some of the source files too. > Unfortunately in some cases the really important explanation will be on some > function that you won't know to look for, not the comment at the top of the > file, so there's an element of discovery there. > > I'd start with the docs as Tom suggested, then > > * https://www.postgresql.org/developer/ > * https://momjian.us/main/presentations/internals.html > * https://wiki.postgresql.org/wiki/Development_information > * https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F > * https://wiki.postgresql.org/wiki/Developer_FAQ > > (some of which need to be added to the "developer information" wiki page I > think) > Thanks dear hackers. This is an enormous help for me. I think recommending specific techniques/tools like Cscope, find src/ -name README\* and others you might know that make life easy with PG hacking can be great help for the beginners. Regards, Zeray -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/perl extension fails on Windows
Hi Christoph, On Mon, Jul 31, 2017 at 2:44 AM, Christoph Bergwrote: > Re: Tom Lane 2017-07-28 <3254.1501276...@sss.pgh.pa.us> >> Christoph Berg writes: >> > The plperl segfault on Debian's kfreebsd port I reported back in 2013 >> > is also still present: >> > https://www.postgresql.org/message-id/20130515064201.GC704%40msgid.df7cb.de >> > https://buildd.debian.org/status/fetch.php?pkg=postgresql-10=kfreebsd-amd64=10~beta2-1=1499947011=0 >> >> So it'd be interesting to know if it's any better with HEAD ... > > Unfortunately not: > > == creating database "pl_regression" == > CREATE DATABASE > ALTER DATABASE > == installing plperl == > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > connection to server was lost > > The only interesting line in log/postmaster.log is a log_line_prefix-less > Util.c: loadable library and perl binaries are mismatched (got handshake key > 0xd500080, needed 0xd600080) > ... which is unchanged from the beta2 output. I am not able to reproduce this issue on my Windows or Linux box. Have you deleted Util.c and SPI.c files before starting with the build? The point is, these files are generated during build time and if they already exist then i think. they are not re generated. I would suggest to delete both the .c files and rebuild your source and then give a try. Here are the results i got on Windows and Linux respectively on HEAD, On Windows: C:\Users\ashu\git-clone-postgresql\postgresql\src\tools\msvc>vcregress.bat PLCHECK Checking plperl (using postmaster on localhost, default port) == dropping database "pl_regression" == NOTICE: database "pl_regression" does not exist, skipping DROP DATABASE == creating database "pl_regression" == CREATE DATABASE ALTER DATABASE == installing plperl == CREATE EXTENSION == installing plperlu == CREATE EXTENSION == running regression test queries== test plperl ... ok test plperl_lc... ok test plperl_trigger ... ok test plperl_shared... ok test plperl_elog ... ok test plperl_util ... ok test plperl_init ... ok test plperlu ... ok test plperl_array ... ok test plperl_plperlu ... ok == All 10 tests passed. == On Linux: LD_LIBRARY_PATH="/home/ashu/git-clone-postgresql/postgresql/tmp_install/home/ashu/git-clone-postgresql/postgresql/TMP/postgres/lib" ../../../src/test/regress/pg_regress --temp-instance=./tmp_check --inputdir=. --bindir= --dbname=pl_regression --load-extension=plperl --load-extension=plperlu plperl plperl_lc plperl_trigger plperl_shared plperl_elog plperl_util plperl_init plperlu plperl_array == creating temporary instance== == initializing database system == == starting postmaster== running on port 50848 with PID 18140 == creating database "pl_regression" == CREATE DATABASE ALTER DATABASE == installing plperl == CREATE EXTENSION == installing plperlu == CREATE EXTENSION == running regression test queries== test plperl ... ok test plperl_lc... ok test plperl_trigger ... ok test plperl_shared... ok test plperl_elog ... ok test plperl_util ... ok test plperl_init ... ok test plperlu ... ok test plperl_array ... ok == shutting down postmaster == == removing temporary instance== -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
On Fri, Jul 14, 2017 at 3:02 AM, Ashutosh Bapatwrote: > Here's revised patch set with only 0004 revised. That patch deals with > creating multi-level inheritance hierarchy from multi-level partition > hierarchy. The original logic of recursively calling > inheritance_planner()'s guts over the inheritance hierarchy required > that for every such recursion we flatten many lists created by that > code. Recursion also meant that root->append_rel_list is traversed as > many times as the number of partitioned partitions in the hierarchy. > Instead the revised version keep the iterative shape of > inheritance_planner() intact, thus naturally creating flat lists, > iterates over root->append_rel_list only once and is still easy to > read and maintain. 0001-0003 look basically OK to me, modulo some cosmetic stuff. Regarding 0004: +if (brel->reloptkind != RELOPT_BASEREL && +brte->relkind != RELKIND_PARTITIONED_TABLE) I spent a lot of time staring at this code before I figured out what was going on here. We're iterating over simple_rel_array, so the reloptkind must be RELOPT_OTHER_MEMBER_REL if it isn't RELOPT_BASEREL. But does that guarantee that rtekind is RTE_RELATION such that brte->relkind will be initialized to a value? I'm not 100% sure. I think it would be clearer to write this test like this: Assert(IS_SIMPLE_REL(brel)); if (brel->reloptkind == RELOPT_OTHER_MEMBER_REL && (brte->rtekind != RELOPT_BASEREL || brte->relkind != RELKIND_PARTITIONED_TABLE)) continue; Note that the way you wrote the comment is says if it *is* another REL, not if it's *not* a baserel; it's good if those kinds of little details match between the code and the comments. It is not clear to me what the motivation is for the API changes in expanded_inherited_rtentry. They don't appear to be necessary. If they are necessary, you need to do a more thorough job updating the comments. This one, in particular: * If so, add entries for all the child tables to the query's * rangetable, and build AppendRelInfo nodes for all the child tables * and add them to root->append_rel_list. If not, clear the entry's And the comments could maybe say something like "We return the list of appinfos rather than directly appending it to append_rel_list because $REASON." - * is a partitioned table. + * RTE simply duplicates the parent *partitioned* table. */ -if (childrte->relkind != RELKIND_PARTITIONED_TABLE) +if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh) This is another case where it's hard to understand the test from the comments. + * In case of multi-level inheritance hierarchy, for every child we require + * PlannerInfo of its immediate parent. Hence we save those in a an array Say why. Also, need to fix "a an". I'm a little bit surprised that this patch doesn't make any changes to allpaths.c or relnode.c. It looks to me like we'll generate paths for the new RTEs that are being added. Are we going to end up with multiple levels of Append nodes, then? Does the consider the way consider_parallel is propagated up and down in set_append_rel_size() and set_append_rel_pathlist() really work with multiple levels? Maybe this is all fine; I haven't tried to verify it in depth. Overall I think this is a reasonable direction to go but I'm worried that there may be bugs lurking -- other code that needs adjusting that hasn't been found, really. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing comment for max_logical_replication_workers in postgresql.conf.sample
> Attached patch looks good except the excessive tab stops: > + > # (change requires restart) > > I will commit/push this with removing the excessive tab stops if > there's no objection. Done. Each fix were pushed in separate commit because the version needed to back patch are differ. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 10 beta docs: different replication solutions
We don't seem to describe logical replication on https://www.postgresql.org/docs/10/static/different-replication-solutions.html The attached patch adds a section. Steve diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml index 138bdf2..1329d1f 100644 --- a/doc/src/sgml/high-availability.sgml +++ b/doc/src/sgml/high-availability.sgml @@ -158,6 +158,26 @@ protocol to make nodes agree on a serializable transactional order. + + + Logical Replication + + + + Logical replication allows a database server to send a stream of + data modifications to another server. + PostgreSQL logical replication constructs + a stream of logical data modifications from the WAL. + + + Logical replication allows the data changes from individual tables + to be replicated. Logical replication doesn't require a particular server + to be designated as a master or a slave but allows data to flow in multiple + directions. For more information on logical replication, see . + + + + Trigger-Based Master-Standby Replication @@ -290,6 +310,7 @@ protocol to make nodes agree on a serializable transactional order. Shared Disk Failover File System Replication Write-Ahead Log Shipping + Logical Replication Trigger-Based Master-Standby Replication Statement-Based Replication Middleware Asynchronous Multimaster Replication @@ -304,6 +325,7 @@ protocol to make nodes agree on a serializable transactional order. NAS DRBD Streaming Repl. + Logical Repl. Slony pgpool-II Bucardo @@ -315,6 +337,7 @@ protocol to make nodes agree on a serializable transactional order. shared disk disk blocks WAL + Logical decoding table rows SQL table rows @@ -330,6 +353,7 @@ protocol to make nodes agree on a serializable transactional order. + @@ -337,6 +361,7 @@ protocol to make nodes agree on a serializable transactional order. + @@ -349,6 +374,7 @@ protocol to make nodes agree on a serializable transactional order. + @@ -360,6 +386,7 @@ protocol to make nodes agree on a serializable transactional order. with sync off + @@ -371,6 +398,7 @@ protocol to make nodes agree on a serializable transactional order. with sync on + @@ -385,6 +413,7 @@ protocol to make nodes agree on a serializable transactional order. + @@ -393,6 +422,7 @@ protocol to make nodes agree on a serializable transactional order. + @@ -403,6 +433,7 @@ protocol to make nodes agree on a serializable transactional order. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #14759: insert into foreign data partitions fail
Thank you for weighing in and reviewing the patch. On 2017/07/28 20:55, Etsuro Fujita wrote: > On 2017/07/26 15:29, Amit Langote wrote: >> On 2017/07/25 9:43, David G. Johnston wrote: >>> On Mon, Jul 24, 2017 at 5:19 PM, Amit Langote wrote: >>> I'm curious what the other limitations are... > > I think COPY has the same limitation as INSERT. Yes. I updated the patch to mention that as well. >> When I first wrote that documentation line (I am assuming you're asking >> about "although these have some limitations that normal tables do not"), I >> was thinking about the fact that the core system does not enforce >> (locally) any constraints defined on foreign tables. Since we allow >> inserting data into partitions directly, it is imperative that we enforce >> the "partition constraint" along with the traditional constraints such as >> NOT NULL and CHECK constraints, which we can do for local table partitions >> but not for foreign table ones. >> >> Anyway, attached patch documents all these limitations about foreign table >> partitions more prominently. > > Typo: s/the they is/they are/ Fixed in the attached. Thanks, Amit From 2b9c28808b725ed4551a2876a187531439f13928 Mon Sep 17 00:00:00 2001 From: amitDate: Mon, 3 Apr 2017 16:45:15 +0900 Subject: [PATCH] Clarify that partition constraint is not enforced on foreign tables --- doc/src/sgml/ddl.sgml | 16 +--- doc/src/sgml/ref/create_foreign_table.sgml | 17 +++-- 2 files changed, 24 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index b05a9c2150..a0ab648928 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -2988,9 +2988,11 @@ VALUES ('Albany', NULL, NULL, 'NY'); Partitions can also be foreign tables (see ), -although these have some limitations that normal tables do not. For -example, data inserted into the partitioned table is not routed to -foreign table partitions. +although they have some limitations that normal tables do not. For +example, routing the data inserted (or copied) into the partitioned table +to foreign table partitions is not supported, nor are the partition +constraints enforced when the data is directly inserted (or copied) into +them. @@ -3297,6 +3299,14 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02 not the partitioned table. + + + + Routing tuples to partitions that are foreign tables is not supported. + So, if a tuple inserted (or copied) into the table routes to one of + the foreign partitions, an error will occur. + + diff --git a/doc/src/sgml/ref/create_foreign_table.sgml b/doc/src/sgml/ref/create_foreign_table.sgml index 065c982082..43a6cbcfab 100644 --- a/doc/src/sgml/ref/create_foreign_table.sgml +++ b/doc/src/sgml/ref/create_foreign_table.sgml @@ -79,7 +79,9 @@ CHECK ( expression ) [ NO INHERIT ] If PARTITION OF clause is specified then the table is created as a partition of parent_table with specified - bounds. + bounds. Note that routing tuples to partitions that are foreign tables + is not supported. So, if a tuple inserted (or copied) into the table + routes to one of the foreign partitions, an error will occur. @@ -279,16 +281,19 @@ CHECK ( expression ) [ NO INHERIT ] Notes -Constraints on foreign tables (such as CHECK -or NOT NULL clauses) are not enforced by the -core PostgreSQL system, and most foreign data wrappers -do not attempt to enforce them either; that is, the constraint is +Constraints (both the user-defined constraints such as CHECK +or NOT NULL clauses and the partition constraint) are not +enforced by the core PostgreSQL system, and most foreign +data wrappers do not attempt to enforce them either; that is, they are simply assumed to hold true. There would be little point in such enforcement since it would only apply to rows inserted or updated via the foreign table, and not to rows modified by other means, such as directly on the remote server. Instead, a constraint attached to a foreign table should represent a constraint that is being enforced by -the remote server. +the remote server. That becomes especially important if the table is +being used in a partition hierarchy, where it is recommended to add +a constraint matching the partition constraint expression on +the remote table. -- 2.11.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refreshing subscription relation state inside a transaction block
On Fri, Jul 28, 2017 at 1:13 PM, Masahiko Sawadawrote: > On Thu, Jul 27, 2017 at 9:31 AM, Masahiko Sawada > wrote: >> On Wed, Jul 26, 2017 at 10:29 PM, Robert Haas wrote: >>> On Mon, Jun 19, 2017 at 4:30 AM, Masahiko Sawada >>> wrote: > I think that either of the options you suggested now would be better. > We'll need that for stopping the tablesync which is in progress during > DropSubscription as well as those will currently still keep running. I > guess we could simply just register syscache callback, the only problem > with that is we'd need to AcceptInvalidationMessages regularly while we > do the COPY which is not exactly free so maybe killing at the end of > transaction would be better (both for refresh and drop)? Attached patch makes table sync worker check its relation subscription state at the end of COPY and exits if it has disappeared, instead of killing table sync worker in DDL. There is still a problem that a table sync worker for a large table can hold a slot for a long time even after its state is deleted. But it would be for PG11 item. >>> >>> Do we still need to do something about this? Should it be an open item? >>> >> >> Thank you for looking at this. >> >> Yeah, I think it should be added to the open item list. The patch is >> updated by Petr and discussed on another thread[1] that also addresses >> other issues of subscription codes. 0004 patch of that thread is an >> updated patch of the patch attached on this thread. >> > > Does anyone have any opinions? Barring any objections, I'll add this > to the open item list. > Added it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #14758: Segfault with logical replication on a function index
Moved to -hackers. On Sat, Jul 29, 2017 at 4:35 AM, Scott Millikenwrote: > Thank you Masahiko! I've tested and confirmed that this patch fixes the > problem. > Thank you for the testing. This issue should be added to the open item since this cause of the server crash. I'll add it. > On Fri, Jul 28, 2017 at 3:07 AM, Masahiko Sawada > wrote: >> >> On Mon, Jul 24, 2017 at 4:22 PM, wrote: >> > The following bug has been logged on the website: >> > >> > Bug reference: 14758 >> > Logged by: Scott Milliken >> > Email address: sc...@deltaex.com >> > PostgreSQL version: 10beta2 >> > Operating system: Linux 4.10.0-27-generic #30~16.04.2-Ubuntu S >> > Description: >> > >> > I'm testing logical replication on 10beta2, and found a segfault that I >> > can >> > reliably reproduce with an index on a not-actually immutable function. >> > >> > Here's the function in question: >> > >> > ``` >> > CREATE OR REPLACE FUNCTION public.immutable_random(integer) >> > RETURNS double precision >> > LANGUAGE sql >> > IMMUTABLE >> > AS $function$SELECT random(); >> > $function$; >> > ``` >> > >> > It's not actually immutable since it's calling random (a hack to get an >> > efficient random sort on a table). >> > >> > (Aside: I'd understand if it errored on creation of the index, but would >> > really prefer to keep using this instead of tablesample because it's >> > fast, >> > deterministic, and doesn't have sampling biases like the SYSTEM >> > sampling.) >> > >> > >> > Here's full reproduction instructions: >> > >> > >> > Primary: >> > ``` >> > mkdir -p /tmp/test-seg0 >> > PGPORT=5301 initdb -D /tmp/test-seg0 >> > echo "wal_level = logical" >> /tmp/test-seg0/postgresql.conf >> > PGPORT=5301 pg_ctl -D /tmp/test-seg0 start >> > for (( ; ; )); do if pg_isready -d postgres -p 5301; then break; fi; >> > sleep >> > 1; done >> > psql -p 5301 postgres -c "CREATE USER test WITH PASSWORD 'test' >> > SUPERUSER >> > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;" >> > createdb -p 5301 -E utf8 test >> > >> > psql -p 5301 -U test test -c "CREATE TABLE testtbl (id int, name text);" >> > psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT >> > testtbl_pkey PRIMARY KEY (id);" >> > psql -p 5301 -U test test -c "CREATE PUBLICATION testpub FOR TABLE >> > testtbl;" >> > psql -p 5301 -U test test -c "INSERT INTO testtbl (id, name) VALUES (1, >> > 'a');" >> > ``` >> > >> > Secondary: >> > ``` >> > mkdir -p /tmp/test-seg1 >> > PGPORT=5302 initdb -D /tmp/test-seg1 >> > PGPORT=5302 pg_ctl -D /tmp/test-seg1 start >> > for (( ; ; )); do if pg_isready -d postgres -p 5302; then break; fi; >> > sleep >> > 1; done >> > psql -p 5302 postgres -c "CREATE USER test WITH PASSWORD 'test' >> > SUPERUSER >> > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;" >> > createdb -p 5302 -E utf8 test >> > >> > psql -p 5302 -U test test -c "CREATE TABLE testtbl (id int, name text);" >> > psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT >> > testtbl_pkey PRIMARY KEY (id);" >> > psql -p 5302 -U test test -c 'CREATE FUNCTION >> > public.immutable_random(integer) RETURNS double precision LANGUAGE sql >> > IMMUTABLE AS $function$ SELECT random(); $function$' >> > psql -p 5302 -U test test -c "CREATE INDEX ix_testtbl_random ON testtbl >> > USING btree (immutable_random(id));" >> > psql -p 5302 -U test test -c "CREATE SUBSCRIPTION test0_testpub >> > CONNECTION >> > 'port=5301 user=test dbname=test' PUBLICATION testpub;" >> > ``` >> > >> > The secondary crashes with a segfault: >> > >> > ``` >> > 2017-07-23 23:55:37.961 PDT [4823] LOG: logical replication table >> > synchronization worker for subscription "test0_testpub", table "testtbl" >> > has started >> > 2017-07-23 23:55:38.244 PDT [4758] LOG: worker process: logical >> > replication >> > worker for subscription 16396 sync 16386 (PID 4823) was terminated by >> > signal >> > 11: Segmentation fault >> > 2017-07-23 23:55:38.244 PDT [4758] LOG: terminating any other active >> > server >> > processes >> > 2017-07-23 23:55:38.245 PDT [4763] WARNING: terminating connection >> > because >> > of crash of another server process >> > 2017-07-23 23:55:38.245 PDT [4763] DETAIL: The postmaster has commanded >> > this server process to roll back the current transaction and exit, >> > because >> > another server process exited >> > abnormally and possibly corrupted shared memory. >> > 2017-07-23 23:55:38.245 PDT [4763] HINT: In a moment you should be able >> > to >> > reconnect to the database and repeat your command. >> > 2017-07-23 23:55:38.247 PDT [4758] LOG: all server processes >> > terminated; >> > reinitializing >> > 2017-07-23 23:55:38.256 PDT [4826] LOG: database system was >> > interrupted; >> > last known up at 2017-07-23 23:55:36 PDT >> > 2017-07-23 23:55:38.809 PDT [4826] LOG: database system was not >> > properly >> > shut down; automatic recovery in progress >> > 2017-07-23 23:55:38.812 PDT [4826] LOG:
Re: [HACKERS] segfault in HEAD when too many nested functions call
On 31/07/2017 01:47, Andres Freund wrote: > Julien, could you quickly verify that everything's good for you now too? > I just checked on current HEAD (cc9f08b6b813e30789100b6b34110d8be1090ba0) and everything's good for me. Thanks! -- Julien Rouhaud http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] segfault in HEAD when too many nested functions call
Hi, On 2017-07-29 16:14:08 -0400, Tom Lane wrote: > Andres Freundwrites: > > [ 0002-Move-ExecProcNode-from-dispatch-to-function-pointer-.patch ] > > Here's a reviewed version of this patch. Thanks! I pushed both now. > I added dummy ExecProcNodeMtd functions to the various node types that > lacked them because they expect to be called through MultiExecProcNode > instead. In the old coding, trying to call ExecProcNode on one of those > node types would have led to a useful error message; as you had it, > it'd have dumped core, which is not an improvement. Ok, makes sense. > Also, I removed the ExecReScan stanza from ExecProcNodeFirst; that > should surely be redundant, because we should only get to that function > through ExecProcNode(). If somehow it's not redundant, please add a > comment explaining why not. Makes sense too. > Some other minor cosmetic changes, mostly comment wordsmithing. Thanks! Julien, could you quickly verify that everything's good for you now too? Regards, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?
Just to continue the above, I can confirm that adding a simple call to SSL_CTX_set_session_id_context() to be_tls_init() with some arbitrary const value fixes the error for me. Attached is a patch (ideally a test should be done for this, but that's beyond what I can invest at the moment, let me know if it's absolutely necessary). On Mon, Jul 31, 2017 at 1:15 AM, Shay Rojanskywrote: > Hi Tom. > > Again, I know little about this, but from what I understand PostgreSQL > wouldn't actually need to do/implement anything here - the session ticket > might be used only to abbreviate the SSL handshake (this would explain why > it's on by default without any application support). In other words, simply > setting the session context id may make the problem go away and at the same > time unlock the abbreviated SSL handshake optimization. I could be wrong > about this though. > > Whether the above is correct or not, SSL resumption - which removes a > network roundtrip from the connection process - may be a worthy > optimization even for long-lived connections such as PostgreSQL, although > obviously much less valuable than, say, short-lived HTTP connections. > > But regardless, it seems that as you say: if you *don't* want to support > resumption, you're required to explicitly disable it with > SSL_OP_NO_TICKET. > > Just to give some context, Npgsql has its own, internal TLS implementation > which does not implement session tickets at the client side - this is the > workaround currently used. However, it would be much better if the standard > .NET SSL implementation could be used instead (i.e. I'm hoping a backport > would be possible here). > > On Sun, Jul 30, 2017 at 10:59 PM, Tom Lane wrote: > >> I wrote: >> > I think what you need to do is tell SslStream not to expect that PG >> > servers will do session resumption. (I'm a bit astonished that that >> > would be its default assumption in the first place, but whatever.) >> >> Actually, after a bit of further googling, it seems that the brain >> damage here may be on the server side. It seems that OpenSSL will >> send a session ticket if requested, even though the surrounding >> application has given it no means to identify the session (!?). >> Apparently we need to pass SSL_OP_NO_TICKET to SSL_CTX_set_options >> to prevent that from happening. >> >> regards, tom lane >> > > openssl-set-session-id-context.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?
Hi Tom. Again, I know little about this, but from what I understand PostgreSQL wouldn't actually need to do/implement anything here - the session ticket might be used only to abbreviate the SSL handshake (this would explain why it's on by default without any application support). In other words, simply setting the session context id may make the problem go away and at the same time unlock the abbreviated SSL handshake optimization. I could be wrong about this though. Whether the above is correct or not, SSL resumption - which removes a network roundtrip from the connection process - may be a worthy optimization even for long-lived connections such as PostgreSQL, although obviously much less valuable than, say, short-lived HTTP connections. But regardless, it seems that as you say: if you *don't* want to support resumption, you're required to explicitly disable it with SSL_OP_NO_TICKET. Just to give some context, Npgsql has its own, internal TLS implementation which does not implement session tickets at the client side - this is the workaround currently used. However, it would be much better if the standard .NET SSL implementation could be used instead (i.e. I'm hoping a backport would be possible here). On Sun, Jul 30, 2017 at 10:59 PM, Tom Lanewrote: > I wrote: > > I think what you need to do is tell SslStream not to expect that PG > > servers will do session resumption. (I'm a bit astonished that that > > would be its default assumption in the first place, but whatever.) > > Actually, after a bit of further googling, it seems that the brain > damage here may be on the server side. It seems that OpenSSL will > send a session ticket if requested, even though the surrounding > application has given it no means to identify the session (!?). > Apparently we need to pass SSL_OP_NO_TICKET to SSL_CTX_set_options > to prevent that from happening. > > regards, tom lane >
Re: [HACKERS] pl/perl extension fails on Windows
Christoph Bergwrites: > Re: Tom Lane 2017-07-28 <3254.1501276...@sss.pgh.pa.us> >> Christoph Berg writes: >>> The plperl segfault on Debian's kfreebsd port I reported back in 2013 >>> is also still present: >> So it'd be interesting to know if it's any better with HEAD ... > Unfortunately not: > The only interesting line in log/postmaster.log is a log_line_prefix-less > Util.c: loadable library and perl binaries are mismatched (got handshake key > 0xd500080, needed 0xd600080) > ... which is unchanged from the beta2 output. Well, that's quite interesting, because it implies that this is indeed the same type of problem. I wonder why the patch didn't fix it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/perl extension fails on Windows
Re: Tom Lane 2017-07-28 <3254.1501276...@sss.pgh.pa.us> > Christoph Bergwrites: > > The plperl segfault on Debian's kfreebsd port I reported back in 2013 > > is also still present: > > https://www.postgresql.org/message-id/20130515064201.GC704%40msgid.df7cb.de > > https://buildd.debian.org/status/fetch.php?pkg=postgresql-10=kfreebsd-amd64=10~beta2-1=1499947011=0 > > So it'd be interesting to know if it's any better with HEAD ... Unfortunately not: == creating database "pl_regression" == CREATE DATABASE ALTER DATABASE == installing plperl == server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost The only interesting line in log/postmaster.log is a log_line_prefix-less Util.c: loadable library and perl binaries are mismatched (got handshake key 0xd500080, needed 0xd600080) ... which is unchanged from the beta2 output. Christoph -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
"Tels"writes: > On Sun, July 30, 2017 12:22 pm, Tom Lane wrote: >> Yeah, I looked into that. The closest candidate I can find is that >> perl 5.10.1 contains Test::More 0.92. However, it's not real clear >> to me exactly which files I'd need to pull out of 5.10.1 and inject into >> an older tarball --- the layout seems a lot different from a standalone >> package. > So basically the two files: > http://search.cpan.org/src/EXODIST/Test-Simple-1.302086/lib/Test/More.pm > http://search.cpan.org/src/EXODIST/Test-Simple-1.302086/lib/Test/Builder/Module.pm > might do the trick. Thanks for the hint. I transplanted these files out of a 5.10.1 tarball into 5.8.3, then built as usual: lib/Test/Simple.pm lib/Test/More.pm lib/Test/Builder.pm lib/Test/Builder/Module.pm The result seems to work, although it fails a few of 5.8.3's tests, probably because I didn't copy over the relevant test scripts. It's good enough to run PG's tests though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?
I wrote: > I think what you need to do is tell SslStream not to expect that PG > servers will do session resumption. (I'm a bit astonished that that > would be its default assumption in the first place, but whatever.) Actually, after a bit of further googling, it seems that the brain damage here may be on the server side. It seems that OpenSSL will send a session ticket if requested, even though the surrounding application has given it no means to identify the session (!?). Apparently we need to pass SSL_OP_NO_TICKET to SSL_CTX_set_options to prevent that from happening. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?
Shay Rojanskywrites: > When trying to connect with Npgsql to PostgreSQL with client authentication > (PG has ssl_ca_file set), the first connection works just fine. The second > connection, however, fails and the PostgreSQL logs contain the message > session id context uninitialized". This occurs when using .NET's default > SSL implementation, SslStream, which supports session resumption - the > session connection's ClientHello message contains a session ticket from the > first session, triggering the issue. AFAIK Postgres doesn't support session resumption. If I am correctly understanding what that is supposed to provide, it would require saving all of a backend's internal state on the off chance that somebody would request resuming the session later. I do not think we are going there. The idea makes sense for servers with relatively lightweight per-session state, but that ain't us. I think what you need to do is tell SslStream not to expect that PG servers will do session resumption. (I'm a bit astonished that that would be its default assumption in the first place, but whatever.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL not setting OpenSSL session id context?
Dear hackers, a long-standing issue reported by users of the Npgsql .NET driver for PostgreSQL may have its roots on the PostgreSQL side. I'm far from being an SSL/OpenSSL expert so please be patient if the terms/analysis are incorrect. When trying to connect with Npgsql to PostgreSQL with client authentication (PG has ssl_ca_file set), the first connection works just fine. The second connection, however, fails and the PostgreSQL logs contain the message session id context uninitialized". This occurs when using .NET's default SSL implementation, SslStream, which supports session resumption - the session connection's ClientHello message contains a session ticket from the first session, triggering the issue. >From some research, it seems that for session resumption/reuse to work, the SSL/TLS server must call SSL_CTX_set_session_id_context/and SSL_set_session_id_context with some arbitrary binary data, to distinguish between contexts/applications. A grep in the PostgreSQL source for "set_session_id_context" doesn't yield anything. Can someone with more knowledge confirm whether an issue exists on the PostgreSQL side? If so, it seems completely trivial to fix this. Thanks, Shay
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
Moin, On Sun, July 30, 2017 12:22 pm, Tom Lane wrote: > "Tels"writes: >> On Sun, July 30, 2017 1:21 am, Tom Lane wrote: So the question is, does anyone care? I wouldn't except that our documentation appears to claim that we work with Perl "5.8 or later". > >> Not sure how often People use old Perl versions out in the field. I'd >> venture this is either happens with "ancient" stuff, e.g. where people >> just can't or want upgrade. >> Otherwise, an up-to-date OS is just necessary for security, anyway, and >> that would contain a Perl from this decade, wouldn't it? > > Well, that's not really the point, IMO. The reason I'm interested in > this is the same reason I run some buildfarm critters on ancient > platforms: if we do something that breaks backwards compatibility > with old software, we should know it and make a deliberate decision > that it's okay. (And update the relevant compatibility claims in our > docs.) Moving the compatibility goalposts without knowing it isn't > good, especially if it happens in supposedly-stable release branches. Sure, I was just pointing out that moving the goalpost forward knowingly, as you put it, can be ok vs. trying to support ancient software at all costs. Reads to me as we are in agreement here. >>> I am unable to confirm our claim that we work with Test::More 0.82, >>> because CPAN has only versions from a year or three back. (Anyone >>> know of a more, er, comprehensive archive than CPAN?) It's also >>> interesting to speculate about how old a version of IPC::Run is new >>> enough, but I see no easy way to get much data on that either. > >> Test::More has been bundled with Perl since 5.6.2 (you can use >> "corelist" >> to check for these things), so if all fails, it might be possible to >> extract a version from a Perl distribution [4]. > > Yeah, I looked into that. The closest candidate I can find is that > perl 5.10.1 contains Test::More 0.92. However, it's not real clear > to me exactly which files I'd need to pull out of 5.10.1 and inject into > an older tarball --- the layout seems a lot different from a standalone > package. Module distributions contain a MANIFEST like this: http://search.cpan.org/~exodist/Test-Simple/MANIFEST And while reconstructing a module for an old version can be quite a hassle,, it looks like Test::More is just plain Perl and only uses Test::Builder::Module. So basically the two files: http://search.cpan.org/src/EXODIST/Test-Simple-1.302086/lib/Test/More.pm http://search.cpan.org/src/EXODIST/Test-Simple-1.302086/lib/Test/Builder/Module.pm might do the trick. Unless PG uses also Test::Simple or other modules, which I haven't check, then you just need to add these, too. And basically, you put these files into a subdirectory, and use: use lib "mydir"; to tell Perl to load modules from there first. Here is a sample archive with a modern Test::More and an old Test::More from 5.10.1. There are two scripts to see how they are loaded and used. http://bloodgate.com/tmp/test-more-test.tar.gz One thing to watch out is that this would use the old Test::More, but any module it loads (or the script use) comes from the system Perl. So the test might not be 100% complete accurate, f.i. if a newer IO::Scalar is used with the old Test::More. You could also compile and install an old Perl version into a local subdirectory and test with that. That takes a bit more time, though. Hope this helps, Tels -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [GSOC][weekly report 8] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
In the last week, I focused on tuning the performance of skip list and fixed several bugs. 1. As only out-conflicts are checked in RWConflictExists, I removed all modification concerned with in-conflicts; 2. If the conflict list is too short, I inserted an item just like inserting into an ordered linked list, that is, comparing items existing in the list one by one to find the right position. Using skip list is slow when the list is short. 3. Not using the abstract skip list interface; I was afraid that it would bring a little overhead. But results showed that it had no influence. I will roll it back if necessary. Sadly, The performance is not better than the original version. But it's highest one among all efforts I did before. It likes hash table. Tracking conflict is faster but inserting conflicts is slower. In a quick test, cpu cycles consumed by two functions RWConflictExists/SetRWConflict is as below. | | RWConflictExists | SetRWConflict | | linked list | 1.39% | 0.14% | | skip list | 1.15% | 0.35% | According to the suggestions of Alvaro, I'll give a detailed performance report tomorrow. -- Sincerely Mengxing Liu skip-list-for-conflict-tracking.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GSOC] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
Thanks for your reply. Actually, the result of without "rdtsc" is reasonable. I used perf to analyze the performance and found that even thought the function tracking conflicts (RWConflictExists) was faster, the function inserting conflicts (SetRWConflict) was too slower. According to your suggestion, I found there were much more waiting events of predicate_lock_manager. That means, slower SetRWConflict resulted in more lock conflicts. So I took some effort to made it faster in the last days. Why the code with "rdtsc" is much faster? I thought that may be caused by some mistakes. When I changed a machine to run the code, this phenomenon didn't happen anymore.. -Original Messages- From: "Robert Haas"Sent Time: 2017-07-29 02:46:47 (Saturday) To: "Mengxing Liu" Cc: "Alvaro Herrera" , kgrittn , "pgsql-hackers@postgresql.org" Subject: Re: [HACKERS] [GSOC] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions On Wed, Jul 26, 2017 at 11:41 AM, Mengxing Liu wrote: Hi, all. There was a very strange phenomenon I couldn't explain. So I was wondering if you can help me. I was trying to replace the linked list with a skip list in serializable transaction object for faster conflict tracking. But the performance is bad. So I used the instruction "rdtsc" to compare the speed of my skip list and the original linked list. The skip list was about 1.5x faster. The interesting thing is that if I added the instruction "rdstc" at the end of the function "RWConflictExists", the performance of the whole system was increased by at most 3 times! Here is the result. | benchmarks | without rdtsc | with rdtsc | | simpe read/write | 4.91 | 14.16 | | ssibench | 9.72 | 10.24 | | tpcb | 26.45 | 26.38 | ( The simple read/write benchmark has the most number of conflicts. ) The patch is attached. All the difference of the two columns is with/without a simple line of code: __asm__ __volatile__ ("rdtsc"); But I don't know why this instruction will influence the performance so much! Lock contention is really expensive, so a slight delay that is just long enough to prevent the contention from happening can sometimes improve performance. This example is surprisingly dramatic, though. Of course, we can't commit it this way -- it will break on non-x86. I would suggest that you gather information on what wait events are occurring in the "without rdtsc" case. Like this: $ script $ psql psql=> select wait_event from pg_stat_activity; psql=> \watch 0.5 ...run test in another window... ^C \q ^D ...use awk or perl or something to count up the wait events and see where the contention is happening... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sincerely Mengxing Liu
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
Noah Mischwrites: > On Sun, Jul 30, 2017 at 12:05:10PM -0400, Tom Lane wrote: >> Well, OK, but I'd still like to tweak configure so that it records >> an absolute path for prove rather than just setting PROVE=prove. >> That way you'd at least be able to tell from the configure log >> whether you are possibly at risk. > That's an improvement. The reason it does that seems to be that we use AC_CHECK_PROGS rather than AC_PATH_PROGS for locating "prove". I can see no particular consistency to the decisions made in configure.in about which to use: AC_CHECK_PROGS(GCOV, gcov) AC_CHECK_PROGS(LCOV, lcov) AC_CHECK_PROGS(GENHTML, genhtml) AC_CHECK_PROGS(DTRACE, dtrace) AC_CHECK_PROGS(XML2_CONFIG, xml2-config) AC_CHECK_PROGS(DBTOEPUB, dbtoepub) AC_CHECK_PROGS(XMLLINT, xmllint) AC_CHECK_PROGS(XSLTPROC, xsltproc) AC_CHECK_PROGS(OSX, [osx sgml2xml sx]) AC_CHECK_PROGS(FOP, fop) AC_CHECK_PROGS(PROVE, prove) versus AC_PATH_PROG(TAR, tar) PGAC_PATH_BISON PGAC_PATH_FLEX PGAC_PATH_PERL PGAC_PATH_PYTHON AC_PATH_PROG(ZIC, zic) PGAC_PATH_TCLCONFIGSH([$with_tclconfig]) I'm tempted to propose that we switch *all* of these uses of AC_CHECK_PROGS to AC_PATH_PROGS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
On Sun, Jul 30, 2017 at 12:05:10PM -0400, Tom Lane wrote: > Noah Mischwrites: > > On Sun, Jul 30, 2017 at 01:21:28AM -0400, Tom Lane wrote: > >> I think it'd be a good idea to insist that "prove" be in > >> the same directory we found "perl" in. > > > Nah; on my machines, I use /usr/bin/perl and ~/sw/cpan/bin/prove. The > > latter > > is built against the former, so there's no particular hazard. > > Well, OK, but I'd still like to tweak configure so that it records > an absolute path for prove rather than just setting PROVE=prove. > That way you'd at least be able to tell from the configure log > whether you are possibly at risk. That's an improvement. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
"Tels"writes: > On Sun, July 30, 2017 1:21 am, Tom Lane wrote: >>> So the question is, does anyone care? I wouldn't except that our >>> documentation appears to claim that we work with Perl "5.8 or later". > Not sure how often People use old Perl versions out in the field. I'd > venture this is either happens with "ancient" stuff, e.g. where people > just can't or want upgrade. > Otherwise, an up-to-date OS is just necessary for security, anyway, and > that would contain a Perl from this decade, wouldn't it? Well, that's not really the point, IMO. The reason I'm interested in this is the same reason I run some buildfarm critters on ancient platforms: if we do something that breaks backwards compatibility with old software, we should know it and make a deliberate decision that it's okay. (And update the relevant compatibility claims in our docs.) Moving the compatibility goalposts without knowing it isn't good, especially if it happens in supposedly-stable release branches. >> I am unable to confirm our claim that we work with Test::More 0.82, >> because CPAN has only versions from a year or three back. (Anyone >> know of a more, er, comprehensive archive than CPAN?) It's also >> interesting to speculate about how old a version of IPC::Run is new >> enough, but I see no easy way to get much data on that either. > Test::More has been bundled with Perl since 5.6.2 (you can use "corelist" > to check for these things), so if all fails, it might be possible to > extract a version from a Perl distribution [4]. Yeah, I looked into that. The closest candidate I can find is that perl 5.10.1 contains Test::More 0.92. However, it's not real clear to me exactly which files I'd need to pull out of 5.10.1 and inject into an older tarball --- the layout seems a lot different from a standalone package. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
Noah Mischwrites: > On Sun, Jul 30, 2017 at 01:21:28AM -0400, Tom Lane wrote: >> I think it'd be a good idea to insist that "prove" be in >> the same directory we found "perl" in. > Nah; on my machines, I use /usr/bin/perl and ~/sw/cpan/bin/prove. The latter > is built against the former, so there's no particular hazard. Well, OK, but I'd still like to tweak configure so that it records an absolute path for prove rather than just setting PROVE=prove. That way you'd at least be able to tell from the configure log whether you are possibly at risk. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
On Sun, Jul 30, 2017 at 01:21:28AM -0400, Tom Lane wrote: > I wrote: > > So the question is, does anyone care? I wouldn't except that our > > documentation appears to claim that we work with Perl "5.8 or later". > > And the lack of field complaints suggests strongly that nobody else > > cares. So I'm inclined to think we just need to be more specific > > about the minimum Perl version --- but what exactly? > > I've done some more research on this. It seems to me there are four > distinct components to any claim about whether we work with a particular > Perl version: > > 1. Can it run the build-related Perl scripts needed to build PG from > a bare git checkout, on non-Windows platforms? > 2. Can it run our MSVC build scripts? > 3. Does pl/perl compile (and pass its regression tests) against it? > 4. Can it run our TAP tests? > 5.8.3 does appear to succeed at points 1,3,4. Now, to get through > the TAP tests you need to install IPC::Run, and you also need to > update Test::More because the version shipped with 5.8.3 is too old. > But at least the failures that you get from lacking these are pretty > clear. > Anyway, pending some news about compatibility of the MSVC scripts, > I think we ought to adjust our docs to state that 5.8.3 is the > minimum supported Perl version. Works for me. I wouldn't wait for testing of the MSVC scripts. Strawberry Perl started with 5.8.8. ActivePerl is far older, but it distributes old versions to subscription holders only. Besides, the main advantage of old-version support is letting folks use a packaged/preinstalled binary, and that doesn't apply on Windows. > Also, I got seriously confused at one point during these tests because, > while our configure script carefully sets PERL to an absolute path name, > it's content to set PROVE to "prove", paying no attention to whether > that version of "prove" matches "perl". Is it really okay to run the > TAP tests with a different perl version than we selected for other > purposes? Typically yes, though one can construct exceptions. > I think it'd be a good idea to insist that "prove" be in > the same directory we found "perl" in. Nah; on my machines, I use /usr/bin/perl and ~/sw/cpan/bin/prove. The latter is built against the former, so there's no particular hazard. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?
Moin Tom, On Sun, July 30, 2017 1:21 am, Tom Lane wrote: > I wrote: >> So the question is, does anyone care? I wouldn't except that our >> documentation appears to claim that we work with Perl "5.8 or later". >> And the lack of field complaints suggests strongly that nobody else >> cares. So I'm inclined to think we just need to be more specific >> about the minimum Perl version --- but what exactly? My 0.02 cent on Perl versions: Not sure how often People use old Perl versions out in the field. I'd venture this is either happens with "ancient" stuff, e.g. where people just can't or want upgrade. Otherwise, an up-to-date OS is just necessary for security, anyway, and that would contain a Perl from this decade, wouldn't it? If someone is still running Perl 5.8.3, they also got glorius Unicode circa Unicode 4.0 or in this area... > I've done some more research on this. It seems to me there are four > distinct components to any claim about whether we work with a particular > Perl version: > > 1. Can it run the build-related Perl scripts needed to build PG from > a bare git checkout, on non-Windows platforms? > 2. Can it run our MSVC build scripts? > 3. Does pl/perl compile (and pass its regression tests) against it? > 4. Can it run our TAP tests? > > I have no info to offer about point #2, but I did some tests with > ancient 5.8.x Perl versions on my buildfarm hosts prairiedog and > gaur. (I would have liked to use something faster, but these Perl > versions failed to build at all on more recent platforms, and > I thought it rather pointless to try to hack them enough to build.) > > I find that perl 5.8.0 and later seem to succeed at point #1, > but you need at least 5.8.1 to compile plperl. Also, on prairiedog > 5.8.1 fails plperl's regression tests because of some seemingly > utf8-related bug. That may be an ancient-macOS problem more than > anything else, so I didn't poke at it too hard. > > The really surprising thing I found out is that you can't run the > TAP tests on anything older than 5.8.3, because that's when they > added "prove". I'm bemused by the idea that such a fundamental > facility would get added in a third-digit minor release, but there > you have it. (Now, in fairness, this amounted to importing a feature > that already existed on CPAN, but still...) > > 5.8.3 does appear to succeed at points 1,3,4. Now, to get through > the TAP tests you need to install IPC::Run, and you also need to > update Test::More because the version shipped with 5.8.3 is too old. > But at least the failures that you get from lacking these are pretty > clear. > > I am unable to confirm our claim that we work with Test::More 0.82, > because CPAN has only versions from a year or three back. (Anyone > know of a more, er, comprehensive archive than CPAN?) It's also > interesting to speculate about how old a version of IPC::Run is new > enough, but I see no easy way to get much data on that either. > > Anyway, pending some news about compatibility of the MSVC scripts, > I think we ought to adjust our docs to state that 5.8.3 is the > minimum supported Perl version. > > Also, I got seriously confused at one point during these tests because, > while our configure script carefully sets PERL to an absolute path name, > it's content to set PROVE to "prove", paying no attention to whether > that version of "prove" matches "perl". Is it really okay to run the > TAP tests with a different perl version than we selected for other > purposes? I think it'd be a good idea to insist that "prove" be in > the same directory we found "perl" in. Thank you for the analysis. I agree about "prove". As for Test::More: Test::More has been bundled with Perl since 5.6.2 (you can use "corelist" to check for these things), so if all fails, it might be possible to extract a version from a Perl distribution [4]. CPAN authors are encouraged to clean out old versions due to the sheer size of the archive. (Not all got the memo, tho...*cough*) and most mirrors only carry the current files. The original author is Michael G. Schwern [0]. But it seems he cleaned house :) You might have luck with an mirror [1] who didn't clean out, or with archive.org. But with Test::More, it seems a bit confusing, as it is part of Test::Simple [2], which in turn is part of Test2, which is now on github [3]. It's Test-Modules all the way down. I'm not sure you'd find old Test::More versions ready-to-use in this. My apologies if you knew that already. However, I do so happen to have a large archive with Perl releases and CPAN modules. It was first mirrored on mid-2015 - so anything that was deleted before 2015 unfortunately I can't help you with that. But if you need a specific module version, just ping me and I can see if it's in there. Hope this helps, Tels [0]: http://ftp.nluug.nl/languages/perl/CPAN/authors/id/M/MS/MSCHWERN/ [1]: http://mirrors.cpan.org/ [2]: http://search.cpan.org/~exodist/Test-Simple-1.302086/ [3]:
Re: [HACKERS] Page Scan Mode in Hash Index
Hi, On Wed, May 10, 2017 at 2:28 PM, Ashutosh Sharmawrote: > While doing the code coverage testing of v7 patch shared with - [1], I > found that there are few lines of code in _hash_next() that are > redundant and needs to be removed. I basically came to know this while > testing the scenario where a hash index scan starts when a split is in > progress. I have removed those lines and attached is the newer version > of patch. > Please find the new version of patches for page at a time scan in hash index rebased on top of latest commit in master branch. Also, i have ran pgindent script with pg_bsd_indent version 2.0 on all the modified files. Thanks. > [1] - > https://www.postgresql.org/message-id/CAE9k0Pmn92Le0iOTU5b6087eRXzUnkq1PKcihxtokHJ9boXCBg%40mail.gmail.com > > On Mon, May 8, 2017 at 6:55 PM, Ashutosh Sharma wrote: >> Hi, >> >> On Tue, Apr 4, 2017 at 3:56 PM, Amit Kapila wrote: >>> On Sun, Apr 2, 2017 at 4:14 AM, Ashutosh Sharma >>> wrote: Please note that these patches needs to be applied on top of [1]. >>> >>> Few more review comments: >>> >>> 1. >>> - page = BufferGetPage(so->hashso_curbuf); >>> + blkno = so->currPos.currPage; >>> + if (so->hashso_bucket_buf == so->currPos.buf) >>> + { >>> + buf = so->currPos.buf; >>> + LockBuffer(buf, BUFFER_LOCK_SHARE); >>> + page = BufferGetPage(buf); >>> + } >>> >>> Here, you are assuming that only bucket page can be pinned, but I >>> think that assumption is not right. When _hash_kill_items() is called >>> before moving to next page, there could be a pin on the overflow page. >>> You need some logic to check if the buffer is pinned, then just lock >>> it. I think once you do that, it might not be convinient to release >>> the pin at the end of this function. >> >> Yes, there are few cases where we might have pin on overflow pages too >> and we need to handle such cases in _hash_kill_items. I have taken >> care of this in the attached v7 patch. Thanks. >> >>> >>> Add some comments on top of _hash_kill_items to explain the new >>> changes or say some thing like "See _bt_killitems for details" >> >> Added some more comments on the new changes. >> >>> >>> 2. >>> + /* >>> + * We save the LSN of the page as we read it, so that we know whether it >>> + * safe to apply LP_DEAD hints to the page later. This allows us to drop >>> + * the pin for MVCC scans, which allows vacuum to avoid blocking. >>> + */ >>> + so->currPos.lsn = PageGetLSN(page); >>> + >>> >>> The second part of above comment doesn't make sense because vacuum's >>> will still be blocked because we hold the pin on primary bucket page. >> >> That's right. It doesn't make sense because we won't allow vacuum to >> start. I have removed it. >> >>> >>> 3. >>> + { >>> + /* >>> + * No more matching tuples were found. return FALSE >>> + * indicating the same. Also, remember the prev and >>> + * next block number so that if fetching tuples using >>> + * cursor we remember the page from where to start the >>> + * scan. >>> + */ >>> + so->currPos.prevPage = (opaque)->hasho_prevblkno; >>> + so->currPos.nextPage = (opaque)->hasho_nextblkno; >>> >>> You can't read opaque without having lock and by this time it has >>> already been released. >> >> I have corrected it. Please refer to the attached v7 patch. >> >> Also, I think if you want to save position for >>> cursor movement, then you need to save the position of last bucket >>> when scan completes the overflow chain, however as you have written it >>> will be always invalid block number. I think there is similar problem >>> with prevblock number. >> >> Did you mean last bucket or last page. If it is last page, then I am >> already storing it. >> >>> >>> 4. >>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber >>> offnum, >>> + OffsetNumber maxoff, ScanDirection dir) >>> +{ >>> + HashScanOpaque so = (HashScanOpaque) scan->opaque; >>> + IndexTuple itup; >>> + int itemIndex; >>> + >>> + if (ScanDirectionIsForward(dir)) >>> + { >>> + /* load items[] in ascending order */ >>> + itemIndex = 0; >>> + >>> + /* new page, relocate starting position by binary search */ >>> + offnum = _hash_binsearch(page, so->hashso_sk_hash); >>> >>> What is the need to find offset number when this function already has >>> that as an input parameter? >> >> It's not required. I have removed it. >> >>> >>> 5. >>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber >>> offnum, >>> + OffsetNumber maxoff, ScanDirection dir) >>> >>> I think maxoff is not required to be passed a parameter to this >>> function as you need it only for forward scan. You can compute it >>> locally. >> >> It is required for both forward and backward scan. However, I am not >> passing it to _hash_load_qualified_items() now. >> >>> >>> 6. >>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber >>> offnum, >>> + OffsetNumber