Re: pgsql: Allow extensions to install built as well as unbuilt headers.
> "Michael" == Michael Paquier writes: >> What version of GNU Make is on there, do you know? Tom? I don't see >> it mentioned in the output anywhere. Michael> I don't know it. Consulting old manuals suggests it may have version 3.80 (c. 2002), which lacks the 'else if..' syntax introduced in 3.81 (c. 2006). -- Andrew (irc:RhodiumToad)
Re: pgsql: Allow extensions to install built as well as unbuilt headers.
On Thu, Sep 06, 2018 at 06:26:51AM +0100, Andrew Gierth wrote: > What version of GNU Make is on there, do you know? Tom? I don't see it > mentioned in the output anywhere. I don't know it. What I can see is that the use of "else ifdef" is forbidden. You could bypass the problem with more ifdef-only tweaks, but it seems to me that it would be cleaner to somewhat reduce the level of dependency between all the different header concepts. By the way, when you use at least two layers in ifdef/endif in pgxs.mk, putting a comment close to the endif to mention which part gets closed improves readability. -- Michael signature.asc Description: PGP signature
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Wed, 05 Sep 2018 20:02:04 +0900, Etsuro Fujita wrote in <5b8fb7ac.5020...@lab.ntt.co.jp> > (2018/08/30 21:58), Etsuro Fujita wrote: > > (2018/08/30 20:37), Kyotaro HORIGUCHI wrote: > >> At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro > >> Fujita wrote > >> in<5b7ffdef.6020...@lab.ntt.co.jp> > >>> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote: > At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro > Fujita wrote > in<5b72c1ae.8010...@lab.ntt.co.jp> > > (2018/08/09 22:04), Etsuro Fujita wrote: > >> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: > >>> > > I spent more time looking at the patch. ISTM that the patch well > > suppresses the effect of the tuple-descriptor expansion by making > > changes to code in the planner and executor (and ruleutils.c), but I'm > > still not sure that the patch is the right direction to go in, because > > ISTM that expanding the tuple descriptor on the fly might be a wart. > >>> > The exapansion should be safe if the expanded descriptor has the > same defitions for base columns and all the extended coulumns are > junks. The junk columns should be ignored by unrelated nodes and > they are passed safely as far as ForeignModify passes tuples as > is from underlying ForeignScan to ForeignUpdate/Delete. > >>> > >>> I'm not sure that would be really safe. Does that work well when > >>> EvalPlanQual, for example? > > I was wrong here; I assumed here that we supported late locking for an > UPDATE or DELETE on a foreign table, and I was a bit concerned that > the approach you proposed might not work well with EvalPlanQual, but > as described in fdwhandler.sgml, the core doesn't support for that: > > For an UPDATE or DELETE on a > foreign table, it > is recommended that the ForeignScan operation on > the target > table perform early locking on the rows that it fetches, perhaps via > the > equivalent of SELECT FOR UPDATE. An FDW can detect > whether > a table is an UPDATE/DELETE > target at plan time > by comparing its relid to > root->parse->resultRelation, > or at execution time by using > ExecRelationIsTargetRelation(). > An alternative possibility is to perform late locking within the > ExecForeignUpdate or > ExecForeignDelete > callback, but no special support is provided for this. > > So, there would be no need to consider about EvalPlanQual. Sorry for > the noise. I don't think it is a noise at all. Thank you for the pointer. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: pgsql: Allow extensions to install built as well as unbuilt headers.
> "Michael" == Michael Paquier writes: Michael> prairiedog is unhappy with this commit: What version of GNU Make is on there, do you know? Tom? I don't see it mentioned in the output anywhere. -- Andrew.
Re: pgsql: Allow extensions to install built as well as unbuilt headers.
Hi Andrew, On Wed, Sep 05, 2018 at 09:45:49PM +, Andrew Gierth wrote: > Allow extensions to install built as well as unbuilt headers. > > Commit df163230b overlooked the case that an out-of-tree extension > might need to build its header files (e.g. via ./configure). If it is > also doing a VPATH build, the HEADERS_* rules in the original commit > would then fail to find the files, since they would be looking only > under $(srcdir) and not in the build directory. > > Fix by adding HEADERS_built and HEADERS_built_$(MODULE) which behave > like DATA_built in that they look in the build dir rather than the > source dir (and also make the files dependencies of the "all" target). > > No Windows support appears to be needed for this, since it is only > relevant to out-of-tree builds (no support exists in Mkvcbuild.pm to > build extension header files in any case). prairiedog is unhappy with this commit: make -C ../../../contrib/spi ../../src/makefiles/pgxs.mk:140: Extraneous text after `else' directive ../../src/makefiles/pgxs.mk:144: *** only one `else' per conditional. Stop. make[2]: *** [submake-contrib-spi] Error 2 make[2]: *** Waiting for unfinished jobs https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2018-09-05%2023%3A21%3A01 -- Michael signature.asc Description: PGP signature
Re: Implement predicate propagation for non-equivalence clauses
Hi, On Wed, Sep 5, 2018 at 2:56 PM, Tom Lane wrote: > Richard Guo writes: > > In this patch, we are trying to do the similar deduction, from > > non-equivalence > > clauses, that is, A=B AND f(A) implies A=B AND f(A) and f(B), under some > > restrictions on f. > > Uh, *what* restrictions on f()? In general the above equivalence > does not hold, at least not for any data type more complicated than > integers; and we do not have any semantic model for deciding > which functions it would be correct for. > Exactly! The operator in f() should be at least in the same opfamily as the equivalence class containing A,B. Besides, as far as I can consider, the clause in f() should not contain volatile functions or subplans. Not sure if these restrictions are enough to make it safe. > One simple example to show what I'm talking about is that float8 zero > and minus zero are equal according to float8eq (assuming IEEE float > arithmetic); but they aren't equivalent for any function f() that is > sensitive to the sign or the text representation of the value. > The numeric data type likewise has values that are "equal" without > being identical for all purposes, eg 0.0 vs 0.000. Or consider > citext. > Thanks for the example. Heikki materialized this example as: create table a (f float8); create table b (f float8); insert into a values ('0'), ('-0'); insert into b values ('0'), ('-0'); select * from a, b where a.f = b.f and a.f::text <> '-0'; And run that query, this patch would give wrong result. Will address this in v2. > The existing planner deduction rules for equivalence classes are > carefully designed to ensure that we only generate derived clauses > using operators from the same operator class or family, so that > it's on the opclass author to ensure that the operators have consistent > semantics. I don't think we can extrapolate from that to any random > function that accepts the datatype. > regards, tom lane >
Re: Implement predicate propagation for non-equivalence clauses
On Wed, Sep 5, 2018 at 2:55 PM, Heikki Linnakangas wrote: > On 05/09/18 09:34, Richard Guo wrote: > >> Hi, >> >> As we know, current planner will generate additional restriction clauses >> from >> equivalence clauses. This will generally lower the total cost because >> some of >> tuples may be filtered out before joins. >> >> In this patch, we are trying to do the similar deduction, from >> non-equivalence >> clauses, that is, A=B AND f(A) implies A=B AND f(A) and f(B), under some >> restrictions on f. >> > > I haven't read the patch in detail, but that really only applies under > special circumstances. Tom caught me making that assumption just recently ( > https://www.postgresql.org/message-id/8003.1527092720%40sss.pgh.pa.us). I > think the restriction here is that f(x) must be an operator that's in the > same operator family as the = operator. In a quick read-through, it's not > clear to me what conditions are in the patch now. Please have a comment > somewhere to list them explicitly. Right. Above all the operator in f(x) should be in the same opfamily as the equivalence class. We neglected that in this patch and it would result in wrong plan. In addition, it should not contain volatile functions or subplans. Will address this in v2 and list the conditions in comment. Thanks! > > This patch will introduce extra cost for relation scan, due to the >> cost of evaluating the new implied quals. Meanwhile, since the extra >> filter may reduce the number of tuples returned by the scan, it may >> lower the cost of following joins. So, whether we will get a better >> plan depends on the selectivity of the implied quals. >> > Perhaps we should evaluate the selectivity of the clause, and only add > them if they seem helpful, based on the cost vs. selectivity? > > At least in this case from the regression tests: > > explain (costs off) >>select * from ec0 a, ec1 b >>where a.ff = b.ff and a.ff = 43::bigint::int8alias1; >> - QUERY PLAN >> -- >> + QUERY PLAN >> +-- >> Nested Loop >> -> Index Scan using ec0_pkey on ec0 a >> Index Cond: (ff = '43'::int8alias1) >> -> Index Scan using ec1_pkey on ec1 b >> Index Cond: (ff = a.ff) >> - Filter: (f1 < '5'::int8alias1) >> + Filter: ((f1 < '5'::int8alias1) AND (ff = '43'::int8alias1)) >> (6 rows) >> > > the new qual is redundant with the Index Condition. If we could avoid > generating such redundant quals, that would be good. Nice point. I am not sure how complex to evaluate the selectivity of the new qual before applying it. But that deserves a try. > > - Heikki >
Re: JIT compiling with LLVM v12
On Wed, Sep 05, 2018 at 11:55:39AM -0700, Andres Freund wrote: > On 2018-08-22 06:20:21 +, Noah Misch wrote: > > I see jit slows the regression tests considerably: > > Is this with LLVM assertions enabled or not? Without, I think. I configured them like this: cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$HOME/sw/nopath/llvm -DCMAKE_BUILD_TYPE=MinSizeRel -DLLVM_USE_LINKER=gold -DLLVM_TARGETS_TO_BUILD=X86 ../llvm cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$HOME/sw/nopath/llvm-el32 -DCMAKE_BUILD_TYPE=MinSizeRel -DLLVM_USE_LINKER=gold -DLLVM_PARALLEL_LINK_JOBS=1 ../llvm > > # mips32el, assert, w/o llvm (buildfarm member topminnow) [1] > > 28min install-check-* > > 35min check-pg_upgrade > > > > # mips32el, assert, w/ llvm 6.0.1 [1] > > 63min install-check-* > > 166min check-pg_upgrade > > But this seems so absurdly large of a difference that I kinda think LLVM > assertions (wich are really expensive and add O(N) operations in a bunch > of places) might be to blame. The 2018-08-25 and 2018-09-01 published runs were far less bad. Most of the blame goes to the reason given in the footnote (competing load on a shared machine), not to JIT.
Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.
On Wed, Sep 05, 2018 at 01:05:42PM -0700, Michael Paquier wrote: > As far as I can see, the proposal is not cold to death and could have a > future, so I'll begin a new thread with a more complete patch. Done. Let's move the discussion here then: https://www.postgresql.org/message-id/20180906014849.gg2...@paquier.xyz -- Michael signature.asc Description: PGP signature
Add extension options to control TAP and isolation tests
Hi all, On a recent thread of pgsql-committers has been discussed the fact that we lacked a bit of infrastructure to allow extensions to control isolation and TAP tests: https://www.postgresql.org/message-id/20180905174527.ga2...@paquier.xyz Attached is a patch which is the result of the previous thread, where a couple of variables are added for extension authors: - ISOLATION, similar to REGRESS for pg_regress, which lists isolation tests. - ISOLATION_OPTS, which can be used to pass an option set to pg_isolation_regress. - TAP_TESTS, a switch to enable running TAP tests. This results in a bit of cleanup in the Makefile of modules we ship with upstream: - modules/brin - modules/commit_ts - modules/test_pg_dump - contrib/bloom, where the TAP tests were actually not running. - contrib/oid2name - contrib/test_decoding, which keeps the same feature set, and is heavily cleaned up (it missed for example the use of the existing NO_INSTALLCHECK). - contrib/vacuumlo Thoughts? -- Michael diff --git a/contrib/bloom/Makefile b/contrib/bloom/Makefile index 13bd397b70..da9553a2d0 100644 --- a/contrib/bloom/Makefile +++ b/contrib/bloom/Makefile @@ -8,6 +8,7 @@ DATA = bloom--1.0.sql PGFILEDESC = "bloom access method - signature file based index" REGRESS = bloom +TAP_TESTS = 1 ifdef USE_PGXS PG_CONFIG = pg_config @@ -19,6 +20,3 @@ top_builddir = ../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif - -wal-check: temp-install - $(prove_check) diff --git a/contrib/oid2name/Makefile b/contrib/oid2name/Makefile index 908e078714..361a80a7a1 100644 --- a/contrib/oid2name/Makefile +++ b/contrib/oid2name/Makefile @@ -6,11 +6,11 @@ PGAPPICON = win32 PROGRAM = oid2name OBJS = oid2name.o $(WIN32RES) +TAP_TESTS = 1 + PG_CPPFLAGS = -I$(libpq_srcdir) PG_LIBS_INTERNAL = $(libpq_pgport) -EXTRA_CLEAN = tmp_check - ifdef USE_PGXS PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) @@ -21,9 +21,3 @@ top_builddir = ../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif - -check: - $(prove_check) - -installcheck: - $(prove_installcheck) diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index afcab930f7..8cd83a763f 100644 --- a/contrib/test_decoding/Makefile +++ b/contrib/test_decoding/Makefile @@ -3,9 +3,20 @@ MODULES = test_decoding PGFILEDESC = "test_decoding - example of a logical decoding output plugin" -# Note: because we don't tell the Makefile there are any regression tests, -# we have to clean those result files explicitly -EXTRA_CLEAN = $(pg_regress_clean_files) +EXTRA_INSTALL=contrib/test_decoding + +REGRESS = ddl xact rewrite toast permissions decoding_in_xact \ + decoding_into_rel binary prepared replorigin time messages \ + spill slot truncate +ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \ + oldest_xmin snapshot_transfer + +REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf +ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf + +# Disabled because these tests require "wal_level=logical", which +# typical installcheck users do not have (e.g. buildfarm clients). +NO_INSTALLCHECK = 1 ifdef USE_PGXS PG_CONFIG = pg_config @@ -18,52 +29,8 @@ include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif -# Disabled because these tests require "wal_level=logical", which -# typical installcheck users do not have (e.g. buildfarm clients). -installcheck:; - # But it can nonetheless be very helpful to run tests on preexisting # installation, allow to do so, but only if requested explicitly. -installcheck-force: regresscheck-install-force isolationcheck-install-force - -check: regresscheck isolationcheck - -submake-regress: - $(MAKE) -C $(top_builddir)/src/test/regress all - -submake-isolation: - $(MAKE) -C $(top_builddir)/src/test/isolation all - -submake-test_decoding: - $(MAKE) -C $(top_builddir)/contrib/test_decoding - -REGRESSCHECKS=ddl xact rewrite toast permissions decoding_in_xact \ - decoding_into_rel binary prepared replorigin time messages \ - spill slot truncate - -regresscheck: | submake-regress submake-test_decoding temp-install - $(pg_regress_check) \ - --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \ - $(REGRESSCHECKS) - -regresscheck-install-force: | submake-regress submake-test_decoding temp-install - $(pg_regress_installcheck) \ - $(REGRESSCHECKS) - -ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml \ - oldest_xmin snapshot_transfer - -isolationcheck: | submake-isolation submake-test_decoding temp-install - $(pg_isolation_regress_check) \ - --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \ - $(ISOLATIONCHECKS) - -isolationcheck-install-force: all | submake-isolation submake-test_decoding temp-install - $(pg_isolation_regress_installcheck) \ - $(ISOLATIONCHECKS) - -
RE: automatic restore point
-Original Message- From: Yotsunaga, Naoki [mailto:yotsunaga.na...@jp.fujitsu.com] Sent: Tuesday, June 26, 2018 10:18 AM To: Postgres hackers Subject: automatic restore point >Hi, I attached a patch to output the LSN before execution to the server log >>when executing a specific command and accidentally erasing data. Since there was an error in the attached patch, I attached the modified patch. -- Naoki Yotsunaga automatic_restore_point_v2.patch Description: automatic_restore_point_v2.patch
Re: On the need for a snapshot in exec_bind_message()
On 2018-09-05 19:11:56 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2018-09-05 18:45:52 -0400, Tom Lane wrote: > >> The snapshot has little to do with the query plan, usually. It's about > >> what view of the database the executed query will see, and particularly > >> about what view the parameter input functions will see, if they look. > > > The snapshot in exec_bind_message() shouldn't be about what the executed > > query sees, no? Sure, we'll evaluate input params and might replan etc, > > but other than that? > > As things stand, yeah, it's not used for execution. > > > It'd not be insane to rejigger things so there's a version of > > PushActiveSnapshot() doesn't eagerly compute the snapshot, but instead > > does so on first access. Obviously we couldn't use that everywhere, but > > the exec_bind_message() seems like a prime case for it. > > I dislike this proposal, because it introduces an element of uncertainty > into when the snapshot is taken. It doesn't seem a lot different from > saying that we'll try to save a gettimeofday call by postponing > calculation of the statement_timestamp until somebody asks for it. > Then it's no longer the statement start time, but some hard-to-define > time within execution. I don't think that really is comparable. The timestamp really exists on a (well, for our purposes) absolute scale. That's much less clearer for snapshots in read committed: Without access to the database or returning results the moment the snapshot has been taken really doesn't mean that much - the session could really just have been scheduled out just before taking the snapshot (or waited for the procarray lock, to be more on topic). Additionally, this isn't the snapshot for the actual query execution - it's the one for the parameter evaluation. And we *already* take a separate snapshot for that from the one actually executing the query, so skew between the two really can't be meaningful. > I think the other idea of trying to keep the bind-time snapshot in use > throughout execution is more worth pursuing. Hm, that seems to have its own set of problems. Some of them nontrivial. It's e.g. perfectly legal to bind multiple statements and then execute them - and I think it's quite possible that pgjdbc would generate something like that. And then there's: > The main problem with it, now that I think harder, is that we need the > execution snapshot to be taken after we've acquired whatever locks the > query needs. Greetings, Andres Freund
Re: On the need for a snapshot in exec_bind_message()
> > Queries stop getting re-optimized after 5 times, unless better plans are to > > be had. In the absence of schema changes or changing search path why is > > the snapshot needed? > > The snapshot has little to do with the query plan, usually. It's about > what view of the database the executed query will see, and particularly > about what view the parameter input functions will see, if they look. > > You could maybe argue that immutable input functions shouldn't need > snapshots, but there are enough not-immutable ones that I don't think > that gets you very far. In any case, we'd still need a snapshot for > query execution. The set of queries that could possibly never need > a snapshot at all is probably not large enough to be interesting. I would think that the vast majority of bind variables in customer queries would involve built-in data types whose input functions are immutable. As far as the "view of the database the executed query will see" goes, either the database itself changes, which should trigger catcache invalidations, that can be accounted for, or the "view" changes as in searchpath which can be tested for. I would think the optimization would be applicable most of the time. It isn't the number of obscure user created non-immutable input functions that exist. It is the frequency with which immutable input functions are used in real app's. This is a 50% performance improvement in point lookups on larger boxes.
Re: On the need for a snapshot in exec_bind_message()
Andres Freund writes: > On 2018-09-05 18:45:52 -0400, Tom Lane wrote: >> The snapshot has little to do with the query plan, usually. It's about >> what view of the database the executed query will see, and particularly >> about what view the parameter input functions will see, if they look. > The snapshot in exec_bind_message() shouldn't be about what the executed > query sees, no? Sure, we'll evaluate input params and might replan etc, > but other than that? As things stand, yeah, it's not used for execution. > It'd not be insane to rejigger things so there's a version of > PushActiveSnapshot() doesn't eagerly compute the snapshot, but instead > does so on first access. Obviously we couldn't use that everywhere, but > the exec_bind_message() seems like a prime case for it. I dislike this proposal, because it introduces an element of uncertainty into when the snapshot is taken. It doesn't seem a lot different from saying that we'll try to save a gettimeofday call by postponing calculation of the statement_timestamp until somebody asks for it. Then it's no longer the statement start time, but some hard-to-define time within execution. I think the other idea of trying to keep the bind-time snapshot in use throughout execution is more worth pursuing. The main problem with it, now that I think harder, is that we need the execution snapshot to be taken after we've acquired whatever locks the query needs. But we actually know the set of required locks, if we have a cached plan at hand. (It's *possible* that that set would change, if we're forced to replan, but it's unlikely.) So you could imagine rejiggering things so that we do the acquire-locks bit before doing parameter input, and unless the lock list changes, we can keep the parameter-input snapshot in force throughout execution. Not quite sure how to make that play with custom plans though. regards, tom lane
Re: *_to_xml() should copy SPI_processed/SPI_tuptable
Chapman Flack writes: > On 09/05/18 18:07, Tom Lane wrote: >> * Replace SPI_tuptable et al with macros that access fields in the >> current SPI stack level (similar to the way that, eg, errno works >> on most modern platforms). This seems do-able, if a bit grotty. > It would mean they'd have to *be* in the stack frame, where they > currently aren't; Right, I was assuming that was obvious ... > Another alternative might be to have SPI_connect save them and > SPI_finish put them back, which leaves you just responsible for > reasoning about your own code. You'd still be expected to save them > across your own uses of other SPI calls, but no longer exposed to > spooky action at a distance from nested uses of SPI in stuff you call. Hmm. I'd thought about that briefly and concluded that it didn't offer a full fix, but on reflection it's not clear why it'd be any less of a fix than the macroized-SPI_tuptable approach. You end up with per-SPI-stack-level storage either way, and while that isn't perfect it does go a long way, as you say. More, this has the huge advantage of being back-patchable, because there'd be no API/ABI change. regards, tom lane
Re: On the need for a snapshot in exec_bind_message()
On 2018-09-05 18:45:52 -0400, Tom Lane wrote: > Daniel Wood writes: > >>> exec_bind_message() > >>> PushActiveSnapshot(GetTransactionSnapshot()); > >>> > >>> If there were no input functions, that needed this, nor reparsing or > >>> reanalyzing needed, and we knew this up front, it'd be a huge win. > > >> Unfortunately, that's not the case, so I think trying to get rid of > >> this call is a nonstarter. > > > Queries stop getting re-optimized after 5 times, unless better plans are to > > be had. In the absence of schema changes or changing search path why is > > the snapshot needed? > > The snapshot has little to do with the query plan, usually. It's about > what view of the database the executed query will see, and particularly > about what view the parameter input functions will see, if they look. The snapshot in exec_bind_message() shouldn't be about what the executed query sees, no? Sure, we'll evaluate input params and might replan etc, but other than that? > You could maybe argue that immutable input functions shouldn't need > snapshots, but there are enough not-immutable ones that I don't think > that gets you very far. In any case, we'd still need a snapshot for > query execution. The set of queries that could possibly never need > a snapshot at all is probably not large enough to be interesting. It'd not be insane to rejigger things so there's a version of PushActiveSnapshot() doesn't eagerly compute the snapshot, but instead does so on first access. Obviously we couldn't use that everywhere, but the exec_bind_message() seems like a prime case for it. Greetings, Andres Freund
Re: On the need for a snapshot in exec_bind_message()
Daniel Wood writes: >>> exec_bind_message() >>> PushActiveSnapshot(GetTransactionSnapshot()); >>> >>> If there were no input functions, that needed this, nor reparsing or >>> reanalyzing needed, and we knew this up front, it'd be a huge win. >> Unfortunately, that's not the case, so I think trying to get rid of >> this call is a nonstarter. > Queries stop getting re-optimized after 5 times, unless better plans are to > be had. In the absence of schema changes or changing search path why is the > snapshot needed? The snapshot has little to do with the query plan, usually. It's about what view of the database the executed query will see, and particularly about what view the parameter input functions will see, if they look. You could maybe argue that immutable input functions shouldn't need snapshots, but there are enough not-immutable ones that I don't think that gets you very far. In any case, we'd still need a snapshot for query execution. The set of queries that could possibly never need a snapshot at all is probably not large enough to be interesting. regards, tom lane
Re: *_to_xml() should copy SPI_processed/SPI_tuptable
On 09/05/18 18:07, Tom Lane wrote: > * Replace SPI_tuptable et al with macros that access fields in the > current SPI stack level (similar to the way that, eg, errno works > on most modern platforms). This seems do-able, if a bit grotty. It would mean they'd have to *be* in the stack frame, where they currently aren't; they're ignored by SPI_connect and just zeroed by SPI_finish, on the grounds that they're transient and you've copied them if you care. Another alternative might be to have SPI_connect save them and SPI_finish put them back, which leaves you just responsible for reasoning about your own code. You'd still be expected to save them across your own uses of other SPI calls, but no longer exposed to spooky action at a distance from nested uses of SPI in stuff you call. -Chap
Re: unaccent(text) fails depending on search_path (WAS: pg_upgrade fails saying function unaccent(text) doesn't exist)
[ redirecting to pgsql-hackers ] I wrote: > Gunnlaugur Thor Briem writes: >> SET search_path = "$user"; SELECT public.unaccent('foo'); >> SET >> ERROR: text search dictionary "unaccent" does not exist > Meh. I think we need the attached, or something just about like it. > > It's barely possible that there's somebody out there who's relying on > setting the search path to allow choosing among multiple "unaccent" > dictionaries. But there are way more people whose functions are > broken due to the recent search-path-tightening changes. Here's a slightly more efficient version. regards, tom lane diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c index 247c202..dbf2bb9 100644 *** a/contrib/unaccent/unaccent.c --- b/contrib/unaccent/unaccent.c *** *** 20,26 --- 20,28 #include "tsearch/ts_locale.h" #include "tsearch/ts_public.h" #include "utils/builtins.h" + #include "utils/lsyscache.h" #include "utils/regproc.h" + #include "utils/syscache.h" PG_MODULE_MAGIC; *** unaccent_dict(PG_FUNCTION_ARGS) *** 376,382 if (PG_NARGS() == 1) { ! dictOid = get_ts_dict_oid(stringToQualifiedNameList("unaccent"), false); strArg = 0; } else --- 378,398 if (PG_NARGS() == 1) { ! /* ! * Use the "unaccent" dictionary that is in the same schema that this ! * function is in. ! */ ! Oid procnspid = get_func_namespace(fcinfo->flinfo->fn_oid); ! const char *dictname = "unaccent"; ! ! dictOid = GetSysCacheOid2(TSDICTNAMENSP, ! PointerGetDatum(dictname), ! ObjectIdGetDatum(procnspid)); ! if (!OidIsValid(dictOid)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_OBJECT), ! errmsg("text search dictionary \"%s.%s\" does not exist", ! get_namespace_name(procnspid), dictname))); strArg = 0; } else
Re: On the need for a snapshot in exec_bind_message()
> > exec_bind_message() > > PushActiveSnapshot(GetTransactionSnapshot()); > > > If there were no input functions, that needed this, nor reparsing or > > reanalyzing needed, and we knew this up front, it'd be a huge win. > > Unfortunately, that's not the case, so I think trying to get rid of > this call is a nonstarter. Queries stop getting re-optimized after 5 times, unless better plans are to be had. In the absence of schema changes or changing search path why is the snapshot needed? - Dan
Re: *_to_xml() should copy SPI_processed/SPI_tuptable
Chapman Flack writes: > In xml.c, query_to_xml_internal() contains a loop that refers > to SPI_processed every iteration: > for (i = 0; i < SPI_processed; i++) > SPI_sql_row_to_xmlelement(i, result, tablename, nulls, > tableforest, targetns, top_level); > likewise, that function it is calling contains a loop that > repeatedly dereferences SPI_tuptable: > for (i = 1; i <= SPI_tuptable->tupdesc->natts; i++) > ... > ... map_sql_value_to_xml_value(colval, >SPI_gettypeid(SPI_tuptable->tupdesc, ... > and map_sql_value_to_xml_value can use OidOutputFunctionCall > [ which could call arbitrary code that clobbers SPI_tuptable ] Ugh. I wonder how many other places have latent issues of the same kind. > The essence of the problem here seems to be simply that these routines > in xml.c are not making local-variable copies of SPI_processed and > SPI_tuptable, as the docs for SPI_execute recommend. I think that's blaming the messenger TBH. The *real* problem is the damfool decision to use global variables for this in the first place. Should we change that somehow? Really the right API would have entailed something like int SPI_execute(const char *src, bool read_only, long tcount, SPITupleTable **tuptable, uint64 *processed); ... and I guess we'd have to think about SPI_lastoid too, so maybe returning a struct would be smarter than a pointer-per-value. So some possible alternatives are: * Just do it. Probably breaks too much 3rd-party code to be very popular, though. * Invent SPI_execute_r() and so on with APIs as above, and deprecate the old functions, but don't remove 'em right away. Grotty, and it does little to fix any latent problems that may exist outside core. * Replace SPI_tuptable et al with macros that access fields in the current SPI stack level (similar to the way that, eg, errno works on most modern platforms). This seems do-able, if a bit grotty. None of this is very back-patchable, so I guess we need a one-off backpatched change for query_to_xml_internal and anyplace else that looks to be at serious risk. regards, tom lane
Re: PostgreSQL does not start when max_files_per_process> 1200 on Windows 7
When max_files_per_process=1201 or more server not start. I tested and debugged this on Windows 7. On Windows 10 work all right. I found this take place after call function count_usable_fds(int max_to_probe, int *usable_fds, int *already_open) Error occured on the first call selres = select(nSockets, &rmask, NULL, NULL, &timeout) from ServerLoop. Error 0xC142: "{DLL Initialization Failed} Initialization of the dynamic link library %hs failed. The process is terminating abnormally." I temporarily moved call set_max_safe_fds() to top of the PostmasterMain and error moved to ReadFile in function pipe_read_line(char *cmd, char *line, int maxsize): if (!ReadFile(childstdoutrddup, lineptr, maxsize - (lineptr - line), &bytesread, NULL)) Repeat the error in a separate example did not work yet. Replacing dup for stdin on dup of file handle resolve this problem. Victor Spirin Postgres Professional:http://www.postgrespro.com The Russian Postgres Company 05.09.2018 19:24, Andres Freund пишет: Hi, On 2018-09-05 13:06:06 +0300, Victor Spirin wrote: I have a bug in Windows 7 with max_files_per_process> 1200. Using dup (0) in the function count_usable_fds more than 1200 times (0 = stdin) causes further unpredictable errors with file operations. Could you expand a bit on this? Greetings, Andres Freund
Re: Collation versioning
On Wed, Sep 5, 2018 at 12:10 PM Christoph Berg wrote: > > So, it's not ideal but perhaps worth considering on the grounds that > > it's better than nothing? > > Ack. Ok, here's a little patch like that. postgres=# select collname, collcollate, collversion from pg_collation where collname = 'en_NZ'; collname | collcollate | collversion --+-+- en_NZ| en_NZ.utf8 | 2.24 (1 row) -- Thomas Munro http://www.enterprisedb.com 0001-Set-collversion-for-collations-that-come-from-glibc.patch Description: Binary data
*_to_xml() should copy SPI_processed/SPI_tuptable
I encountered this in 9.5 and haven't yet written a reproducer for 10 or 11, but the code in question looks the same in master. In xml.c, query_to_xml_internal() contains a loop that refers to SPI_processed every iteration: for (i = 0; i < SPI_processed; i++) SPI_sql_row_to_xmlelement(i, result, tablename, nulls, tableforest, targetns, top_level); likewise, that function it is calling contains a loop that repeatedly dereferences SPI_tuptable: for (i = 1; i <= SPI_tuptable->tupdesc->natts; i++) ... ... map_sql_value_to_xml_value(colval, SPI_gettypeid(SPI_tuptable->tupdesc, ... and map_sql_value_to_xml_value can use OidOutputFunctionCall. If that attribute's data type has an output function that uses SPI, the next iteration here dereferences a null SPI_tuptable. The root of the problem does not seem to be an output function that uses SPI; indeed, the chance that one might do so seems to have been the original motivation for OutputFunctionCall(). The essence of the problem here seems to be simply that these routines in xml.c are not making local-variable copies of SPI_processed and SPI_tuptable, as the docs for SPI_execute recommend. Would it be worth also expanding that note in the docs? It currently says: --- All SPI query-execution functions set both SPI_processed and SPI_tuptable Save these two global variables into local procedure variables if you need to access the result table of SPI_execute or another query-execution function across later calls. --- but might it bear emphasis that those later calls can happen indirectly from things that you call (e.g. output functions)? Those will be nested in their own SPI_connect/SPI_finish, which a reasonable person might guess would save/restore those globals, but they don't; SPI_finish merely clears them, on the assumption that you've got your local copies if you need them. -Chap
Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
On Wed, Sep 5, 2018 at 5:05 PM Alexander Korotkov wrote: > On Wed, Sep 5, 2018 at 2:39 PM Alexander Korotkov > wrote: > > On Wed, Sep 5, 2018 at 12:26 PM Alexander Korotkov > > wrote: > > > On Wed, Sep 5, 2018 at 1:45 AM R, Siva wrote: > > > > On Tue, Sep 4, 2018 at 09:16 PM, Alexander Korotkov > > > > wrote: > > > > > Do you have a test scenario for reproduction of this issue? We need > > > > > it to ensure that fix is correct. > > > > > > > > Unfortunately, I do not have a way of reproducing this issue. > > > > So far I have tried a workload consisting of inserts (of the > > > > same attribute value that is indexed), batch deletes of rows > > > > and vacuum interleaved with engine crash/restarts. > > > > > > Issue reproduction and testing is essential for bug fix. Remember > > > last time you reported GIN bug [1], after issue reproduction it > > > appears that we have more things to fix. I's quite clear for me that > > > if segment list contains GIN_SEGMENT_INSERT before GIN_SEGMENT_DELETE, > > > then it might lead to wrong behavior in ginRedoRecompress(). But it's > > > not yet clear to understand what code patch could lead to such segment > > > list... I'll explore code more and probably will come with some idea. > > > > Aha, I've managed to reproduce this. > > 1. Apply ginRedoRecompress-asserts.patch, which add assertions to > > ginRedoRecompress() detecting past opaque writes. > > It was wrong, sorry. It appears that I put strict inequality into > asserts, while there should be loose inequality. Correct version of > patch is attached. And scenario I've posted isn't really reproducing > the bug... Finally I managed to reproduce the bug. The scenario is following. Underlying idea is that when insertion of multiple tuples goes to the beginning of the page and this insertion succeed only thanks to collapse of some short segments together, then this insertion wouldn't fit to the page if given alone. create table test (i integer primary key, a int[]) with (fillfactor = 50); insert into test (select id, array[id%2]::int[] from generate_series(1,15376) id); create index test_idx on test using gin(a) with (fastupdate = off); update test set a = '{1}' where i % 4 = 0 or i % 16 = 2 or i % 64 in (6, 46, 36) or i % 256 = 54; vacuum test; insert into test (select id + 16376, '{0}' from generate_series(1,5180) id); update test set a = '{1}' where i <= 15376 and i % 256 in (182, 198); vacuum test; alter index test_idx set (fastupdate = on); delete from test where i <= 134 and a = '{1}'; vacuum test; insert into test (select id+3, '{0}' from generate_series(1,110) id); vacuum test; With ginRedoRecompress-asserts-v2.patch following assertion is fired. TRAP: FailedAssertion("!(segptr + newsegsize + (szleft - segsize) <= ( ((void) ((_Bool) (! (!(PageValidateSpecialPointer(page))) || (ExceptionalCondition("!(PageValidateSpecialPointer(page))", ("FailedAssertion"), "ginxlog.c", 289), 0, (char *) ((char *) (page) + ((PageHeader) (page))->pd_special) ))", File: "ginxlog.c", Line: 289) -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: On the need for a snapshot in exec_bind_message()
Hi, On 2018-09-05 12:31:04 -0700, Daniel Wood wrote: > NOTE: > > In GetSnapshotData because pgxact, is declared volatile, the compiler will > not reduce the following two IF tests into a single test: > > > if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING) > continue; > > > if (pgxact->vacuumFlags & PROC_IN_VACUUM) > continue; > > > You can reduce the code path in the inner loop by coding this as: > > if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING|PROC_IN_VACUUM)) > continue; > > > I'm still working on quantifying any gain. Note it isn't just one L1 cache > > fetch and one conditional branch eliminated. Due to the update frequency of > the pgxact cache line, for single statement TXN's, there are a certain number > of full cache misses, due to invalidation, that occurs when given pgxact is > updated between the first fetch of vacuumFlags and the 2nd fetch. These two obviously could be combined, but I think we should just get rid of the volatiles. With a small bit of work they shouldn't be required anymore these days. Greetings, Andres Freund
Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.
On Wed, Sep 05, 2018 at 03:20:03PM -0300, Alvaro Herrera wrote: > On 2018-Sep-05, Michael Paquier wrote: > > It would be better to start a new thread rather than posting > > a new patch, but I'd rather take the temperature first. > > Slightly feverish, it appears. As far as I can see, the proposal is not cold to death and could have a future, so I'll begin a new thread with a more complete patch. -- Michael signature.asc Description: PGP signature
Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
I wrote: > Andres Freund writes: >> One concern I have with your approach is that it isn't particularly >> bullet-proof for cases where the rebuild is triggered by something that >> doesn't hold a conflicting lock. > Wouldn't that be a bug in the something-else? So where are we on this? Should I proceed with my patch, or are we going to do further investigation? Does anyone want to do an actual patch review? I think it's important to have some fix in place in time for the next 11beta release, so time grows short ... regards, tom lane
Re: On the need for a snapshot in exec_bind_message()
Daniel Wood writes: > In particular: > exec_bind_message() > PushActiveSnapshot(GetTransactionSnapshot()); > If there were no input functions, that needed this, nor reparsing or > reanalyzing needed, and we knew this up front, it'd be a huge win. Unfortunately, that's not the case, so I think trying to get rid of this call is a nonstarter. What we have kicked around a bit is trying to get rid of the additional snapshot-taking at the start of execution, so that the snapshot taken at BIND time serves all the way through the query. That'd require a fair amount of refactoring I think, but at least it's not a broken idea on its face. > In GetSnapshotData because pgxact, is declared volatile, the compiler will > not reduce the following two IF tests into a single test: > if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING) > continue; > if (pgxact->vacuumFlags & PROC_IN_VACUUM) > continue; That, on the other hand, is just crummy coding :-( > I'm still working on quantifying any gain. It'll be interesting to see if you can show visible improvement from merging those. It's enough of a hotspot that I wouldn't be surprised to find some, but actual numbers would be nice. regards, tom lane
On the need for a snapshot in exec_bind_message()
In particular: exec_bind_message() PushActiveSnapshot(GetTransactionSnapshot()); Suppressing this I've achieved over 1.9 M TXN's a second on select only pgbench on a 48 core box. It is about 50% faster with this change. The cpu usage of GetSnapshotData drops from about 22% to 4.5%. If there were no input functions, that needed this, nor reparsing or reanalyzing needed, and we knew this up front, it'd be a huge win. We could test for a number of conditions on the first parse/optimization of the query and set a flag to suppress this for subsequent executions. NOTE: In GetSnapshotData because pgxact, is declared volatile, the compiler will not reduce the following two IF tests into a single test: if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING) continue; if (pgxact->vacuumFlags & PROC_IN_VACUUM) continue; You can reduce the code path in the inner loop by coding this as: if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING|PROC_IN_VACUUM)) continue; I'm still working on quantifying any gain. Note it isn't just one L1 cache fetch and one conditional branch eliminated. Due to the update frequency of the pgxact cache line, for single statement TXN's, there are a certain number of full cache misses, due to invalidation, that occurs when given pgxact is updated between the first fetch of vacuumFlags and the 2nd fetch.
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Andrew Gierth writes: > The problem is that if we're relying on -fexcess-precision=standard > semantics in places besides infinity checks, then we won't get those > semantics on clang/i386/no-sse2 since it has no comparable option. (What > are we doing about compilers for x86-32 other than clang and gcc?) Well, the bigger problem is that we don't really know exactly where nonstandard precision semantics might cause visible behavior changes. It's just about a dead certainty, IMO, that there are some other places we don't know about because the regression tests don't expose them; or that future code rearrangements might create new problems. Given that (if I understand Andres correctly) -fexcess-precision=standard behavior is required by C99, it's hard to get excited about expending a whole lot of work here. I'm fine with adding some compiler options or telling the user to do so, but I don't think we should do anything much beyond that. regards, tom lane
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
> "Tom" == Tom Lane writes: >> At the very minimum, I believe FreeBSD project policy would require >> that the i386 packages for postgresql be built to run without SSE2. Tom> Well, can we tell them they have to use gcc to compile, or will Tom> that be against their policy too? That's no problem, lots of ports have USE_GCC=yes (or some specified version or minimum version of gcc). -- Andrew (irc:RhodiumToad)
Re: Collation versioning
Re: Thomas Munro 2018-09-05 > > Hopefully that version info is fine-grained enough. > > Hmm. I was looking for locale data version, not libc.so itself. I > realise they come ultimately from the same source package, but are the > locale definitions and libc6 guaranteed to be updated at the same > time? I see that the locales package depends on libc-bin >> 2.27 (no > upper bound), which in turn depends on libc6 >> 2.27, << 2.28. So > perhaps you can have a system with locales 2.27 and libc6 2.28? No because libc6.deb "breaks" locales << 2.27: Package: libc6 Source: glibc Version: 2.27-5 Breaks: [...] locales (<< 2.27), locales-all (<< 2.27), (I can't tell off-hand why this isn't just a stricter dependency in locales.deb, but it's probably because this variant works better for upgrades.) > I also wonder if there are some configurations where they could get out > of sync because of manual use of locale-gen or something like that. > Clearly most systems would have them in sync through apt-get upgrade > though, so maybe gnu_get_libc_version() would work well in practice? I'd hope so. I'm more worried about breakage because of fixes applied within one glibc version (2.27-5 vs 2.27-6), but I guess Debian's glibc maintainers are clueful enough not to do that. Re: Thomas Munro 2018-09-05 > And even if they are, what if your cluster is still running and still > has the older libc.so.6 mapped in? Newly forked backends will see new > locale data but gnu_get_libc_version() will return the old string. > (Pointed out off-list by Andres.) Eventually you restart your cluster > and start seeing the error. That problem isn't protected against by PG itself. I've seen clusters that were upgraded on disk but not restarted yet where every plpgsql invocation was throwing symbol errors. So I guess we don't have to try harder for libc. > So, it's not ideal but perhaps worth considering on the grounds that > it's better than nothing? Ack. Christoph
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> If you wanted to argue that the set of people who still want to > Tom> run PG on pre-SSE2 hardware is the empty set, that'd be an easier > Tom> sell really. > At the very minimum, I believe FreeBSD project policy would require that > the i386 packages for postgresql be built to run without SSE2. Well, can we tell them they have to use gcc to compile, or will that be against their policy too? regards, tom lane
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
> "Peter" == Peter Eisentraut writes: > On 05/09/2018 18:42, Andres Freund wrote: >> Realistically we're going to be running into old versions of clang >> for a long time. And the importance of running i386 without SSE2 >> surely isn't increasing. So I don't really see an urgent need to do >> anything about it. And if it gets fixed, and we care, we can just >> add a clang version check to the test. Peter> Another option perhaps is to let this be and accept it as Peter> alternative floating point behavior. We already have some of Peter> those. If it was only a matter of error handling, then the best fix would likely to be just avoiding __builtin_isinf if (clang AND i386 AND not sse2). The problem is that if we're relying on -fexcess-precision=standard semantics in places besides infinity checks, then we won't get those semantics on clang/i386/no-sse2 since it has no comparable option. (What are we doing about compilers for x86-32 other than clang and gcc?) -- Andrew (irc:RhodiumToad)
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
> "Tom" == Tom Lane writes: Tom> If you wanted to argue that the set of people who still want to Tom> run PG on pre-SSE2 hardware is the empty set, that'd be an easier Tom> sell really. At the very minimum, I believe FreeBSD project policy would require that the i386 packages for postgresql be built to run without SSE2. Tom> But what I'm really concerned about here, given that we're Tom> apparently talking about an ABI change, No, there's no ABI change involved. Tom> I'm actually a bit surprised to hear that you can just randomly Tom> add -msse2 on BSDen. Do they ship separate copies of libc et al to Tom> make that work? -msse2 just tells the compiler to assume that the SSE2 instructions and registers are available for use (and if so, it will always use them for floats in preference to the x87 registers where possible), it doesn't change the ABI: all function results are still bounced through the x87 register stack (and float/double parameters are not passed in registers anyway). -- Andrew (irc:RhodiumToad)
Re: JIT compiling with LLVM v12
Hi, On 2018-08-22 06:20:21 +, Noah Misch wrote: > I see jit slows the regression tests considerably: Is this with LLVM assertions enabled or not? The differences seem bigger than what I'm observing, especially on the mips animal - which I observe uses a separately installed LLVM build. On my machine, with master, on a postgres assert build w/ non-debug llvm build: PGOPTIONS='-c jit=0' time make -Otarget -j10 -s check-world && echo success || echo f 240.37user 55.55system 2:08.17elapsed 230%CPU (0avgtext+0avgdata 66264maxresident)k PGOPTIONS='-c jit=1' time make -Otarget -j10 -s check-world && echo success || echo f 253.02user 55.77system 2:16.22elapsed 226%CPU (0avgtext+0avgdata 54756maxresident)k Using your command, on a postgres optimized build w/ non-debug llvm build: > # x86_64, non-assert, w/o llvm > $ for n in 1 2 3; do env time make -C src/bin/pg_upgrade check; done 2>&1 | > grep elapsed > 7.64user 4.24system 0:36.40elapsed 32%CPU (0avgtext+0avgdata > 36712maxresident)k > 8.09user 4.50system 0:37.71elapsed 33%CPU (0avgtext+0avgdata > 36712maxresident)k > 7.53user 4.18system 0:36.54elapsed 32%CPU (0avgtext+0avgdata > 36712maxresident)k > > # x86_64, non-assert, w/ llvm trunk > $ for n in 1 2 3; do env time make -C src/bin/pg_upgrade check; done 2>&1 | > grep elapsed > 9.58user 5.79system 0:49.61elapsed 30%CPU (0avgtext+0avgdata > 36712maxresident)k > 9.47user 5.92system 0:47.84elapsed 32%CPU (0avgtext+0avgdata > 36712maxresident)k > 9.09user 5.51system 0:47.94elapsed 30%CPU (0avgtext+0avgdata > 36712maxresident)k > andres@alap4:~/build/postgres/master-optimize/vpath$ for n in 1 2 3; do PGOPTIONS='-cjit=0' env time make -C src/bin/pg_upgrade check; done 2>&1 | grep elapsed 8.01user 3.63system 0:39.88elapsed 29%CPU (0avgtext+0avgdata 50196maxresident)k 7.96user 3.86system 0:39.70elapsed 29%CPU (0avgtext+0avgdata 50064maxresident)k 7.96user 3.80system 0:37.17elapsed 31%CPU (0avgtext+0avgdata 50148maxresident)k andres@alap4:~/build/postgres/master-optimize/vpath$ for n in 1 2 3; do PGOPTIONS='-cjit=1' env time make -C src/bin/pg_upgrade check; done 2>&1 | grep elapsed 7.88user 3.76system 0:44.98elapsed 25%CPU (0avgtext+0avgdata 50092maxresident)k 7.99user 3.72system 0:46.53elapsed 25%CPU (0avgtext+0avgdata 50036maxresident)k 7.88user 3.87system 0:45.26elapsed 25%CPU (0avgtext+0avgdata 50132maxresident)k So here the difference is smaller, but not hugely so. > # mips32el, assert, w/o llvm (buildfarm member topminnow) [1] > 28min install-check-* > 35min check-pg_upgrade > > # mips32el, assert, w/ llvm 6.0.1 [1] > 63min install-check-* > 166min check-pg_upgrade But this seems so absurdly large of a difference that I kinda think LLVM assertions (wich are really expensive and add O(N) operations in a bunch of places) might be to blame. Greetings, Andres Freund
Re: pread() and pwrite()
Hi, On 07/26/2018 10:04 PM, Thomas Munro wrote: Done. Rebased. This needs a rebase again. Once resolved the patch passes make check-world, and a strace analysis shows the associated read()/write() have been turned into pread64()/pwrite64(). All lseek()'s are SEEK_END's. Best regards, Jesper
Re: JIT compiling with LLVM v12
Hi, On 2018-08-25 21:34:22 -0400, Robert Haas wrote: > On Wed, Aug 22, 2018 at 6:43 PM, Andres Freund wrote: > > Now you can say that'd be solved by bumping the cost up, sure. But > > obviously the row / cost model is pretty much out of whack here, I don't > > see how we can make reasonable decisions in a trivial query that has a > > misestimation by five orders of magnitude. > > Before JIT, it didn't matter whether the costing was wrong, provided > that the path with the lowest cost was the cheapest path (or at least > close enough to the cheapest path not to bother anyone). I don't thinkt that's really true. Due to the cost fuzzing absurdly high cost very commonly lead to the actually different planning choices to not have a large enough influence to matter. > I'd guess that, as you read this, you're thinking, well, but if I'd > added JIT and non-JIT paths for every option, it would have doubled > the number of paths, and that would have slowed the planner down way > too much. That's certainly true, but my point is just that the > problem is probably not as simple as "the defaults are too low". I > think the problem is more fundamentally that the model you've chosen > is kinda broken. Right. And that's why I repeatedly brought up this part in discussions... I still think it's a reasonable compromise, but it certainly has costs. I'm also doubtful that just adding a separate path for JIT (with a significantly smaller cpu_*_cost or such) would really have helped in the cases with borked estimations - we'd *still* end up choosing JITing if the loop count is absurd, just because the cost is high. There *are* cases where it helps - if all the cost is incurred, say, due to random page fetches, then JITing isn't going to help that much. > I'm not saying I know how you could have done any better, but I do > think we're going to have to try to figure out something to do about > it, because saying, "check-pg_upgrade is 4x slower, but that's just > because of all those bad estimates" is not going to fly. That I'm unconvinced by however. This was on some quite slow machine and/or with LLVM assertions enabled - the performance difference on a normal machine is smaller: $ PGOPTIONS='-cjit=0' time make -s check ... 5.21user 2.11system 0:24.95elapsed 29%CPU (0avgtext+0avgdata 54212maxresident)k 20976inputs+340848outputs (14major+342228minor)pagefaults 0swaps $ PGOPTIONS='-cjit=1' time make -s check ... 5.33user 2.01system 0:30.49elapsed 24%CPU (0avgtext+0avgdata 54236maxresident)k 0inputs+340856outputs (0major+342616minor)pagefaults 0swaps But also importantly, I think there's actual advantages in triggering JIT in some places in the regression tests. There's buildfarm animals exercising the path that everything is JITed, but that's not really helpful during development. > Those bad estimates were harmlessly bad before, I think that's not true. Greetings, Andres Freund
Re: pg_verify_checksums failure with hash indexes
On Wed, Sep 05, 2018 at 12:16:00AM -0400, Tom Lane wrote: > Amit Kapila writes: >> Does anybody else have any idea on how can we write a test for >> non-default block size or if we already have anything similar? > > Build with a non-default BLCKSZ and see if the regression tests pass. > There's no way that a build with BLCKSZ x can run any tests for > BLCKSZ y. Or we could implement block-level configuration at initdb time? That's what Andres has done for WAL segment size recently. /me hides and runs fast > Note that you can expect some plan variations from a different BLCKSZ, > so there'd be at least a few "failures" in the regression tests, which'd > require manual inspection. Otherwise this could be delegated to a > buildfarm animal using a nonstandard BLCKSZ. Last time I did that I saw only plan diffs, which was a couple of weeks ago. -- Michael signature.asc Description: PGP signature
Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.
On 2018-Sep-05, Michael Paquier wrote: > On Wed, Sep 05, 2018 at 11:39:50AM -0300, Alvaro Herrera wrote: > > Should this be used in src/test/modules/{brin,test_commit_ts} also? > > Hmm, I have not thought those two ones. commit_ts relies on REGRESS to > be defined so as it does its cleanup. brin is more interesting, it > directly quotes that it needs to tweak its Makefile to do the cleanup, > and it uses isolation tests. Wouldn't it be more adapted to add a > second extra switch for isolation tests similar to REGRESS? We could > call it ISOLATION, and it would list the set of isolation tests you > could run from an extension. That would be useful for a lot of folks. > > Thoughts? Yeah, +1 for that. > It would be better to start a new thread rather than posting > a new patch, but I'd rather take the temperature first. Slightly feverish, it appears. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
On 2018-09-05 14:08:11 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2018-09-05 10:05:26 -0400, Tom Lane wrote: > >> One thought is that maybe we should provide a way to override this, > >> in case somebody really can't or doesn't want to use -msse2, and > >> is willing to put up with platform-dependent float behavior. > > > IDK, people that are capable of making that decision can just hack > > configure. > > "Just hacking configure" is a pretty high bar for most folk. Sure. But being able to accurately judge whether you're ok that math behaviour doesn't work as documented seems like a high bar too. And this'd only be relevant if they insist on using clang rather than gcc. > If you wanted to argue that the set of people who still want to run PG > on pre-SSE2 hardware is the empty set, that'd be an easier sell > really. I think it's an empty set, yea, especially with clang. > But what I'm really concerned about here, given that we're apparently > talking about an ABI change, is that someone might want to run on a > platform where the libraries insist on the x87 way. > I'm actually a bit surprised to hear that you can just randomly add > -msse2 on BSDen. Do they ship separate copies of libc et al to make > that work? I don't think we're talking about an ABI change - with -msse2 the calling conventions are unchanged, the stuff just gets moved out of the x87 registers into SSE ones / xmm for the actual math. Which in turn solves our "80bit register" problem. There really shouldn't be any need for libc6 itself to care. Greetings, Andres Freund
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Andres Freund writes: > On 2018-09-05 10:05:26 -0400, Tom Lane wrote: >> One thought is that maybe we should provide a way to override this, >> in case somebody really can't or doesn't want to use -msse2, and >> is willing to put up with platform-dependent float behavior. > IDK, people that are capable of making that decision can just hack > configure. "Just hacking configure" is a pretty high bar for most folk. If you wanted to argue that the set of people who still want to run PG on pre-SSE2 hardware is the empty set, that'd be an easier sell really. But what I'm really concerned about here, given that we're apparently talking about an ABI change, is that someone might want to run on a platform where the libraries insist on the x87 way. I'm actually a bit surprised to hear that you can just randomly add -msse2 on BSDen. Do they ship separate copies of libc et al to make that work? regards, tom lane
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
On Wed, Sep 5, 2018 at 10:13 AM Chris Travers wrote: > On Wed, Sep 5, 2018 at 6:55 PM Andres Freund wrote: >> > On Wed, Sep 5, 2018 at 6:40 PM Chris Travers >> > wrote: >> > >> Do you mean this loop in dsm_impl_posix_resize() is getting >> > >> interrupted constantly and never completing? >> > >> >> > >> /* We may get interrupted, if so just retry. */ >> > >> do >> > >> { >> > >> rc = posix_fallocate(fd, 0, size); >> > >> } while (rc == EINTR); >> > >> >> >> Probably worthwile to check that the dsm code is properly robust if >> errors are thrown from within here. Yeah, currently dsm_impl_posix_resize() returns and lets dsm_impl_posix() clean up (close(), shm_unlink()) before raising errors. We can't just let CHECK_FOR_INTERRUPTS() take a non-local exit. Some refactoring involving PG_TRY()/PG_CATCH() may be the simplest way forward. -- Thomas Munro http://www.enterprisedb.com
Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.
Michael Paquier writes: > On Wed, Sep 05, 2018 at 11:39:50AM -0300, Alvaro Herrera wrote: >> Should this be used in src/test/modules/{brin,test_commit_ts} also? > Hmm, I have not thought those two ones. commit_ts relies on REGRESS to > be defined so as it does its cleanup. brin is more interesting, it > directly quotes that it needs to tweak its Makefile to do the cleanup, > and it uses isolation tests. Wouldn't it be more adapted to add a > second extra switch for isolation tests similar to REGRESS? We could > call it ISOLATION, and it would list the set of isolation tests you > could run from an extension. That would be useful for a lot of folks. > Thoughts? It would be better to start a new thread rather than posting > a new patch, but I'd rather take the temperature first. Yeah, if we're going to introduce infrastructure for TAP tests, it'd make sense to do so for isolation tests as well. regards, tom lane
Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.
On Wed, Sep 05, 2018 at 11:39:50AM -0300, Alvaro Herrera wrote: > Should this be used in src/test/modules/{brin,test_commit_ts} also? Hmm, I have not thought those two ones. commit_ts relies on REGRESS to be defined so as it does its cleanup. brin is more interesting, it directly quotes that it needs to tweak its Makefile to do the cleanup, and it uses isolation tests. Wouldn't it be more adapted to add a second extra switch for isolation tests similar to REGRESS? We could call it ISOLATION, and it would list the set of isolation tests you could run from an extension. That would be useful for a lot of folks. Thoughts? It would be better to start a new thread rather than posting a new patch, but I'd rather take the temperature first. > Why do you guys design Makefile rules in pgsql-committers? That was a bad idea, and a reaction to what Tom has committed for the cleanup of the new TAP tests I have added previously. -- Michael signature.asc Description: PGP signature
Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
On Wed, Sep 5, 2018 at 4:59 AM, R, Siva wrote: > Hi, > > We recently encountered an issue where the opaque data flags on a gin data > leaf page was corrupted while replaying a gin insert WAL record. Upon > further examination of the redo code, we found a bug in ginRedoRecompress > code, which extracts the WAL information and updates the page. > > Specifically, when a new segment is inserted in the middle of a page, a > memmove operation is performed [1] at the current point in the page to make > room for the new segment. If this segment insertion is followed by delete > segment actions that are yet to be processed and the total data size is very > close to GinDataPageMaxDataSize, then we may move the data portion beyond > the boundary causing the opaque data to be corrupted. > > One way of solving this problem is to perform the replay work on a scratch > space, perform sanity check on the total size of the data portion before > copying it back to the actual page. While it involves additional memory > allocation and memcpy operations, it is safer and similar to the 'do' code > path where we ensure to make a copy of all segment past the first modified > segment before placing them back on the page [2]. > Hmm, could you share the sequence of what kind of WAL has applied to the broken page? I suspect the segment list contains GIN_SEGMENT_REPLACE before GIN_SEGMENT_INSERT. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
On Wed, Sep 5, 2018 at 6:55 PM Andres Freund wrote: > Hi, > > On 2018-09-05 18:48:44 +0200, Chris Travers wrote: > > Will submit a patch here shortly. Thanks! Should we do for master and > > 10? Or 9.6 too? > > Please don't top-post on this list. This needs to be done in all > branches where the posix_fallocate call is present. > > > > Yep, Maybe we should check for signals there. > > > > > > On Wed, Sep 5, 2018 at 5:27 PM Thomas Munro < > thomas.mu...@enterprisedb.com> > > > wrote: > > > > > >> On Wed, Sep 5, 2018 at 8:23 AM Chris Travers < > chris.trav...@adjust.com> > > >> wrote: > > >> > 1. The query is in a parallel index scan or similar > > >> > 2. A process is executing a parallel plan and allocating a > significant > > >> chunk of memory (2MB for example) in dynamic shared memory. > > >> > 3. The startup process goes into a loop where it sends a sigusr1, > > >> sleeps 5m, and sends another sigusr1 etc. > > >> > 4. The sigusr1 aborts the system call, which is then retried. > > >> > 5. Because the system call takes more than 5ms, we end up in an > > >> endless loop > > What you're presumably encountering here is a recovery conflict. > Agreed but the question is how to correct what is a fairly interesting race condition. > > > > On Wed, Sep 5, 2018 at 6:40 PM Chris Travers > > wrote: > > >> Do you mean this loop in dsm_impl_posix_resize() is getting > > >> interrupted constantly and never completing? > > >> > > >> /* We may get interrupted, if so just retry. */ > > >> do > > >> { > > >> rc = posix_fallocate(fd, 0, size); > > >> } while (rc == EINTR); > > >> > > Probably worthwile to check that the dsm code is properly robust if > errors are thrown from within here. > Will check that too. Thanks! > > > Greetings, > > Andres Freund > -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Andres Freund writes: > On 2018-09-05 18:55:34 +0200, Peter Eisentraut wrote: >> Another option perhaps is to let this be and accept it as alternative >> floating point behavior. We already have some of those. > -many. We'd directly violate our own error rules. I think that just from a regression-test-maintenance standpoint, this approach would be a mess. We could expect for instance that such problems would come and go in weird places in the geometric tests. regards, tom lane
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
On 2018-09-05 18:55:34 +0200, Peter Eisentraut wrote: > On 05/09/2018 18:42, Andres Freund wrote: > > Realistically we're going to be running into old versions of clang for a > > long time. And the importance of running i386 without SSE2 surely isn't > > increasing. So I don't really see an urgent need to do anything about > > it. And if it gets fixed, and we care, we can just add a clang version > > check to the test. > > Another option perhaps is to let this be and accept it as alternative > floating point behavior. We already have some of those. -many. We'd directly violate our own error rules. I'm actually personally in favor of not throwing error when float overflows - it's imo not actually useful and costs performance - but sometimes throwing an error depending on the specific register allocator behaviour of a specific version of a compiler is bad. It's really weird to return [+-]Infinity depending on just *how much* you overflowed. Greetings, Andres Freund
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
Hi, On 2018-09-05 18:48:44 +0200, Chris Travers wrote: > Will submit a patch here shortly. Thanks! Should we do for master and > 10? Or 9.6 too? Please don't top-post on this list. This needs to be done in all branches where the posix_fallocate call is present. > > Yep, Maybe we should check for signals there. > > > > On Wed, Sep 5, 2018 at 5:27 PM Thomas Munro > > wrote: > > > >> On Wed, Sep 5, 2018 at 8:23 AM Chris Travers > >> wrote: > >> > 1. The query is in a parallel index scan or similar > >> > 2. A process is executing a parallel plan and allocating a significant > >> chunk of memory (2MB for example) in dynamic shared memory. > >> > 3. The startup process goes into a loop where it sends a sigusr1, > >> sleeps 5m, and sends another sigusr1 etc. > >> > 4. The sigusr1 aborts the system call, which is then retried. > >> > 5. Because the system call takes more than 5ms, we end up in an > >> endless loop What you're presumably encountering here is a recovery conflict. > On Wed, Sep 5, 2018 at 6:40 PM Chris Travers > wrote: > >> Do you mean this loop in dsm_impl_posix_resize() is getting > >> interrupted constantly and never completing? > >> > >> /* We may get interrupted, if so just retry. */ > >> do > >> { > >> rc = posix_fallocate(fd, 0, size); > >> } while (rc == EINTR); > >> Probably worthwile to check that the dsm code is properly robust if errors are thrown from within here. Greetings, Andres Freund
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
On 05/09/2018 18:42, Andres Freund wrote: > Realistically we're going to be running into old versions of clang for a > long time. And the importance of running i386 without SSE2 surely isn't > increasing. So I don't really see an urgent need to do anything about > it. And if it gets fixed, and we care, we can just add a clang version > check to the test. Another option perhaps is to let this be and accept it as alternative floating point behavior. We already have some of those. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
On Wed, Sep 5, 2018 at 9:49 AM Chris Travers wrote: > > Will submit a patch here shortly. Thanks! Should we do for master and 10? > Or 9.6 too? > > On Wed, Sep 5, 2018 at 6:40 PM Chris Travers wrote: >> >> Yep, Maybe we should check for signals there. Yeah, it seems reasonable to check for interrupts here. It would need to be back-patched as far as 9.4. -- Thomas Munro http://www.enterprisedb.com
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
Will submit a patch here shortly. Thanks! Should we do for master and 10? Or 9.6 too? On Wed, Sep 5, 2018 at 6:40 PM Chris Travers wrote: > Yep, Maybe we should check for signals there. > > On Wed, Sep 5, 2018 at 5:27 PM Thomas Munro > wrote: > >> On Wed, Sep 5, 2018 at 8:23 AM Chris Travers >> wrote: >> > 1. The query is in a parallel index scan or similar >> > 2. A process is executing a parallel plan and allocating a significant >> chunk of memory (2MB for example) in dynamic shared memory. >> > 3. The startup process goes into a loop where it sends a sigusr1, >> sleeps 5m, and sends another sigusr1 etc. >> > 4. The sigusr1 aborts the system call, which is then retried. >> > 5. Because the system call takes more than 5ms, we end up in an >> endless loop >> >> Do you mean this loop in dsm_impl_posix_resize() is getting >> interrupted constantly and never completing? >> >> /* We may get interrupted, if so just retry. */ >> do >> { >> rc = posix_fallocate(fd, 0, size); >> } while (rc == EINTR); >> >> -- >> Thomas Munro >> http://www.enterprisedb.com >> > > > -- > Best Regards, > Chris Travers > Head of Database > > Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com > Saarbrücker Straße 37a, 10405 Berlin > > -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin
Re: [HACKERS] Proposal to add work_mem option to postgres_fdw module
On 01/09/2018 06:33, Shinoda, Noriyoshi (PN Japan GCS Delivery) wrote: > Certainly the PQconndefaults function specifies Debug flag for the "options" > option. > I think that eliminating the Debug flag is the simplest solution. > For attached patches, GUC can be specified with the following syntax. > > CREATE SERVER remsvr1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'host > 1', ..., options '-c work_mem=64MB -c geqo=off'); > ALTER SERVER remsv11 OPTIONS (SET options '-c work_mem=64MB -c geqo=off'); > > However, I am afraid of the effect that this patch will change the behavior > of official API PQconndefaults. It doesn't change the behavior of the API, it just changes the result of the API function, which is legitimate and the reason we have the API function in the first place. I think this patch is fine. I'll work on committing it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Hi, On 2018-09-05 10:05:26 -0400, Tom Lane wrote: > Peter Eisentraut writes: > > On 05/09/2018 02:51, Andres Freund wrote: > >> My current proposal is thus to do add a check that does > >> #if defined(__clang__) && defined(__i386__) && !defined(__SSE2_MATH__) > >> something-that-fails > >> #endif > >> in an autoconf test, and have configure complain if that fails. > > > Would this not be invalidated if the bug you have filed gets fixed? Realistically we're going to be running into old versions of clang for a long time. And the importance of running i386 without SSE2 surely isn't increasing. So I don't really see an urgent need to do anything about it. And if it gets fixed, and we care, we can just add a clang version check to the test. > Perhaps, but how would autoconf tell that? I wouldn't have any confidence > in a runtime test, as it might or might not trigger the bug depending on > all sorts of factors like -O level. > > In any case, it seems like "use -msse2 on x86" is good advice from a > performance standpoint even if it doesn't prevent a compiler bug. Indeed. > One thought is that maybe we should provide a way to override this, > in case somebody really can't or doesn't want to use -msse2, and > is willing to put up with platform-dependent float behavior. IDK, people that are capable of making that decision can just hack configure. Greetings, Andres Freund
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
Yep, Maybe we should check for signals there. On Wed, Sep 5, 2018 at 5:27 PM Thomas Munro wrote: > On Wed, Sep 5, 2018 at 8:23 AM Chris Travers > wrote: > > 1. The query is in a parallel index scan or similar > > 2. A process is executing a parallel plan and allocating a significant > chunk of memory (2MB for example) in dynamic shared memory. > > 3. The startup process goes into a loop where it sends a sigusr1, > sleeps 5m, and sends another sigusr1 etc. > > 4. The sigusr1 aborts the system call, which is then retried. > > 5. Because the system call takes more than 5ms, we end up in an endless > loop > > Do you mean this loop in dsm_impl_posix_resize() is getting > interrupted constantly and never completing? > > /* We may get interrupted, if so just retry. */ > do > { > rc = posix_fallocate(fd, 0, size); > } while (rc == EINTR); > > -- > Thomas Munro > http://www.enterprisedb.com > -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin
Re: Collation versioning
On Wed, Sep 5, 2018 at 8:20 AM Thomas Munro wrote: > On Wed, Sep 5, 2018 at 3:35 AM Christoph Berg wrote: > > int main (void) { puts (gnu_get_libc_version ()); return 0; } > > > > $ ./a.out > > 2.27 > > Hmm. I was looking for locale data version, not libc.so itself. I > realise they come ultimately from the same source package, but are the > locale definitions and libc6 guaranteed to be updated at the same > time? And even if they are, what if your cluster is still running and still has the older libc.so.6 mapped in? Newly forked backends will see new locale data but gnu_get_libc_version() will return the old string. (Pointed out off-list by Andres.) Eventually you restart your cluster and start seeing the error. So, it's not ideal but perhaps worth considering on the grounds that it's better than nothing? -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Bug in to_timestamp().
On Wed, Sep 5, 2018, 6:35 PM Alexander Korotkov wrote: > On Wed, Sep 5, 2018 at 3:10 PM amul sul wrote: > > On Wed, Sep 5, 2018 at 3:05 PM Alexander Korotkov > > wrote: > > > On Wed, Sep 5, 2018 at 1:22 AM David G. Johnston > > > wrote: > > > > From those results the question is how important is it to force the > following breakage on our users (i.e., introduce FX exact symbol matching): > > > > > > > > SELECT to_timestamp('97/Feb/16', 'FXYY:Mon:DD'); > > > > - to_timestamp > > > > --- > > > > - Sun Feb 16 00:00:00 1997 PST > > > > -(1 row) > > > > - > > > > +ERROR: unexpected character "/", expected character ":" > > > > +HINT: In FX mode, punctuation in the input string must exactly > match the format string. > > > > > > > > There seemed to be some implicit approvals of this breakage some 30 > emails and 10 months ago but given that this is the only change from a > correct result to a failure I'd like to officially put it out there for > opinion/vote gathering. Mine is a -1; though keeping the distinction > between space and non-alphanumeric characters is expected. > > > > > > Do I understand correctly that you're -1 to changes to FX mode, but no > > > objection to changes in non-FX mode? > > > > > Ditto. > > So, if no objections for non-FX mode changes, then I'll extract that > part and commit it separately. > Yeah, that make sense to me, thank you. Regards, Amul >
Re: PL/Python: Remove use of simple slicing API
On 29/08/2018 11:37, Peter Eisentraut wrote: > I have found some dying code in PL/Python. > > The simple slicing API (sq_slice, sq_ass_slice) has been deprecated > since Python 2.0 and has been removed altogether in Python 3, so we can > remove those functions from the PLyResult class. Instead, the non-slice > mapping functions mp_subscript and mp_ass_subscript can take slice > objects as an index. Since we just pass the index through to the > underlying list object, we already support that. Test coverage was > already in place. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: PostgreSQL does not start when max_files_per_process> 1200 on Windows 7
Hi, On 2018-09-05 13:06:06 +0300, Victor Spirin wrote: > I have a bug in Windows 7 with max_files_per_process> 1200. > > Using dup (0) in the function count_usable_fds more than 1200 times (0 = > stdin) causes further unpredictable errors with file operations. Could you expand a bit on this? Greetings, Andres Freund
Re: Prevent concurrent DROP SCHEMA when certain objects are being initially created in the namespace
Hi, On 2018-09-05 01:05:54 -0400, Tom Lane wrote: > Andres Freund writes: > > On September 4, 2018 9:11:25 PM PDT, Tom Lane wrote: > >> I think that line of thought leads to an enormous increase in locking > >> overhead, for which we'd get little if any gain in usability. So my > >> inclination is to make an engineering judgment that we won't fix this. > > > Haven't we already significantly started down this road, to avoid a lot of > > the "tuple concurrently updated" type errors? > > Not that I'm aware of. We do not take locks on schemas, nor functions, > nor any other of the object types I mentioned. Well, we kinda do, during some of their own DDL. CF AcquireDeletionLock(), RangeVarGetAndCheckCreationNamespace(), and other LockDatabaseObject() callers. The RangeVarGetAndCheckCreationNamespace() even locks the schema an object is created in , which is pretty much what we're discussing here. I thinkt he problem with the current logic is more that the findDependentObjects() scan doesn't use a "dirty" scan, so it doesn't ever get to seeing conflicting operations. > > Would expanding this a git further really be that noticeable? > > Frankly, I think it would be not so much "noticeable" as "disastrous". > > Making the overhead tolerable would require very large compromises > in coverage, perhaps like "we'll only lock during DDL not DML". > At which point I'd question why bother. We've seen no field reports > (that I can recall offhand, anyway) that trace to not locking these > objects. Why would "we'll only lock during DDL not DML" be such a large compromise? To me that's a pretty darn reasonable subset - preventing corruption of the catalog contents is, uh, good? Greetings, Andres Freund
Re: Out arguments name of "pg_identify_object_as_address" function in 9.5.14 and 11beta3
On 2018-Sep-05, Tom Lane wrote: > Alvaro Herrera writes: > > That said, I haven't heard of anyone using these functions in code yet, > > so if we change it in 11 or 12 nobody is going to complain. > > ... and that's pretty much my feeling. It seems really unlikely that > anyone's using named-argument notation for pg_get_object_address, and > even if they are, it wouldn't be very painful to change, or just not > use the notation if they need cross-branch compatibility. I think it's > more useful in the long run to make the names consistent. > > Will go take care of it. Agreed, thanks. If you haven't touched the docs yet, here's the change -- 9.5/9.6/10 need a slight adjustment from 11/master. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index bb794e044f..853e5fee7c 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17701,7 +17701,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); pg_identify_object_as_address(catalog_id oid, object_id oid, object_sub_id integer) - type text, name text[], args text[] + type text, object_names text[], object_args text[] get external representation of a database object's address @@ -17745,7 +17745,8 @@ SELECT collation for ('foo' COLLATE "de_DE"); information is independent of the current server, that is, it could be used to identify an identically named object in another server. type identifies the type of database object; - name and args are text arrays that together + object_names and object_args + are text arrays that together form a reference to the object. These three columns can be passed to pg_get_object_address to obtain the internal address of the object. diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 5e95325250..7f903fa2d1 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17016,7 +17016,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); pg_identify_object_as_address(catalog_id oid, object_id oid, object_sub_id integer) - type text, name text[], args text[] + type text, object_names text[], object_args text[] get external representation of a database object's address @@ -17060,7 +17060,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); information is independent of the current server, that is, it could be used to identify an identically named object in another server. type identifies the type of database object; - name and args are text arrays that together + object_names and object_args are text arrays that together form a reference to the object. These three columns can be passed to pg_get_object_address to obtain the internal address of the object.
Re: Out arguments name of "pg_identify_object_as_address" function in 9.5.14 and 11beta3
Alvaro Herrera writes: > On 2018-Sep-03, Tom Lane wrote: >> I do not think we can change the names of the output arguments; >> it'd break existing queries. However, renaming input arguments >> shouldn't affect anything. So I propose we make pg_get_object_address' >> input arguments be named type,object_names,object_args for >> consistency with the other function, and update the docs to match. > Hmm, I don't think it's possible to rename input args without breaking > working code either: Yeah, Andrew noted the same ... > That said, I haven't heard of anyone using these functions in code yet, > so if we change it in 11 or 12 nobody is going to complain. ... and that's pretty much my feeling. It seems really unlikely that anyone's using named-argument notation for pg_get_object_address, and even if they are, it wouldn't be very painful to change, or just not use the notation if they need cross-branch compatibility. I think it's more useful in the long run to make the names consistent. Will go take care of it. regards, tom lane
Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
On Wed, Sep 5, 2018 at 8:23 AM Chris Travers wrote: > 1. The query is in a parallel index scan or similar > 2. A process is executing a parallel plan and allocating a significant chunk > of memory (2MB for example) in dynamic shared memory. > 3. The startup process goes into a loop where it sends a sigusr1, sleeps 5m, > and sends another sigusr1 etc. > 4. The sigusr1 aborts the system call, which is then retried. > 5. Because the system call takes more than 5ms, we end up in an endless loop Do you mean this loop in dsm_impl_posix_resize() is getting interrupted constantly and never completing? /* We may get interrupted, if so just retry. */ do { rc = posix_fallocate(fd, 0, size); } while (rc == EINTR); -- Thomas Munro http://www.enterprisedb.com
Funny hang on PostgreSQL 10 during parallel index scan on slave
Hi all; For the last few months we have been facing a funny problem on a slave where queries go to 100% cpu usage and never finish, causing the recovery process to hang and the replica to fall behind, Over time we ruled out a lot of causes and were banging our heads against this one. Today we got a break in it when we attached a debugger to various processes even without debugging symbols. Not only did we get useful stack traces from the hung query but attaching a debugger to the startup process caused the query to finish. This has shown up in 10.2 and 10.5. Based on the stack traces we have concluded the following seems to happen: 1. The query is in a parallel index scan or similar 2. A process is executing a parallel plan and allocating a significant chunk of memory (2MB for example) in dynamic shared memory. 3. The startup process goes into a loop where it sends a sigusr1, sleeps 5m, and sends another sigusr1 etc. 4. The sigusr1 aborts the system call, which is then retried. 5. Because the system call takes more than 5ms, we end up in an endless loop I see one of two possible solutions here. 1. Exponential backoff in sending signals to maybe 1s max. 2. If there is any way to check for signals before retrying the system call (which I am not 100% sure where it is yet but on my way). Any feedback or thoughts before we look at implementing a patch? -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin
Re: Collation versioning
On Wed, Sep 5, 2018 at 3:35 AM Christoph Berg wrote: > Re: Thomas Munro 2018-09-04 > > > I was reminded about that by recent news > > about an upcoming glibc/CLDR resync that is likely to affect > > PostgreSQL users (though, I guess, probably only when they do a major > > OS upgrade). > > Or replicating/restoring a database to a newer host. Yeah. > > Does anyone know > > of a way to extract a version string from glibc using existing > > interfaces? I heard there was an undocumented way but I haven't been > > able to find it -- probably because I was, erm, looking in the > > documentation. > > That sounds more robust. Googling around: > > https://www.linuxquestions.org/questions/linux-software-2/how-to-check-glibc-version-263103/ > > #include > #include > int main (void) { puts (gnu_get_libc_version ()); return 0; } > > $ ./a.out > 2.27 > > Hopefully that version info is fine-grained enough. Hmm. I was looking for locale data version, not libc.so itself. I realise they come ultimately from the same source package, but are the locale definitions and libc6 guaranteed to be updated at the same time? I see that the locales package depends on libc-bin >> 2.27 (no upper bound), which in turn depends on libc6 >> 2.27, << 2.28. So perhaps you can have a system with locales 2.27 and libc6 2.28? I also wonder if there are some configurations where they could get out of sync because of manual use of locale-gen or something like that. Clearly most systems would have them in sync through apt-get upgrade though, so maybe gnu_get_libc_version() would work well in practice? -- Thomas Munro http://www.enterprisedb.com
Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.
On 2018-Sep-04, Michael Paquier wrote: > OK. I have dug into that, and finished with the attached. What do you > think? One thing is that the definition of submake is moving out of > REGRESS, and .PHONY gets defined. Should this be used in src/test/modules/{brin,test_commit_ts} also? Why do you guys design Makefile rules in pgsql-committers? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
On Wed, Sep 5, 2018 at 2:39 PM Alexander Korotkov wrote: > On Wed, Sep 5, 2018 at 12:26 PM Alexander Korotkov > wrote: > > On Wed, Sep 5, 2018 at 1:45 AM R, Siva wrote: > > > On Tue, Sep 4, 2018 at 09:16 PM, Alexander Korotkov > > > wrote: > > > > Do you have a test scenario for reproduction of this issue? We need > > > > it to ensure that fix is correct. > > > > > > Unfortunately, I do not have a way of reproducing this issue. > > > So far I have tried a workload consisting of inserts (of the > > > same attribute value that is indexed), batch deletes of rows > > > and vacuum interleaved with engine crash/restarts. > > > > Issue reproduction and testing is essential for bug fix. Remember > > last time you reported GIN bug [1], after issue reproduction it > > appears that we have more things to fix. I's quite clear for me that > > if segment list contains GIN_SEGMENT_INSERT before GIN_SEGMENT_DELETE, > > then it might lead to wrong behavior in ginRedoRecompress(). But it's > > not yet clear to understand what code patch could lead to such segment > > list... I'll explore code more and probably will come with some idea. > > Aha, I've managed to reproduce this. > 1. Apply ginRedoRecompress-asserts.patch, which add assertions to > ginRedoRecompress() detecting past opaque writes. It was wrong, sorry. It appears that I put strict inequality into asserts, while there should be loose inequality. Correct version of patch is attached. And scenario I've posted isn't really reproducing the bug... -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company ginRedoRecompress-asserts-v2.patch Description: Binary data
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Peter Eisentraut writes: > On 05/09/2018 02:51, Andres Freund wrote: >> My current proposal is thus to do add a check that does >> #if defined(__clang__) && defined(__i386__) && !defined(__SSE2_MATH__) >> something-that-fails >> #endif >> in an autoconf test, and have configure complain if that fails. > Would this not be invalidated if the bug you have filed gets fixed? Perhaps, but how would autoconf tell that? I wouldn't have any confidence in a runtime test, as it might or might not trigger the bug depending on all sorts of factors like -O level. In any case, it seems like "use -msse2 on x86" is good advice from a performance standpoint even if it doesn't prevent a compiler bug. One thought is that maybe we should provide a way to override this, in case somebody really can't or doesn't want to use -msse2, and is willing to put up with platform-dependent float behavior. regards, tom lane
Re: pointless check in RelationBuildPartitionDesc
On 2018-Sep-05, Amit Langote wrote: > On 2018/09/05 1:50, Alvaro Herrera wrote: > > Proposed patch. Checking isnull in a elog(ERROR) is important, because > > the column is not marked NOT NULL. This is not true for other columns > > where we simply do Assert(!isnull). > > Looks good. Thanks for taking care of other sites as well. > > @@ -14705,7 +14705,9 @@ ATExecDetachPartition(Relation rel, RangeVar *name) > > (void) SysCacheGetAttr(RELOID, tuple, Anum_pg_class_relpartbound, > &isnull); > - Assert(!isnull); > + if (isnull) > + elog(ERROR, "null relpartbound for relation %u", > + RelationGetRelid(partRel)); > > In retrospect, I'm not sure why this piece of code is here at all; maybe > just remove the SycCacheGetAttr and Assert? Yeah, good idea, will do. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Bug in to_timestamp().
On Wed, Sep 5, 2018 at 3:10 PM amul sul wrote: > On Wed, Sep 5, 2018 at 3:05 PM Alexander Korotkov > wrote: > > On Wed, Sep 5, 2018 at 1:22 AM David G. Johnston > > wrote: > > > From those results the question is how important is it to force the > > > following breakage on our users (i.e., introduce FX exact symbol > > > matching): > > > > > > SELECT to_timestamp('97/Feb/16', 'FXYY:Mon:DD'); > > > - to_timestamp > > > --- > > > - Sun Feb 16 00:00:00 1997 PST > > > -(1 row) > > > - > > > +ERROR: unexpected character "/", expected character ":" > > > +HINT: In FX mode, punctuation in the input string must exactly match > > > the format string. > > > > > > There seemed to be some implicit approvals of this breakage some 30 > > > emails and 10 months ago but given that this is the only change from a > > > correct result to a failure I'd like to officially put it out there for > > > opinion/vote gathering. Mine is a -1; though keeping the distinction > > > between space and non-alphanumeric characters is expected. > > > > Do I understand correctly that you're -1 to changes to FX mode, but no > > objection to changes in non-FX mode? > > > Ditto. So, if no objections for non-FX mode changes, then I'll extract that part and commit it separately. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [WIP] [B-Tree] Retail IndexTuple deletion
Hi, I prepared next version of Background worker (cleaner) based on a retail indextuple deletion patch. This version shows stable behavior on regression tests and pgbench workloads. In this version: 1. Only AccessShareLock are acquired on a cleanup of heap and index relations. 2. Some 'aggressive' cleanup strategy introduced - conditional cleanup locks not used. 3. Cleanup only an in-memory blocks. 4. The Cleaner calls heap_page_prune() before cleanup a block. Benchmarks - Two factors were evaluated: performance (tps) and relations blowing. Before each test some rarefaction of pgbench_accounts was modeled by deletion 10% of tuples at each block. Also, I tested normal and Gaussian distribution of queries on pgbench_accounts relation. Autovacuum uses default settings. Script: pgbench -i -s 10 psql -c $"DELETE FROM pgbench_accounts WHERE (random() < 0.1);" psql -c $"VACUUM;" psql -c $"CREATE INDEX pgbench_accounts_ext ON public.pgbench_accounts USING btree (abalance);" && pgbench -T 3600 -c 32 -j 8 -M prepared -P 600 NORMAL distribution: average tps = 1045 (cleaner); = 1077 (autovacuum) Relations size at the end of test, MB: pgbench_accounts: 128 (cleaner); 128 (autovacuum) pgbench_branches: 0.1 (cleaner); 2.1 (autovacuum) pgbench_tellers: 0.4 (cleaner); 2.8 (autovacuum) pgbench_accounts_pkey: 21 (cleaner); 43 (autovacuum) pgbench_accounts_ext: 48 (cleaner); 56 (autovacuum) Gaussian distribution: average tps = 213 (cleaner); = 213 (autovacuum) Relations size at the end of test, MB: pgbench_accounts: 128 (cleaner); 128 (autovacuum) pgbench_accounts_ext: 22 (cleaner); 29 (autovacuum) Conclusions --- 1. For retail indextuple deletion purposes i replaced ItemIdSetDead() by ItemIdMarkDead() in heap_page_prune_execute() operation. Hereupon in the case of 100% filling of each relation block we get a blowing HEAP and index , more or less. When the blocks already have free space, the cleaner can delay blowing the heap and index without a vacuum. 2. Cleaner works fine in the case of skewness of access frequency to relation blocks. 3. The cleaner does not cause a decrease of performance. -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company >From b465357b8a884cac3b503ad171976d3afd31f574 Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Wed, 5 Sep 2018 10:39:11 +0300 Subject: [PATCH 5/5] Heap and Index cleaner --- .../postgres_fdw/expected/postgres_fdw.out| 30 ++-- contrib/postgres_fdw/sql/postgres_fdw.sql | 4 +- src/backend/access/heap/pruneheap.c | 6 +- src/backend/access/nbtree/nbtree.c| 15 +- src/backend/access/transam/xact.c | 4 + src/backend/catalog/system_views.sql | 11 ++ src/backend/commands/vacuumlazy.c | 44 +++-- src/backend/postmaster/Makefile | 2 +- src/backend/postmaster/pgstat.c | 36 src/backend/postmaster/postmaster.c | 160 +- src/backend/storage/buffer/bufmgr.c | 60 ++- src/backend/storage/buffer/localbuf.c | 24 +++ src/backend/storage/ipc/ipci.c| 3 + src/backend/storage/lmgr/lwlocknames.txt | 1 + src/backend/storage/lmgr/proc.c | 5 +- src/backend/tcop/postgres.c | 12 ++ src/backend/utils/adt/pgstatfuncs.c | 2 + src/backend/utils/hash/Makefile | 2 +- src/backend/utils/init/miscinit.c | 3 +- src/backend/utils/init/postinit.c | 11 +- src/include/commands/progress.h | 12 ++ src/include/commands/vacuum.h | 3 + src/include/pgstat.h | 14 +- src/include/storage/buf_internals.h | 1 + src/include/storage/bufmgr.h | 5 +- src/include/storage/pmsignal.h| 2 + src/test/regress/expected/rules.out | 18 +- src/test/regress/expected/triggers.out| 2 +- src/test/regress/input/constraints.source | 12 +- src/test/regress/output/constraints.source| 34 ++-- src/test/regress/sql/rules.sql| 2 +- src/test/regress/sql/triggers.sql | 2 +- 32 files changed, 458 insertions(+), 84 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index f5498c62bd..622bea2a7d 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -5496,19 +5496,25 @@ ALTER SERVER loopback OPTIONS (DROP extensions); INSERT INTO ft2 (c1,c2,c3) SELECT id, id % 10, to_char(id, 'FM0') FROM generate_series(2001, 2010) id; EXPLAIN (verbose, costs off) -UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;-- can't be pushed down -QUERY PLAN -
Re: [HACKERS] Bug in to_timestamp().
On Wed, Sep 5, 2018 at 3:05 PM Alexander Korotkov wrote: > > On Wed, Sep 5, 2018 at 1:22 AM David G. Johnston > wrote: > > On Mon, Sep 3, 2018 at 2:30 PM, Alexander Korotkov > > wrote: > >> > >> The current version of patch doesn't really distinguish spaces and > >> delimiters in format string in non-FX mode. So, spaces and delimiters > >> are forming single group. For me Oracle behavior is ridiculous at > >> least because it doesn't allow cases when input string exactly matches > >> format string. > >> > >> This one fails: > >> SELECT to_timestamp('2018- -01 02', '- -MM DD') FROM dual > > > > Related to below - since this works today it should continue to work. I > > was under the wrong impression that we followed their behavior today and > > were pondering deviating from it because of its ridiculousness. > > Right, that's just incompatibility with Oracle behavior, not with our > previous behavior. > > >> > The couple of regression tests that change do so for the better. It > >> > would be illuminating to set this up as two patches though, one > >> > introducing all of the new regression tests against the current code and > >> > then a second patch with the changed behavior with only the affected > >> > tests. > >> > >> OK, here you go. 0001-to_timestamp-regression-test-v17.patch > >> introduces changes in regression tests and their output for current > >> master, while 0002-to_timestamp-format-checking-v17.patch contain > >> changes to to_timestamp itself. > >> > > > > From those results the question is how important is it to force the > > following breakage on our users (i.e., introduce FX exact symbol matching): > > > > SELECT to_timestamp('97/Feb/16', 'FXYY:Mon:DD'); > > - to_timestamp > > --- > > - Sun Feb 16 00:00:00 1997 PST > > -(1 row) > > - > > +ERROR: unexpected character "/", expected character ":" > > +HINT: In FX mode, punctuation in the input string must exactly match the > > format string. > > > > There seemed to be some implicit approvals of this breakage some 30 emails > > and 10 months ago but given that this is the only change from a correct > > result to a failure I'd like to officially put it out there for > > opinion/vote gathering. Mine is a -1; though keeping the distinction > > between space and non-alphanumeric characters is expected. > > Do I understand correctly that you're -1 to changes to FX mode, but no > objection to changes in non-FX mode? > Ditto. Regards, Amul Sul.
Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
On Wed, Sep 5, 2018 at 12:26 PM Alexander Korotkov wrote: > On Wed, Sep 5, 2018 at 1:45 AM R, Siva wrote: > > On Tue, Sep 4, 2018 at 09:16 PM, Alexander Korotkov > > wrote: > > > Do you have a test scenario for reproduction of this issue? We need > > > it to ensure that fix is correct. > > > > Unfortunately, I do not have a way of reproducing this issue. > > So far I have tried a workload consisting of inserts (of the > > same attribute value that is indexed), batch deletes of rows > > and vacuum interleaved with engine crash/restarts. > > Issue reproduction and testing is essential for bug fix. Remember > last time you reported GIN bug [1], after issue reproduction it > appears that we have more things to fix. I's quite clear for me that > if segment list contains GIN_SEGMENT_INSERT before GIN_SEGMENT_DELETE, > then it might lead to wrong behavior in ginRedoRecompress(). But it's > not yet clear to understand what code patch could lead to such segment > list... I'll explore code more and probably will come with some idea. Aha, I've managed to reproduce this. 1. Apply ginRedoRecompress-asserts.patch, which add assertions to ginRedoRecompress() detecting past opaque writes. 2. Setup streaming replication. 3. Execute following on the master. create or replace function test () returns void as $$ declare i int; begin FOR i IN 1..1000 LOOP insert into test values ('{1}'); end loop; end $$ language plpgsql; create table test (a int[]); insert into test (select '{}'::int[] from generate_series(1,1)); insert into test (select '{1}'::int[] from generate_series(1,10)); create index test_idx on test using gin(a) with (fastupdate = off); delete from test where a = '{}'::int[]; vacuum test; select test(); So, since we managed to reproduce this, I'm going to test and review your fix. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c index 7a1e94a1d56..301c527cab9 100644 --- a/src/backend/access/gin/ginxlog.c +++ b/src/backend/access/gin/ginxlog.c @@ -276,6 +276,7 @@ ginRedoRecompress(Page page, ginxlogRecompressDataLeaf *data) break; case GIN_SEGMENT_INSERT: +Assert(segptr + newsegsize + szleft < PageGetSpecialPointer(page)); /* make room for the new segment */ memmove(segptr + newsegsize, segptr, szleft); /* copy the new segment in place */ @@ -285,6 +286,8 @@ ginRedoRecompress(Page page, ginxlogRecompressDataLeaf *data) break; case GIN_SEGMENT_REPLACE: +Assert(segptr + newsegsize + (szleft - segsize) < PageGetSpecialPointer(page)); +Assert(segptr + szleft < PageGetSpecialPointer(page)); /* shift the segments that follow */ memmove(segptr + newsegsize, segptr + segsize,
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/30 21:58), Etsuro Fujita wrote: (2018/08/30 20:37), Kyotaro HORIGUCHI wrote: At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro Fujita wrote in<5b7ffdef.6020...@lab.ntt.co.jp> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote: At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita wrote in<5b72c1ae.8010...@lab.ntt.co.jp> (2018/08/09 22:04), Etsuro Fujita wrote: (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: I spent more time looking at the patch. ISTM that the patch well suppresses the effect of the tuple-descriptor expansion by making changes to code in the planner and executor (and ruleutils.c), but I'm still not sure that the patch is the right direction to go in, because ISTM that expanding the tuple descriptor on the fly might be a wart. The exapansion should be safe if the expanded descriptor has the same defitions for base columns and all the extended coulumns are junks. The junk columns should be ignored by unrelated nodes and they are passed safely as far as ForeignModify passes tuples as is from underlying ForeignScan to ForeignUpdate/Delete. I'm not sure that would be really safe. Does that work well when EvalPlanQual, for example? I was wrong here; I assumed here that we supported late locking for an UPDATE or DELETE on a foreign table, and I was a bit concerned that the approach you proposed might not work well with EvalPlanQual, but as described in fdwhandler.sgml, the core doesn't support for that: For an UPDATE or DELETE on a foreign table, it is recommended that the ForeignScan operation on the target table perform early locking on the rows that it fetches, perhaps via the equivalent of SELECT FOR UPDATE. An FDW can detect whether a table is an UPDATE/DELETE target at plan time by comparing its relid to root->parse->resultRelation, or at execution time by using ExecRelationIsTargetRelation(). An alternative possibility is to perform late locking within the ExecForeignUpdate or ExecForeignDelete callback, but no special support is provided for this. So, there would be no need to consider about EvalPlanQual. Sorry for the noise. Best regards, Etsuro Fujita
Re: Collation versioning
Re: Thomas Munro 2018-09-04 > I was reminded about that by recent news > about an upcoming glibc/CLDR resync that is likely to affect > PostgreSQL users (though, I guess, probably only when they do a major > OS upgrade). Or replicating/restoring a database to a newer host. > ... or, on a Debian system using the locales package, like this: > > libc_collation_version_command = 'dpkg -s locales | grep Version: | > sed "s/Version: //"' Ugh. This sounds horribly easy to get wrong on the user side. I could of course put that preconfigured into the Debian packages, but that would leave everyone not using any of the standard distro packagings in the rain. > Does anyone know > of a way to extract a version string from glibc using existing > interfaces? I heard there was an undocumented way but I haven't been > able to find it -- probably because I was, erm, looking in the > documentation. That sounds more robust. Googling around: https://www.linuxquestions.org/questions/linux-software-2/how-to-check-glibc-version-263103/ #include #include int main (void) { puts (gnu_get_libc_version ()); return 0; } $ ./a.out 2.27 Hopefully that version info is fine-grained enough. Christoph
PostgreSQL does not start when max_files_per_process> 1200 on Windows 7
Hi, I have a bug in Windows 7 with max_files_per_process> 1200. Using dup (0) in the function count_usable_fds more than 1200 times (0 = stdin) causes further unpredictable errors with file operations. When I open a real file and use its descriptor for the dup, no error occurs. In the patch I check the file postgresql.conf -- Victor Spirin Postgres Professional:http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index 8dd51f1..79054e5 100644 --- a/src/backend/storage/file/fd.c +++ b/src/backend/storage/file/fd.c @@ -828,6 +828,15 @@ count_usable_fds(int max_to_probe, int *usable_fds, int *already_open) ereport(WARNING, (errmsg("getrlimit failed: %m"))); #endif /* HAVE_GETRLIMIT */ + int fd_test = 0; +#ifdef WIN32 + /* +* we have error on Windows7 with max_files_per_process > 1200 when dup(0) - stdin +* make test on postgresql.conf file +*/ + fd_test = _open(ConfigFileName, _O_RDONLY); + if (fd_test < 0) fd_test = 0; /* if was error then = stdin */ +#endif /* dup until failure or probe limit reached */ for (;;) { @@ -843,7 +852,7 @@ count_usable_fds(int max_to_probe, int *usable_fds, int *already_open) break; #endif - thisfd = dup(0); + thisfd = dup(fd_test); if (thisfd < 0) { /* Expect EMFILE or ENFILE, else it's fishy */ @@ -869,7 +878,9 @@ count_usable_fds(int max_to_probe, int *usable_fds, int *already_open) /* release the files we opened */ for (j = 0; j < used; j++) close(fd[j]); - +#ifdef WIN32 + if (fd_test > 0) _close(fd_test); +#endif pfree(fd); /*
Re: [HACKERS] Bug in to_timestamp().
On Wed, Sep 5, 2018 at 1:22 AM David G. Johnston wrote: > On Mon, Sep 3, 2018 at 2:30 PM, Alexander Korotkov > wrote: >> >> The current version of patch doesn't really distinguish spaces and >> delimiters in format string in non-FX mode. So, spaces and delimiters >> are forming single group. For me Oracle behavior is ridiculous at >> least because it doesn't allow cases when input string exactly matches >> format string. >> >> This one fails: >> SELECT to_timestamp('2018- -01 02', '- -MM DD') FROM dual > > Related to below - since this works today it should continue to work. I was > under the wrong impression that we followed their behavior today and were > pondering deviating from it because of its ridiculousness. Right, that's just incompatibility with Oracle behavior, not with our previous behavior. >> > The couple of regression tests that change do so for the better. It would >> > be illuminating to set this up as two patches though, one introducing all >> > of the new regression tests against the current code and then a second >> > patch with the changed behavior with only the affected tests. >> >> OK, here you go. 0001-to_timestamp-regression-test-v17.patch >> introduces changes in regression tests and their output for current >> master, while 0002-to_timestamp-format-checking-v17.patch contain >> changes to to_timestamp itself. >> > > From those results the question is how important is it to force the following > breakage on our users (i.e., introduce FX exact symbol matching): > > SELECT to_timestamp('97/Feb/16', 'FXYY:Mon:DD'); > - to_timestamp > --- > - Sun Feb 16 00:00:00 1997 PST > -(1 row) > - > +ERROR: unexpected character "/", expected character ":" > +HINT: In FX mode, punctuation in the input string must exactly match the > format string. > > There seemed to be some implicit approvals of this breakage some 30 emails > and 10 months ago but given that this is the only change from a correct > result to a failure I'd like to officially put it out there for opinion/vote > gathering. Mine is a -1; though keeping the distinction between space and > non-alphanumeric characters is expected. Do I understand correctly that you're -1 to changes to FX mode, but no objection to changes in non-FX mode? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
On Wed, Sep 5, 2018 at 1:45 AM R, Siva wrote: > On Tue, Sep 4, 2018 at 09:16 PM, Alexander Korotkov > wrote: > > Do you have a test scenario for reproduction of this issue? We need > > it to ensure that fix is correct. > > Unfortunately, I do not have a way of reproducing this issue. > So far I have tried a workload consisting of inserts (of the > same attribute value that is indexed), batch deletes of rows > and vacuum interleaved with engine crash/restarts. Issue reproduction and testing is essential for bug fix. Remember last time you reported GIN bug [1], after issue reproduction it appears that we have more things to fix. I's quite clear for me that if segment list contains GIN_SEGMENT_INSERT before GIN_SEGMENT_DELETE, then it might lead to wrong behavior in ginRedoRecompress(). But it's not yet clear to understand what code patch could lead to such segment list... I'll explore code more and probably will come with some idea. Links [1] https://www.postgresql.org/message-id/flat/1531867212836.63354%40amazon.com -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: A strange GiST error message or fillfactor of GiST build
> 5 сент. 2018 г., в 13:29, Kyotaro HORIGUCHI > написал(а): > > We don't preserve it for Andrey's use case. Just my 2 cents: that was a hacky use case for development reasons. I think that removing fillfactor is good idea and your latest patch looks good from my POV. Best regards, Andrey Borodin.
Re: A strange GiST error message or fillfactor of GiST build
At Tue, 4 Sep 2018 13:05:55 +0300, Alexander Korotkov wrote in > On Tue, Sep 4, 2018 at 12:16 PM Kyotaro HORIGUCHI > wrote: > > I agree that fillfactor should be ignored in certain cases > > (inserting the first tuple into empty pages or something like > > that). Even though GiST doesn't need fillfactor at all, it is > > defined independently from AM instances > > What exactly do you mean? Yes, fillfactor is defined in StdRdOptions > struct, which is shared across many access methods. But some of them > uses fillfactor, while others don't. For instance, GIN and BRIN don't > support fillfactor at all. Yes. It was with the intent of keeping Andrey's use case. So I proposed that just disabling it rather than removing the code at all. > > and it gives some > > (undocumented) convenient on testing even on GiST. > > Do you keep in the mind some particular use cases? If so, could you > please share them. It would let me and others understand what > behavior we need to preserve and why. I misunderstood the consensus here, I myself don't have a proper one. The consensus here is: fillfactor is useless for GiST. We don't preserve it for Andrey's use case. No one is objecting to ignoring it here. Is it right? Well, I propose the first attached instaed. It just removes fillfactor handling from GiST. Apart from the fillfactor removal, we have an internal error for rootsplit failure caused by too-long tuples, but this should be a user error message like gistSplit. (index row size %zu exeeds..) =# create table y as select cube(array(SELECT random() as a FROM generate_series(1,600))) from generate_series(1,1e3,1); =# create index on y using gist(cube); ERROR: failed to add item to index page in "y_cube_idx" The attached second patch changes the message in the case. ERROR: size of 2 index rows (9640 bytes) exceeds maximum 8152 of the root page for the index "y_cube_idx" This is one of my first motivation here but now this might be another issue. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From e5abf14d5d7de4256a7d40a0e5783655a99c2515 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Wed, 5 Sep 2018 09:49:58 +0900 Subject: [PATCH 1/2] Ignore fillfactor in GiST fillfactor for GiST doesn't work as expected since GiST doesn't perform sorted bulk insertion to pack pages as tight as possible. Therefore we remove the fillfactor capability from it. It is still allowed to be set for backward compatibility but just ignored. --- doc/src/sgml/ref/create_index.sgml | 8 +++- src/backend/access/common/reloptions.c | 4 ++-- src/backend/access/gist/gist.c | 13 ++--- src/backend/access/gist/gistbuild.c| 18 +++--- src/backend/access/gist/gistutil.c | 5 ++--- src/include/access/gist_private.h | 12 +++- 6 files changed, 23 insertions(+), 37 deletions(-) diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 3c1223b324..e2a77e0d4d 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -390,7 +390,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] - The B-tree, hash, GiST and SP-GiST index methods all accept this parameter: + The B-tree, hash and SP-GiST index methods all accept this parameter: @@ -412,6 +412,12 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] + + + GiST also accepts this parameter just for historical reasons but + ignores it. + + diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index db84da0678..9c9589a78e 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -186,12 +186,12 @@ static relopt_int intRelOpts[] = { { "fillfactor", - "Packs gist index pages only to this percentage", + "This is ignored but exists for historical reasons", RELOPT_KIND_GIST, ShareUpdateExclusiveLock /* since it applies only to later * inserts */ }, - GIST_DEFAULT_FILLFACTOR, GIST_MIN_FILLFACTOR, 100 + 0, 10, 100 }, { { diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index 8a42effdf7..5915ab2cf2 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -172,7 +172,7 @@ gistinsert(Relation r, Datum *values, bool *isnull, values, isnull, true /* size is currently bogus */ ); itup->t_tid = *ht_ctid; - gistdoinsert(r, itup, 0, giststate); + gistdoinsert(r, itup, giststate); /* cleanup */ MemoryContextSwitchTo(oldCxt); @@ -212,7 +212,7 @@ gistinsert(Relation r, Datum *values, bool *isnull, * Returns 'true' if the page was split, 'false' otherwise. */ bool -gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate, +gistplacetopage(Relation rel, GISTSTATE *giststate, Buffer buffer, IndexTuple *itup, int ntup, OffsetNumber o
Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
On 05/09/2018 02:51, Andres Freund wrote: > My current proposal is thus to do add a check that does > #if defined(__clang__) && defined(__i386__) && !defined(__SSE2_MATH__) > something-that-fails > #endif > in an autoconf test, and have configure complain if that > fails. Something roughly along the lines of > "Compiling PostgreSQL with clang, on 32bit x86, requires SSE2 support. Use > -msse2 or use gcc." Would this not be invalidated if the bug you have filed gets fixed? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Prevent concurrent DROP SCHEMA when certain objects are being initially created in the namespace
> > Something which would be good to have for all those queries is a set of > isolation tests. No need for multiple specs, you could just use one > spec with one session defining all the object types you would like to > work on. How did you find this object list? Did you test all the > objects available manually? > Attached the isolation spec file. I originally was only going to fix the simple CREATE TYPE scenario but decided to look up other objects that can reside in namespaces and ended up finding a handful of others. I tested each one manually before and after adding the AccessShareLock acquire on the schema. I think that line of thought leads to an enormous increase in locking > overhead, for which we'd get little if any gain in usability. So my > inclination is to make an engineering judgment that we won't fix this. > As I was creating this patch, I had similar feelings on the locking overhead and was curious how others would feel about it as well. Regards, Jimmy On Tue, Sep 4, 2018 at 10:05 PM, Tom Lane wrote: > Andres Freund writes: > > On September 4, 2018 9:11:25 PM PDT, Tom Lane wrote: > >> I think that line of thought leads to an enormous increase in locking > >> overhead, for which we'd get little if any gain in usability. So my > >> inclination is to make an engineering judgment that we won't fix this. > > > Haven't we already significantly started down this road, to avoid a lot > of the "tuple concurrently updated" type errors? > > Not that I'm aware of. We do not take locks on schemas, nor functions, > nor any other of the object types I mentioned. > > > Would expanding this a git further really be that noticeable? > > Frankly, I think it would be not so much "noticeable" as "disastrous". > > Making the overhead tolerable would require very large compromises > in coverage, perhaps like "we'll only lock during DDL not DML". > At which point I'd question why bother. We've seen no field reports > (that I can recall offhand, anyway) that trace to not locking these > objects. > > regards, tom lane > concurrent-schema-drop.spec Description: Binary data