Re: [HACKERS] segfault in 9.5alpha - plpgsql function, implicit cast and IMMUTABLE cast function
On Fri, Jul 17, 2015 at 11:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Geoff Winkless pgsqlad...@geoff.dj writes: While doing some testing of 9.5a one of my colleagues (not on list) found a reproducible server segfault. Hm, looks like commit 1345cc67bbb014209714af32b5681b1e11eaf964 is to blame: memory management for the plpgsql cast cache needs to be more complicated than I realized :-(. And this issue is already fixed by 0fc94a5b. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN index and aborted transaction
Tatsuo Ishii wrote: When a transaction aborts, it seems a BRIN index leaves summary data which is not valid any more. Is this an expected behavior? I guess the answer is yes, because it does not affect correctness of a query result, but I would like to make sure. You're right, that is not rolled back (just like any other index type, actually). Second question is when the wrong summary data is gone? It seems vacuum does not help. Do I have to recreate the index (or reindex)? Yeah, that's a bit of an open problem: we don't have any mechanism to mark a block range as needing resummarization, yet. I don't have any great ideas there, TBH. Some options that were discussed but never led anywhere: 1. whenever a heap tuple is deleted that's minimum or maximum for a column, mark the index tuple as needing resummarization. One a future vacuuming pass the index would be updated. (I think this works for minmax, but I don't see how to apply it to inclusion). 2. have block ranges be resummarized randomly during vacuum. 3. Have index tuples last for only X number of transactions, marking the as needing summarization when that expires. 4. Have a user-invoked function that re-runs summarization. That way the user can implement any of the above policies, or others. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY
Heikki Linnakangas wrote: On 07/17/2015 05:40 PM, Michael Paquier wrote: On Fri, Jul 17, 2015 at 11:16 PM, Kevin Grittner kgri...@ymail.com wrote: Heikki Linnakangas heikki.linnakan...@iki.fi wrote: This fixes bug #13126, reported by Kirill Simonov. It looks like you missed something with the addition of AT_ReAddComment: test_ddl_deparse.c:80:11: warning: enumeration value 'AT_ReAddComment' not handled in switch [-Wswitch] switch (subcmd-subtype) ^ Oops. If someone could pick up the attached (backpatch to 9.5 needed)... Hmm, that function is pretty fragile, it will segfault on any AT_* type that it doesn't recognize. Thankfully you get that compiler warning, but we have added AT_* type codes before in minor releases. Yeah, that module was put together in a bit of a rush when I decided to remove the JSON deparsing part of the DDL patch. I couldn't quite figure out what the purpose of that module is, as there is no documentation or README or file-header comments on it. Well, since it's in src/test/modules I thought it was clear that the intention is just to be able to test the pg_ddl_command type -- obviously not. I will add a README or something. If it's there just to so you can run the regression tests that come with it, it might make sense to just add a default case to that switch to handle any unrecognized commands, and perhaps even remove the cases for the currently untested subcommands as it's just dead code. Well, I would prefer to have an output that says unrecognized and then add more test cases to the SQL files so that there's not so much dead code. I prefer that to removing the C support code, because then as we add extra tests we don't need to modify the C source. If it's supposed to act like a sample that you can copy-paste and modify, then perhaps that would still be the best option, and add a comment there explaining that it only cares about the most common subtypes but you can add handlers for others if necessary. I wasn't thinking in having it be useful for copy-paste. My longer-term plan is to have the JSON deparsing extension live in src/extensions. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
Oops, as usual I forgot something... This v2 removes old stats code that was put in comment and simplify the logic when counting lag times, as they are now taken into account at the end of the transaction instead of at the beginning. This patch adds per-script statistics other improvements to pgbench Rationale: Josh asked for the per-script stats:-) Some restructuring is done so that all stats (-l --aggregate-interval --progress --per-script-stats, latency lag...) share the same structures and functions to accumulate data. This limits a lot the growth of pgbench from this patch (+17 lines). In passing, remove the distinction between internal and external scripts. Pgbench just execute scripts, some of them may be internal... As a side effect, all scripts can be accumulated pgbench -B -N -S -f ... would execute 4 scripts, 3 of which internal (tpc-b, simple-update, select-only and another externally supplied one). Also add a weight option to change the probability of choosing some scripts when several are available. Hmmm... Not sure that the --per-script-stats option is really useful. The stats could always be shown when several scripts are executed? sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats starting vacuum...end. transaction type: multiple scripts scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 3 s number of transactions actually processed: 3192 latency average: 0.940 ms tps = 1063.756045 (including connections establishing) tps = 1065.412737 (excluding connections establishing) SQL script 0: builtin: TPC-B (sort of) - weight is 1 - 297 transactions (tps = 98.977301) - latency average = 3.001 ms - latency stddev = 1.320 ms SQL script 1: builtin: simple update - weight is 2 - 621 transactions (tps = 206.952539) - latency average = 2.506 ms - latency stddev = 1.194 ms SQL script 2: builtin: select only - weight is 7 - 2274 transactions (tps = 757.826205) - latency average = 0.236 ms - latency stddev = 0.083 ms -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..eb571a8 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,18 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ benchmarking arguments: variablelist + varlistentry + termoption-B/option/term + termoption--tpc-b/option/term + listitem + para +Built-in TPC-B like test which updates three tables and performs +an insert on a fourth. See below for details. +This is the default if no bench is explicitely specified. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -313,8 +325,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ para Read transaction script from replaceablefilename/. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. +Multiple option-f/ options are allowed. /para /listitem /varlistentry @@ -404,10 +415,10 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Built-in test which updates only one table compared to option-B/. +This avoids update contention on structnamepgbench_tellers/ +and structnamepgbench_branches/, but it makes the test case +even less like TPC-B. /para /listitem /varlistentry @@ -499,9 +510,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +522,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Built-in test with select-only transactions. /para /listitem /varlistentry @@ -552,6 +563,20 @@
Re: [HACKERS] BRIN index and aborted transaction
Alvaro, Thank you for the explanation. It's really helpfull. Second question is when the wrong summary data is gone? It seems vacuum does not help. Do I have to recreate the index (or reindex)? Yeah, that's a bit of an open problem: we don't have any mechanism to mark a block range as needing resummarization, yet. I don't have any great ideas there, TBH. Some options that were discussed but never led anywhere: 1. whenever a heap tuple is deleted that's minimum or maximum for a column, mark the index tuple as needing resummarization. One a future vacuuming pass the index would be updated. (I think this works for minmax, but I don't see how to apply it to inclusion). 2. have block ranges be resummarized randomly during vacuum. 3. Have index tuples last for only X number of transactions, marking the as needing summarization when that expires. 4. Have a user-invoked function that re-runs summarization. That way the user can implement any of the above policies, or others. What about doing resummarization while rechecking the heap data? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] intel s3500 -- hot stuff
On 18/07/2015 12:03, Julien Rouhaud wrote: On 10/12/2014 17:52, Jeff Janes wrote: On Tue, Dec 9, 2014 at 12:43 PM, Bruce Momjian br...@momjian.us mailto:br...@momjian.us wrote: On Mon, Dec 8, 2014 at 03:40:43PM -0600, Merlin Moncure wrote: Did not see consistent measurable gains 256 effective_io_concurrency. Interesting that at setting of '2' (the lowest possible setting with the feature actually working) is pessimal. Very interesting. When we added a per-tablespace random_page_cost, there was a suggestion that we might want to add per-tablespace effective_io_concurrency someday: What I'd really like to see is to have effective_io_concurrency work on other types of scans. It's clearly a barn burner on fast storage and perhaps the default should be something other than '1'. Spinning storage is clearly dead and ssd seem to really benefit from the posix readhead api. I haven't played much with SSD, but effective_io_concurrency can be a big win even on spinning disk. Well, the real question is knowing which blocks to request before actually needing them. With a bitmap scan, that is easy --- I am unclear how to do it for other scans. We already have kernel read-ahead for sequential scans, and any index scan that hits multiple rows will probably already be using a bitmap heap scan. If the index scan is used to provide ordering as well as selectivity than it will resist being converted to an bitmap scan. Also it won't convert to a bitmap scan solely to get credit for the use of effective_io_concurrency, as that setting doesn't enter into planning decisions. For a regular index scan, it should be easy to prefetch table blocks for all the tuples that will need to be retrieved based on the current index leaf page, for example. Looking ahead across leaf page boundaries would be harder. I also think that having effective_io_concurrency for other nodes that bitmap scan would be really great, but for now having a per-tablespace effective_io_concurrency is simpler to implement and will already help, so here's a patch to implement it. I'm also adding it to the next commitfest. I didn't know that the thread must exists on -hackers to be able to add a commitfest entry, so I transfer the thread here. Sorry the double post. -- Julien Rouhaud http://dalibo.com - http://dalibo.org diff --git a/doc/src/sgml/ref/create_tablespace.sgml b/doc/src/sgml/ref/create_tablespace.sgml index 5756c3e..cf08408 100644 --- a/doc/src/sgml/ref/create_tablespace.sgml +++ b/doc/src/sgml/ref/create_tablespace.sgml @@ -104,14 +104,15 @@ CREATE TABLESPACE replaceable class=parametertablespace_name/replaceable listitem para A tablespace parameter to be set or reset. Currently, the only -available parameters are varnameseq_page_cost/ and -varnamerandom_page_cost/. Setting either value for a particular -tablespace will override the planner's usual estimate of the cost of -reading pages from tables in that tablespace, as established by -the configuration parameters of the same name (see -xref linkend=guc-seq-page-cost, -xref linkend=guc-random-page-cost). This may be useful if one -tablespace is located on a disk which is faster or slower than the +available parameters are varnameseq_page_cost/, +varnamerandom_page_cost/ and varnameeffective_io_concurrency/. +Setting either value for a particular tablespace will override the +planner's usual estimate of the cost of reading pages from tables in +that tablespace, as established by the configuration parameters of the +same name (see xref linkend=guc-seq-page-cost, +xref linkend=guc-random-page-cost, +xref linkend=guc-effective-io-concurrency). This may be useful if +one tablespace is located on a disk which is faster or slower than the remainder of the I/O subsystem. /para /listitem diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index 8176b6a..fb24d74 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -232,6 +232,18 @@ static relopt_int intRelOpts[] = }, -1, 64, MAX_KILOBYTES }, + { + { + effective_io_concurrency, + Number of simultaneous requests that can be handled efficiently by the disk subsystem., + RELOPT_KIND_TABLESPACE + }, +#ifdef USE_PREFETCH + 1, 0, MAX_IO_CONCURRENCY +#else + 0, 0, 0 +#endif + }, /* list terminator */ {{NULL}} @@ -1387,7 +1399,8 @@ tablespace_reloptions(Datum reloptions, bool validate) int numoptions; static const relopt_parse_elt tab[] = { {random_page_cost, RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, random_page_cost)}, - {seq_page_cost, RELOPT_TYPE_REAL, offsetof(TableSpaceOpts, seq_page_cost)}
Re: [HACKERS] WAL test/verification tool
On Sat, Jul 18, 2015 at 8:28 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Thomas Munro wrote: I have heard rumours of a tool that could verify or compare the effects of applying WAL records for testing/development purposes, but I've been unable to track it down or find out if it was publicly released. Does anyone know the status of that or what it was called? http://www.postgresql.org/message-id/cab7npqt6k-weggl0sg00ql2tctvn3n4taibyboni_jrf4cy...@mail.gmail.com Yes, and unfortunately I have no plans to work on the page masking facility (to be able to compare two pages correctly by masking fields that may not be consistent at WAL replay) and this patch in the short term. There is another thing I am interested in getting done for 9.6 which will get my focus until September... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_resetsysid
On 2015-07-18 02:29, Peter Eisentraut wrote: On 6/14/15 11:29 AM, Petr Jelinek wrote: 0002 - Adds pg_resetsysid utility which changes the system id to newly generated one. 0003 - Adds -s option to pg_resetxlog to change the system id to the one specified - this is separate from the other one as it can be potentially more dangerous. Adding a new top-level binary creates a lot of maintenance overhead (e.g., documentation, man page, translations, packaging), and it seems silly to create a new tool whose only purpose is to change one number in one file. If we're going to have this in pg_resetxlog anyway, why not create another option letter to assigns a generated ID? As the documentation says, resetting the system ID clears the WAL, so it's not like this is a completely danger-free situation. Well, last time I submitted this I did it exactly as you propose but there was long discussion about this changing the target audience of pg_resetxlog and that it would be better as separate binary from pg_resetxlog. It might more future proof to have more generic binary which can do all the less dangerous work that pg_resetxlog does (which currently is probably only -c and the newly proposed -s). Something like pg_setcontroldata but that's too long. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_uuid_t support in contrib/btree_gist
I'd like to add uuid support to contrib/btree_gist. Proposal: * Add `contrib/btree_gist/btree_uuid.c' with implementation based on gbtree_ninfo type description * Modify `contrib/btree_gist/btree_utils_num.c' to support pg_uuid_t * Modify `contrib/btree_gist/btree_gist.h' to add gbt_t_uuid type * Add `contrib/btree_gist/btree_gist--1.2.sql' with gist_uuid_ops * Modify `Makefile' to build btree_uuid * Modify `contrib/btree_gist/btree_gist.control' to set default_version to 1.2 * Regression tests * I've probably missed something, but that's the basic idea Question: pg_uuid_t is an opaque type in `src/include/utils/uuid.h'. To put this type in a struct for both a new uuidKEY and the gbtree_ninfo type description support we need the implementation of the struct that is currently hidden in `src/backend/utils/adt/uuid.c'. We could get around this by using gbtree_vinfo type description for uuid, but this is not a var type. That smells a little to me. Is modifying `src/include/utils/uuid.h' to leak the implementation details of pg_uuid_t in `src/backend/utils/adt/uuid.c' an option? It seems pretty drastic to make that change just for contrib/btree_gist, but the 16-byte char[] representation seems fairly standard to me. I don't know why this would be needed at some point in the future, but another possible impl could be a hi/lo int64 to represent the uuid type (like java), so leaking the implementation details would potentially add a btree_gist dependency to ever doing this in the future. A third option is to have our own impl of pg_uuid_t or something similar in btree_gist. Having two independent implementations of pg_uuid_t in postgres and btree_gist smells even worse and opens us up to dependency issues (that may very well be caught by the regression tests). The answer is not obvious to me how to proceed on this. (I'm a first-time poster so please correct any procedures or etiquette I'm missing)
Re: [HACKERS] object_classes array is broken, again
Robert Haas wrote: The transforms patch seems to have forgotten to add TransformRelationId to object_classes[], much like the RLS patch forgot to add PolicyRelationId in the same place. Fixing this is easy, but ISTM that we need to insert some sort of a guard to prevent people from continuing to forget this, because it's apparently quite easy to do. Perhaps add_object_address should Assert(OidIsValid(object_classes[oclass])), The problem is that there aren't enough callers of add_object_address: there are many indexes of that array that aren't ever accessed and so it's not obvious when the array is broken. If we were to put OCLASS_CLASS at the end instead of at the beginning, that would fix the problem by making it immediately obvious when things get broken this way, because the value used in the most common case would shift around every time we add another value. (Of course, we'd have to instruct people to not add new members after the pg_class entry.) I just tried this, and it's a bit nasty: while it does causes the tests to fail, it's not at all obvious that there's a connection between the failure and object_classes[]. I think we can solve that by adding a comment to ObjectClass. Here's the first hunk in the large regression failure: *** /pgsql/source/master/src/test/regress/expected/triggers.out 2015-05-22 20:09:28.936186873 +0200 --- /home/alvherre/Code/pgsql/build/master/src/test/regress/results/triggers.out 2015-07-18 17:26:13. 664764070 +0200 *** *** 1429,1437 (4 rows) DROP TABLE city_table CASCADE; - NOTICE: drop cascades to 2 other objects - DETAIL: drop cascades to view city_view - drop cascades to view european_city_view DROP TABLE country_table; -- Test pg_trigger_depth() create table depth_a (id int not null primary key); --- 1429,1434 -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index c1212e9..0107c53 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -127,7 +127,6 @@ typedef struct * See also getObjectClass(). */ static const Oid object_classes[MAX_OCLASS] = { - RelationRelationId, /* OCLASS_CLASS */ ProcedureRelationId, /* OCLASS_PROC */ TypeRelationId,/* OCLASS_TYPE */ CastRelationId,/* OCLASS_CAST */ @@ -158,7 +157,9 @@ static const Oid object_classes[MAX_OCLASS] = { DefaultAclRelationId, /* OCLASS_DEFACL */ ExtensionRelationId, /* OCLASS_EXTENSION */ EventTriggerRelationId, /* OCLASS_EVENT_TRIGGER */ - PolicyRelationId /* OCLASS_POLICY */ + PolicyRelationId, /* OCLASS_POLICY */ + TransformRelationId, /* OCLASS_POLICY */ + RelationRelationId /* OCLASS_CLASS */ }; diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h index 5da18c2..6f4802d 100644 --- a/src/include/catalog/dependency.h +++ b/src/include/catalog/dependency.h @@ -112,11 +112,10 @@ typedef struct ObjectAddresses ObjectAddresses; /* * This enum covers all system catalogs whose OIDs can appear in - * pg_depend.classId or pg_shdepend.classId. + * pg_depend.classId or pg_shdepend.classId. See also object_classes[]. */ typedef enum ObjectClass { - OCLASS_CLASS,/* pg_class */ OCLASS_PROC,/* pg_proc */ OCLASS_TYPE,/* pg_type */ OCLASS_CAST,/* pg_cast */ @@ -149,6 +148,11 @@ typedef enum ObjectClass OCLASS_EVENT_TRIGGER, /* pg_event_trigger */ OCLASS_POLICY,/* pg_policy */ OCLASS_TRANSFORM, /* pg_transform */ + /* + * Keep this previous-to-last, see + * https://www.postgresql.org/message-id/ + */ + OCLASS_CLASS,/* pg_class */ MAX_OCLASS /* MUST BE LAST */ } ObjectClass; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_uuid_t support in contrib/btree_gist
Jarred Ward jar...@webriots.com writes: pg_uuid_t is an opaque type in `src/include/utils/uuid.h'. To put this type in a struct for both a new uuidKEY and the gbtree_ninfo type description support we need the implementation of the struct that is currently hidden in `src/backend/utils/adt/uuid.c'. Yeah, I'd just move it into uuid.h. There's about 0 chance of needing to change it, and we haven't hesitated to expose the internals of many other data types in their respective headers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_resetsysid
On 7/18/15 9:42 AM, Petr Jelinek wrote: Well, last time I submitted this I did it exactly as you propose but there was long discussion about this changing the target audience of pg_resetxlog and that it would be better as separate binary from pg_resetxlog. In my reading of the thread, I did not get the sense that that was the consensus. There were certainly a lot of different opinions, but specifically some people ended up withdrawing their objections to using pg_resetxlog. It might more future proof to have more generic binary which can do all the less dangerous work that pg_resetxlog does (which currently is probably only -c and the newly proposed -s). I don't buy the more or less dangerous argument. Many tools can be dangerous. cp can be dangerous if you overwrite the wrong file. pg_restore can be dangerous if you give it the wrong options. Changing the system ID is also dangerous, as it can break replication and truncate the WAL. Right now, changing the system ID is an obscure step in some obscure workflow related to some obscure feature. That is not to say it's invalid, but it's not something that we can present to a normal user as the official workflow. Just adding little tools of the nature whack this around until it's in the right shape for this other thing is just adding complications on top of complications. If we want to turn this into a less dangerous and more user-facing feature, I would like to see a complete workflow of how this would be used. Maybe we'll come up with a better solution. For example, why couldn't pg_basebackup take care of this? Something like pg_setcontroldata but that's too long. Well, there is nothing so far saying that pg_controldata couldn't also write to pg_control. It's not called pg_getcontroldata. ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers