Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On Tue, Jan 27, 2015 at 6:08 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Anyway, I think it's reasonably clear now that pgaudit is unlikely to make it into 9.5 in any form, so I'll find something else to do. Well, I am marking this patch as rejected then... Let's in any case the discussion continue. Perhaps we could reach a clearer spec about what we want and how to shape it. -- Michael
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote: BTW, I'm not all that thrilled with the deserialized object terminology. I found myself repeatedly tripping up on which form was serialized and which de-. If anyone's got a better naming idea I'm willing to adopt it. My first thought is that we should form some kind of TOAST-like backronym, like Serialization Avoidance Loading and Access Device (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I don't think there is anything per se wrong with the terms serialization and deserialization; indeed, I used the same ones in the parallel-mode stuff. But they are fairly general terms, so it might be nice to have something more specific that applies just to this particular usage. The words that sprung to mind for me were: packed/unpacked. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] pg_rewind in contrib
On Mon, Jan 19, 2015 at 2:50 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Jan 19, 2015 at 2:38 PM, Michael Paquier michael.paqu...@gmail.com wrote: Heikki Linnakangas wrote: Addressed most of your comments, and Michael's. Another version attached. Extra thing: src/bin/pg_rewind/Makefile surely forgets to clean up the stuff from the regression tests: +clean distclean maintainer-clean: + rm -f pg_rewind$(X) $(OBJS) xlogreader.c Heikki, what's your status here? -- Michael
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
On Tue, Feb 10, 2015 at 9:21 AM, Peter Geoghegan p...@heroku.com wrote: * There was some squashing of commits, since Andres felt that they weren't all useful as separate commits. I've still split out the RTE permissions commit, as well as the RLS commit (plus the documentation and test commits, FWIW). I hope that this will make it easier to review parts of the patch, without being generally excessive. When documentation and tests are left out, the entire patch series is left at: Patch moved to next CF. -- Michael
Re: [HACKERS] multiple backends attempting to wait for pincount 1
On 2015-02-13 00:27:04 -0500, Tom Lane wrote: Two different CLOBBER_CACHE_ALWAYS critters recently reported exactly the same failure pattern on HEAD: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhordt=2015-02-06%2011%3A59%3A59 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tickdt=2015-02-12%2010%3A22%3A57 Those are rather strange, yea. Unfortunately both report a relatively large number of changes since the last run... I'd say we have a problem. I'd even go so far as to say that somebody has completely broken locking, because this looks like autovacuum and manual vacuuming are hitting the same table at the same time. Hm. It seems likely that that would show up more widely. Oddly enough other CLOBBER_CACHE animals, like http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=jaguarundidt=2015-02-12%2013%3A03%3A00 , that run more frequently have not reported a problem. Neither has leech which IIRC runs on the same system... One avenue to look are my changes around both buffer pinning and interrupt handling... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
On Fri, Feb 13, 2015 at 6:12 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Feb 13, 2015 at 4:59 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: Where are we on this? AFAIK, we have now a feature with no documentation and no example in-core to test those custom routine APIs, hence moved to next CF. Now Hanada-san is working on the example module that use this new infrastructure on top of postgres_fdw. Probably, he will submit the patch within a couple of days, for the upcoming commit fest. I am a bit surprised by that. Are you planning to give up on the ctidscan module module and Sorry I typed the wrong key. So... Are you planning to give up on the ctidscan module and submit only the module written by Hanada-san on top of postgres_fdw? As I imagine that the goal is just to have a test module to run the APIs why would the module submitted by Hanada-san be that necessary? -- Michael
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
On Fri, Feb 13, 2015 at 4:59 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: Where are we on this? AFAIK, we have now a feature with no documentation and no example in-core to test those custom routine APIs, hence moved to next CF. Now Hanada-san is working on the example module that use this new infrastructure on top of postgres_fdw. Probably, he will submit the patch within a couple of days, for the upcoming commit fest. I am a bit surprised by that. Are you planning to give up on the ctidscan module module and Regarding to the documentation, a consensus was to make up a wikipage to edit the description by everyone, then it shall become source of SGML file. The latest one is here: https://wiki.postgresql.org/wiki/CustomScanInterface OK. This looks like a good base. It would be good to have an actual patch for review as well at this stage. -- Michael
Re: [HACKERS] gcc5: initdb produces gigabytes of _fsm files
What does the ASM look like? It's a fairly quick way to tell whether the fail is optimization or memory corruption. Apologies if I'm explaining how to extract albumen to your elderly relative... On 12 February 2015 at 23:16, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Christoph Berg c...@df7cb.de writes: gcc5 is lurking in Debian experimental, and it's breaking initdb. Yeah, I just heard the same about Red Hat as well: https://bugzilla.redhat.com/show_bug.cgi?id=1190978 Not clear if it's an outright compiler bug or they've just found some creative new way to make an optimization assumption we're violating. Apparently, it's the former. See https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3 I will be unamused if the gcc boys try to make an argument that they did some valid optimization here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On 2015-02-13 17:06:14 +0900, Michael Paquier wrote: On Fri, Feb 13, 2015 at 4:57 PM, Marko Tiikkaja ma...@joh.to wrote: On 2/13/15 8:52 AM, Michael Paquier wrote: On Sun, Nov 23, 2014 at 8:23 PM, David Rowley dgrowle...@gmail.com wrote: As the patch stands there's still a couple of FIXMEs in there, so there's still a bit of work to do yet. Comments are welcome Hm, if there is still work to do, we may as well mark this patch as rejected as-is, also because it stands in this state for a couple of months. I didn't bring this up before, but I'm pretty sure this patch should be marked returned with feedback. From what I've understood, rejected means we don't want this thing, not in this form or any other. That doesn't seem to be the case for this patch, nor for a few others marked rejected in the currently in-progress commit fest. In the new CF app, marking a patch as returned this feedback adds it automatically to the next commit fest. And note that it is actually what I did for now to move on to the next CF in the doubt: https://commitfest.postgresql.org/3/27/ But if nothing is done, we should as well mark it as rejected. Not based on the fact that it is rejected based on its content, but to not bloat the CF app with entries that have no activity for months. Then the CF app needs to be fixed. Marking patches as rejected on these grounds is a bad idea. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add ssl_protocols configuration option
On Thu, Jan 15, 2015 at 4:17 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Dec 15, 2014 at 11:15 PM, Alex Shulgin a...@commandprompt.com wrote: Michael Paquier michael.paqu...@gmail.com writes: Perhaps ssloptions.[ch], unless you plan to add non-option-related code there later? I don't think anything else than common options-related code fits in there, so ssloptions.c makes sense to me. BTW, there is no Regent code in your openssl.c, so the copyright statement is incorrect. Good catch, I just blindly copied that from some other file. There have been arguments in favor and against this patch... But marking it as returned with feedback because of a lack of activity in the last couple of weeks. Feel free to update if you think that's not correct, and please move this patch to commit fest 2014-12. Attached is a new version that addresses the earlier feedback: renamed the added *.[ch] files and removed incorrect copyright line. I'm moving this to the current CF. This patch is statuquo since the beginning of this CF, at the arguments are standing the same. If there is nothing happening maybe it would be better to mark it as returned with feedback? Thoughts? I am not sure where we are moving on here, and if anybody cares much about this patch... Hence marked as rejected. Feel free to complain... -- Michael
Re: [HACKERS] assessing parallel-safety
On Fri, Feb 13, 2015 at 2:40 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Feb 12, 2015 at 3:52 PM, Amit Kapila amit.kapil...@gmail.com wrote: Probably not, because many queries will scan multiple relations, and we want to do all of this work just once per query. By this, do you mean to say that if there is any parallel-unsafe expression (function call) in query, then we won't parallelize any part of query, if so why is that mandatory? Because of stuff like set_config() and txid_current(), which will fail outright in parallel mode. Also because the user may have defined a function that updates some other table in the database, which will also fail outright if called in parallel mode. Instead of failing, we want those kinds of things to fall back to a non-parallel plan. Can't we parallelize scan on a particular relation if all the expressions in which that relation is involved are parallel-safe No, because the execution of that node can be interleaved in arbitrary fashion with all the other nodes in the plan tree. Once we've got parallel mode active, all the related prohibitions apply to *everything* we do thereafter, not just that one node. Okay, got the point. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] SSL information view
On Fri, Feb 13, 2015 at 9:07 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Feb 3, 2015 at 9:36 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Feb 3, 2015 at 6:42 AM, Michael Paquier michael.paqu...@gmail.com wrote: Where are we on this patch? No new version has been provided and there have been comments provided by Heikki here (5491e547.4040...@vmware.com) and by Alexei here (87ppbqz00h@commandprompt.com). Yeah, it's on my shame list. I'm definitely planning to get a new version in before the next CF and to try to work quicker with it that time to finish it off on time. Thanks for the reminder! Magnus, are you planning to work on this item of your shame list soon? Could you clarify the status of this patch? I do, and I hope to work on it over the next week or two. However, feel free to bump it to the next CF -- I'll pick it up halfway in between if necessary. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
Where are we on this? AFAIK, we have now a feature with no documentation and no example in-core to test those custom routine APIs, hence moved to next CF. Now Hanada-san is working on the example module that use this new infrastructure on top of postgres_fdw. Probably, he will submit the patch within a couple of days, for the upcoming commit fest. Regarding to the documentation, a consensus was to make up a wikipage to edit the description by everyone, then it shall become source of SGML file. The latest one is here: https://wiki.postgresql.org/wiki/CustomScanInterface Anyway, the next commit-fest shall within a couple of days. I'd like to have discussion for the feature. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Friday, February 13, 2015 4:38 PM To: Kaigai Kouhei(海外 浩平) Cc: Robert Haas; Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) On Thu, Jan 15, 2015 at 8:02 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: When custom-scan node replaced a join-plan, it shall have at least two child plan-nodes. The callback handler of PlanCustomPath needs to be able to call create_plan_recurse() to transform the underlying path-nodes to plan-nodes, because this custom-scan node may take other built-in scan or sub-join nodes as its inner/outer input. In case of FDW, it shall kick any underlying scan relations to remote side, thus we may not expect ForeignScan has underlying plans... Do you have an example of this? Yes, even though full code set is too large for patch submission... https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin. c#L1880 This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin. It takes GpuHashJoinPath inherited from CustomPath that has multiple underlying scan/join paths. Once it is called back from the backend, it also calls create_plan_recurse() to make inner/outer plan nodes according to the paths. In the result, we can see the following query execution plan that CustomScan takes underlying scan plans. postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2; QUERY PLAN -- Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922 width=143) Hash clause 1: (aid = aid) Hash clause 2: (bid = bid) Bulkload: On - Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=409 width=77) - Custom Scan (MultiHash) (cost=734.00..734.00 rows=4 width=37) hash keys: aid nBatches: 1 Buckets: 46000 Memory Usage: 99.99% - Seq Scan on t1 (cost=0.00..734.00 rows=4 width=37) - Custom Scan (MultiHash) (cost=734.00..734.00 rows=4 width=37) hash keys: bid nBatches: 1 Buckets: 46000 Memory Usage: 49.99% - Seq Scan on t2 (cost=0.00..734.00 rows=4 width=37) (13 rows) Where are we on this? AFAIK, we have now a feature with no documentation and no example in-core to test those custom routine APIs, hence moved to next CF. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL information view
On Tue, Feb 3, 2015 at 9:36 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Feb 3, 2015 at 6:42 AM, Michael Paquier michael.paqu...@gmail.com wrote: Where are we on this patch? No new version has been provided and there have been comments provided by Heikki here (5491e547.4040...@vmware.com) and by Alexei here (87ppbqz00h@commandprompt.com). Yeah, it's on my shame list. I'm definitely planning to get a new version in before the next CF and to try to work quicker with it that time to finish it off on time. Thanks for the reminder! Magnus, are you planning to work on this item of your shame list soon? Could you clarify the status of this patch? -- Michael
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On Fri, Feb 13, 2015 at 4:57 PM, Marko Tiikkaja ma...@joh.to wrote: On 2/13/15 8:52 AM, Michael Paquier wrote: On Sun, Nov 23, 2014 at 8:23 PM, David Rowley dgrowle...@gmail.com wrote: As the patch stands there's still a couple of FIXMEs in there, so there's still a bit of work to do yet. Comments are welcome Hm, if there is still work to do, we may as well mark this patch as rejected as-is, also because it stands in this state for a couple of months. I didn't bring this up before, but I'm pretty sure this patch should be marked returned with feedback. From what I've understood, rejected means we don't want this thing, not in this form or any other. That doesn't seem to be the case for this patch, nor for a few others marked rejected in the currently in-progress commit fest. In the new CF app, marking a patch as returned this feedback adds it automatically to the next commit fest. And note that it is actually what I did for now to move on to the next CF in the doubt: https://commitfest.postgresql.org/3/27/ But if nothing is done, we should as well mark it as rejected. Not based on the fact that it is rejected based on its content, but to not bloat the CF app with entries that have no activity for months. -- Michael
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On Fri, Feb 13, 2015 at 5:12 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-13 17:06:14 +0900, Michael Paquier wrote: On Fri, Feb 13, 2015 at 4:57 PM, Marko Tiikkaja ma...@joh.to wrote: On 2/13/15 8:52 AM, Michael Paquier wrote: On Sun, Nov 23, 2014 at 8:23 PM, David Rowley dgrowle...@gmail.com wrote: As the patch stands there's still a couple of FIXMEs in there, so there's still a bit of work to do yet. Comments are welcome Hm, if there is still work to do, we may as well mark this patch as rejected as-is, also because it stands in this state for a couple of months. I didn't bring this up before, but I'm pretty sure this patch should be marked returned with feedback. From what I've understood, rejected means we don't want this thing, not in this form or any other. That doesn't seem to be the case for this patch, nor for a few others marked rejected in the currently in-progress commit fest. In the new CF app, marking a patch as returned this feedback adds it automatically to the next commit fest. And note that it is actually what I did for now to move on to the next CF in the doubt: https://commitfest.postgresql.org/3/27/ But if nothing is done, we should as well mark it as rejected. Not based on the fact that it is rejected based on its content, but to not bloat the CF app with entries that have no activity for months. Then the CF app needs to be fixed. Marking patches as rejected on these grounds is a bad idea. Yup, definitely the term is incorrect. We need Returned with feedback but please do not add it to the next CF dear CF app. -- Michael
Re: [HACKERS] Review of GetUserId() Usage
On Fri, Dec 5, 2014 at 11:28 PM, Stephen Frost sfr...@snowman.net wrote: * Stephen Frost (sfr...@snowman.net) wrote: * Stephen Frost (sfr...@snowman.net) wrote: 3. It messes around with pg_signal_backend(). There are currently no cases in which pg_signal_backend() throws an error, which is good, because it lets you write queries against pg_stat_activity() that don't fail halfway through, even if you are missing permissions on some things. This patch introduces such a case, which is bad. Good point, I'll move that check up into the other functions, which will allow for a more descriptive error as well. Err, I'm missing something here, as pg_signal_backend() is a misc.c static internal function? How would you be calling it from a query against pg_stat_activity()? I'm fine making the change anyway, just curious.. Updated patch attached which move the ereport() out of pg_signal_backend() and into its callers, as the other permissions checks are done, and includes the documentation changes. The error messages are minimally changed to match the new behvaior. Moving to next CF, this patch did not get reviews. -- Michael
[HACKERS] restrict global access to be readonly
Hi PG Developers, I didn’t find any convenient way to restrict access to PostgreSQL databases to be read-only for all users. I need it in following scenarios: A) Planned switch-over from master to slave. We want to minimize impact within the planned switch-overs. So during the process we switch from master to slave, we would like to allow read-only transactions to be run on the original master until the switch-over complete and the new master starts taking user connections (we do the switch with virtual IP mechanism). I didn’t find way to do this on the database server side. Sure, we can utilize the runtime parameter default_transaction_read_only, however, it does not restrict user from changing transaction attribute to non-readonly mode, so is not safe. B) Blocking writing access when storage constraint is reached. We have massive PostgreSQL instances which are sold to external users with specific storage constraints and prices. When storage constraint for a specific instance is reached, we would rather change the instance to be readonly (then notify user to cleanup data or buy more storage) than shutdown the instance. Our current solution is putting a recovery.conf file to the instance (killing all existing connections) and restart the instance to get it into recovery mode (which is readonly), which is not pretty. C) Blocking writing access when an instance has expired. Similar with B), when the user’s contract with us expires about his/her instance, we want to firstly block the write access rather than shutdown the instance completely. Having that said, it would be very nice if there is a command like “SET GLOBAL_ACCESS TO READONLY | READWRITE” which does the job for the whole instance. I guess there could be others who want this feature too. So, have anyone considered or discussed about adding such a command? Is there anyone working on it (I would like to work on it if no)? Sincerely, Guangzhou
Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
Sorry I typed the wrong key. So... Are you planning to give up on the ctidscan module and submit only the module written by Hanada-san on top of postgres_fdw? As I imagine that the goal is just to have a test module to run the APIs why would the module submitted by Hanada-san be that necessary? No. The ctidscan module is a reference implementation towards the existing custom-scan interface that just supports relation scan with own way, but no support for relations join at this moment. The upcoming enhancement to postgres_fdw will support remote join, that looks like a scan on pseudo materialized relation on local side. It is the proof of the concept to the new interface I like to discuss in this thread. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Friday, February 13, 2015 6:17 PM To: Kaigai Kouhei(海外 浩平) Cc: Robert Haas; Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) On Fri, Feb 13, 2015 at 6:12 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Feb 13, 2015 at 4:59 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: Where are we on this? AFAIK, we have now a feature with no documentation and no example in-core to test those custom routine APIs, hence moved to next CF. Now Hanada-san is working on the example module that use this new infrastructure on top of postgres_fdw. Probably, he will submit the patch within a couple of days, for the upcoming commit fest. I am a bit surprised by that. Are you planning to give up on the ctidscan module module and Sorry I typed the wrong key. So... Are you planning to give up on the ctidscan module and submit only the module written by Hanada-san on top of postgres_fdw? As I imagine that the goal is just to have a test module to run the APIs why would the module submitted by Hanada-san be that necessary? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] question on Postgres smart shutdown mode
Hi All, I have a question on PG smart shutdown mode. When shutdown Postgres by issuing Smart Shutdown mode (SIGTERM) request, is there a way for client to be notified of this shutdown event? I tried PG_NOTIFY, but I cannot get any notification events when this happens. BTW, I am relative new to Postgres. Thanks, Bo
Re: [HACKERS] BRIN range operator class
On Thu, Feb 12, 2015 at 3:34 AM, Emre Hasegeli e...@hasegeli.com wrote: Thank you for looking at my patch again. New version is attached with a lot of changes and point data type support. Patch is moved to next CF 2015-02 as work is still going on. -- Michael
Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates
On Thu, Jan 29, 2015 at 8:28 AM, Peter Geoghegan p...@heroku.com wrote: On Mon, Jan 26, 2015 at 11:21 PM, Andreas Karlsson andr...@proxel.se wrote: Do you also think the SQL functions should be named numeric_int128_sum, numeric_int128_avg, etc? Some quick review comments. These apply to int128-agg-v5.patch. Andreas, are you planning to continue working on this patch? Peter has provided some comments that are still unanswered. -- Michael
[HACKERS] Refactoring GUC unit conversions
In the redesign checkpoint_segments patch, Robert suggested keeping the settings' base unit as number of segments, but allow conversions from MB, GB etc. I started looking into that and found that adding a new unit to guc.c is quite messy. The conversions are done with complicated if-switch-case constructs. Attached is a patch to refactor that, making the conversions table-driven. This doesn't change functionality, it just makes the code nicer. Any objections? - Heikki From a053c61c333687224d33a18a2a299c4dc2eb6bfe Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas heikki.linnakangas@iki.fi Date: Fri, 13 Feb 2015 15:24:50 +0200 Subject: [PATCH 1/1] Refactor unit conversions code in guc.c. Replace the if-switch-case constructs with two conversion tables, containing all the supported conversions between human-readable unit strings and the base units used in GUC variables. This makes the code easier to read, and makes adding new units simpler. --- src/backend/utils/misc/guc.c | 425 +++ src/include/utils/guc.h | 2 + 2 files changed, 188 insertions(+), 239 deletions(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 9572777..59e25af 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -97,20 +97,6 @@ #define CONFIG_EXEC_PARAMS_NEW global/config_exec_params.new #endif -#define KB_PER_MB (1024) -#define KB_PER_GB (1024*1024) -#define KB_PER_TB (1024*1024*1024) - -#define MS_PER_S 1000 -#define S_PER_MIN 60 -#define MS_PER_MIN (1000 * 60) -#define MIN_PER_H 60 -#define S_PER_H (60 * 60) -#define MS_PER_H (1000 * 60 * 60) -#define MIN_PER_D (60 * 24) -#define S_PER_D (60 * 60 * 24) -#define MS_PER_D (1000 * 60 * 60 * 24) - /* * Precision with which REAL type guc values are to be printed for GUC * serialization. @@ -666,6 +652,88 @@ const char *const config_type_names[] = /* PGC_ENUM */ enum }; +/* + * Unit conversions tables. + * + * There are two tables, one for memory units, and another for time units. + * For each supported conversion from one unit to another, we have an entry + * in the conversion table. + * + * To keep things simple, and to avoid intermediate-value overflows, + * conversions are never chained. There needs to be a direct conversion + * between all units. + * + * The conversions from each base unit must be kept in order from greatest + * to smallest unit; convert_from_base_unit() relies on that. (The order of + * the base units does not matter.) + */ +#define MAX_UNIT_LEN 3 /* length of longest recognized unit string */ + +typedef struct +{ + char unit[MAX_UNIT_LEN + 1]; /* unit, as a string, like kB or min */ + int base_unit; /* GUC_UNIT_XXX */ + int multiplier; /* If positive, multiply the value with this for + * unit - base_unit conversion. If negative, + * divide (with the absolute value) */ +} unit_conversion; + +/* Ensure that the constants in the tables don't overflow or underflow */ +#if BLCKSZ 1024 || BLCKSZ (1024*1024) +#error BLCKSZ must be between 1KB and 1MB +#endif +#if XLOG_BLCKSZ 1024 || XLOG_BLCKSZ (1024*1024) +#error XLOG_BLCKSZ must be between 1KB and 1MB +#endif + +static const char *memory_units_hint = + gettext_noop(Valid units for this parameter are \kB\, \MB\, \GB\, and \TB\.); + +static const unit_conversion memory_unit_conversion_table[] = +{ + { TB, GUC_UNIT_KB, 1024*1024*1024 }, + { GB, GUC_UNIT_KB, 1024*1024 }, + { MB, GUC_UNIT_KB, 1024 }, + { kB, GUC_UNIT_KB, 1 }, + + { TB, GUC_UNIT_BLOCKS, (1024*1024*1024) / (BLCKSZ / 1024) }, + { GB, GUC_UNIT_BLOCKS, (1024*1024) / (BLCKSZ / 1024) }, + { MB, GUC_UNIT_BLOCKS, 1024 / (BLCKSZ / 1024) }, + { kB, GUC_UNIT_BLOCKS, -(BLCKSZ / 1024) }, + + { TB, GUC_UNIT_XBLOCKS, (1024*1024*1024) / (XLOG_BLCKSZ / 1024) }, + { GB, GUC_UNIT_XBLOCKS, (1024*1024) / (XLOG_BLCKSZ / 1024) }, + { MB, GUC_UNIT_XBLOCKS, 1024 / (XLOG_BLCKSZ / 1024) }, + { kB, GUC_UNIT_XBLOCKS, -(XLOG_BLCKSZ / 1024) }, + + { } /* end of table marker */ +}; + +static const char *time_units_hint = + gettext_noop(Valid units for this parameter are \ms\, \s\, \min\, \h\, and \d\.); + +static const unit_conversion time_unit_conversion_table[] = +{ + { d, GUC_UNIT_MS, 1000 * 60 * 60 * 24 }, + { h, GUC_UNIT_MS, 1000 * 60 * 60 }, + { min, GUC_UNIT_MS, 1000 * 60}, + { s, GUC_UNIT_MS, 1000 }, + { ms, GUC_UNIT_MS, 1 }, + + { d, GUC_UNIT_S, 60 * 60 * 24 }, + { h, GUC_UNIT_S, 60 * 60 }, + { min, GUC_UNIT_S, 60 }, + { s, GUC_UNIT_S, 1 }, + { ms, GUC_UNIT_S, -1000 }, + + { d, GUC_UNIT_MIN, 60 * 24 }, + { h, GUC_UNIT_MIN, 60 }, + { min, GUC_UNIT_MIN, 1 }, + { s, GUC_UNIT_MIN, -60 }, + { ms, GUC_UNIT_MIN, -1000 * 60 }, + + { } /* end of table marker */ +}; /* * Contents of GUC tables @@ -5018,6 +5086,85 @@ ReportGUCOption(struct config_generic * record) } /* + * Convert a value from one of the human-friendly units (kB, min etc.) + * to a given base unit. 'value' and 'unit'
Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg
On Thu, Jan 29, 2015 at 7:25 AM, Tomas Vondra tomas.von...@2ndquadrant.com wrote: attached is v9 of the patch, modified along the lines of Tom's comments: Moved this patch to next CF, hopefully it will get more attention, and a reviewer. -- Michael
[HACKERS] anyarray
Some of users of intarray contrib module wish to use its features with another kind of arrays, not only for int4 type. Suggested module generalizes intarray over other (not all) types op pgsql. Anyarray also provides a calculation of similarity two one dimensinal arrays similar to smlar module. Anyarray module doesn't provide an something similar to query_int feature of intarray, because this feature is very hard to implement (it requires new pseudo-type anyquery), it is close to impossible to have operation extensibility and it's complicated queries are hidden from pgsql's optimizer (like to jsquery). As far I know, query_int isn't very popular for now. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ anyarray-1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication Helpers WIP for discussion
On 13/02/15 08:48, Michael Paquier wrote: On Mon, Dec 22, 2014 at 10:26 PM, Robert Haas robertmh...@gmail.com mailto:robertmh...@gmail.com wrote: On Fri, Dec 19, 2014 at 8:40 AM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: What I hope to get from this is agreement on the general approach and protocol so that we can have common base which will both make it easier to create external logical replication solutions and also eventually lead to full logical replication inside core PostgreSQL. The protocol is a really important topic which deserves its own discussion. Andres has mentioned some of the ideas he has in mind - which I think are similar to what you did here - but there hasn't really been a thread devoted to discussing that topic specifically. I think that would be a good idea: lay out what you have in mind, and why, and solicit feedback. Looking at this patch, I don't see what we actually gain much here except a decoder plugin that speaks a special protocol for a special background worker that has not been presented yet. What actually is the value of that defined as a contrib/ module in-core. Note that we have already test_decoding to basically test the logical decoding facility, used at least at the SQL level to get logical changes decoded. Based on those reasons I am planning to mark this as rejected (it has no documentation as well). So please speak up if you think the contrary, but it seems to me that this could live happily out of core. I think you are missing point of this, it's not meant to be committed in this form at all and even less as contrib module. It was meant as basis for in-core logical replication discussion, but sadly I didn't really have time to pursue it in this CF in the end. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compress method for spgist - 2
Now that the input data type and leaf data type can be different, which one is attType? It's the leaf data type, as the patch stands. I renamed that to attLeafType, and went fixing all the references to it. In most places it's just a matter of search replace, but what about the reconstructed datum? In Done. Now there is separate attType and attLeafType which describe input/output and leaf types. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ spgist_compress_method-5.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Help me! Why did the salve stop suddenly ?
On Thu, Feb 12, 2015 at 3:27 AM, hailong Li shuhujiaol...@gmail.com wrote: Hi, dear pgsql-hackers Please have a look at https://wiki.postgresql.org/wiki/Guide_to_reporting_problems This is the wrong mailing list for this sort of question, and your report is pretty unclear, so it's hard to tell what might have gone wrong. Please repost to a more appropriate mailing list taking note of the above guidelines, and you may get a more helpful answer. Alternatively, contact a commercial PostgreSQL support company and contract with them for help, as again suggested in the above guide. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2015 - mentors, students and admins.
On 12 February 2015 at 14:55, Alexander Korotkov aekorot...@gmail.com wrote: Hi! On Mon, Feb 9, 2015 at 11:52 PM, Thom Brown t...@linux.com wrote: Google Summer of Code 2015 is approaching. I'm intending on registering PostgreSQL again this year. Before I do that, I'd like to have an idea of how many people are interested in being either a student or a mentor. I'm ready to participate as mentor again! Thanks for the interest from mentors and students so far. Could I interest more to volunteer as mentors this year? I'll be registering the project with Google soon so I'll want an idea of how many mentors we'll have available. Thom
[HACKERS] why does find_my_exec resolve symlinks?
Here is a scenario: ./configure --prefix=/usr/local/pgsql/9.4.1 make make install ln -s 9.4.1 /usr/local/pgsql/9.4 PATH=/usr/local/pgsql/9.4/bin:$PATH And then when 9.4.2 comes out, the symlink is updated. I think this sort of setup in variations is not uncommon. When building other software against that installation, it would use pg_config --includedir, pg_config --libdir, etc., but that points to /usr/local/pgsql/9.4.1/lib instead of /usr/local/pgsql/9.4/lib, because find_my_exec() goes out of its way to resolve symlinks in the returned path. If the other software saves an rpath or the bindir or something during the build, then if I later clear out the old 9.4.1 installation, the other software will break. The reason for this behavior is commit 336969e490d71c316a42fabeccda87f798e562dd Author: Tom Lane t...@sss.pgh.pa.us Date: Sat Nov 6 23:06:29 2004 + Add code to find_my_exec() to resolve a symbolic link down to the actual executable location. This allows people to continue to use setups where, eg, postmaster is symlinked from a convenient place. Per gripe from Josh Berkus. I don't quite understand what setup Josh was using there. Is there a way we can consolidate these situations? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Redesigning checkpoint_segments
On 02/04/2015 11:41 PM, Josh Berkus wrote: On 02/04/2015 12:06 PM, Robert Haas wrote: On Wed, Feb 4, 2015 at 1:05 PM, Josh Berkus j...@agliodbs.com wrote: Let me push max_wal_size and min_wal_size again as our new parameter names, because: * does what it says on the tin * new user friendly * encourages people to express it in MB, not segments * very different from the old name, so people will know it works differently That's not bad. If we added a hard WAL limit in a future release, how would that fit into this naming scheme? Well, first, nobody's at present proposing a patch to add a hard limit, so I'm reluctant to choose non-obvious names to avoid conflict with a feature nobody may ever write. There's a number of reasons a hard limit would be difficult and/or undesirable. If we did add one, I'd suggest calling it wal_size_limit or something similar. However, we're most likely to only implement the limit for archives, which means that it might acually be called archive_buffer_limit or something more to the point. Ok, I don't hear any loud objections to min_wal_size and max_wal_size, so let's go with that then. Attached is a new version of this. It now comes in four patches. The first three are just GUC-related preliminary work, the first of which I posted on a separate thread today. - Heikki From a053c61c333687224d33a18a2a299c4dc2eb6bfe Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas heikki.linnakangas@iki.fi Date: Fri, 13 Feb 2015 15:24:50 +0200 Subject: [PATCH 1/4] Refactor unit conversions code in guc.c. Replace the if-switch-case constructs with two conversion tables, containing all the supported conversions between human-readable unit strings and the base units used in GUC variables. This makes the code easier to read, and makes adding new units simpler. --- src/backend/utils/misc/guc.c | 425 +++ src/include/utils/guc.h | 2 + 2 files changed, 188 insertions(+), 239 deletions(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 9572777..59e25af 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -97,20 +97,6 @@ #define CONFIG_EXEC_PARAMS_NEW global/config_exec_params.new #endif -#define KB_PER_MB (1024) -#define KB_PER_GB (1024*1024) -#define KB_PER_TB (1024*1024*1024) - -#define MS_PER_S 1000 -#define S_PER_MIN 60 -#define MS_PER_MIN (1000 * 60) -#define MIN_PER_H 60 -#define S_PER_H (60 * 60) -#define MS_PER_H (1000 * 60 * 60) -#define MIN_PER_D (60 * 24) -#define S_PER_D (60 * 60 * 24) -#define MS_PER_D (1000 * 60 * 60 * 24) - /* * Precision with which REAL type guc values are to be printed for GUC * serialization. @@ -666,6 +652,88 @@ const char *const config_type_names[] = /* PGC_ENUM */ enum }; +/* + * Unit conversions tables. + * + * There are two tables, one for memory units, and another for time units. + * For each supported conversion from one unit to another, we have an entry + * in the conversion table. + * + * To keep things simple, and to avoid intermediate-value overflows, + * conversions are never chained. There needs to be a direct conversion + * between all units. + * + * The conversions from each base unit must be kept in order from greatest + * to smallest unit; convert_from_base_unit() relies on that. (The order of + * the base units does not matter.) + */ +#define MAX_UNIT_LEN 3 /* length of longest recognized unit string */ + +typedef struct +{ + char unit[MAX_UNIT_LEN + 1]; /* unit, as a string, like kB or min */ + int base_unit; /* GUC_UNIT_XXX */ + int multiplier; /* If positive, multiply the value with this for + * unit - base_unit conversion. If negative, + * divide (with the absolute value) */ +} unit_conversion; + +/* Ensure that the constants in the tables don't overflow or underflow */ +#if BLCKSZ 1024 || BLCKSZ (1024*1024) +#error BLCKSZ must be between 1KB and 1MB +#endif +#if XLOG_BLCKSZ 1024 || XLOG_BLCKSZ (1024*1024) +#error XLOG_BLCKSZ must be between 1KB and 1MB +#endif + +static const char *memory_units_hint = + gettext_noop(Valid units for this parameter are \kB\, \MB\, \GB\, and \TB\.); + +static const unit_conversion memory_unit_conversion_table[] = +{ + { TB, GUC_UNIT_KB, 1024*1024*1024 }, + { GB, GUC_UNIT_KB, 1024*1024 }, + { MB, GUC_UNIT_KB, 1024 }, + { kB, GUC_UNIT_KB, 1 }, + + { TB, GUC_UNIT_BLOCKS, (1024*1024*1024) / (BLCKSZ / 1024) }, + { GB, GUC_UNIT_BLOCKS, (1024*1024) / (BLCKSZ / 1024) }, + { MB, GUC_UNIT_BLOCKS, 1024 / (BLCKSZ / 1024) }, + { kB, GUC_UNIT_BLOCKS, -(BLCKSZ / 1024) }, + + { TB, GUC_UNIT_XBLOCKS, (1024*1024*1024) / (XLOG_BLCKSZ / 1024) }, + { GB, GUC_UNIT_XBLOCKS, (1024*1024) / (XLOG_BLCKSZ / 1024) }, + { MB, GUC_UNIT_XBLOCKS, 1024 / (XLOG_BLCKSZ / 1024) }, + { kB, GUC_UNIT_XBLOCKS, -(XLOG_BLCKSZ / 1024) }, + + { } /* end of table marker */ +}; + +static const char *time_units_hint = + gettext_noop(Valid units for this parameter are \ms\, \s\,
Re: [HACKERS] Refactoring GUC unit conversions
On 2/13/15 7:26 AM, Heikki Linnakangas wrote: In the redesign checkpoint_segments patch, Robert suggested keeping the settings' base unit as number of segments, but allow conversions from MB, GB etc. I started looking into that and found that adding a new unit to guc.c is quite messy. The conversions are done with complicated if-switch-case constructs. Attached is a patch to refactor that, making the conversions table-driven. This doesn't change functionality, it just makes the code nicer. Looks good, but shouldn't there be a check for a unit that's neither memory or time? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup may fail to send feedbacks.
On Tue, Feb 10, 2015 at 7:48 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, The attached patch is fix for walreciever not using gettimeofday, and fix for receivelog using it. XLogWalRcvProcessMsg doesn't send feedback when processing 'w'=WAL record packet. So the same thing as pg_basebackup and pg_receivexlog will occur on walsender. Exiting the for(;;) loop by TimestampDifferenceExceeds just before line 439 is an equivalent measure to I poposed for receivelog.c, but calling XLogWalRcvSendHSFeedback(false) there is seemingly simpler (but I feel a bit uncomfortable for the latter) I'm concerned about the performance degradation by calling getimeofday() per a record. Or other measures? Introduce the maximum number of records that we can receive and process between feedbacks? For example, we can change XLogWalRcvSendHSFeedback() so that it's called per at least 8 records. I'm not sure what number is good, though... At the beginning of the while(len 0){if (len 0){ block, last_recv_timestamp is always kept up to date (by using gettimeotda():). So we can use the variable instead of gettimeofday() in my previous proposal. Yes, but something like last_recv_timestamp doesn't exist in REL9_1_STABLE. So you don't think that your proposed fix should be back-patched to all supported versions. Right? The start time of the timeout could be last_recv_timestamp at the beginning of the while loop, since it is a bit earlier than the actual time at the point. Sounds strange to me. I think that last_recv_timestamp should be compared with the time when the last feedback message was sent, e.g., maybe you can expose sendTime in XLogWalRcvSendReplay() as global variable, and use it to compare with last_recv_timestamp. When we get out of the WAL receive loop due to the timeout check which your patch added, XLogWalRcvFlush() is always executed. I don't think that current WAL file needs to be flushed at that time. The another solution would be using RegisterTimeout/enable_timeout_after and it seemed to be work for me. It can throw out the time counting stuff in the loop we are talking about and that of XLogWalRcvSendHSFeedback and XLogWalRcvSendReply, but it might be a bit too large for the gain. Yes, sounds overkill. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactoring GUC unit conversions
On 02/13/2015 07:34 PM, Jim Nasby wrote: On 2/13/15 7:26 AM, Heikki Linnakangas wrote: In the redesign checkpoint_segments patch, Robert suggested keeping the settings' base unit as number of segments, but allow conversions from MB, GB etc. I started looking into that and found that adding a new unit to guc.c is quite messy. The conversions are done with complicated if-switch-case constructs. Attached is a patch to refactor that, making the conversions table-driven. This doesn't change functionality, it just makes the code nicer. Looks good, but shouldn't there be a check for a unit that's neither memory or time? Can you elaborate? We currently only support units for memory and time settings. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple backends attempting to wait for pincount 1
On 2015-02-13 22:33:35 +, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: On 2015-02-13 00:27:04 -0500, Tom Lane wrote: I'd say we have a problem. I'd even go so far as to say that somebody has completely broken locking, because this looks like autovacuum and manual vacuuming are hitting the same table at the same time. One avenue to look are my changes around both buffer pinning and interrupt handling... I found a way to cause this reliably on my machine and did a bisect. That pointed to commit 675f55e1d9bcb9da4323556b456583624a07 For the record, I would build and start the cluster, start two psql sessions, and paste this into the first session: drop table if exists m; create table m (id int primary key); insert into m select generate_series(1, 100) x; checkpoint; vacuum analyze; checkpoint; delete from m where id between 50 and 100; begin; declare c cursor for select * from m; fetch c; fetch c; fetch c; As soon as I saw the fetches execute I hit Enter on this in the other psql session: vacuum freeze m; It would block, and then within a minute (i.e., autovacuum_naptime) I would get the error. Great! Thanks for that piece of detective work. I've been travelling until an hour ago and not looked yet. How did you get to that recipe? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c
On 2/12/15 10:54 PM, Michael Paquier wrote: Hi all, When calling vacuum(), there is the following assertion using VACOPT_FREEZE: Assert((vacstmt-options VACOPT_VACUUM) || !(vacstmt-options (VACOPT_FULL | VACOPT_FREEZE))); I think that this should be changed with sanity checks based on the parameter values of freeze_* in VacuumStmt as we do not set up VACOPT_FREEZE when VACUUM is used without options in parenthesis, for something like that: Assert((vacstmt-options VACOPT_VACUUM) || - !(vacstmt-options (VACOPT_FULL | VACOPT_FREEZE))); + ((vacstmt-options VACOPT_FULL) == 0 + vacstmt-freeze_min_age 0 + vacstmt-freeze_table_age 0 + vacstmt-multixact_freeze_min_age 0 + vacstmt-multixact_freeze_table_age 0)); This would also have the advantage to limit the use of VACOPT_FREEZE in the query parser. A patch is attached. Thoughts? Looks good. Should we also assert that if VACOPT_FREEZE is set then all the other stuff is 0? I don't know what kind of sanity checks we normally try and put on the parser, but that seems like a possible hole. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple backends attempting to wait for pincount 1
On 2015-02-13 23:05:16 +, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: How did you get to that recipe? I have been working on some patches to allow vacuum to function in the face of long-held snapshots. (I'm struggling to get them into presentable shape for the upcoming CF.) I was devising the most diabolical cases I could to try to break my patched code and started seeing this error. I was panicked that I had introduced the bug, but on comparing to the master branch I found I was able to cause it there, too. So I saw this a couple days before the report on list, and had some cases that *sometimes* caused the error. I tweaked until it seemed to be pretty reliable, and then used that for the bisect. I still consider you to be the uncontested champion of diabolical test cases, but I'm happy to have hit upon one that was useful here. ;-) Hah. Not sure if that's something to be proud of :P I don't think it's actually 675333 at fault here. I think it's a long standing bug in LockBufferForCleanup() that can just much easier be hit with the new interrupt code. Imagine what happens in LockBufferForCleanup() when ProcWaitForSignal() returns spuriously - something it's documented to possibly do (and which got more likely with the new patches). In the normal case UnpinBuffer() will have unset BM_PIN_COUNT_WAITER - but in a spurious return it'll still be set and LockBufferForCleanup() will see it still set. If you just gdb into the VACUUM process with 6647248e370884 checked out, and do a PGSemaphoreUnlock(MyProc-sem) you'll hit it as well. I think we should simply move the buf-flags = ~BM_PIN_COUNT_WAITER (Inside LockBuffer) to LockBufferForCleanup, besides the PinCountWaitBuf = NULL. Afaics, that should do the trick. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactoring GUC unit conversions
On 2/13/15 11:44 AM, Heikki Linnakangas wrote: On 02/13/2015 07:34 PM, Jim Nasby wrote: On 2/13/15 7:26 AM, Heikki Linnakangas wrote: In the redesign checkpoint_segments patch, Robert suggested keeping the settings' base unit as number of segments, but allow conversions from MB, GB etc. I started looking into that and found that adding a new unit to guc.c is quite messy. The conversions are done with complicated if-switch-case constructs. Attached is a patch to refactor that, making the conversions table-driven. This doesn't change functionality, it just makes the code nicer. Looks good, but shouldn't there be a check for a unit that's neither memory or time? Can you elaborate? We currently only support units for memory and time settings. I'm thinking an Assert in case someone screws up the function call. But perhaps I'm just being paranoid. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple backends attempting to wait for pincount 1
Andres Freund and...@2ndquadrant.com wrote: On 2015-02-13 00:27:04 -0500, Tom Lane wrote: I'd say we have a problem. I'd even go so far as to say that somebody has completely broken locking, because this looks like autovacuum and manual vacuuming are hitting the same table at the same time. One avenue to look are my changes around both buffer pinning and interrupt handling... I found a way to cause this reliably on my machine and did a bisect. That pointed to commit 675f55e1d9bcb9da4323556b456583624a07 For the record, I would build and start the cluster, start two psql sessions, and paste this into the first session: drop table if exists m; create table m (id int primary key); insert into m select generate_series(1, 100) x; checkpoint; vacuum analyze; checkpoint; delete from m where id between 50 and 100; begin; declare c cursor for select * from m; fetch c; fetch c; fetch c; As soon as I saw the fetches execute I hit Enter on this in the other psql session: vacuum freeze m; It would block, and then within a minute (i.e., autovacuum_naptime) I would get the error. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple backends attempting to wait for pincount 1
Andres Freund and...@2ndquadrant.com wrote: How did you get to that recipe? I have been working on some patches to allow vacuum to function in the face of long-held snapshots. (I'm struggling to get them into presentable shape for the upcoming CF.) I was devising the most diabolical cases I could to try to break my patched code and started seeing this error. I was panicked that I had introduced the bug, but on comparing to the master branch I found I was able to cause it there, too. So I saw this a couple days before the report on list, and had some cases that *sometimes* caused the error. I tweaked until it seemed to be pretty reliable, and then used that for the bisect. I still consider you to be the uncontested champion of diabolical test cases, but I'm happy to have hit upon one that was useful here. ;-) -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_regress writes into source tree
On 12/18/2014 06:05 PM, Andrew Dunstan wrote: On 12/18/2014 03:02 AM, Michael Paquier wrote: On Fri, Dec 12, 2014 at 10:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Another thing in that patch was that I had to add the sql/ directory to the source tree, but other than that .gitignore file it was empty. Maybe pg_regress should create the sql/ directory in the build dir if it doesn't exist. This is only a problem if a pg_regress suite only runs stuff from input/, because otherwise the sql/ dir already exists in the source. +1 for having pg_regress create the sql/ directory when it does not exist. Current behavior is annoying when modules having only tests in input/... That seems like a separate issue. I think Peter should commit his patch and backpatch it immediately, and we can deal with the missing sql directory when someone sends in a patch. What's happened on this? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL renegotiation and other related woes
On 02/12/2015 01:41 AM, Andres Freund wrote: Hi, On 2015-02-05 23:03:02 +0200, Heikki Linnakangas wrote: On 01/26/2015 12:14 PM, Andres Freund wrote: Hi, When working on getting rid of ImmediateInterruptOK I wanted to verify that ssl still works correctly. Turned out it didn't. But neither did it in master. Turns out there's two major things we do wrong: 1) We ignore the rule that once called and returning SSL_ERROR_WANTS_(READ|WRITE) SSL_read()/write() have to be called again with the same parameters. Unfortunately that rule doesn't mean just that the same parameters have to be passed in, but also that we can't just constantly switch between _read()/write(). Especially nonblocking backend code (i.e. walsender) and the whole frontend code violate this rule. I don't buy that. Googling around, and looking at the man pages, I just don't see anything to support that claim. I believe it is supported to switch between reads and writes. There's at least two implementations that seem to have workaround to achieve something like that. Apache's mod_ssl and the ssl layer for libevent. 2) We start renegotiations in be_tls_write() while in nonblocking mode, but don't properly retry to handle socket readyness. There's a loop that retries handshakes twenty times (???), but what actually is needed is to call SSL_get_error() and ensure that there's actually data available. Yeah, that's just crazy. Actually, I don't think we need to call SSL_do_handshake() at all. At least on modern OpenSSL versions. OpenSSL internally uses a state machine, and SSL_read() and SSL_write() calls will complete the handshake just as well. Some blaming seems to show that that's been the case for a long time. 2) is easy enough to fix, but 1) is pretty hard. Before anybody says that 1) isn't an important rule: It reliably causes connection aborts within a couple renegotiations. The higher the latency the higher the likelihood of aborts. Even locally it doesn't take very long to abort. Errors usually are something like SSL connection has been closed unexpectedly or SSL Error: sslv3 alert unexpected message and a host of other similar messages. There's a couple reports of those in the archives and I've seen many more in client logs. Yeah, I can reproduce that. There's clearly something wrong. I believe this is an OpenSSL bug. I traced the unexpected message error to this code in OpenSSL's s3_read_bytes() function: Yep, I got to that as well. case SSL3_RT_APPLICATION_DATA: /* * At this point, we were expecting handshake data, but have * application data. If the library was running inside ssl3_read() * (i.e. in_read_app_data is set) and it makes sense to read * application data at this point (session renegotiation not yet * started), we will indulge it. */ if (s-s3-in_read_app_data (s-s3-total_renegotiations != 0) (((s-state SSL_ST_CONNECT) (s-state = SSL3_ST_CW_CLNT_HELLO_A) (s-state = SSL3_ST_CR_SRVR_HELLO_A) ) || ((s-state SSL_ST_ACCEPT) (s-state = SSL3_ST_SW_HELLO_REQ_A) (s-state = SSL3_ST_SR_CLNT_HELLO_A) ) )) { s-s3-in_read_app_data = 2; return (-1); } else { al = SSL_AD_UNEXPECTED_MESSAGE; SSLerr(SSL_F_SSL3_READ_BYTES, SSL_R_UNEXPECTED_RECORD); goto f_err; } So that quite clearly throws an error, *unless* the original application call was SSL_read(). As you also concluded, OpenSSL doesn't like it when SSL_write() is called when it's in renegotiation. I think that's OpenSSL's fault and it should indulge the application data message even in SSL_write(). That'd be nice, but I think there were multiple reports to them about this and there wasn't much of a response. Maybe it's time to try again. In theory, I guess we should do similar hacks in the server, and always call SSL_read() before SSL_write(), but it seems to work without it. Or maybe not; OpenSSL server and client code is not symmetric, so perhaps it works in server mode without that. I think we should do it on the server side - got some server side errors when I cranked up the pg_receivexlog's status interval and set the wal sender timeout very low. I've not been able to reproduce any errors on the server side, with my latest client-only patch. Not sure why. I would assume that the server has the same problem, but perhaps there are some mitigating factors that make it impossible or at least less likely there. I'm planning to commit and backpatch the first two of the attached patches. The first is essentially the same as what I've posted before, with some minor cleanup. I realized that the hack can be isolated completely in fe-secure-openssl.c by putting the kill-switch in the custom BIO_read routine, instead of
Re: [HACKERS] RangeType internal use
On 2/10/15 2:04 PM, David Fetter wrote: Yeah, but people expect to be able to partition on ranges that are not all of equal width. I think any proposal that we shouldn't support that is the kiss of death for a feature like this - it will be so restricted as to eliminate 75% of the use cases. Well, that's debatable IMO (especially your claim that variable-size partitions would be needed by a majority of users). It's ubiquitous. Time range partition sets almost always have some sets with finite range and at least one range with infinity in it: current end to infinity, and somewhat less frequently in my experience, -infinity to some arbitrary start. We could instead handle that with a generic this doesn't fit in any other partition capability. Presumably that would be easy if we're building this on top of inheritance features. If we exclude the issue of needing one or two oddball partitions for +/- infinity, I expect that fixed sized partitions would actually cover 80-90% of cases. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RangeType internal use
On Fri, Feb 13, 2015 at 3:13 PM, Jim Nasby jim.na...@bluetreble.com wrote: If we exclude the issue of needing one or two oddball partitions for +/- infinity, I expect that fixed sized partitions would actually cover 80-90% of cases. That would not be true in our case. The data is not at all evenly distributed over the partitioning key. We would need something more like: values a, b, and c get their own partitions and everything else goes in partition d.
Re: [HACKERS] RangeType internal use
On Fri, Feb 13, 2015 at 03:13:11PM -0600, Jim Nasby wrote: On 2/10/15 2:04 PM, David Fetter wrote: Yeah, but people expect to be able to partition on ranges that are not all of equal width. I think any proposal that we shouldn't support that is the kiss of death for a feature like this - it will be so restricted as to eliminate 75% of the use cases. Well, that's debatable IMO (especially your claim that variable-size partitions would be needed by a majority of users). It's ubiquitous. Time range partition sets almost always have some sets with finite range and at least one range with infinity in it: current end to infinity, and somewhat less frequently in my experience, -infinity to some arbitrary start. We could instead handle that with a generic this doesn't fit in any other partition capability. Presumably that would be easy if we're building this on top of inheritance features. If we exclude the issue of needing one or two oddball partitions for +/- infinity, I expect that fixed sized partitions would actually cover 80-90% of cases. Is partition the domain really that big an ask? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] assessing parallel-safety
On Fri, Feb 13, 2015 at 12:10 AM, Noah Misch n...@leadboat.com wrote: Given your wish to optimize, I recommend first investigating the earlier thought to issue eval_const_expressions() once per planner() instead of once per subquery_planner(). Compared to the parallelModeRequired/parallelModeOK idea, it would leave us with a more maintainable src/backend/optimizer. I won't object to either design, though. In off-list discussions with Tom Lane, he pressed hard on the question of whether we can zero out the number of functions that are parallel-unsafe (i.e. can't be run while parallel even in the master) vs. parallel-restricted (must be run in the master rather than elsewhere). The latter category can be handled by strictly local decision-making, without needing to walk the entire plan tree; e.g. parallel seq scan can look like this: Parallel Seq Scan on foo Filter: a = pg_backend_pid() Parallel Filter: b = 1 And, indeed, I was pleasantly surprised when surveying the catalogs by how few functions were truly unsafe, vs. merely needing to be restricted to the master. But I can't convince myself that there's any way sane of allowing database writes even in the master; creating new combo CIDs there seems disastrous, and users will be sad if a parallel plan is chosen for some_plpgsql_function_that_does_updates() and this then errors out because of parallel mode. Tom also argued that (1) trying to assess parallel-safety before preprocess_expressions() was doomed to fail, because preprocess_expressions() can additional function calls via, at least, inlining and default argument insertion and (2) preprocess_expressions() can't be moved earlier than without changing the semantics. I'm not sure if he's right, but those are sobering conclusions. Andres pointed out to me via IM that inlining is dismissable here; if inlining introduces a parallel-unsafe construct, the inlined function was mislabeled to begin with, and the user has earned the error message they get. Default argument insertion might not be dismissable although the practical risks seem low. It's pretty awful that this is such a thorny issue, and the patch may be rather cringe-worthy, given the (to me) essentially reasonable nature of wanting to know whether a query has any references to a function with a certain property in it. Your survey of parallel safety among built-in functions was on-target. Thanks for having a look. A healthy amount of the way the parallel mode stuff works came from your brain, so your opinion, while always valuable, is especially appreciated here. I have a lot of confidence in your judgement about this stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question on Postgres smart shutdown mode
On Fri, Feb 13, 2015 at 2:15 AM, wei sun wei.sun...@gmail.com wrote: I have a question on PG smart shutdown mode. When shutdown Postgres by issuing Smart Shutdown mode (SIGTERM) request, is there a way for client to be notified of this shutdown event? I tried PG_NOTIFY, but I cannot get any notification events when this happens. BTW, I am relative new to Postgres. Unfortunately, no, there isn't a way for the client to be notified. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
On 2/13/15 2:04 AM, Martijn van Oosterhout wrote: On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote: BTW, I'm not all that thrilled with the deserialized object terminology. I found myself repeatedly tripping up on which form was serialized and which de-. If anyone's got a better naming idea I'm willing to adopt it. My first thought is that we should form some kind of TOAST-like backronym, like Serialization Avoidance Loading and Access Device (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I don't think there is anything per se wrong with the terms serialization and deserialization; indeed, I used the same ones in the parallel-mode stuff. But they are fairly general terms, so it might be nice to have something more specific that applies just to this particular usage. The words that sprung to mind for me were: packed/unpacked. +1 After thinking about it, I don't think having a more distinctive name (like TOAST) is necessary for this feature. TOAST is something that's rather visible to end users, whereas packing would only matter to someone creating a new varlena type. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd behavior of updatable security barrier views on foreign tables
Etsuro, * Etsuro Fujita (fujita.ets...@lab.ntt.co.jp) wrote: On 2015/02/11 4:06, Stephen Frost wrote: * Etsuro Fujita (fujita.ets...@lab.ntt.co.jp) wrote: On 2015/02/10 7:23, Dean Rasheed wrote: Sorry, I didn't have time to look at this properly. My initial thought is that expand_security_qual() needs to request a lock on rows coming from the relation it pushes down into a subquery if that relation was the result relation, because otherwise it won't have any locks, since preprocess_rowmarks() only adds PlanRowMarks to non-target relations. That seems close to what I had in mind; expand_security_qual() needs to request a FOR UPDATE lock on rows coming from the relation it pushes down into a subquery only when that relation is the result relation and *foreign table*. I had been trying to work out an FDW-specific way to address this, but I think Dean's right that this should be addressed in expand_security_qual(), which means it'll apply to all cases and not just these FDW calls. I don't think that's actually an issue though and it'll match up to how SELECT FOR UPDATE is handled today. Sorry, my explanation was not accurate, but I also agree with Dean's idea. In the above, I just wanted to make it clear that such a lock request done by expand_security_qual() should be limited to the case where the relation that is a former result relation is a foreign table. We aren't doing that for the other cases and so I don't think it makes sense to do it here.. These should all be handled the same way. If it's OK, I'll submit a patch for that, maybe early next week. Not really necessary, I have the code for it, just need to test, etc. Thanks! Stephen signature.asc Description: Digital signature
[HACKERS] chkpass with RANDOMIZE_ALLOCATED_MEMORY
Hi, It is been observed on RANDOMIZE_ALLOCATED_MEMORY enabled PG95 build that chkpass is failing because of uninitialized memory and seems showing false alarm. I have tried to add code snippets to explain as following i.e. postgres=# CREATE EXTENSION chkpass; WARNING: type input function chkpass_in should not be volatile CREATE EXTENSION postgres=# CREATE TABLE test_tbl ( name text, pass chkpass ); CREATE TABLE postgres=# INSERT INTO test_tbl VALUES('foo','foo1'); WARNING: type chkpass has unstable input conversion for foo1 LINE 1: INSERT INTO test_tbl VALUES('foo','foo1'); ^ INSERT 0 1 It is giving warning type chkpass has unstable input conversion because chkpass structure hold uninitialized memory area’s that are left unused. When chkpass_in() is called with input “foo1”, it allocate 16 bytes of randomized memory for chkpass i.e. contrib/chkpass/chkpass.c /* * This is the internal storage format for CHKPASSs. * 15 is all I need but add a little buffer */ typedef struct chkpass { charpassword[16]; } chkpass; chkpass_in() result = (chkpass *) palloc(sizeof(chkpass)); *result memory* 0x7ffc93047a68: 0xdd 0xde 0xdf 0xe0 0xe1 0xe2 0xe3 0xe4 0x7ffc93047a70: 0xe5 0xe6 0xe7 0xe8 0xe9 0xea 0xeb 0xec It copies string (16 bytes max) returned from crypt() function, it copies until null character reached i.e. strlcpy(result-password, crypt_output, sizeof(result-password)); *crypt_output memory* 0x7fff7d1577f0: 0x6a 0x6e 0x6d 0x54 0x44 0x53 0x55 0x33 0x7fff7d1577f8: 0x4b 0x48 0x52 0x55 0x6f 0x00 0x00 0x00 *result memory* 0x7ffc93047a68: 0x6a 0x6e 0x6d 0x54 0x44 0x53 0x55 0x33 0x7ffc93047a70: 0x4b 0x48 0x52 0x55 0x6f 0x00 0xeb 0xec It left remaining last 2 byte left untouched. It is the cause for unstable input conversion” warning. I think these uninitialized memory areas are false alarms. Along with this there seems another issue that makes chkpass troubling RANDOMIZE_ALLOCATED_MEMORY comparison logic is that crypt() function is being passed with random salt value for DES based result. PFA patch, to generate consistent results, it uses constant salt value. It seems a minor inconvenience but it will make results better. Please do let me know if I missed something or more information is required. Thanks. Regards, Muhammad Asif Naeem chkpass_RANDOMIZE_ALLOCATED_MEMORY_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL information view
On Fri, Feb 13, 2015 at 5:31 PM, Magnus Hagander mag...@hagander.netwrote: On Fri, Feb 13, 2015 at 9:07 AM, Michael Paquier michael.paqu...@gmail.comwrote: Magnus, are you planning to work on this item of your shame list soon? Could you clarify the status of this patch? I do, and I hope to work on it over the next week or two. However, feel free to bump it to the next CF -- I'll pick it up halfway in between if necessary. OK, I moved it to 2015-02 with the same status Waiting on Author. -- Michael
Re: [HACKERS] Support UPDATE table SET(*)=...
Hi all, Sorry for the delay. Please find attached latest version of UPDATE (*) patch.The patch implements review comments and Tom's gripe about touching transformTargetList is addressed now. I have added regression tests and simplified parser representation a bit. Regards, Atri diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml index 35b0699..1f68bdf 100644 --- a/doc/src/sgml/ref/update.sgml +++ b/doc/src/sgml/ref/update.sgml @@ -25,7 +25,9 @@ PostgreSQL documentation UPDATE [ ONLY ] replaceable class=PARAMETERtable_name/replaceable [ * ] [ [ AS ] replaceable class=parameteralias/replaceable ] SET { replaceable class=PARAMETERcolumn_name/replaceable = { replaceable class=PARAMETERexpression/replaceable | DEFAULT } | ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) | - ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable ) + ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable ) | + ( replaceable class=PARAMETER*/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) | + ( replaceable class=PARAMETER*/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable ) } [, ...] [ FROM replaceable class=PARAMETERfrom_list/replaceable ] [ WHERE replaceable class=PARAMETERcondition/replaceable | WHERE CURRENT OF replaceable class=PARAMETERcursor_name/replaceable ] diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index a68f2e8..6d08dbd 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -1937,6 +1937,101 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt) nsitem-p_lateral_only = false; nsitem-p_lateral_ok = true; + /* + * Check if (SET(*) = SELECT ...) is present. If it is present we + * need to resolve and populate the remaining needed MultiAssignRefs in the + * target list. We modify target list in place and add needed MultiAssignRefs. + */ + if (list_length(stmt-targetList) == 1) + { + ResTarget *current_val = linitial(stmt-targetList); + + /* Currently there is no way that ResTarget node's name value in UPDATE command + * is set to NULL except for UPDATE SET (*) case. + * Hence we can safely depend on name value being NULL as a check for + * presence of UPDATE SET (*) case. + */ + if (current_val-name == NULL) + { + List *rel_cols_list; + List *expanded_tlist = NULL; + List *sub_list = NULL; + ListCell *lc_val; + ListCell *lc_relcol; + int rteindex = 0; + int sublevels_up = 0; + int i = 0; + + rteindex = RTERangeTablePosn(pstate, pstate-p_target_rangetblentry, + sublevels_up); + + expandRTE(pstate-p_target_rangetblentry, rteindex, sublevels_up, + current_val-location, false, + (rel_cols_list), NULL); + + + /* SET(*) = (SELECT ...) case */ + if (IsA(current_val-val, MultiAssignRef)) + { +MultiAssignRef *orig_val = (MultiAssignRef *) (current_val-val); + +orig_val-ncolumns = list_length(rel_cols_list); + +Assert(sub_list == NULL); + +sub_list = list_make1(orig_val); + +/* Change targetlist to have corresponding ResTarget nodes + * as corresponding to the columns in target relation */ +for (i = 1;i list_length(rel_cols_list);i++) +{ + MultiAssignRef *r = makeNode(MultiAssignRef); + + r-source = orig_val-source; + r-colno = i + 1; + r-ncolumns = orig_val-ncolumns; + + lappend(sub_list, r); +} + } + else if (IsA(current_val-val, List)) /* SET(*) = (val, val,...) case */ + { + +Assert(sub_list == NULL); +sub_list = (List *) (current_val-val); + } + else + { +elog(ERROR, Unknown type in UPDATE command); + } + + if (list_length(rel_cols_list) != list_length(sub_list)) +elog(ERROR, number of columns does not match number of values); + + /* Change targetlist to have corresponding ResTarget nodes + * as corresponding to the columns in target relation */ + forboth(lc_val, sub_list, lc_relcol, rel_cols_list) + { +ResTarget *current_res = makeNode(ResTarget); + +current_res-name = strVal(lfirst(lc_relcol)); +current_res-val = (Node *) (lfirst(lc_val)); + +if (expanded_tlist == NULL) +{ + expanded_tlist = list_make1(current_res); +} +else +{ + lappend(expanded_tlist, current_res); +} + } + + stmt-targetList = expanded_tlist; + } + } + + qry-targetList = transformTargetList(pstate, stmt-targetList, EXPR_KIND_UPDATE_SOURCE); @@ -1982,18 +2077,20 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt) continue; } if (origTargetList == NULL) - elog(ERROR, UPDATE target count mismatch --- internal error); +elog(ERROR, UPDATE target count mismatch
Re: [HACKERS] assessing parallel-safety
On Fri, Feb 13, 2015 at 05:13:06PM -0500, Robert Haas wrote: On Fri, Feb 13, 2015 at 12:10 AM, Noah Misch n...@leadboat.com wrote: Given your wish to optimize, I recommend first investigating the earlier thought to issue eval_const_expressions() once per planner() instead of once per subquery_planner(). Compared to the parallelModeRequired/parallelModeOK idea, it would leave us with a more maintainable src/backend/optimizer. I won't object to either design, though. In off-list discussions with Tom Lane, he pressed hard on the question of whether we can zero out the number of functions that are parallel-unsafe (i.e. can't be run while parallel even in the master) vs. parallel-restricted (must be run in the master rather than elsewhere). The latter category can be handled by strictly local decision-making, without needing to walk the entire plan tree; e.g. parallel seq scan can look like this: Parallel Seq Scan on foo Filter: a = pg_backend_pid() Parallel Filter: b = 1 And, indeed, I was pleasantly surprised when surveying the catalogs by how few functions were truly unsafe, vs. merely needing to be restricted to the master. But I can't convince myself that there's any way sane of allowing database writes even in the master; creating new combo CIDs there seems disastrous, and users will be sad if a parallel plan is chosen for some_plpgsql_function_that_does_updates() and this then errors out because of parallel mode. Yep. The scarcity of parallel-unsafe, built-in functions reflects the dominant subject matter of built-in functions. User-defined functions are more diverse. It would take quite a big hammer to beat the parallel-unsafe category into irrelevancy. Tom also argued that (1) trying to assess parallel-safety before preprocess_expressions() was doomed to fail, because preprocess_expressions() can additional function calls via, at least, inlining and default argument insertion and (2) preprocess_expressions() can't be moved earlier than without changing the semantics. I'm not sure if he's right, but those are sobering conclusions. Andres pointed out to me via IM that inlining is dismissable here; if inlining introduces a parallel-unsafe construct, the inlined function was mislabeled to begin with, and the user has earned the error message they get. Default argument insertion might not be dismissable although the practical risks seem low. All implementation difficulties being equal, I would opt to check for parallel safety after inserting default arguments and before inlining. Checking before inlining reveals the mislabeling every time instead of revealing it only when inline_function() gives up. Timing of the parallel safety check relative to default argument insertion matters less. Remember, the risk is merely that a user will find cause to remove a parallel-safe marking where he/she expected the system to deduce parallel unsafety. If implementation difficulties lead to some other timings, that won't bother me. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c
On Sat, Feb 14, 2015 at 8:10 AM, Jim Nasby wrote: On 2/12/15 10:54 PM, Michael Paquier wrote: Hi all, When calling vacuum(), there is the following assertion using VACOPT_FREEZE: Assert((vacstmt-options VACOPT_VACUUM) || !(vacstmt-options (VACOPT_FULL | VACOPT_FREEZE))); I think that this should be changed with sanity checks based on the parameter values of freeze_* in VacuumStmt as we do not set up VACOPT_FREEZE when VACUUM is used without options in parenthesis, for something like that: Assert((vacstmt-options VACOPT_VACUUM) || - !(vacstmt-options (VACOPT_FULL | VACOPT_FREEZE))); + ((vacstmt-options VACOPT_FULL) == 0 + vacstmt-freeze_min_age 0 + vacstmt-freeze_table_age 0 + vacstmt-multixact_freeze_min_age 0 + vacstmt-multixact_freeze_table_age 0)); This would also have the advantage to limit the use of VACOPT_FREEZE in the query parser. A patch is attached. Thoughts? Looks good. Should we also assert that if VACOPT_FREEZE is set then all the other stuff is 0? I don't know what kind of sanity checks we normally try and put on the parser, but that seems like a possible hole. Possible, but this would require at least to change gram.y to update VacuumStmt-options to set VACOPT_FREEZE for the case where VACUUM is parsed without parenthesis. I'd rather simply rely on the freeze parameters for simplicity. That is at least what I guess by looking at where is used VACOPT_FREEZE. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_regress writes into source tree
On Sat, Feb 14, 2015 at 6:24 AM, Andrew Dunstan and...@dunslane.net wrote: On 12/18/2014 06:05 PM, Andrew Dunstan wrote: On 12/18/2014 03:02 AM, Michael Paquier wrote: On Fri, Dec 12, 2014 at 10:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Another thing in that patch was that I had to add the sql/ directory to the source tree, but other than that .gitignore file it was empty. Maybe pg_regress should create the sql/ directory in the build dir if it doesn't exist. This is only a problem if a pg_regress suite only runs stuff from input/, because otherwise the sql/ dir already exists in the source. +1 for having pg_regress create the sql/ directory when it does not exist. Current behavior is annoying when modules having only tests in input/... That seems like a separate issue. I think Peter should commit his patch and backpatch it immediately, and we can deal with the missing sql directory when someone sends in a patch. What's happened on this? Nothing has been committed, and as far as I understood this patch could have been simply pushed... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
On 2/9/15 6:21 PM, Peter Geoghegan wrote: Thanks for taking a look at it. That's somewhat cleaned up in the attached patchseries - V2.2. In patch 1, sepgsql is also affected by this commit. Note that this commit necessitates an initdb, since stored ACLs are broken. Doesn't that warrant bumping catversion? Patch 2 + * When killing a speculatively-inserted tuple, we set xmin to invalid and +if (!(xlrec-flags XLOG_HEAP_KILLED_SPECULATIVE_TUPLE)) When doing this, should we also set the HEAP_XMIN_INVALID hint bit? reads more of patch Ok, I see we're not doing this because the check for a super deleted tuple is already cheap. Probably worth mentioning that in the comment... ExecInsert(): + * We don't suppress the effects (or, perhaps, side-effects) of + * BEFORE ROW INSERT triggers. This isn't ideal, but then we + * cannot proceed with even considering uniqueness violations until + * these triggers fire on the one hand, but on the other hand they + * have the ability to execute arbitrary user-defined code which + * may perform operations entirely outside the system's ability to + * nullify. I'm a bit confused as to why we're calling BEFORE triggers out here... hasn't this always been true for both BEFORE *and* AFTER triggers? The comment makes me wonder if there's some magic that happens with AFTER... +spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0 Non-standard formatting. Given the size of the patch and work already into it I'd just leave it for the next formatting run; I only mention it in case someone has some compelling reason not to. ExecLockUpdateTuple(): + * Try to lock tuple for update as part of speculative insertion. If + * a qual originating from ON CONFLICT UPDATE is satisfied, update + * (but still lock row, even though it may not satisfy estate's + * snapshot). I find this confusing... which row is being locked? The previously inserted one? Perhaps a better wording would be satisfied, update. Lock the original row even if it doesn't satisfy estate's snapshot. infer_unique_index(): +/* + * We need not lock the relation since it was already locked, either by + * the rewriter or when expand_inherited_rtentry() added it to the query's + * rangetable. + */ +relationObjectId = rt_fetch(parse-resultRelation, parse-rtable)-relid; + +relation = heap_open(relationObjectId, NoLock); Seems like there should be an Assert here. Also, the comment should probably go before the heap_open call. heapam_xlog.h: +/* reuse xl_heap_multi_insert-only bit for xl_heap_delete */ I wish this would mention why it's safe to do this. Also, the comment mentions xl_heap_delete when the #define is for XLOG_HEAP_KILLED_SPECULATIVE_TUPLE; presumably that's wrong. Perhaps: /* * reuse XLOG_HEAP_LAST_MULTI_INSERT bit for * XLOG_HEAP_KILLED_SPECULATIVE_TUPLE. This is safe because we never do * multi-inserts for INSERT ON CONFLICT. */ I'll review the remaining patches later. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers