date:20150508

Stephen Frost  writes:
> I'm all for improving performance of postgres_fdw and would like to see
> us support sending queries off to be worked asyncronously, but starting
> execution on the remote server during ExecInitNode is against the
> documentated FDW API spec.  I discussed exactly this issue over a year
> ago here:

> http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net

> Sadly, there weren't any direct responses to that email, but I do recall
> having a discussion on another thread (or in person?) with Tom where we
> ended up agreeing that we can't simply remove that requirement from the
> docs or the API.

Yeah.  There are at least a couple of reasons why not:

* ExecInitNode only creates the runtime data structures, it should not
begin execution.  It's possible for example that the scan will never be
iterated at all; say it's on the inside of a nestloop and the outer
relation turns out to be empty.  It's not apparent why starting the remote
query a few microseconds sooner is worth the risk of demanding useless
computation.

* If the scan is parameterized (again, it's on the inside of a nestloop,
and the outer relation is supplying join key values), those parameter
values are simply not available at ExecInitNode time.

Also, so far as a quick review of the actual patch goes, I would really
like to see this lose the "PFC" wrapper layer, which accounts for 95% of
the code churn in the patch and doesn't seem to add any actual value.
What it does add is unchecked malloc failure conditions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cube extension kNN support

2015-05-08 Thread Stas Kelvich

Hi!

Patch is pretty ready, last issue was about changed extension interface, so 
there should be migration script and version bump.
Attaching a version with all migration stuff.



distances2r4.patch
Description: Binary data


Stas.


> On 09 May 2015, at 05:20, Alvaro Herrera  wrote:
> 
> Sergey Konoplev wrote:
>> On Thu, Mar 27, 2014 at 3:26 PM, Sergey Konoplev  wrote:
>>> On Sun, Sep 22, 2013 at 4:38 PM, Stas Kelvich  
>>> wrote:
 Here is the patch that introduces kNN search for cubes with euclidean, 
 taxicab and chebyshev distances.
>>> 
>>> What is the status of this patch?
>> 
>> Referring to our private conversation with Alexander Korotkov, the
>> patch is in WIP state currently, and, hopefully, will be ready by 9.5.
>> I'm ready to actively participate in its testing on a real world
>> production set of data.
> 
> This patch doesn't seem to have received an updated version.  Should we
> just punt on it?  The assumption would be that Stas or Alexander will be
> re-submitting this for 9.6.
> 
> -- 
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove obsolete mention of src/tools/backend

2015-05-08 Thread Amit Langote

On Friday, May 8, 2015, Amit Langote  wrote:

>
> Hi,
>
> Commit 63f1ccd got rid of src/tool/backend and hence
> src/tool/backend/index.html. But lmgr README still directs reader to the
> aforementioned file. Attached removes this obsolete reference.
>

Please ignore this. This has already been fixed. Sorry about the noise.

Amit

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

On Fri, May 8, 2015 at 5:48 PM, Tom Lane  wrote:
> ... btw, I just noticed something that had escaped me because it seems so
> obviously wrong that I had not even stopped to consider the possibility
> that the code was doing what it's doing.  To wit, that the planner
> supposes that two foreign tables are potentially remote-joinable if they
> share the same underlying FDW handler function.  Not the same server, and
> not even the same pg_foreign_data_wrapper entry, but the pg_proc entry for
> the handler function.  I think this is fundamentally bogus.  Under what
> circumstances are we not just laying off the need to check same server
> origin onto the FDW?  How is it that the urgent need for the FDW to check
> for that isn't even mentioned in the documentation?
>
> I think that we'd really be better off insisting on same server (as in
> same pg_foreign_server OID), hence automatically same FDW, and what's
> even more important, same user mapping for any possible query execution
> context.  The possibility that there are some corner cases where some FDWs
> could optimize other scenarios seems to me to be poor return for the bugs
> and security holes that will arise any time typical FDWs forget to check
> this.

I originally wanted to go quite the other way with this and check for
join pushdown via handler X any time at least one of the two relations
involved used handler X, even if the other one used some other handler
or was a plain table.  In particular, it seems to me quite plausible
to want to teach an FDW that a certain local table is replicated on a
remote node, allowing a join between a foreign table and a plain table
to be pushed down.  This infrastructure can't be used that way anyhow,
so maybe there's no harm in tightening it up, but I'm wary of
circumscribing what FDW authors can do.  I think it's better to be
rather expansive in terms of when we call them and let them return
without doing anything some of them time than to define the situations
in which we call them too narrowly and end up ruling out interesting
use cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cube extension kNN support

Sergey Konoplev wrote:
> On Thu, Mar 27, 2014 at 3:26 PM, Sergey Konoplev  wrote:
> > On Sun, Sep 22, 2013 at 4:38 PM, Stas Kelvich  
> > wrote:
> >> Here is the patch that introduces kNN search for cubes with euclidean, 
> >> taxicab and chebyshev distances.
> >
> > What is the status of this patch?
> 
> Referring to our private conversation with Alexander Korotkov, the
> patch is in WIP state currently, and, hopefully, will be ready by 9.5.
> I'm ready to actively participate in its testing on a real world
> production set of data.

This patch doesn't seem to have received an updated version.  Should we
just punt on it?  The assumption would be that Stas or Alexander will be
re-submitting this for 9.6.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

On Fri, May 8, 2015 at 2:51 PM, Tom Lane  wrote:
> Well, we have two alternatives.  I can keep hacking on this and get it
> to a state where it seems credible to me, but we won't have any proof
> that it actually works (though perhaps we could treat any problems
> as bugs that should hopefully get found before 9.5 ships, if a
> postgres_fdw patch shows up in the next few months).  Or we could
> revert the whole thing and bounce it to the 9.6 cycle.  I don't really
> like doing the latter, but I'm pretty uncomfortable with committing to
> published FDW APIs that are (a) as messy as this and (b) practically
> untested.  The odds that something slipped through the cracks are high.

A lot of work went into this patch.  I think it would be a shame to
revert it.  I'd even rather ship something imperfect or somewhat
unstable and change it later than give up and roll it all back.

> Aside from the other gripes I raised, I'm exceedingly unhappy with the
> ad-hoc APIs proposed for GetForeignJoinPaths and set_join_pathlist_hook.
> It's okay for internal calls in joinpath.c to look like that, but
> exporting that set of parameters seems like pure folly.  We've changed
> those parameter lists repeatedly (for instance in 9.2 and again in 9.3);
> the odds that they'll need to change again in future approach 100%.
>
> One way we could reduce the risk of code breakage there is to stuff all
> or most of those parameters into a struct.  This might result in a small
> slowdown for the internal calls, or then again maybe not --- there
> probably aren't many architectures that can pass 10 parameters in
> registers anyway.

Putting it into a structure certainly seems fine.  I think it's pretty
silly to assume that the FDW APIs are frozen or we're never going to
change them.  There was much discussion of the merits of exposing that
information or not, and I was (and am) convinced that the FDWs need
access to most if not all of that stuff, and that removing access to
it will cripple the facility and result in mountains of duplicated and
inefficient code.  If in the future we compute more or different stuff
there, I expect there's a good chance that FDWs will need to be
updated to look at that stuff too.  Of course, I don't object to
maximizing our chances of not needing an API break, but I will be
neither surprised nor disappointed if such efforts fail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] initdb -S and tablespaces

On Fri, May 8, 2015 at 7:53 PM, Andres Freund  wrote:
> On 2015-05-04 14:23:16 -0400, Robert Haas wrote:
>> On Fri, May 1, 2015 at 10:41 AM, Abhijit Menon-Sen  
>> wrote:
>> > As for the non-backpatchable 0002, I agree with Andres that it should be
>> > included in 9.5; but I take it you're still not convinced?
>>
>> No, I'm not convinced.  That patch will protect you in one extremely
>> specific scenario: you turn off fsync, do some stuff, shut down, turn
>> fsync back on again, and start up.
>
> Hm. ISTM it'd not be hard to actually make it safe in nearly all
> situations. What about tracking the last checkpoint's fsync setting and
> do a fsync_pgdata() in the checkpointer at the end of a checkpointer if
> the previous setting was off and the current one is on?  Combined with
> doing so at startup if the settings changed between runs, that should
> give pretty decent protection. And seems fairly simple to implement.

That seems a bit better.  I think it's really important, if we're
going to start to try to make fsync=off anything other than a toy,
that we document really clearly the circumstances in which it is or is
not safe:

- If you crash while fsync=off, your cluster may be corrupted.
- If you crash while fsync=on, but it was off at the last checkpoint,
your cluster may be corrupted.
- If you turn fsync=off, do stuff, turn fsync=on, and checkpoint
successfully, a subsequent crash should not corrupt anything.

Of course, even the last one isn't totally bullet-proof.  Suppose one
backend fails to absorb the new setting for some reason...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Async execution of postgres_fdw.

Kyotaro,

* Kyotaro HORIGUCHI (horiguchi.kyot...@lab.ntt.co.jp) wrote:
>  The attached is the fixed patch. It apparently improves the
> performance for the test case shown in the previous mail, in
> which the average tuple length is about 140 bytes.

I'm all for improving performance of postgres_fdw and would like to see
us support sending queries off to be worked asyncronously, but starting
execution on the remote server during ExecInitNode is against the
documentated FDW API spec.  I discussed exactly this issue over a year
ago here:

http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net

Sadly, there weren't any direct responses to that email, but I do recall
having a discussion on another thread (or in person?) with Tom where we
ended up agreeing that we can't simply remove that requirement from the
docs or the API.

I certainly appreciate that you've put quite a bit of effort into this
but I'm afraid we can't accept it while it's starting to run a query on
the remote side during the ExecInitNode phase.  The query can not start
executing on the remote side until InterateForeignScan is called.

You might consider looking at the other suggestion in that email with
regard to adding an Async mechanism to the executor.  I didn't get to
the point of writing code, but I did think about it a fair bit and still
believe that could work.

I'm not going to change the status of this patch in the CommitFest at
this time, in case anyone else feels I've misunderstood or not correctly
analyzed what the patch does (I'll admit, I've only read it and not
actually compiled it or run it with a debugger, but I'm pretty sure my
reading of what's happening is correct..), but I'm afraid this is going
to have to end up as Rejected.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)

Hi,

On 2015-04-24 08:46:48 +0530, Abhijit Menon-Sen wrote:
> diff --git a/contrib/pgstattuple/pgstatbloat.c 
> b/contrib/pgstattuple/pgstatbloat.c
> new file mode 100644
> index 000..612e22b
> --- /dev/null
> +++ b/contrib/pgstattuple/pgstatbloat.c
> @@ -0,0 +1,389 @@
> +/*
> + * contrib/pgstattuple/pgstatbloat.c
> + *
> + * Abhijit Menon-Sen 
> + * Portions Copyright (c) 2001,2002  Tatsuo Ishii (from pgstattuple)

I think for new extension we don't really add authors and such anymore.

> + * Permission to use, copy, modify, and distribute this software and
> + * its documentation for any purpose, without fee, and without a
> + * written agreement is hereby granted, provided that the above
> + * copyright notice and this paragraph and the following two
> + * paragraphs appear in all copies.
> + *
> + * IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT,
> + * INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
> + * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
> + * DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED
> + * OF THE POSSIBILITY OF SUCH DAMAGE.
> + *
> + * THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS
> + * IS" BASIS, AND THE AUTHOR HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE,
> + * SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
> + */

Shouldn't be here in a contrib module.

> +PG_FUNCTION_INFO_V1(pgstatbloat);

> +CREATE FUNCTION pgstatbloat(IN reloid regclass,
> +OUT table_len BIGINT,   -- physical table length in bytes
> +OUT scanned_percent FLOAT8, -- what percentage of the table's 
> pages was scanned
> +OUT approx_tuple_count BIGINT,  -- estimated number of live tuples
> +OUT approx_tuple_len BIGINT,-- estimated total length in bytes 
> of live tuples
> +OUT approx_tuple_percent FLOAT8,-- live tuples in % (based on 
> estimate)
> +OUT dead_tuple_count BIGINT,-- exact number of dead tuples
> +OUT dead_tuple_len BIGINT,  -- exact total length in bytes of 
> dead tuples
> +OUT dead_tuple_percent FLOAT8,  -- dead tuples in % (based on 
> estimate)
> +OUT free_space BIGINT,  -- exact free space in bytes
> +OUT free_percent FLOAT8)-- free space in %

Hm. I do wonder if this should really be called 'statbloat'. Perhaps
it'd more appropriately be called pg_estimate_bloat or somesuch?

Also, is free_space really exact? The fsm isn't byte exact.

> +static Datum
> +build_output_type(pgstatbloat_output_type *stat, FunctionCallInfo fcinfo)
> +{
> +#define NCOLUMNS 10
> +#define NCHARS   32
> +
> + HeapTuple   tuple;
> + char   *values[NCOLUMNS];
> + charvalues_buf[NCOLUMNS][NCHARS];
> + int i;
> + double  tuple_percent;
> + double  dead_tuple_percent;
> + double  free_percent;   /* free/reusable space in % */
> + double  scanned_percent;
> + TupleDesc   tupdesc;
> + AttInMetadata *attinmeta;
> +
> + /* Build a tuple descriptor for our result type */
> + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> + elog(ERROR, "return type must be a row type");
> +
> + /*
> +  * Generate attribute metadata needed later to produce tuples from raw C
> +  * strings
> +  */
> + attinmeta = TupleDescGetAttInMetadata(tupdesc);
> +
> + if (stat->table_len == 0)
> + {
> + tuple_percent = 0.0;
> + dead_tuple_percent = 0.0;
> + free_percent = 0.0;
> + }
> + else
> + {
> + tuple_percent = 100.0 * stat->tuple_len / stat->table_len;
> + dead_tuple_percent = 100.0 * stat->dead_tuple_len / 
> stat->table_len;
> + free_percent = 100.0 * stat->free_space / stat->table_len;
> + }
> +
> + scanned_percent = 0.0;
> + if (stat->total_pages != 0)
> + {
> + scanned_percent = 100 * stat->scanned_pages / stat->total_pages;
> + }
> +
> + for (i = 0; i < NCOLUMNS; i++)
> + values[i] = values_buf[i];
> + i = 0;
> + snprintf(values[i++], NCHARS, INT64_FORMAT, stat->table_len);
> + snprintf(values[i++], NCHARS, "%.2f", scanned_percent);
> + snprintf(values[i++], NCHARS, INT64_FORMAT, stat->tuple_count);
> + snprintf(values[i++], NCHARS, INT64_FORMAT, stat->tuple_len);
> + snprintf(values[i++], NCHARS, "%.2f", tuple_percent);
> + snprintf(values[i++], NCHARS, INT64_FORMAT, stat->dead_tuple_count);
> + snprintf(values[i++], NCHARS, INT64_FORMAT, stat->dead_tuple_len);
> + snprintf(values[i++], NCHARS, "%.2f", dead_tuple_percent);
> + snprintf(values[i++], NCHARS, INT64_FORMAT, stat->free_space);
> + snprintf(values[

Re: [HACKERS] GSSAPI, SSPI - include_realm default

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Dec  9, 2014 at 05:38:25PM -0500, Stephen Frost wrote:
> > > My comment that include_realm is supported back to 8.4 was because there
> > > is an expectation that a pg_hba.conf file can be used unchanged across
> > > several major releases.  So when 9.5 comes out and people update their
> > > pg_hba.conf files for 9.5, those files will still work in old releases.
> > >  But the time to do those updates is then, not now.
> > 
> > The back-branches are being patched to discourage using the default
> > because it's not a secure approach.  New users start using PG all the
> > time and so changing the existing documentation is worthwhile to ensure
> > those new users understand.  A note in the release notes for whichever
> > minor release the change to the documentation shows up in would be a
> > good way to make existing users aware of the change and hopefully
> > encourage them to review their configuration.
> > 
> > If we don't agree that the change should be made then we can discuss
> > that, but everyone commenting so far has agreed on the change.
> 
> Where are we on this?

Thanks for the prod on this.  I've now committed the changes which were
discussed.  Please let me know if you see any issues or have any
concerns.

Thanks again!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] initdb -S and tablespaces

On 2015-05-04 14:23:16 -0400, Robert Haas wrote:
> On Fri, May 1, 2015 at 10:41 AM, Abhijit Menon-Sen  
> wrote:
> > As for the non-backpatchable 0002, I agree with Andres that it should be
> > included in 9.5; but I take it you're still not convinced?
> 
> No, I'm not convinced.  That patch will protect you in one extremely
> specific scenario: you turn off fsync, do some stuff, shut down, turn
> fsync back on again, and start up.

Hm. ISTM it'd not be hard to actually make it safe in nearly all
situations. What about tracking the last checkpoint's fsync setting and
do a fsync_pgdata() in the checkpointer at the end of a checkpointer if
the previous setting was off and the current one is on?  Combined with
doing so at startup if the settings changed between runs, that should
give pretty decent protection. And seems fairly simple to implement.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-05-08 Thread Kohei KaiGai

2015-05-09 3:51 GMT+09:00 Tom Lane :
> Robert Haas  writes:
>> On Fri, May 8, 2015 at 1:46 PM, Tom Lane  wrote:
>>> That's nice, but 9.5 feature freeze is only a week away.  I don't have a
>>> lot of confidence that this stuff is actually in a state where we won't
>>> regret shipping it in 9.5.
>
>> Yeah.  The POC you were asking for upthread certainly exists and has
>> for a while, or I would not have committed this.  But I do not think
>> it likely that the  postgres_fdw support will be ready for 9.5.
>
> Well, we have two alternatives.  I can keep hacking on this and get it
> to a state where it seems credible to me, but we won't have any proof
> that it actually works (though perhaps we could treat any problems
> as bugs that should hopefully get found before 9.5 ships, if a
> postgres_fdw patch shows up in the next few months).  Or we could
> revert the whole thing and bounce it to the 9.6 cycle.  I don't really
> like doing the latter, but I'm pretty uncomfortable with committing to
> published FDW APIs that are (a) as messy as this and (b) practically
> untested.  The odds that something slipped through the cracks are high.
>
> Aside from the other gripes I raised, I'm exceedingly unhappy with the
> ad-hoc APIs proposed for GetForeignJoinPaths and set_join_pathlist_hook.
> It's okay for internal calls in joinpath.c to look like that, but
> exporting that set of parameters seems like pure folly.  We've changed
> those parameter lists repeatedly (for instance in 9.2 and again in 9.3);
> the odds that they'll need to change again in future approach 100%.
>
> One way we could reduce the risk of code breakage there is to stuff all
> or most of those parameters into a struct.  This might result in a small
> slowdown for the internal calls, or then again maybe not --- there
> probably aren't many architectures that can pass 10 parameters in
> registers anyway.
>
Is it like a following structure definition?

  typedef struct
  {
PlannerInfo *root;
RelOptInfo *joinrel;
RelOptInfo *outerrel;
RelOptInfo *innerrel;
List *restrictlist;
JoinType jointype;
SpecialJoinInfo *sjinfo;
SemiAntiJoinFactors *semifactors;
Relids param_source_rels;
Relids extra_lateral_rels;
  } SetJoinPathListArgs;

I agree the idea. It also helps CSP driver implementation where it calls
next driver that was already chained on its installation.

  if (set_join_pathlist_next)
  set_join_pathlist_next(args);

is more stable manner than

  if (set_join_pathlist_next)
  set_join_pathlist_next(root,
   joinrel,
   outerrel,
   innerrel,
   restrictlist,
   jointype,
   sjinfo,
   semifactors,
   param_source_rels,
   extra_lateral_rels);

Thanks,
--
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-05-08 Thread Kohei KaiGai

2015-05-09 2:46 GMT+09:00 Tom Lane :
> Kouhei Kaigai  writes:
>>> I've been trying to code-review this patch, because the documentation
>>> seemed several bricks shy of a load, and I find myself entirely confused
>>> by the fdw_ps_tlist and custom_ps_tlist fields.
>
>> Main-point of your concern is lack of documentation/comments to introduce
>> how does the pseudo-scan targetlist works here, isn't it??
>
> Well, there's a bunch of omissions and outright errors in the docs and
> comments, but this is the main issue that I was uncertain how to fix
> from looking at the patch.
>
>>> Also,
>>> if that is what they're for (ie, to allow the FDW to redefine the scan
>>> tuple contents) it would likely be better to decouple that feature from
>>> whether the plan node is for a simple scan or a join.
>
>> In this version, we don't intend FDW/CSP to redefine the contents of
>> scan tuples, even though I want off-loads heavy targetlist calculation
>> workloads to external computing resources in *the future version*.
>
> I do not think it's a good idea to introduce such a field now and then
> redefine how it works and what it's for in a future version.  We should
> not be moving the FDW APIs around more than we absolutely have to,
> especially not in ways that wouldn't throw an obvious compile error
> for un-updated code.  Also, the longer we wait to make a change that
> we know we want, the more pain we inflict on FDW authors (simply because
> there will be more of them a year from now than there are today).
>
Ah, above my sentence don't intend to reuse the existing field for
different works in the future version. It's just what I want to support
in the future version.
Yep, I see. It is not a good idea to redefine the existing field for
different purpose silently. It's not my plan.

>>> The business about
>>> resjunk columns in that list also seems a bit half baked, or at least
>>> underdocumented.
>
>> I'll add source code comments to introduce how does it works any when
>> does it have resjunk=true. It will be a bit too deep to be introduced
>> in the SGML file.
>
> I don't actually see a reason for resjunk marking in that list at all,
> if what it's for is to define the contents of the scan tuple.  I think we
> should just s/ExecCleanTypeFromTL/ExecTypeFromTL/ in nodeForeignscan and
> nodeCustom, and get rid of the "sanity check" in create_foreignscan_plan
> (which is pretty pointless anyway, considering the number of other ways
> you could screw up that tlist without it being detected).
>
http://www.postgresql.org/message-id/9a28c8860f777e439aa12e8aea7694f8010d7...@bpxm15gp.gisp.nec.co.jp

Does the introduction in above post make sense?
The *_ps_tlist is not only used for a basic of scan-tuple descriptor, but
also used to solve var-node if varno==INDEX_VAR in EXPLAIN command.
On the other hands, existence of the junk entries (which are referenced in
external computing resources only) may cause unnecessary projection.
So, I want to discriminate target-entries for basis of scan-tuple descriptor
from other ones just for EXPLAIN command.

> I'm also inclined to rename the fields to
> fdw_scan_tlist/custom_scan_tlist, which would better reflect what they do,
> and to change the API of make_foreignscan() to add a parameter
> corresponding to the scan tlist.  It's utterly bizarre and error-prone
> that this patch has added a field that the FDW is supposed to set and
> not changed make_foreignscan to match.
>
OK, I'll do the both of changes. The name of ps_tlist is a shorten of
"pseudo-scan target-list". So, fdw_scan_tlist/custom_scan_tlist are
almost intentional.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: knowing detail of config files via SQL

Sawada,

* Sawada Masahiko (sawada.m...@gmail.com) wrote:
> Thank you for reviewing.
> I agree with this.
> Attached patch is updated version v10.

Committed with quite a few additional changes and improvements.  Please
take a look, test, and let me know if you see any issues or have any
concerns.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-05-08 Thread Kohei KaiGai

2015-05-09 6:48 GMT+09:00 Tom Lane :
> ... btw, I just noticed something that had escaped me because it seems so
> obviously wrong that I had not even stopped to consider the possibility
> that the code was doing what it's doing.  To wit, that the planner
> supposes that two foreign tables are potentially remote-joinable if they
> share the same underlying FDW handler function.  Not the same server, and
> not even the same pg_foreign_data_wrapper entry, but the pg_proc entry for
> the handler function.  I think this is fundamentally bogus.  Under what
> circumstances are we not just laying off the need to check same server
> origin onto the FDW?  How is it that the urgent need for the FDW to check
> for that isn't even mentioned in the documentation?
>
Indeed. Comparison of fdw_handler may cause unexpected behavior.
I agree it needs to be fixed up.

> I think that we'd really be better off insisting on same server (as in
> same pg_foreign_server OID), hence automatically same FDW, and what's
> even more important, same user mapping for any possible query execution
> context.  The possibility that there are some corner cases where some FDWs
> could optimize other scenarios seems to me to be poor return for the bugs
> and security holes that will arise any time typical FDWs forget to check
> this.
>
The former version of foreign/custom-join patch did check for joinable relations
using FDW server id, however, it was changed to the current form because it
may have additional optimization opportunity - in case when multiple foreign
servers have same remote host, access credential and others...
Also, I understand your concern about potential security holes by oversight.
It is an issue like a weighing scales, however, it seems to me the benefit
come from the potential optimization case does not negate the security-
hole risk enough.
So, I'll make a patch to change the logic to check joinable foreign-tables.

Thanks,
--
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres GSSAPI Encryption

Robbie,

* Robbie Harwood (rharw...@redhat.com) wrote:
> Today, there exists GSSAPI authentication support in Postgres.  I plan
> to extend this work to include encryption as well, but wanted to get
> your input on that first since you've probably thought about this
> already.

Great!

> From what I can tell, the auth/encryption layer is very nicely designed
> for extensibility; adding an encryption mechanism should just involve
> adding another option to the read/write functions in {f,b}e-secure.c,
> and then creating {f,b}e-secure-gssapi.c (or similar) to populate from.

It was reworked recently thanks to Heikki.

> We'd I think also want a new kind of HBA entry (probably something along
> the lines of `hostgss` to contrast with `hostssl`), but I'm not sure
> what we'd want to do for the counterpart of `hostnossl` (`hostnogss`?
> But then do we need `hostnogssnossl`?  Is this even a useful setting to
> have?), so that will probably require broader discussion.

Are you intending to support GSSAPI encryption *without* using the
GSSAPI authentication mechanism?  If not, maybe we can come up with a
way to have an option to the GSSAPI auth mechanism that behaves the way
we want for GSSAPI encryption.  Perhaps something like:

encryption = (none | request | require)

If you need an option for it to still be able to fall-thru to the next
pg_hba line, that'd be more difficult, but is that really a sensible
use-case?

> Finally, I think there are a couple different ways the protocol could
> look.  I'd ideally like to opportunistically encrypt all
> GSSAPI-authenticated connections and fallback to non-encrypted when the
> other end doesn't support it (possibly dropping the connection if this
> causes it to not match HBA), but I'm not sure if this will work with the
> existing code.  A (less-nice) option could be to add a new
> authentication method for gss->encryption, which feels aesthetically
> misplaced.  The approach used for SSL today could probably be adopted as
> a third approach, but I don't really see a gain from doing it this way
> since encryption happens after authentication (opposite of SSL) in
> GSSAPI.

I'd definitely like to see us opportunistically encrypt all GSSAPI
authenticated connections also.  The main case to consider is how to
support migrating users who are currently using GSSAPI + SSL to GSSAPI
auth+encryption, and how to support them if they want to continue to use
GSSAPI just for user auth and use SSL for encryption.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Andres Freund  writes:
> prairiedog, without CCA, failed as well
> http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-05-08%2019%3A55%3A11
> different test, but again directly after index creation. So I hope it's
> indeed the indcheckxmin thing.

Oh, interesting.  That definitely suggests it's about machine speed/timing
not CCA per se (prairiedog being quite slow).  Which would fit with the
indcheckxmin theory.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

... btw, I just noticed something that had escaped me because it seems so
obviously wrong that I had not even stopped to consider the possibility
that the code was doing what it's doing.  To wit, that the planner
supposes that two foreign tables are potentially remote-joinable if they
share the same underlying FDW handler function.  Not the same server, and
not even the same pg_foreign_data_wrapper entry, but the pg_proc entry for
the handler function.  I think this is fundamentally bogus.  Under what
circumstances are we not just laying off the need to check same server
origin onto the FDW?  How is it that the urgent need for the FDW to check
for that isn't even mentioned in the documentation?

I think that we'd really be better off insisting on same server (as in
same pg_foreign_server OID), hence automatically same FDW, and what's
even more important, same user mapping for any possible query execution
context.  The possibility that there are some corner cases where some FDWs
could optimize other scenarios seems to me to be poor return for the bugs
and security holes that will arise any time typical FDWs forget to check
this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multixacts woes

Robert Haas wrote:
> My colleague Thomas Munro and I have been working with Alvaro, and
> also with Kevin and Amit, to fix bug #12990, a multixact-related data
> corruption bug.

Thanks for this great summary of the situation.

> 1. I believe that there is still a narrow race condition that cause
> the multixact code to go crazy and delete all of its data when
> operating very near the threshold for member space exhaustion. See
> http://www.postgresql.org/message-id/ca+tgmozihwybetx8nzzptosjprg2kcr-nawgajkzclcbvj1...@mail.gmail.com
> for the scenario and proposed fix.

I agree that there is a problem here.

> 2. We have some logic that causes autovacuum to run in spite of
> autovacuum=off when wraparound threatens.  My commit
> 53bb309d2d5a9432d2602c93ed18e58bd2924e15 provided most of the
> anti-wraparound protections for multixact members that exist for
> multixact IDs and for regular XIDs, but this remains an outstanding
> issue.  I believe I know how to fix this, and will work up an
> appropriate patch based on some of Thomas's earlier work.

I believe autovacuum=off is fortunately uncommon, but certainly getting
this issue fixed is a good idea.

> 3. It seems to me that there is a danger that some users could see
> extremely frequent anti-mxid-member-wraparound vacuums as a result of
> this work.

I agree with the idea that the long term solution to this issue is to
make the freeze process cheaper.  I don't have any good ideas on how to
make this less severe in the interim.  You say the fix for #8470 is not
tested thoroughly enough to back-patch it just yet, and I can behind
that; so let's wait until 9.5 has been tested a bit more.

Another avenue not mentioned and possibly worth exploring is making some
more use of the multixact cache, and reuse multixacts that were
previously issued and have the same effects as the one you're interested
in: for instance, if you want a multixact with locking members
(10,20,30) and you have one for (5,10,20,30) but transaction 5 has
finished, then essentially both have the same semantics (because locks
don't have any effect the transaction has finished) so we can use it
instead of creating a new one.  I have no idea how to implement this;
obviously, having to run TransactionIdIsCurrentTransactionId for each
member on each multixact in the cache each time you want to create a new
multixact is not very reasonable.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 22:29:47 +0200, Andres Freund wrote:
> On 2015-05-08 15:22:09 -0400, Tom Lane wrote:
> > I'm back to suspecting that the indcheckxmin issue is the true cause of
> > the buildfarm failure

> > though we lack an explanation why Andres failed to reproduce it ...
> 
> My laptop is probably a good bit faster than jaguarundi, particularly in
> a single "threaded" workload, as that test, due to turbo
> boost. Friarbird didn't trigger it either so far.  It's quite possible
> that it requires a concurrent analyze or so to run to occur in the wrong
> moment.

prairiedog, without CCA, failed as well
http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-05-08%2019%3A55%3A11
different test, but again directly after index creation. So I hope it's
indeed the indcheckxmin thing.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multixacts woes

Josh Berkus wrote:

> I have a couple workloads in my pool which do consume mxids faster than
> xids, due to (I think) execeptional numbers of FK conflicts.  It's
> definitely unusual, though, and I'm sure they'd rather have corruption
> protection and endure some more vacuums.  If we do this, though, it
> might be worthwhile to backport the multixact age function, so that
> affected users can check and schedule mxact wraparound vacuums
> themselves, something you currently can't do on 9.3.

Backporting that is difficult in core, but you can do it with an
extension without too much trouble.  Also, the multixact age function
does not give you the "oldest member" which is what you need to properly
monitor the whole of this; you can add that to an extension too.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres GSSAPI Encryption

2015-05-08 Thread Robbie Harwood

Hello!

Today, there exists GSSAPI authentication support in Postgres.  I plan
to extend this work to include encryption as well, but wanted to get
your input on that first since you've probably thought about this
already.

From what I can tell, the auth/encryption layer is very nicely designed
for extensibility; adding an encryption mechanism should just involve
adding another option to the read/write functions in {f,b}e-secure.c,
and then creating {f,b}e-secure-gssapi.c (or similar) to populate from.

We'd I think also want a new kind of HBA entry (probably something along
the lines of `hostgss` to contrast with `hostssl`), but I'm not sure
what we'd want to do for the counterpart of `hostnossl` (`hostnogss`?
But then do we need `hostnogssnossl`?  Is this even a useful setting to
have?), so that will probably require broader discussion.

Finally, I think there are a couple different ways the protocol could
look.  I'd ideally like to opportunistically encrypt all
GSSAPI-authenticated connections and fallback to non-encrypted when the
other end doesn't support it (possibly dropping the connection if this
causes it to not match HBA), but I'm not sure if this will work with the
existing code.  A (less-nice) option could be to add a new
authentication method for gss->encryption, which feels aesthetically
misplaced.  The approach used for SSL today could probably be adopted as
a third approach, but I don't really see a gain from doing it this way
since encryption happens after authentication (opposite of SSL) in
GSSAPI.

I'm interested to hear your thoughts on this.

Thanks!

--Robbie


signature.asc
Description: PGP signature

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 15:22:09 -0400, Tom Lane wrote:
> I'm back to suspecting that the indcheckxmin issue is the true cause of
> the buildfarm failure

Me too.

> though we lack an explanation why Andres failed to reproduce it ...

My laptop is probably a good bit faster than jaguarundi, particularly in
a single "threaded" workload, as that test, due to turbo
boost. Friarbird didn't trigger it either so far.  It's quite possible
that it requires a concurrent analyze or so to run to occur in the wrong
moment.

I've pushed the fixes to those two. I guess we'll have to wait a couple
(slow) buildfarm cycles to see whether it fixes things.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multixacts woes

On 2015-05-08 12:57:17 -0700, Josh Berkus wrote:
> I have a couple workloads in my pool which do consume mxids faster than
> xids, due to (I think) execeptional numbers of FK conflicts.  It's
> definitely unusual, though, and I'm sure they'd rather have corruption
> protection and endure some more vacuums.  If we do this, though, it
> might be worthwhile to backport the multixact age function, so that
> affected users can check and schedule mxact wraparound vacuums
> themselves, something you currently can't do on 9.3.

That's not particularly realistic due to the requirement of an initdb to
change the catalog.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multixacts woes

2015-05-08 Thread Josh Berkus

On 05/08/2015 11:27 AM, Andres Freund wrote:
> Hi,
> 
> On 2015-05-08 14:15:44 -0400, Robert Haas wrote:
>> 3. It seems to me that there is a danger that some users could see
>> extremely frequent anti-mxid-member-wraparound vacuums as a result of
>> this work.  Granted, that beats data corruption or errors, but it
>> could still be pretty bad.
> 
> It's certainly possible to have workloads triggering that, but I think
> it's relatively uncommon.  I in most cases I've checked the multixact
> consumption rate is much lower than the xid consumption. There are some
> exceptions, but often that's pretty bad code.

I have a couple workloads in my pool which do consume mxids faster than
xids, due to (I think) execeptional numbers of FK conflicts.  It's
definitely unusual, though, and I'm sure they'd rather have corruption
protection and endure some more vacuums.  If we do this, though, it
might be worthwhile to backport the multixact age function, so that
affected users can check and schedule mxact wraparound vacuums
themselves, something you currently can't do on 9.3.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] deparsing utility commands

Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> > Here's a cleaned up version of this patch; I threw together a very quick
> > test module, and updated a conflicting OID.  As far as I can tell, I'm
> > only missing the documentation updates before this is push-able.
> 
> Here is a complete version.  Barring serious problems, I intend to
> commit this on Monday.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fb39731..04762e8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18270,6 +18270,97 @@ CREATE EVENT TRIGGER test_table_rewrite_oid
 
 
   
+
+  
+   Capturing Changes at Command End
+
+   
+pg_event_trigger_ddl_commands
+   
+
+   
+pg_event_trigger_ddl_commands returns a list of
+DDL (data definition language) commands executed by
+each user action, when invoked in a function attached to a
+ddl_command_end event trigger.  If called in any other
+context, an error is raised.
+pg_event_trigger_ddl_commands returns one row for each
+base command executed; some commands that are a single SQL sentence
+may return more than one row.  This function returns the following
+columns:
+
+
+ 
+  
+   
+Name
+Type
+Description
+   
+  
+
+  
+   
+classid
+Oid
+OID of catalog the object belongs in
+   
+   
+objid
+Oid
+OID of the object in the catalog
+   
+   
+objsubid
+integer
+Object sub-id (e.g. attribute number for columns)
+   
+   
+command_tag
+text
+command tag
+   
+   
+object_type
+text
+Type of the object
+   
+   
+schema_name
+text
+
+ Name of the schema the object belongs in, if any; otherwise NULL.
+ No quoting is applied.
+
+   
+   
+object_identity
+text
+
+ Text rendering of the object identity, schema-qualified. Each and every
+ identifier present in the identity is quoted if necessary.
+
+   
+   
+in_extension
+bool
+whether the command is part of an extension script
+   
+   
+command
+pg_ddl_command
+
+ A complete representation of the command, in internal format.
+ This cannot be output directly, but it can be passed to other
+ functions to obtain different pieces of information about the
+ command.
+
+   
+  
+ 
+
+   
+  
   
 
 
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 8e75c27..943909c 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -48,6 +48,7 @@
 #include "catalog/pg_ts_config.h"
 #include "catalog/pg_ts_dict.h"
 #include "commands/dbcommands.h"
+#include "commands/event_trigger.h"
 #include "commands/proclang.h"
 #include "commands/tablespace.h"
 #include "foreign/foreign.h"
@@ -56,6 +57,7 @@
 #include "parser/parse_func.h"
 #include "parser/parse_type.h"
 #include "utils/acl.h"
+#include "utils/aclchk_internal.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
@@ -65,32 +67,6 @@
 
 
 /*
- * The information about one Grant/Revoke statement, in internal format: object
- * and grantees names have been turned into Oids, the privilege list is an
- * AclMode bitmask.  If 'privileges' is ACL_NO_RIGHTS (the 0 value) and
- * all_privs is true, 'privileges' will be internally set to the right kind of
- * ACL_ALL_RIGHTS_*, depending on the object type (NB - this will modify the
- * InternalGrant struct!)
- *
- * Note: 'all_privs' and 'privileges' represent object-level privileges only.
- * There might also be column-level privilege specifications, which are
- * represented in col_privs (this is a list of untransformed AccessPriv nodes).
- * Column privileges are only valid for objtype ACL_OBJECT_RELATION.
- */
-typedef struct
-{
-	bool		is_grant;
-	GrantObjectType objtype;
-	List	   *objects;
-	bool		all_privs;
-	AclMode		privileges;
-	List	   *col_privs;
-	List	   *grantees;
-	bool		grant_option;
-	DropBehavior behavior;
-} InternalGrant;
-
-/*
  * Internal format used by ALTER DEFAULT PRIVILEGES.
  */
 typedef struct
@@ -605,6 +581,15 @@ ExecGrantStmt_oids(InternalGrant *istmt)
 			elog(ERROR, "unrecognized GrantStmt.objtype: %d",
  (int) istmt->objtype);
 	}
+
+	/*
+	 * Pass the info to event triggers about the just-executed GRANT.  Note
+	 * that we prefer to do it after actually executing it, because that gives
+	 * the functions a chance to adjust the istmt with privileges actually
+	 * granted.
+	 */
+	if (EventTriggerSupportsGrantObjectType(istmt->objtype))
+		EventTriggerCollectGrant(istmt);
 }
 
 /*
diff --git a/src/backend/

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-05-08 Thread Fabrízio de Royes Mello

On Fri, May 8, 2015 at 4:23 PM, Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
>
>
> On Thu, May 7, 2015 at 7:55 PM, Sawada Masahiko 
wrote:
> >
> > On 5/7/15, Sawada Masahiko  wrote:
> > > On Wed, May 6, 2015 at 5:42 AM, Robert Haas  > > > wrote:
> > >> On Tue, May 5, 2015 at 11:10 AM, Sawada Masahiko <
sawada.m...@gmail.com
> > > > wrote:
> > >>> On Fri, May 1, 2015 at 9:04 PM, Robert Haas  > > > wrote:
> >  On Thu, Apr 30, 2015 at 11:05 PM, Sawada Masahiko
> >   > > > wrote:
> > > VACUUM has both syntax: with parentheses and without parentheses.
> > > I think we should have both syntax for REINDEX like VACUUM does
> > > because it would be pain to put parentheses whenever we want to do
> > > REINDEX.
> > > Are the parentheses optional in REINDEX command?
> > 
> >  No.  The unparenthesized VACUUM syntax was added back before we
> >  realized that that kind of syntax is a terrible idea.  It requires
> >  every option to be a keyword, and those keywords have to be in a
fixed
> >  order.  I believe the intention is to keep the old VACUUM syntax
> >  around for backward-compatibility, but not to extend it.  Same for
> >  EXPLAIN and COPY.
> > >>>
> > >>> REINDEX will have only one option VERBOSE for now.
> > >>> Even we're in a situation like that it's not clear to be added newly
> > >>> additional option to REINDEX now, we should need to put parenthesis?
> > >>
> > >> In my opinion, yes.  The whole point of a flexible options syntax is
> > >> that we can add new options without changing the grammar.  That
> > >> involves some compromise on the syntax, which doesn't bother me a
bit.
> > >> Our previous experiments with this for EXPLAIN and COPY and VACUUM
> > >> have worked out quite well, and I see no reason for pessimism here.
> > >
> > > I agree that flexible option syntax does not need to change grammar
> > > whenever we add new options.
> > > Attached patch is changed based on your suggestion.
> > > And the patch for reindexdb is also attached.
> > > Please feedbacks.
> > >
> > >>> Also I'm not sure that both implementation and documentation
regarding
> > >>> VERBOSE option should be optional.
> > >>
> > >> I don't know what this means.
> > >>
> > >
> > > Sorry for confusing you.
> > > Please ignore this.
> > >
> >
> > Sorry, I forgot attach files.
> >
>
> I applied the two patches to master and I got some errors when compile:
>
> tab-complete.c: In function ‘psql_completion’:
> tab-complete.c:3338:12: warning: left-hand operand of comma expression
has no effect [-Wunused-value]
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
> ^
> tab-complete.c:3338:21: warning: left-hand operand of comma expression
has no effect [-Wunused-value]
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
>  ^
> tab-complete.c:3338:31: warning: left-hand operand of comma expression
has no effect [-Wunused-value]
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
>^
> tab-complete.c:3338:41: warning: left-hand operand of comma expression
has no effect [-Wunused-value]
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
>  ^
> tab-complete.c:3338:53: warning: left-hand operand of comma expression
has no effect [-Wunused-value]
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
>  ^
> tab-complete.c:3338:5: warning: statement with no effect [-Wunused-value]
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
>  ^
> tab-complete.c:3338:59: error: expected ‘;’ before ‘}’ token
> {"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
>^
> tab-complete.c:3340:22: error: ‘list_REINDEX’ undeclared (first use in
this function)
>COMPLETE_WITH_LIST(list_REINDEX);
>   ^
> tab-complete.c:169:22: note: in definition of macro ‘COMPLETE_WITH_LIST’
>   completion_charpp = list; \
>   ^
> tab-complete.c:3340:22: note: each undeclared identifier is reported only
once for each function it appears in
>COMPLETE_WITH_LIST(list_REINDEX);
>   ^
> tab-complete.c:169:22: note: in definition of macro ‘COMPLETE_WITH_LIST’
>   completion_charpp = list; \
>   ^
> make[3]: *** [tab-complete.o] Error 1
> make[3]: *** Waiting for unfinished jobs
> make[2]: *** [install-psql-recurse] Error 2
> make[2]: *** Waiting for unfinished jobs
> make[1]: *** [install-bin-recurse] Error 2
> make: *** [install-src-recurse] Error 2
>
>
> Looking at the code I think you remove one line accidentally from
tab-complete.c:
>
> $ git diff src/bin/psql/tab-complete.c
> diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
> index 750e29d..55b0df5 100644
> --- a/src/bin/psql/tab-complete.c
> +++ b/src/bin/

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-05-08 Thread Fabrízio de Royes Mello

On Thu, May 7, 2015 at 7:55 PM, Sawada Masahiko 
wrote:
>
> On 5/7/15, Sawada Masahiko  wrote:
> > On Wed, May 6, 2015 at 5:42 AM, Robert Haas  > > wrote:
> >> On Tue, May 5, 2015 at 11:10 AM, Sawada Masahiko  > > wrote:
> >>> On Fri, May 1, 2015 at 9:04 PM, Robert Haas  > > wrote:
>  On Thu, Apr 30, 2015 at 11:05 PM, Sawada Masahiko
>   > > wrote:
> > VACUUM has both syntax: with parentheses and without parentheses.
> > I think we should have both syntax for REINDEX like VACUUM does
> > because it would be pain to put parentheses whenever we want to do
> > REINDEX.
> > Are the parentheses optional in REINDEX command?
> 
>  No.  The unparenthesized VACUUM syntax was added back before we
>  realized that that kind of syntax is a terrible idea.  It requires
>  every option to be a keyword, and those keywords have to be in a
fixed
>  order.  I believe the intention is to keep the old VACUUM syntax
>  around for backward-compatibility, but not to extend it.  Same for
>  EXPLAIN and COPY.
> >>>
> >>> REINDEX will have only one option VERBOSE for now.
> >>> Even we're in a situation like that it's not clear to be added newly
> >>> additional option to REINDEX now, we should need to put parenthesis?
> >>
> >> In my opinion, yes.  The whole point of a flexible options syntax is
> >> that we can add new options without changing the grammar.  That
> >> involves some compromise on the syntax, which doesn't bother me a bit.
> >> Our previous experiments with this for EXPLAIN and COPY and VACUUM
> >> have worked out quite well, and I see no reason for pessimism here.
> >
> > I agree that flexible option syntax does not need to change grammar
> > whenever we add new options.
> > Attached patch is changed based on your suggestion.
> > And the patch for reindexdb is also attached.
> > Please feedbacks.
> >
> >>> Also I'm not sure that both implementation and documentation regarding
> >>> VERBOSE option should be optional.
> >>
> >> I don't know what this means.
> >>
> >
> > Sorry for confusing you.
> > Please ignore this.
> >
>
> Sorry, I forgot attach files.
>

I applied the two patches to master and I got some errors when compile:

tab-complete.c: In function ‘psql_completion’:
tab-complete.c:3338:12: warning: left-hand operand of comma expression has
no effect [-Wunused-value]
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
^
tab-complete.c:3338:21: warning: left-hand operand of comma expression has
no effect [-Wunused-value]
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
 ^
tab-complete.c:3338:31: warning: left-hand operand of comma expression has
no effect [-Wunused-value]
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
   ^
tab-complete.c:3338:41: warning: left-hand operand of comma expression has
no effect [-Wunused-value]
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
 ^
tab-complete.c:3338:53: warning: left-hand operand of comma expression has
no effect [-Wunused-value]
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
 ^
tab-complete.c:3338:5: warning: statement with no effect [-Wunused-value]
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
 ^
tab-complete.c:3338:59: error: expected ‘;’ before ‘}’ token
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};
   ^
tab-complete.c:3340:22: error: ‘list_REINDEX’ undeclared (first use in this
function)
   COMPLETE_WITH_LIST(list_REINDEX);
  ^
tab-complete.c:169:22: note: in definition of macro ‘COMPLETE_WITH_LIST’
  completion_charpp = list; \
  ^
tab-complete.c:3340:22: note: each undeclared identifier is reported only
once for each function it appears in
   COMPLETE_WITH_LIST(list_REINDEX);
  ^
tab-complete.c:169:22: note: in definition of macro ‘COMPLETE_WITH_LIST’
  completion_charpp = list; \
  ^
make[3]: *** [tab-complete.o] Error 1
make[3]: *** Waiting for unfinished jobs
make[2]: *** [install-psql-recurse] Error 2
make[2]: *** Waiting for unfinished jobs
make[1]: *** [install-bin-recurse] Error 2
make: *** [install-src-recurse] Error 2


Looking at the code I think you remove one line accidentally from
tab-complete.c:

$ git diff src/bin/psql/tab-complete.c
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 750e29d..55b0df5 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3335,7 +3335,6 @@ psql_completion(const char *text, int start, int end)
 /* REINDEX */
else if (pg_strcasecmp(prev_wd, "REINDEX") == 0)
{
-   static const char *const list_REINDEX[] =
{"TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE", NULL};

COMPLETE_WITH_LIST(list_R

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Actually, looking closer, the quoted code is simply not broken without
>> RELCACHE_FORCE_RELEASE: without that, neither heap_close nor index_close
>> will do anything that could cause a cache flush.  So while it's certainly
>> good pratice to move that lappend_oid call up, it does not explain the
>> observed symptoms.  We still need some more investigation here.

> Couldn't a cache flush request come from another backend?  Although this
> isn't being run in a parallel group, is it?  Maybe a delayed signal that
> happens to show up late at just the right time?  Dunno if we've ever
> actually seen that but the thought occured to me.

A signal could certainly have arrived in that interval, but it wouldn't be
serviced in that interval --- or if it was, we have far worse problems
than this one.  Nothing interesting should happen except at (at least) a
CHECK_FOR_INTERRUPTS() point, and there is none in this code sequence.

I'm back to suspecting that the indcheckxmin issue is the true cause of
the buildfarm failure, though we lack an explanation why Andres failed
to reproduce it ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I wrote:
> > Peter Geoghegan  writes:
> >> On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
> >>> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
> >>> Or is the candidates list fairly noncritical?
> 
> >> The candidates list is absolutely critical.
> 
> > Oh, I was confusing CCA with RELCACHE_FORCE_RELEASE, which does something
> > a bit different.
> 
> Actually, looking closer, the quoted code is simply not broken without
> RELCACHE_FORCE_RELEASE: without that, neither heap_close nor index_close
> will do anything that could cause a cache flush.  So while it's certainly
> good pratice to move that lappend_oid call up, it does not explain the
> observed symptoms.  We still need some more investigation here.

Couldn't a cache flush request come from another backend?  Although this
isn't being run in a parallel group, is it?  Maybe a delayed signal that
happens to show up late at just the right time?  Dunno if we've ever
actually seen that but the thought occured to me.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

I wrote:
> Peter Geoghegan  writes:
>> On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
>>> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
>>> Or is the candidates list fairly noncritical?

>> The candidates list is absolutely critical.

> Oh, I was confusing CCA with RELCACHE_FORCE_RELEASE, which does something
> a bit different.

Actually, looking closer, the quoted code is simply not broken without
RELCACHE_FORCE_RELEASE: without that, neither heap_close nor index_close
will do anything that could cause a cache flush.  So while it's certainly
good pratice to move that lappend_oid call up, it does not explain the
observed symptoms.  We still need some more investigation here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Peter Geoghegan  writes:
> > On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
> >> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
> >> Or is the candidates list fairly noncritical?
> 
> > The candidates list is absolutely critical.
> 
> Oh, I was confusing CCA with RELCACHE_FORCE_RELEASE, which does something
> a bit different.  I wonder whether we should get rid of that symbol and
> just drive the test in RelationClose off CLOBBER_CACHE_ALWAYS.
> (Ditto for CATCACHE_FORCE_RELEASE.)  Or maybe make CLOBBER_CACHE_ALWAYS
> #define the other two symbols.

Ah.  Seems like that'd make sense to me, though I guess I'd prefer just
driving it all off of CCA.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 14:59:22 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I think Peter (on IM) just found a more likely explanation than mine.
> > index_close(idxRel, NoLock);
> > heap_close(relation, NoLock);
> > candidates = lappend_oid(candidates, 
> > idxForm->indexrelid);
> > ...
> > Yes. idxForm points into idxRel->rd_index.
> 
> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
> Or is the candidates list fairly noncritical?

Afaics RELCACHE_FORCE_RELEASE is independent of CCA (should we change
that?).  So this probably a bug, just not a relevant bug. I can't see
how it'd take affect in this case.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Peter Geoghegan  writes:
> On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
>> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
>> Or is the candidates list fairly noncritical?

> The candidates list is absolutely critical.

Oh, I was confusing CCA with RELCACHE_FORCE_RELEASE, which does something
a bit different.  I wonder whether we should get rid of that symbol and
just drive the test in RelationClose off CLOBBER_CACHE_ALWAYS.
(Ditto for CATCACHE_FORCE_RELEASE.)  Or maybe make CLOBBER_CACHE_ALWAYS
#define the other two symbols.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

* Peter Geoghegan (p...@heroku.com) wrote:
> On Fri, May 8, 2015 at 12:00 PM, Peter Geoghegan  wrote:
> > On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
> >> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
> >> Or is the candidates list fairly noncritical?
> >
> > The candidates list is absolutely critical.
> 
> However, the problematic code path is only a couple of times in the
> regression test.

To Tom's point, it shouldn't actually matter how many times it's in the
regression test, should it?  I'm not saying you're wrong about the
cause, but it's certainly interesting that it didn't consistently blow
up with CCA..

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On Fri, May 8, 2015 at 12:00 PM, Peter Geoghegan  wrote:
> On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
>> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
>> Or is the candidates list fairly noncritical?
>
> The candidates list is absolutely critical.

However, the problematic code path is only a couple of times in the
regression test.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On Fri, May 8, 2015 at 11:59 AM, Tom Lane  wrote:
> Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
> Or is the candidates list fairly noncritical?

The candidates list is absolutely critical.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Andres Freund  writes:
> I think Peter (on IM) just found a more likely explanation than mine.
>   index_close(idxRel, NoLock);
>   heap_close(relation, NoLock);
>   candidates = lappend_oid(candidates, 
> idxForm->indexrelid);
> ...
> Yes. idxForm points into idxRel->rd_index.

Ooops.  But shouldn't that have failed 100% of the time in a CCA build?
Or is the candidates list fairly noncritical?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

Robert Haas  writes:
> On Fri, May 8, 2015 at 1:46 PM, Tom Lane  wrote:
>> That's nice, but 9.5 feature freeze is only a week away.  I don't have a
>> lot of confidence that this stuff is actually in a state where we won't
>> regret shipping it in 9.5.

> Yeah.  The POC you were asking for upthread certainly exists and has
> for a while, or I would not have committed this.  But I do not think
> it likely that the  postgres_fdw support will be ready for 9.5.

Well, we have two alternatives.  I can keep hacking on this and get it
to a state where it seems credible to me, but we won't have any proof
that it actually works (though perhaps we could treat any problems
as bugs that should hopefully get found before 9.5 ships, if a
postgres_fdw patch shows up in the next few months).  Or we could
revert the whole thing and bounce it to the 9.6 cycle.  I don't really
like doing the latter, but I'm pretty uncomfortable with committing to
published FDW APIs that are (a) as messy as this and (b) practically
untested.  The odds that something slipped through the cracks are high.

Aside from the other gripes I raised, I'm exceedingly unhappy with the
ad-hoc APIs proposed for GetForeignJoinPaths and set_join_pathlist_hook.
It's okay for internal calls in joinpath.c to look like that, but
exporting that set of parameters seems like pure folly.  We've changed
those parameter lists repeatedly (for instance in 9.2 and again in 9.3);
the odds that they'll need to change again in future approach 100%.

One way we could reduce the risk of code breakage there is to stuff all
or most of those parameters into a struct.  This might result in a small
slowdown for the internal calls, or then again maybe not --- there
probably aren't many architectures that can pass 10 parameters in
registers anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On Fri, May 8, 2015 at 11:35 AM, Andres Freund  wrote:
> I think Peter (on IM) just found a more likely explanation than mine.
>
> index_close(idxRel, NoLock);
> heap_close(relation, NoLock);
> candidates = lappend_oid(candidates, 
> idxForm->indexrelid);
> ...
>
> Yes. idxForm points into idxRel->rd_index.
>
> He's working on a fix for both this and removal of the still unnecessary
> indcheckxmin check.

This was my (last minute) blunder, in case it matters.

Attached patch fixes the bug, and removes the unnecessary indcheckxmin check.

-- 
Peter Geoghegan
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8bcc506..1d555ed 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -547,15 +547,11 @@ infer_arbiter_indexes(PlannerInfo *root)
 			goto next;
 
 		/*
-		 * If the index is valid, but cannot yet be used, ignore it. See
-		 * src/backend/access/heap/README.HOT for discussion.
-		 */
-		if (idxForm->indcheckxmin &&
-			!TransactionIdPrecedes(HeapTupleHeaderGetXmin(idxRel->rd_indextuple->t_data),
-   TransactionXmin))
-			goto next;
-
-		/*
+		 * Note that we do not perform a get_relation_info()-style check
+		 * against indcheckxmin here to eliminate candidates, because
+		 * uniqueness checking only cares about the most recently committed
+		 * tuple versions.
+		 *
 		 * Look for match on "ON constraint_name" variant, which may not be
 		 * unique constraint.  This can only be a constraint name.
 		 */
@@ -566,10 +562,10 @@ infer_arbiter_indexes(PlannerInfo *root)
 		(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 		 errmsg("ON CONFLICT DO UPDATE not supported with exclusion constraints")));
 
+			candidates = lappend_oid(candidates, idxForm->indexrelid);
 			list_free(indexList);
 			index_close(idxRel, NoLock);
 			heap_close(relation, NoLock);
-			candidates = lappend_oid(candidates, idxForm->indexrelid);
 			return candidates;
 		}
 		else if (indexOidFromConstraint != InvalidOid)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] deparsing utility commands

2015-05-08 Thread Fabrízio de Royes Mello

On Fri, May 8, 2015 at 3:14 PM, Alvaro Herrera 
wrote:
>
> Alvaro Herrera wrote:
> > Here's a cleaned up version of this patch; I threw together a very quick
> > test module, and updated a conflicting OID.  As far as I can tell, I'm
> > only missing the documentation updates before this is push-able.
>
> Here is a complete version.  Barring serious problems, I intend to
> commit this on Monday.
>

You forgot to attach the patches!

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Re: [HACKERS] multixacts woes

On 2015-05-08 14:32:14 -0400, Robert Haas wrote:
> On Fri, May 8, 2015 at 2:27 PM, Andres Freund  wrote:
> > On 2015-05-08 14:15:44 -0400, Robert Haas wrote:
> >> Apparently, we have been hanging our hat since the release of 9.3.0 on
> >> the theory that the average multixact won't ever have more than two
> >> members, and therefore the members SLRU won't overwrite itself and
> >> corrupt data.
> >
> > It's essentially a much older problem - it has essentially existed since
> > multixacts were introduced (8.1?). The consequences of it were much
> > lower before 9.3 though.
> 
> OK, I wasn't aware of that.  What exactly were the consequences before 9.3?

I think just problems when locking a row. That's obviously much less bad
than problems when reading a row.

> > FWIW, I intend to either work on this myself, or help whoever seriously
> > tackles this, in the next cycle.
> 
> That would be great.

With "this" I mean freeze avoidance. While I obviously, having proposed
it as well at some point, think that freeze maps are a possible
solution, I'm not yet sure that it's the best solution.

> I'll investigate what resources EnterpriseDB can commit to this.

Cool.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 14:30:46 -0400, Tom Lane wrote:
> Maybe just hold an open transaction in another session while you do what
> the regression test does?  I think this is probably not a matter of CCA
> per se but just timing.  It's unfortunate that the test in question is
> run serially without other transactions occurring concurrently, as that
> would likely have exposed the problem far sooner.

I think Peter (on IM) just found a more likely explanation than mine.

index_close(idxRel, NoLock);
heap_close(relation, NoLock);
candidates = lappend_oid(candidates, 
idxForm->indexrelid);
...

Yes. idxForm points into idxRel->rd_index.

He's working on a fix for both this and removal of the still unnecessary
indcheckxmin check.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multixacts woes

On Fri, May 8, 2015 at 2:27 PM, Andres Freund  wrote:
> On 2015-05-08 14:15:44 -0400, Robert Haas wrote:
>> Apparently, we have been hanging our hat since the release of 9.3.0 on
>> the theory that the average multixact won't ever have more than two
>> members, and therefore the members SLRU won't overwrite itself and
>> corrupt data.
>
> It's essentially a much older problem - it has essentially existed since
> multixacts were introduced (8.1?). The consequences of it were much
> lower before 9.3 though.

OK, I wasn't aware of that.  What exactly were the consequences before 9.3?

> I'm not inclined to backport it at this stage. Maybe if we get some
> field reports about too many anti-wraparound vacuums due to this, *and*
> the code has been tested in 9.5.

That was about what I was thinking, too.

>> Another thought that occurs to me is that if we had a freeze map, it
>> would radically decrease the severity of this problem, because
>> freezing would become vastly cheaper.  I wonder if we ought to try to
>> get that into 9.5, even if it means holding up 9.5
>
> I think that's not realistic. Doing this right isn't easy. And doing it
> wrong can lead to quite bad results, i.e. data corruption. Doing it
> under the pressure of delaying a release further and further seems like
> recipe for disaster.

Those are certainly good things to worry about.

> FWIW, I intend to either work on this myself, or help whoever seriously
> tackles this, in the next cycle.

That would be great.  I'll investigate what resources EnterpriseDB can
commit to this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Andres Freund  writes:
> On 2015-05-08 11:10:00 -0700, Peter Geoghegan wrote:
>> +1. I knew we should have done this before commit.

> Hrmpf.

> I couldn't hit the problem with CCA unfortunately, even after a bunch of
> tries; quite possibly it's too fast on my laptop.

Maybe just hold an open transaction in another session while you do what
the regression test does?  I think this is probably not a matter of CCA
per se but just timing.  It's unfortunate that the test in question is
run serially without other transactions occurring concurrently, as that
would likely have exposed the problem far sooner.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multixacts woes

Hi,

On 2015-05-08 14:15:44 -0400, Robert Haas wrote:
> Apparently, we have been hanging our hat since the release of 9.3.0 on
> the theory that the average multixact won't ever have more than two
> members, and therefore the members SLRU won't overwrite itself and
> corrupt data.

It's essentially a much older problem - it has essentially existed since
multixacts were introduced (8.1?). The consequences of it were much
lower before 9.3 though.

> 3. It seems to me that there is a danger that some users could see
> extremely frequent anti-mxid-member-wraparound vacuums as a result of
> this work.  Granted, that beats data corruption or errors, but it
> could still be pretty bad.

It's certainly possible to have workloads triggering that, but I think
it's relatively uncommon.  I in most cases I've checked the multixact
consumption rate is much lower than the xid consumption. There are some
exceptions, but often that's pretty bad code.

> At that
> point, I think it's possible that relminmxid advancement might start
> to force full-table scans more often than would be required for
> relfrozenxid advancement.  If so, that may be a problem for some
> users.

I think it's the best we can do right now.

> What can we do about this?  Alvaro proposed back-porting his fix for
> bug #8470, which avoids locking a row if a parent subtransaction
> already has the same lock.  Alvaro tells me (via chat) that on some
> workloads this can dramatically reduce multixact size, which is
> certainly appealing.  But the fix looks fairly invasive - it changes
> the return value of HeapTupleSatisfiesUpdate in certain cases, for
> example - and I'm not sure it's been thoroughly code-reviewed by
> anyone, so I'm a little nervous about the idea of back-porting it at
> this point.  I am inclined to think it would be better to release the
> fixes we have - after handling items 1 and 2 - and then come back to
> this issue.  Another thing to consider here is that if the high rate
> of multixact consumption is organic rather than induced by lots of
> subtransactions of the same parent locking the same tuple, this fix
> won't help.

I'm not inclined to backport it at this stage. Maybe if we get some
field reports about too many anti-wraparound vacuums due to this, *and*
the code has been tested in 9.5.

> Another thought that occurs to me is that if we had a freeze map, it
> would radically decrease the severity of this problem, because
> freezing would become vastly cheaper.  I wonder if we ought to try to
> get that into 9.5, even if it means holding up 9.5

I think that's not realistic. Doing this right isn't easy. And doing it
wrong can lead to quite bad results, i.e. data corruption. Doing it
under the pressure of delaying a release further and further seems like
recipe for disaster.

> Quite aside from multixacts, repeated wraparound autovacuuming of
> static data is a progressively more serious problem as data set sizes
> and transaction volumes increase.

Yes. Agreed.

> The possibility that multixact freezing may in some
> scenarios exacerbate that problem is just icing on the cake.  The
> fundamental problem is that a 32-bit address space just isn't that big
> on modern hardware, and the problem is worse for multixact members
> than it is for multixact IDs, because a given multixact only uses
> consumes one multixact ID, but as many slots in the multixact member
> space as it has members.

FWIW, I intend to either work on this myself, or help whoever seriously
tackles this, in the next cycle.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

On Fri, May 8, 2015 at 1:46 PM, Tom Lane  wrote:
> That's nice, but 9.5 feature freeze is only a week away.  I don't have a
> lot of confidence that this stuff is actually in a state where we won't
> regret shipping it in 9.5.

Yeah.  The POC you were asking for upthread certainly exists and has
for a while, or I would not have committed this.  But I do not think
it likely that the  postgres_fdw support will be ready for 9.5.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

My colleague Thomas Munro and I have been working with Alvaro, and
also with Kevin and Amit, to fix bug #12990, a multixact-related data
corruption bug. I somehow did not realize until very recently that we
actually use two SLRUs to keep track of multixacts: one for the
multixacts themselves (pg_multixacts/offsets) and one for the members
(pg_multixacts/members). Confusingly, members are sometimes called
offsets, and offsets are sometimes called IDs, or multixacts. If
either of these SLRUs wraps around, we get data loss. This comment in
multixact.c explains it well:

/*
* Since multixacts wrap differently from transaction IDs, this logic is
* not entirely correct: in some scenarios we could go for longer than 2
* billion multixacts without seeing any data loss, and in
some others we
* could get in trouble before that if the new pg_multixact/members data
* stomps on the previous cycle's data. For lack of a better
mechanism we
* use the same logic as for transaction IDs, that is, start
taking action
* halfway around the oldest potentially-existing multixact.
*/
multiWrapLimit = oldest_datminmxid + (MaxMultiXactId >> 1);
if (multiWrapLimit < FirstMultiXactId)
multiWrapLimit += FirstMultiXactId;

Apparently, we have been hanging our hat since the release of 9.3.0 on
the theory that the average multixact won't ever have more than two
members, and therefore the members SLRU won't overwrite itself and
corrupt data. This is not good enough: we need to prevent multixact
IDs from wrapping around, and we separately need to prevent multixact
members from wrapping around, and the current code was conflating
those things in a way that simply didn't work. Recent commits by
Alvaro and by me have mostly fixed this, but there are a few loose
ends:

1. I believe that there is still a narrow race condition that cause
the multixact code to go crazy and delete all of its data when
operating very near the threshold for member space exhaustion. See
http://www.postgresql.org/message-id/ca+tgmozihwybetx8nzzptosjprg2kcr-nawgajkzclcbvj1...@mail.gmail.com
for the scenario and proposed fix.

2. We have some logic that causes autovacuum to run in spite of
autovacuum=off when wraparound threatens. My commit
53bb309d2d5a9432d2602c93ed18e58bd2924e15 provided most of the
anti-wraparound protections for multixact members that exist for
multixact IDs and for regular XIDs, but this remains an outstanding
issue. I believe I know how to fix this, and will work up an
appropriate patch based on some of Thomas's earlier work.

3. It seems to me that there is a danger that some users could see
extremely frequent anti-mxid-member-wraparound vacuums as a result of
this work. Granted, that beats data corruption or errors, but it
could still be pretty bad. The default value of
autovacuum_multixact_freeze_max_age is 4.
Anti-mxid-member-wraparound vacuums kick in when you exceed 25% of the
addressable space, or 1073741824 total members. So, if your typical
multixact has more than 1073741824/4 = ~2.68 members, you're
going to see more autovacuum activity as a result of this change.
We're effectively capping autovacuum_multixact_freeze_max_age at
1073741824/(average size of your multixacts). If your multixacts are
just a couple of members (like 3 or 4) this is probably not such a big
deal. If your multixacts typically run to 50 or so members, your
effective freeze age is going to drop from 400m to ~21.4m. At that
point, I think it's possible that relminmxid advancement might start
to force full-table scans more often than would be required for
relfrozenxid advancement. If so, that may be a problem for some
users.

What can we do about this? Alvaro proposed back-porting his fix for
bug #8470, which avoids locking a row if a parent subtransaction
already has the same lock. Alvaro tells me (via chat) that on some
workloads this can dramatically reduce multixact size, which is
certainly appealing. But the fix looks fairly invasive - it changes
the return value of HeapTupleSatisfiesUpdate in certain cases, for
example - and I'm not sure it's been thoroughly code-reviewed by
anyone, so I'm a little nervous about the idea of back-porting it at
this point. I am inclined to think it would be better to release the
fixes we have - after handling items 1 and 2 - and then come back to
this issue. Another thing to consider here is that if the high rate
of multixact consumption is organic rather than induced by lots of
subtransactions of the same parent locking the same tuple, this fix
won't help.

Another thought that occurs to me is that if we had a freeze map, it
would radically decrease the severity of this problem, because
freezing would become vastly cheaper. I wonder if we ought to try to
get that into 9.5, even if it means holding up 9.5. Quite aside from
multixacts, repeated wraparound autovacuuming of static data i

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 11:10:00 -0700, Peter Geoghegan wrote:
> +1. I knew we should have done this before commit.

Hrmpf.

I couldn't hit the problem with CCA unfortunately, even after a bunch of
tries; quite possibly it's too fast on my laptop.  So I'll just have
remove the check and we'll see whether it makes jaguarundi happy over
the next week or so.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] deparsing utility commands

Alvaro Herrera wrote:
> Here's a cleaned up version of this patch; I threw together a very quick
> test module, and updated a conflicting OID.  As far as I can tell, I'm
> only missing the documentation updates before this is push-able.

Here is a complete version.  Barring serious problems, I intend to
commit this on Monday.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On Fri, May 8, 2015 at 11:06 AM, Andres Freund  wrote:
> On 2015-05-08 20:37:15 +0300, Heikki Linnakangas wrote:
>> Why does INSERT ON CONFLICT pay attention to indcheckxmin? Uniqueness check
>> only cares about the most recent committed version of the tuple, and the
>> index good for that use immediately. If there was a problem there, the
>> uniqueness check in a normal insert have the same problem.
>
> Yea, that's a good angle to view this from.  That'd not be the case, I
> think, if we'd allow this to be used on system relations. But since
> neither creating an index on system relations, nor using INSERT ON
> CONFLICT on them is supported...


+1. I knew we should have done this before commit.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 20:37:15 +0300, Heikki Linnakangas wrote:
> Why does INSERT ON CONFLICT pay attention to indcheckxmin? Uniqueness check
> only cares about the most recent committed version of the tuple, and the
> index good for that use immediately. If there was a problem there, the
> uniqueness check in a normal insert have the same problem.

Yea, that's a good angle to view this from.  That'd not be the case, I
think, if we'd allow this to be used on system relations. But since
neither creating an index on system relations, nor using INSERT ON
CONFLICT on them is supported...

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER SYSTEM and ParseConfigFile()

Stephen Frost  writes:
> Greetings,
>   While working through the pg_file_settings patch, I came across this
>   comment above ParseConfigFp() (which is called by ParseConfigFile()):

> src/backend/utils/misc/guc-file.l:603
> --
>  * Output parameters:
>  *  head_p, tail_p: head and tail of linked list of name/value pairs
>  *
>  * *head_p and *tail_p must be initialized to NULL before calling the outer
>  * recursion level.  On exit, they contain a list of name-value pairs read
>  * from the input file(s).
> --

>   However, in 65d6e4c, ProcessConfigFile(), which isn't part of the
>   recursion, was updated with a second call to ParseConfigFile (for the
>   PG_AUTOCONF_FILENAME file), passing in the head and tail values which
>   had been set by the first call.

>   I'm a bit nervous that there might be an issue here due to how flex
>   errors are handled and the recursion, though it might also be fine
>   (but then why comment about it?).

>   In any case, either the comment needs to be changed, or we should be
>   passing clean NULL variables to ParseConfigFile and then combining the
>   results in ProcessConfigFile().

I think the code is OK, but yeah, this comment should be changed to
reflect the idea that the function will append entries to an existing
list of name/value pairs (and thus, that head_p/tail_p are not output
but in/out parameters).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ALTER SYSTEM and ParseConfigFile()

Greetings,

  While working through the pg_file_settings patch, I came across this
  comment above ParseConfigFp() (which is called by ParseConfigFile()):

src/backend/utils/misc/guc-file.l:603
--
 * Output parameters:
 *  head_p, tail_p: head and tail of linked list of name/value pairs
 *
 * *head_p and *tail_p must be initialized to NULL before calling the outer
 * recursion level.  On exit, they contain a list of name-value pairs read
 * from the input file(s).
--

  However, in 65d6e4c, ProcessConfigFile(), which isn't part of the
  recursion, was updated with a second call to ParseConfigFile (for the
  PG_AUTOCONF_FILENAME file), passing in the head and tail values which
  had been set by the first call.

  I'm a bit nervous that there might be an issue here due to how flex
  errors are handled and the recursion, though it might also be fine
  (but then why comment about it?).

  In any case, either the comment needs to be changed, or we should be
  passing clean NULL variables to ParseConfigFile and then combining the
  results in ProcessConfigFile().

  This is pretty orthogonal to the changes for pg_file_settings, so I'm
  going to continue working through that and hopefully get it wrapped up
  soon.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

Kouhei Kaigai  writes:
>> I've been trying to code-review this patch, because the documentation
>> seemed several bricks shy of a load, and I find myself entirely confused
>> by the fdw_ps_tlist and custom_ps_tlist fields.

> Main-point of your concern is lack of documentation/comments to introduce
> how does the pseudo-scan targetlist works here, isn't it??

Well, there's a bunch of omissions and outright errors in the docs and
comments, but this is the main issue that I was uncertain how to fix
from looking at the patch.

>> Also,
>> if that is what they're for (ie, to allow the FDW to redefine the scan
>> tuple contents) it would likely be better to decouple that feature from
>> whether the plan node is for a simple scan or a join.

> In this version, we don't intend FDW/CSP to redefine the contents of
> scan tuples, even though I want off-loads heavy targetlist calculation
> workloads to external computing resources in *the future version*.

I do not think it's a good idea to introduce such a field now and then
redefine how it works and what it's for in a future version.  We should
not be moving the FDW APIs around more than we absolutely have to,
especially not in ways that wouldn't throw an obvious compile error
for un-updated code.  Also, the longer we wait to make a change that
we know we want, the more pain we inflict on FDW authors (simply because
there will be more of them a year from now than there are today).

>> The business about
>> resjunk columns in that list also seems a bit half baked, or at least
>> underdocumented.

> I'll add source code comments to introduce how does it works any when
> does it have resjunk=true. It will be a bit too deep to be introduced
> in the SGML file.

I don't actually see a reason for resjunk marking in that list at all,
if what it's for is to define the contents of the scan tuple.  I think we
should just s/ExecCleanTypeFromTL/ExecTypeFromTL/ in nodeForeignscan and
nodeCustom, and get rid of the "sanity check" in create_foreignscan_plan
(which is pretty pointless anyway, considering the number of other ways
you could screw up that tlist without it being detected).

I'm also inclined to rename the fields to
fdw_scan_tlist/custom_scan_tlist, which would better reflect what they do,
and to change the API of make_foreignscan() to add a parameter
corresponding to the scan tlist.  It's utterly bizarre and error-prone
that this patch has added a field that the FDW is supposed to set and
not changed make_foreignscan to match.

>> I do not think that this should have gotten committed without an attendant
>> proof-of-concept patch to postgres_fdw, so that the logic could be tested.

> Hanada-san is now working according to the comments from Robert.

That's nice, but 9.5 feature freeze is only a week away.  I don't have a
lot of confidence that this stuff is actually in a state where we won't
regret shipping it in 9.5.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 19:32:02 +0200, Andres Freund wrote:
> If the failure is indeed caused by checkxmin (trying to reproduce
> right now), we can just remove the updates in that subsection of the
> tests. They're not relevant.

Hm.  Or easier and uglier, replace the CREATE INDEX statements with
CREATE INDEX CONCURRENTLY.  There's a fair bunch of tests in that file,
and keeping them update free will be annoying.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0


On 05/08/2015 08:25 PM, Tom Lane wrote:

Andres Freund  writes:

On 2015-05-08 12:32:10 -0400, Tom Lane wrote:

Looks like there's a CLOBBER_CACHE_ALWAYS issue ...
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2015-05-08%2011%3A52%3A00



Currently index inferrence ignores indexes that aren't yet valid
according to checkxmin. I'm tempted to think that is a mistake. I think
we should throw an error instead; possbily at execution rather than plan
time.


If that's what is happening, then throwing an error is not going to fix
the problem of the regression test results being unstable; it'll just be
a different error that might or might not get thrown depending on timing.


Why does INSERT ON CONFLICT pay attention to indcheckxmin? Uniqueness 
check only cares about the most recent committed version of the tuple, 
and the index good for that use immediately. If there was a problem 
there, the uniqueness check in a normal insert have the same problem.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 13:25:22 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-05-08 12:32:10 -0400, Tom Lane wrote:
> >> Looks like there's a CLOBBER_CACHE_ALWAYS issue ...
> >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2015-05-08%2011%3A52%3A00
> 
> > Currently index inferrence ignores indexes that aren't yet valid
> > according to checkxmin. I'm tempted to think that is a mistake. I think
> > we should throw an error instead; possbily at execution rather than plan
> > time.
> 
> If that's what is happening, then throwing an error is not going to fix
> the problem of the regression test results being unstable; it'll just be
> a different error that might or might not get thrown depending on timing.

Yep, that was more a comment about how the behaviour generally should
be.  If the failure is indeed caused by checkxmin (trying to reproduce
right now), we can just remove the updates in that subsection of the
tests. They're not relevant.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Teach autovacuum about multixact member wraparound.

On Fri, May 8, 2015 at 1:23 PM, Kevin Grittner  wrote:
> Robert Haas  wrote:
>
>> This patch leaves unsolved the problem of ensuring that emergency
>> autovacuums are triggered even when autovacuum=off.  We'll need
>> to fix that via a separate patch.
>
> I think we also need something like your previously-posted
>
> multixact-truncate-race.patch as a follow-on.  I'm not at all
> convinced that there is no possibility of truncating new data as
> things stand; and even if that is true now, it seems terribly
> fragile.  (Only mentioning it because it wasn't noted in the commit
> message.)

I agree.  I'm writing up a summary of open issues now.  I didn't
mention it in the commit message because it's not related to the
autovacuum part of the problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Andres Freund  writes:
> On 2015-05-08 12:32:10 -0400, Tom Lane wrote:
>> Looks like there's a CLOBBER_CACHE_ALWAYS issue ...
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2015-05-08%2011%3A52%3A00

> Currently index inferrence ignores indexes that aren't yet valid
> according to checkxmin. I'm tempted to think that is a mistake. I think
> we should throw an error instead; possbily at execution rather than plan
> time.

If that's what is happening, then throwing an error is not going to fix
the problem of the regression test results being unstable; it'll just be
a different error that might or might not get thrown depending on timing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Teach autovacuum about multixact member wraparound.

2015-05-08 Thread Kevin Grittner

Robert Haas  wrote:

> This patch leaves unsolved the problem of ensuring that emergency
> autovacuums are triggered even when autovacuum=off.  We'll need
> to fix that via a separate patch.

I think we also need something like your previously-posted

multixact-truncate-race.patch as a follow-on.  I'm not at all
convinced that there is no possibility of truncating new data as
things stand; and even if that is true now, it seems terribly
fragile.  (Only mentioning it because it wasn't noted in the commit
message.)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 12:32:10 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > So I've committed the patch yesterday evening. I'm pretty sure there'll
> > be some more minor things to change. But overall I feel good about the
> > current state.
> 
> Looks like there's a CLOBBER_CACHE_ALWAYS issue ...
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2015-05-08%2011%3A52%3A00

Hm. I can't immediately think of anything CCA relevant in the patch. I
do wonder if it's possible that this is a consequence of
indcheckxmin. The unique index is created just before the failing
statement, after the table has had a couple updates. And the following
statement succeeds.

Currently index inferrence ignores indexes that aren't yet valid
according to checkxmin. I'm tempted to think that is a mistake. I think
we should throw an error instead; possbily at execution rather than plan
time. It'll be rather confusing for users if a INSERT ... ON CONFLICT
fails shortly after an index creation, but not later.  Not that an error
message will make them happy, but it'll be much less confusing.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

Andres Freund  writes:
> So I've committed the patch yesterday evening. I'm pretty sure there'll
> be some more minor things to change. But overall I feel good about the
> current state.

Looks like there's a CLOBBER_CACHE_ALWAYS issue ...
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2015-05-08%2011%3A52%3A00

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-08 Thread Geoff Winkless

On 8 May 2015 at 16:51, Andres Freund  wrote:

> On 2015-05-08 16:36:07 +0100, Geoff Winkless wrote:
> > I thought the previous version suggested multiple possible targets and
> > actions, this suggests that while there can be multiple targets the
> > action is always the same.
>
> I don't think any version of the patch included that functionality. I
> can see it being useful, but it'd make a bunch of things more
> complicated, so I doubt we'll get there for 9.5.
>

I'm not particularly bothered by it - I only see it ever being useful on
the extreme edge case, and
I certainly wouldn't want
it
to hold up the release - but it was the only reason I could see for
requiring a target_clause in the first place (I
expect that's why I
misread the docs previously!). If you can't specify different actions based
on the target_clause, what's the point in forcing the user to enumerate it?

> example (or similar); do we think people will be smart enough to realise
> > that's possible without one?
>
> Hm. I'm tempted to say that the synopis makes that clear enough.

Unless I'm missing it, it's really only in "This happens on a row-by-row
basis" and in the "deterministic" paragraph that you even mention
multi-line inserts in the ON CONFLICT section.

> I
>  
> personally never check such examples though, so maybe I'm the wrong
> person to judge.
>

I'm afraid I'm the sort of person who goes straight to the examples :)

Maybe I'll just suggest then that there's a _potential for confusion_ if
you only give examples of the first kind - people might place some
inference on the fact that the examples only show single-row INSERTs.

Geoff

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

On 2015-05-08 16:36:07 +0100, Geoff Winkless wrote:
> Omitted only has one m.
> 
> There's an extra space in "error . (See".
> 
> Otherwise it reads fine to me, although I've only skimmed it.

Thanks, I'll push fixes for those.

> I may have misunderstood: there is only one ON CONFLICT action
> allowed?

Yes.

> I thought the previous version suggested multiple possible targets and
> actions, this suggests that while there can be multiple targets the
> action is always the same.

I don't think any version of the patch included that functionality. I
can see it being useful, but it'd make a bunch of things more
complicated, so I doubt we'll get there for 9.5.

> So I thought I could have
> 
> INSERT INTO distributors (did, dname)
>   ON CONFLICT (did) DO UPDATE dname=target.dname
>   ON CONFLICT (dname) DO NOTHING;
> 
> Did I misunderstand?
> 
> Finally there's no
> 
> INSERT INTO distributors (did, dname)
>   SELECT did, dname FROM otherdists
> ON CONFLICT (did) DO NOTHING;
> 
> example (or similar); do we think people will be smart enough to realise
> that's possible without one?

Hm. I'm tempted to say that the synopis makes that clear enough.  I
personally never check such examples though, so maybe I'm the wrong
person to judge.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] transforms vs. CLOBBER_CACHE_ALWAYS

2015-05-08 Thread Christian Ullrich


* Tom Lane wrote:


Christian Ullrich  writes:

* Peter Eisentraut wrote:

On 4/30/15 2:49 PM, Andrew Dunstan wrote:

friarbird is a FreeBSD buildfarm animal running with
-DCLOBBER_CACHE_ALWAYS. It usually completes a run in about 6.5 hours.
However, it's been stuck since Monday running the plpython regression
tests. The only relevant commit seems to be the transforms feature.



I can reproduce it.  I'll look into it.



I looked over the CCA animals and noticed that pademelon and gaur are
apparently unaffected;


pademelon and gaur do not run CCA (if they did, it would take weeks for
a run to complete :-().


Ah. I obviously had associated the "note" icon with CCA. Sorry. OTOH, if 
I had noticed that, I might not have gone into more detail ...


--
Christian





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-08 Thread Geoff Winkless

On 8 May 2015 at 16:03, Andres Freund  wrote:

> So I've committed the patch yesterday evening. I'm pretty sure there'll
> be some more minor things to change. But overall I feel good about the
> current state.
>
> It'd be quite helpful if others could read the docs, specifically for
> insert, and comment whether they're understandable. I've spent a fair
> amount of time trying to make them somewhat simpler, but I do think I
> only succeeded partially.  And there's also my own brand of english to
> consider ;)
>

Omitted only has one m.

There's an extra space in "error . (See".

Otherwise it reads fine to me, although I've only skimmed it.

I may have misunderstood: there is only one ON CONFLICT action allowed? I
thought the previous version suggested multiple possible targets and
actions, this suggests that while there can be multiple targets the action
is always the same.

So I thought I could have

INSERT INTO distributors (did, dname)
  ON CONFLICT (did) DO UPDATE dname=target.dname
  ON CONFLICT (dname) DO NOTHING;

Did I misunderstand?

Finally there's no

INSERT INTO distributors (did, dname)
  SELECT did, dname FROM otherdists
ON CONFLICT (did) DO NOTHING;

example (or similar); do we think people will be smart enough to realise
that's possible without one?

Geoff

Re: [HACKERS] [COMMITTERS] pgsql: Add transforms feature

Peter Eisentraut  writes:
> On 5/3/15 2:27 PM, Tom Lane wrote:
>> 2. Preventing undefined-symbol errors at link time will hide actual coding
>> errors, not only the intended cross-plugin reference cases.  We have
>> repeatedly seen the buildfarm members that complain about this find actual
>> bugs that were missed on laxer platforms.  Therefore I think that this
>> approach is going to lead to masking bugs that we'll find only by much
>> more expensive/painful means than noticing a buildfarm failure.

> I appreciate this issue, and I have actually done quite a bit of
> research in the hope of being able to provide similar functionality on
> other platforms.

I agree that it's probably impractical to persuade toolchains to do this
if they don't want to.  But that doesn't mean that it's a good idea to
remove the whole class of error checks on platforms where it *is* possible
to make the check.  The entire reason that we have a buildfarm is, in
essence, to get the union of all checks that are possible on any supported
platform.

> Moreover, I'm not sure this error checking actually buys us much in
> practice.  A typoed symbol will be flagged by a compiler warning, and
> any undefined symbol will be caught be the test suite as soon as the
> module is loaded. So I can't imagine what those buildfarm complaints
> are, although I'd be interested look into them the analyze the
> circumstances if someone could point me to some concrete cases.

On many platforms, undefined symbols will only be complained of if you
actually try to call them, so I think this position places undue faith
in the code coverage of our buildfarm tests.  (As a counterexample,
consider SSL, which I don't think we're exercising at all in the
buildfarm because of the difficulty of setup.  And we have had bugs of
this sort versus OpenSSL, cf commit c9e1ad7fa.)

>> Surely there are other ways to deal with this requirement that don't
>> require forcing undefined symbols to be treated as valid.  For instance,
>> our existing find_rendezvous_variable() infrastructure might offer a
>> usable solution for passing a function pointer from plugin A to plugin B.

> The rendezvous variables were invented for a different problem, namely
> that you can load modules in arbitrary order, and only the first guy
> initializes a variable.  That's not the functionality that we need.

Sure it is.  Look at what plpgsql uses it for: to export a
"PLpgSQL_plugin" struct containing function pointers that might be used
by other extensions.  That seems to me to be exactly the same case as
what we're talking about here.

> Also, any such approach would likely entail casting all functions and
> variables through common pointer types, which would lose all kinds of
> compiler checking.  (All rendezvous variables are void pointers.)

Only with very poor coding style.  Properly declared function pointers
are pretty type-safe.

> It would put quite a bit of extra burden on the authors of modules such
> as hstore or postgis to not only maintain a list of exported functions
> but also test those interfaces.

Well, that actually gets to the core of my point: if a module author
exports a struct full of function pointers then he has put a stake in the
ground as to exactly what his exported API *is*.  With what you've done
for transforms, what is the exported API of hstore or plpython or plperl?
Conceivably, any non-static function in those libraries is now fair game
for somebody else to call.  We have this problem already with respect to
the core code, and it's nasty.  We should not be encouraging the same
situation to develop for extensions.

I do not understand the second part of your point.  Surely, if an
extension author expects other extensions to call function X, he's got the
same testing problem regardless of how the other extensions would reach X
exactly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] transforms vs. CLOBBER_CACHE_ALWAYS

Christian Ullrich  writes:
> * Peter Eisentraut wrote:
>> On 4/30/15 2:49 PM, Andrew Dunstan wrote:
>>> friarbird is a FreeBSD buildfarm animal running with
>>> -DCLOBBER_CACHE_ALWAYS. It usually completes a run in about 6.5 hours.
>>> However, it's been stuck since Monday running the plpython regression
>>> tests. The only relevant commit seems to be the transforms feature.

>> I can reproduce it.  I'll look into it.

> I looked over the CCA animals and noticed that pademelon and gaur are 
> apparently unaffected;

pademelon and gaur do not run CCA (if they did, it would take weeks for
a run to complete :-().

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] transforms vs. CLOBBER_CACHE_ALWAYS

2015-05-08 Thread Christian Ullrich


* Peter Eisentraut wrote:


On 4/30/15 2:49 PM, Andrew Dunstan wrote:



friarbird is a FreeBSD buildfarm animal running with
-DCLOBBER_CACHE_ALWAYS. It usually completes a run in about 6.5 hours.
However, it's been stuck since Monday running the plpython regression
tests. The only relevant commit seems to be the transforms feature.


I can reproduce it.  I'll look into it.


I looked over the CCA animals and noticed that pademelon and gaur are 
apparently unaffected; what they have in common is the OS (HP-UX) and 
the Python version (2.5). There's nothing I can do about OS-related 
differences, but I thought I'd check the Python angle.


With Python 2.5.6, jaguarundi locks up on the plpython tests just the 
same as with 3.4, and friarbird with 2.7. So that is not the culprit, 
either.


I ran make check by hand, and noticed three tests where it seemed to 
hang (I gave it at least three minutes each, and the functions in the 
queries are simple):


plpython_spiSELECT cursor_plan();
plpython_setof  SELECT test_setof_as_list(1, 'list');
plpython_composite  SELECT multiout_simple_setof();

These are the only plpython tests that mention SETOF at all, and the 
queries that hung are the first ones in their respective tests to 
actually build a set.


Does that help?

--
Christian




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken --dry-run mode in pg_rewind

2015-05-08 Thread Vladimir Borodin

8 мая 2015 г., в 16:39, Vladimir Borodin  написал(а):8 мая 2015 г., в 16:11, Stephen Frost  написал(а):* Heikki Linnakangas (hlinn...@iki.fi) wrote:On 05/08/2015 03:39 PM, Michael Paquier wrote:On Fri, May 8, 2015 at 9:34 PM, Heikki Linnakangas  wrote:On 05/08/2015 03:25 PM, Vladimir Borodin wrote:Seems, that pg_rewind does not account --dry-run option properly. A simplefixfor that is attached.No, the --dry-run takes effect later. It performs all the actions itnormally would, including reading files from the source, except for actuallywriting anything in the target. See the dry-run checks in file_ops.cUrgh, my test script had an error and I made grep only in pg_rewind.c. Sorry for noise.Even if the patch sent is incorrect, shouldn't there be some processbypass in updateControlFile() and createBackupLabel() in case of a--dry-run?They both use open_target_file() and write_target_file(), whichcheck for --dry-run and do nothing if it's set.Hmm, I wonder it we should print something else than "Done!" at theend, if run in --dry-run mode. Or give some indication around thetime it says "Rewinding from last common checkpoint at ...", thatit's running in dry-run mode and won't actually modify anything. Theprogress messages are a bit alarming if you don't realize that it'sskipping all the writes.That would be really nice.Added comments in all places mentioned in this thread.

pg_rewind_dry_run_comments.patch
Description: Binary data
Wouldn't hurt to also augment that rather doom-looking "point of noreturn" comment with a note that says writes won't happen if indry-run. :)For my 2c anyway.	Thanks!		Stephen--May the force be with you…https://simply.name
--May the force be with you…https://simply.name

Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

So I've committed the patch yesterday evening. I'm pretty sure there'll
be some more minor things to change. But overall I feel good about the
current state.

It'd be quite helpful if others could read the docs, specifically for
insert, and comment whether they're understandable. I've spent a fair
amount of time trying to make them somewhat simpler, but I do think I
only succeeded partially.  And there's also my own brand of english to
consider ;)

Phew.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify pg_stat_get_activity to build a tuplestore

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> 
> > As I mentioned on another thread, you're certainly right about that, and
> > here's the first broken-out patch, which just changes
> > pg_stat_get_activity to build and return a tuplestore in a single
> > function call instead of using the old-style multiple-invokation call
> > method.  Given the simplicity and the lack of any user-visible change,
> > and that it's an independently useful change regardless of what happens
> > with the other changes, I'm planning to move forward with this pretty
> > quickly, barring concerns, of course.
> 
> Um.  Abhijit mentioned to me the idea of implementing function execution
> code in fmgr that would allow for calling functions lazily (i.e. stop
> calling it once enough rows have been returned).  One of the reasons not
> to do that is that most functions use the materialize mode rather than
> value-per-call, so the optimization is pointless because the whole
> result is computed anyway.  If we change more functions to materialize,
> we will make that sort of optimization even more distant.

With pg_stat_get_activity, at least, we already deal with this in a way-
you can pass in the specific pid you want to look at and then we'll just
return the one row that has that pid.

I'm not saying that such an optimization isn't a good idea (I'm all for
it, in fact), but it seems unlikely to help this particular function
much.  Wouldn't we also have to change the API to support that, as there
are functions which expect to be called all the way through and do
something at the end?

I've long wanted to get at information, like if there is a LIMIT or a
WHERE clause or set of conditionals which applies to the results of a
given function and have that exposed to all PLs (pl/pgsql being the big
one there).  That'd be far more valuable than this short-circuiting, not
that I'm against having both, of course.

> What is the reason for this change anyway?

Hrm, guess I wasn't clear enough in the commit message.  The reason is
that the default roles patch turns it into a helper function which takes
arguments and is then called from multiple functions which are exposed
at the SQL level.  That's much simpler when working with a tuplestore
since we can create the tuplestore in the calling function and have the
helper just populate it for us, based on the parameters passed in.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-08 Thread Bernd Helmle



--On 6. Mai 2015 16:28:43 -0400 Andrew Dunstan  wrote:

>> Single user sessions would work, but the "peer" authentication is also 
>> still available and should be the preferred method to reset passwords 
>> when trust is disabled, so this should not be an issue.
> 
> (Personally I think there's a very good case for completely ripping out
> RFC1413 ident auth. I've not seen it used in a great long while, and it's
> always been a security risk.)

I have the same feeling. I haven't seen it in the last 6+ years used
anywhere and I personally think it's a relict...so +1.

-- 
Thanks

Bernd


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Modify pg_stat_get_activity to build a tuplestore

Stephen Frost wrote:

> As I mentioned on another thread, you're certainly right about that, and
> here's the first broken-out patch, which just changes
> pg_stat_get_activity to build and return a tuplestore in a single
> function call instead of using the old-style multiple-invokation call
> method.  Given the simplicity and the lack of any user-visible change,
> and that it's an independently useful change regardless of what happens
> with the other changes, I'm planning to move forward with this pretty
> quickly, barring concerns, of course.

Um.  Abhijit mentioned to me the idea of implementing function execution
code in fmgr that would allow for calling functions lazily (i.e. stop
calling it once enough rows have been returned).  One of the reasons not
to do that is that most functions use the materialize mode rather than
value-per-call, so the optimization is pointless because the whole
result is computed anyway.  If we change more functions to materialize,
we will make that sort of optimization even more distant.

What is the reason for this change anyway?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Remove obsolete mention of src/tools/backend

2015-05-08 Thread Amit Langote


Hi,

Commit 63f1ccd got rid of src/tool/backend and hence
src/tool/backend/index.html. But lmgr README still directs reader to the
aforementioned file. Attached removes this obsolete reference.

Thanks,
Amit
diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index 6bc7efc..8898e25 100644
--- a/src/backend/storage/lmgr/README
+++ b/src/backend/storage/lmgr/README
@@ -56,9 +56,9 @@ two lock methods: DEFAULT and USER.
 Lock modes describe the type of the lock (read/write or shared/exclusive).
 In principle, each lock method can have its own set of lock modes with
 different conflict rules, but currently DEFAULT and USER methods use
-identical lock mode sets.  See src/tools/backend/index.html and
-src/include/storage/lock.h for more details.  (Lock modes are also called
-lock types in some places in the code and documentation.)
+identical lock mode sets. See src/include/storage/lock.h for more details.
+(Lock modes are also called lock types in some places in the code and
+documentation.)
 
 There are two main methods for recording locks in shared memory.  The primary
 mechanism uses two main structures: the per-lockable-object LOCK struct, and

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken --dry-run mode in pg_rewind

2015-05-08 Thread Vladimir Borodin


> 8 мая 2015 г., в 16:11, Stephen Frost  написал(а):
> 
> * Heikki Linnakangas (hlinn...@iki.fi ) wrote:
>> On 05/08/2015 03:39 PM, Michael Paquier wrote:
>>> On Fri, May 8, 2015 at 9:34 PM, Heikki Linnakangas  wrote:
 On 05/08/2015 03:25 PM, Vladimir Borodin wrote:
> Seems, that pg_rewind does not account --dry-run option properly. A simple
> fix
> for that is attached.
 
 
 No, the --dry-run takes effect later. It performs all the actions it
 normally would, including reading files from the source, except for 
 actually
 writing anything in the target. See the dry-run checks in file_ops.c

Urgh, my test script had an error and I made grep only in pg_rewind.c. Sorry 
for noise.

>>> 
>>> Even if the patch sent is incorrect, shouldn't there be some process
>>> bypass in updateControlFile() and createBackupLabel() in case of a
>>> --dry-run?
>> 
>> They both use open_target_file() and write_target_file(), which
>> check for --dry-run and do nothing if it's set.
>> 
>> Hmm, I wonder it we should print something else than "Done!" at the
>> end, if run in --dry-run mode. Or give some indication around the
>> time it says "Rewinding from last common checkpoint at ...", that
>> it's running in dry-run mode and won't actually modify anything. The
>> progress messages are a bit alarming if you don't realize that it's
>> skipping all the writes.

That would be really nice.

> 
> Wouldn't hurt to also augment that rather doom-looking "point of no
> return" comment with a note that says writes won't happen if in
> dry-run. :)
> 
> For my 2c anyway.
> 
>   Thanks!
> 
>   Stephen


--
May the force be with you…
https://simply.name

Re: [HACKERS] subxcnt defined as signed integer in SnapshotData and SerializeSnapshotData

2015-05-08 Thread Simon Riggs

On 8 May 2015 at 13:02, Michael Paquier  wrote:

> On Fri, May 8, 2015 at 3:55 PM, Simon Riggs  wrote:
> > On 7 May 2015 at 21:40, Michael Paquier 
> wrote:
> >>
> >> Hi all,
> >>
> >> Coverity is complaining about the following assertion introduced in
> >> commit 924bcf4 (parallel stuff, SerializeSnapshot@snapmgr.c):
> >> +   Assert(snapshot->xcnt >= 0);
> >>
> >> Now the thing is that this assertion does not make much sense, because
> >> SnapshotData defines subxcnt as uint32 in snapshot.h. While we could
> >> simply remove this assertion, I am wondering if we could not change
> >> subxcnt to uint32 instead.
> >>
> >> SnapshotData has been introduced in 2008 by d43b085, with this comment:
> >> +   int32   subxcnt;/* # of xact ids in
> >> subxip[], -1 if overflow */
> >> Comment regarding negative values removed in efc16ea5.
> >>
> >> Now, by looking at the code on HEAD, I am seeing no code paths that
> >> make use of negative values of subxcnt. Perhaps I am missing
> >> something?
> >
> >
> > So the comment is wrong? It does not set to -1 at overflow anymore?
>
> SnapshotData.suboverflowed is used instead. Have a look at efc16ea5 in
> procarray.c to convince yourself:
>
> @@ -785,16 +1121,17 @@ GetSnapshotData(Snapshot snapshot)
>  *
>  * Again, our own XIDs are not included in the snapshot.
>  */
> -   if (subcount >= 0 && proc != MyProc)
> +   if (!suboverflowed && proc != MyProc)
> {
> if (proc->subxids.overflowed)
> -   subcount = -1;  /* overflowed */
> +   suboverflowed = true;
> else
>
> I think that we should redefine subxcnt as uint32 for consistency with
> xcnt, and remove the two assertions that 924bcf4 has introduced. I
> could get a patch quickly done FWIW.
>

(uint32) +1

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Streaming replication and WAL archive interactions


On 04/22/2015 10:07 AM, Michael Paquier wrote:

On Wed, Apr 22, 2015 at 3:38 PM, Heikki Linnakangas  wrote:

I feel that the best approach is to archive the last, partial segment, but
with the .partial suffix. I don't see any plausible real-wold setup where
the current behavior would be better. I don't really see much need to
archive the partial segment at all, but there's also no harm in doing it, as
long as it's clearly marked with the .partial suffix.


Well, as long as it is clearly archived at promotion, even with a
suffix, I guess that I am fine... This will need some tweaking on
restore_command for existing applications, but as long as it is
clearly documented I am fine. Shouldn't this be a different patch
though?


Ok, I came up with the attached, which adds the .partial suffix to the 
partial WAL segment that's archived after promotion. I couldn't find any 
natural place to talk about it in the docs, though. I think after the 
docs changes from the main patch are applied, it would be natural to 
mention this in the "Continuous archiving in standby", so I think I'll 
add that later.


Barring objections, I'll push this later tonight.

Now that we got this last-partial-segment problem out of the way, I'm 
going to try fixing the problem you (Michael) pointed out about relying 
on pgstat file. Meanwhile, I'd love to get more feedback on the rest of 
the patch, and the documentation.


- Heikki

From 15c123141d1eef0d6b05a384d1c5c202ffa04a84 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Fri, 8 May 2015 12:04:46 +0300
Subject: [PATCH 1/2] Add macros to check if a filename is a WAL segment or
 other such file.

We had many instances of the strlen + strspn combination to check for that.
This makes the code a bit easier to read.
---
 src/backend/access/transam/xlog.c  | 11 +++
 src/backend/replication/basebackup.c   |  7 ++-
 src/bin/pg_basebackup/pg_receivexlog.c | 16 ++--
 src/bin/pg_resetxlog/pg_resetxlog.c|  8 ++--
 src/include/access/xlog_internal.h | 18 ++
 5 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 92822a1..5097173 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3577,8 +3577,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
 	while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL)
 	{
 		/* Ignore files that are not XLOG segments */
-		if (strlen(xlde->d_name) != 24 ||
-			strspn(xlde->d_name, "0123456789ABCDEF") != 24)
+		if (!IsXLogFileName(xlde->d_name))
 			continue;
 
 		/*
@@ -3650,8 +3649,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
 	while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL)
 	{
 		/* Ignore files that are not XLOG segments */
-		if (strlen(xlde->d_name) != 24 ||
-			strspn(xlde->d_name, "0123456789ABCDEF") != 24)
+		if (!IsXLogFileName(xlde->d_name))
 			continue;
 
 		/*
@@ -3839,10 +3837,7 @@ CleanupBackupHistory(void)
 
 	while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL)
 	{
-		if (strlen(xlde->d_name) > 24 &&
-			strspn(xlde->d_name, "0123456789ABCDEF") == 24 &&
-			strcmp(xlde->d_name + strlen(xlde->d_name) - strlen(".backup"),
-   ".backup") == 0)
+		if (IsBackupHistoryFileName(xlde->d_name))
 		{
 			if (XLogArchiveCheckDone(xlde->d_name))
 			{
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 3563fd9..de103c6 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -350,17 +350,14 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
 		while ((de = ReadDir(dir, "pg_xlog")) != NULL)
 		{
 			/* Does it look like a WAL segment, and is it in the range? */
-			if (strlen(de->d_name) == 24 &&
-strspn(de->d_name, "0123456789ABCDEF") == 24 &&
+			if (IsXLogFileName(de->d_name) &&
 strcmp(de->d_name + 8, firstoff + 8) >= 0 &&
 strcmp(de->d_name + 8, lastoff + 8) <= 0)
 			{
 walFileList = lappend(walFileList, pstrdup(de->d_name));
 			}
 			/* Does it look like a timeline history file? */
-			else if (strlen(de->d_name) == 8 + strlen(".history") &&
-	 strspn(de->d_name, "0123456789ABCDEF") == 8 &&
-	 strcmp(de->d_name + 8, ".history") == 0)
+			else if (IsTLHistoryFileName(de->d_name))
 			{
 historyFileList = lappend(historyFileList, pstrdup(de->d_name));
 			}
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
index e77d2b6..53802af 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -188,23 +188,11 @@ FindStreamingStart(uint32 *tli)
 
 		/*
 		 * Check if the filename looks like an xlog file, or a .partial file.
-		 * Xlog files are always 24 characters, and .partial files are 32
-		 * characters.
 		 */
-		if (strlen(dirent->d_name) == 24)
-		{
-			if (strspn(dirent->d_name, "0123456789ABCDEF") != 24)
-

Re: [HACKERS] Broken --dry-run mode in pg_rewind

* Heikki Linnakangas (hlinn...@iki.fi) wrote:
> On 05/08/2015 03:39 PM, Michael Paquier wrote:
> >On Fri, May 8, 2015 at 9:34 PM, Heikki Linnakangas  wrote:
> >>On 05/08/2015 03:25 PM, Vladimir Borodin wrote:
> >>>Seems, that pg_rewind does not account --dry-run option properly. A simple
> >>>fix
> >>>for that is attached.
> >>
> >>
> >>No, the --dry-run takes effect later. It performs all the actions it
> >>normally would, including reading files from the source, except for actually
> >>writing anything in the target. See the dry-run checks in file_ops.c
> >
> >Even if the patch sent is incorrect, shouldn't there be some process
> >bypass in updateControlFile() and createBackupLabel() in case of a
> >--dry-run?
> 
> They both use open_target_file() and write_target_file(), which
> check for --dry-run and do nothing if it's set.
> 
> Hmm, I wonder it we should print something else than "Done!" at the
> end, if run in --dry-run mode. Or give some indication around the
> time it says "Rewinding from last common checkpoint at ...", that
> it's running in dry-run mode and won't actually modify anything. The
> progress messages are a bit alarming if you don't realize that it's
> skipping all the writes.

Wouldn't hurt to also augment that rather doom-looking "point of no
return" comment with a note that says writes won't happen if in
dry-run. :)

For my 2c anyway.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade

2015-05-08 Thread Pavel Stehule

2015-05-07 13:43 GMT+02:00 Alvaro Herrera :

> The problem is here:
>
> > [root@ps-test5:/etc/puppet/modules/postgresql/files] pg_controldata
> > /mnt/ebs/pgsql/data
> > pg_control version number:922
> > Catalog version number:   201302181
>
> The catversion for 9.2 is 201204301; you have modified it with your
> patches in a way that breaks this check in pg_upgrade:
>

yes, It was a reason.

Thank you very much for help

Regards

Pavel



>
> /*
>  * If the old server is before the MULTIXACT_FORMATCHANGE_CAT_VER
> change
>  * (see pg_upgrade.h) and the new server is after, then we don't copy
>  * pg_multixact files, but we need to reset pg_control so that the new
>  * server doesn't attempt to read multis older than the cutoff value.
>  */
> if (old_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER
> &&
> new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
>
> pg_upgrade behaves differently if the source catversion is earlier than
> this value:
>
> /*
>  * pg_multixact format changed in 9.3 commit
> 0ac5ad5134f2769ccbaefec73844f85,
>  * ("Improve concurrency of foreign key locking") which also updated
> catalog
>  * version to this value.  pg_upgrade behavior depends on whether old and
> new
>  * server versions are both newer than this, or only the new one is.
>  */
> #define MULTIXACT_FORMATCHANGE_CAT_VER 201301231
>
> because it expects to see the "oldest multixact id" in pg_controldata,
> but 9.2 did not have that.
>
> You either need to change your database's catversion, or patch your
> pg_upgrade so that it knows to consider your catversion part of 9.2
> instead of 9.3.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] Broken --dry-run mode in pg_rewind


On 05/08/2015 03:39 PM, Michael Paquier wrote:

On Fri, May 8, 2015 at 9:34 PM, Heikki Linnakangas  wrote:

On 05/08/2015 03:25 PM, Vladimir Borodin wrote:

Seems, that pg_rewind does not account --dry-run option properly. A simple
fix
for that is attached.



No, the --dry-run takes effect later. It performs all the actions it
normally would, including reading files from the source, except for actually
writing anything in the target. See the dry-run checks in file_ops.c


Even if the patch sent is incorrect, shouldn't there be some process
bypass in updateControlFile() and createBackupLabel() in case of a
--dry-run?


They both use open_target_file() and write_target_file(), which check 
for --dry-run and do nothing if it's set.


Hmm, I wonder it we should print something else than "Done!" at the end, 
if run in --dry-run mode. Or give some indication around the time it 
says "Rewinding from last common checkpoint at ...", that it's running 
in dry-run mode and won't actually modify anything. The progress 
messages are a bit alarming if you don't realize that it's skipping all 
the writes.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken --dry-run mode in pg_rewind

2015-05-08 Thread Michael Paquier

On Fri, May 8, 2015 at 9:34 PM, Heikki Linnakangas  wrote:
> On 05/08/2015 03:25 PM, Vladimir Borodin wrote:
>> Seems, that pg_rewind does not account --dry-run option properly. A simple
>> fix
>> for that is attached.
>
>
> No, the --dry-run takes effect later. It performs all the actions it
> normally would, including reading files from the source, except for actually
> writing anything in the target. See the dry-run checks in file_ops.c

Even if the patch sent is incorrect, shouldn't there be some process
bypass in updateControlFile() and createBackupLabel() in case of a
--dry-run?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Broken --dry-run mode in pg_rewind


On 05/08/2015 03:25 PM, Vladimir Borodin wrote:

Hi all.

Seems, that pg_rewind does not account --dry-run option properly. A simple fix
for that is attached.


No, the --dry-run takes effect later. It performs all the actions it 
normally would, including reading files from the source, except for 
actually writing anything in the target. See the dry-run checks in 
file_ops.c


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Broken --dry-run mode in pg_rewind

2015-05-08 Thread Vladimir Borodin

Hi all.Seems, that pg_rewind does not account --dry-run option properly. A simple fix for that is attached.

pg_rewind_dry_run_fix.patch
Description: Binary data

--May the force be with you…https://simply.name

Re: [HACKERS] PATCH: remove nclients/nthreads constraint from pgbench

2015-05-08 Thread Fabien COELHO



Minor v2 update to change a not badly chosen variable name.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index a808546..2517a3a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -326,8 +326,7 @@ pgbench  options  dbname

 Number of worker threads within pgbench.
 Using more than one thread can be helpful on multi-CPU machines.
-The number of clients must be a multiple of the number of threads,
-since each thread is given the same number of client sessions to manage.
+Clients are distributed as evenly as possible among available threads.
 Default is 1.

   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 06a4dfd..5343837 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2762,6 +2762,7 @@ main(int argc, char **argv)
 	int			c;
 	int			nclients = 1;	/* default number of simulated clients */
 	int			nthreads = 1;	/* default number of threads */
+	int			nclients_dealt;	/* clients distributed between threads */
 	int			is_init_mode = 0;		/* initialize mode? */
 	int			is_no_vacuum = 0;		/* no vacuum at all before testing? */
 	int			do_vacuum_accounts = 0; /* do vacuum accounts before testing? */
@@ -3122,12 +3123,6 @@ main(int argc, char **argv)
 	if (nxacts <= 0 && duration <= 0)
 		nxacts = DEFAULT_NXACTS;
 
-	if (nclients % nthreads != 0)
-	{
-		fprintf(stderr, "number of clients (%d) must be a multiple of number of threads (%d)\n", nclients, nthreads);
-		exit(1);
-	}
-
 	/* --sampling-rate may be used only with -l */
 	if (sample_rate > 0.0 && !use_log)
 	{
@@ -3328,19 +3323,24 @@ main(int argc, char **argv)
 
 	/* set up thread data structures */
 	threads = (TState *) pg_malloc(sizeof(TState) * nthreads);
+	nclients_dealt = 0;
+
 	for (i = 0; i < nthreads; i++)
 	{
 		TState	   *thread = &threads[i];
 
 		thread->tid = i;
-		thread->state = &state[nclients / nthreads * i];
-		thread->nstate = nclients / nthreads;
+		thread->state = &state[nclients_dealt];
+		thread->nstate =
+			(nclients - nclients_dealt + nthreads - i - 1) / (nthreads - i);
 		thread->random_state[0] = random();
 		thread->random_state[1] = random();
 		thread->random_state[2] = random();
 		thread->throttle_latency_skipped = 0;
 		thread->latency_late = 0;
 
+		nclients_dealt += thread->nstate;
+
 		if (is_latencies)
 		{
 			/* Reserve memory for the thread to store per-command latencies */
@@ -3364,6 +3364,9 @@ main(int argc, char **argv)
 		}
 	}
 
+	/* all clients must be assigned to a thread */
+	Assert(nclients_dealt == nclients);
+
 	/* get start up time */
 	INSTR_TIME_SET_CURRENT(start_time);
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] subxcnt defined as signed integer in SnapshotData and SerializeSnapshotData

2015-05-08 Thread Michael Paquier

On Fri, May 8, 2015 at 3:55 PM, Simon Riggs  wrote:
> On 7 May 2015 at 21:40, Michael Paquier  wrote:
>>
>> Hi all,
>>
>> Coverity is complaining about the following assertion introduced in
>> commit 924bcf4 (parallel stuff, SerializeSnapshot@snapmgr.c):
>> +   Assert(snapshot->xcnt >= 0);
>>
>> Now the thing is that this assertion does not make much sense, because
>> SnapshotData defines subxcnt as uint32 in snapshot.h. While we could
>> simply remove this assertion, I am wondering if we could not change
>> subxcnt to uint32 instead.
>>
>> SnapshotData has been introduced in 2008 by d43b085, with this comment:
>> +   int32   subxcnt;/* # of xact ids in
>> subxip[], -1 if overflow */
>> Comment regarding negative values removed in efc16ea5.
>>
>> Now, by looking at the code on HEAD, I am seeing no code paths that
>> make use of negative values of subxcnt. Perhaps I am missing
>> something?
>
>
> So the comment is wrong? It does not set to -1 at overflow anymore?

SnapshotData.suboverflowed is used instead. Have a look at efc16ea5 in
procarray.c to convince yourself:

@@ -785,16 +1121,17 @@ GetSnapshotData(Snapshot snapshot)
 *
 * Again, our own XIDs are not included in the snapshot.
 */
-   if (subcount >= 0 && proc != MyProc)
+   if (!suboverflowed && proc != MyProc)
{
if (proc->subxids.overflowed)
-   subcount = -1;  /* overflowed */
+   suboverflowed = true;
else

I think that we should redefine subxcnt as uint32 for consistency with
xcnt, and remove the two assertions that 924bcf4 has introduced. I
could get a patch quickly done FWIW.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Obsolete mention of src/tools/backend