initdb help message about WAL segment size

2018-03-10 Thread Magnus Hagander
I propose the attached patch, aligning the help message with the docs. Any
reason not to?

(yes, I confused myself by trying --wal-segsize=1MB instead of just
--wal-segsize=1. I blame just stepping off an intercontinental flight and a
very early morning :P)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 1f7f2aaa54..65eba7d42f 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2323,7 +2323,7 @@ usage(const char *progname)
 	printf(_("  -U, --username=NAME   database superuser name\n"));
 	printf(_("  -W, --pwpromptprompt for a password for the new superuser\n"));
 	printf(_("  -X, --waldir=WALDIR   location for the write-ahead log directory\n"));
-	printf(_("  --wal-segsize=SIZEsize of wal segment size\n"));
+	printf(_("  --wal-segsize=SIZEsize of wal segment size in megabytes\n"));
 	printf(_("\nLess commonly used options:\n"));
 	printf(_("  -d, --debug   generate lots of debugging output\n"));
 	printf(_("  -k, --data-checksums  use data page checksums\n"));


Re: Parallel Aggregates for string_agg and array_agg

2018-03-10 Thread David Rowley
On 11 March 2018 at 12:11, Tomas Vondra  wrote:
> On 03/05/2018 04:51 AM, David Rowley wrote:
>> On 5 March 2018 at 04:54, Tomas Vondra  wrote:
>> Consider the following slightly backward-looking case;
>>
>> select string_agg(',', x::text) from generate_Series(1,10)x;
>>   string_agg
>> --
>>  ,2,3,4,5,6,7,8,9,10,
>>
>> Here the delimiter is the number, not the ','. Of course, the
>> delimiter for the first aggregated result is not shown.  The previous
>> implementation of the transfn for this aggregate just threw away the
>> first delimiter, but that's no good for parallel aggregates as the
>> transfn may be getting called in a parallel worker, in which case
>> we'll need that delimiter in the combine function to properly join to
>> the other partial aggregated results matching the same group.
>>
>
> Hmmm, you're right I haven't considered the delimiter might be variable.
> But isn't just yet another hint that while StringInfo was suitable for
> aggregate state of non-parallel string_agg, it may not be for the
> parallel version?
>
> What if the parallel string_agg instead used something like this:
>
> struct StringAggState
> {
> char   *delimiter;/* first delimiter */
> StringInfoData  str;  /* partial aggregate state */
> } StringAggState;
>
> I think that would make the code cleaner, compared to using the cursor
> field for that purpose. But maybe it'd make it not possible to share
> code with the non-parallel aggregate, not sure.

It would be possible to use something like that. The final function
would just need to take 'str' and ignore 'delimiter', whereas the
combine function would need both. However, you could optimize the
above to just have a single StringInfoData and have a pointer to the
start of the actual data (beyond the first delimiter), that would save
us a call to palloc and also allow the state's data to exist in one
contiguous block of memory, which will be more cache friendly when it
comes to reading it back again.  The pointer here would, of course,
have to be an offset into the data array, since repallocs would cause
problems as the state grew.

This is kinda the option I was going for with using the cursor. I
didn't think that was going to be a problem. It seems that cursor was
invented so that users of StringInfo could store a marker somehow
along with the string. The comment for cursor read:

 * cursor is initialized to zero by makeStringInfo or initStringInfo,
 * but is not otherwise touched by the stringinfo.c routines.
 * Some routines use it to scan through a StringInfo.

The use of the cursor field does not interfere with pqformat.c's use
of it as in the serialization functions we're creating a new
StringInfo for the 'result'. If anything tries to send the internal
state, then that's certainly broken. It needs to be serialized before
that can happen.

I don't quite see how inventing a new struct would make the code
cleaner. It would make the serialization and deserialization functions
have to write and read more fields for the lengths of the data
contained in the state.

I understand that the cursor field is used in pqformat.c quite a bit,
but I don't quite understand why it should get to use that field the
way it wants, but the serialization and deserialization functions
shouldn't.  I'd understand that if we were trying to phase out the
cursor field from StringInfoData, but I can't imagine that's going to
happen.

Of course, with that all said. If you really strongly object to how
I've done this, then I'll change it to use a new struct type. I had
just tried to make the patch footprint as small as possible, and the
above text is me just explaining my reasoning behind this, not me
objecting to your request for me to change it. Let me know your
thoughts after reading the above.

In the meantime, I've attached an updated patch. The only change it
contains is updating the "Partial Mode" column in the docs from "No"
to "Yes".

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


combinefn_for_string_and_array_aggs_v6.patch
Description: Binary data


Re: [HACKERS] MERGE SQL Statement for PG11

2018-03-10 Thread Peter Geoghegan
On Sat, Mar 10, 2018 at 9:22 PM, Pavan Deolasee
 wrote:
> Ok. I will look at it. I think it shouldn't be too difficult and the
> original restriction was mostly a fallout of expecting CHECK constraint
> style expressions there.

Good, thanks.

> Ok. OVERRIDING is done. I think we can support ruleutils easily too. I don't
> know how to test that though.

Glad to hear it.

> I thought for a while about this and even tried multiple approaches before
> settling for what we have today. The biggest challenge is that
> inheritance/partition tables take completely different paths for INSERTs and
> UPDATE/DELETE. The RIGHT OUTER JOIN makes it kinda difficult because the
> regular UPDATE/DELETE code path ends up throwing duplicates when the source
> table is joined with individual partitions. IIRC that's the sole reason why
> I'd to settle on pushing the JOIN underneath, give it SELECT like treatment
> and then handle UPDATE/DELETE in the executor.

It sounds like we should try to thoroughly understand why these
duplicates arose. Did you actually call EvalPlanQualSetPlan() for all
subplans at the time?

> Ok. If you've something which is workable, then great. But AFAICS this is
> what the original patch was doing until we came to support partitioning.
> Even with partitioning I could get everything to work, without duplicating
> the RTE, except the duplicate rows issue. I don't know how to solve that
> without doing what I've done or completely rewriting UPDATE/DELETE handling
> for inheritance/partition table. If you or others have better ideas, they
> are most welcome.

I don't claim that what I wrote was workable with partitioning. But
I'm not getting how we can get away with not calling
EvalPlanQualSetPlan() for child plans, or something like it, as things
are.

> Right. The entire purpose of having two different RTEs is to work around
> this problem. I explained this approach here [1]. I didn't receive any
> objections then, but that's mostly because nobody read it carefully. As I
> said, if we have an alternate feasible and better mechanism, let's go for it
> as long as efforts are justifiable.

FWIW, you're right that I didn't give that aspect much thought until
quite recently. I'm no expert on partitioning.

As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
the works from Alvaro. In your explanation about that approach that
you cited, you wondered what the trouble might have been with ON
CONFLICT + partitioning, and supposed that the issues were similar
there. Are they? Has that turned up much?

-- 
Peter Geoghegan



Re: Faster inserts with mostly-monotonically increasing values

2018-03-10 Thread Pavan Deolasee
On Sat, Mar 10, 2018 at 12:11 AM, Claudio Freire 
wrote:

> On Fri, Mar 9, 2018 at 2:54 PM, Pavan Deolasee 
> wrote:
> >
> >
>
> >
> > So yes, the benefits of the patch go down with higher number of clients,
> but
> > it does not entirely vanish.
>
> What if you implement my suggestion?
>
> That should improve the multi-client case considerably.
>


Yes, I will try that next - it seems like a good idea. So the idea would
be: check if the block is still the rightmost block and the insertion-key
is greater than the first key in the page. If those conditions are
satisfied, then we do a regular binary search within the page to find the
correct location. This might add an overhead of binary search when keys are
strictly ordered and a single client is inserting the data. If that becomes
a concern, we might be able to look for that special case too and optimise
for it too.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] MERGE SQL Statement for PG11

2018-03-10 Thread Pavan Deolasee
On Sun, Mar 11, 2018 at 9:27 AM, Peter Geoghegan  wrote:

>
> >>
> >> We're talking about the scantuple here. It's not like excluded.*.
>
> I often care about things like system columns not because of the
> user-visible functionality, but because it reassures me that the
> design is robust.
>
>
Ok. I will look at it. I think it shouldn't be too difficult and the
original restriction was mostly a fallout of expecting CHECK constraint
style expressions there.


> >> I think that this is a blocker, unfortunately.
> >
> > You mean OVERRIDING or ruleutils?
>
> I meant OVERRIDING, but ruleutils seems like something we need, too.
>

Ok. OVERRIDING is done. I think we can support ruleutils easily too. I
don't know how to test that though.


>
> >> >> * I still think we probably need to add something to ruleutils.c, so
> >> >> that MERGE Query structs can be deparsed -- see get_query_def().
> >> >
> >> > Ok. I will look at it. Not done in this version though.
> >>
> >> I also wonder what it would take to support referencing CTEs. Other
> >> implementations do have this. Why can't we?
> >
> >
> > Hmm. I will look at them. But TBH I would like to postpone them to v12 if
> > they turn out to be tricky. We have already covered a very large ground
> in
> > the last few weeks, but we're approaching the feature freeze date.
>
> I quickly implemented CTE support myself (not wCTE support, since
> MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work
> when I mechanically duplicate the approach taken with other types of
> DML statement in the parser. I have written a few tests, and so far it
> holds up.
>

Ok, thanks. I started doing something similar, but great if you have
already implemented. I will focus on other things for now.


>
> I also undertook something quite a bit harder: Changing the
> representation of the range table from parse analysis on. As I
> mentioned in passing at one point, I'm concerned about the fact that
> there are two RTEs for the target relation in all cases. You
> introduced a separate Query.resultRelation-style RTI index,
> Query.mergeTarget_relation, and we see stuff like this through every
> stage of query processing, from parse analysis right through to
> execution. One problem with the existing approach is that it leaves
> many cases where EXPLAIN MERGE shows the target relation alias "t" as
> "t_1" for some planner nodes, though not for others. More importantly,
> I suspect that having two distinct RTEs has introduced special cases
> that are not really needed, and will be more bug-prone (in fact, there
> are bugs already, which I'll get to at the end). I think that it's
> fair to say that what happens in the patch with EvalPlanQual()'s RTI
> argument is ugly, especially because we also have a separate
> resultRelInfo that we *don't* use. We should aspire to have a MERGE
> implementation that isn't terribly different to UPDATE or DELETE,
> especially within the executor.
>

I thought for a while about this and even tried multiple approaches before
settling for what we have today. The biggest challenge is that
inheritance/partition tables take completely different paths for INSERTs
and UPDATE/DELETE. The RIGHT OUTER JOIN makes it kinda difficult because
the regular UPDATE/DELETE code path ends up throwing duplicates when the
source table is joined with individual partitions. IIRC that's the sole
reason why I'd to settle on pushing the JOIN underneath, give it SELECT
like treatment and then handle UPDATE/DELETE in the executor.


>
> I wrote a very rough patch that arranged for the MERGE rtable to have
> only a single relation RTE. My approach was to teach
> transformFromClauseItem() and friends to recognize that they shouldn't
> create a new RTE for the MERGE target RangeVar. I actually added some
> hard coding to getRTEForSpecialRelationTypes() for this, which is ugly
> as sin, but the general approach is likely salvageable (we could
> invent a special type of RangeVar to do this, perhaps). The point here
> is that everything just works if there isn't a separate RTE for the
> join's leftarg and the setTargetTable() target, with exactly one
> exception: partitioning becomes *thoroughly* broken. Every single
> partitioning test fails with "ERROR: no relation entry for relid 1",
> and occasionally some other "can't happen" error. This looks like it
> would be hard to fix -- at least, I'd find it hard to fix.
>
>
Ok. If you've something which is workable, then great. But AFAICS this is
what the original patch was doing until we came to support partitioning.
Even with partitioning I could get everything to work, without duplicating
the RTE, except the duplicate rows issue. I don't know how to solve that
without doing what I've done or completely rewriting UPDATE/DELETE handling
for inheritance/partition table. If you or others have better ideas, they
are most welcome.


> This is an extract from header comments for inheritance_planner()
> (master branch):
>
>  * We 

Re: [HACKERS] MERGE SQL Statement for PG11

2018-03-10 Thread Peter Geoghegan
On Fri, Mar 9, 2018 at 6:55 AM, Pavan Deolasee  wrote:
>> * Is this actually needed at all?:
>>
>> > +   /* In MERGE when and condition, no system column is allowed */
>> > +   if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND &&
>> > +   attnum < InvalidAttrNumber &&
>> > +   !(attnum == TableOidAttributeNumber || attnum ==
>> > ObjectIdAttributeNumber))
>> > +   ereport(ERROR,
>> > +   (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
>> > +errmsg("system column \"%s\" reference in WHEN AND
>> > condition is invalid",
>> > +   colname),
>> > +parser_errposition(pstate, location)));
>>
>> We're talking about the scantuple here. It's not like excluded.*.

I often care about things like system columns not because of the
user-visible functionality, but because it reassures me that the
design is robust.

>> I think that this is a blocker, unfortunately.
>
> You mean OVERRIDING or ruleutils?

I meant OVERRIDING, but ruleutils seems like something we need, too.

>> >> * I still think we probably need to add something to ruleutils.c, so
>> >> that MERGE Query structs can be deparsed -- see get_query_def().
>> >
>> > Ok. I will look at it. Not done in this version though.
>>
>> I also wonder what it would take to support referencing CTEs. Other
>> implementations do have this. Why can't we?
>
>
> Hmm. I will look at them. But TBH I would like to postpone them to v12 if
> they turn out to be tricky. We have already covered a very large ground in
> the last few weeks, but we're approaching the feature freeze date.

I quickly implemented CTE support myself (not wCTE support, since
MERGE doesn't use RETURNING), and it wasn't tricky. It seems to work
when I mechanically duplicate the approach taken with other types of
DML statement in the parser. I have written a few tests, and so far it
holds up.

I also undertook something quite a bit harder: Changing the
representation of the range table from parse analysis on. As I
mentioned in passing at one point, I'm concerned about the fact that
there are two RTEs for the target relation in all cases. You
introduced a separate Query.resultRelation-style RTI index,
Query.mergeTarget_relation, and we see stuff like this through every
stage of query processing, from parse analysis right through to
execution. One problem with the existing approach is that it leaves
many cases where EXPLAIN MERGE shows the target relation alias "t" as
"t_1" for some planner nodes, though not for others. More importantly,
I suspect that having two distinct RTEs has introduced special cases
that are not really needed, and will be more bug-prone (in fact, there
are bugs already, which I'll get to at the end). I think that it's
fair to say that what happens in the patch with EvalPlanQual()'s RTI
argument is ugly, especially because we also have a separate
resultRelInfo that we *don't* use. We should aspire to have a MERGE
implementation that isn't terribly different to UPDATE or DELETE,
especially within the executor.

I wrote a very rough patch that arranged for the MERGE rtable to have
only a single relation RTE. My approach was to teach
transformFromClauseItem() and friends to recognize that they shouldn't
create a new RTE for the MERGE target RangeVar. I actually added some
hard coding to getRTEForSpecialRelationTypes() for this, which is ugly
as sin, but the general approach is likely salvageable (we could
invent a special type of RangeVar to do this, perhaps). The point here
is that everything just works if there isn't a separate RTE for the
join's leftarg and the setTargetTable() target, with exactly one
exception: partitioning becomes *thoroughly* broken. Every single
partitioning test fails with "ERROR: no relation entry for relid 1",
and occasionally some other "can't happen" error. This looks like it
would be hard to fix -- at least, I'd find it hard to fix.

This is an extract from header comments for inheritance_planner()
(master branch):

 * We have to handle this case differently from cases where a source relation
 * is an inheritance set. Source inheritance is expanded at the bottom of the
 * plan tree (see allpaths.c), but target inheritance has to be expanded at
 * the top.  The reason is that for UPDATE, each target relation needs a
 * different targetlist matching its own column set.  Fortunately,
 * the UPDATE/DELETE target can never be the nullable side of an outer join,
 * so it's OK to generate the plan this way.

Of course, that isn't true of MERGE: The MERGE target *can* be the
nullable side of an outer join. That's probably a big complication for
using one target RTE. Your approach to implementing partitioning [1]
seems to benefit from having two different RTEs, in a way -- you
sidestep the restriction. As you put it, "Since MERGE need both the
facilities [both INSERT and UPDATE facilities], I'd to pretty much
merge both the machineries". As I 

Re: disable SSL compression?

2018-03-10 Thread Peter Eisentraut
On 3/9/18 09:06, Magnus Hagander wrote:
> What platform does that actually work out of the box on? I have
> customers who actively want to use it (for compression, not security --
> replication across limited and metered links), and the amount of
> workarounds they have to put in place OS level to get it working is
> increasingly complicated.

It was disabled in OpenSSL 1.1.0:

  *) CRIME protection: disable compression by default, even if OpenSSL is
 compiled with zlib enabled. Applications can still enable compression
 by calling SSL_CTX_clear_options(ctx, SSL_OP_NO_COMPRESSION), or by
 using the SSL_CONF library to configure compression.
 [Emilia Käsper]

So for your purposes, you could add a server option to turn it back on.

Such a server option would also be useful for those users who are using
OpenSSL <1.1.0 and want to turn off compression on the server side.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: VACUUM FULL vs dropped columns

2018-03-10 Thread Andrew Dunstan
On Sun, Mar 11, 2018 at 9:49 AM, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> Why does VACUUM FULL cause the size of this table with a single
>> dropped column (1 out of 1000) cause the table size to double?
>
> VACUUM FULL will rewrite the tuples with a null bitmap where they
> had none before (cf reform_and_rewrite_tuple).  That's only a rather
> marginal increase in the tuple size, but in this particular example,
> it pushes the tuples from just under half a page to just over, so
> that you go from 2 tuples/page to 1.
>


Aha! Thanks for the explanation.

cheers

andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: VACUUM FULL vs dropped columns

2018-03-10 Thread Tom Lane
Andrew Dunstan  writes:
> Why does VACUUM FULL cause the size of this table with a single
> dropped column (1 out of 1000) cause the table size to double?

VACUUM FULL will rewrite the tuples with a null bitmap where they
had none before (cf reform_and_rewrite_tuple).  That's only a rather
marginal increase in the tuple size, but in this particular example,
it pushes the tuples from just under half a page to just over, so
that you go from 2 tuples/page to 1.

regards, tom lane



Re: Parallel Aggregates for string_agg and array_agg

2018-03-10 Thread Tomas Vondra


On 03/05/2018 04:51 AM, David Rowley wrote:
> On 5 March 2018 at 04:54, Tomas Vondra  wrote:
>> 1) There seems to be forgotten declaration of initArrayResultInternal in
>> arrayfuncs.c. I suppose you've renamed it to initArrayResultWithSize and
>> moved it to a header file, and forgotten to remove this bit.
> 
> Oops. Must be left over from trying things another way. Removed.
> 
>> 2) bytea_string_agg_deserialize has this comment:
>>
>>  /*
>>   * data: technically we could reuse the buf.data buffer, but that is
>>   * slightly larger than we need due to the extra 4 bytes for the cursor
>>   */
>>
>> I find the argument "it has 4 extra bytes, so we'll allocate new chunk"
>> somewhat weak. We do allocate the memory in 2^n chunks anyway, so the
>> new chunk is likely to be much larger anyway. I'm not saying we need to
>> reuse the buffer, IMHO the savings will be non-measurable.
> 
> Agreed. I've removed that part of the comment.
> 
>> 3) string_agg_transfn and bytea_string_agg_transfn say this"
>>
>>  /*
>>   * We always append the delimiter text before the value text, even
>>   * with the first aggregated item.  The reason for this is that if
>>   * this state needs to be combined with another state using the
>>   * aggregate's combine function, then we need to have the delimiter
>>   * for the first aggregated item.  The first delimiter will be
>>   * stripped off in the final function anyway.  We use a little cheat
>>   * here and store the position of the actual first value (after the
>>   * delimiter) using the StringInfo's cursor variable.  This relies on
>>   * the cursor not being used for any other purpose.
>>   */
>>
>> How does this make the code any simpler, compared to not adding the
>> delimiter (as before) and adding it when combining the values (at which
>> point you should know when it's actually needed)?
> 
> The problem is that if we don't record the first delimiter then we
> won't know what it is when it comes to combining the states.
> 
> Consider the following slightly backward-looking case;
> 
> select string_agg(',', x::text) from generate_Series(1,10)x;
>   string_agg
> --
>  ,2,3,4,5,6,7,8,9,10,
> 
> Here the delimiter is the number, not the ','. Of course, the
> delimiter for the first aggregated result is not shown.  The previous
> implementation of the transfn for this aggregate just threw away the
> first delimiter, but that's no good for parallel aggregates as the
> transfn may be getting called in a parallel worker, in which case
> we'll need that delimiter in the combine function to properly join to
> the other partial aggregated results matching the same group.
> 

Hmmm, you're right I haven't considered the delimiter might be variable.
But isn't just yet another hint that while StringInfo was suitable for
aggregate state of non-parallel string_agg, it may not be for the
parallel version?

What if the parallel string_agg instead used something like this:

struct StringAggState
{
char   *delimiter;/* first delimiter */
StringInfoData  str;  /* partial aggregate state */
} StringAggState;

I think that would make the code cleaner, compared to using the cursor
field for that purpose. But maybe it'd make it not possible to share
code with the non-parallel aggregate, not sure.

I always considered the cursor field to be meant for scanning through
StringInfo, so using it for this purpose seems a bit like a misuse.

> Probably I could improve the comment a bit. I understand that having a
> variable delimiter is not commonplace in the normal world. I certainly
> had never considered it before working on this.  I scratched my head a
> bit when doing this and thought about inventing a new trans type, but
> I decided that the most efficient design for an aggregate state was to
> store the delimiter and data all in one buffer and have a pointer to
> the start of the actual data... StringInfo has exactly what's required
> if you use the cursor as the pointer, so I didn't go and reinvent the
> wheel.
> 

I wonder why the variable delimiter is not mentioned in the docs ... (at
least I haven't found anything mentioning the behavior).

> I've now changed the comment to read:
> 
> /*
> * It's important that we store the delimiter text for all aggregated
> * items, even the first one, which at first thought you might think
> * could just be discarded.  The reason for this is that if this
> * function is being called from a parallel worker, then we'll need
> * the first delimiter in order to properly combine the partially
> * aggregated state with the states coming from other workers.  In the
> * final output, the first delimiter will be stripped off of the final
> * aggregate state, but in order to know where the actual first data
> * is we must store the position of the first data value somewhere.
> * Conveniently, StringInfo has a cursor property which can be used
> * to serve our needs here.
> */
> 
> I've 

Intermittent pg_ctl failures on Windows

2018-03-10 Thread Tom Lane
The buildfarm's Windows members occasionally show weird pg_ctl failures,
for instance this recent case:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird=2018-03-10%2020%3A30%3A20

### Restarting node "master"
# Running: pg_ctl -D 
G:/prog/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/t_006_logical_decoding_master_data/pgdata
 -l 
G:/prog/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/log/006_logical_decoding_master.log
 restart
waiting for server to shut down done
server stopped
waiting for server to startThe process cannot access the file because it is 
being used by another process.
 stopped waiting
pg_ctl: could not start server
Examine the log output.
Bail out!  system pg_ctl failed

or this one:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird=2017-12-29%2023%3A30%3A24

### Stopping node "subscriber" using mode fast
# Running: pg_ctl -D 
c:/prog/bf/root/HEAD/pgsql.build/src/test/subscription/tmp_check/t_001_rep_changes_subscriber_data/pgdata
 -m fast stop
waiting for server to shut downpg_ctl: could not open PID file 
"c:/prog/bf/root/HEAD/pgsql.build/src/test/subscription/tmp_check/t_001_rep_changes_subscriber_data/pgdata/postmaster.pid":
 Permission denied
Bail out!  system pg_ctl failed

I'd been writing these off as Microsoft randomness and/or antivirus
interference, but it suddenly occurred to me that there might be a
consistent explanation: since commit f13ea95f9, when pg_ctl is waiting
for server start/stop, it is trying to read postmaster.pid more-or-less
concurrently with the postmaster writing to that file.  On Unix that's not
much of a problem, but I believe that on Windows you have to specifically
open the file with sharing enabled, or you get error messages like these.
The postmaster should be enabling sharing, because port.h redirects
open/fopen to pgwin32_open/pgwin32_fopen which enable the sharing flags.
But it only does that #ifndef FRONTEND.  So pg_ctl is just using naked
open(), which could explain these failures.

If this theory is accurate, it should be pretty easy to replicate the
problem if you modify the postmaster to hold postmaster.pid open longer
when rewriting it, e.g. stick fractional-second sleeps into CreateLockFile
and AddToDataDirLockFile.

I'm not in a position to investigate this in detail nor test a fix,
but I think somebody should.

regards, tom lane



VACUUM FULL vs dropped columns

2018-03-10 Thread Andrew Dunstan
While doing some testing I noticed this, which seems somewhat perverse:

create table t();
insert into t select from generate_series(1,1);
select 'alter table t ' || string_agg(' add column c'||x::text||' int
default ' ||x::text,',')
from generate_series(1,1000) x \gexec

create table t_dropped();
insert into t_dropped select from generate_series(1,1);
select 'alter table t_dropped ' || string_agg(' add column
c'||x::text||' int default ' ||x::text,',')
from generate_series(1,1000)  x \gexec

alter table t_dropped drop column c900;

select pg_total_relation_size('t') as size_t,
pg_total_relation_size('t_dropped') as size_t_dropped;

  size_t  | size_t_dropped
--+
 4096 |   4096
(1 row)


vacuum full t;
vacuum full t_dropped;

select pg_total_relation_size('t') as size_t,
pg_total_relation_size('t_dropped') as size_t_dropped;

  size_t  | size_t_dropped
--+
 4096 |   8192
(1 row)


Why does VACUUM FULL cause the size of this table with a single
dropped column (1 out of 1000) cause the table size to double?

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Bogus use of canonicalize_qual

2018-03-10 Thread Tom Lane
I wrote:
> Whilst fooling about with predtest.c, I noticed a rather embarrassing
> error.  Consider the following, rather silly, CHECK constraint:
> ...
> So, what to do?  We have a few choices, none ideal:

I'd been assuming that we need to back-patch a fix for this, but after
further reflection, I'm not so sure.  The bug is only triggered by fairly
silly CHECK constraints, and given that it's been there a long time (at
least since 9.2 according to my tests) without any field reports, it seems
likely that nobody is writing such silly CHECK constraints.

If we suppose that we only need to fix it in HEAD, the most attractive
answer is to add a parameter distinguishing WHERE and CHECK arguments
to canonicalize_qual.  That allows symmetrical simplification of constant-
NULL subexpressions in the two cases, and the fact that the caller now
has to make an explicit choice of WHERE vs CHECK semantics might help
discourage people from applying the function in cases where it's not
clear which one applies.  PFA a patch that does it like that.

I'm a little unhappy with what I learned about the PARTITION code while
doing this :-(.  It's pretty schizophrenic about whether partition
constraints are implicit-AND or explicit-AND format, and I do not think
that the construction of default-partition constraints is done in a
desirable fashion either.  But I mostly resisted the temptation to touch
that logic in this patch.

Comments, objections?

regards, tom lane

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index fcf7655..786c05d 100644
*** a/src/backend/catalog/partition.c
--- b/src/backend/catalog/partition.c
*** get_proposed_default_constraint(List *ne
*** 3204,3215 
  	defPartConstraint = makeBoolExpr(NOT_EXPR,
  	 list_make1(defPartConstraint),
  	 -1);
  	defPartConstraint =
  		(Expr *) eval_const_expressions(NULL,
  		(Node *) defPartConstraint);
! 	defPartConstraint = canonicalize_qual(defPartConstraint);
  
! 	return list_make1(defPartConstraint);
  }
  
  /*
--- 3204,3217 
  	defPartConstraint = makeBoolExpr(NOT_EXPR,
  	 list_make1(defPartConstraint),
  	 -1);
+ 
+ 	/* Simplify, to put the negated expression into canonical form */
  	defPartConstraint =
  		(Expr *) eval_const_expressions(NULL,
  		(Node *) defPartConstraint);
! 	defPartConstraint = canonicalize_qual(defPartConstraint, true);
  
! 	return make_ands_implicit(defPartConstraint);
  }
  
  /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e559afb..46a648a 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
*** PartConstraintImpliedByRelConstraint(Rel
*** 13719,13725 
  		 * fail to detect valid matches without this.
  		 */
  		cexpr = eval_const_expressions(NULL, cexpr);
! 		cexpr = (Node *) canonicalize_qual((Expr *) cexpr);
  
  		existConstraint = list_concat(existConstraint,
  	  make_ands_implicit((Expr *) cexpr));
--- 13719,13725 
  		 * fail to detect valid matches without this.
  		 */
  		cexpr = eval_const_expressions(NULL, cexpr);
! 		cexpr = (Node *) canonicalize_qual((Expr *) cexpr, true);
  
  		existConstraint = list_concat(existConstraint,
  	  make_ands_implicit((Expr *) cexpr));
*** ATExecAttachPartition(List **wqueue, Rel
*** 14058,14067 
  	/* Skip validation if there are no constraints to validate. */
  	if (partConstraint)
  	{
  		partConstraint =
  			(List *) eval_const_expressions(NULL,
  			(Node *) partConstraint);
! 		partConstraint = (List *) canonicalize_qual((Expr *) partConstraint);
  		partConstraint = list_make1(make_ands_explicit(partConstraint));
  
  		/*
--- 14058,14075 
  	/* Skip validation if there are no constraints to validate. */
  	if (partConstraint)
  	{
+ 		/*
+ 		 * Run the partition quals through const-simplification similar to
+ 		 * check constraints.  We skip canonicalize_qual, though, because
+ 		 * partition quals should be in canonical form already; also, since
+ 		 * the qual is in implicit-AND format, we'd have to explicitly convert
+ 		 * it to explicit-AND format and back again.
+ 		 */
  		partConstraint =
  			(List *) eval_const_expressions(NULL,
  			(Node *) partConstraint);
! 
! 		/* XXX this sure looks wrong */
  		partConstraint = list_make1(make_ands_explicit(partConstraint));
  
  		/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 14b7bec..24e6c46 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** preprocess_expression(PlannerInfo *root,
*** 988,994 
  	 */
  	if (kind == EXPRKIND_QUAL)
  	{
! 		expr = (Node *) canonicalize_qual((Expr *) expr);
  
  #ifdef OPTIMIZER_DEBUG
  		printf("After canonicalize_qual()\n");
--- 988,994 
  	 */
  	if (kind == EXPRKIND_QUAL)
  	{
! 		expr = (Node *) 

Re: [PROPOSAL] timestamp informations to pg_stat_statements

2018-03-10 Thread Tomas Vondra


On 03/10/2018 04:43 PM, legrand legrand wrote:
> +1
> Having the time of first occurence of a statement is very usefull
> for trouble shouting, it permits for exemple to retrieve the order of
> operations in some complex cases (and thoses informations aren't
> taken by any third party collecting tool, that will only be able to
> provide a time range of occurence).
> 

I really don't see how this would be useful in reconstructing order of
operations, particularly in complex cases where I'd expect the queries
to be executed repeatedly / interleaved in different ways. Furthermore,
it would only work for the very first execution of all statements, so
you would probably have to reset the stats over and over - which seems
to directly contradict the purpose of pg_stat_statements (aggregation of
data over longer periods of time).

So unfortunately this seems rather useless, and log_statements=all seems
like a much better / reliable approach to achieve that.

I also doubt it really allows computation of averages, e.g. queries per
second, because practical workloads usually have daily/weekly patterns
(and different queries may follow different patterns).

So I agree with Peter Geoghegan that tools regularly snapshotting
pg_stat_statements and processing that are a better approach. I've seen
a bunch of simple scripts doing just that, actually.

> I thougth that pgss rows where removed randomly when max rows was
> reached, wouldn't having last_executed information permit a kind of
> LRU removal ?
> 

No, the entries are not removed randomly. We track "usage" for each
entry (essentially +1 for each time the query got executed, with a decay
factor applied on each eviction (and we evict 5% at a time).

It's not immediately obvious why something based on time of the
first/last execution would be better than the current algorithm.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?

2018-03-10 Thread Henry
I just came across the following paper and project and thought I would
share:

It seems some incremental graph query extensions to SQL could add some
powerful capabilities to PostgreSQL (without having to think about a
complete query language replacement). The incremental change could include:
paths as first class citizens and a new MATCH keyword.

This is what I was reading:

https://arxiv.org/abs/1712.01550*  G-CORE A Core for Future Graph Query
Languages *

*"Path as first-class citizens. The notion of Path is fundamental for graph
databases, because it introduces an intermediate abstraction level that
allows to represents how elements in a graph are related. The facilities
provided by a graph query language to manipulate paths (i.e. describe,
search, filter, count, annotate, return, etc.) increase the expressivity of
the language. Particularly, the ability to return paths enables the user to
post-process paths within the query language rather that in an ad-hoc
manner [15]." *

They have an open source parser for G-Core here:
https://github.com/ldbc/ldbc_gcore_parser

"This is a G-Core query example which matches persons with the name “John
Doe” together with indirect friends that live in the same city and returns
a table with the names of these friends."

SELECT m.lastName + ', ' + m.firstName AS friendName MATCH (n:Person)-/<:
knows*>/->(m:Person)

WHERE n.firstName = 'John' AND n.lastName = 'Doe' AND (n)-[:isLocatedIn
]->()<-[:isLocatedIn]-(m)

Oracle has a similar graph query language as well: http://pgql-lang.org

SELECT p2.name AS friend_of_friend
  FROM facebook_graph /* In the Facebook
graph..   */
 MATCH (p1:Person) -/:friend_of{2}/-> (p2:Person) /* ..match two-hop
friends.. */
 WHERE p1.name = 'Mark'   /* ..of Mark.
*/


And Microsoft SQL server added limited MATCH capability:

https://docs.microsoft.com/en-us/sql/t-sql/queries/match-sql-graph



On Sun, Dec 3, 2017 at 2:37 PM Deep-Impact 
wrote:

> From: Tom Lane
> It sounds like what you want is to replace all of Postgres except
> the name.  I'm not clear on the point.
>
>
> The point is to make PostgreSQL a versatile database suitable even for
> niche use cases.  I want more people to love and rely on PostgreSQL.
> Ideally, I want to see various data models incorporated in core from
> the beginning, but it would be difficult.  So, this pluggable data
> model is necessary to foster data model development both inside and
> outside the PostgreSQL community.  With that said, I hope PostgreSQL
> will someday absorb those data models, just like PL/pgSQL is now in
> core.
>
> But for now, I think the relational data model will continue to play a
> central role.  I don't think it's worth making the current relational
> data model implementation a plugged module.  It will depend on the
> software design and source code cleanness whether to do that.
>
> I don't understand yet how painful it would be to support other data
> models as first-class citizens, and I may be a reckless man who
> doesn't know fear.  So I wish you could help and pave this difficult
> way together.
>
> Regards
> MauMau
>
>
>


Re: All Taxi Services need Index Clustered Heap Append

2018-03-10 Thread legrand legrand
Hello,

Would the following custom solution:

- a pre-loaded table rows being sorted by id and ts 
  containing null values for other columns, 
  enough free space per block to permit updates in place,
- having a (btree or brin) index on (id,ts), 
- loaded using UPDATEs in spite of INSERTs for each new mesure

have a chance to be a good work arround ?

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: [HACKERS] [PATCH] Incremental sort

2018-03-10 Thread Alexander Korotkov
On Sat, Mar 10, 2018 at 6:42 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Thu, Mar 8, 2018 at 2:49 AM, Tomas Vondra  > wrote:
> Thank you very much for testing and benchmarking.  I'll investigate
> the regressions you found.
>
>
>> Now, there's a caveat in those tests - the data set is synthetic and
>> perfectly random, i.e. all groups equally likely, no correlations or
>> anything like that.
>>
>> I wonder what is the "worst case" scenario, i.e. how to construct a data
>> set with particularly bad behavior of the incremental sort.
>
>
> I think that depends on the reason of bad behavior of incremental sort.
> For example, our quick sort implementation behaves very good on
> presorted data.  But, incremental sort appears to be not so good in
> this case as Heikki showed upthread.  That prompted me to test
> presorted datasets (which appeared to be "worst case") more intensively.
> But I suspect that regressions you found have another reason, and
> correspondingly "worst case" would be also different.
> When I'll investigate the reason of regressions, I'll try to construct
> "worst case" as well.
>

After some investigation of benchmark results, I found 2 sources of
regressions of incremental sort.

*Case 1: Underlying node scan lose is bigger than incremental sort win*

= 33 [Wed Mar  7 10:14:14 CET 2018] scale:1000 groups:10
work_mem:64MB incremental:on max_workers:0 =
SELECT * FROM s_1 ORDER BY a, b
   QUERY
PLAN
-
 Limit  (cost=1588080.84..1588080.84 rows=1 width=20) (actual
time=5874.527..5874.527 rows=0 loops=1)
   ->  Incremental Sort  (cost=119371.51..1488081.45 rows=939 width=20)
(actual time=202.842..5653.224 rows=1000 loops=1)
 Sort Key: s_1.a, s_1.b
 Presorted Key: s_1.a
 Sort Method: external merge  Disk: 29408kB
 Sort Groups: 11
 ->  Index Scan using s_1_a_idx on s_1  (cost=0.43..323385.52
rows=939 width=20) (actual time=0.051..1494.105 rows=1000 loops=1)
 Planning time: 0.269 ms
 Execution time: 5877.367 ms
(9 rows)

= 37 [Wed Mar  7 10:15:51 CET 2018] scale:1000 groups:10
work_mem:64MB incremental:off max_workers:0 =
SELECT * FROM s_1 ORDER BY a, b
  QUERY PLAN

--
 Limit  (cost=1656439.93..1656439.93 rows=1 width=20) (actual
time=4741.716..4741.716 rows=0 loops=1)
   ->  Sort  (cost=1531440.69..1556440.54 rows=939 width=20) (actual
time=3522.156..4519.278 rows=1000 loops=1)
 Sort Key: s_1.a, s_1.b
 Sort Method: external merge  Disk: 293648kB
 ->  Seq Scan on s_1  (cost=0.00..163694.39 rows=939 width=20)
(actual time=0.021..650.322 rows=1000 loops=1)
 Planning time: 0.249 ms
 Execution time: 4777.088 ms
(7 rows)

In this case optimizer have decided that "Index Scan + Incremental Sort"
would be
cheaper than "Seq Scan + Sort".  But it appears that the amount of time we
loose by
selecting Index Scan over Seq Scan is bigger than amount of time we win by
selecting Incremental Sort over Sort.  I would note that regular Sort
consumes
about 10X more disk space.  I bet that all this space has fit to OS cache
of test
machine.  But optimizer did expect actual IO to take place in this case.
This
has lead actual time to be inadequate the costing.

*Case 2: Underlying node is not parallelyzed*

= 178 [Wed Mar  7 11:18:53 CET 2018] scale:1000 groups:100
work_mem:8MB incremental:on max_workers:2 =
SELECT * FROM s_2 ORDER BY a, b, c
 QUERY
PLAN

 Limit  (cost=1179047.88..1179047.88 rows=1 width=20) (actual
time=4819.999..4819.999 rows=0 loops=1)
   ->  Incremental Sort  (cost=89.04..1079047.34 rows=1054 width=20)
(actual time=0.203..4603.197 rows=1000 loops=1)
 Sort Key: s_2.a, s_2.b, s_2.c
 Presorted Key: s_2.a, s_2.b
 Sort Method: quicksort  Memory: 135kB
 Sort Groups: 10201
 ->  Index Scan using s_2_a_b_idx on s_2  (cost=0.43..406985.62
rows=1054 width=20) (actual time=0.052..1461.177 rows=1000 loops=1)
 Planning time: 0.313 ms
 Execution time: 4820.037 ms
(9 rows)

= 182 [Wed Mar  7 11:20:11 CET 2018] scale:1000 groups:100
work_mem:8MB incremental:off max_workers:2 =
SELECT * FROM s_2 ORDER BY a, b, c
 QUERY
PLAN

Re: [HACKERS] [PATCH] Incremental sort

2018-03-10 Thread Alexander Korotkov
On Thu, Mar 8, 2018 at 2:49 AM, Tomas Vondra 
wrote:

> OK, the revised patch works fine - I've done a lot of testing and
> benchmarking, and not a single segfault or any other crash.
>
> Regarding the benchmarks, I generally used queries of the form
>
> SELECT * FROM (SELECT * FROM t ORDER BY a) foo ORDER BY a,b
>
> with the first sort done in various ways:
>
> * regular Sort node
> * indexes with Index Scan
> * indexes with Index Only Scan
>
> and all these three options with and without LIMIT (the limit was set to
> 1% of the source table).
>
> I've also varied parallelism (max_parallel_workers_per_gather was set to
> either 0 or 2), work_mem (from 4MB to 256MB) and data set size (tables
> from 1000 rows to 10M rows).
>
> All of this may seem like an overkill, but I've found a couple of
> regressions thanks to that.
>
> The full scripts and results are available here:
>
> https://github.com/tvondra/incremental-sort-tests
>
> The queries actually executed are a bit more complicated, to eliminate
> overhead due to data transfer to client etc. The same approach was used
> in the other sorting benchmarks we've done in the past.
>
> I'm attaching results for two scales - 10k and 10M rows, preprocessed
> into .ods format. I haven't looked at the other scales yet, but I don't
> expect any surprises there.
>
> Each .ods file contains raw data for one of the tests (matching the .sh
> script filename), pivot table, and comparison of durations with and
> without the incremental sort.
>
> In general, I think the results look pretty impressive. Almost all the
> comparisons are green, which means "faster than master" - usually by
> tens of percent (without limit), or by up to ~95% (with LIMIT).
>
> There are a couple of regressions in two cases sort-indexes and
> sort-indexes-ios.
>
> Oh the small dataset this seems to be related to the number of groups
> (essentially, number of distinct values in a column). My assumption is
> that there is some additional overhead when "switching" between the
> groups, and with many groups it's significant enough to affect results
> on these tiny tables (where master only takes ~3ms to do the sort). The
> slowdown seems to be
>
> On the large data set it seems to be somehow related to both work_mem
> and number of groups, but I didn't have time to investigate that yet
> (there are explain analyze plans in the results, so feel free to look).
>
> In general, I think this looks really nice. It's certainly awesome with
> the LIMIT case, as it allows us to leverage indexes on a subset of the
> ORDER BY columns.
>

Thank you very much for testing and benchmarking.  I'll investigate
the regressions you found.


> Now, there's a caveat in those tests - the data set is synthetic and
> perfectly random, i.e. all groups equally likely, no correlations or
> anything like that.
>
> I wonder what is the "worst case" scenario, i.e. how to construct a data
> set with particularly bad behavior of the incremental sort.


I think that depends on the reason of bad behavior of incremental sort.
For example, our quick sort implementation behaves very good on
presorted data.  But, incremental sort appears to be not so good in
this case as Heikki showed upthread.  That prompted me to test
presorted datasets (which appeared to be "worst case") more intensively.
But I suspect that regressions you found have another reason, and
correspondingly "worst case" would be also different.
When I'll investigate the reason of regressions, I'll try to construct
"worst case" as well.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [PROPOSAL] timestamp informations to pg_stat_statements

2018-03-10 Thread legrand legrand
+1
Having the time of first occurence of a statement is very usefull for
trouble shouting,
it permits for exemple to retrieve the order of operations in some complex
cases (and thoses informations aren't taken by any third party collecting
tool, that will only be able to provide a time range of occurence).

I thougth that pgss rows where removed randomly when max rows was reached,
wouldn't having last_executed information permit a kind of LRU removal ?

Regards
PAscal





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Online enabling of checksums

2018-03-10 Thread Michael Banck
Hi,

I had a closer look at v3 of the patch now.

On Sat, Mar 03, 2018 at 07:23:31PM +0100, Magnus Hagander wrote:
> Attached is a rebased patch which removes this optimization, updates the
> pg_proc entry for the new format, and changes pg_verify_checksums to use -r
> instead of -o for relfilenode.
 
The patch applies fine with minimal fuzz and compiles with no warnings;
make check and the added isolation tests, as well as the added checksum
tests pass.

If I blindly run "SELECT pg_enable_data_checksums();" on new cluster, I
get:

|postgres=# SELECT pg_enable_data_checksums();
| pg_enable_data_checksums 
|--
| t
|(1 row)
|
|postgres=# SHOW data_checksums ;
| data_checksums 
|
| inprogress
|(1 row)


However, inspecting the log one sees:

|2018-03-10 14:15:57.702 CET [3313] ERROR:  Database template0 does not allow 
connections.
|2018-03-10 14:15:57.702 CET [3313] HINT:  Allow connections using ALTER 
DATABASE and try again.
|2018-03-10 14:15:57.702 CET [3152] LOG:  background worker "checksum helper 
launcher" (PID 3313) exited with exit code 1

and the background worker is no longer running without any obvious hint
to the client.

I am aware that this is discussed already, but as-is the user experience
is pretty bad, I think pg_enable_data_checksums() should either bail
earlier if it cannot connect to all databases, or it should be better
documented.

Otherwise, if I allow connections to template0, the patch works as
expected, I have not managed to break it so far.

Some further review comments:

> diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
> index f4bc2d4161..89afecb341 100644
> --- a/doc/src/sgml/wal.sgml
> +++ b/doc/src/sgml/wal.sgml
> @@ -230,6 +230,73 @@
[...]
> +   
> +Checksums can be enabled or disabled online, by calling the appropriate
> +functions.
> +Disabling of checksums takes effect immediately when the function is 
> called.
> +   
> +
> +   
> +Enabling checksums will put the cluster in inprogress 
> mode.
> +During this time, checksums will be written but not verified. In 
> addition to
> +this, a background worker process is started that enables checksums on 
> all
> +existing data in the cluster. Once this worker has completed processing 
> all
> +databases in the cluster, the checksum mode will automatically switch to
> +on.
> +   
> +
> +   
> +If the cluster is stopped while in inprogress mode, 
> for
> +any reason, then this process must be restarted manually. To do this,
> +re-execute the function pg_enable_data_checksums()
> +once the cluster has been restarted.
> +   

Maybe document the above issue here, unless it is clear that the
templat0-needs-to-allow-connections issue will go away before the patch
is pushed.

> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index 47a6c4d895..56aaa88de1 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c

[...]

> +void
> +SetDataChecksumsOn(void)
> +{
> + if (!DataChecksumsEnabledOrInProgress())
> + elog(ERROR, "Checksums not enabled or in progress");
> +
> + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
> + if (ControlFile->data_checksum_version != 
> PG_DATA_CHECKSUM_INPROGRESS_VERSION)
> + {
> + LWLockRelease(ControlFileLock);
> + elog(ERROR, "Checksums not in in_progress mode");

The string used in "SHOW data_checksums" is "inprogress", not
"in_progress".

[...]

> @@ -7769,6 +7839,16 @@ StartupXLOG(void)
>   CompleteCommitTsInitialization();
>  
>   /*
> +  * If we reach this point with checksums in inprogress state, we notify
> +  * the user that they need to manually restart the process to enable
> +  * checksums.
> +  */
> + if (ControlFile->data_checksum_version == 
> PG_DATA_CHECKSUM_INPROGRESS_VERSION)
> + ereport(WARNING,
> + (errmsg("checksum state is \"in progress\" with 
> no worker"),
> +  errhint("Either disable or enable checksums by 
> calling the pg_disable_data_checksums() or pg_enable_data_checksums() 
> functions.")));

Again, string is "inprogress", not "in progress", not sure if that
matters.

> diff --git a/src/backend/postmaster/checksumhelper.c 
> b/src/backend/postmaster/checksumhelper.c
> new file mode 100644
> index 00..6aa71bcf30
> --- /dev/null
> +++ b/src/backend/postmaster/checksumhelper.c
> @@ -0,0 +1,631 @@
> +/*-
> + *
> + * checksumhelper.c
> + * Backend worker to walk the database and write checksums to pages

Backend or Background?

[...]

> +typedef struct ChecksumHelperShmemStruct
> +{
> + pg_atomic_flag launcher_started;
> + boolsuccess;
> + boolprocess_shared_catalogs;
> + /* Parameter values  set on start */

double space.

[...]

> 

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-10 Thread Alexander Korotkov
On Fri, Mar 9, 2018 at 9:40 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada 
> wrote:
>
>> Attached an updated patch
>>
> fixed these issue. Will review the patch again.
>
>
> Thank you!
>

I've fixed a bug: _bt_vacuum_needs_cleanup() didn't release a lock when
metapage is in old format.
Bug was found by Darafei Praliaskouski and Grigory Smolkin who tested this
patch.
Revised patch is attached.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-lazy-btree-cleanup-5.patch
Description: Binary data


Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-10 Thread Michail Nikolaev
Hello.

Andrey, Tels - thanks for review.

> It could be named "SkipTuples" (e.g. this is the number of tuples we need
> to skip, not the number we have skipped), and the other one then
> "iss_SkipTuplesRemaining" so they are consistent with each other.

Agreed, done.

> Also, I think that this check could be removed from loop. We do not
expect that it's state will change during execution, do we?

Removed.

Patch is attached, github is updated too.

Michail.


offset_index_only_v4.patch
Description: Binary data


Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-10 Thread Tomas Vondra
On 03/10/2018 02:08 PM, Mark Dilger wrote:
> 
>> On Mar 3, 2018, at 2:40 PM, Tomas Vondra  
>> wrote:
>>
>> An updated patch version, fixing the breakage caused by fd1a421fe6
>> twiddling with pg_proc.
> 
> Hi Tomas, thanks again for this most useful patch!
> 
> Perhaps this is intentional, but there seems to be a place in 
> src/backend/parser/parse_utilcmd.c
> that is overlooked in your recent patch set.  The simplest fix would be:
> 

Yeah, this is consequence of 5564c11815486bdfe87eb46ebc7c070293fa6956,
which fixed a place we forgot to modify in pg10. Will fix.


thanks

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Sample values for pg_stat_statements

2018-03-10 Thread Tomas Vondra
Hi,

I've looked at this patch today. I like the idea / intent in general, as
it helps with some investigation tasks. That being said, I have a couple
of questions/comments based on read through the patch:


1) I see you've renamed the .sql script from 1.4 to 1.6. I thought we've
abandoned that approach some time ago, and are now only doing the
upgrade scripts. That is, keep 1.4 script and then 1.4--1.5 and 1.5-1.6.
That's how other extensions are doing it now, at least - see btree_gin
for example. But maybe pg_stat_statements has to do it this way for some
reason, not sure?

2) The patch should have updated doc/src/sgml/pgstatstatements.sgml

3) Do we really need both collect_consts and collect_params? I can't
really imagine wanting to set only one of those options. In any case,
the names seem unnecessarily abbreviated - just use collect_constants
and collect_parameters.

4) The width_threshold GUC name seems rather weird. I mean, I wouldn't
use "threshold" in this context, and it's really unclear size of what is
it referring to. We do have a precedent, though, as pg_stat_activity has
track_activity_query_size, so I suggest using either parameters_size or
max_parameters_size (prefixed by "pg_stat_statements." of course).

5) I don't quite see why keeping the first set of parameter values we
happen to see would be particularly useful. For example, I'm way more
interested in values for the longest execution - why not to keep those?

6) I suggest to use the same naming style as the existing functions, so
for example CollectParams should be pgss_CollectParams (and it's missing
a comment too).

7) There are a couple of places where the code style violates project
rules, e.g. by placing {} around a single command in if-statement.

8) I see Andres mentioned possible privacy issues (not quite sure what
is 'data minimalism', mentioned by Andres). I'm not sure it's a problem,
considering it can be disabled and it's subject to the usual role check
(susperuser/role_read_all_stats). Unfortunately we can't use the same
approach as pg_stat_activity (only showing data for user's own queries).
Well, we could keep per-user samples, but that might considerably
inflate the file size.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: disable SSL compression?

2018-03-10 Thread Craig Ringer
On 9 March 2018 at 14:17, Gasper Zejn  wrote:

> On 09. 03. 2018 06:24, Craig Ringer wrote:
>
> I'm totally unconvinced by the threat posed by exploiting a client by
> tricking it into requesting protocol compression - or any other protocol
> change the client lib doesn't understand - with a connection option in
> PGOPTIONS or the "options" connstring entry. The attacker must be able to
> specify either environment variables (in which case I present "LD_PRELOAD")
> or the connstr. If they can set a connstr they can direct the client to
> talk to a different host that tries to exploit the connecting client in
> whatever manner they wish by sending any custom crafted messages they like.
>
> If the attacker has access to client process or environment, he's already
> won and this is not where the compression vulnerability lies.
>
>
I'm aware. That's a reference to Tom's often-stated objection to using a
GUC as a client flag to enable new server-to-client protocol messages, not
anything re SSL.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-10 Thread Mark Dilger

> On Mar 3, 2018, at 2:40 PM, Tomas Vondra  wrote:
> 
> An updated patch version, fixing the breakage caused by fd1a421fe6
> twiddling with pg_proc.

Hi Tomas, thanks again for this most useful patch!

Perhaps this is intentional, but there seems to be a place in 
src/backend/parser/parse_utilcmd.c
that is overlooked in your recent patch set.  The simplest fix would be:

diff --git a/src/backend/parser/parse_utilcmd.c 
b/src/backend/parser/parse_utilcmd.c
index 0fd14f43c6..6ec7818f31 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1661,6 +1661,10 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid 
heapRelid,
stat_types = lappend(stat_types, 
makeString("ndistinct"));
else if (enabled[i] == STATS_EXT_DEPENDENCIES)
stat_types = lappend(stat_types, 
makeString("dependencies"));
+   else if (enabled[i] == STATS_EXT_MCV)
+   stat_types = lappend(stat_types, makeString("mcv"));
+   else if (enabled[i] == STATS_EXT_HISTOGRAM)
+   stat_types = lappend(stat_types, 
makeString("histogram"));
else
elog(ERROR, "unrecognized statistics kind %c", 
enabled[i]);
}

diff --git a/src/test/regress/expected/create_table_like.out 
b/src/test/regress/expected/create_table_like.out
index 52ff18c8ca..d7454648fc 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
 "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-"public"."ctlt_all_a_b_stat" (ndistinct, dependencies) ON a, b FROM 
ctlt_all
+"public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv, histogram) ON 
a, b FROM ctlt_all
 
 SELECT c.relname, objsubid, description FROM pg_description, pg_index i, 
pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND 
c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, 
objsubid;
 relname | objsubid | description 


Otherwise, perhaps you could include a comment about why STATS_EXT_MCV and
STATS_EXT_HISTOGRAM are not handled in this case.

mark







Re: Concurrency bug in UPDATE of partition-key

2018-03-10 Thread Amit Kapila
On Thu, Mar 8, 2018 at 4:55 PM, Amit Khandekar  wrote:
> (Mail subject changed; original thread : [1])
>
> On 8 March 2018 at 11:57, Amit Khandekar  wrote:
>>> Clearly, ExecUpdate() while moving rows between partitions is missing out on
>>> re-constructing the to-be-updated tuple, based on the latest tuple in the
>>> update chain. Instead, it's simply deleting the latest tuple and inserting a
>>> new tuple in the new partition based on the old tuple. That's simply wrong.
>>
>> You are right. This need to be fixed. This is a different issue than
>> the particular one that is being worked upon in this thread, and both
>> these issues have different fixes.
>>
>
> As discussed above, there is a concurrency bug where during UPDATE of
> a partition-key, another transaction T2 updates the same row.
>

We have one more behavior related to concurrency that would be
impacted due to row movement.  Basically, now Select .. for Key Share
will block the Update that routes the tuple to a different partition
even if the update doesn't update the key (primary/unique key) column.
Example,

create table foo (a int2, b text) partition by list (a);
create table foo1 partition of foo for values IN (1);
create table foo2 partition of foo for values IN (2);
insert into foo values(1, 'Initial record');

Session-1
---
begin;
select * from foo where a=1 for key share;

Session-2
---
begin;
update foo set a=1, b= b|| '-> update 1' where a=1;
update foo set a=2, b= b|| '-> update 2' where a=1;  --this will block

You can see when the update moves the row to a different partition, it
will block as internally it performs delete.  I think as we have
already documented that such an update is internally Delete+Insert,
this behavior is expected, but I think it is better if we update the
docs (Row-level locks section) as well.

In particular, I am talking about below text in Row-level locks section [1].
FOR KEY SHARE
Behaves similarly to FOR SHARE, except that the lock is weaker: SELECT
FOR UPDATE is blocked, but not SELECT FOR NO KEY UPDATE. A key-shared
lock blocks other transactions from performing DELETE or any UPDATE
that changes the key values, but not other UPDATE,

[1] - 
https://www.postgresql.org/docs/devel/static/explicit-locking.html#LOCKING-ROWS

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Online enabling of checksums

2018-03-10 Thread Michael Banck
Hi,

On Mon, Mar 05, 2018 at 11:09:02AM +0100, Magnus Hagander wrote:
> On Mon, Mar 5, 2018 at 10:43 AM, Michael Banck 
> wrote:
> > I still find that confusing, but maybe it's just me. I thought the one
> > in the pageheader is the "expected" checksum, and we compare the "found"
> > or "computed/calculated" (in the page itself) against it.
> >
> > I had the same conversation with an external tool author, by the way:
> 
> Maybe we should just say "on disk" for the one that's on disk, would that
> break the confusion? So "calculated %X, found %X on disk"?

I found that there is a precedent in bufpage.c:

|ereport(WARNING,
|(ERRCODE_DATA_CORRUPTED,
| errmsg("page verification failed, calculated checksum %u but 
expected %u",
|checksum, p->pd_checksum)));

apart from the fact that it doesn't print out the hex value (which I
find strange), it sounds like a sensible message to me. But "found %X on
disk" would work as well I guess.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer



Re: remove pg_class.relhaspkey

2018-03-10 Thread Tomas Vondra


On 02/26/2018 07:23 AM, Michael Paquier wrote:
> On Mon, Feb 26, 2018 at 12:45:48AM -0500, Tom Lane wrote:
>> Michael Paquier  writes:
>>> On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote:
 We've discussed that at least twice before, and not pulled the trigger
 for fear of breaking client code.
>>
>>> Speaking of which, I have looked at where relhaspkey is being used.  And
>>> there are a couple of things using it:
>>> - Greenplum has a consistency checker tool using it.
>>> - https://github.com/no0p/pgsampler
>>> - https://searchcode.com/codesearch/view/54937539/
>>> - http://codegist.net/code/postgres%20drop%20tables/
>>> - 
>>> https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs
>>
>> Thanks for poking around.  Did you happen to notice how many of these
>> clients are taking pains to deal with the existing inaccuracy of
>> relhaspkey (ie, that it remains set after the pkey has been dropped)?
> 
> As far as I looked at things.  Those clients rely on how optimistic
> relhaspkey is.  In short, if it is set to true, there can be primary
> key definitions.  If set to false, then it is sure that no primary key
> definition can be found.  If the flag is true, then those clients just
> do an extra lookup on pg_index with indisprimary.  I think that this
> just complicates the code involved though.  If looking for primary keys
> it is way better to just scan directly pg_index.
> 
>> I think there's possibly an argument that relhaspkey should be dropped
>> because it's an attractive nuisance, encouraging clients to assume
>> more than they should about what it means.  But we don't have a lot
>> of evidence for such an argument right now.
> 
> The only breakage I could imagine here is an application which
> thinks relhaspkey set to true implies that a primary key *has* to be
> present. I have to admit that I have not found such a case. Still I
> would not be surprised if there are such applications unaware of
> being broken, particularly plpgsql functions.

I agree with this sentiment - I don't think those flags are particularly
helpful for client applications, and would vote +1 for removal.

Even if the application handles them correctly (i.e. rechecks existence
when relhaspkey=true), the assumption is that this saves a measurable
amount of work. I'm not so sure about that, considering pretty much all
tables have both primary keys and indexes. OTOH it certainly does make
the code more complicated.

For internal usage it might have made a difference back when those flags
were introduced, but the relcache should deal with this efficiently now,
as pointed out in [1]. But as pointed out, we're not using relhaspkey
internally at all. So +1 to get rid of it.

For the other flags we would probably need to test what impact would it
have (e.g. table with no indexes, many indexes on other tables, and
something calling get_relation_info a lot). But this patch proposes to
remove only relhaspkey.


[1]
https://www.postgresql.org/message-id/CA%2BTgmoYJu24Y8uUAJ4zeQAhoYxLmFxcy%2B5Hdij9ehjoxKo3j3g%40mail.gmail.com


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-10 Thread Amit Kapila
On Fri, Mar 9, 2018 at 3:18 PM, amul sul  wrote:
> On Thu, Mar 8, 2018 at 12:31 PM, Amit Kapila  wrote:
>> On Thu, Mar 8, 2018 at 11:04 AM, Pavan Deolasee
>>
>>> This is just one example. I am almost certain there are many such cases that
>>> will require careful attention.
>>>
>>
>> Right, I think we should be able to detect and fix such cases.
>>
>
> I found a couple of places (in heap_lock_updated_tuple, rewrite_heap_tuple,
> EvalPlanQualFetch & heap_lock_updated_tuple_rec) where ItemPointerEquals is
> use to check tuple has been updated/deleted.  With the proposed patch
> ItemPointerEquals() will no longer work as before, we required addition check
> for updated/deleted tuple, proposed the same in latest patch[1]. Do let me 
> know
> your thoughts/suggestions on this, thanks.
>

I think you have identified the places correctly.  I have few
suggestions though.

1.
- if (!ItemPointerEquals(>t_self, ctid))
+ if (!(ItemPointerEquals(>t_self, ctid) ||
+   (!ItemPointerValidBlockNumber(ctid) &&
+(ItemPointerGetOffsetNumber(>t_self) ==   /* TODO: Condn.
should be macro? */
+ ItemPointerGetOffsetNumber(ctid)

Can't we write this and similar tests as:
ItemPointerValidBlockNumber(ctid) &&
!ItemPointerEquals(>t_self, ctid)?  It will be bit simpler to
understand and serve the purpose.

2.

if (mytup.t_data->t_infomask & HEAP_XMAX_INVALID ||
  ItemPointerEquals(_self, _data->t_ctid) ||
+ !HeapTupleHeaderValidBlockNumber(mytup.t_data) ||
  HeapTupleHeaderIsOnlyLocked(mytup.t_data))

I think it is better to keep the check for
HeapTupleHeaderValidBlockNumber earlier than ItemPointerEquals as it
will first validate if block number is valid and then only compare the
complete CTID.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: using index or check in ALTER TABLE SET NOT NULL

2018-03-10 Thread Sergei Kornilov
Hello
My patch does not apply after commit 5748f3a0aa7cf78ac6979010273bd9d50869bb8e. 
Here is update to current master. Not null constraint is immutable too, so here 
is no changes in PartConstraintImpliedByRelConstraint excepts rename and 
comments fix.

In this patch version i also revert tests to v4 state: i use DEBUG ereport 
instead INFO and code path not tested. Please tell me if i must change tests 
some way.

regards, Sergeidiff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index afe2139..783a0ef 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -215,8 +215,15 @@ WITH ( MODULUS numeric_literal, REM
 
  
   These forms change whether a column is marked to allow null
-  values or to reject null values.  You can only use SET
-  NOT NULL when the column contains no null values.
+  values or to reject null values.
+ 
+
+ 
+  You can only use SET NOT NULL when the column 
+  contains no null values. Full table scan is performed to check 
+  this condition unless there are valid CHECK 
+  constraints that prohibit NULL values for specified column.
+  In the latter case table scan is skipped.
  
 
  
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index fcf7655..4fec36d 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1258,7 +1258,7 @@ check_default_allows_bound(Relation parent, Relation default_rel,
 	 * not contain any row that would belong to the new partition, we can
 	 * avoid scanning the default partition.
 	 */
-	if (PartConstraintImpliedByRelConstraint(default_rel, def_part_constraints))
+	if (ConstraintImpliedByRelConstraint(default_rel, def_part_constraints))
 	{
 		ereport(INFO,
 (errmsg("updated partition constraint for default partition \"%s\" is implied by existing constraints",
@@ -1302,8 +1302,8 @@ check_default_allows_bound(Relation parent, Relation default_rel,
 			 * that it will not contain any row that would belong to the new
 			 * partition, we can avoid scanning the child table.
 			 */
-			if (PartConstraintImpliedByRelConstraint(part_rel,
-	 def_part_constraints))
+			if (ConstraintImpliedByRelConstraint(part_rel,
+ def_part_constraints))
 			{
 ereport(INFO,
 		(errmsg("updated partition constraint for default partition \"%s\" is implied by existing constraints",
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e559afb..b6d395b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -162,7 +162,7 @@ typedef struct AlteredTableInfo
 	/* Information saved by Phases 1/2 for Phase 3: */
 	List	   *constraints;	/* List of NewConstraint */
 	List	   *newvals;		/* List of NewColumnValue */
-	bool		new_notnull;	/* T if we added new NOT NULL constraints */
+	bool		verify_new_notnull; /* T if we should recheck NOT NULL */
 	int			rewrite;		/* Reason for forced rewrite, if any */
 	Oid			newTableSpace;	/* new tablespace; 0 means no change */
 	bool		chgPersistence; /* T if SET LOGGED/UNLOGGED is used */
@@ -377,6 +377,7 @@ static ObjectAddress ATExecDropNotNull(Relation rel, const char *colName, LOCKMO
 static void ATPrepSetNotNull(Relation rel, bool recurse, bool recursing);
 static ObjectAddress ATExecSetNotNull(AlteredTableInfo *tab, Relation rel,
  const char *colName, LOCKMODE lockmode);
+static bool NotNullImpliedByRelConstraints(Relation rel, Form_pg_attribute attr);
 static ObjectAddress ATExecColumnDefault(Relation rel, const char *colName,
 	Node *newDefault, LOCKMODE lockmode);
 static ObjectAddress ATExecAddIdentity(Relation rel, const char *colName,
@@ -4371,7 +4372,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode)
 			 * Test the current data within the table against new constraints
 			 * generated by ALTER TABLE commands, but don't rebuild data.
 			 */
-			if (tab->constraints != NIL || tab->new_notnull ||
+			if (tab->constraints != NIL || tab->verify_new_notnull ||
 tab->partition_constraint != NULL)
 ATRewriteTable(tab, InvalidOid, lockmode);
 
@@ -4532,7 +4533,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 	}
 
 	notnull_attrs = NIL;
-	if (newrel || tab->new_notnull)
+	if (newrel || tab->verify_new_notnull)
 	{
 		/*
 		 * If we are rebuilding the tuples OR if we added any new NOT NULL
@@ -5545,7 +5546,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 		 * null, but since it's filled specially, there's no need to test
 		 * anything.)
 		 */
-		tab->new_notnull |= colDef->is_not_null;
+		tab->verify_new_notnull |= colDef->is_not_null;
 	}
 
 	/*
@@ -5948,8 +5949,17 @@ ATExecSetNotNull(AlteredTableInfo *tab, Relation rel,
 
 		CatalogTupleUpdate(attr_rel, >t_self, tuple);
 
-		/* Tell Phase 3 it needs to test the constraint */
-		tab->new_notnull = true;
+		/*
+		 * 

Re: Fixes for missing schema qualifications

2018-03-10 Thread Noah Misch
On Fri, Mar 09, 2018 at 04:55:38PM +0900, Michael Paquier wrote:
> --- a/src/backend/catalog/information_schema.sql
> +++ b/src/backend/catalog/information_schema.sql
> @@ -186,7 +186,7 @@ CREATE FUNCTION _pg_interval_type(typid oid, mod int4) 
> RETURNS text
>  AS
>  $$SELECT
>CASE WHEN $1 IN (1186) /* interval */
> -   THEN upper(substring(format_type($1, $2) from 'interval[()0-9]* 
> #"%#"' for '#'))
> +   THEN pg_catalog.upper(substring(pg_catalog.format_type($1, $2) 
> from 'interval[()0-9]* #"%#"' for '#'))
> ELSE null
>END$$;
>  
> diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
> index 3560318749..f345572c8c 100644
> --- a/src/bin/psql/command.c
> +++ b/src/bin/psql/command.c
> @@ -4483,7 +4483,7 @@ get_create_object_cmd(EditableObjectType obj_type, Oid 
> oid,
>   printfPQExpBuffer(query,
> "SELECT 
> nspname, relname, relkind, "
> 
> "pg_catalog.pg_get_viewdef(c.oid, true), "
> -   
> "array_remove(array_remove(c.reloptions,'check_option=local'),'check_option=cascaded')
>  AS reloptions, "
> +   
> "pg_catalog.array_remove(pg_catalog.array_remove(c.reloptions,'check_option=local'),'check_option=cascaded')
>  AS reloptions, "
> "CASE WHEN 
> 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text "
> "WHEN 
> 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text ELSE NULL 
> END AS checkoption "
> "FROM 
> pg_catalog.pg_class c "
> diff --git a/src/test/isolation/isolationtester.c 
> b/src/test/isolation/isolationtester.c
> index 4ecad038bd..64d666f5cd 100644
> --- a/src/test/isolation/isolationtester.c
> +++ b/src/test/isolation/isolationtester.c
> @@ -184,7 +184,7 @@ main(int argc, char **argv)
>   PQclear(res);
>  
>   /* Get the backend pid for lock wait checking. */
> - res = PQexec(conns[i], "SELECT pg_backend_pid()");
> + res = PQexec(conns[i], "SELECT pg_catalog.pg_backend_pid()");
>   if (PQresultStatus(res) == PGRES_TUPLES_OK)
>   {
>   if (PQntuples(res) == 1 && PQnfields(res) == 1)
> diff --git a/src/test/modules/worker_spi/worker_spi.c 
> b/src/test/modules/worker_spi/worker_spi.c
> index 3b98b1682b..547bdb26c4 100644
> --- a/src/test/modules/worker_spi/worker_spi.c
> +++ b/src/test/modules/worker_spi/worker_spi.c
> @@ -115,7 +115,9 @@ initialize_worker_spi(worktable *table)
>  
>   /* XXX could we use CREATE SCHEMA IF NOT EXISTS? */
>   initStringInfo();
> - appendStringInfo(, "select count(*) from pg_namespace where nspname 
> = '%s'",
> + appendStringInfo(,
> +  "select pg_catalog.count(*) "
> +  "from pg_catalog.pg_namespace where 
> nspname = '%s'",

This qualifies some functions, but it leaves plenty of unqualified operators.



Re: Fixes for missing schema qualifications

2018-03-10 Thread Michael Paquier
On Fri, Mar 09, 2018 at 09:35:22AM -0500, David Steele wrote:
> These look sane to me.  Did you check the back branches for anything
> that might not exist in HEAD?

I did, but I have not spotted anything extra.  Impossible to say that I
did not miss one though, such scanning is tiring.
--
Michael


signature.asc
Description: PGP signature