Re: Add timeline to partial WAL segments

2019-01-04 Thread David Steele

On 1/5/19 3:31 AM, Michael Paquier wrote:

On Mon, Dec 31, 2018 at 01:07:34PM +0200, David Steele wrote:

In short, the *initial* name of the WAL file is set to what it should be if
it doesn't complete so we don't need to run around and try to rename files
on failure.  Only on success do we need to rename.

Sound plausible?


It does.


Excellent.  This now requires work on my part so I have marked the patch 
"Waiting on Author".


Thanks,
--
-David
da...@pgmasters.net



Re: Add timeline to partial WAL segments

2019-01-04 Thread Michael Paquier
On Mon, Dec 31, 2018 at 01:07:34PM +0200, David Steele wrote:
> In short, the *initial* name of the WAL file is set to what it should be if
> it doesn't complete so we don't need to run around and try to rename files
> on failure.  Only on success do we need to rename.
> 
> Sound plausible?

It does.
--
Michael


signature.asc
Description: PGP signature


Re: Unable to `make install` on MacOS in the latest master (68a13f28be)

2019-01-04 Thread Michael Paquier
On Thu, Jan 03, 2019 at 11:27:34AM -0500, Andrew Alsup wrote:
> Thanks for the help on "make distclean". That did the trick. I will be
> more careful when pulling master. Somehow, I hadn't been hit with this
> before, which was just dumb luck. Thanks for helping me out.

A more violent method is that from the top of the tree:
git clean -d -x -f

That's really efficient when using the git reporitory directly.
--
Michael


signature.asc
Description: PGP signature


Re: Use atexit() in initdb and pg_basebackup

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 04:35:51PM -0300, Alvaro Herrera wrote:
> On 2018-Dec-29, Peter Eisentraut wrote:
>> @@ -3438,5 +3437,8 @@ main(int argc, char *argv[])
>>  
>>  destroyPQExpBuffer(start_db_cmd);
>>  
>> +/* prevent cleanup */
>> +made_new_pgdata = found_existing_pgdata = made_new_xlogdir = 
>> found_existing_xlogdir = false;
>> +
>>  return 0;
>>  }
> 
> This is a bit ugly, but meh.
> 
> Other than the first point, LGTM.

Re-meuh (French version).  That's partially a problem of this patch
because all those flags get reset.  I think that it would be cleaner
to replace all those boolean flags with just a simple bits16 or such,
making the flag cleanup reset way cleaner, and less error-prone if
more flag types are added in the future.
--
Michael


signature.asc
Description: PGP signature


Re: A few new options for vacuumdb

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 11:49:46PM +, Bossart, Nathan wrote:
> 0004 introduces a slight change to existing behavior.  Currently, if
> you specify a missing table, vacuumdb will process each table until
> it reaches the nonexistent one, at which point it will fail.  After
> 0004 is applied, vacuumdb will fail during the catalog query, and no
> tables will be processed.

I have not looked at the patch set in details, but that would make
vacuumdb more consistent with the way VACUUM works with multiple
relations, which sounds like a good thing.
--
Michael


signature.asc
Description: PGP signature


Re: commit fest app: Authors

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 10:05:25AM -0500, Tom Lane wrote:
> Magnus Hagander  writes:
>> On Fri, Jan 4, 2019 at 12:26 PM Peter Eisentraut <
>> peter.eisentr...@2ndquadrant.com> wrote:
>>> Why does the commit fest app not automatically fill in the author for a
>>> new patch?
> 
>> I'm not sure it's a good idea to *enforce* it -- because you can't register
>> an author until they have logged into the CF app once. And I think
>> registering it with no author is better than registering it with the wrong
>> author.
> 
> Yeah --- I've not checked in detail, but I supposed that most/all of the
> cases of that correspond to authors with no community account to connect
> the patch to.

Agreed.  It is worse to not track a patch than to track it without an
author, whom can be guessed from the thread attached to a patch
anyway...
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Log PostgreSQL version number on startup

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 06:54:51PM -0500, Stephen Frost wrote:
> * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
>> Do we want to do the whole version string, or just "PostgreSQL 12devel"?
> 
> The whole thing.

I would prefer the whole string as well, as that's useful to look
after all the details not only related to a given minor version, like
a build tagged with an internal compilation system number added with
configure's extra-version.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER TABLE with multiple SET NOT NULL

2019-01-04 Thread Tom Lane
Allison Kaptur  writes:
> I encountered a surprising error when writing a migration that both added a
> primary key to a table and added a new NOT NULL column. It threw the error "
> column "col_d" contains null values", even though I supplied a default. The
> migration looks like this:
> CREATE TABLE new_table AS SELECT col_a, col_b, col_c from existing_table;
> ALTER TABLE new_table
> ADD COLUMN col_d UUID UNIQUE NOT NULL DEFAULT uuid_generate_v4(),
> ADD PRIMARY KEY (col_a, col_b, col_c);

Hm, this can be made a good deal more self-contained:

regression=# create table t1 (a int);
CREATE TABLE
regression=# insert into t1 values(1);
INSERT 0 1
regression=# alter table t1 add column b float8 not null default random(),
add primary key(a);
ERROR:  column "b" contains null values

It fails like that as far back as I tried (8.4).  I'm guessing that we're
doing the ALTER steps in the wrong order, but haven't looked closer than
that.

Interestingly, in v11 and HEAD it works if you use a constant default,
suggesting that the fast-default feature is at least adjacent to the
problem.

regards, tom lane



Re: jsonpath

2019-01-04 Thread Alexander Korotkov
On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov  wrote:
> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
> does this:
>
>   ...
>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>   ...
>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>   ...
>
> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
> "day".
>
> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>
> "Day", "day" are not mapped to DCH_DAY because they determine letter case in 
> the
> output, but "ff1" and "FF#" output contains only digits.

Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
an entry for "ff1" then enum should have a DCH_ff1 member in the same
position.

I got some other questions regarding this patchset.

1) Why do we parse FF7-FF9 if we're not supporting them anyway?
Without defining FF7-FF9 we can also don't throw errors for them
everywhere.  That would save us some code lines.

2) + DCH_to_char_fsec("%01d", in->fsec / INT64CONST(10));
Why do we use INT64CONST() here and in the similar places assuming
that fsec is only uint32?

3) wrapItem() is unused in
0002-Jsonpath-engine-and-operators-v21.patch, but used in
0006-Jsonpath-syntax-extensions-v21.patch.  Please, move it to
0006-Jsonpath-syntax-extensions-v21.patch?

4) I also got these couple of warning during compilation.

jsonpath_exec.c:1485:1: warning: unused function
'recursiveExecuteNested' [-Wunused-function]
recursiveExecuteNested(JsonPathExecContext *cxt, JsonPathItem *jsp,
^
1 warning generated.
jsonpath_scan.l:444:6: warning: implicit declaration of function
'jsonpath_yyparse' is invalid in C99 [-Wimplicit-function-declaration]
if (jsonpath_yyparse((void*)&parseresult) != 0)
^
1 warning generated.

Perhaps recursiveExecuteNested() is unsed in this patchset.  It's
probably used by some subsequent SQL/JSON-related patchset.  So,
please, move it there.

5) I think each usage of PG_TRY()/PG_CATCH() deserves comment
describing why it's safe to use without subtransaction.  In fact we
should just state that no function called inside performs data
modification.

--
Alexander Korotkov

Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Allow auto_explain to log to NOTICE

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 02:45:55PM +0100, Daniel Gustafsson wrote:
> (Michael: sorry for not having responded to your comments on the patch, $life
> has had little time over for hacking lately)

No worries, I understand.

> There is that.  We might not be excited about writing tests for this
> contrib module but someone else might want to use this for testing
> their application in a similar manner to how pg_regress tests?

If we don't introduce it, there is no way to know.  Still if somebody
needs to create regression tests they could just use EXPLAIN for the
same purpose.  However, the option still seems useful to me to get out
plans with the most generic output, so I support the idea.  If others
don't feel so, I am fine to give up as well.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Log PostgreSQL version number on startup

2019-01-04 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 21/11/2018 15:46, Christoph Berg wrote:
> > 2018-11-21 15:19:47.394 CET [24453] LOG:  starting PostgreSQL 12devel on 
> > x86_64-pc-linux-gnu, compiled by gcc (Debian 8.2.0-9) 8.2.0, 64-bit
> 
> Do we want to do the whole version string, or just "PostgreSQL 12devel"?

The whole thing.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Joerg Sonnenberger
On Fri, Jan 04, 2019 at 02:36:15PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2019-01-04 16:43:39 -0500, Tom Lane wrote:
> > Joerg Sonnenberger  writes:
> > >> * What's the generator written in?  (if the answer's not "Perl", wedging
> > >> it into our build processes might be painful)
> >
> > > Plain C, nothing really fancy in it.
> >
> > That's actually a bigger problem than you might think, because it
> > doesn't fit in very nicely in a cross-compiling build: we might not
> > have any C compiler at hand that generates programs that can execute
> > on the build machine.  That's why we prefer Perl for tools that need
> > to execute during the build.  However, if the code is pretty small
> > and fast, maybe translating it to Perl is feasible.  Or perhaps
> > we could add sufficient autoconfiscation infrastructure to identify
> > a native C compiler.  It's not very likely that there isn't one,
> > but it is possible that nothing we learned about the configured
> > target compiler would apply to it :-(

There is a pre-made autoconf macro for doing the basic glue for
CC_FOR_BUILD, it's been used by various projects already including libXt
and friends. 

> I think it might be ok if we included the output of the generator in the
> buildtree? Not being able to add keywords while cross-compiling sounds like
> an acceptable restriction to me.  I assume we'd likely grow further users
> of such a generator over time, and some of the input lists might be big
> enough that we'd not want to force it to be recomputed on every machine.

This is quite reasonable as well. I wouldn't worry about the size of the
input list at all. Processing the Webster dictionary needs something
less than 0.4s on my laptop for 235k entries.

Joerg



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Andres Freund
Hi,

On 2019-01-04 16:43:39 -0500, Tom Lane wrote:
> Joerg Sonnenberger  writes:
> >> * What's the generator written in?  (if the answer's not "Perl", wedging
> >> it into our build processes might be painful)
>
> > Plain C, nothing really fancy in it.
>
> That's actually a bigger problem than you might think, because it
> doesn't fit in very nicely in a cross-compiling build: we might not
> have any C compiler at hand that generates programs that can execute
> on the build machine.  That's why we prefer Perl for tools that need
> to execute during the build.  However, if the code is pretty small
> and fast, maybe translating it to Perl is feasible.  Or perhaps
> we could add sufficient autoconfiscation infrastructure to identify
> a native C compiler.  It's not very likely that there isn't one,
> but it is possible that nothing we learned about the configured
> target compiler would apply to it :-(

I think it might be ok if we included the output of the generator in the
buildtree? Not being able to add keywords while cross-compiling sounds like
an acceptable restriction to me.  I assume we'd likely grow further users
of such a generator over time, and some of the input lists might be big
enough that we'd not want to force it to be recomputed on every machine.

Greetings,

Andres Freund



Re: Feature: triggers on materialized views

2019-01-04 Thread Mitar
Hi!

I am new to contributing to PostgreSQL and this is my first time
having patches in commit fest, so I am not sure about details of the
process here, but I assume that replying and discuss the patch during
this period is one of the actives, so I am replying to the comment. If
I should wait or something like that, please advise.

On Fri, Jan 4, 2019 at 3:23 AM Peter Eisentraut
 wrote:
> > A summary of the patch: This patch enables adding AFTER triggers (both
> > ROW and STATEMENT) on materialized views. They are fired when doing
> > REFRESH MATERIALIZED VIEW CONCURRENTLY for rows which have changed.
>
> What bothers me about this patch is that it subtly changes what a
> trigger means.  It currently means, say, INSERT was executed on this
> table.  You are expanding that to mean, a row was inserted into this
> table -- somehow.

Aren't almost all statements these days generated by some sort of
automatic logic? Which generates those INSERTs and then you get
triggers on them? I am not sure where is this big difference in your
view coming from? Triggers are not defined as "user-made INSERT was
executed on this table". I think it has always been defined as "INSERT
modified this table", no matter where this insert came from (from
user, from some other trigger, by backup process). I mean, this is the
beauty of declarative programming. You define it once and you do not
care who triggers it.

Materialized views are anyway just built-in implementation of tables +
triggers to rerun the query. You could reconstruct them manually. And
why would not triggers be called if tables is being modified through
INSERTs? So if PostgreSQL has such a feature, why make it limited and
artificially make it less powerful? It is literally not possible to
have triggers only because there is "if trigger on a materialized
view, throw an exception".

> Triggers should generally refer to user-facing commands

So triggers on table A are not run when some other trigger from table
B decides to insert data into table A? Not true. I think triggers have
never cared who and where an INSERT came from. They just trigger. From
user, from another trigger, or from some built-in PostgreSQL procedure
called REFRESH.

> Could you not make a trigger on REFRESH itself?

If you mean if I could simulate this somehow before or after I call
REFRESH, then not really. I would not have access to previous state of
the table to compute the diff anymore. Moreover, I would have to
recompute the diff again, when REFRESH already did it once.

I could implement materialized views myself using regular tables and
triggers. And then have triggers after change on that table. But this
sounds very sad.

Or, are you saying that we should introduce a whole new type of of
trigger, REFRESH trigger, which would be valid only on materialized
views, and get OLD and NEW relations for previous and old state? I
think this could be an option, but it would require much more work,
and more changes to API. Is this what community would prefer?

> This is also a problem, because it would allow bypassing the trigger
> accidentally.

Sure, this is why it is useful to explain that CONCURRENT REFRESH uses
INSERT/UPDATE/DELETE and this is why you get triggers, and REFRESH
does not (but it is faster). I explained this in documentation.

But yes, this is downside. I checked the idea of calling row-level
triggers after regular REFRESH, but it seems it will introduce a lot
of overhead and special handling. I tried implementing it as TRUNCATE
+ INSERTS instead of heap swap and it is 2x slower.

> Moreover, consider that there could be updatable materialized views,
> just like there are updatable normal views.  And there could be triggers
> on those updatable materialized views.  Those would look similar but
> work quite differently from what you are proposing here.

Hm, not really. I would claim they would behave exactly the same.
AFTER trigger on INSERT on a materialized view would trigger for rows
which have changed through user updating materialized view directly,
or by calling CONCURRENT REFRESH which inserted a row. In both cases
the same trigger would run because materialized view had a row
inserted. Pretty nice.


Mitar

-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m



Re: Delay locking partitions during query execution

2019-01-04 Thread David Rowley
On Sat, 5 Jan 2019 at 03:12, Tomas Vondra  wrote:
> >>
> >> partitions0  100 10001
> >> 
> >> master   19 1590 2090  128
> >> patched  18 1780 6820 1130
> >>
> >> So, that's nice. I wonder why the throughput drops so fast between 1k
> >> and 10k partitions, but I'll look into that later.
> >
> > Those look strange. Why is it so slow with the non-partitioned case?
> > I'd have expected that to be the fastest result.
> >
>
> Because there are 1M rows in the table, and it's doing a seqscan.

Of course. My test did the same, but I didn't consider that because I
had so few rows per partition.  Likely just adding an index would have
it make more sense.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Tom Lane
Joerg Sonnenberger  writes:
> On Fri, Jan 04, 2019 at 03:31:11PM -0500, Tom Lane wrote:
>> The sample hash function certainly looks great.  I'm not terribly on board
>> with using |0x20 as a substitute for lower-casing, but that's a minor
>> detail.

> Yeah, I've included that part more because I don't know the current use
> cases enough. If all instances are already doing lower-casing in
> advance, it is trivial to drop.

I think we probably don't need that, because we'd always need to generate
a lower-cased version of the input anyway: either to compare to the
potential keyword match, or to use as the normalized identifier if it
turns out not to be a keyword.  I don't think there are any cases where
it's useful to delay downcasing till after the keyword lookup.

>> * What's the generator written in?  (if the answer's not "Perl", wedging
>> it into our build processes might be painful)

> Plain C, nothing really fancy in it.

That's actually a bigger problem than you might think, because it
doesn't fit in very nicely in a cross-compiling build: we might not
have any C compiler at hand that generates programs that can execute
on the build machine.  That's why we prefer Perl for tools that need
to execute during the build.  However, if the code is pretty small
and fast, maybe translating it to Perl is feasible.  Or perhaps
we could add sufficient autoconfiscation infrastructure to identify
a native C compiler.  It's not very likely that there isn't one,
but it is possible that nothing we learned about the configured
target compiler would apply to it :-(

>> * What license is it under?

> Two clause BSD license.

OK, that works, at least.

regards, tom lane



Re: explain plans with information about (modified) gucs

2019-01-04 Thread Sergei Agalakov

Hi,

every now and then I have to investigate an execution plan that is
strange in some way and I can't reproduce the same behavior. Usually
it's simply due to data distribution changing since the problem was
observed (say, after a nightly batch load/update).

In many cases it however may be due to some local GUC tweaks, usually
addressing some query specific issues (say, disabling nested loops or
lowering join_collapse_limit). I've repeatedly ran into cases where the
GUC was not properly reset to the "regular" value, and it's rather
difficult to identify this is what's happening. Or cases with different
per-user settings and connection pooling (SET SESSION AUTHORIZATION /
ROLE etc.).

So I propose to extend EXPLAIN output with an additional option, which
would include information about modified GUCs in the execution plan
(disabled by default, of course):

test=# explain (gucs) select * from t;

 QUERY PLAN
  
   Seq Scan on t  (cost=0.00..35.50 rows=2550 width=4)
   GUCs: application_name = 'x', client_encoding = 'UTF8',
 cpu_tuple_cost = '0.01'
   (2 rows)

Of course, this directly applies to auto_explain too, which gets a new
option log_gucs.

The patch is quite trivial, but there are about three open questions:

1) names of the options

I'm not particularly happy with calling the option "gucs" - it's an
acronym and many users have little idea what GUC stands for. So I think
a better name would be desirable, but I'm not sure what would that be.
Options? Parameters?

2) format of output

At this point the names/values are simply formatted into a one-line
string. That's not particularly readable, and it's not very useful for
the YAML/JSON formats I guess. So adding each modified GUC as an extra
text property would be better.

3) identifying modified (and interesting) GUCs

We certainly don't want to include all GUCs, so the question is how to
decide which GUCs are interesting. The simplest approach would be to
look for GUCs that changed in the session (source == PGC_S_SESSION), but
that does not help with SET SESSION AUTHORIZATION / ROLE cases. So we
probably want (source > PGC_S_ARGV), but probably not PGC_S_OVERRIDE
because that includes irrelevant options like wal_buffers etc.

For now I've used

  /* return only options that were modified (not as in config file) */
  if ((conf->source <= PGC_S_ARGV) || (conf->source == PGC_S_OVERRIDE))
continue;

which generally does the right thing, although it also includes stuff
like application_name or client_encoding. But perhaps it'd be better to
whitelist the GUCs in some way, because some of the user-defined GUCs
may be sensitive and should not be included in plans.

Opinions?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


This patch correlates with my proposal
"add session information column to pg_stat_statements"
https://www.postgresql.org/message-id/3aa097d7-7c47-187b-5913-db8366cd4491%40gmail.com
They both address the problem to identify the factors that make  different 
execution plans
for the same SQL statements. You are interested in the current settings that 
affect the execution plan,
I'm concerned about historical data in pg_stat_statements.
From my experience the most often offending settings are 
current_schemas/search_path and current_user.
Please have in mind that probably the same approach that you will use to extend 
explain plan functionality
will be eventually implemented to extend pg_stat_statements.
I think that the list of the GUCs that are reported by explain plan should be 
structured like JSON, something like
extended_settings:
{
  "current_schemas" : ["pg_catalog", "s1", "s2", "public"],
  "current_user" : "user1",
  "enable_nestloop" : "off",
  "work_mem" = "32MB"
}
It is less important for yours use case explain, but is important for 
pg_stat_statements case.
The pg_stat_statements is often accessed by monitoring and reporting tools, and 
it will be a good idea to have
the data here in the structured and easily parsed format.
 


Thank you,

Sergei Agalakov



Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-04 Thread Tom Lane
I wrote:
> If you reverse out just that change you'll see why I added it: without it,
> the plan for the earlier "test a corner case in which we shouldn't apply
> the star-schema optimization" isn't optimized as much as I thought it
> should be.

Hmm ... looking at this closer, it seems like this patch probably breaks
what that regression test case was actually meant to test, ie once we've
deleted the VALUES subselects from the jointree, it's likely that the
planner join-ordering mistake that was testing for can no longer happen,
because the join just plain doesn't exist.

I'll plan to deal with that by running the test case with actual small
tables instead of VALUES subselects.  It might be useful to run the test
case in its current form as an exercise for the RTE_RESULT optimizations,
but that would be separate from its current intention.

regards, tom lane



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Joerg Sonnenberger
On Fri, Jan 04, 2019 at 03:31:11PM -0500, Tom Lane wrote:
> John Naylor  writes:
> > On 1/3/19, Joerg Sonnenberger  wrote:
> >> I was pointed at your patch on IRC and decided to look into adding my
> >> own pieces. What I can provide you is a fast perfect hash function
> >> generator.  I've attached a sample hash function based on the current
> >> main keyword list. hash() essentially gives you the number of the only
> >> possible match, a final strcmp/memcmp is still necessary to verify that
> >> it is an actual keyword though. The |0x20 can be dropped if all cases
> >> have pre-lower-cased the input already. This would replace the binary
> >> search in the lookup functions. Returning offsets directly would be easy
> >> as well. That allows writing a single string where each entry is prefixed
> >> with a type mask, the token id, the length of the keyword and the actual
> >> keyword text. Does that sound useful to you?
> 
> > Judging by previous responses, there is still interest in using
> > perfect hash functions, so thanks for this. I'm not knowledgeable
> > enough to judge its implementation, so I'll leave that for others.
> 
> We haven't actually seen the implementation, so it's hard to judge ;-).

It's a temporary hacked version of nbperf in the NetBSD tree.

> The sample hash function certainly looks great.  I'm not terribly on board
> with using |0x20 as a substitute for lower-casing, but that's a minor
> detail.

Yeah, I've included that part more because I don't know the current use
cases enough. If all instances are already doing lower-casing in
advance, it is trivial to drop.

> The foremost questions in my mind are:
> 
> * What's the generator written in?  (if the answer's not "Perl", wedging
> it into our build processes might be painful)

Plain C, nothing really fancy in it.

> * What license is it under?

Two clause BSD license.

> * Does it always suceed in producing a single-level lookup table?

This question is a bit tricky. The short answer is: yes. The longer
answer: The choosen hash function in the example is very simple (e.g.
just two variations of DJB-style hash), so with that: no, not without
potentially fiddling a bit with the hash function if things ever get
nasty like having two keywords that hit a funnel for both variants. The
main concern for the choice was to be fast. When using two families of
independent hash functions, the generator requires a probalistic linear
time in the number of keys. That means with a strong enough hash function
like the Jenkins hash used in PG elsewhere, it will succeed very fast.
So if it fails on new keywords, making the mixing a bit stronger should
be enough.

Joerg



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread John Naylor
On 12/27/18, Tom Lane  wrote:
> If you really are hot about saving that other 440 bytes, the way to
> do it would be to drop the struct entirely and use two parallel
> arrays, an int16[] for value and a char[] (or better uint8[]) for
> category.  Those would be filled by reading kwlist.h twice with
> different definitions for PG_KEYWORD.  Not sure it's worth the
> trouble though --- in particular, not clear that it's a win from
> the standpoint of number of cache lines touched.

Understood. That said, after re-implementing all keyword lookups, I
wondered if there'd be a notational benefit to dropping the struct,
especially since as yet no caller uses both token and category. It
makes pl_scanner.c and its reserved keyword list a bit nicer, and gets
rid of the need to force frontend to have 'zero' token numbers, but
I'm not sure it's a clear win. I've attached a patch (applies on top
of v6), gzipped to avoid confusing the cfbot.

-John Naylor


keyword-nostruct.patch.gz
Description: GNU Zip compressed data


Re: pgsql: Remove WITH OIDS support, change oid catalog column visibility.

2019-01-04 Thread Andres Freund
Hi,

On 2019-01-03 13:40:42 -0500, Tom Lane wrote:
> I noticed that this patch has broken restores of existing dump files:
> 
> psql:testbed.public.backup:82: ERROR:  unrecognized configuration parameter 
> "default_with_oids"
> 
> Quite aside from the fact that this won't work if the user tries to
> restore in single-transaction or no-error mode, this is really really
> unhelpful because it's impossible to tell whether the error is
> ignorable or not, except by digging through the dump file.
> 
> What you should do is restore that GUC, but add a check-hook that throws
> an error for an attempt to set it to "true".

Makes sense, I (or a colleague) will look into that next week. Wanna get my
head out of pluggable storage first...

Greetings,

Andres Freund



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Tom Lane
Andres Freund  writes:
> On 2019-01-04 12:26:18 -0500, Tom Lane wrote:
>> I'm wondering where we want to go with this.  Is anyone excited
>> about pursuing the perfect-hash-function idea?  (Joerg's example
>> function looked pretty neat to me.)  If we are going to do that,
>> does it make sense to push this version beforehand?

> I think it does make sense to push this version beforehand. Most of
> the code would be needed anyway, so it's not like this is going to
> cause a lot of churn.

Yeah, I'm leaning in that direction too, first on the grounds of
"don't let the perfect be the enemy of the good", and second because
if we do end up with perfect hashing, we'd still need a table-generation
step.  The build infrastructure this adds would support a generator
that produces perfect hashes just as well as what this is doing,
even if we end up having to whack the API of ScanKeywordLookup around
some more.  So barring objections, I'll have a look at pushing this,
and then we can think about using perfect hashing instead.

regards, tom lane



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Tom Lane
John Naylor  writes:
> On 1/3/19, Joerg Sonnenberger  wrote:
>> I was pointed at your patch on IRC and decided to look into adding my
>> own pieces. What I can provide you is a fast perfect hash function
>> generator.  I've attached a sample hash function based on the current
>> main keyword list. hash() essentially gives you the number of the only
>> possible match, a final strcmp/memcmp is still necessary to verify that
>> it is an actual keyword though. The |0x20 can be dropped if all cases
>> have pre-lower-cased the input already. This would replace the binary
>> search in the lookup functions. Returning offsets directly would be easy
>> as well. That allows writing a single string where each entry is prefixed
>> with a type mask, the token id, the length of the keyword and the actual
>> keyword text. Does that sound useful to you?

> Judging by previous responses, there is still interest in using
> perfect hash functions, so thanks for this. I'm not knowledgeable
> enough to judge its implementation, so I'll leave that for others.

We haven't actually seen the implementation, so it's hard to judge ;-).

The sample hash function certainly looks great.  I'm not terribly on board
with using |0x20 as a substitute for lower-casing, but that's a minor
detail.

The foremost questions in my mind are:

* What's the generator written in?  (if the answer's not "Perl", wedging
it into our build processes might be painful)

* What license is it under?

* Does it always suceed in producing a single-level lookup table?

regards, tom lane



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Andres Freund
On 2019-01-04 12:26:18 -0500, Tom Lane wrote:
> I wrote:
> > Andres Freund  writes:
> >> On 2018-12-29 16:59:52 -0500, John Naylor wrote:
> >>> I think 0001 with complete keyword lookup replacement is in decent
> >>> enough shape to post. Make check-world passes. A few notes and
> >>> caveats:
> 
> >> I tried to take this for a spin, an for me the build fails because various
> >> frontend programs don't have KeywordOffsets/Strings defined, but reference 
> >> it
> >> through various functions exposed to the frontend (like fmtId()).  That I 
> >> see
> >> that error but you don't is probably related to me using -fuse-ld=gold in
> >> CFLAGS.
> 
> > I was just about to point out that the cfbot is seeing that too ...
> 
> Aside from the possible linkage problem, this will need a minor rebase
> over 4879a5172, which rearranged some of plpgsql's calls of
> ScanKeywordLookup.
> 
> While I don't think it's going to be hard to resolve these issues,
> I'm wondering where we want to go with this.  Is anyone excited
> about pursuing the perfect-hash-function idea?  (Joerg's example
> function looked pretty neat to me.)  If we are going to do that,
> does it make sense to push this version beforehand?

I think it does make sense to push this version beforehand. Most of
the code would be needed anyway, so it's not like this is going to
cause a lot of churn.

Greetings,

Andres Freund



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread John Naylor
On 1/3/19, Joerg Sonnenberger  wrote:
> Hello John,
> I was pointed at your patch on IRC and decided to look into adding my
> own pieces. What I can provide you is a fast perfect hash function
> generator.  I've attached a sample hash function based on the current
> main keyword list. hash() essentially gives you the number of the only
> possible match, a final strcmp/memcmp is still necessary to verify that
> it is an actual keyword though. The |0x20 can be dropped if all cases
> have pre-lower-cased the input already. This would replace the binary
> search in the lookup functions. Returning offsets directly would be easy
> as well. That allows writing a single string where each entry is prefixed
> with a type mask, the token id, the length of the keyword and the actual
> keyword text. Does that sound useful to you?

Judging by previous responses, there is still interest in using
perfect hash functions, so thanks for this. I'm not knowledgeable
enough to judge its implementation, so I'll leave that for others.

-John Naylor



Re: Arrays of domain returned to client as non-builtin oid describing the array, not the base array type's oid

2019-01-04 Thread Tom Lane
Alvaro Herrera  writes:
> On 2019-Jan-02, James Robinson wrote:
>> So -- do others find this inconsistent, or is it just me and I should
>> work on having psycopg2 be able to learn the type mapping itself if I
>> don't want to do SQL-side casts? I'll argue that if scalar projections
>> erase the domain's oid, then array projections ought to as well.

> Sounds reasonable.

> Do you have a link to a previous discussion that rejected changing the
> returned OID to that of the domain array?  I want to know what the argument
> is, other than backwards compatibility.

TBH I doubt it was ever discussed; I don't recall having thought about
doing that while working on c12d570fa.

> Disregarding the size/shape of a patch to change this, I wonder what's
> the cost of the change.

It could be kind of expensive.  The only way to find out whether an array
is over a domain type is to drill down to the element type and see.  Then
if it is, we'd have to drill down to the domain base type, after which we
could use its typarray field.  So that means at least one additional
syscache lookup each time we determine which type OID to report.

I think there are also corner cases to worry about, in particular what
if the base type lacks a typarray entry?  This would happen at least
for domains over arrays.  We don't have arrays of arrays according to
the type system, but arrays of domains over arrays allow you to kind
of fake that.  I don't see a way to report a valid description of the
data type while still abstracting out the domain in that case.

> I mean, how many clients are going to be broken
> if we change it?

This possibility only came in with v11, so probably there are few if any
use-cases of arrays-of-domains in the wild yet, and few or no clients
with intelligence about it.  I don't think that backwards compatibility
would be a show-stopper argument against changing it, if we could satisfy
ourselves about the above points.

Having said that: in the end, the business of flattening scalar domains
was mainly meant to help simple clients handle simple cases simply.
I'm not sure that array cases fall into that category at all, so I'm
not that excited about adding complexity/cycles for this.

regards, tom lane



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread John Naylor
On 12/30/18, Andres Freund  wrote:
> I tried to take this for a spin, an for me the build fails because various
> frontend programs don't have KeywordOffsets/Strings defined, but reference
> it
> through various functions exposed to the frontend (like fmtId()).  That I
> see
> that error but you don't is probably related to me using -fuse-ld=gold in
> CFLAGS.
>
> I can "fix" this by including kwlist_d.h in common/keywords.c
> regardless of FRONTEND. That also lead me to discover that the build
> dependencies somewhere aren't correctly set-up, because I need to
> force a clean rebuild to trigger the problem again, just changing
> keywords.c back doesn't trigger the problem.

Hmm, that was a typo, and I didn't notice even when I found I had to
include kwlist_d.h in ecpg/keywords.c. :-(  I've fixed both of those
in the attached v6.

As far as dependencies, I'm far from sure I have it up to par. That
piece could use some discussion.

On 1/4/19, Tom Lane  wrote:
> Aside from the possible linkage problem, this will need a minor rebase
> over 4879a5172, which rearranged some of plpgsql's calls of
> ScanKeywordLookup.
>
> While I don't think it's going to be hard to resolve these issues,
> I'm wondering where we want to go with this.  Is anyone excited
> about pursuing the perfect-hash-function idea?  (Joerg's example
> function looked pretty neat to me.)  If we are going to do that,
> does it make sense to push this version beforehand?

If it does, for v6 I've also done the rebase, updated the copyright
year, and fixed an error in MSVC.

-John Naylor
From fb846eeabd450b6932e57f7e8d4f0aebc125d178 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Fri, 4 Jan 2019 14:35:40 -0500
Subject: [PATCH v6] Use offset-based keyword lookup.

Replace binary search over an array of ScanKeyword structs with a binary
search over an array of offsets into a keyword string. Access auxillary
data only after a keyword hit. This has better locality of reference and
a smaller memory footprint.
---
 .../pg_stat_statements/pg_stat_statements.c   |   2 +
 src/backend/parser/parser.c   |   4 +-
 src/backend/parser/scan.l |  38 ++--
 src/backend/utils/adt/misc.c  |   2 +-
 src/backend/utils/adt/ruleutils.c |   9 +-
 src/common/.gitignore |   1 +
 src/common/Makefile   |  25 ++-
 src/common/keywords.c |  42 +++--
 src/fe_utils/string_utils.c   |   9 +-
 src/include/common/keywords.h |  19 +-
 src/include/parser/kwlist.h   |   2 +-
 src/include/parser/scanner.h  |   8 +-
 src/interfaces/ecpg/preproc/.gitignore|   2 +
 src/interfaces/ecpg/preproc/Makefile  |  20 ++-
 src/interfaces/ecpg/preproc/c_keywords.c  |  65 ++-
 src/interfaces/ecpg/preproc/c_kwlist.h|  53 ++
 src/interfaces/ecpg/preproc/ecpg_keywords.c   |  81 ++---
 src/interfaces/ecpg/preproc/ecpg_kwlist.h |  68 +++
 src/interfaces/ecpg/preproc/keywords.c|   6 +-
 src/interfaces/ecpg/preproc/pgc.l |  22 +--
 src/interfaces/ecpg/preproc/preproc_extern.h  |   8 +-
 src/pl/plpgsql/src/.gitignore |   2 +
 src/pl/plpgsql/src/Makefile   |  16 +-
 src/pl/plpgsql/src/pl_reserved_kwlist.h   |  55 ++
 src/pl/plpgsql/src/pl_scanner.c   | 167 --
 src/pl/plpgsql/src/pl_unreserved_kwlist.h | 111 
 src/tools/gen_keywords.pl | 136 ++
 src/tools/msvc/Solution.pm|  44 +
 src/tools/msvc/clean.bat  |   6 +
 29 files changed, 696 insertions(+), 327 deletions(-)
 create mode 100644 src/common/.gitignore
 create mode 100644 src/interfaces/ecpg/preproc/c_kwlist.h
 create mode 100644 src/interfaces/ecpg/preproc/ecpg_kwlist.h
 create mode 100644 src/pl/plpgsql/src/pl_reserved_kwlist.h
 create mode 100644 src/pl/plpgsql/src/pl_unreserved_kwlist.h
 create mode 100644 src/tools/gen_keywords.pl

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index e8ef966bb5..4776de7595 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -3076,6 +3076,8 @@ fill_in_constant_lengths(pgssJumbleState *jstate, const char *query,
 	yyscanner = scanner_init(query,
 			 &yyextra,
 			 ScanKeywords,
+			 KeywordString,
+			 KeywordOffsets,
 			 NumScanKeywords);
 
 	/* we don't want to re-emit any escape string warnings */
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 7e9b1222fd..19fa7bb8ae 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -40,8 +40,8 @@ raw_parser(const char *str)
 	int			yyresult;
 
 	/* initialize the flex scanner */
-	yyscanner = scanner_init(str, &yyextra.core_yy_extra,
-			 ScanKeywords, Nu

Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-04 Thread Tom Lane
David Rowley  writes:
> I've just looked over the v4 patch. I agree that having an RTE for a
> from-less SELECT seems like a good idea.

Thanks for looking!

> While reading the patch, I noted the following:
> 1. It's more efficient to use bms_next_member as it does not need to
> re-skip 0 words on each iteration. (Likely bms_first_member() is not
> needed anywhere in the code base)

Sure.  As the comment says, this isn't meant to be super efficient for
multiple removable RTEs, but we might as well use the other API.

> 2. The following comment seems to indicate that we can go ahead and
> make parallelise the result processing, but the code still defers to
> the checks below and may still end up not parallelising if say,
> there's a non-parallel safe function call in the SELECT's target list.

Adjusted the comment.

> 3. You may as well just ditch the variable and just do:
> Assert(rel->relid > 0);
> Assert(planner_rt_fetch(rel->relid, root)->rtekind == RTE_RESULT);
> instead of:
> RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY;
> There are a few other cases doing just that in costsize.c

Meh.  I'm not really on board with doing it that way, as it'll just
mean more to change if the code is ever modified to look at other
fields of the RTE.  Still, I see your point that other places in
costsize.c are doing it without PG_USED_FOR_ASSERTS_ONLY, so changed.

> 4. I don't quite understand why this changed in join.out

> @@ -3596,7 +3588,7 @@ select t1.* from
>> Hash Right Join
>   Output: i8.q2
>   Hash Cond: ((NULL::integer) = i8b1.q2)
> - ->  Hash Left Join
> + ->  Hash Join

> Can you explain why that's valid?

Excellent question.  The reason that plan changed is the logic I added
in find_nonnullable_rels_walker:

+ * If the PHV's syntactic scope is exactly one rel, it will be forced
+ * to be evaluated at that rel, and so it will behave like a Var of
+ * that rel: if the rel's entire output goes to null, so will the PHV.

In this case there's a PHV wrapped around b2.d2, and this change allows
reduce_outer_joins_pass2 to detect that the i8-to-b2 left join can
be simplified to an inner join because the qual just above it (on the
left join of b1 to i8/b2) is strict for b2; that is, the condition
(b2.d2 = b1.q2) cannot succeed for a null-extended i8/b2 row.

If you reverse out just that change you'll see why I added it: without it,
the plan for the earlier "test a corner case in which we shouldn't apply
the star-schema optimization" isn't optimized as much as I thought it
should be.

v5 attached; this responds to your comments plus Alexander's earlier
gripe about not getting a clean build with --disable-cassert.
No really substantive changes though.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index e8ef966..6696f92 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2404,6 +2404,8 @@ JumbleRangeTable(pgssJumbleState *jstate, List *rtable)
 			case RTE_NAMEDTUPLESTORE:
 APP_JUMB_STRING(rte->enrname);
 break;
+			case RTE_RESULT:
+break;
 			default:
 elog(ERROR, "unrecognized RTE kind: %d", (int) rte->rtekind);
 break;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bb92d9d..b3894d0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5361,7 +5361,7 @@ INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
QUERY PLAN
 -
  Insert on public.ft2
-   Output: (tableoid)::regclass
+   Output: (ft2.tableoid)::regclass
Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
->  Result
  Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2   '::character(10), NULL::user_enum
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index 12cb18c..cd77f99 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -437,9 +437,12 @@ ExecSupportsMarkRestore(Path *pathnode)
 return ExecSupportsMarkRestore(((ProjectionPath *) pathnode)->subpath);
 			else if (IsA(pathnode, MinMaxAggPath))
 return false;	/* childless Result */
+			else if (IsA(pathnode, GroupResultPath))
+return 

Re: Use atexit() in initdb and pg_basebackup

2019-01-04 Thread Alvaro Herrera
On 2018-Dec-29, Peter Eisentraut wrote:

> @@ -387,6 +388,7 @@ StreamLog(void)
>   if (!conn)
>   /* Error message already written in GetConnection() */
>   return;
> + atexit(disconnect_atexit);
>  
>   if (!CheckServerVersionForStreaming(conn))
>   {

Seems you're registering the atexit cb twice here; you should only do so
in the first "!conn" block.

It would be nicer to be able to call atexit() in GetConnection() instead
of at each callsite, but that would require a place to save each conn
struct into, which is probably more work than warranted.

> @@ -3438,5 +3437,8 @@ main(int argc, char *argv[])
>  
>   destroyPQExpBuffer(start_db_cmd);
>  
> + /* prevent cleanup */
> + made_new_pgdata = found_existing_pgdata = made_new_xlogdir = 
> found_existing_xlogdir = false;
> +
>   return 0;
>  }

This is a bit ugly, but meh.

Other than the first point, LGTM.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Arrays of domain returned to client as non-builtin oid describing the array, not the base array type's oid

2019-01-04 Thread Alvaro Herrera
On 2019-Jan-02, James Robinson wrote:

> So -- do others find this inconsistent, or is it just me and I should
> work on having psycopg2 be able to learn the type mapping itself if I
> don't want to do SQL-side casts? I'll argue that if scalar projections
> erase the domain's oid, then array projections ought to as well.

Sounds reasonable.

Do you have a link to a previous discussion that rejected changing the
returned OID to that of the domain array?  I want to know what the argument
is, other than backwards compatibility.

Disregarding the size/shape of a patch to change this, I wonder what's
the cost of the change.  I mean, how many clients are going to be broken
if we change it?  And by contrast, how many apps are going to work
better with array-on-domain if we change it?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

2019-01-04 Thread Peter Geoghegan
Hi Alexander,

On Fri, Jan 4, 2019 at 7:40 AM Alexander Korotkov
 wrote:
> I'm starting to look at this patchset.  Not ready to post detail
> review, but have a couple of questions.

Thanks for taking a look!

> Yes, it shouldn't be too hard, but it seems like we have to keep two
> branches of code for different handling of duplicates.  Is that true?

Not really. If you take a look at v9, you'll see the approach I've
taken is to make insertion scan keys aware of which rules apply (the
"heapkeyspace" field field controls this). I think that there are
about 5 "if" statements for that outside of amcheck. It's pretty
manageable.

I like to imagine that the existing code already has unique keys, but
nobody ever gets to look at the final attribute. It works that way
most of the time -- the only exception is insertion with user keys
that aren't unique already. Note that the way we move left on equal
pivot tuples, rather than right (rather than following the pivot's
downlink) wasn't invented by Postgres to deal with the lack of unique
keys. That's actually a part of the Lehman and Yao design itself.
Almost all of the special cases are optimizations rather than truly
necessary infrastructure.

> I didn't get the point of this paragraph.  Does it might happen that
> first right tuple is under tuple size restriction, but new pivot tuple
> is beyond that restriction?  If so, would we have an error because of
> too long pivot tuple?  If not, I think this needs to be explained
> better.

The v9 version of the function _bt_check_third_page() shows what it
means (comments on this will be improved in v10, too). The old limit
of 2712 bytes still applies to pivot tuples, while a new, lower limit
of 2704 bytes applied for non-pivot tuples. This difference is
necessary because an extra MAXALIGN() quantum could be needed to add a
heap TID to a pivot tuple during truncation in the worst case. To
users, the limit is 2704 bytes, because that's the limit that actually
needs to be enforced during insertion.

We never actually say "1/3 of a page means 2704 bytes" in the docs,
since the definition was always a bit fuzzy. There will need to be a
compatibility note in the release notes, though.
-- 
Peter Geoghegan



Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

2019-01-04 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> On 2018-12-29 16:59:52 -0500, John Naylor wrote:
>>> I think 0001 with complete keyword lookup replacement is in decent
>>> enough shape to post. Make check-world passes. A few notes and
>>> caveats:

>> I tried to take this for a spin, an for me the build fails because various
>> frontend programs don't have KeywordOffsets/Strings defined, but reference it
>> through various functions exposed to the frontend (like fmtId()).  That I see
>> that error but you don't is probably related to me using -fuse-ld=gold in
>> CFLAGS.

> I was just about to point out that the cfbot is seeing that too ...

Aside from the possible linkage problem, this will need a minor rebase
over 4879a5172, which rearranged some of plpgsql's calls of
ScanKeywordLookup.

While I don't think it's going to be hard to resolve these issues,
I'm wondering where we want to go with this.  Is anyone excited
about pursuing the perfect-hash-function idea?  (Joerg's example
function looked pretty neat to me.)  If we are going to do that,
does it make sense to push this version beforehand?

regards, tom lane



Re: Potentially undocumented behaviour change in Postgres 11 concerning OLD record in an after insert trigger

2019-01-04 Thread Pavel Stehule
pá 4. 1. 2019 v 17:44 odesílatel Tom Lane  napsal:

> Kristjan Tammekivi  writes:
> > I've noticed a change in the behaviour in triggers / hstores in Postgres
> > 11.1 when compared to Postgres 10.5.
> > [ reference to OLD in an insert trigger doesn't throw error anymore ]
>
> Hmm.  This seems to be a side effect of the changes we (I) made in v11
> to rationalize the handling of NULL vs ROW(NULL,NULL,...) composite
> values in plpgsql.  The "unassigned" trigger row variables are now
> acting as though they are plain NULL values.  I'm not sure now whether
> I realized that this would happen --- it looks like there are not and
> were not any regression test cases that would throw an error for the
> disallowed-reference case, so it's quite possible that it just escaped
> notice.
>
> The old behavior was pretty darn squirrely; in particular it would let
> you parse but not execute a reference to "OLD.column", a behavior that
> could not occur for any other composite variable.  Now that'll just
> return NULL.  In a green field I don't think there'd be complaints
> about this behavior --- I know lots of people have spent considerable
> effort programming around the other behavior.
>
> While I haven't looked closely, I think duplicating the old behavior
> would require adding a special-purpose flag to plpgsql DTYPE_REC
> variables, which'd cost a little performance (extra error checks
> in very hot code paths) and possibly break compatibility with
> pldebugger, if there's been a v11 release of that.
>
> So I'm a bit inclined to accept this behavior change and adjust
> the documentation to say that OLD/NEW are "null" rather than
> "unassigned" when not relevant.
>

It is maybe unwanted effect, but it is not bad from my view. new behave is
consistent - a initial value of variables is just NULL

+1

Pavel


> Thoughts?
>
> regards, tom lane
>
>


Re: Potentially undocumented behaviour change in Postgres 11 concerning OLD record in an after insert trigger

2019-01-04 Thread Tom Lane
Kristjan Tammekivi  writes:
> I've noticed a change in the behaviour in triggers / hstores in Postgres
> 11.1 when compared to Postgres 10.5.
> [ reference to OLD in an insert trigger doesn't throw error anymore ]

Hmm.  This seems to be a side effect of the changes we (I) made in v11
to rationalize the handling of NULL vs ROW(NULL,NULL,...) composite
values in plpgsql.  The "unassigned" trigger row variables are now
acting as though they are plain NULL values.  I'm not sure now whether
I realized that this would happen --- it looks like there are not and
were not any regression test cases that would throw an error for the
disallowed-reference case, so it's quite possible that it just escaped
notice.

The old behavior was pretty darn squirrely; in particular it would let
you parse but not execute a reference to "OLD.column", a behavior that
could not occur for any other composite variable.  Now that'll just
return NULL.  In a green field I don't think there'd be complaints
about this behavior --- I know lots of people have spent considerable
effort programming around the other behavior.

While I haven't looked closely, I think duplicating the old behavior
would require adding a special-purpose flag to plpgsql DTYPE_REC
variables, which'd cost a little performance (extra error checks
in very hot code paths) and possibly break compatibility with
pldebugger, if there's been a v11 release of that.

So I'm a bit inclined to accept this behavior change and adjust
the documentation to say that OLD/NEW are "null" rather than
"unassigned" when not relevant.

Thoughts?

regards, tom lane



RE: Shared Memory: How to use SYSV rather than MMAP ?

2019-01-04 Thread REIX, Tony
Hi Thomas,


I'm back from vacations.

Thanks for the 2 patches!

I've seen also the discussion around this subject. Very interesting. Should I 
wait for a decision been taken? or should I study and experiment your patches 
before?

I'm now in the process of studying with gprof what Postgres does on AIX 
compared to Linux/Power, in order to understand where time goes.


Regards,


Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

De : Thomas Munro 
Envoyé : mercredi 26 décembre 2018 00:28:54
À : REIX, Tony
Cc : Andres Freund; Robert Haas; Pg Hackers; EMPEREUR-MOT, SYLVIE
Objet : Re: Shared Memory: How to use SYSV rather than MMAP ?

On Wed, Dec 19, 2018 at 4:17 AM REIX, Tony  wrote:
> Here is the patch we are using now on AIX for enabling SysV shm for AIX, 
> which improves greatly the performance on AIX.
>
> It is compile time.
>
> It seems to me that you'd like this to become a shared_memory_type GUC. 
> Correct? However, I do not know how to do.
>
>
> Even as-is, this patch would greatly improve the performance of PostgreSQL 
> v11.1 in the field on AIX machines. So, we'd like this change to be available 
> for AIX asap.
>
>
> What are the next steps to get this patch accepted? or What are your 
> suggestions for improving it?

Hi Tony,

Since it's not fixing a bug, we wouldn't back-patch that into existing
releases.  But I agree that we should do something like this for
PostgreSQL 12, and I think we should make it user configurable.

Here is a quick rebase of Andres's shared_memory_type patch for
master, so that you can put shared_memory_type=sysv in postgresql.conf
to get the old pre-9.3 behaviour (this may also be useful for other
operating systems).  Here also is a "blind" patch that makes it
respect huge_pages=try/on on AIX (or at least, I think it does; I
don't have an AIX to try it, it almost certainly needs some
adjustments).  Thoughts?


--
Thomas Munro
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com&data=02%7C01%7Ctony.reix%40atos.net%7C554c8fd9266d4fc3674808d66ac0d8cc%7C33440fc6b7c7412cbb730e70b0198d5a%7C0%7C0%7C636813773841365546&sdata=lpXwVIYe5lE2P6lC1fhgqcxsuQG6sVdb5FZ9k3d590U%3D&reserved=0

ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is 
in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager 
votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant 
d’emails que vous attendez et dont vous connaissez les expéditeurs et leur 
faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y 
que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen 
könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie 
kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.


Re: New GUC to sample log queries

2019-01-04 Thread Adrien Mobile
Le 4 janvier 2019 13:16:48 GMT+01:00, Peter Eisentraut 
 a écrit :
>On 19/11/2018 14:40, Tomas Vondra wrote:
>> On 11/19/18 2:57 AM, Michael Paquier wrote:
>>> On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote:
 Since it's hard to come up with a concise name that will mention
>sampling rate
 in the context of min_duration_statement, I think it's fine to name
>this
 configuration "log_sample_rate", as long as it's dependency from
 log_min_duration_statements is clearly explained in the
>documentation.
>>>
>>> log_sample_rate looks fine to me as a name.
>> 
>> That seems far too short to me - the name should indicate it applies
>to 
>> statement logging. I'd say log_statement_sample_rate is better.
>
>It was committed with this name, but then one has to wonder why it
>doesn't also apply to log_statement.
>
>-- 
>Peter Eisentraut  http://www.2ndQuadrant.com/
>PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Yes, but maybe log_min_duration_statements_sample is too long? 
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.



Re: insensitive collations

2019-01-04 Thread Daniel Verite
Peter Eisentraut wrote:

> Here is an updated patch.


When using GROUP BY and ORDER BY on a field with a non-deterministic
collation, this pops out:

CREATE COLLATION myfr (locale='fr-u-ks-level1',
  provider='icu', deterministic=false);

=# select n from (values ('été' collate "myfr"), ('ete')) x(n)
  group by 1 order by 1 ;
  n  
-
 ete
(1 row)

=#  select n from (values ('été' collate "myfr"), ('ete')) x(n)
  group by 1 order by 1 desc;
  n  
-
 été
(1 row)

The single-row output is different whether it's sorted in the ASC or
DESC direction, even though in theory, ORDER BY is done after GROUP
BY, where it shouldn't make that difference.

EXPLAIN shows that the sort is done before grouping, which might
explain why it happens, but isn't that plan incorrect given the context?

postgres=# explain select n from (values ('été' collate "myfr"), ('ete'))
x(n)
  group by 1 order by 1  desc;
QUERY PLAN
--
 Group  (cost=0.04..0.04 rows=2 width=32)
   Group Key: "*VALUES*".column1
   ->  Sort  (cost=0.04..0.04 rows=2 width=32)
 Sort Key: "*VALUES*".column1 COLLATE myfr DESC
 ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=32)
(5 rows)


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: Log a sample of transactions

2019-01-04 Thread Adrien Mobile
Le 4 janvier 2019 13:20:09 GMT+01:00, Peter Eisentraut 
 a écrit :
>On 12/12/2018 22:32, Adrien Nayrat wrote:
>> Per idea of Nikolay Samokhvalov[1] I propose this patch to add the
>possibility
>> to log all statements from a fraction of transactions.
>> 
>> I have several questions:
>>   * Do we want this feature?
>
>It's not clear to me what the use is.  The per-statement sampling
>allows
>you to capture slow queries without overwhelming the logs.  We don't
>have any logging of slow transactions or any other transaction scope
>logging, so what will this sample?
>
>-- 
>Peter Eisentraut  http://www.2ndQuadrant.com/
>PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hello,
When you troubleshoot applicative issues with multi-statements transaction, you 
may have to log all queries to find all statements of one transaction. With 
high throughput, it could be hard to log all queries without causing troubles. 
The goal of this patch is to be able to log a sample of transactions. I agree 
the use case is specific.
Regards 
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.



Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

2019-01-04 Thread Alexander Korotkov
Hi!

I'm starting to look at this patchset.  Not ready to post detail
review, but have a couple of questions.

On Wed, Sep 19, 2018 at 9:24 PM Peter Geoghegan  wrote:
> I still haven't managed to add pg_upgrade support, but that's my next
> step. I am more or less happy with the substance of the patch in v5,
> and feel that I can now work backwards towards figuring out the best
> way to deal with on-disk compatibility. It shouldn't be too hard --
> most of the effort will involve coming up with a good test suite.

Yes, it shouldn't be too hard, but it seems like we have to keep two
branches of code for different handling of duplicates.  Is that true?

+ * In the worst case (when a heap TID is appended) the size of the returned
+ * tuple is the size of the first right tuple plus an additional MAXALIGN()
+ * quantum.  This guarantee is important, since callers need to stay under
+ * the 1/3 of a page restriction on tuple size.  If this routine is ever
+ * taught to truncate within an attribute/datum, it will need to avoid
+ * returning an enlarged tuple to caller when truncation + TOAST compression
+ * ends up enlarging the final datum.

I didn't get the point of this paragraph.  Does it might happen that
first right tuple is under tuple size restriction, but new pivot tuple
is beyond that restriction?  If so, would we have an error because of
too long pivot tuple?  If not, I think this needs to be explained
better.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Shared Memory: How to use SYSV rather than MMAP ?

2019-01-04 Thread Peter Eisentraut
On 27/12/2018 00:53, Thomas Munro wrote:
> mmap and sysv were the same, but there
> did seem to be a measurable speed-up available at high client counts
> with kern.ipc.shm_use_phys=1 and thus for sysv only, for people
> prepared to set 3 sysctls and this proposed new GUC.  Maybe the effect
> would be greater with bigger shared_buffers or smaller pages (this
> test ran on super pages)?  More work required to figure that out.

Could you get a similar effect for mmap by using mlock() to prevent the
pages from being swapped?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: commit fest app: Authors

2019-01-04 Thread Tom Lane
Magnus Hagander  writes:
> On Fri, Jan 4, 2019 at 12:26 PM Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> wrote:
>> Why does the commit fest app not automatically fill in the author for a
>> new patch?

> I'm not sure it's a good idea to *enforce* it -- because you can't register
> an author until they have logged into the CF app once. And I think
> registering it with no author is better than registering it with the wrong
> author.

Yeah --- I've not checked in detail, but I supposed that most/all of the
cases of that correspond to authors with no community account to connect
the patch to.

regards, tom lane



Re: FETCH FIRST clause PERCENT option

2019-01-04 Thread Tomas Vondra
On 1/4/19 8:08 AM, Surafel Temesgen wrote:
> 
> 
> 
> On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
> mailto:tomas.von...@2ndquadrant.com>> wrote:
> 
> The execution part of the patch seems to be working correctly, but I
> think there's an improvement - we don't need to execute the outer plan
> to completion before emitting the first row. For example, let's say the
> outer plan produces 1 rows in total and we're supposed to return the
> first 1% of those rows. We can emit the first row after fetching the
> first 100 rows, we don't have to wait for fetching all 10k rows.
> 
> 
> my other concern is in this case we are going to determine limit and
> safe row to return on the fly
> which have additional computation and i fair that it will not perform
> better on most of the case
> 

I very much doubt recomputing the number of rows to return will be
measurable at all, considering those are int64 values. It's likely
negligible compared to all the other overhead (reading rows from the
subplan, etc.).

And even if it turns out expensive, there are ways to make it less
expensive (e.g. not computing it every time, but only when it actually
can increment the value - for example with 1% rows to return, it's
enough to recompute it every 100 rows).

It also very much depends on which use cases you consider important.
Clearly, you're only thinking about the case that actually requires
fetching everything. But there's also the use case where you have
another limit node somewhere above.

So let's do a simple "cost" analysis of these two use cases:

1) case #1 (no LIMIT above, requires fetching everything)

Let's assume the "cost" of the current approach (fetching everything in
advance, computing node->count once) is the baseline here.

With the incremental approach, there likely is some extra overhead.
Let's overshoot it a bit - with 5% overhead, it's 1.05 of the baseline.

2) case #2 (LIMIT/EXISTS above, only needs to fetch the first few rows,
perhaps just a single one)

Here, we take the incremental approach as the baseline. With the "fetch
everything first" approach we need to evaluate the whole subplan and
fetch all the results, instead of just some small fraction. But the
exact ratio is pretty much arbitrary - I can easily construct queries
that need to fetch 100x or 10x more data from the subplan.

So essentially using the "incremental fetch" approach means we may pay
5% overhead in cases where we need to fetch everything, while the "fetch
everything first" approach means we'll add arbitrary amount of overhead
to cases where we don't need to fetch everything.

Of course, you one way to deal with this would be to argue the cases
benefiting from incremental approach are very rare / non-existent.
Perhaps that's the case, although I don't see why it would be.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: START/END line number for COPY FROM

2019-01-04 Thread Peter Eisentraut
On 20/12/2018 14:02, Surafel Temesgen wrote:
> Currently we can skip header line on COPY FROM but having the ability to
> skip and stop copying at any line can use to divide long copy operation
> and enable to copy a subset of the file and skipping footer.

It seems a bit fragile to me if I want to skip a footer and need to
figure out the total line count, subtract one, and then oh, was it zero-
or one-based.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Fast path for empty relids in check_outerjoin_delay()

2019-01-04 Thread Peter Eisentraut
On 12/12/2018 08:32, Richard Guo wrote:
> This small revise is not expected to bring performance improvements, but
> can improve the readability of the code that for empty relids, the qual
> is always considered as being not-outerjoin_delayed.

I think code readability and maintainability would be improved by having
fewer special cases and fast paths.  In this particular case, I'd even
remove the existing fast path and just let the subsequent loop run zero
times.  I doubt there is a performance benefit to the existing coding.
And if there is no performance benefit, then it's not really a fast
path, just another path.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FETCH FIRST clause PERCENT option

2019-01-04 Thread Tomas Vondra



On 1/4/19 7:40 AM, Surafel Temesgen wrote:
> 
> 
> On Thu, Jan 3, 2019 at 4:51 PM Tomas Vondra
> mailto:tomas.von...@2ndquadrant.com>> wrote:
> 
> 
> On 1/3/19 1:00 PM, Surafel Temesgen wrote:
> > Hi
> >
> > On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
> >  
>  >> wrote:
> 
> >     The execution part of the patch seems to be working correctly,
> but I
> >     think there's an improvement - we don't need to execute the
> outer plan
> >     to completion before emitting the first row. For example,
> let's say the
> >     outer plan produces 1 rows in total and we're supposed to
> return the
> >     first 1% of those rows. We can emit the first row after
> fetching the
> >     first 100 rows, we don't have to wait for fetching all 10k rows.
> >
> >
> > but total rows count is not given how can we determine safe to
> return row
> >
> 
> But you know how many rows were fetched from the outer plan, and this
> number only grows grows. So the number of rows returned by FETCH FIRST
> ... PERCENT also only grows. For example with 10% of rows, you know that
> once you reach 100 rows you should emit ~10 rows, with 200 rows you know
> you should emit ~20 rows, etc. So you may track how many rows we're
> supposed to return / returned so far, and emit them early.
> 
> 
> 
> 
> yes that is clear but i don't find it easy to put that in formula. may
> be someone with good mathematics will help
> 

What formula? All the math remains exactly the same, you just need to
update the number of rows to return and track how many rows are already
returned.

I haven't tried doing that, but AFAICS you'd need to tweak how/when
node->count is computed - instead of computing it only once it needs to
be updated after fetching each row from the subplan.

Furthermore, you'll need to stash the subplan rows somewhere (into a
tuplestore probably), and whenever the node->count value increments,
you'll need to grab a row from the tuplestore and return that (i.e.
tweak node->position and set node->subSlot).

I hope that makes sense. The one thing I'm not quite sure about is
whether tuplestore allows adding and getting rows at the same time.

Does that make sense?

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Delay locking partitions during query execution

2019-01-04 Thread Tomas Vondra



On 1/4/19 1:53 AM, David Rowley wrote:
> On Fri, 4 Jan 2019 at 13:01, Tomas Vondra  
> wrote:
>> On 1/3/19 11:57 PM, David Rowley wrote:
>>> You'll know you're getting a generic plan when you see "Filter (a =
>>> $1)" and see "Subplans Removed: " below the Append.
>>>
>>
>> Indeed, with prepared statements I now see some improvements:
>>
>> partitions0  100 10001
>> 
>> master   19 1590 2090  128
>> patched  18 1780 6820 1130
>>
>> So, that's nice. I wonder why the throughput drops so fast between 1k
>> and 10k partitions, but I'll look into that later.
> 
> Those look strange. Why is it so slow with the non-partitioned case?
> I'd have expected that to be the fastest result.
> 

Because there are 1M rows in the table, and it's doing a seqscan.

That also makes the other cases difficult to compare, because with few
partitions there will be multiple pages per partition, scanned
sequentially. And with many partitions it's likely only a single page
with a couple of rows on it. I'll think about constructing a better
benchmark, to make it easier to compare - perhaps by using a single row
per table and/or adding indexes. Or something ...

>> Does this mean this optimization can only ever work with prepared
>> statements, or can it be made to work with regular plans too?
> 
> That's a good question.  I confirm it's only any use of run-time
> pruning occurs during init plan. For this patch to be any use, you'll
> need to see a "Subplans Removed: " in the query's EXPLAIN ANALYZE
> output.  If you don't see this then all Append/MergeAppend subplans
> were initialised and the relation lock would have been obtained
> regardless of if delaylock is set for the relation. The effect of the
> patch here would just have been to obtain the lock during the first
> call to ExecGetRangeTableRelation() for that relation instead of
> during AcquireExecutorLocks().  There may actually be a tiny overhead
> in this case since AcquireExecutorLocks() must skip the delaylock
> relations, but they'll get locked later anyway. I doubt you could
> measure that though.
> 
> When run-time pruning is able to prune partitions before execution
> starts then the optimisation is useful since AcquireExecutorLocks()
> won't obtain the lock and ExecGetRangeTableRelation() won't be called
> for all pruned partition's rels as we don't bother to init the
> Append/MergeAppend subplan for those.
> 
> I'm a little unsure if there are any cases where this type of run-time
> pruning can occur when PREPAREd statements are not in use.  Initplan
> parameters can't prune before executor run since we need to run the
> executor to obtain the values of those. Likewise for evaluation of
> volatile functions. So I think run-time pruning before initplan is
> only ever going to happen for PARAM_EXTERN type parameters, i.e. with
> PREPAREd statements (REF: analyze_partkey_exprs() partprune.c).
> Without PREPAREd statements, if the planner itself was unable to prune
> the partitions it would already have obtained the lock during
> planning, so AcquireExecutorLocks(), in this case, would bump into the
> local lock hash table entry and forego trying to obtain the lock
> itself.  That's not free, but it's significantly faster than obtaining
> a lock.
> 
> Or in short... it only good for prepared statements where the
> statement's parameters allow for run-time pruning. However, that's a
> pretty large case since the planner is still very slow at planning for
> large numbers of partitions, meaning it's common (or at least it will
> be) for people to use PREPAREd statement and plan_cache_mode =
> force_generic_plan;
> 

OK, thanks for the explanation. One more reason to use prepared
statements in such cases ...

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: proposal - plpgsql unique statement id

2019-01-04 Thread Pavel Stehule
pá 4. 1. 2019 v 13:58 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 13/11/2018 14:35, Pavel Stehule wrote:
> > I am try to enhance plpgsql_check about profiler functionality and I
> > have to solve how to identify every plpgsql statement across different
> > sessions. There is lineno, but surely it should not be unique. I propose
> > introduce stmtid to every statement structure. This number will be
> > unique inside function.
>
> That seems reasonable enough, although I wouldn't use 0 as a valid stmtid.
>

done


> A documenting comment near PLpgSQL_stmt might be nice.
>

 done

Regards

Pavel


> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 0e9ea10596..9e98134c05 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -368,6 +368,8 @@ do_compile(FunctionCallInfo fcinfo,
 
 	function->fn_prokind = procStruct->prokind;
 
+	function->nstatements = 0;
+
 	/*
 	 * Initialize the compiler, particularly the namespace stack.  The
 	 * outermost namespace contains function parameters and other special
@@ -911,6 +913,8 @@ plpgsql_compile_inline(char *proc_source)
  true);
 	function->found_varno = var->dno;
 
+	function->nstatements = 0;
+
 	/*
 	 * Now parse the function's text
 	 */
@@ -1020,6 +1024,7 @@ add_dummy_return(PLpgSQL_function *function)
 
 		new = palloc0(sizeof(PLpgSQL_stmt_block));
 		new->cmd_type = PLPGSQL_STMT_BLOCK;
+		new->stmtid = ++function->nstatements;
 		new->body = list_make1(function->action);
 
 		function->action = new;
@@ -1031,6 +1036,7 @@ add_dummy_return(PLpgSQL_function *function)
 
 		new = palloc0(sizeof(PLpgSQL_stmt_return));
 		new->cmd_type = PLPGSQL_STMT_RETURN;
+		new->stmtid = ++function->nstatements;
 		new->expr = NULL;
 		new->retvarno = function->out_param_varno;
 
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 59063976b3..55c041c175 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -414,6 +414,7 @@ pl_block		: decl_sect K_BEGIN proc_sect exception_sect K_END opt_label
 
 		new->cmd_type	= PLPGSQL_STMT_BLOCK;
 		new->lineno		= plpgsql_location_to_lineno(@2);
+		new->stmtid 	= ++plpgsql_curr_compile->nstatements;
 		new->label		= $1.label;
 		new->n_initvars = $1.n_initvars;
 		new->initvarnos = $1.initvarnos;
@@ -918,6 +919,7 @@ stmt_call		: K_CALL
 		new = palloc0(sizeof(PLpgSQL_stmt_call));
 		new->cmd_type = PLPGSQL_STMT_CALL;
 		new->lineno = plpgsql_location_to_lineno(@1);
+		new->stmtid = ++plpgsql_curr_compile->nstatements;
 		new->expr = read_sql_stmt("CALL ");
 		new->is_call = true;
 
@@ -932,6 +934,7 @@ stmt_call		: K_CALL
 		new = palloc0(sizeof(PLpgSQL_stmt_call));
 		new->cmd_type = PLPGSQL_STMT_CALL;
 		new->lineno = plpgsql_location_to_lineno(@1);
+		new->stmtid = ++plpgsql_curr_compile->nstatements;
 		new->expr = read_sql_stmt("DO ");
 		new->is_call = false;
 
@@ -947,6 +950,7 @@ stmt_assign		: assign_var assign_operator expr_until_semi
 		new = palloc0(sizeof(PLpgSQL_stmt_assign));
 		new->cmd_type = PLPGSQL_STMT_ASSIGN;
 		new->lineno   = plpgsql_location_to_lineno(@1);
+		new->stmtid = ++plpgsql_curr_compile->nstatements;
 		new->varno = $1->dno;
 		new->expr  = $3;
 
@@ -962,6 +966,7 @@ stmt_getdiag	: K_GET getdiag_area_opt K_DIAGNOSTICS getdiag_list ';'
 		new = palloc0(sizeof(PLpgSQL_stmt_getdiag));
 		new->cmd_type = PLPGSQL_STMT_GETDIAG;
 		new->lineno   = plpgsql_location_to_lineno(@1);
+		new->stmtid	  = ++plpgsql_curr_compile->nstatements;
 		new->is_stacked = $2;
 		new->diag_items = $4;
 
@@ -1149,6 +1154,7 @@ stmt_if			: K_IF expr_until_then proc_sect stmt_elsifs stmt_else K_END K_IF ';'
 		new = palloc0(sizeof(PLpgSQL_stmt_if));
 		new->cmd_type	= PLPGSQL_STMT_IF;
 		new->lineno		= plpgsql_location_to_lineno(@1);
+		new->stmtid		= ++plpgsql_curr_compile->nstatements;
 		new->cond		= $2;
 		new->then_body	= $3;
 		new->elsif_list = $4;
@@ -1253,6 +1259,7 @@ stmt_loop		: opt_loop_label K_LOOP loop_body
 		new = palloc0(sizeof(PLpgSQL_stmt_loop));
 		new->cmd_type = PLPGSQL_STMT_LOOP;
 		new->lineno   = plpgsql_location_to_lineno(@2);
+		new->stmtid   = ++plpgsql_curr_compile->nstatements;
 		new->label	  = $1;
 		new->body	  = $3.stmts;
 
@@ -1270,6 +1277,7 @@ stmt_while		: opt_loop_label K_WHILE expr_until_loop loop_body
 		new = palloc0(sizeof(PLpgSQL_stmt_while));
 		new->cmd_type = PLPGSQL_STMT_WHILE;
 		new->lineno   = plpgsql_location_to_lineno(@2);
+		new->stmtid	  = ++plpgsql_curr_compile->nstatements;
 		new->label	  = $1;
 		new->cond	  = $3;
 		new->body	  = $4.stmts;
@@ -1290,6 +1298,7 @@ stmt_for		: opt_loop_label K_FOR for_control loop

Re: Allow auto_explain to log to NOTICE

2019-01-04 Thread Daniel Gustafsson
(Michael: sorry for not having responded to your comments on the patch, $life
has had little time over for hacking lately)

> On 4 Jan 2019, at 13:49, Michael Paquier  wrote:
> 
> On Fri, Jan 04, 2019 at 01:06:24PM +0100, Peter Eisentraut wrote:
>> Do we really want to add user-facing options just to be able to run
>> tests?  Should we write the tests differently instead?

I voiced this concern upthread as well, not being sure if it’s worth the added
complexity.

> The take is that the output of the plans generated includes data which
> is run-dependent because the duration of the plan is generated
> unconditionally.  One way to write generic tests considering this
> would be to use a TAP test, but I feel that's overdoing it just for
> this case.
> 
> Being able to control if the plan duration shows up still looks like
> an interesting option to me independently of adding tests.

There is that.  We might not be excited about writing tests for this contrib
module but someone else might want to use this for testing their application in
a similar manner to how pg_regress tests?

cheers ./daniel


Re: [PATCH] Pass COPT and PROFILE to CXXFLAGS as well

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 11:41:15AM +0100, Peter Eisentraut wrote:
> Note that pgxs supports PG_CPPFLAGS for adding custom pieces to CPPFLAGS
> in a safe way.  Maybe we should add an equivalent for CFLAGS?  It's just
> that it hasn't been needed so far.

+1.  Wouldn't it make sense to also have PG_LDFLAGS?  In some stuff I
maintain, I have been abusing of PG_CPPFLAGS to pass some flags, which
is not really right.  We also have contrib/passwordcheck/Makefile
abuse of it to set -DUSE_CRACKLIB..
--
Michael


signature.asc
Description: PGP signature


Re: Custom text type for title text

2019-01-04 Thread Greg Stark
Policy on contrib has shifted over time. But generally we want to encourage
a lively ecosystem of extensions maintained outside of the Postgres source
tree so we avoid adding things to contrib when there's no particular
advantage.

The most common reason things are added to contrib is when the extension is
closely tied to internals and needs to be maintained along with changes to
internals. Modules like that are hard to maintain separately. But modules
that use documented general extensibility APIs should be able to be stable
across versions and live outside contrib.

On Thu 3 Jan 2019, 23:54 Daniel Heath  Would this also be appropriate for inclusion as contrib? I'm unfamiliar
> with the policy for what is / is not included there.
>
> Thanks,
> Daniel Heath
>
>
> On Fri, Jan 4, 2019, at 9:47 AM, Fabrízio de Royes Mello wrote:
>
>
>
> Em qui, 3 de jan de 2019 às 20:22, Daniel Heath 
> escreveu:
>
> Hi All,
>
> I've frequently seen an issue in applications which store titles (eg of
> books, events, user profiles) where duplicate values are not properly
> vetted.
>
> The 'citext' type is helpful here, but I'd be keen to go further.
>
> I propose a 'titletext' type, which has the following properties when
> compared for equality:
>  * Case insensitivity (like 'citext')
>  * Only considers characters in [:alnum:] (that is, ignores spaces,
> punctuation, etc)
>
> This would be useful for a range of situations where it's important to
> avoid entering duplicate values.
>
> Given the discussion at
> https://www.postgresql.org/message-id/CAKFQuwY9u14TqG8Yzj%3DfAB0tydvvtK7ibgFEx3tegbPWsGjJpg%40mail.gmail.com
> 
>  I'd
> lean towards making this type not automatically coerce to text (to avoid
> surprising behaviour when comparing text to titletext).
>
> Is a suitable patch likely to be accepted?
>
>
> You don’t need touch the core to do that. Just implement it as an
> extension and share throught some channel like pgxn.org.
>
> Note that citext also is an extension and released as a contrib module.
>
> Regards,
>
> --
>Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
>PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
>
>
>


Re: GiST VACUUM

2019-01-04 Thread Andrey Borodin
3 янв. 2019 г., в 23:47, Andrey Borodin  написал(а):
> 
>> * Bitmapset stores 32 bit signed integers, but BlockNumber is unsigned. So 
>> this will fail with an index larger than 2^31 blocks. That's perhaps 
>> academical, I don't think anyone will try to create a 16 TB GiST index any 
>> time soon. But it feels wrong to introduce an arbitrary limitation like that.
> Looks like bitmapset is unsuitable for collecting block numbers due to the 
> interface. Let's just create custom container for this?
> Or there is one other option: for each block number throw away sign bit and 
> consider page potentially internal and potentially empty leaf if (blkno & 
> 0x7FF) is in bitmapset.
> And then we will have to create at least one 17Tb GiST to check it actually 
> works.

Heikki, how do you think, is implementing our own radix tree for this is viable 
solution?
I've written working implementation with 4-level statically typed tree. If we 
follow this route, probably, there must be tests for this data structure.

Best regards, Andrey Borodin.


radix_tree.diff
Description: Binary data


Re: proposal: plpgsql pragma statement

2019-01-04 Thread Pavel Stehule
pá 4. 1. 2019 v 14:07 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 06/12/2018 18:27, Pavel Stehule wrote:
> > For my purpose I can imagine PRAGMA on function level with same syntax
> > like PL/SQL - I need to push somewhere some information that I can use
> > for plpgsql_check to protect users against false alarms. The locality in
> > this moment is not too important for me. But I prefer solution that
> > doesn't looks too strange, and is possible just with change plpgsql
> parser.
>
> When you are about to warn about a particular statement, you have the
> statement's line number, so you can look up the source code and check
> for any disable-this-warning comments.
>

It means to write own lexer and preparse source code before I start
checking.

I think so block level PRAGMA is significantly better solution



> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: proposal: plpgsql pragma statement

2019-01-04 Thread Peter Eisentraut
On 06/12/2018 18:27, Pavel Stehule wrote:
> For my purpose I can imagine PRAGMA on function level with same syntax
> like PL/SQL - I need to push somewhere some information that I can use
> for plpgsql_check to protect users against false alarms. The locality in
> this moment is not too important for me. But I prefer solution that
> doesn't looks too strange, and is possible just with change plpgsql parser.

When you are about to warn about a particular statement, you have the
statement's line number, so you can look up the source code and check
for any disable-this-warning comments.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: proposal - plpgsql unique statement id

2019-01-04 Thread Peter Eisentraut
On 13/11/2018 14:35, Pavel Stehule wrote:
> I am try to enhance plpgsql_check about profiler functionality and I
> have to solve how to identify every plpgsql statement across different
> sessions. There is lineno, but surely it should not be unique. I propose
> introduce stmtid to every statement structure. This number will be
> unique inside function.

That seems reasonable enough, although I wouldn't use 0 as a valid stmtid.

A documenting comment near PLpgSQL_stmt might be nice.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: don't create storage when unnecessary

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 09:19:25AM -0300, Alvaro Herrera wrote:
> I'll do it.

Thanks for taking care of it.  Please note that commands/copy.c also
makes use of RELKIND_CAN_HAVE_STORAGE() as of bf491a9.  Could you
split the renaming with a commit independent on what is being
discussed on this thread?  These are separate issues in my opinion.
--
Michael


signature.asc
Description: PGP signature


Re: Using POPCNT and other advanced bit manipulation instructions

2019-01-04 Thread David Rowley
On Thu, 20 Dec 2018 at 23:59, Jose Luis Tallon
 wrote:
> IMVHO: Please do not disregard potential optimization by the compiler
> around those calls.. o_0  That might explain the reduced performance
> improvement observed.

It was a speedup that I measured. Did you see something else?

> > What I'm really looking for by posting now are reasons why we can't do
> > this. I'm also interested in getting some testing done on older
> > machines, particularly machines with processors that are from before
> > 2007, both AMD and Intel.
>
> I can offer a 2005-vintage Opteron 2216 rev3 (bought late 2007) to test
> on. Feel free to toss me some test code.
>
> cpuinfo flags:fpu de tsc msr pae mce cx8 apic mca cmov pat clflush
> mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow
> rep_good nopl extd_apicid eagerfpu pni cx16 hypervisor lahf_lm
> cmp_legacy 3dnowprefetch vmmcall
>
> >   2007-2008 seems to be around the time both
> > AMD and Intel added support for POPCNT and LZCNT, going by [4].

It would be really good if you could git clone a copy of master and
patch it with the patch from earlier in the thread and see if you
encounter any issues running make check-world.

I'm a bit uncertain if passing -mpopcnt to a recent gcc would result
in the popcnt instruction being compiled in if the machine doing the
compiling had no support for that.

Likely it would be simple to test that with:

echo "int main(char **argv, int argc) { return
__builtin_popcount(argc); }" > popcnt.c && gcc popcnt.c -S -mpopcnt &&
cat popcnt.s | grep pop

I see a "popcntl" in there on my machine.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Allow auto_explain to log to NOTICE

2019-01-04 Thread Michael Paquier
On Fri, Jan 04, 2019 at 01:06:24PM +0100, Peter Eisentraut wrote:
> Do we really want to add user-facing options just to be able to run
> tests?  Should we write the tests differently instead?

The take is that the output of the plans generated includes data which
is run-dependent because the duration of the plan is generated
unconditionally.  One way to write generic tests considering this
would be to use a TAP test, but I feel that's overdoing it just for
this case.

Being able to control if the plan duration shows up still looks like
an interesting option to me independently of adding tests.
--
Michael


signature.asc
Description: PGP signature


Re: Statement-level Triggers For Uniqueness Checks

2019-01-04 Thread Peter Eisentraut
On 25/12/2018 00:56, Corey Huinker wrote:
> The regression diff (attached) seems to imply that the triggers simply
> are not firing, though.

The reason for this was explained by Dean.  If you take out the check
that he mentioned, then your trigger fires but crashes.  In your changed
unique_key_recheck(), "slot" is not initialized before use (or ever).

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Using POPCNT and other advanced bit manipulation instructions

2019-01-04 Thread David Rowley
Thanks for looking at this.

On Thu, 20 Dec 2018 at 23:56, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> I've checked for Clang 6, it turns out that indeed it generates popcnt without
> any macro, but only in one place for bloom_prop_bits_set. After looking at 
> this
> function it seems that it would be benefitial to actually use popcnt there 
> too.

Yeah, that's the pattern that's mentioned in
https://lemire.me/blog/2016/05/23/the-surprising-cleverness-of-modern-compilers/
It would need to be changed to call the popcount function.  This
existing makes me a bit more worried that some extension could be
using a similar pattern and end up being compiled with -mpopcnt due to
pg_config having that CFLAG. That's all fine until the binary makes
it's way over to a machine without that instruction.

> > I am able to measure performance gains from the patch.  In a 3.4GB
> > table containing a single column with just 10 statistics targets, I
> > got the following times after running ANALYZE on the table.
>
> I've tested it too a bit, and got similar results when the patched version is
> slightly faster. But then I wonder if popcnt is the best solution here, since
> after some short research I found a paper [1], where authors claim that:
>
> Maybe surprisingly, we show that a vectorized approach using SIMD
> instructions can be twice as fast as using the dedicated instructions on
> recent Intel processors.
>
>
> [1]: https://arxiv.org/pdf/1611.07612.pdf

I can't imagine that using the number_of_ones[] array processing
8-bits at a time would be slower than POPCNT though.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pg_dump multi VALUES INSERT

2019-01-04 Thread Alvaro Herrera
FWIW you can insert multiple zero-column rows with "insert into ..
select union all select union all select".

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Log a sample of transactions

2019-01-04 Thread Peter Eisentraut
On 12/12/2018 22:32, Adrien Nayrat wrote:
> Per idea of Nikolay Samokhvalov[1] I propose this patch to add the possibility
> to log all statements from a fraction of transactions.
> 
> I have several questions:
>   * Do we want this feature?

It's not clear to me what the use is.  The per-statement sampling allows
you to capture slow queries without overwhelming the logs.  We don't
have any logging of slow transactions or any other transaction scope
logging, so what will this sample?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2019-01-04 Thread Pavel Stehule
> About syntax: i vote for current syntax "reindex table CONCURRENTLY
> tablename". This looks consistent with existed CREATE INDEX CONCURRENTLY
> and REFRESH MATERIALIZED VIEW CONCURRENTLY.
>

+1

Pavel


> regards, Sergei
>
> [1]:
> https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com
>
>


Re: don't create storage when unnecessary

2019-01-04 Thread Alvaro Herrera
On 2019-Jan-04, Peter Eisentraut wrote:

> On 18/12/2018 08:18, Amit Langote wrote:
> > Sorry to be saying it late, but I have to agree with Horiguchi-san here
> > that RELKIND_HAS_STORAGE sounds better and is clear.
> 
> I think I agree.  Does someone want to send a patch?

I'll do it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2019-01-04 Thread Sergei Kornilov
Hello
Thank you! I review new patch version. It applied, builds and pass tests. Code 
looks good, but i notice new behavior notes:

> postgres=# reindex (verbose) table CONCURRENTLY measurement ;
> WARNING:  REINDEX of partitioned tables is not yet implemented, skipping 
> "measurement"
> NOTICE:  table "measurement" has no indexes
> REINDEX
> postgres=# \d measurement
> Partitioned table "public.measurement"
>   Column   |  Type   | Collation | Nullable | Default 
> ---+-+---+--+-
>  city_id   | integer |   | not null | 
>  logdate   | date|   | not null | 
>  peaktemp  | integer |   |  | 
>  unitsales | integer |   |  | 
> Partition key: RANGE (logdate)
> Indexes:
> "measurement_logdate_idx" btree (logdate)
> Number of partitions: 0

NOTICE seems unnecessary here.

Unfortunally concurrenttly reindex loses comments, reproducer:

> create table testcomment (i int);
> create index testcomment_idx1 on testcomment (i);
> comment on index testcomment_idx1 IS 'test comment';
> \di+ testcomment_idx1
> reindex table testcomment ;
> \di+ testcomment_idx1 # ok
> reindex table CONCURRENTLY testcomment ;
> \di+ testcomment_idx1 # we lose comment

Also i think we need change REINDEX to "REINDEX (without 
CONCURRENTLY)" in ACCESS EXCLUSIVE section Table-level Lock 
Modes documentation (to be similar with REFRESH MATERIALIZED VIEW and CREATE 
INDEX description)

About reindex invalid indexes - i found one good question in archives [1]: how 
about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop 
reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove only invalid 
ones. Valid index gives "ERROR:  permission denied: 
"pg_toast_16426_index_ccnew" is a system catalog"

About syntax: i vote for current syntax "reindex table CONCURRENTLY tablename". 
This looks consistent with existed CREATE INDEX CONCURRENTLY and REFRESH 
MATERIALIZED VIEW CONCURRENTLY.

regards, Sergei

[1]: 
https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com



Re: New GUC to sample log queries

2019-01-04 Thread Peter Eisentraut
On 19/11/2018 14:40, Tomas Vondra wrote:
> On 11/19/18 2:57 AM, Michael Paquier wrote:
>> On Sun, Nov 18, 2018 at 12:18:33PM +0100, Dmitry Dolgov wrote:
>>> Since it's hard to come up with a concise name that will mention sampling 
>>> rate
>>> in the context of min_duration_statement, I think it's fine to name this
>>> configuration "log_sample_rate", as long as it's dependency from
>>> log_min_duration_statements is clearly explained in the documentation.
>>
>> log_sample_rate looks fine to me as a name.
> 
> That seems far too short to me - the name should indicate it applies to 
> statement logging. I'd say log_statement_sample_rate is better.

It was committed with this name, but then one has to wonder why it
doesn't also apply to log_statement.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump multi VALUES INSERT

2019-01-04 Thread David Rowley
On Mon, 31 Dec 2018 at 18:58, Surafel Temesgen  wrote:
> On Fri, Dec 28, 2018 at 6:46 PM Fabien COELHO  wrote:
>> > At first i also try to do it like that but it seems the function will
>> > became long and more complex to me
>>
>> Probably. But calling it with size 100 should result in the same behavior,
>> so it is really just an extension of the preceeding one? Or am I missing
>> something?
>>
>
> Specifying table data using single value insert statement and user specified 
> values insert statement
> have enough deference that demand to be separate function and they are not 
> the same thing that should implement
> with the same function. Regarding code duplication i think the solution is 
> making those code separate function
> and call at appropriate place.

I don't really buy this. I've just hacked up a version of
dumpTableData_insert() which supports a variable number rows per
statement. It seems fairly clean and easy to me. Likely the fact that
this is very possible greatly increases the chances of this getting in
since it gets rid of the code duplication. I did also happen to move
the row building code out of the function into its own function, but
that's not really required. I just did that so I could see all the
code in charge of terminating each statement on my screen without
having to scroll.  I've not touched any of the plumbing work to plug
the rows_per_statement variable into the command line argument. So
it'll need a bit of merge work with the existing patch.   There will
need to be some code that ensures that the user does not attempt to
have 0 rows per statement. The code I wrote won't behave well if that
happens.

... Checks existing patch ...

I see you added a test, but missed checking for 0. That needs to be fixed.

+ if (dopt.dump_inserts_multiple < 0)
+ {
+ write_msg(NULL, "argument of --insert-multi must be positive number\n");
+ exit_nicely(1);
+ }

I also didn't adopt passing the rows-per-statement into the FETCH.  I
think that's a very bad idea and we should keep that strictly at 100.
I don't see any reason to tie the two together. If a user wants 10
rows per statement, do we really want to FETCH 10 times more often?
And what happens when they want 1 million rows per statement? We've no
reason to run out of memory from this since we're just dumping the
rows out to the archive on each row.

1.

+Specify the number of values per INSERT command.
+This will make the dump file smaller than --inserts
+and it is faster to reload but lack per row data lost on error
+instead entire affected insert statement data lost.

Unsure what you mean about "data lost".  It also controls the number
of "rows" per INSERT statement, not the number of
values.

I think it would be fine just to say:

+When using --inserts, this allows the maximum number
+of rows per INSERT statement to be specified.
+This setting defaults to 1.

I've used "maximum" there as the final statement on each table can
have less and also 0-column tables will always be 1 row per statement.

2. Is --insert-multi a good name? What if they do --insert-multi=1?
That's not very "multi".  Is --rows-per-insert better?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


multi-row-inserts_for_pg_dump_drowley.patch
Description: Binary data


Re: Allow auto_explain to log to NOTICE

2019-01-04 Thread Peter Eisentraut
On 26/12/2018 06:42, Michael Paquier wrote:
> On Tue, Oct 30, 2018 at 11:44:42AM +0100, Daniel Gustafsson wrote:
>> Circling back to this, I updated the patch with providing another
>> option as I couldn’t think of another way to do it cleanly.  I’ll
>> add the patch to the next CF but as it’s just about to start it
>> should be moved to the next once started.

Do we really want to add user-facing options just to be able to run
tests?  Should we write the tests differently instead?


-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: postgres_fdw: oddity in costing aggregate pushdown paths

2019-01-04 Thread Etsuro Fujita
(2018/12/28 17:28), Etsuro Fujita wrote:
> I noticed that I forgot to modify the cost for evaluating the PathTarget
> for each output row accordingly to this change :(.  Attached is a patch
> for that.

On reflection, I noticed these on estimate_path_cost_size, other than that:

1) In the case of a foreign-grouping path, we failed to adjust the
PathTarget eval cost when using the remote estimates, which I think
would be needed because the PathTarget expressions cannot always be
pre-computed as entries of the fdw_scan_tlist for the foreign-grouping path.

2) We also failed to factor in the eval cost for the foreign-scan and
foreign-join cases, with/without the remote estimates; in the scan/join
cases, the PathTarget might contain PHVs, so the cost of evaluating PHVs
should be charged.  Currently, PHVs are evaluated locally, so the cost
of PHV expressions should also be factored in when using the remote
estimates.

Here is a new version of the patch.

Best regards,
Etsuro Fujita
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 2ac9d7b..3304ebe 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2667,6 +2667,9 @@ estimate_path_cost_size(PlannerInfo *root,
 	Cost		total_cost;
 	Cost		cpu_per_tuple;
 
+	/* Make sure the core code has set up the relation's reltarget */
+	Assert(foreignrel->reltarget);
+
 	/*
 	 * If the table or the server is configured to use remote estimates,
 	 * connect to the foreign server and execute EXPLAIN to estimate the
@@ -2744,6 +2747,25 @@ estimate_path_cost_size(PlannerInfo *root,
 		cost_qual_eval(&local_cost, local_param_join_conds, root);
 		startup_cost += local_cost.startup;
 		total_cost += local_cost.per_tuple * retrieved_rows;
+
+		/* Add in the tlist eval cost for each output row */
+		startup_cost += foreignrel->reltarget->cost.startup;
+		total_cost += foreignrel->reltarget->cost.per_tuple * rows;
+
+		/*
+		 * In the case of a foreign-grouping path, the fdw_scan_tlist would
+		 * have contained tlist expressions that will be pushed down to the
+		 * remote as-is including at least grouping expressions; adjust the
+		 * eval cost.
+		 */
+		if (IS_UPPER_REL(foreignrel))
+		{
+			QualCost	tlist_cost;
+
+			cost_qual_eval(&tlist_cost, fdw_scan_tlist, root);
+			startup_cost -= tlist_cost.startup;
+			total_cost -= tlist_cost.per_tuple * rows;
+		}
 	}
 	else
 	{
@@ -2842,19 +2864,19 @@ estimate_path_cost_size(PlannerInfo *root,
 			nrows = clamp_row_est(nrows * fpinfo->joinclause_sel);
 			run_cost += nrows * remote_conds_cost.per_tuple;
 			run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
+
+			/* Add in the tlist eval cost for each output row */
+			startup_cost += foreignrel->reltarget->cost.startup;
+			run_cost += foreignrel->reltarget->cost.per_tuple * rows;
 		}
 		else if (IS_UPPER_REL(foreignrel))
 		{
 			PgFdwRelationInfo *ofpinfo;
-			PathTarget *ptarget = foreignrel->reltarget;
 			AggClauseCosts aggcosts;
 			double		input_rows;
 			int			numGroupCols;
 			double		numGroups = 1;
 
-			/* Make sure the core code set the pathtarget. */
-			Assert(ptarget != NULL);
-
 			/*
 			 * This cost model is mixture of costing done for sorted and
 			 * hashed aggregates in cost_agg().  We are not sure which
@@ -2918,26 +2940,22 @@ estimate_path_cost_size(PlannerInfo *root,
 			 * Startup cost includes:
 			 *	  1. Startup cost for underneath input relation
 			 *	  2. Cost of performing aggregation, per cost_agg()
-			 *	  3. Startup cost for PathTarget eval
 			 *-
 			 */
 			startup_cost = ofpinfo->rel_startup_cost;
 			startup_cost += aggcosts.transCost.startup;
 			startup_cost += aggcosts.transCost.per_tuple * input_rows;
 			startup_cost += (cpu_operator_cost * numGroupCols) * input_rows;
-			startup_cost += ptarget->cost.startup;
 
 			/*-
 			 * Run time cost includes:
 			 *	  1. Run time cost of underneath input relation
 			 *	  2. Run time cost of performing aggregation, per cost_agg()
-			 *	  3. PathTarget eval cost for each output row
 			 *-
 			 */
 			run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
 			run_cost += aggcosts.finalCost * numGroups;
 			run_cost += cpu_tuple_cost * numGroups;
-			run_cost += ptarget->cost.per_tuple * numGroups;
 
 			/* Accout for the eval cost of HAVING quals, if any */
 			if (root->parse->havingQual)
@@ -2952,6 +2970,10 @@ estimate_path_cost_size(PlannerInfo *root,
 startup_cost += fpinfo->local_conds_cost.startup;
 run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
 			}
+
+			/* Add in the tlist eval cost for each output row */
+			startup_cost += foreignrel->reltarget->cost.startup;
+			run_cost += foreignrel->reltarget->cost.per_tuple * rows;
 		}
 		else
 		{
@@ -2970,6 +2992,10 @@ estimate_path_cost_size(PlannerInfo *root,
 			startup_cost += foreignrel->baserestrictcost.startup;
 			cpu_per_tuple = cpu_tuple_cost + foreignrel->baserestrictcost.per_tuple;
 			ru

Re: commit fest app: Authors

2019-01-04 Thread Magnus Hagander
On Fri, Jan 4, 2019 at 12:26 PM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> Why does the commit fest app not automatically fill in the author for a
> new patch?
>
> And relatedly, every commit fest, there are a few patches registered
> without authors, probably because of the above behavior.
>
> Could this be improved?
>

Can't say I recall why it was done that way originally. I assume you're
suggesting author should default to whomever adds it to the app? Or to
somehow try to match it out of the attached email?

I'm guessing the original discussion could have something to do with a time
when a lot of authors didn't register their own patches (think good old
wiki days), it was just a third party who filled it all out. But today
there is probably no reason not to default it to that, as long as it can be
changed.

I'm not sure it's a good idea to *enforce* it -- because you can't register
an author until they have logged into the CF app once. And I think
registering it with no author is better than registering it with the wrong
author.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: don't create storage when unnecessary

2019-01-04 Thread Peter Eisentraut
On 18/12/2018 08:18, Amit Langote wrote:
> Sorry to be saying it late, but I have to agree with Horiguchi-san here
> that RELKIND_HAS_STORAGE sounds better and is clear.

I think I agree.  Does someone want to send a patch?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



commit fest app: Authors

2019-01-04 Thread Peter Eisentraut
Why does the commit fest app not automatically fill in the author for a
new patch?

And relatedly, every commit fest, there are a few patches registered
without authors, probably because of the above behavior.

Could this be improved?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Feature: triggers on materialized views

2019-01-04 Thread Peter Eisentraut
On 28/12/2018 08:43, Mitar wrote:
> A summary of the patch: This patch enables adding AFTER triggers (both
> ROW and STATEMENT) on materialized views. They are fired when doing
> REFRESH MATERIALIZED VIEW CONCURRENTLY for rows which have changed.

What bothers me about this patch is that it subtly changes what a
trigger means.  It currently means, say, INSERT was executed on this
table.  You are expanding that to mean, a row was inserted into this
table -- somehow.

Triggers should generally refer to user-facing commands.  Could you not
make a trigger on REFRESH itself?

> Triggers are not fired if you call REFRESH without CONCURRENTLY. This
> is based on some discussion on the mailing list because implementing
> it for without CONCURRENTLY would require us to add logic for firing
> triggers where there was none before (and is just an efficient heap
> swap).

This is also a problem, because it would allow bypassing the trigger
accidentally.

Moreover, consider that there could be updatable materialized views,
just like there are updatable normal views.  And there could be triggers
on those updatable materialized views.  Those would look similar but
work quite differently from what you are proposing here.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: minor fix in CancelVirtualTransaction

2019-01-04 Thread Peter Eisentraut
On 06/12/2018 21:37, Alvaro Herrera wrote:
> When scanning the list of virtual transactions looking for one
> particular vxid, we cancel the transaction *and* break out of the loop
> only if both backendId and localTransactionId matches us.  The canceling
> part is okay, but limiting the break to that case is pointless: if we
> found the correct backendId, there cannot be any other entry with the
> same backendId.  Rework the loop so that we break out of it if backendId
> matches even if the localTransactionId part doesn't.

Your reasoning seems correct to me.

Maybe add a code comment along the lines of "once we have found the
right ... we don't need to check the remaining ...".

Or, you can make this even more clear by comparing the backendId
directly with the proc entry:

if (vxid.backendId == proc->backendId)
{
if (vxid.localTransactionId == proc->lxid)
{

}
break;
}

Then the logic your are trying to achieve would be obvious.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] Pass COPT and PROFILE to CXXFLAGS as well

2019-01-04 Thread Peter Eisentraut
On 21/11/2018 14:28, Christoph Berg wrote:
> The context here is that we want to use the *FLAGS from pg_config for
> compiling PG extension packages, but add additional *FLAGS from the
> extension build environment. Merging the pg_config CFLAGS with the
> environment CFLAGS seemed hard/weird/error-prone, so what we are doing
> now is 
> https://salsa.debian.org/postgresql/postgresql-common/blob/master/pg_buildext#L53-55
> 
> if [ "${CFLAGS:-}" ]; then
> export COPT="$CFLAGS"
> fi

Note that pgxs supports PG_CPPFLAGS for adding custom pieces to CPPFLAGS
in a safe way.  Maybe we should add an equivalent for CFLAGS?  It's just
that it hasn't been needed so far.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: Update ssl test certificates and keys

2019-01-04 Thread Michael Paquier
On Thu, Jan 03, 2019 at 03:36:36PM +0100, Peter Eisentraut wrote:
> done

Thanks, Peter.
--
Michael


signature.asc
Description: PGP signature


Re: pg_dump multi VALUES INSERT

2019-01-04 Thread Surafel Temesgen
On Thu, Jan 3, 2019 at 1:38 AM David Rowley 
wrote:

> On Thu, 3 Jan 2019 at 01:50, Surafel Temesgen 
> wrote:
> > On Mon, Dec 31, 2018 at 12:38 PM David Rowley <
> david.row...@2ndquadrant.com> wrote:
> >> Just looking at the v5 patch, it seems not to handle 0 column tables
> correctly.
>
> > The attach patch contain a fix for it
>
> + /* if it is zero-column table then we're done */
> + if (nfields == 0)
> + {
> + archputs(insertStmt->data, fout);
> + continue;
> + }
>
> So looks like you're falling back on one INSERT per row for this case.
> Given that this function is meant to be doing 'dump_inserts_multiple'
> INSERTs per row, I think the comment should give some details of why
> we can't do multi-inserts, and explain the reason for it. "we're done"
> is just not enough detail.
>
>
right ,  attach patch add more detail comment

regards
Surafel
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..ee94d1d293 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,18 @@ PostgreSQL documentation
   
  
 
+ 
+  --insert-multi
+  
+   
+Specify the number of values per INSERT command.
+This will make the dump file smaller than --inserts
+and it is faster to reload but lack per row data lost on error
+instead entire affected insert statement data lost.
+   
+  
+ 
+
  
   --load-via-partition-root
   
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 341b1a51f2..050ef89650 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char   *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"insert-multi", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 dosync = false;
 break;
 
+			case 8:			/* inserts values number */
+errno = 0;
+dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+if (p == optarg || *p != '\0')
+{
+	write_msg(NULL, "argument of --insert-multi must be a number\n");
+	exit_nicely(1);
+}
+if (errno == ERANGE)
+{
+	write_msg(NULL, "argument of --insert-multi exceeds integer range.\n");
+	exit_nicely(1);
+}
+if (dopt.dump_inserts_multiple < 0)
+{
+	write_msg(NULL, "argument of --insert-multi must be positive number\n");
+	exit_nicely(1);
+}
+
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -607,8 +630,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +901,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +992,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists  use IF EXISTS when dropping object

Re: [HACKERS] logical decoding of two-phase transactions

2019-01-04 Thread Nikhil Sontakke
Hi Arseny,

> I hadn't checked whether my concerns where addressed in the latest
> version though.
>

I'd like to believe that the latest patch set tries to address some
(if not all) of your concerns. Can you please take a look and let me
know?

Regards,
Nikhil

--
Nikhil Sontakke
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/



Re: [HACKERS] logical decoding of two-phase transactions

2019-01-04 Thread Nikhil Sontakke
Hi Tomas,

> Thanks for the updated patch - I've started working on a review, with
> the hope of getting it committed sometime in 2019-01. But the patch
> bit-rotted again a bit (probably due to d3c09b9b), which broke the last
> part. Can you post a fixed version?
>

PFA, updated patch set.

Regards,
Nikhil

> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Nikhil Sontakke
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/


0001-Cleaning-up-of-flags-in-ReorderBufferTXN-structure.Jan4.patch
Description: Binary data


0002-Support-decoding-of-two-phase-transactions-at-PREPAR.Jan4.patch
Description: Binary data


0003-Gracefully-handle-concurrent-aborts-of-uncommitted-t.Jan4.patch
Description: Binary data


0004-Teach-test_decoding-plugin-to-work-with-2PC.Jan4.patch
Description: Binary data