Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-23 Thread Masahiko Sawada
On Thu, Apr 20, 2017 at 9:31 AM, Kyotaro HORIGUCHI
 wrote:
> Ok, I got the point.
>
> At Wed, 19 Apr 2017 17:39:01 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>  wrote in 
> <20170419.173901.16598616.horiguchi.kyot...@lab.ntt.co.jp>
>> > >> |
>> > >> | Quorum-based synchronous replication is basically more
>> > >> | efficient than priority-based one when you specify multiple
>> > >> | standbys in synchronous_standby_names and want
>> > >> | to synchronously replicate transactions to two or more of
>> > >> | them.
>
> "Some" means "not all".
>
>> > >> | In the priority-based case, the replication master
>> > >> | must wait for a reply from the slowest standby in the
>> > >> | required number of standbys in priority order, which may
>> > >> | slower than the rest.
>
>
> Quorum-based synchronous replication is expected to be more
> efficient than priority-based one when your master doesn't need
> to be in sync with all of the nominated standbys by
> synchronous_standby_names.  While quorum-based
> replication master waits only for a specified number of fastest
> standbys, priority-based replicatoin master must wait for
> standbys at the top of the list, which may be slower than the
> rest.

This description looks good to me. I've updated the patch based on
this description and attached it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


quorum_repl_doc_improve_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund


On April 23, 2017 10:31:18 PM PDT, Petr Jelinek  
wrote:
>On 24/04/17 04:31, Petr Jelinek wrote:
>So actually maybe running regression tests through it might be
>reasonable approach if we add new make target for it.

That sounds like a good plan.


>Note that the first patch is huge. That's because I needed to add
>alternative output for largeobject test because it uses fastpath
>function calls which are not allowed over replication protocol.

There's no need for that restriction, is there?  At least for db walsenders...

>As part of this I realized that the parser fallback that I wrote
>originally for handling SQL in walsender is not robust enough as the
>lexer would fail on some valid SQL statements. So there is also second
>patch that does this using different approach which allows any SQL
>statement.

Haven't looked at the new thing yet, but I'm not particularly surprised...  
Wonder of there's a good way to fully integrate both grammars, without 
exploding keywords.

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] visual studio 2017 build support

2017-04-23 Thread Haribabu Kommi
Here I attached a small patch that adds the build support for
visual studio 2017.

The tools version number is still 14.X, irrespective of VS 2017
version of 15.0. I modified the versions accordingly.

Regards,
Hari Babu
Fujitsu Australia


vs2017_build_support.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup issue

2017-04-23 Thread David G. Johnston
For reference this has been asked, and eventually answered on -general at:

https://www.postgresql.org/message-id/flat/CAKFQuwZDS7nA0SvVnumwjHBxz4CWKQm3bVNTHVeWdtAW_oXNJg%40mail.gmail.com#cakfquwzds7na0svvnumwjhbxz4cwkqm3bvnthvewdtaw_ox...@mail.gmail.com

Further comments below; partly a rehash of the conclusion drawn by Adrian
Klaver on that thread.

On Sun, Apr 23, 2017 at 11:55 AM, chiru r  wrote:

>
> postgres=#
> postgres=# create user backup_admin password 'X';
> CREATE ROLE
> postgres=# create role dba_admin SUPERUSER REPLICATION;
> CREATE ROLE
> postgres=# grant dba_admin to backup_admin;
> GRANT ROLE
> postgres=# alter user backup_admin set role to dba_admin;
> ALTER ROLE
>
> postgres=# \du
>List of roles
> Role name | Attributes
> | Member of
> --+-
> ---+
>  backup_admin |
>  | {dba_admin}
>  dba_admin| Superuser, Cannot login, Replication
> | {}
>  postgres | Superuser, Create role, Create DB, Replication, Bypass
> RLS | {}
>
> [postgres@pgserver ~]$ mkdir online_backups1
> [postgres@pgserver ~]$ /opt/PostgreSQL/9.5/bin/pg_basebackup  --format=t
>   --pgdata=online_backups1 -p 5432 -U backup_admin  -x -z  --verbose
> pg_basebackup: could not connect to server: FATAL:  must be superuser or
> replication role to start walsender
>
> *Please help me why pg_basebackup is throwing FATAL when I use
> backup_admin?.*
>
>
​The pg_basebackup is dying because the role specified, -U backup_admin,
has neither SUPERUSER nor REPLICATION privileges since those two privileges
are not programmed to be passed down via inheritance.  This is a feature.
As noted on the other thread one could ask for another feature (via another
role attribute) that tells PostgreSQL to pass down those privileges via
inheritance.​  That seems like the most useful solution if one believes
that having such an attribute would be an improvement over explicitly
defining whether specific login roles are replication or, the
all-inclusive, superuser.

The reason the "ALTER USER .. SET ROLE TO" doesn't make any difference here
is because pg_backup doesn't specify a database and the table
pg_db_role_setting, which where that command stores its data, is only
consulted after a successful connection to a specific database has been
established.  That doesn't happen here.

*Is there any limitation in pg_basebackup utility ?*
>

​I suppose...
if you look at it from the standpoint that pg_basebackup operates as the
physical data files level and not the SQL level.

Other's with more authority will voice their own opinions but I'm not where
changing the inheritance behavior is an option at this point.  Making ALTER
USER ... SET ROLE work here is plausible but hackish.  The new role
attribute just seems messy.

While I guess I get the appeal of having everything defined via group roles
and implicit inheritance it still sounds like a purely aesthetic dynamic
which is contrary to existing design decisions.

David J.


Re: [HACKERS] TAP tests - installcheck vs check

2017-04-23 Thread Andrew Dunstan


On 04/23/2017 10:33 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> AFAICT, unlike the pg_regress checks, which in the installcheck case run
>> against a running instance of postgres, for TAP tests the only
>> difference is that that for the check case a temp install is done,
>> possibly with some extra contrib modules. Is that correct? If is is, why
>> aren't we providing an installcheck target for tests like recover. In at
>> least one case (buildfarmn jacana) installs are quite expensive (2 or 3
>> minutes) and if they are pointless as seems to be the case here why
>> can't we just avoid them?
> A lot of those test cases involve setting non-default configuration
> parameters and/or stopping/starting the postmaster.  So I can't see how
> we would run them against a pre-existing live cluster, which is the usual
> meaning of "make installcheck".
>
> I think what you're imagining is skipping redundant builds of the
> "tmp_install" tree by using an installation tree with a temporary $PGDATA
> directory.  That seems like a fine idea, but we need another word for it.
>
>   


That's actually the current meaning of installcheck w.r.t. TAP. See
Makefile.global.in. It's what we run (mostly) in the buildfarm for the
bin tests.

I agree entirely that it's confusing as heck. +1 for inventing a new name.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders

2017-04-23 Thread Tom Lane
Thomas Munro  writes:
> On Sun, Apr 23, 2017 at 6:01 PM, Tom Lane  wrote:
>> Fair enough.  But I'd still like an explanation of why only about
>> half of the population is showing a failure here.  Seems like every
>> machine should be seeing the LSN as moving backwards in this test.
>> So (a) why aren't they all failing, and (b) should we change the
>> test to make sure every platform sees that happening?

> Every machine sees the LSN moving backwards, but the code path that
> had the assertion only reached if it decides to interpolate, which is
> timing dependent: there needs to be a future sample in the lag
> tracking buffer, which I guess is not the case in those runs.

I'm dissatisfied with this explanation because if it's just timing,
it doesn't seem very likely that some machines would reproduce the
failure every single time while others never would.  Maybe that can be
blamed on kernel scheduler vagaries + different numbers of cores, but
I can't escape the feeling that there's something here we've not
fully understood.

While chasing after this earlier today, I turned on some debug logging
and noted that the standby's reports look like

2017-04-23 15:46:46.206 EDT [34829] LOG:  database system is ready to accept 
read only connections
2017-04-23 15:46:46.212 EDT [34834] LOG:  fetching timeline history file for 
timeline 2 from primary server
2017-04-23 15:46:46.212 EDT [34834] LOG:  started streaming WAL from primary at 
0/300 on timeline 1
2017-04-23 15:46:46.213 EDT [34834] LOG:  sending write 0/302 flush 
0/3028470 apply 0/3028470
2017-04-23 15:46:46.214 EDT [34834] LOG:  replication terminated by primary 
server
2017-04-23 15:46:46.214 EDT [34834] DETAIL:  End of WAL reached on timeline 1 
at 0/3028470.
2017-04-23 15:46:46.214 EDT [34834] LOG:  sending write 0/3028470 flush 
0/3028470 apply 0/3028470
2017-04-23 15:46:46.214 EDT [34830] LOG:  new target timeline is 2
2017-04-23 15:46:46.214 EDT [34834] LOG:  restarted WAL streaming at 0/300 
on timeline 2
2017-04-23 15:46:46.228 EDT [34834] LOG:  sending write 0/302 flush 
0/3028470 apply 0/3028470

So you're right that the standby's reported "write" position can go
backward, but it seems pretty darn odd that the flush and apply
positions didn't go backward too.  Is there a bug there?

I remain of the opinion that if we can't tell from the transmitted
data whether a timeline switch has caused the position to go backward,
then that's a protocol shortcoming that ought to be fixed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] identity columns

2017-04-23 Thread Michael Paquier
On Mon, Apr 24, 2017 at 10:03 AM, Vitaly Burovoy
 wrote:
> On 4/23/17, Robert Haas  wrote:
>> On Thu, Apr 20, 2017 at 12:05 AM, Vitaly Burovoy
>>  wrote:
>> But why do we need it?  Instead of:
>>
>> ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
>> SET GENERATED { ALWAYS | BY DEFAULT }
>> DROP IDENTITY [ IF EXISTS ]
>>
>> Why not just:
>>
>> SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
>> DROP IDENTITY [ IF EXISTS ]
>>
>> Surely the ALTER TABLE command can tell whether the column is already
>> GENERATED, so the first form could make it generated if it's not and
>> adjust the ALWAYS/BY DEFAULT property if it is.
>
> I thought exactly that way, but Peter gave an explanation[1].
> I had to search a different way because no one joined to the
> discussion at that time.
> One of reasons from Peter was to make "SET GENERATED" follow the
> standard (i.e. raise an error).
> I asked whether "IF NOT EXISTS" works for him instead of "ADD GENERATED".
> The answer[2] was "It could be done", but "it is very difficult to implement".
>
> So I wonder why the adjustment patch is not wished for being committed.

Same line of thoughts here, as far as I understand, ADD GENERATED and
SET GENERATED have a lot in common, SET GENERATED follows the SQL
spec, and not ADD GENERATED, so I don't have a good reason to not
simplify the interface by keeping SET GENERATED and dropping ADD. This
will be less confusing to users.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-23 Thread Michael Paquier
On Sun, Apr 23, 2017 at 10:15 AM, Petr Jelinek
 wrote:
> On 21/04/17 06:11, Michael Paquier wrote:
>> On Fri, Apr 21, 2017 at 12:29 AM, Peter Eisentraut
>>  wrote:
>> Hmm. I have been actually looking at this solution and I am having
>> doubts regarding its robustness. In short this would need to be
>> roughly a two-step process:
>> - In PostmasterStateMachine(), SIGUSR2 is sent to the checkpoint to
>> make it call ShutdownXLOG(). Prior doing that, a first signal should
>> be sent to all the WAL senders with
>> SignalSomeChildren(BACKEND_TYPE_WALSND). SIGUSR2 or SIGINT could be
>> used.
>> - At reception of this signal, all WAL senders switch to a stopping
>> state, refusing commands that can generate WAL.
>> - Checkpointer looks at the state of all WAL senders, looping with a
>> sleep call of a couple of ms, refusing to launch the shutdown
>> checkpoint as long as all WAL senders have not switched to the
>> stopping state.
>> - In reaper(), once checkpointer is confirmed as stopped, signal again
>> the WAL senders, and tell them to perform the last loop.

OK, I have been hacking that, finishing with the attached. In the
attached I am using SIGUSR2 to instruct the WAL senders to prepare for
stopping, and SIGINT to handle the last WAL flush loop. The shutdown
checkpoint moves on only if all active WAL senders are marked with a
STOPPING state. Reviews as welcome.

>> After that, I got a second, more simple idea.
>> CheckpointerShmem->ckpt_flags holds the information about checkpoints
>> currently running, so we could have the WAL senders look at this data
>> and prevent any commands generating WAL. The checkpointer may be
>> already stopped at the moment the WAL senders finish their loop, so we
>> need also to check if the checkpointer is running or not on those code
>> paths. Such safeguards may actually be enough for the problem of this
>> thread. Thoughts?
>>
>
> Hmm but how do we handle statements that are already in progress by the
> time ckpt_flags changes?

Yup, this does not handle well race conditions.
-- 
Michael


walsender-chkpt-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP tests - installcheck vs check

2017-04-23 Thread Tom Lane
Andrew Dunstan  writes:
> AFAICT, unlike the pg_regress checks, which in the installcheck case run
> against a running instance of postgres, for TAP tests the only
> difference is that that for the check case a temp install is done,
> possibly with some extra contrib modules. Is that correct? If is is, why
> aren't we providing an installcheck target for tests like recover. In at
> least one case (buildfarmn jacana) installs are quite expensive (2 or 3
> minutes) and if they are pointless as seems to be the case here why
> can't we just avoid them?

A lot of those test cases involve setting non-default configuration
parameters and/or stopping/starting the postmaster.  So I can't see how
we would run them against a pre-existing live cluster, which is the usual
meaning of "make installcheck".

I think what you're imagining is skipping redundant builds of the
"tmp_install" tree by using an installation tree with a temporary $PGDATA
directory.  That seems like a fine idea, but we need another word for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
On 2017-04-24 04:27:58 +0200, Petr Jelinek wrote:
> On 24/04/17 01:43, Andres Freund wrote:
> > 
> >> BTW while looking at the code, I don't understand why we call
> >> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
> >> the SetLatch be enough (they both end up calling sendSelfPipeByte()
> >> eventually)?
> > 
> > Historic raisins - there didn't use to be a SetLatch in
> > procsignal_sigusr1_handler. That changed when I whacked around catchup &
> > notify to be based on latches ([1] and following).
> > 
> > [1] 
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47
> > 
> 
> Okay, but why call both SetLatch and latch_sigusr1_handler? What does
> that buy us?

Nothing.  It's how the code evolved, we can change that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
On 2017-04-24 04:26:16 +0200, Petr Jelinek wrote:
> > WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> > currently sets a different variable from postgres.c, but that seems like
> > a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> > WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> > an active bug to me?
> > 
> 
> Hmm I don't think good solution is to use same variable though,
> walsender needs to do slightly different thing on SIGHUP,

Hm? You mean SyncRepInitConfig()?  I don't think that matters, because
it's only needed if config is changed while streaming.

> plus the
> got_SIGHUP is not the type of variable we want to export. I wonder if it
> would be enough to just add check for got_SIGHUP somewhere at the
> beginning of exec_replication_command().

I don't think that'd be sufficient, because we really want config
changes to get picked up while idle, and that won't work from within
exec_replication_command().  And that pretty much means it'll have to
happen outside of walsender.c.

FWIW, while it's not pretty, I actually see very little reason not to
share got_SIGHUP (with a bit less generic name name) between different
types of processes (i.e. putting it in miscadmin.h or such). It's not
exactly pretty, but there's also no benefit in duplicating it
everywhere, and without it you run into the issue presented here.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Petr Jelinek
On 24/04/17 02:04, Andres Freund wrote:
> Hi,
> 
> Oh, and one more point: There desperately need to be tests exercising
> "SQL via walsender". Including the issue of parallelism here, the missed
> cancel handler, timeouts, and such.  One way to do so is to use
> pg_regress with an adjusted connection string (specifying
> replication=database), doing the same for isolationtester, or using
> dblink.
> 

Well, there are some tests (the logical replication uses that
functionality) and it's not quite clear to me what all to run there.
I assume you don't want to rerun whole regression suite against
walsender given the time it would take in normal tests (although I could
see doing that optionally somehow) but then what to pick from there.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Petr Jelinek
On 24/04/17 01:43, Andres Freund wrote:
> 
>> BTW while looking at the code, I don't understand why we call
>> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
>> the SetLatch be enough (they both end up calling sendSelfPipeByte()
>> eventually)?
> 
> Historic raisins - there didn't use to be a SetLatch in
> procsignal_sigusr1_handler. That changed when I whacked around catchup &
> notify to be based on latches ([1] and following).
> 
> [1] 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47
> 

Okay, but why call both SetLatch and latch_sigusr1_handler? What does
that buy us?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Petr Jelinek
On 24/04/17 01:59, Andres Freund wrote:
> Hi,
> 
> On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
>> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
>> does not break anything for existing walsender usage.
> 
>> diff --git a/src/backend/replication/logical/launcher.c 
>> b/src/backend/replication/logical/launcher.c
>> index 761fbfa..e9dd886 100644
>> --- a/src/backend/replication/logical/launcher.c
>> +++ b/src/backend/replication/logical/launcher.c
>> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char 
>> *subname, Oid userid,
>>  BackgroundWorkerbgw;
>>  BackgroundWorkerHandle *bgw_handle;
>>  int i;
>> -int slot;
>> +int slot = 0;
>>  LogicalRepWorker   *worker = NULL;
>>  int nsyncworkers = 0;
>>  TimestampTz now = GetCurrentTimestamp();
> 
> That seems independent and unnecessary?
> 

Yeah that leaked from different patch I was working on in parallel.

> 
>> -/* SIGUSR1: set flag to send WAL records */
>> -static void
>> -WalSndXLogSendHandler(SIGNAL_ARGS)
>> -{
>> -int save_errno = errno;
>> -
>> -latch_sigusr1_handler();
>> -
>> -errno = save_errno;
>> -}
> 
>>  pqsignal(SIGPIPE, SIG_IGN);
>> -pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
>> */
>> +pqsignal(SIGUSR1, procsignal_sigusr1_handler);
> 
> Those aren't actually entirely equivalent. I'm not sure it matters much,
> but WalSndXLogSendHandler didn't include a SetLatch(), but
> procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
> will just be elided, but what if walsender already woke up and did it's
> work?
> 

They aren't exactly same no, but I don't see harm in what
procsignal_sigusr1_handler. I mean worst case scenario is latch is set
so walsender checks for work.

> I think it'd be better to have PostgresMain() set up signals by default,
> and then have WalSndSignals() overwrites the ones it needs separate.

Hmm that sounds like a good idea.

> WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> currently sets a different variable from postgres.c, but that seems like
> a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> an active bug to me?
> 

Hmm I don't think good solution is to use same variable though,
walsender needs to do slightly different thing on SIGHUP, plus the
got_SIGHUP is not the type of variable we want to export. I wonder if it
would be enough to just add check for got_SIGHUP somewhere at the
beginning of exec_replication_command().

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] some review comments on logical rep code

2017-04-23 Thread Masahiko Sawada
On Sat, Apr 22, 2017 at 4:26 AM, Fujii Masao  wrote:
> On Fri, Apr 21, 2017 at 4:02 PM, Masahiko Sawada  
> wrote:
>> On Fri, Apr 21, 2017 at 1:19 AM, Fujii Masao  wrote:
>>> On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada  
>>> wrote:
 On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI
  wrote:
> Hi,
>
> Thank you for the revised version.
>
> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada 
>  wrote in 
> 
>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada  
>> wrote:
>> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI
>> >  wrote:
>> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada 
>> >>  wrote in 
>> >> 

Re: [HACKERS] identity columns

2017-04-23 Thread Vitaly Burovoy
On 4/23/17, Robert Haas  wrote:
> On Thu, Apr 20, 2017 at 12:05 AM, Vitaly Burovoy
>  wrote:
>> OK. Let's go through it again.
>> IDENTITY is a property of a column. There are no syntax to change any
>> property of any DB object via the "ADD" syntax.
>> Yes, a structure (a sequence) is created. But in fact it cannot be
>> independent from the column at all (I remind you that according to the
>> standard it should be unnamed sequence and there are really no way to
>> do something with it but via the column's DDL).
>
> I agree that ADD is a little odd here, but it doesn't seem terrible.

Yes, it is not terrible, but why do we need to support an odd syntax
if we can use a correct one?
If we leave it as it was committed, we have to support it for years.

> But why do we need it?  Instead of:
>
> ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
> SET GENERATED { ALWAYS | BY DEFAULT }
> DROP IDENTITY [ IF EXISTS ]
>
> Why not just:
>
> SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
> DROP IDENTITY [ IF EXISTS ]
>
> Surely the ALTER TABLE command can tell whether the column is already
> GENERATED, so the first form could make it generated if it's not and
> adjust the ALWAYS/BY DEFAULT property if it is.

I thought exactly that way, but Peter gave an explanation[1].
I had to search a different way because no one joined to the
discussion at that time.
One of reasons from Peter was to make "SET GENERATED" follow the
standard (i.e. raise an error).
I asked whether "IF NOT EXISTS" works for him instead of "ADD GENERATED".
The answer[2] was "It could be done", but "it is very difficult to implement".

So I wonder why the adjustment patch is not wished for being committed.

>> It is even hard to detect which sequence (since they have names) is
>> owned by the column:
>>
>> postgres=# CREATE TABLE xxx(i int generated always as identity, j
>> serial);
>> CREATE TABLE
>> postgres=# \d xxx*
>> Table "public.xxx"
>>  Column |  Type   | Collation | Nullable |Default
>> +-+---+--+
>>  i  | integer |   | not null | generated always as identity
>>  j  | integer |   | not null | nextval('xxx_j_seq'::regclass)
>>
>>  Sequence "public.xxx_i_seq"
>>Column   |  Type   | Value
>> +-+---
>>  last_value | bigint  | 1
>>  log_cnt| bigint  | 0
>>  is_called  | boolean | f
>>
>>  Sequence "public.xxx_j_seq"
>>Column   |  Type   | Value
>> +-+---
>>  last_value | bigint  | 1
>>  log_cnt| bigint  | 0
>>  is_called  | boolean | f
>> Owned by: public.xxx.j
>>
>> I can only guess that "public.xxx_i_seq" is owned by "public.xxx.i",
>> nothing proves that.
>> Whereas for regular sequence there are two evidences ("Default" and "Owned
>> by").
>
> This seems like a psql deficiency that should be fixed.

It was a part of explanation why IDENTITY is a property and therefore
"SET" should be used instead of "ADD".

>> Also the created sequence cannot be deleted (only with the column) or
>> left after the column is deleted.
>
> This does not seem like a problem.  In fact I'd say that's exactly the
> desirable behavior.

Also it is not about a problem, it is a part of explanation.

>> The "[ NOT ] EXISTS" is a common Postgres' syntax extension for
>> creating/updating objects in many places. That's why I think it should
>> be used instead of introducing the new "ADD" syntax which contradicts
>> the users' current experience.
>
> As noted above, I don't understand why we need either ADD or IF NOT
> EXISTS.  Why can't SET just, eh, set the property to the desired
> state?

+1

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

[1] https://postgr.es/m/497c40af-bd7a-5cb3-d028-d91434639...@2ndquadrant.com
[2] https://postgr.es/m/59d8e32a-14de-6f45-c2b0-bb52e4a75...@2ndquadrant.com
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP tests - installcheck vs check

2017-04-23 Thread Michael Paquier
On Mon, Apr 24, 2017 at 7:29 AM, Andrew Dunstan
 wrote:
>
> AFAICT, unlike the pg_regress checks, which in the installcheck case run
> against a running instance of postgres, for TAP tests the only
> difference is that that for the check case a temp install is done,
> possibly with some extra contrib modules. Is that correct? If is is, why
> aren't we providing an installcheck target for tests like recover. In at
> least one case (buildfarmn jacana) installs are quite expensive (2 or 3
> minutes) and if they are pointless as seems to be the case here why
> can't we just avoid them?

install.pl deploys by default the dll of modules needed for the tests,
so no objections. Don't you think the TAP scripts in src/test/perl
should be installed as well? I think that this would make sense for
consistency with what other Nix platforms do, but there is no real
installation of PGXS there. So perhaps they could be deployed in a
different path like scripts/perl?
-- 
Michael
VMware vCenter Server
www.vmware.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-23 16:59:41 -0700, Andres Freund wrote:
> Hi,
> 
> On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
> > Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
> > does not break anything for existing walsender usage.
> 
> > diff --git a/src/backend/replication/logical/launcher.c 
> > b/src/backend/replication/logical/launcher.c
> > index 761fbfa..e9dd886 100644
> > --- a/src/backend/replication/logical/launcher.c
> > +++ b/src/backend/replication/logical/launcher.c
> > @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const 
> > char *subname, Oid userid,
> > BackgroundWorkerbgw;
> > BackgroundWorkerHandle *bgw_handle;
> > int i;
> > -   int slot;
> > +   int slot = 0;
> > LogicalRepWorker   *worker = NULL;
> > int nsyncworkers = 0;
> > TimestampTz now = GetCurrentTimestamp();
> 
> That seems independent and unnecessary?
> 
> 
> > -/* SIGUSR1: set flag to send WAL records */
> > -static void
> > -WalSndXLogSendHandler(SIGNAL_ARGS)
> > -{
> > -   int save_errno = errno;
> > -
> > -   latch_sigusr1_handler();
> > -
> > -   errno = save_errno;
> > -}
> 
> > pqsignal(SIGPIPE, SIG_IGN);
> > -   pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
> > */
> > +   pqsignal(SIGUSR1, procsignal_sigusr1_handler);
> 
> Those aren't actually entirely equivalent. I'm not sure it matters much,
> but WalSndXLogSendHandler didn't include a SetLatch(), but
> procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
> will just be elided, but what if walsender already woke up and did it's
> work?
> 
> I think it'd be better to have PostgresMain() set up signals by default,
> and then have WalSndSignals() overwrites the ones it needs separate.
> WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> currently sets a different variable from postgres.c, but that seems like
> a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> an active bug to me?

Oh, and one more point: There desperately need to be tests exercising
"SQL via walsender". Including the issue of parallelism here, the missed
cancel handler, timeouts, and such.  One way to do so is to use
pg_regress with an adjusted connection string (specifying
replication=database), doing the same for isolationtester, or using
dblink.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A note about debugging TAP failures

2017-04-23 Thread Andres Freund
On 2017-04-23 11:31:05 +0900, Michael Paquier wrote:
> Keeping folders in case of failures is something that I have been
> advocating in favor of for some time, but this never got into the tree
> :(

I don't think it'd be ok to do so unless you the randomness of dirnames
is changed as you'd just accumulate cruft forever without manual
intervention.  So that's not too surprising imo...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-23 Thread Noah Misch
On Thu, Apr 20, 2017 at 11:34:34PM -0700, Noah Misch wrote:
> On Fri, Apr 21, 2017 at 01:20:05PM +0900, Masahiko Sawada wrote:
> > On Fri, Apr 21, 2017 at 12:02 PM, Noah Misch  wrote:
> > > On Wed, Apr 19, 2017 at 01:52:53PM +0900, Masahiko Sawada wrote:
> > >> On Wed, Apr 19, 2017 at 12:34 PM, Noah Misch  wrote:
> > >> > On Sun, Apr 16, 2017 at 07:25:28PM +0900, Fujii Masao wrote:
> > >> >> As I told firstly this is not a bug. There are some proposals for 
> > >> >> better design
> > >> >> of priority column in pg_stat_replication, but we've not reached the 
> > >> >> consensus
> > >> >> yet. So I think that it's better to move this open item to "Design 
> > >> >> Decisions to
> > >> >> Recheck Mid-Beta" section so that we can hear more opinions.
> > >> >
> > >> > I'm reading that some people want to report NULL priority, some people 
> > >> > want to
> > >> > report a constant 1 priority, and nobody wants the current behavior.  
> > >> > Is that
> > >> > an accurate summary?
> > >>
> > >> Yes, I think that's correct.
> > >
> > > Okay, but ...
> > >
> > >> FWIW the reason of current behavior is that it would be useful for the
> > >> user who is willing to switch from ANY to FIRST. They can know which
> > >> standbys will become sync or potential.
> > >
> > > ... does this mean you personally want to keep the current behavior?  If 
> > > not,
> > > has some other person stated a wish to keep the current behavior?
> > 
> > No, I want to change the current behavior. IMO it's better to set
> > priority 1 to all standbys in quorum set. I guess there is no longer
> > person who supports the current behavior.
> 
> In that case, this open item is not eligible for section "Design Decisions to
> Recheck Mid-Beta".  That section is for items where we'll probably change
> nothing, but we plan to recheck later just in case.  Here, we expect to change
> the behavior; the open question is which replacement behavior to prefer.
> 
> Fujii, as the owner of this open item, you are responsible for moderating the
> debate until there's adequate consensus to make a particular change or to keep
> the current behavior after all.  Please proceed to do that.  Beta testers
> deserve a UI they may like, not a UI you already plan to change later.

Please observe the policy on open item ownership[1] and send a status update
within three calendar days of this message.  Include a date for your
subsequent status update.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
> does not break anything for existing walsender usage.

> diff --git a/src/backend/replication/logical/launcher.c 
> b/src/backend/replication/logical/launcher.c
> index 761fbfa..e9dd886 100644
> --- a/src/backend/replication/logical/launcher.c
> +++ b/src/backend/replication/logical/launcher.c
> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char 
> *subname, Oid userid,
>   BackgroundWorkerbgw;
>   BackgroundWorkerHandle *bgw_handle;
>   int i;
> - int slot;
> + int slot = 0;
>   LogicalRepWorker   *worker = NULL;
>   int nsyncworkers = 0;
>   TimestampTz now = GetCurrentTimestamp();

That seems independent and unnecessary?


> -/* SIGUSR1: set flag to send WAL records */
> -static void
> -WalSndXLogSendHandler(SIGNAL_ARGS)
> -{
> - int save_errno = errno;
> -
> - latch_sigusr1_handler();
> -
> - errno = save_errno;
> -}

>   pqsignal(SIGPIPE, SIG_IGN);
> - pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
> */
> + pqsignal(SIGUSR1, procsignal_sigusr1_handler);

Those aren't actually entirely equivalent. I'm not sure it matters much,
but WalSndXLogSendHandler didn't include a SetLatch(), but
procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
will just be elided, but what if walsender already woke up and did it's
work?

I think it'd be better to have PostgresMain() set up signals by default,
and then have WalSndSignals() overwrites the ones it needs separate.
WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
currently sets a different variable from postgres.c, but that seems like
a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
an active bug to me?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-04-23 Thread Stephen Frost
Noah, all,

On Sun, Apr 23, 2017 at 19:52 Noah Misch  wrote:

> On Sat, Apr 22, 2017 at 01:14:08PM -0700, Noah Misch wrote:
> > On Thu, Apr 20, 2017 at 09:53:28PM -0400, Stephen Frost wrote:
> > > * Noah Misch (n...@leadboat.com) wrote:
> > > > On Mon, Apr 17, 2017 at 03:41:25PM -0400, Stephen Frost wrote:
> > > > > I've put up a new patch for review on the thread and plan to commit
> > > > > that tomorrow, assuming there isn't anything further.  That should
> > > > > resolve the immediate issue, but I do think there's some merit to
> Amit's
> > > > > follow-on patches and will continue to discuss those and may
> commit one
> > > > > or both of those later this week.
> > > >
> > > > This PostgreSQL 10 open item is past due for your status update.
> Kindly send
> > > > a status update within 24 hours, and include a date for your
> subsequent status
> > > > update.  Refer to the policy on open item ownership:
> > > >
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> > >
> > > I thought I'd be able to get to this today, but other activities have
> > > taken up the time I had planned to work on this.  I'll be on it again
> > > tomorrow morning and will provide an update (most likely a couple of
> > > commits) by COB tomorrow.
> >
> > This PostgreSQL 10 open item is again past due for your status update.
> Kindly
> > send a status update within 24 hours, and include a date for your
> subsequent
> > status update.  Refer to the policy on open item ownership:
> >
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>
> IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past
> due
> for your status update.  Please reacquaint yourself with the policy on open
> item ownership[1] and then reply immediately.  If I do not hear from you by
> 2017-04-25 00:00 UTC, I will transfer this item to release management team
> ownership without further notice.


The status is simply that I've been considering Robert's comments regarding
the documentation and have had a busy weekend. I'll provide an update
tomorrow.

Thanks!

Stephen

>


Re: [HACKERS] A note about debugging TAP failures

2017-04-23 Thread Michael Paquier
On Sun, Apr 23, 2017 at 10:05 PM, Craig Ringer
 wrote:
> On 23 Apr. 2017 10:32, "Michael Paquier"  wrote:
> On Sun, Apr 23, 2017 at 7:48 AM, Daniel Gustafsson  wrote:
>> Skipping the tempdir and instead using ${testname}_data_${name} without a
>> random suffix, we can achieve this with something along the lines of the
>> attached PoC.  It works as now (retain of failure, remove on success
>> unless
>> overridden) but that logic can easily be turned around if we want that.
>> If
>> it’s of interest I can pursue this after some sleep (tomorrow has become
>> today
>> at this point).
>
> Yes, something like that may make sense as well for readability.
>
> Keeping folders in case of failures is something that I have been
> advocating in favor of for some time, but this never got into the tree
> :(
>
> Huh? We do keep test temp datadirs etc in case of failure. Just not on
> success.

Yes, you are right. I thought this was not the case. Happy to be wrong.

> Our definition of failure there sucks a bit though. We keep the datadirs if
> any test fails in a script. If the script its self crashes we still blow
> away the datadirs which is kind of counterintuitive.

Yes, I agree that it would make sense to keep them around in this case
as well, having the data folder may help in debugging tests in some
cases.

> I'd like to change the __DIE__ sig handler to only delete on clean script
> exit code, tap reporting success, and if some env bar like PG_TESTS_NOCLEAN
> is undefined. The later could also be used in pg_regress etc.

This looks like a sensible plan.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-04-23 Thread Noah Misch
On Sat, Apr 22, 2017 at 01:14:08PM -0700, Noah Misch wrote:
> On Thu, Apr 20, 2017 at 09:53:28PM -0400, Stephen Frost wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> > > On Mon, Apr 17, 2017 at 03:41:25PM -0400, Stephen Frost wrote:
> > > > I've put up a new patch for review on the thread and plan to commit
> > > > that tomorrow, assuming there isn't anything further.  That should
> > > > resolve the immediate issue, but I do think there's some merit to Amit's
> > > > follow-on patches and will continue to discuss those and may commit one
> > > > or both of those later this week.
> > > 
> > > This PostgreSQL 10 open item is past due for your status update.  Kindly 
> > > send
> > > a status update within 24 hours, and include a date for your subsequent 
> > > status
> > > update.  Refer to the policy on open item ownership:
> > > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> > 
> > I thought I'd be able to get to this today, but other activities have
> > taken up the time I had planned to work on this.  I'll be on it again
> > tomorrow morning and will provide an update (most likely a couple of
> > commits) by COB tomorrow.
> 
> This PostgreSQL 10 open item is again past due for your status update.  Kindly
> send a status update within 24 hours, and include a date for your subsequent
> status update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
for your status update.  Please reacquaint yourself with the policy on open
item ownership[1] and then reply immediately.  If I do not hear from you by
2017-04-25 00:00 UTC, I will transfer this item to release management team
ownership without further notice.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
On 2017-04-21 04:20:26 +0200, Petr Jelinek wrote:
> Looks like SIGUSR1 being different is problem here - it's normally used
> to . I also noticed that we don't handle SIGINT (query cancel).

I think we really need to unify the paths between walsender and normal
backends to a much larger degree.


> BTW while looking at the code, I don't understand why we call
> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
> the SetLatch be enough (they both end up calling sendSelfPipeByte()
> eventually)?

Historic raisins - there didn't use to be a SetLatch in
procsignal_sigusr1_handler. That changed when I whacked around catchup &
notify to be based on latches ([1] and following).

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-23 11:05:44 +0100, Simon Riggs wrote:
> > Yikes.  This is clearly way undertested.  It's also pretty scary that
> > the code has recently been whacked out quite heavily (both 9.6 and
> > master), without hitting anything around this - doesn't seem to bode
> > well for how in-depth the testing was.
> 
> Obviously if there is a bug it's because tests didn't find it and
> therefore it is by definition undertested for that specific bug.

Sure. But there's bugs that are hard to catch, and there's bugs that are
easily testable. This seems to largely fall into the "relatively easy to
test" camp.


> I'm not sure what other conclusion you wish to draw, if any?

That the changes around whacking around twophase.c in several of the
last releases, didn't yield enough verified testing infrastructure.


> >> It's not clear to me how much potential this has to create user data
> >> corruption, but it doesn't look good at first glance.  Discuss.
> >
> > Hm. I think it can cause wrong tqual.c results in some edge cases.
> > During HS, lastOverflowedXid will be set in some cases, and then
> > XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in
> > turn cause lookups snapshot->subxip (where all HS xids reside)
> > potentially return wrong results.
> >
> > I was about to say that I don't see how it could result in persistent
> > corruption however - the subtrans lookups are only done for
> > (snapshot->xmin, snapshot->xmax] and subtrans is truncated
> > regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory
> > anymore, so that might be delayed.  Hm.
> 
> I've not found any reason, yet, to believe we return wrong answers in
> any case, even though the transient data structure pg_subtrans is
> corrupted by the switched call Tom discovers.

I think I pointed out a danger above. Consider what happens if query on
a standby has a suboverflowed snapshot:
Snapshot
GetSnapshotData(Snapshot snapshot)
{
...
if (TransactionIdPrecedesOrEquals(xmin, 
procArray->lastOverflowedXid))
suboverflowed = true;
}
..
snapshot->suboverflowed = suboverflowed;
}

In that case we rely on pg_subtrans for visibility determinations:
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
   Buffer buffer)
{
...
if (!HeapTupleHeaderXminCommitted(tuple))
{
...
else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), 
snapshot))
return false;

and
static bool
XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
{
...
if (!snapshot->takenDuringRecovery)
{
...
else
{
...
if (snapshot->suboverflowed)
{
/*
 * Snapshot overflowed, so convert xid to top-level.  
This is safe
 * because we eliminated too-old XIDs above.
 */
xid = SubTransGetTopmostTransaction(xid);

if the subxid->xid mapping doesn't actually exist - as it's the case
with this bug afaics - we'll not get the correct toplevel
transaction. Which'll mean the following block:
/*
 * We now have either a top-level xid higher than xmin or an
 * indeterminate xid. We don't know whether it's top level or 
subxact
 * but it doesn't matter. If it's present, the xid is visible.
 */
for (j = 0; j < snapshot->subxcnt; j++)
{
if (TransactionIdEquals(xid, snapshot->subxip[j]))
return true;
}
won't work correctly if suboverflowed.

I don't see anything prevent wrong results here?


Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing select(2) based latch (was Unportable implementation of background worker start)

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-20 17:27:42 -0400, Tom Lane wrote:
> In short: yeah, let's nuke the WAIT_USE_SELECT implementation.
> It's dead code and it's unlikely to get resurrected.

Done.


> BTW, noting that SUSv2 specifies  not , I wonder
> whether we couldn't drop configure's test for the latter along with
> the
> 
> #ifdef HAVE_SYS_POLL_H
> #include 
> #endif
> 
> stanzas we have in a couple of places.  But that's a separate issue.

Done, too, in a separate commit.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] valgrind error in subscription code

2017-04-23 Thread Andres Freund
On 2017-04-22 21:08:18 +0200, Petr Jelinek wrote:
> On 22/04/17 20:31, Andres Freund wrote:
> > Hi,
> > 
> > I enabled skink / the valgrind animal to run the tap tests too (hugely
> > increasing the test time :(), and that paid of:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2017-04-22%2004%3A52%3A13
> > 
> > ==606== VALGRINDERROR-BEGIN
> > ==606== Conditional jump or move depends on uninitialised value(s)
> > ==606==at 0x46A207: logicalrep_rel_open (relation.c:361)
> > ==606==by 0x472D12: copy_table (tablesync.c:669)
> > ==606==by 0x473186: LogicalRepSyncTableStart (tablesync.c:803)
> > ==606==by 0x475287: ApplyWorkerMain (worker.c:1530)
> > ==606==by 0x440AFD: StartBackgroundWorker (bgworker.c:835)
> > ==606==by 0x44E48A: do_start_bgworker (postmaster.c:5577)
> > ==606==by 0x44E59F: maybe_start_bgworker (postmaster.c:5761)
> > ==606==by 0x44F11D: sigusr1_handler (postmaster.c:5015)
> > ==606==by 0x4E430BF: ??? (in /lib/x86_64-linux-gnu/libpthread-2.24.so)
> > ==606==by 0x6FB8212: __select_nocancel (syscall-template.S:84)
> > ==606==by 0x44F868: ServerLoop (postmaster.c:1693)
> > ==606==by 0x450C53: PostmasterMain (postmaster.c:1337)
> > ==606==  Uninitialised value was created by a heap allocation
> > ==606==at 0x605382: MemoryContextAlloc (mcxt.c:729)
> > ==606==by 0x5E4E6A: DynaHashAlloc (dynahash.c:266)
> > ==606==by 0x5E4EEE: element_alloc (dynahash.c:1637)
> > ==606==by 0x5E5018: get_hash_entry (dynahash.c:1248)
> > ==606==by 0x5E5898: hash_search_with_hash_value (dynahash.c:1033)
> > ==606==by 0x5E5A0D: hash_search (dynahash.c:890)
> > ==606==by 0x469D38: logicalrep_relmap_update (relation.c:179)
> > ==606==by 0x472D05: copy_table (tablesync.c:666)
> > ==606==by 0x473186: LogicalRepSyncTableStart (tablesync.c:803)
> > ==606==by 0x475287: ApplyWorkerMain (worker.c:1530)
> > ==606==by 0x440AFD: StartBackgroundWorker (bgworker.c:835)
> > ==606==by 0x44E48A: do_start_bgworker (postmaster.c:5577)
> > ==606== 
> > ==606== VALGRINDERROR-END
> 
> Thanks, here is patch to fix that - I also removed the individual
> settings of everything to NULL/0/InvalidOid etc and just replaced it all
> with memset.

I've pushed this, and a fix for the (afaik harmless) replication origins
bleat.  Let's see what skink says - I wasn't patient enough to locally
verify that your fix indeed fixes things.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders

2017-04-23 Thread Thomas Munro
On Sun, Apr 23, 2017 at 6:01 PM, Tom Lane  wrote:
> Thomas Munro  writes:
>> On Sun, Apr 23, 2017 at 3:41 AM, Tom Lane  wrote:
>>> As for this patch itself, is it reasonable to try to assert that the
>>> timeline has in fact changed?
>
>> The protocol doesn't include the timeline in reply messages, so it's
>> not clear how the upstream server would know what timeline the standby
>> thinks it's dealing with in any given reply message.  The sending
>> server has its own idea of the current timeline but it's not in sync
>> with the stream of incoming replies.
>
> Fair enough.  But I'd still like an explanation of why only about
> half of the population is showing a failure here.  Seems like every
> machine should be seeing the LSN as moving backwards in this test.
> So (a) why aren't they all failing, and (b) should we change the
> test to make sure every platform sees that happening?

Every machine sees the LSN moving backwards, but the code path that
had the assertion only reached if it decides to interpolate, which is
timing dependent: there needs to be a future sample in the lag
tracking buffer, which I guess is not the case in those runs.  For
example, "jacana" is one of the animals that reported the assertion
failure for 009_twophase.pl but not the one for
004_timeline_switch.pl, in this run:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-04-20%2021%3A00%3A20

This standby2 log fragment corresponds to the LSN zigzag manoeuvre:

2017-04-20 22:04:51.744 GMT [1984] LOG:  fetching timeline history
file for timeline 2 from primary server
2017-04-20 22:04:51.744 GMT [1984] LOG:  started streaming WAL from
primary at 0/300 on timeline 1
2017-04-20 22:04:51.744 GMT [1984] LOG:  replication terminated by
primary server
2017-04-20 22:04:51.744 GMT [1984] DETAIL:  End of WAL reached on
timeline 1 at 0/3028D50.
2017-04-20 22:04:51.744 GMT [1708] LOG:  new target timeline is 2
2017-04-20 22:04:51.744 GMT [1984] LOG:  restarted WAL streaming at
0/300 on timeline 2

It looks approximately the same on my development machine where I got
the assertion failure until commit 84638808:

2017-04-24 09:04:46.563 NZST [95164] LOG:  fetching timeline history
file for timeline 2 from primary server
2017-04-24 09:04:46.563 NZST [95164] LOG:  started streaming WAL from
primary at 0/300 on timeline 1
2017-04-24 09:04:46.564 NZST [95164] LOG:  replication terminated by
primary server
2017-04-24 09:04:46.564 NZST [95164] DETAIL:  End of WAL reached on
timeline 1 at 0/3028470.
2017-04-24 09:04:46.565 NZST [95160] LOG:  new target timeline is 2
2017-04-24 09:04:46.565 NZST [95164] LOG:  restarted WAL streaming at
0/300 on timeline 2

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TAP tests - installcheck vs check

2017-04-23 Thread Andrew Dunstan

AFAICT, unlike the pg_regress checks, which in the installcheck case run
against a running instance of postgres, for TAP tests the only
difference is that that for the check case a temp install is done,
possibly with some extra contrib modules. Is that correct? If is is, why
aren't we providing an installcheck target for tests like recover. In at
least one case (buildfarmn jacana) installs are quite expensive (2 or 3
minutes) and if they are pointless as seems to be the case here why
can't we just avoid them?


cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] valgrind error in subscription code

2017-04-23 Thread Petr Jelinek
On 22/04/17 21:16, Andres Freund wrote:
> Hi,
> 
> On 2017-04-22 21:08:18 +0200, Petr Jelinek wrote:
>> Thanks, here is patch to fix that - I also removed the individual
>> settings of everything to NULL/0/InvalidOid etc and just replaced it all
>> with memset.
> 
> Cool.
> 
>> diff --git a/src/backend/replication/logical/relation.c 
>> b/src/backend/replication/logical/relation.c
>> index 875a081..5bc54dd 100644
>> --- a/src/backend/replication/logical/relation.c
>> +++ b/src/backend/replication/logical/relation.c
>> @@ -141,19 +141,10 @@ logicalrep_relmap_free_entry(LogicalRepRelMapEntry 
>> *entry)
>>  pfree(remoterel->attnames);
>>  pfree(remoterel->atttyps);
>>  }
>> -remoterel->attnames = NULL;
>> -remoterel->atttyps = NULL;
>> -
>>  bms_free(remoterel->attkeys);
>> -remoterel->attkeys = NULL;
>>  
>>  if (entry->attrmap)
>>  pfree(entry->attrmap);
>> -
> 
> Btw, I think it's a good pattern to zero things like attrmap after
> freeing.  Based on a minute of looking it's unclear to me if
> logicalrep_relmap_update() could be called again, if e.g. one of the
> pallocs after the logicalrep_relmap_free_entry() errors out.  I think
> you essentially addressed that with the memset, so that's good.
> 

Yes, memset is called immediately after logicalrep_relmap_free_entry()
which is why I found it redundant to set things to NULL there. We could
put the memset directly inside logicalrep_relmap_free_entry() and the
call either logicalrep_relmap_free_entry() or memset in the caller if
that would make you feel better about it.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 18:41, Simon Riggs  wrote:
> On 23 April 2017 at 17:17, Tom Lane  wrote:
>> Simon Riggs  writes:
 Also, when I fix that, it gets further but still crashes at the same
 Assert in SubTransSetParent.  The proximate cause this time seems to be
 that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
 it's computing that as "false", but in reality the subtrans link in
 question has already been set.
>>
>>> Not sure about that, investigating.
>>
>> As a quick hack, I just hotwired RecoverPreparedTransactions to set
>> overwriteOK = true always, and with that and the SubTransSetParent
>> argument-order fix, HEAD passes the recovery tests.  Maybe we can
>> be smarter than that, but this might be a good short-term fix to get
>> the buildfarm green again.
>
> That would work. I've been looking into a fix I can explain, but "do
> it always" may actually be it.
>
> OK, I'll do that.

Done, tests pass again for me now.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] vcregress support for single TAP tests

2017-04-23 Thread Andrew Dunstan

Here's a patch that will allow calling vcregress.pl to run a single TAP
test set. It would work like this:


vcregress.pl src/test/recover true


The second argument if true (in the perl sense, of course) would trigger
a temp install before running the tests. It defaults to off, in an
attempt to minimize the unnecessary running of installs.

Once we have this I will switch the buildfarm client to use it. The
previous bincheck procedure is kept to avoid breakage of scripts,
buildfarm clients etc.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index 8933920..56c7a2d 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -34,7 +34,7 @@ if (-e "src/tools/msvc/buildenv.pl")
 
 my $what = shift || "";
 if ($what =~
-/^(check|installcheck|plcheck|contribcheck|modulescheck|ecpgcheck|isolationcheck|upgradecheck|bincheck|recoverycheck)$/i
+/^(check|installcheck|plcheck|contribcheck|modulescheck|ecpgcheck|isolationcheck|upgradecheck|bincheck|recoverycheck|taptest)$/i
   )
 {
 	$what = uc $what;
@@ -54,13 +54,6 @@ copy("$Config/dummy_seclabel/dummy_seclabel.dll", "src/test/regress");
 
 $ENV{PATH} = "$topdir/$Config/libpq;$ENV{PATH}";
 
-my $schedule = shift;
-unless ($schedule)
-{
-	$schedule = "serial";
-	$schedule = "parallel" if ($what eq 'CHECK' || $what =~ /PARALLEL/);
-}
-
 if ($ENV{PERL5LIB})
 {
 	$ENV{PERL5LIB} = "$topdir/src/tools/msvc;$ENV{PERL5LIB}";
@@ -90,13 +83,14 @@ my %command = (
 	ISOLATIONCHECK => \,
 	BINCHECK   => \,
 	RECOVERYCHECK  => \,
-	UPGRADECHECK   => \,);
+	UPGRADECHECK   => \,
+	TAP_TEST   => \,);
 
 my $proc = $command{$what};
 
 exit 3 unless $proc;
 
-&$proc();
+&$proc(@_);
 
 exit 0;
 
@@ -104,6 +98,7 @@ exit 0;
 
 sub installcheck
 {
+	my $schedule = shift || 'serial';
 	my @args = (
 		"../../../$Config/pg_regress/pg_regress",
 		"--dlpath=.",
@@ -119,6 +114,7 @@ sub installcheck
 
 sub check
 {
+	my $schedule = shift || 'parallel';
 	InstallTemp();
 	chdir "${topdir}/src/test/regress";
 	my @args = (
@@ -145,7 +141,7 @@ sub ecpgcheck
 	exit $status if $status;
 	InstallTemp();
 	chdir "$topdir/src/interfaces/ecpg/test";
-	$schedule = "ecpg";
+	my $schedule = "ecpg";
 	my @args = (
 		"../../../../$Config/pg_regress_ecpg/pg_regress_ecpg",
 		"--bindir=",
@@ -219,6 +215,17 @@ sub bincheck
 	exit $mstat if $mstat;
 }
 
+sub taptest
+{
+	my ($dir, $needs_install) = @_;
+
+	die "no tests found!" unless -d "$topdir/$dir/t";
+
+	InstallTemp() if $needs_install;
+	my $status = tap_check("$topdir/$dir");
+	exit $status if $status;
+}
+
 sub plcheck
 {
 	chdir "../../pl";
@@ -516,7 +523,6 @@ sub fetchRegressOpts
 	$m =~ s{\\\r?\n}{}g;
 	if ($m =~ /^\s*REGRESS_OPTS\s*\+?=(.*)/m)
 	{
-
 		# Substitute known Makefile variables, then ignore options that retain
 		# an unhandled variable reference.  Ignore anything that isn't an
 		# option starting with "--".
@@ -596,9 +602,10 @@ sub InstallTemp
 sub usage
 {
 	print STDERR
-	  "Usage: vcregress.pl  [  ]\n\n",
+	  "Usage: vcregress.pl  [ arg [ , arg] ... ]\n\n",
 	  "Options for :\n",
 	  "  bincheck   run tests of utilities in src/bin/\n",
+	  "  taptest   run an arbitrary TAP test\n",
 	  "  check  deploy instance and run regression tests on it\n",
 	  "  contribcheck   run tests of modules in contrib/\n",
 	  "  ecpgcheck  run regression tests of ECPG\n",
@@ -608,8 +615,11 @@ sub usage
 	  "  plcheckrun tests of PL languages\n",
 	  "  recoverycheck  run recovery test suite\n",
 	  "  upgradecheck   run tests of pg_upgrade\n",
-	  "\nOptions for :\n",
+	  "\nOptions for arg: (used by check and installcheck)\n",
 	  "  serial serial mode\n",
-	  "  parallel   parallel mode\n";
+	  "  parallel   parallel mode\n",
+	  "\nOptions for args for taptest\n",
+	  "  test-dir   (required) directory where tests reside\n",
+	  "  install-needed (optional) if non-zero a temp install will be done\n";
 	exit(1);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] identity columns

2017-04-23 Thread Robert Haas
On Thu, Apr 20, 2017 at 12:05 AM, Vitaly Burovoy
 wrote:
>> I am still not fond of this change.  There is precedent all over the
>> place for having separate commands for creating a structure, changing a
>> structure, and removing a structure.  I don't understand what the
>> problem with that is.

I agree.  That's not intrinsically a problem.

> OK. Let's go through it again.
> IDENTITY is a property of a column. There are no syntax to change any
> property of any DB object via the "ADD" syntax.
> Yes, a structure (a sequence) is created. But in fact it cannot be
> independent from the column at all (I remind you that according to the
> standard it should be unnamed sequence and there are really no way to
> do something with it but via the column's DDL).

I agree that ADD is a little odd here, but it doesn't seem terrible.
But why do we need it?  Instead of:

ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
SET GENERATED { ALWAYS | BY DEFAULT }
DROP IDENTITY [ IF EXISTS ]

Why not just:

SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
DROP IDENTITY [ IF EXISTS ]

Surely the ALTER TABLE command can tell whether the column is already
GENERATED, so the first form could make it generated if it's not and
adjust the ALWAYS/BY DEFAULT property if it is.

> It is even hard to detect which sequence (since they have names) is
> owned by the column:
>
> postgres=# CREATE TABLE xxx(i int generated always as identity, j serial);
> CREATE TABLE
> postgres=# \d xxx*
> Table "public.xxx"
>  Column |  Type   | Collation | Nullable |Default
> +-+---+--+
>  i  | integer |   | not null | generated always as identity
>  j  | integer |   | not null | nextval('xxx_j_seq'::regclass)
>
>  Sequence "public.xxx_i_seq"
>Column   |  Type   | Value
> +-+---
>  last_value | bigint  | 1
>  log_cnt| bigint  | 0
>  is_called  | boolean | f
>
>  Sequence "public.xxx_j_seq"
>Column   |  Type   | Value
> +-+---
>  last_value | bigint  | 1
>  log_cnt| bigint  | 0
>  is_called  | boolean | f
> Owned by: public.xxx.j
>
> I can only guess that "public.xxx_i_seq" is owned by "public.xxx.i",
> nothing proves that.
> Whereas for regular sequence there are two evidences ("Default" and "Owned 
> by").

This seems like a psql deficiency that should be fixed.

> Also the created sequence cannot be deleted (only with the column) or
> left after the column is deleted.

This does not seem like a problem.  In fact I'd say that's exactly the
desirable behavior.

> The "[ NOT ] EXISTS" is a common Postgres' syntax extension for
> creating/updating objects in many places. That's why I think it should
> be used instead of introducing the new "ADD" syntax which contradicts
> the users' current experience.

As noted above, I don't understand why we need either ADD or IF NOT
EXISTS.  Why can't SET just, eh, set the property to the desired
state?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-04-23 Thread Fabien COELHO


Hello Nikolay,


Hmmm. The pre-existing TAP test in pgbench is about concurrent commits, it
is not to test pgbench itself. Pgbench allows to run some programmable
clients in parallel very easily, which cannot be done simply otherwise. I
think that having it there would encourage such uses.



Since none of us is Tom Lane, I think we have no moral right to encourage
somebody doing somebody in postgres.


I do not get it.


People do what they think better to do.


Probably.

[...] abstraction. For me, all pg standard executables could have their 
methods in PostgresNode.pm.



you are speaking about
local $ENV{PGPORT} = $self->port;
?


Yes. My point is that this is the internal stuff of PostgresNode and that 
it should stay inside that class. The point of the class is to organize 
postgres instances for testing, including client-server connections. From 
an object oriented point of view, the method to identify the server could 
vary, thus the testing part should not need to know, unless what is tested 
is this connection variations, hence it make sense to have it there.



Why do you need it here after all? Lots of TAP tests for bin utilities runs
them using command_like function from TestLib.pm and need no setting
$ENV{PGPORT}.


The test I've seen that do not need it do not connect to the server.
In order to connect to the server they get through variants from 
PostgresNode which set the variables before calling the TestLib function.



Is pgbench so special? If it is so, may be it is good reason to
fix utilities from TestLib.pm, so they can take port from PostgresNode.


Nope. TestLib does not know about PostgresNode, the idea is rather that 
PostgresNode knows and wraps around TestLib when needed.


If in these tests we can use command_like instead of pgbench_likes it 
would increase maintainability of the code.


"command_like" variants do not look precisely at all results (status, 
stdout, stderr) and are limited to what they check (eg one regex at a 
time). Another thing I do not like is that they expect commands as a list 
of pieces, which is not very readable.


Now I can add a command_likes which allows to manage status, stdout and 
stderr and add that in TestLib & PostgresNode.



If we need more close integration with PostgresNode then command_like
provides, then may be we should improve command_like or add another function
that provides things that are needed.


Yep, this is possible.


psql -- is a core utility for postgres node. pgbench is optional.


AFAICS pgbench is a core utility as well. I do not know how not to compile 
pg without pgbench. It was optional when in contrib, it is not anymore.



I think all the data from this file should be somehow imported into
001_pgbench.pl and these files should be created only when tests is
running, and then removed when testing is done.


Hmmm. I thought of that but was very unconvinced by the approach: pgbench
basically always requires a file, so what the stuff was doing was writting
the script into a file, checking for possible errors when doing so, then
unlinking the file and checking for errors again after the run.



I think I was wrong about deleting file after tests are finished. Better keep
them.


Hmmm... Then what is the point not to have them as files to begin with?

Then you have to do some escaping the the pgbench script themselves, 
and the perl script becomes pretty horrible and unreadable with plenty 
of perl, SQL, backslash commands in strings...



Perl provides a lot of tools for escaping.


Sure. It does not mean that Perl is the best place to store dozens of 
pgbench scripts.


If once chooses the right one, there would be no need in additional 
backslashes.


This does not fix the issue with having a mixture of files and languages 
in the test script, which I think is basically a bad idea for readability.


Finally, if the script is inside the perl script it makes it hard to 
run the test outside of it when a problem is found, so it is a pain.



I've took back my words about deleting. After a first run one will have these
files "in flesh" so they would be available for further experiments.


I'm lost. So where is the problem with having them as file in the first 
place, if you want to keep them anyway?



I would speak again about maintainability. Having 36 files, most of them <100b
size -- is a very bad idea for maintainability.


I do not understand why. Running from files is a pgbench requirement. Each 
test file is quite well defined, could be commented more about what it is 
testing, and it is easy to see which pgbench run uses it.


If each commit of new functionality would add 36 small files, we will 
drown in these files quite soon. These files should be generated on fly. 
I am 100% sure of it.


Good for you:-)

I think that it would lead to a horrible test script. I can write horrible 
test scripts, but I wish to avoid it.



PS. I've read the perl code through much more carefully. All other things are
looking 

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-04-23 Thread Jan Michálek
2017-04-19 10:05 GMT+02:00 Jan Michálek :

> 2017-04-19 9:18 GMT+02:00 Fabien COELHO :
>
>>
>> I still do not understand "why" this variant vs CommonMark or whatever
 other version.

>>>
>>> Because of simply implementation and readability (looks similar to
>>> aligned
>>> format) and it is comfortable to edit generated table (changing values,
>>> aligning columns etc.).
>>>
>>
>> Hmmm. Why not.
>>
>> Sorry, maybe I`m not understanding, there is problems with characters like
>>> pipe in cells, pipe should be escaped. What other special markdown
>>> characters? Escaping html code in cells?
>>>
>>
>> Markdown include characters/sequences which are interpreted as markers:
>> _Italic_, **Bold**, *** => horizontal rules, > block quote... `inline
>> code`... If they are interpreted within a table cell then probably they
>> should be escaped somehow.
>>
>
>
I have treated "_*|<>"

jelen=# SELECT E'**1**\nrrr';
| **RECORD 1** |  |
|--|--|
| ?column? | \*\*1\*\*/b |


jelen=#

> I`m able to sanitize characters, but complex sequences will be problem. I
> will look on this, but I don`t know, if I`m able to do this.
>
> My main interest on this was in rst. I`m using markdown only in github
> issues and my knowldge about md is poor.
>
>
>>
>> Main of the functionality is used from aligned format. I tested returned
> tables in retext and it works. If i have another character than
> standart
> pipe, it shouldn`t work.
>
> Sure. ISTM that you are currently using U+2502 instead of pipe, hence
 my
 point.

>>>
>>> Could you send me example where?
>>>
>>
>> I already did in the first mail with the example output copy pasted from
>> psql. Some characters are pipe and others are BOX DRAWINGS LIGHT VERTICAL
>> characters.
>>
>> Maybe this is because I have in my ~/.psqlrc:
>>
>>  \pset linestyle unicode
>>  \pset border 2
>>
>
>> in which case your reuse of the the aligned stuff should take care of the
>> border setting to avoid using special UTF8 characters..
>>
>
I corrected it.

jelen=# \pset linestyle unicode
Line style is unicode.
jelen=# SELECT 1,2,3,4;
| **RECORD 1** |   |
|--|---|
| ?column? | 1 |
| ?column? | 2 |
| ?column? | 3 |
| ?column? | 4 |


jelen=#

Regards
Jan



>
> Yes, it looks it is done by linestyle.
>
>
> jelen=# SELECT 1;
>
> | ?column? |
> |--|
> |1 |
>
>
> (1 row)
>
> jelen=# \pset linestyle unicode
> Line style is unicode.
> jelen=# SELECT 1;
>
> │ ?column? │
> |--|
> │1 │
>
>
> (1 row)
>
> jelen=#
>
> I have prepared linestyle for rst and md, but I can`t switch linestyle
> outside, because if i did it
> \pset linestyle
> wrote "markdown" or "rst".
> I see, problem is only in cells borders, I will correct this.
>
> Jan
>
>
>>
>> --
>> Fabien.
>>
>
>
>
> --
> Jelen
> Starší čeledín datovýho chlíva
>



-- 
Jelen
Starší čeledín datovýho chlíva
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 3b86612862..97d8612359 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
***
*** 2536,2542  lo_import 152801
aligned, wrapped,
html, asciidoc,
latex (uses tabular),
!   latex-longtable, or
troff-ms.
Unique abbreviations are allowed.  (That would mean one letter
is enough.)
--- 2536,2543 
aligned, wrapped,
html, asciidoc,
latex (uses tabular),
! 		  latex-longtable, 
! 		  rst, markdown, or
troff-ms.
Unique abbreviations are allowed.  (That would mean one letter
is enough.)
***
*** 2564,2570  lo_import 152801
  

The html, asciidoc, latex,
!   latex-longtable, and troff-ms
formats put out tables that are intended to
be included in documents using the respective mark-up
language. They are not complete documents! This might not be
--- 2565,2572 
  

The html, asciidoc, latex,
! 		  latex-longtable, troff-ms,
! 		  markdown and rst
formats put out tables that are intended to
be included in documents using the respective mark-up
language. They are not complete documents! This might not be
diff --git a/src/bin/psql/command.c bindex 859ded71f6..5bca425d22 100644
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
***
*** 3693,3698  _align2string(enum printFormat in)
--- 3693,3704 
  		case PRINT_TROFF_MS:
  			return "troff-ms";
  			break;
+ 		case PRINT_MARKDOWN:
+ 			return "markdown";
+ 			break;
+ 		case PRINT_RST:
+ 			return "rst";
+ 			break;
  	}
  	return "unknown";
  }
***
*** 3764,3772  do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
  			

[HACKERS] pg_basebackup issue

2017-04-23 Thread chiru r
Hi Team,

I am using Postgresql 9.5 and I have created backup_admin user and created
dba_admin ROLE with SUPERUSER and REPLICATION ,after that GRANT dba_admin
 role   to backup_admin user and executed  pg_basebakup utility with
backup_admin user.
But I am not able to use the pg_basebackup utility using backup_admin user
and got below FATAL.
pg_basebackup: could not connect to server: FATAL:  must be superuser or
replication role to start walsender

However I have observed only issue with backup_admin  user to use
pg_basebackup utility.

Please help me to understand why pg_basebackup is throwing FATAL when I use
backup_admin?.

Is there any limitation with pg_basebackup utility ?

The process i am following for backup_admin user :

postgres=# select version();
 version
--
 PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-55), 64-bit
(1 row)

postgres=#
postgres=# create user backup_admin password 'X';
CREATE ROLE
postgres=# create role dba_admin SUPERUSER REPLICATION;
CREATE ROLE
postgres=# grant dba_admin to backup_admin;
GRANT ROLE
postgres=# alter user backup_admin set role to dba_admin;
ALTER ROLE

postgres=# \du
   List of roles
Role name | Attributes
| Member of
--++
 backup_admin |
   | {dba_admin}
 dba_admin| Superuser, Cannot login, Replication
| {}
 postgres | Superuser, Create role, Create DB, Replication, Bypass
RLS | {}

[postgres@pgserver ~]$ mkdir online_backups1
[postgres@pgserver ~]$ /opt/PostgreSQL/9.5/bin/pg_basebackup  --format=t
--pgdata=online_backups1 -p 5432 -U backup_admin  -x -z  --verbose
pg_basebackup: could not connect to server: FATAL:  must be superuser or
replication role to start walsender

*Please help me why pg_basebackup is throwing FATAL when I use
backup_admin?.*

*Is there any limitation in pg_basebackup utility ?*

For information the pg_basebackup is working fine for Postgres user and it
is successful.

[postgres@pgserver ~]$ /opt/PostgreSQL/9.5/bin/pg_basebackup  --format=t
--pgdata=online_backups -p 5432 -U postgres  -x -z  --verbose
transaction log start point: 0/228 on timeline 1
transaction log end point: 0/2000130
pg_basebackup: base backup completed

Thanks,
Chiru


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 17:17, Tom Lane  wrote:
> Simon Riggs  writes:
>>> Also, when I fix that, it gets further but still crashes at the same
>>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>>> it's computing that as "false", but in reality the subtrans link in
>>> question has already been set.
>
>> Not sure about that, investigating.
>
> As a quick hack, I just hotwired RecoverPreparedTransactions to set
> overwriteOK = true always, and with that and the SubTransSetParent
> argument-order fix, HEAD passes the recovery tests.  Maybe we can
> be smarter than that, but this might be a good short-term fix to get
> the buildfarm green again.

That would work. I've been looking into a fix I can explain, but "do
it always" may actually be it.

OK, I'll do that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Tom Lane
Simon Riggs  writes:
> On 23 April 2017 at 00:55, Tom Lane  wrote:
>> It's not clear to me how much potential this has to create user data
>> corruption, but it doesn't look good at first glance.  Discuss.

> SubTransSetParent() only matters for the case where we are half-way
> through a commit with a large commit. Since we do atomic updates of
> commit and subcommmit when on same page, this problem would only
> matter when top level xid and other subxids were separated across
> multiple pages and the issue would only affect a read only query
> checking visibility during the commit for that prepared transaction.
> Furthermore, the nature of the corruption is that we set the xid to
> point to the subxid; since we never mark a top-level commit as
> subcommitted, subtrans would never be consulted and so the corruption
> would not lead to any incorrect answer even during the window of
> exposure. So it looks to me like this error shouldn't cause a problem.

Still trying to wrap my head around this argument ... I agree that
incorrectly setting the parent's pg_subtrans entry can't cause a
visible problem, because it will never be consulted.  However, failing
to set the child's entry seems like it could cause transient problems.
There would only be a risk during the eventual commit of the prepared
transaction, and only when the pg_xact entries span multiple pages,
but then there would be a window where the child xact is marked
subcommitted but it has a zero entry in pg_subtrans.  That would
result in a WARNING from TransactionIdDidCommit, and a false result,
which depending on timing might mean that commit of the overall
transaction appears non-atomic to onlookers.  (That is, the parent
xact might already appear committed to them, but the subtransaction
not yet.)

I can't find any indication in the archives that anyone's ever reported
seeing that WARNING, though that might only mean that they'd not noticed
it.  But in any case, it seems like the worst consequence is a narrow
window for seeing non-atomic commit of a prepared transaction on a standby
server.  That seems pretty unlikely to cause real-world problems.

The other point about overwriteOK not being set when it needs to be
also seems like it could not cause any problems in production builds,
since that flag is only examined by an Assert.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-04-23 Thread Nikolay Shaplov
В письме от 20 апреля 2017 19:14:34 пользователь Fabien COELHO написал:

> >> (1) extends the existing perl tap test infrastructure with methods to
> >> test
> >> pgbench, i.e. "pgbench" which runs a pgbench test and "pgbench_likes"
> >> which allows to check for expectations.
> > 
> > I do not think it is good idea adding this functions to the
> > PostgresNode.pm.
> I thought it was:-)
> 
> > They are pgbench specific.
> 
> Sure.
> 
> > I do not think anybody will need them outside of pgbench tests.
> 
> Hmmm. The pre-existing TAP test in pgbench is about concurrent commits, it
> is not to test pgbench itself. Pgbench allows to run some programmable
> clients in parallel very easily, which cannot be done simply otherwise. I
> think that having it there would encourage such uses.
Since none of us is Tom Lane, I think we have no moral right to encourage 
somebody doing somebody in postgres. People do what they think better to do. 
If somebody need this functions for his tests, he/she can move them to public 
space. If somebody sure he would use them soon, and tell about it in this 
list, it would be good reason to make them public too. Until then I would offer 
to consider these function private business of pgbench testing and keep them 
somewhere inside src/bin/pgbench directory


> Another point is that the client needs informations held as attributes in
> the node in order to connect to the server. Having it outside would mean
> picking the attributes inside, which is pretty unclean as it breaks the
> abstraction. For me, all pg standard executables could have their methods
> in PostgresNode.pm.
you are speaking about 
local $ENV{PGPORT} = $self->port;
?
Why do you need it here after all? Lots of TAP tests for bin utilities runs 
them using command_like function from TestLib.pm and need no setting 
$ENV{PGPORT}. Is pgbench so special? If it is so, may be it is good reason to 
fix utilities from TestLib.pm, so they can take port from PostgresNode.

> Finally, it does not cost anything to have it there.
The price is maintainability. If in these tests we can use command_like 
instead of pgbench_likes it would increase maintainability of the code.


> > May be you should move some code into some kind of helpers and keep them
> > in
> > PostgresNode.pm.
> 
> Hmmm. I can put a "run any client" function with the same effect and pass
> an additional argument to tell that pgbench should be run, but this looks
> quite contrived for no special reason.
I read the existing code more carefully now, and as far as I can see most 
console utilities uses command_like for testing. I think the best way would be 
to use it for testing. Or write a warping around it, if we need additional 
specific behavior for pgbench.

If we need more close integration with PostgresNode then command_like 
provides, then may be we should improve command_like or add another function 
that provides things that are needed. 


> > Or write universal functions that can be used for testing any postgres
> > console tool. Then they can be placed in PostgresNode.pm
> 
> There are already "psql"-specific functions... Why not pgbench specific
> ones?
psql -- is a core utility for postgres node. pgbench is optional. 

> >> (3) add a lot of new very small tests so that coverage jumps from very
> >> low
> >> to over 90% for source files. [...]
> > 
> > What tool did you use to calculate the coverage?
> 
> I followed the documentated recipee:
> 
> https://www.postgresql.org/docs/devel/static/regress-coverage.html
Thanks...


> > Lots of small test looks good at first glance, except the point that
> > adding
> > lots of small files with pgbench scripts is not great idea. This makes
> > code
> > tree too overloaded with no practical reason. I am speaking of
> > src/bin/pgbench/t/scripts/001_0* files.
> > 
> > I think all the data from this file should be somehow imported into
> > 001_pgbench.pl and these files should be created only when tests is
> > running, and then removed when testing is done.
> 
> Hmmm. I thought of that but was very unconvinced by the approach: pgbench
> basically always requires a file, so what the stuff was doing was writting
> the script into a file, checking for possible errors when doing so, then
> unlinking the file and checking for errors again after the run. 
I think I was wrong about deleting file after tests are finished. Better keep 
them.

> Then you
> have to do some escaping the the pgbench script themselves, and the perl
> script becomes pretty horrible and unreadable with plenty of perl, SQL,
> backslash commands in strings...
Perl provides a lot of tools for escaping. If once chooses the right one, 
there would be no need in additional backslashes.

> Finally, if the script is inside the perl
> script it makes it hard to run the test outside of it when a problem is
> found, so it is a pain.
I've took back my words about deleting. After a first run one will have these 
files "in flesh" so they would be available for 

Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Tom Lane
Simon Riggs  writes:
>> Also, when I fix that, it gets further but still crashes at the same
>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>> it's computing that as "false", but in reality the subtrans link in
>> question has already been set.

> Not sure about that, investigating.

As a quick hack, I just hotwired RecoverPreparedTransactions to set
overwriteOK = true always, and with that and the SubTransSetParent
argument-order fix, HEAD passes the recovery tests.  Maybe we can
be smarter than that, but this might be a good short-term fix to get
the buildfarm green again.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A note about debugging TAP failures

2017-04-23 Thread Tom Lane
Michael Banck  writes:
> On Sat, Apr 22, 2017 at 02:05:13PM -0700, Andres Freund wrote:
>> Agreed.  If paths are reproducible and cleaned up on next run it's also
>> much less of an issue to just leave them around till the next run.
>> Which we imo also should do for non-failing tmp_check directories.

> In general yes, but I think it should be checked what the overall 
> storage requirements will be if one set of all data directories is kept
> around.

> I was surprised the src/bin/pg_basebackup/t TAP tests took up several
> hundred megabytes (IIRC) when I ran them, so if other tests are similar,
> it might make a few animals run out of diskspace as soon as this is
> deployed without a heads-up to the operators.

src/test/recovery/ alone eats 2.6GB if one sets CLEANUP => 0 as I did
upthread.  So I think leaving them there by default is a nonstarter.
It might be okay to leave failed tests' dirs in place, but even there,
I'd personally prefer a manual control knob.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing select(2) based latch (was Unportable implementation of background worker start)

2017-04-23 Thread Robert Haas
On Thu, Apr 20, 2017 at 5:50 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2017-04-20 17:27:42 -0400, Tom Lane wrote:
>>> In short: yeah, let's nuke the WAIT_USE_SELECT implementation.
>>> It's dead code and it's unlikely to get resurrected.
>
>> Ok, cool.  v10 or wait till v11?  I see very little reason to wait
>> personally.
>
> I feel no need to wait on that.  Code removal is not a "new feature".

One can imagine a situation in which code removal seemed to carry a
risk of destabilizing something, but the change under discussion here
seems more likely to improve stability rather than to regress
anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-04-23 Thread Robert Haas
On Thu, Apr 20, 2017 at 5:24 PM, Claudio Freire  wrote:
>> What's not clear to me is how sensitive the performance of vacuum is
>> to the number of cycles used here.  For a large index, the number of
>> searches will presumably be quite large, so it does seem worth
>> worrying about performance.  But if we just always used a binary
>> search, would that lose enough performance with small numbers of
>> segments that anyone would care?  If so, maybe we need to use linear
>> search for small numbers of segments and switch to binary search with
>> larger numbers of segments.
>
> I just went and tested.

Thanks!

> That's after inlining the compare on both the linear and sequential
> code, and it seems it lets the compiler optimize the binary search to
> the point where it outperforms the sequential search.
>
> That's not the case when the compare isn't inlined.
>
> That seems in line with [1], that show the impact of various
> optimizations on both algorithms. It's clearly a close enough race
> that optimizations play a huge role.
>
> Since we're not likely to go and implement SSE2-optimized versions, I
> believe I'll leave the binary search only. That's the attached patch
> set.

That sounds reasonable based on your test results.  I guess part of
what I was wondering is whether a vacuum on a table large enough to
require multiple gigabytes of work_mem isn't likely to be I/O-bound
anyway.  If so, a few cycles one way or the other other isn't likely
to matter much.  If not, where exactly are all of those CPU cycles
going?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Do we need multi-column frequency/histogram stats? WAS Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Tomas Vondra

On 04/23/2017 04:16 PM, Bruce Momjian wrote:

On Sun, Apr 23, 2017 at 10:01:16AM -0400, Bruce Momjian wrote:

On Sun, Apr 23, 2017 at 11:44:12AM +0100, Simon Riggs wrote:

For us "functional dependency" would sound like something to do with
functions (e.g. CREATE FUNCTION), so just "dependency" appears to me
to be the best term for this.

There are multiple statistics for dependency stored, hence
"dependencies". I don't like it, but its the best term I can see at
present.


OK, thank you for the reply, and I am sorry I forgot the previous
discussion.  I just wanted to re-check we had research this.  Thanks.


(Email subject updated.)

Actually, I have a larger question that I was thinking about.  Because
we already have lots of per-column stats, and now the dependency score,
is it possible to mix the per-column stats and dependency score in a way
that multi-column frequency/histogram stats are not necessary?  That
might be a less costly approach I had not considered.


Certainly not. Functional dependencies are "global" statistics, and only 
a very specific type of it. It only tells you that a particular column 
"implies" another column, i.e. knowledge of a value in A means there's 
only a single possible value in "B". That has a number of implications:


* It only works for equality conditions. No inequalities or so.

* It assumes the queries are "consistent" with the functional dependencies.

* There are dependencies/correlations that are not functional 
dependencies, while MCV/histograms would help.


Functional dependencies was the simplest type of extended statistics, 
and so it was the first one to implement (and introduce all the 
infrastructure). But we still need the other types.


The other types of statistics actually track correlation between values 
in the columns, not just "column A implies column B".



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Tomas Vondra

On 04/23/2017 12:44 PM, Simon Riggs wrote:

On 23 April 2017 at 09:17, Dean Rasheed  wrote:

On 23 April 2017 at 03:37, Bruce Momjian  wrote:

In looking at the new multi-column statistics "dependency" option in
Postgres 10, I am quite confused by the term "dependency".  Wouldn't
"correlation" be clearer and less confusing as "column dependency"
already means something else.



I also asked that exactly that question...


Actually, the terms "dependency" and "correlation" are both quite
broad terms that cover a whole range of other different things, and
hence could be misleading. The precise term for this is "functional
dependency" [1], so if anything, the option name should be
"functional_dependencies" or some shortening of that, keeping a part
of each of those words.


...and got that answer also.

For us "functional dependency" would sound like something to do with
functions (e.g. CREATE FUNCTION), so just "dependency" appears to me
to be the best term for this.



Not really. Functional dependency is a term well-defined in relational 
algebra, particularly in definition of normal forms. It has nothing to 
do with functions, and I'm sure it's not the only possible "ambiguity".


I actually considered functional_dependency when working on this, and I 
think it might be a better choice, but seemed a bit longish.



There are multiple statistics for dependency stored, hence
"dependencies". I don't like it, but its the best term I can see at
present.


That is a good point, actually. It should probably be plural.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Do we need multi-column frequency/histogram stats? WAS Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Bruce Momjian
On Sun, Apr 23, 2017 at 10:01:16AM -0400, Bruce Momjian wrote:
> On Sun, Apr 23, 2017 at 11:44:12AM +0100, Simon Riggs wrote:
> > For us "functional dependency" would sound like something to do with
> > functions (e.g. CREATE FUNCTION), so just "dependency" appears to me
> > to be the best term for this.
> > 
> > There are multiple statistics for dependency stored, hence
> > "dependencies". I don't like it, but its the best term I can see at
> > present.
> 
> OK, thank you for the reply, and I am sorry I forgot the previous
> discussion.  I just wanted to re-check we had research this.  Thanks.

(Email subject updated.)

Actually, I have a larger question that I was thinking about.  Because
we already have lots of per-column stats, and now the dependency score,
is it possible to mix the per-column stats and dependency score in a way
that multi-column frequency/histogram stats are not necessary?  That
might be a less costly approach I had not considered.

Or did I miss that discussion too?  ;-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Bruce Momjian
On Sun, Apr 23, 2017 at 11:44:12AM +0100, Simon Riggs wrote:
> For us "functional dependency" would sound like something to do with
> functions (e.g. CREATE FUNCTION), so just "dependency" appears to me
> to be the best term for this.
> 
> There are multiple statistics for dependency stored, hence
> "dependencies". I don't like it, but its the best term I can see at
> present.

OK, thank you for the reply, and I am sorry I forgot the previous
discussion.  I just wanted to re-check we had research this.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A note about debugging TAP failures

2017-04-23 Thread Craig Ringer
On 23 Apr. 2017 10:32, "Michael Paquier"  wrote:

On Sun, Apr 23, 2017 at 7:48 AM, Daniel Gustafsson  wrote:
> Skipping the tempdir and instead using ${testname}_data_${name} without a
> random suffix, we can achieve this with something along the lines of the
> attached PoC.  It works as now (retain of failure, remove on success
unless
> overridden) but that logic can easily be turned around if we want that.
If
> it’s of interest I can pursue this after some sleep (tomorrow has become
today
> at this point).

Yes, something like that may make sense as well for readability.

Keeping folders in case of failures is something that I have been
advocating in favor of for some time, but this never got into the tree
:(


Huh? We do keep test temp datadirs etc in case of failure. Just not on
success.

Our definition of failure there sucks a bit though. We keep the datadirs if
any test fails in a script. If the script its self crashes we still blow
away the datadirs which is kind of counterintuitive.

I'd like to change the __DIE__ sig handler to only delete on clean script
exit code, tap reporting success, and if some env bar like PG_TESTS_NOCLEAN
is undefined. The later could also be used in pg_regress etc.



There is a patch from Horiguchi-san that allows actually such a thing,
have a look at patch 0006 on this email:
https://www.postgresql.org/message-id/CAMsr+YEc2Ek=
qjfj2jev7spz69poduoujxbowzmppvydeie...@mail.gmail.com

The same concept rebased gives the attached. I didn't double-check the
compatibility with past versions of Test:More though..
--
Michael


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 09:17, Dean Rasheed  wrote:
> On 23 April 2017 at 03:37, Bruce Momjian  wrote:
>> In looking at the new multi-column statistics "dependency" option in
>> Postgres 10, I am quite confused by the term "dependency".  Wouldn't
>> "correlation" be clearer and less confusing as "column dependency"
>> already means something else.
>>

I also asked that exactly that question...

> Actually, the terms "dependency" and "correlation" are both quite
> broad terms that cover a whole range of other different things, and
> hence could be misleading. The precise term for this is "functional
> dependency" [1], so if anything, the option name should be
> "functional_dependencies" or some shortening of that, keeping a part
> of each of those words.

...and got that answer also.

For us "functional dependency" would sound like something to do with
functions (e.g. CREATE FUNCTION), so just "dependency" appears to me
to be the best term for this.

There are multiple statistics for dependency stored, hence
"dependencies". I don't like it, but its the best term I can see at
present.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders

2017-04-23 Thread Simon Riggs
On 22 April 2017 at 16:41, Tom Lane  wrote:
> Thomas Munro  writes:
>> The assertion fails reliably for me, because standby2's reported write
>> LSN jumps backwards after the timeline changes: for example I see
>> 302 then 3028470 then 302 followed by a normal progression.
>> Surprisingly, 004_timeline_switch.pl reports success anyway.  I'm not
>> sure why the test fails sometimes on tern, but you can see that even
>> when it passed on tern the assertion had failed.
>
> Whoa.  This just turned into a much larger can of worms than I expected.
> How can it be that processes are getting assertion crashes and yet the
> test framework reports success anyway?  That's impossibly
> broken/unacceptable.

Agreed, thanks for fixing.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 01:19, Andres Freund  wrote:
> On 2017-04-22 19:55:18 -0400, Tom Lane wrote:
>> Now that we've got consistent failure reports about the 009_twophase.pl
>> recovery test, I set out to find out why it's failing.  It looks to me
>> like the reason is that this (twophase.c:2145):
>>
>> SubTransSetParent(xid, subxid, overwriteOK);
>>
>> ought to be this:
>>
>> SubTransSetParent(subxid, xid, overwriteOK);
>>
>> because the definition of SubTransSetParent is
>>
>> void
>> SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK)
>>
>> not the other way 'round.
>>
>> While "git blame" blames this line on the recent commit 728bd991c,
>> that just moved the call from somewhere else.  AFAICS this has actually
>> been wrong since StandbyRecoverPreparedTransactions was written,
>> in 361bd1662 of 2010-04-13.
>
>> Also, when I fix that, it gets further but still crashes at the same
>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>> it's computing that as "false", but in reality the subtrans link in
>> question has already been set.
>>
>
> Yikes.  This is clearly way undertested.  It's also pretty scary that
> the code has recently been whacked out quite heavily (both 9.6 and
> master), without hitting anything around this - doesn't seem to bode
> well for how in-depth the testing was.

Obviously if there is a bug it's because tests didn't find it and
therefore it is by definition undertested for that specific bug.

I'm not sure what other conclusion you wish to draw, if any?


>> It's not clear to me how much potential this has to create user data
>> corruption, but it doesn't look good at first glance.  Discuss.
>
> Hm. I think it can cause wrong tqual.c results in some edge cases.
> During HS, lastOverflowedXid will be set in some cases, and then
> XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in
> turn cause lookups snapshot->subxip (where all HS xids reside)
> potentially return wrong results.
>
> I was about to say that I don't see how it could result in persistent
> corruption however - the subtrans lookups are only done for
> (snapshot->xmin, snapshot->xmax] and subtrans is truncated
> regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory
> anymore, so that might be delayed.  Hm.

I've not found any reason, yet, to believe we return wrong answers in
any case, even though the transient data structure pg_subtrans is
corrupted by the switched call Tom discovers.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 00:55, Tom Lane  wrote:
> Now that we've got consistent failure reports about the 009_twophase.pl
> recovery test, I set out to find out why it's failing.  It looks to me
> like the reason is that this (twophase.c:2145):
>
> SubTransSetParent(xid, subxid, overwriteOK);
>
> ought to be this:
>
> SubTransSetParent(subxid, xid, overwriteOK);
>
> because the definition of SubTransSetParent is
>
> void
> SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK)
>
> not the other way 'round.
>
> While "git blame" blames this line on the recent commit 728bd991c,
> that just moved the call from somewhere else.  AFAICS this has actually
> been wrong since StandbyRecoverPreparedTransactions was written,
> in 361bd1662 of 2010-04-13.

Thanks for finding that.

> It's not clear to me how much potential this has to create user data
> corruption, but it doesn't look good at first glance.  Discuss.

Well, strangely for different reason I was looking at that particular
call a couple of days back. I didn't spot that issue, but I was
thinking why we even make that call at that time. My conclusion was
that it could be optimized away mostly, since it isn't needed very
often, but its not really worth optimizing.

SubTransSetParent() only matters for the case where we are half-way
through a commit with a large commit. Since we do atomic updates of
commit and subcommmit when on same page, this problem would only
matter when top level xid and other subxids were separated across
multiple pages and the issue would only affect a read only query
checking visibility during the commit for that prepared transaction.
Furthermore, the nature of the corruption is that we set the xid to
point to the subxid; since we never mark a top-level commit as
subcommitted, subtrans would never be consulted and so the corruption
would not lead to any incorrect answer even during the window of
exposure. So it looks to me like this error shouldn't cause a problem.

We can fix that, but...

> Also, when I fix that, it gets further but still crashes at the same
> Assert in SubTransSetParent.  The proximate cause this time seems to be
> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
> it's computing that as "false", but in reality the subtrans link in
> question has already been set.

Not sure about that, investigating.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A note about debugging TAP failures

2017-04-23 Thread Michael Banck
On Sat, Apr 22, 2017 at 02:05:13PM -0700, Andres Freund wrote:
> On 2017-04-22 16:22:59 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2017-04-22 13:51:30 -0400, Tom Lane wrote:
> > >> I think we need to fix TestLib and/or PostgresNode so that there's a way
> > >> to make TAP tests not auto-clean their data directories at end of run,
> > >> without having to resort to editing the script like this.
> > 
> > > I think leaving the directory around in case of failures would be a
> > > pretty bare minimum.  It's sometimes also useful to keep the remains if
> > > everything was apparently successfull, but it's far less important imo.
> > 
> > In the particular case I was interested in here, the test script thought
> > everything was successful :-(.  I'm working on fixing that little problem,
> > but I do not believe that the TAP scripts are so bulletproof that there
> > will never be a need to override their judgment.
> 
> Agreed.  If paths are reproducible and cleaned up on next run it's also
> much less of an issue to just leave them around till the next run.
> Which we imo also should do for non-failing tmp_check directories.

In general yes, but I think it should be checked what the overall 
storage requirements will be if one set of all data directories is kept
around.

I was surprised the src/bin/pg_basebackup/t TAP tests took up several
hundred megabytes (IIRC) when I ran them, so if other tests are similar,
it might make a few animals run out of diskspace as soon as this is
deployed without a heads-up to the operators.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Dean Rasheed
On 23 April 2017 at 03:37, Bruce Momjian  wrote:
> In looking at the new multi-column statistics "dependency" option in
> Postgres 10, I am quite confused by the term "dependency".  Wouldn't
> "correlation" be clearer and less confusing as "column dependency"
> already means something else.
>

Actually, the terms "dependency" and "correlation" are both quite
broad terms that cover a whole range of other different things, and
hence could be misleading. The precise term for this is "functional
dependency" [1], so if anything, the option name should be
"functional_dependencies" or some shortening of that, keeping a part
of each of those words.

Regards,
Dean

[1] https://en.wikipedia.org/wiki/Functional_dependency


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery tests vs windows

2017-04-23 Thread Andrew Dunstan


On 04/22/2017 04:43 PM, Andrew Dunstan wrote:
> After we got over the Test::More version issue, the recovery tests
> proceeded to fail fairly spectacularly in a test run on jacana.
>
>
> Test 6 fails because there is a CR in the returned stdout from psql. I'm
> inclined to adjust that in PostgresNode::safe_psql so we don't have to
> do it all over the place.
>
> It looks like at least some tests are failing because of confusion
> between Windows paths and the MSys virtualized paths. For example, I see
> this:
>
> 2017-04-22 10:01:14.436 EDT [2276] LOG:  archive command failed with
> exit code 1
> 2017-04-22 10:01:14.436 EDT [2276] DETAIL:  The failed archive
> command was: copy "pg_wal\00010001"
> 
> "/home/pgrunner/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/data_master_TCeC/archives\00010001"
> The system cannot find the path specified.
> 0 file(s) copied.
>
> Paths that go in recovery.conf or archive commands must of course be
> pure Windows paths.In this case the path would have been correct if
> prefixed with "c:/Mingw/Msys/1.0"
>
> I'll try to come up with a fix for that.
>
>


All the issue should be fixed by the attached patch.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e42eb88..51cbec8 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -101,6 +101,15 @@ our @EXPORT = qw(
 
 our ($test_localhost, $test_pghost, $last_port_assigned, @all_nodes);
 
+# Windows path to virtual file system root
+
+our $vfs_path = '';
+if ($Config{osname} eq 'msys')
+{
+	$vfs_path = `cd / && pwd -W`;
+	chomp $vfs_path;
+}
+
 INIT
 {
 
@@ -763,7 +772,7 @@ standby_mode=on
 sub enable_restoring
 {
 	my ($self, $root_node) = @_;
-	my $path = $root_node->archive_dir;
+	my $path = $vfs_path . $root_node->archive_dir;
 	my $name = $self->name;
 
 	print "### Enabling WAL restore for node \"$name\"\n";
@@ -791,7 +800,7 @@ standby_mode = on
 sub enable_archiving
 {
 	my ($self) = @_;
-	my $path   = $self->archive_dir;
+	my $path   = $vfs_path . $self->archive_dir;
 	my $name   = $self->name;
 
 	print "### Enabling WAL archiving for node \"$name\"\n";
@@ -979,6 +988,7 @@ sub safe_psql
 		print "\n End standard error\n";
 	}
 
+	$stdout =~ s/\r//g if $TestLib::windows_os;
 	return $stdout;
 }
 
@@ -1579,6 +1589,9 @@ sub pg_recvlogical_upto
 		}
 	};
 
+	$stdout =~ s/\r//g if $TestLib::windows_os;
+	$stderr =~ s/\r//g if $TestLib::windows_os;
+
 	if (wantarray)
 	{
 		return ($ret, $stdout, $stderr, $timeout);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders

2017-04-23 Thread Tom Lane
Thomas Munro  writes:
> On Sun, Apr 23, 2017 at 3:41 AM, Tom Lane  wrote:
>> As for this patch itself, is it reasonable to try to assert that the
>> timeline has in fact changed?

> The protocol doesn't include the timeline in reply messages, so it's
> not clear how the upstream server would know what timeline the standby
> thinks it's dealing with in any given reply message.  The sending
> server has its own idea of the current timeline but it's not in sync
> with the stream of incoming replies.

Fair enough.  But I'd still like an explanation of why only about
half of the population is showing a failure here.  Seems like every
machine should be seeing the LSN as moving backwards in this test.
So (a) why aren't they all failing, and (b) should we change the
test to make sure every platform sees that happening?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers