Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Craig Ringer
On 24 March 2017 at 14:07, Kuntal Ghosh  wrote:
> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>  wrote:
>> Hello,
>> In Windows, if one needs to take a dump in plain text format (this is
>> the default option, or can be specified using -Fp) with some level of
>> compression (-Z[0-9]), an output file has to
>> be specified. Otherwise, if the output is redirected to stdout, it'll
>> create a corrupted dump (cmd is set to ASCII mode, so it'll put
>> carriage returns in the file).
> To reproduce the issue, please use the following command in windows cmd:
>
> pg_dump -Z 9 test > E:\test_xu.backup
> pg_dump -Fp -Z 9 test > E:\test_xu.backup

This is a known problem. It is not specific to PostgreSQL, it affects
any software that attempts to use stdin/stdout on Windows via cmd,
where it is not 8-bit clean.

We don't just refuse to run with stdout as a destination because it's
perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
as I know, tell whether it's being invoked by cmd or something else.

If you have concrete ideas on how to improve this they'd be welcomed.
Is there anywhere you expected to find info in the docs? Do you know
of a way to detect in Windows if the output stream is not 8-bit clean
from within the application program? ... other?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Make the optimiser aware of partitions ordering

2017-03-24 Thread Ashutosh Bapat
On Mon, Mar 20, 2017 at 8:33 PM, Ronan Dunklau  wrote:
> On lundi 20 mars 2017 15:52:03 CET Robert Haas wrote:
>> On Mon, Mar 20, 2017 at 6:31 AM, Ronan Dunklau 
> wrote:
>> > With range partitioning, we guarantee that each partition contains non-
>> > overlapping values. Since we know the range allowed for each partition, it
>> > is possible to sort them according to the partition key (as is done
>> > already for looking up partitions).
>> >
>> > Thus, we ca generate sorted Append plans instead of MergeAppend when
>> > sorting occurs on the partition key.
>>
>> Great idea.  This is too late for v10 at this point, but please add it
>> to the next CommitFest so we don't forget about it.
>
> I know it is too late, and thought that it was too early to add it to the
> commitfest properly since so many design decisions should be discussed. Thanks
> for the feedback, I added it.

Thanks for working on this. I was also thinking about the same.

I will try to get back to review/work with this for v11. Mean time, I
am working on partition-wise joins [1]. In those patches, I have added
a structure called PartitionScheme, which represents how a relation is
partitioned. For base relations it's mostly copy of PartitionDesc and
PartitionKey, but then it bubbles up across join, with each
partitioned join getting relevant partitioning scheme. If you could
base your design such that is uses PartitionScheme, it could be used
for joins and probably when Jeevan's patch for partition-wise
aggregate [2] comes along, it can be used with grouping.

[1] 
https://www.postgresql.org/message-id/CAFjFpRcMWwepj-Do1otxQ-GApGPSZ1FmH7YQvQTwzQOGczq_sw%40mail.gmail.com
[2] http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg308861.html

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-24 Thread Rajkumar Raghuwanshi
On Fri, Mar 24, 2017 at 12:38 AM, Amit Khandekar  wrote:
> Meanwhile, attached is a WIP patch v10. The only change in this patch
> w.r.t. the last patch (v9) is that this one has a new function defined
> append_nonpartial_cost(). Just sending this to show how the algorithm
> looks like; haven't yet called it.
>

Hi,

I have given patch on latest pg sources (on commit
457a4448732881b5008f7a3bcca76fc299075ac3). configure and make all
install ran successfully, but initdb failed with below error.

[edb@localhost bin]$ ./initdb -D data
The files belonging to this database system will be owned by user "edb".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory data ... ok
creating subdirectories ... ok
selecting default max_connections ... sh: line 1:  3106 Aborted
 (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=100 -c shared_buffers=1000 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3112 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=50 -c shared_buffers=500 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3115 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=40 -c shared_buffers=400 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3118 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=30 -c shared_buffers=300 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3121 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=20 -c shared_buffers=200 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
sh: line 1:  3124 Aborted (core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=10 -c shared_buffers=100 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
10
selecting default shared_buffers ... sh: line 1:  3127 Aborted
(core dumped)
"/home/edb/WORKDB/PG3/postgresql/inst/bin/postgres" --boot -x0 -F -c
max_connections=10 -c shared_buffers=16384 -c
dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
400kB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... TRAP:
FailedAssertion("!(LWLockTranchesAllocated >=
LWTRANCHE_FIRST_USER_DEFINED)", File: "lwlock.c", Line: 501)
child process was terminated by signal 6: Aborted
initdb: removing data directory "data"

[edb@localhost bin]$

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-24 Thread vinayak


On 2017/03/23 15:04, Haribabu Kommi wrote:



On Wed, Mar 22, 2017 at 8:11 PM, vinayak 
> wrote:



On 2017/03/21 21:25, Haribabu Kommi wrote:



On Tue, Mar 21, 2017 at 3:41 PM, vinayak
mailto:pokale_vinayak...@lab.ntt.co.jp>> wrote:

Thank you for testing the patch on Windows platform.


Thanks for the updated patch.

It works good for a normal relation.  But for a relation that
contains child tables,
the PROGRESS_ANALYZE_NUM_ROWS_SAMPLED produces wrong results.


Thank you for reviewing the patch.
The attached patch implements a way to report sample rows count
from acquire_sample_rows() even if called for child tables.

How about adding another phase called
PROGRESS_ANALYZE_PHASE_COLLECT_INHERIT_SAMPLE_ROWS
and set this phase only when it is an inheritance analyze
operation. And adding
some explanation of ROWS_SAMPLED phase about inheritance tables
how these sampled rows are calculated will provide good analyze
progress of
relation that contains child relations also.

I have added the phase called
PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS.
I have also updated the documentation.


Thanks for the updated patch. I will check it.

The ANALYZE command takes long time in computing statistics
phase.So I think we can add some column or phase so that user can
easily understand the progress.
How about adding new column like "num_rows_processed" will compute
the statistics of specified column?


I prefer a column with rows processed instead of a phase.
Because we already set the phase of compute stats and showing
the progress there will number of rows processed will be good.

How about separate the computing "inheritance statistics" phase
from the computing regular "single table" statistics.
Comment?


Yes, this will be good to show both that states of inheritance of 
sampled rows and
compute inheritance stats, so that it will be clearly visible to the 
user the current

status.

I have updated the patch.
I have added column "num_rows_processed" and phase "computing inherited 
statistics" in the view.


=# \d+ pg_stat_progress_analyze
 View "pg_catalog.pg_stat_progress_analyze"
 Column |  Type   | Collation | Nullable | Default | 
Storage  | Description

+-+---+--+-+--+-
 pid| integer |   |  | | 
plain|
 datid  | oid |   |  | | 
plain|
 datname| name|   |  | | 
plain|
 relid  | oid |   |  | | 
plain|
 phase  | text|   |  | | 
extended |
 num_target_sample_rows | bigint  |   |  | | 
plain|
 num_rows_sampled   | bigint  |   |  | | 
plain|
 num_rows_processed | bigint  |   |  | | 
plain|

View definition:
 SELECT s.pid,
s.datid,
d.datname,
s.relid,
CASE s.param1
WHEN 0 THEN 'initializing'::text
WHEN 1 THEN 'collecting sample rows'::text
WHEN 2 THEN 'collecting inherited sample rows'::text
WHEN 3 THEN 'computing statistics'::text
WHEN 4 THEN 'computing inherited statistics'::text
ELSE NULL::text
END AS phase,
s.param2 AS num_target_sample_rows,
s.param3 AS num_rows_sampled,
s.param4 AS num_rows_processed
   FROM pg_stat_get_progress_info('ANALYZE'::text) s(pid, datid, relid, 
param1, param2, param3, param4, param5, param6, param7, param8, param9, 
param10)

 LEFT JOIN pg_database d ON s.datid = d.oid;

Regards,
Vinayak Pokale
NTT Open Source Software Center


pg_stat_progress_analyze_v6.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] segfault in hot standby for hash indexes

2017-03-24 Thread Amit Kapila
On Fri, Mar 24, 2017 at 12:25 PM, Ashutosh Sharma  wrote:
>>
>> I think it would have been probably okay to use *int* for ntuples as
>> that matches with what you are actually assigning in the function.
>
> okay, corrected it. Attached is newer version of patch.
>

Thanks, this version looks good to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-24 Thread Amit Khandekar
On 24 March 2017 at 13:11, Rajkumar Raghuwanshi
 wrote:
> I have given patch on latest pg sources (on commit
> 457a4448732881b5008f7a3bcca76fc299075ac3). configure and make all
> install ran successfully, but initdb failed with below error.

> FailedAssertion("!(LWLockTranchesAllocated >=
> LWTRANCHE_FIRST_USER_DEFINED)", File: "lwlock.c", Line: 501)

Thanks for reporting, Rajkumar.

With the new PARALLEL_APPEND tranche ID, LWTRANCHE_FIRST_USER_DEFINED
value has crossed the value 64. So we need to increase the initial
size of LWLockTrancheArray from 64 to 128. Attached is the updated
patch.


ParallelAppend_v11.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect and hash indexes

2017-03-24 Thread Amit Kapila
On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma  wrote:
> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma  
> wrote:
>>
>> Thanks for reviewing my patch. I have removed the extra white space.
>> Attached are both the patches.
>
> Sorry, I have mistakenly attached wrong patch. Here are the correct
> set of patches.
>

The latest version of patches looks fine to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-24 Thread Amit Langote
Hi Ashutosh,

On 2017/03/23 21:48, Ashutosh Bapat wrote:
>>> I have fixed all the issues reported till now.

In patch 0007, the following code in have_partkey_equi_join() looks
potentially unsafe:

/*
 * If the clause refers to different partition keys from
 * both relations, it can not be used for partition-wise join.
 */
if (ipk1 != ipk2)
continue;

/*
 * The clause allows partition-wise join if only it uses the same
 * operator family as that specified by the partition key.
 */
if (!list_member_oid(rinfo->mergeopfamilies,
 part_scheme->partopfamily[ipk1]))
continue;

What if ipk1 and ipk2 both turn out to be -1? Accessing
part_schem->partopfamily[ipk1] would be incorrect, no?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Kuntal Ghosh
On Fri, Mar 24, 2017 at 12:35 PM, Craig Ringer  wrote:
> On 24 March 2017 at 14:07, Kuntal Ghosh  wrote:
>> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>>  wrote:
>>> Hello,
>>> In Windows, if one needs to take a dump in plain text format (this is
>>> the default option, or can be specified using -Fp) with some level of
>>> compression (-Z[0-9]), an output file has to
>>> be specified. Otherwise, if the output is redirected to stdout, it'll
>>> create a corrupted dump (cmd is set to ASCII mode, so it'll put
>>> carriage returns in the file).
>> To reproduce the issue, please use the following command in windows cmd:
>>
>> pg_dump -Z 9 test > E:\test_xu.backup
>> pg_dump -Fp -Z 9 test > E:\test_xu.backup
>
> This is a known problem. It is not specific to PostgreSQL, it affects
> any software that attempts to use stdin/stdout on Windows via cmd,
> where it is not 8-bit clean.
>
> We don't just refuse to run with stdout as a destination because it's
> perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
> as I know, tell whether it's being invoked by cmd or something else.
ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
mode, cmd pokes its nose and does CR / LF conversions on its own. So,
whenever we want compression on a plain-text dump file, we can set the
stdout mode to O_BINARY. Is it a wrong approach?

> If you have concrete ideas on how to improve this they'd be welcomed.
> Is there anywhere you expected to find info in the docs? Do you know
> of a way to detect in Windows if the output stream is not 8-bit clean
> from within the application program? ... other?
Actually, I'm not that familiar with windows environment. But, I
couldn't find any note to user in pg_dump documentation regarding the
issue. In cmd, if someone needs a plain-text dump in compressed
format, they should specify the output file, otherwise they may run
into the above problem. However, if a dump is corrupted due to the
above issue, a fix for that is provided in [1]. Should we include this
in the documentation?



[1] http://www.gzip.org/
Use fixgz.c to remove the extra CR (carriage return) bytes.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Michael Banck
Hi,

On Fri, Mar 24, 2017 at 12:32:28AM +0100, Petr Jelinek wrote:
> Yes, I also forgot to check if the table actually exists on subscriber
> when fetching them in CREATE SUBSCRIPTION (we have check during
> replication but not there).
> 
> Attached patches should fix both issues.

I no longer get a segfault, thanks.

However, replication also seems to not work, I'm using the following
script right now:

--8<--
#!/bin/sh

set -x

#rm -rf data_* pg*.log

initdb --pgdata=data_pg1 1> /dev/null 2>&1
sed -i -e 's/^#wal_level.=.replica/wal_level = logical/'
data_pg1/postgresql.conf
pg_ctl --pgdata=data_pg1 -l pg1.log start 1> /dev/null
psql -c "CREATE TABLE t1(id int);"
pg_basebackup --pgdata=data_pg2
sed -i -e 's/^#port.=.5432/port = 5433/' data_pg2/postgresql.conf
psql -c "INSERT INTO t1 VALUES(1);"
pg_ctl --pgdata=data_pg2 -l pg2.log start 1> /dev/null
psql -c "CREATE PUBLICATION pub1;"
psql --port=5433 -c "CREATE SUBSCRIPTION sub1 CONNECTION
'dbname=postgres' PUBLICATION pub1 WITH (COPY DATA);"
sleep 6
psql -c 'SELECT application_name, state FROM pg_stat_replication;'
psql --port=5433 -c "SELECT COUNT(*) FROM t1;"
--8<--

(I had to add the sleep 6 - 5 wasn't always enough - in order to get the
subcription from 'catchup' to 'streaming' which was a bit surprising to
me)

This is the output:

--8<--
+ initdb --pgdata=data_pg1
+ sed -i -e s/^#wal_level.=.replica/wal_level = logical/
data_pg1/postgresql.conf
+ pg_ctl --pgdata=data_pg1 -l pg1.log start
+ psql -c CREATE TABLE t1(id int);
CREATE TABLE
+ pg_basebackup --pgdata=data_pg2
+ sed -i -e s/^#port.=.5432/port = 5433/ data_pg2/postgresql.conf
+ psql -c INSERT INTO t1 VALUES(1);
INSERT 0 1
+ pg_ctl --pgdata=data_pg2 -l pg2.log start
+ psql -c CREATE PUBLICATION pub1;
CREATE PUBLICATION
+ psql --port=5433 -c CREATE SUBSCRIPTION sub1 CONNECTION
'dbname=postgres' PUBLICATION pub1 WITH (COPY DATA);
NOTICE:  created replication slot "sub1" on publisher
NOTICE:  synchronized table states
CREATE SUBSCRIPTION
+ sleep 6
+ psql -c SELECT application_name, state FROM pg_stat_replication;
 application_name |   state   
--+---
 sub1 | streaming
(1 row)

+ psql --port=5433 -c SELECT COUNT(*) FROM t1;
 count 
---
 0
(1 row)
--8<--

I would have assumed the inserted value to be visible on the standby.
If I insert further values, don't show up, either.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] identity columns

2017-03-24 Thread Vitaly Burovoy
On 3/23/17, Peter Eisentraut  wrote:
> On 3/23/17 06:09, Vitaly Burovoy wrote:
>> I think we'll end up with "DROP IDENTITY IF EXISTS" to avoid raising
>> an exception and "ADD OR SET" if your grammar remains.
>
> That sounds reasonable to me.

It would be great if "DROP IDENTITY IF EXISTS" is in the current patch
since we don't have any disagreements about "DROP IDENTITY" behavior
and easiness of implementation of "IF EXISTS" there.

>> Right. From that PoV IDENTITY also changes a default value: "SET (ADD
>> ... AS?) IDENTITY" works as setting a default to "nextval(...)"
>> whereas "DROP IDENTITY" works as setting it back to NULL.
>
> But dropping and re-adding an identity destroys state, so it's not quite
> the same.

How does dropping and re-adding affects a choosing between "ADD" and "SET"?
If you drop identity property, you get a column's state destroyed
whatever grammar variation you are using.

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


I have an idea. What about the next version of DDL:

DROP IDENTITY [ IF EXISTS ]
SET GENERATED { ALWAYS | BY DEFAULT } [ IF NOT EXISTS ] | SET ...

That version:
1. does not introduce a new "ADD" variation;
2. without "IF NOT EXISTS" follows the standard;
3. with "IF NOT EXISTS" sets a column's identity property or alters it
(works as "CREATE OR REPLACE" for functions).

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Mark Kirkwood

On 24/03/17 12:32, Petr Jelinek wrote:


On 24/03/17 00:14, Mark Kirkwood wrote:

On 24/03/17 02:00, Peter Eisentraut wrote:

On 3/21/17 21:38, Peter Eisentraut wrote:

This patch is looking pretty good to me, modulo the failing pg_dump
tests.

Attached is a fixup patch.  I have mainly updated some comments and
variable naming for (my) clarity.  No functional changes.

Committed all that.


Testing now this patch is in, I'm unable to create a subscription:

(master)

bench=# CREATE PUBLICATION pgbench
 FOR TABLE pgbench_accounts , pgbench_branches,
   pgbench_tellers, pgbench_history
 WITH (PUBLISH INSERT, PUBLISH UPDATE, PUBLISH DELETE);

(slave)

bench=# CREATE SUBSCRIPTION pgbench
 CONNECTION 'port=5447 user=postgres dbname=bench'
 PUBLICATION pgbench
 WITH (COPY DATA);
ERROR:  duplicate key value violates unique constraint
"pg_subscription_rel_srrelid_srsubid_index"
DETAIL:  Key (srrelid, srsubid)=(0, 16389) already exists.

This is a pair of freshly initdb'ed instances, the master has a size 100
pgbench schema.

I'm guessing this is a different bug from the segfault also reported


Yes, I also forgot to check if the table actually exists on subscriber
when fetching them in CREATE SUBSCRIPTION (we have check during
replication but not there).

Attached patches should fix both issues.




Yep, does seem to.

I note that (probably intensional) specifying 'COPY DATA' does not 
'CREATE TABLES' for you...ok, I probably didn't read ...something 
somewhere...but anyway, after creating the tables it all seems to work. 
Nice.


However one minor observation - as Michael Banck noted - the elapsed 
time for slave to catch up after running:


$ pgbench -c8 -T600 bench

on the master was (subjectively) much longer than for physical streaming 
replication. Is this expected?


regards

Mark




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-03-24 Thread Rahila Syed
Hello Rushabh,

Thank you for reviewing.
Have addressed all your comments in the attached patch. The attached patch
currently throws an
error if a new partition is added after default partition.

>Rather then adding check for default here, I think this should be handle
inside
>get_qual_for_list().
Have moved the check inside get_qual_for_partbound() as needed to do some
operations
before calling get_qual_for_list() for default partitions.

Thank you,
Rahila Syed

On Tue, Mar 21, 2017 at 11:36 AM, Rushabh Lathia 
wrote:

> I picked this for review and noticed that patch is not getting
> cleanly complied on my environment.
>
> partition.c: In function ‘RelationBuildPartitionDesc’:
> partition.c:269:6: error: ISO C90 forbids mixed declarations and code
> [-Werror=declaration-after-statement]
>   Const*val = lfirst(c);
>   ^
> partition.c: In function ‘generate_partition_qual’:
> partition.c:1590:2: error: ISO C90 forbids mixed declarations and code
> [-Werror=declaration-after-statement]
>   PartitionBoundSpec *spec = (PartitionBoundSpec *)bound;
>   ^
> partition.c: In function ‘get_partition_for_tuple’:
> partition.c:1820:5: error: array subscript has type ‘char’
> [-Werror=char-subscripts]
>  result = parent->indexes[partdesc->boundinfo->def_index];
>  ^
> partition.c:1824:16: error: assignment makes pointer from integer without
> a cast [-Werror]
>  *failed_at = RelationGetRelid(parent->reldesc);
> ^
> cc1: all warnings being treated as errors
>
> Apart from this, I was reading patch here are few more comments:
>
> 1) Variable initializing happening at two place.
>
> @@ -166,6 +170,8 @@ RelationBuildPartitionDesc(Relation rel)
>  /* List partitioning specific */
>  PartitionListValue **all_values = NULL;
>  boolfound_null = false;
> +boolfound_def = false;
> +intdef_index = -1;
>  intnull_index = -1;
>
>  /* Range partitioning specific */
> @@ -239,6 +245,8 @@ RelationBuildPartitionDesc(Relation rel)
>  i = 0;
>  found_null = false;
>  null_index = -1;
> +found_def = false;
> +def_index = -1;
>  foreach(cell, boundspecs)
>  {
>  ListCell   *c;
> @@ -249,6 +257,15 @@ RelationBuildPartitionDesc(Relation rel)
>
>
> 2)
>
> @@ -1558,6 +1586,19 @@ generate_partition_qual(Relation rel)
>  bound = stringToNode(TextDatumGetCString(boundDatum));
>  ReleaseSysCache(tuple);
>
> +/* Return if it is a default list partition */
> +PartitionBoundSpec *spec = (PartitionBoundSpec *)bound;
> +ListCell *cell;
> +foreach(cell, spec->listdatums)
>
> More comment on above hunk is needed?
>
> Rather then adding check for default here, I think this should be handle
> inside
> get_qual_for_list().
>
> 3) Code is not aligned with existing
>
> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index 6316688..ebb7db7 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -2594,6 +2594,7 @@ partbound_datum:
>  Sconst{ $$ = makeStringConst($1, @1); }
>  | NumericOnly{ $$ = makeAConst($1, @1); }
>  | NULL_P{ $$ = makeNullAConst(@1); }
> +| DEFAULT  { $$ = (Node *)makeDefElem("DEFAULT", NULL, @1); }
>  ;
>
>
> 4) Unnecessary hunk:
>
> @@ -2601,7 +2602,6 @@ partbound_datum_list:
>  | partbound_datum_list ',' partbound_datum
>  { $$ = lappend($1, $3); }
>  ;
> -
>
> Note: this is just an initially review comments, I am yet to do the
> detailed review
> and the testing for the patch.
>
> Thanks.
>
> On Mon, Mar 20, 2017 at 9:27 AM, Rahila Syed 
> wrote:
>
>> Hello,
>>
>> Please find attached a rebased patch with support for pg_dump. I am
>> working on the patch
>> to handle adding a new partition after a default partition by throwing an
>> error if
>> conflicting rows exist in default partition and adding the partition
>> successfully otherwise.
>> Will post an updated patch by tomorrow.
>>
>> Thank you,
>> Rahila Syed
>>
>> On Mon, Mar 13, 2017 at 5:42 AM, Robert Haas 
>> wrote:
>>
>>> On Fri, Mar 10, 2017 at 2:17 PM, Peter Eisentraut
>>>  wrote:
>>> > On 3/2/17 21:40, Robert Haas wrote:
>>> >> On the point mentioned above, I
>>> >> don't think adding a partition should move tuples, necessarily; seems
>>> >> like it would be good enough - maybe better - for it to fail if there
>>> >> are any that would need to be moved.
>>> >
>>> > ISTM that the uses cases of various combinations of adding a default
>>> > partition, adding another partition after it, removing the default
>>> > partition, clearing out the default partition in order to add more
>>> > nondefault partitions, and so on, need to be more clearly spelled out
>>> > for each partitioning type.  We also need to consider that pg_dump and
>>> > pg_upgrade need to be able

Re: [HACKERS] Enabling parallelism for queries coming from SQL or other PL functions

2017-03-24 Thread Rafia Sabih
On Thu, Mar 23, 2017 at 11:26 PM, Robert Haas  wrote:

>
> The changes to the plpgsql code don't look so good to me.  The change
> to exec_stmt_return_query fixes the same bug that I mentioned in the
> email linked above, but only half of it -- it corrects the RETURN
> QUERY EXECUTE case but not the RETURN QUERY case.  And it's clearly a
> separate change; that part is a bug fix, not an enhancement.

My bad. Since, you have given this as a separate patch in the link
upthread, I suppose there's nothing expected from me regarding this
right now.

> Some of
> the other changes depend on whether we're in a trigger, which seems
> irrelevant to whether we can use parallelism. Even if the outer query
> is doing writes, we can still use parallelism for queries inside the
> trigger function if warranted.  It's probably a rare case to have
> queries inside a trigger that are expensive enough to justify such
> handling but I don't see the value of putting in special-case logic to
> prevent it.
>
Fixed. I confused it with not allowing parallel workers when update
command is in progress.

> I suspect that code fails to achieve its goals anyway.  At the top of
> exec_eval_expr(), you call exec_prepare_plan() and unconditionally
> pass CURSOR_OPT_PARALLEL_OK, so when that function returns, expr->plan
> might now be a parallel plan.  If we reach the call to
> exec_run_select() further down in that function, and if we happen to
> pass false, it's not going to matter, because exec_run_select() is
> going to find the plan already initialized.
>
True, fixed.
The attached patch is to be applied over [1].

[1]  
https://www.postgresql.org/message-id/CA%2BTgmoZ_ZuH%2BauEeeWnmtorPsgc_SmP%2BXWbDsJ%2BcWvWBSjNwDQ%40mail.gmail.com
-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


pl_parallel_opt_support_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Erik Rijkers

On 2017-03-24 10:45, Mark Kirkwood wrote:


However one minor observation - as Michael Banck noted - the elapsed
time for slave to catch up after running:

$ pgbench -c8 -T600 bench

on the master was (subjectively) much longer than for physical
streaming replication. Is this expected?



I think you probably want to do (on the slave) :

  alter role  set synchronous_commit = off;

otherwise it's indeed extremely slow.







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Amit Kapila
On Fri, Mar 24, 2017 at 12:25 AM, Pavan Deolasee
 wrote:
>
>
> On Thu, Mar 23, 2017 at 7:53 PM, Amit Kapila 
> wrote:
>>
>> >
>>
>> I am not sure on what basis user can set such parameters, it will be
>> quite difficult to tune those parameters.  I think the point is
>> whatever threshold we keep, once that is crossed, it will perform two
>> scans for all the indexes.
>
>
> Well, that applies to even vacuum parameters, no?
>

I don't know how much we can directly compare the usability of the new
parameters you are proposing here to existing parameters.

> The general sense I've got
> here is that we're ok to push some work in background if it helps the
> real-time queries, and I kinda agree with that.
>

I don't think we can define this work as "some" work, it can be a lot
of work depending on the number of indexes.  Also, I think for some
cases it will generate maintenance work without generating benefit.
For example, when there is one index on a table and there are updates
for that index column.

> Having said that, I see many ways we can improve on this later. Like we can
> track somewhere else information about tuples which may have received WARM
> updates (I think it will need to be a per-index bitmap or so) and use that
> to do WARM chain conversion in a single index pass.
>

Sure, if we have some way to do it in a single pass or does most of
the time in foreground process (like we have some dead marking idea
for indexes), then it would have been better.

> But this is clearly not
> PG 10 material.
>

I don't see much discussion about this aspect of the patch, so not
sure if it is acceptable to increase the cost of vacuum.  Now, I don't
know if your idea of GUC's make it such that the additional cost will
occur seldom and this additional pass has a minimal impact which will
make it acceptable.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix and simplify check for whether we're running as Windows serv

2017-03-24 Thread Heikki Linnakangas

On 03/22/2017 07:44 PM, Robert Haas wrote:

On Wed, Mar 22, 2017 at 10:13 AM, Alvaro Herrera
 wrote:

Heikki Linnakangas wrote:

I did some archeology, and found CheckTokenMembership() in MinGW's w32api
packages version 3.14
(https://sourceforge.net/projects/mingw/files/MinGW/Base/w32api/w32api-3.14/,
in include/winbase.h). According to the timestamps on that download page,
that was released in 2009. That was the oldest version I could find, so it
might go even further back.

Dave, do you know exactly what version of MinGW narwhal is running? And how
difficult is it to upgrade to something slightly more modern? Ease of
upgrade is another good data point on how far we need to support old
versions.


Given that this was backpatched and that it broke narwhal in all
branches, I think the solution needs to make narwhal work again without
requiring it to upgrade; so we should acquire CheckTokenMembership via
dynloading just like we do the other functions.  If we want to require a
newer mingw version in pg10, that's acceptable, but it should be a
separate patch.


+1 for not moving the minimum system requirements in the back-branches.


Ok. I reverted this patch in the back-branches, and applied the much 
less invasive "V2" patch [1] instead. HEAD is unchanged, so narwhal 
still fails there.


Dave: the consensus is that we no longer support the old version of 
MinGW that narwhal is using, for PostgreSQL v 10. Can you modify the 
configuration of narwhal to not try building 'master' anymore, or 
upgrade the toolchain, please?


[1] 
https://www.postgresql.org/message-id/CAB7nPqSvfu%3DKpJ%3DNX%2BYAHmgAmQdzA7N5h31BjzXeMgczhGCC%2BQ%40mail.gmail.com


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Kuntal Ghosh
On Fri, Mar 24, 2017 at 2:17 PM, Kuntal Ghosh
 wrote:
> On Fri, Mar 24, 2017 at 12:35 PM, Craig Ringer  wrote:
>> On 24 March 2017 at 14:07, Kuntal Ghosh  wrote:
>>> On Fri, Mar 24, 2017 at 11:28 AM, Kuntal Ghosh
>>>  wrote:
 Hello,
 In Windows, if one needs to take a dump in plain text format (this is
 the default option, or can be specified using -Fp) with some level of
 compression (-Z[0-9]), an output file has to
 be specified. Otherwise, if the output is redirected to stdout, it'll
 create a corrupted dump (cmd is set to ASCII mode, so it'll put
 carriage returns in the file).
>>> To reproduce the issue, please use the following command in windows cmd:
>>>
>>> pg_dump -Z 9 test > E:\test_xu.backup
>>> pg_dump -Fp -Z 9 test > E:\test_xu.backup
>>
>> This is a known problem. It is not specific to PostgreSQL, it affects
>> any software that attempts to use stdin/stdout on Windows via cmd,
>> where it is not 8-bit clean.
>>
>> We don't just refuse to run with stdout as a destination because it's
>> perfectly sensible if you're not using cmd.exe. pg_dump cannot, as far
>> as I know, tell whether it's being invoked by cmd or something else.
> ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
> mode, cmd pokes its nose and does CR / LF conversions on its own. So,
> whenever we want compression on a plain-text dump file, we can set the
> stdout mode to O_BINARY. Is it a wrong approach?
With the help from Ashutosh Sharma, I tested this in Windows
environment. Sadly, it still doesn't work. :( IMHO, we should document
the issue somewhere.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-24 Thread Amit Langote
Hi Ashutosh,

On 2017/03/23 21:48, Ashutosh Bapat wrote:
>>> I have fixed all the issues reported till now.

I've tried to fix your 0012 patch (Multi-level partitioned table
expansion) considering your message earlier on this thread [1].
Especially the fact that no AppendRelInfo and RelOptInfo are allocated for
partitioned child tables as of commit d3cc37f1d [2].  I've fixed
expand_inherited_rtentry() such that AppendRelInfo *is* allocated for a
partitioned child RTEs whose rte->inh is set to true.  Such an RTE is
recursively expanded with that RTE the parent.

Also as I mentioned elsewhere [3], the multi-level inheritance expansion
of partitioned table will break update/delete for partitioned table, which
is because inheritance_planner() is not ready to handle inheritance sets
structured that way.  I tried to refactor inheritance_planner() such that
its core logic can be recursively invoked for partitioned child RTEs.  The
resulting child paths and other auxiliary information related to planning
across the hierarchy are maintained in one place using a struct to hold
the same in a few flat lists.  The refactoring didn't break any existing
tests and a couple of new tests are added to check that it indeed works
for multi-level partitioned tables expanded using new multi-level structure.

There is some test failure in 0014 (Multi-level partition-wise join
tests), probably because of the changes I made to 0012, which I didn't get
time to check why, although I've checked using an example that multi-level
join planning still works, so it's not completely broken either.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpRefs5ZMnxQ2vP9v5zOtWtNPuiMYc01sb1SWjCOB1CT%3DuQ%40mail.gmail.com

[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d3cc37f1d

[3]
https://www.postgresql.org/message-id/744d20fe-fc7b-f89e-8d06-6496ec537b86%40lab.ntt.co.jp
>From 7051b9cb4760908e23e64969988b58fcb466868e Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Fri, 10 Feb 2017 13:50:14 +0530
Subject: [PATCH 12/14] Multi-level partitioned table expansion.

Construct inheritance hierarchy of a partitioned table to reflect the
partition hierarchy.

Commit d3cc37f1d got rid of AppendRelInfos for *all* partitioned child
child tables.  But we'd need one for recursively expandded partitioned
children.  In its absence, some of the later planning stages would not be
able to access the partitions below the first level, because they are not
directly linked to the root parent.  Fix that by allocating one.  That
means such children also get a RelOptInfo, which also wasn't allocated
previously.

Expanding partitioned table inheritance this new way means we need to
teach a few places how to traverse the hierarchy, such as propagating
lateral join information from the root table down the partition
hierarchy.

Further, inheritance_planner() needs refactoring to enable recursive
invocation of its core logic.  While the query_planner() can handle the
new partition hierarchical structure just fine because of the way the
relevant code works, inheritance_planner() would miss any child tables
below the first level, because it can't do recursion.  Refactor it so
that it can.
---
 src/backend/optimizer/plan/initsplan.c |  14 +-
 src/backend/optimizer/plan/planner.c   | 290 +
 src/backend/optimizer/prep/prepunion.c |  69 ++--
 src/test/regress/expected/inherit.out  |  22 +++
 src/test/regress/sql/inherit.sql   |  17 ++
 5 files changed, 292 insertions(+), 120 deletions(-)

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index b4ac224a7a..20774b29fe 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -630,7 +630,19 @@ create_lateral_join_info(PlannerInfo *root)
 	{
 		RelOptInfo *brel = root->simple_rel_array[rti];
 
-		if (brel == NULL || brel->reloptkind != RELOPT_BASEREL)
+		if (brel == NULL)
+			continue;
+
+		/*
+		 * If an "other rel" RTE is a "partitioned table", we must propagate
+		 * the lateral info inherited all the way from the root parent to its
+		 * children. That's because the children are not linked directly with
+		 * the root parent via AppendRelInfo's unlike in case of a regular
+		 * inheritance set (see expand_inherited_rtentry()).  Failing to
+		 * do this would result in those children not getting marked with the
+		 * appropriate lateral info.
+		 */
+		if (brel->reloptkind != RELOPT_BASEREL && !brel->part_scheme)
 			continue;
 
 		if (root->simple_rte_array[rti]->inh)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4487683196..59cb559b9e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -91,10 +91,28 @@ typedef struct
 	List	   *groupClause;	/* overrides parse->groupClause */
 } standard_qp_extra;
 
+/* Result of a given invocation of inheritance_planner_guts() */
+typedef struct
+{
+	I

Re: [HACKERS] Backend crash on non-exclusive backup cancel

2017-03-24 Thread Teodor Sigaev

Thank you all, pushed

Michael Paquier wrote:

On Fri, Mar 24, 2017 at 1:45 AM, Teodor Sigaev  wrote:

I believe patch looks good and it's ready to commit.


Thanks for the review!


As I understand, it fixes bug introduced by
commit 7117685461af50f50c03f43e6a622284c8d54694
Date:   Tue Apr 5 20:03:49 2016 +0200

Implement backup API functions for non-exclusive backups


Indeed.


And, suppose, it should be backpatched to 9.6?


Yes, that's where non-exclusive backups have been introduced.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Pavan Deolasee
On Fri, Mar 24, 2017 at 4:04 PM, Amit Kapila 
wrote:

> On Fri, Mar 24, 2017 at 12:25 AM, Pavan Deolasee
>  wrote:
> >
> >
> > On Thu, Mar 23, 2017 at 7:53 PM, Amit Kapila 
> >
> > The general sense I've got
> > here is that we're ok to push some work in background if it helps the
> > real-time queries, and I kinda agree with that.
> >
>
> I don't think we can define this work as "some" work, it can be a lot
> of work depending on the number of indexes.  Also, I think for some
> cases it will generate maintenance work without generating benefit.
> For example, when there is one index on a table and there are updates
> for that index column.
>
>
That's a fair point. I think we can address this though. At the end of
first index scan we would know how many warm pointers the index has and
whether it's worth doing a second scan. For the case you mentioned, we will
do a second scan just on that one index and skip on all other indexes and
still achieve the same result. On the other hand, if one index receives
many updates and other indexes are rarely updated then we might leave
behind a few WARM chains behind and won't be able to do IOS on those pages.
But given the premise that other indexes are receiving rare updates, it may
not be a problem. Note: the code is not currently written that way, but it
should be a fairly small change.

The other thing that we didn't talk about is that vacuum will need to track
dead tuples and warm candidate chains separately which increases memory
overhead. So for very large tables, and for the same amount of
maintenance_work_mem, one round of vacuum will be able to clean lesser
pages. We can work out more compact representation, but something not done
currently.


>
> > But this is clearly not
> > PG 10 material.
> >
>
> I don't see much discussion about this aspect of the patch, so not
> sure if it is acceptable to increase the cost of vacuum.  Now, I don't
> know if your idea of GUC's make it such that the additional cost will
> occur seldom and this additional pass has a minimal impact which will
> make it acceptable.


Yeah, I agree. I'm trying to schedule some more benchmarks, but any help is
appreciated.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] [psql] patch to fix ordering in words_after_create array

2017-03-24 Thread Rushabh Lathia
Hi All,

While looking at the code around tab-complete.c, I
found the ordering in words_after_create array is not
correct for DEFAULT PRIVILEGES, which been added
under below commit:

commit d7d77f3825122bde55be9e06f6c4851028b99795
Author: Peter Eisentraut 
Date:   Thu Mar 16 18:54:28 2017 -0400

psql: Add completion for \help DROP|ALTER

PFA patch to fix the same.

Thanks,
Rushabh Lathia
www.EnterpriseDB.com
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f749406..02a1571 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -999,8 +999,8 @@ static const pgsql_thing_t words_after_create[] = {
 	{"CONFIGURATION", Query_for_list_of_ts_configurations, NULL, THING_NO_SHOW},
 	{"CONVERSION", "SELECT pg_catalog.quote_ident(conname) FROM pg_catalog.pg_conversion WHERE substring(pg_catalog.quote_ident(conname),1,%d)='%s'"},
 	{"DATABASE", Query_for_list_of_databases},
-	{"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW},
 	{"DEFAULT PRIVILEGES", NULL, NULL, THING_NO_CREATE | THING_NO_DROP},
+	{"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW},
 	{"DOMAIN", NULL, &Query_for_list_of_domains},
 	{"EVENT TRIGGER", NULL, NULL},
 	{"EXTENSION", Query_for_list_of_extensions},

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix and simplify check for whether we're running as Windows serv

2017-03-24 Thread Dave Page
On Friday, March 24, 2017, Heikki Linnakangas  wrote:

> On 03/22/2017 07:44 PM, Robert Haas wrote:
>
>> On Wed, Mar 22, 2017 at 10:13 AM, Alvaro Herrera
>>  wrote:
>>
>>> Heikki Linnakangas wrote:
>>>
 I did some archeology, and found CheckTokenMembership() in MinGW's
 w32api
 packages version 3.14
 (https://sourceforge.net/projects/mingw/files/MinGW/Base/
 w32api/w32api-3.14/,
 in include/winbase.h). According to the timestamps on that download
 page,
 that was released in 2009. That was the oldest version I could find, so
 it
 might go even further back.

 Dave, do you know exactly what version of MinGW narwhal is running? And
 how
 difficult is it to upgrade to something slightly more modern? Ease of
 upgrade is another good data point on how far we need to support old
 versions.

>>>
>>> Given that this was backpatched and that it broke narwhal in all
>>> branches, I think the solution needs to make narwhal work again without
>>> requiring it to upgrade; so we should acquire CheckTokenMembership via
>>> dynloading just like we do the other functions.  If we want to require a
>>> newer mingw version in pg10, that's acceptable, but it should be a
>>> separate patch.
>>>
>>
>> +1 for not moving the minimum system requirements in the back-branches.
>>
>
> Ok. I reverted this patch in the back-branches, and applied the much less
> invasive "V2" patch [1] instead. HEAD is unchanged, so narwhal still fails
> there.
>
> Dave: the consensus is that we no longer support the old version of MinGW
> that narwhal is using, for PostgreSQL v 10. Can you modify the
> configuration of narwhal to not try building 'master' anymore, or upgrade
> the toolchain, please?
>

I've disabled it. Thanks.


-- 
Dave Page
PostgreSQL Core Team
http://www.postgresql.org/


Re: [HACKERS] scram and \password

2017-03-24 Thread Heikki Linnakangas

On 03/23/2017 06:41 AM, Michael Paquier wrote:

And after a lookup the failure is here:
-   result = get_role_password(port->user_name, &shadow_pass, logdetail);
+   shadow_pass = get_role_password(port->user_name, logdetail);
if (result == STATUS_OK)
result is never setup in this code path, so that may crash.


Ah, of course. For some reason, I has -Wno-maybe-uninitialized in my 
configure command line. Without that, gcc even warns about that.


Fixed, and pushed. Thanks!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Petr Jelinek
On 24/03/17 11:23, Erik Rijkers wrote:
> On 2017-03-24 10:45, Mark Kirkwood wrote:
>>
>> However one minor observation - as Michael Banck noted - the elapsed
>> time for slave to catch up after running:
>>
>> $ pgbench -c8 -T600 bench
>>
>> on the master was (subjectively) much longer than for physical
>> streaming replication. Is this expected?
>>
> 
> I think you probably want to do (on the slave) :
> 
>   alter role  set synchronous_commit = off;
> 
> otherwise it's indeed extremely slow.
> 

Yes that might be one reason, see
https://www.postgresql.org/message-id/08b7053b-5679-0733-3af7-00b8cde62...@2ndquadrant.com
(although it probably needs rebase now)

Also don't forget that the snapbuild performance fixes haven't been
committed yet so it can take long time to even start.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread Stephen Frost
Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> There is a function for that.
[...]
> There is not a function for that, but there could be one.

I'm not sure you've really considered what you're suggesting here.

We need to to make sure we have every file between two LSNs.  Yes, we
could step the LSN forward one byte at a time, calling the appropriate
function for every single byte, to make sure that we have that file, but
that really isn't a reasonable approach.  Nor would it be reasonable if
we go on the assumption that WAL files can't be less than 1MB.

Beyond that, this also bakes in an assumption that we would then require
access to a database (of a current enough version to have the functions
needed too!) to connect to and run these functions, which is a poor
design.  If the user is using a remote system to gather the WAL on, that
system may not have easy access to PG.  Further, backup tools will want
to do things like off-line verification that the backup is complete,
perhaps in another environment entirely which doesn't have PG, or maybe
where what they're trying to do is make sure that a given backup is good
before starting a restore to bring PG back up.

Also, given that one of the things we're talking about here is
specifically that we want to be able to change the WAL size for
different databases, you would have to make sure that the database
you're running these functions on uses the same WAL file size as the one
which is being backed up.

No, I don't agree that we can claim the LSN -> WAL filename mapping is
an internal PG detail that we can whack around because there are
functions to calculate the answer.  External utilities need to be able
to perform that translation and we need to document for them how to do
so correctly.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Aleksander Alekseev
Hi Anastasia,

Thanks a lot for a review!

As was mentioned above there is also a bottleneck in find_all_inheritors
procedure. Turned out the problem there is similar and it could be easily
fixed as well. Corresponding patch is attached to this message. To keep
things in order I'm attaching the previous patch as well.

On Wed, Mar 22, 2017 at 11:53:45AM +, Anastasia Lubennikova wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:not tested
> 
> The patch looks good to me.
> It applies clearly, passes all the tests and eliminates the bottleneck 
> described in the first message.
> And as I can see from the thread, there were no objections from others,
> except a few minor comments about code style, which are fixed in the last 
> version of the patch.
> So I mark it "Ready for committer".
> 
> The new status of this patch is: Ready for Committer
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7cacb1e9b2..a22a5a37c8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -161,6 +161,20 @@ typedef struct TabStatusArray
 static TabStatusArray *pgStatTabList = NULL;
 
 /*
+ * pgStatTabHash entry
+ */
+typedef struct TabStatHashEntry
+{
+	Oid t_id;
+	PgStat_TableStatus* tsa_entry;
+} TabStatHashEntry;
+
+/*
+ * Hash table for O(1) t_id -> tsa_entry lookup
+ */
+static HTAB *pgStatTabHash = NULL;
+
+/*
  * Backends store per-function info that's waiting to be sent to the collector
  * in this hash table (indexed by function OID).
  */
@@ -815,7 +829,13 @@ pgstat_report_stat(bool force)
 pgstat_send_tabstat(this_msg);
 this_msg->m_nentries = 0;
 			}
+
+			/*
+			 * Entry will be zeroed soon so we need to remove it from the lookup table.
+			 */
+			hash_search(pgStatTabHash, &entry->t_id, HASH_REMOVE, NULL);
 		}
+
 		/* zero out TableStatus structs after use */
 		MemSet(tsa->tsa_entries, 0,
 			   tsa->tsa_used * sizeof(PgStat_TableStatus));
@@ -1667,59 +1687,77 @@ pgstat_initstats(Relation rel)
 }
 
 /*
+ * Make sure pgStatTabList and pgStatTabHash are initialized.
+ */
+static void
+make_sure_stat_tab_initialized()
+{
+	HASHCTL ctl;
+
+	if (pgStatTabList)
+		return;
+
+	Assert(!pgStatTabHash);
+
+	memset(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);
+	ctl.entrysize = sizeof(TabStatHashEntry);
+	ctl.hcxt = TopMemoryContext;
+
+	pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup table",
+		TABSTAT_QUANTUM, &ctl, HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	pgStatTabList = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+sizeof(TabStatusArray));
+}
+
+/*
  * get_tabstat_entry - find or create a PgStat_TableStatus entry for rel
  */
 static PgStat_TableStatus *
 get_tabstat_entry(Oid rel_id, bool isshared)
 {
+	TabStatHashEntry* hash_entry;
 	PgStat_TableStatus *entry;
 	TabStatusArray *tsa;
-	TabStatusArray *prev_tsa;
-	int			i;
+	bool found;
+
+	make_sure_stat_tab_initialized();
+
+	/*
+	 * Find an entry or create a new one.
+	 */
+	hash_entry = hash_search(pgStatTabHash, &rel_id, HASH_ENTER, &found);
+	if(found)
+		return hash_entry->tsa_entry;
 
 	/*
-	 * Search the already-used tabstat slots for this relation.
+	 * `hash_entry` was just created and now we have to fill it.
+	 * First make sure there is a free space in a last element of pgStatTabList.
 	 */
-	prev_tsa = NULL;
-	for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = tsa->tsa_next)
+	tsa = pgStatTabList;
+	while(tsa->tsa_used == TABSTAT_QUANTUM)
 	{
-		for (i = 0; i < tsa->tsa_used; i++)
+		if(tsa->tsa_next == NULL)
 		{
-			entry = &tsa->tsa_entries[i];
-			if (entry->t_id == rel_id)
-return entry;
+			tsa->tsa_next = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+		sizeof(TabStatusArray));
 		}
 
-		if (tsa->tsa_used < TABSTAT_QUANTUM)
-		{
-			/*
-			 * It must not be present, but we found a free slot instead. Fine,
-			 * let's use this one.  We assume the entry was already zeroed,
-			 * either at creation or after last use.
-			 */
-			entry = &tsa->tsa_entries[tsa->tsa_used++];
-			entry->t_id = rel_id;
-			entry->t_shared = isshared;
-			return entry;
-		}
+		tsa = tsa->tsa_next;
 	}
 
 	/*
-	 * We ran out of tabstat slots, so allocate more.  Be sure they're zeroed.
-	 */
-	tsa = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
-	sizeof(TabStatusArray));
-	if (prev_tsa)
-		prev_tsa->tsa_next = tsa;
-	else
-		pgStatTabList = tsa;
-
-	/*
-	 * Use the first entry of the new TabStatusArray.
+	 * Add an entry.
 	 */
 	entry = &tsa->tsa_entries[tsa->tsa_used++];
 	entry->t_id = r

Re: [HACKERS] Monitoring roles patch

2017-03-24 Thread Stephen Frost
Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 3/22/17 09:17, Stephen Frost wrote:
> >> If we do it via GRANTs instead, then users can easily extend it.
> > The intent here is that users will *also* be able to do it via GRANTs if
> > they wish to.
> 
> But why not do it with GRANTs in the first place then?

This is akin to asking why do we need GRANT ALL and ALTER DEFAULT PRIVs.

Would it be technically possible to make users jump through hoops every
time they set up a new system to create their own monitor role that
would then have the right set of privileges for *that* version of PG?
Yes, but it's not exactly friendly.  The whole idea here is that the
tool authors will be able to tell the users that they just need to GRANT
this one role, not a script of 20 GRANT statements, oh, and that it
won't break when doing upgrades.

If we make the users run all the statements individually then they'll
also have to get an updated script for the next version of PG too
because we will have added things that the tools will want access to.

Further, they'll have to make sure to install all the contrib modules
they want to use before running the GRANT script which is provided, or
deal with GRANT'ing the rights out after installing some extension.

With the approach that Dave and I are advocating, we can avoid all of
that.  Contrib modules can bake-in GRANTs to the appropriate roles,
upgrades can be handled smoothly even when we add new capabilities which
are appropriate, users have a simple and straight-forward way to set up
good monitoring, and tool authors will know what permissions are
available and can finally have a better answer than "well, just make the
monior user superuser if you want to avoid all these complexities."

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread David Steele

On 3/24/17 12:27 AM, Peter Eisentraut wrote:

On 3/23/17 16:58, Stephen Frost wrote:

The backup tools need to also get the LSN from the pg_stop_backup and
verify that they have the WAL file associated with that LSN.


There is a function for that.


They also
need to make sure that they have all of the WAL files between the
starting LSN and the ending LSN.  Doing that requires understanding how
the files are named to make sure there aren't any missing.


There is not a function for that, but there could be one.


A function would be nice, but tools often cannot depend on the database 
being operational so it's still necessary to re-implement them.  Having 
a sane sequence in the WAL makes that easier.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-24 Thread Aleksander Alekseev
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

This patch looks good to me. As I understand we have both a complete feature 
and a consensus in a thread here. If there are no objection, I'm marking this 
patch as "Ready for Commiter".

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread Stephen Frost
Jeff,

* Jeff Janes (jeff.ja...@gmail.com) wrote:
> On Thu, Mar 23, 2017 at 1:45 PM, Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> wrote:
> > On 3/22/17 17:33, David Steele wrote:
> > > and I doubt that most tool writers would be quick to
> > > add support for a feature that very few people (if any) use.
> >
> > I'm not one of those tool writers, although I have written my share of
> > DBA scripts over the years, but I wonder why those tools would really
> > care.  They are handed files with predetermined names to archive, and
> > for restore files with predetermined names are requested back from them.
> >  What else do they need?  If something is missing that requires them to
> > parse file names, then maybe that should be added.
> 
> I have a pg_restore which predicts the file 5 files ahead of the one it was
> asked for, and initiates a pre-fetch and decompression of it. Then it
> delivers the file it was asked for, either by pulling it out of the
> pre-staging area set up by the N-5th invocation, or by going directly to
> the archive to get it.  This speeds up play-back dramatically when the
> files are stored compressed and non-local.

Ah, yes, that is on our road-map for pgBackrest to do also, along with
parallel WAL fetch which also needs to figure out the WAL names before
being asked for them.

We do already have parallel push, which also needs to figure out what
the upcoming file names are going to be so we can find them and push
them when they're indicated as ready in archive_status.  Perhaps we
could just push whatever is ready and remember everything that was
pushed for when PG asks, but that is really not ideal.

> That is why I need to know how the files are numbered.  I don't think that
> that makes much of a difference, though.  Any change is going to break
> that, no matter which change.  Then I'll fix it.

Right, but the discussion here is actually about the idea that we're
going to encourage people to use the initdb-time option to change the
WAL size, meaning you'll need to deal with different WAL sizes and
different naming due to that, and then we're going to turn around in the
very next release and break the naming, meaning you'll have to adjust
your tools first for the different possible WAL sizes in PG10 and then
adjust again for the different naming in PG11.

I'm trying to suggest that if we're going to do this that, perhaps, we
should try to make both the changes in one release instead of across
two.

> If we are going to break it, I'd prefer to just do away with the 'segment'
> thing altogether.  You have timelines, and you have files.  That's it.

I'm not sure I follow this proposal.  We have to know which WAL file has
which LSN in it, how do you do that with just 'timelines and files'?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Suppress Clang 3.9 warnings

2017-03-24 Thread Aleksander Alekseev
Hi Tom,

Since no one seems to be particularly excited about this patch I'm
marking it as "Returned with feedback" to save reviewers time.

On Wed, Mar 15, 2017 at 12:21:21PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Wed, Mar 15, 2017 at 10:57:15AM -0400, Tom Lane wrote:
> >> Seems like the correct solution is to
> >> absorb that fix, either by updating to a newer autoconf release or by
> >> carrying our own version of AC_CHECK_DECLS until they come out with one.
> 
> > As you mention upthread, that Autoconf commit is still newer than every
> > Autoconf release.  (Last release was 58 months ago.)  Altering configure.ac 
> > to
> > work around the bug would be reasonable, but it feels heavy relative to the
> > benefit of suppressing some warnings.
> 
> It does seem like rather a lot of work, but I think it's preferable to
> hacking up the coding in port.h.  Mainly because we could booby-trap the
> substitute AC_CHECK_DECLS to make sure we revert it whenever autoconf 2.70
> does materialize (a check on m4_PACKAGE_VERSION, like the one at
> configure.in line 22, ought to do the trick); whereas I do not think
> we'd remember to de-kluge port.h if we kluge around it there.
> 
> I'm fine with leaving it alone, too.
> 
>   regards, tom lane

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] Backend crash on non-exclusive backup cancel

2017-03-24 Thread Michael Paquier
On Fri, Mar 24, 2017 at 7:58 PM, Teodor Sigaev  wrote:
> Thank you all, pushed.

Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-24 Thread Michael Paquier
On Fri, Mar 24, 2017 at 8:36 PM, Heikki Linnakangas  wrote:
> On 03/23/2017 06:41 AM, Michael Paquier wrote:
>>
>> And after a lookup the failure is here:
>> -   result = get_role_password(port->user_name, &shadow_pass, logdetail);
>> +   shadow_pass = get_role_password(port->user_name, logdetail);
>> if (result == STATUS_OK)
>> result is never setup in this code path, so that may crash.
>
> Ah, of course. For some reason, I has -Wno-maybe-uninitialized in my
> configure command line. Without that, gcc even warns about that.
>
> Fixed, and pushed. Thanks!

OK, cool.

In order to close this thread, I propose to reuse the patches I sent
here to make scram_build_verifier() available to frontends:
https://www.postgresql.org/message-id/CAB7nPqT4yc3u8wspYkWbG088Ndp6asMH3=zb___ck89ctvz...@mail.gmail.com

And on top of it modify \password so as it generates a md5 verifier
for pre-9.6 servers and a scram one for post-10 servers by looking at
the backend version of the current connection. What do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-24 Thread Heikki Linnakangas

On 03/24/2017 03:02 PM, Michael Paquier wrote:

In order to close this thread, I propose to reuse the patches I sent
here to make scram_build_verifier() available to frontends:
https://www.postgresql.org/message-id/CAB7nPqT4yc3u8wspYkWbG088Ndp6asMH3=zb___ck89ctvz...@mail.gmail.com

And on top of it modify \password so as it generates a md5 verifier
for pre-9.6 servers and a scram one for post-10 servers by looking at
the backend version of the current connection. What do you think?


Yep, sounds like a plan.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Amit Kapila
On Thu, Mar 23, 2017 at 3:54 PM, Pavan Deolasee
 wrote:
> Thanks Amit. v19 addresses some of the comments below.
>
> On Thu, Mar 23, 2017 at 10:28 AM, Amit Kapila 
> wrote:
>>
>> On Wed, Mar 22, 2017 at 4:06 PM, Amit Kapila 
>> wrote:
>> > On Tue, Mar 21, 2017 at 6:47 PM, Pavan Deolasee
>> >  wrote:
>
>>
>> 5.
>> +btrecheck(Relation indexRel, IndexInfo *indexInfo, IndexTuple indexTuple,
>> + Relation heapRel, HeapTuple heapTuple)
>> {
>> ..
>> + if (!datumIsEqual(values[i - 1], indxvalue, att->attbyval,
>> + att->attlen))
>> ..
>> }
>>
>> Will this work if the index is using non-default collation?
>>
>
> Not sure I understand that. Can you please elaborate?
>

I was worried for the case if the index is created non-default
collation, will the datumIsEqual() suffice the need.  Now again
thinking about it, I think it will because in the index tuple we are
storing the value as in heap tuple.  However today it occurred to me
how will this work for toasted index values (index value >
TOAST_INDEX_TARGET).  It is mentioned on top of datumIsEqual() that it
probably won't work for toasted values.  Have you considered that
point?

>>
>> 6.
>> +++ b/src/backend/access/nbtree/nbtxlog.c
>> @@ -390,83 +390,9 @@ btree_xlog_vacuum(XLogReaderState *record)
>> -#ifdef UNUSED
>>   xl_btree_vacuum *xlrec = (xl_btree_vacuum *) XLogRecGetData(record);
>>
>>   /*
>> - * This section of code is thought to be no longer needed, after analysis
>> - * of the calling paths. It is retained to allow the code to be
>> reinstated
>> - * if a flaw is revealed in that thinking.
>> - *
>> ..
>>
>> Why does this patch need to remove the above code under #ifdef UNUSED
>>
>
> Yeah, it isn't strictly necessary. But that dead code was coming in the way
> and hence I decided to strip it out. I can put it back if it's an issue or
> remove that as a separate commit first.
>

I think it is better to keep unrelated changes out of patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Aleksander Alekseev
Also I would like to share some benchmark results.

Before applying any patches (copied from one of previous messages):

```
latency average = 1153.495 ms
latency stddev = 154.366 ms
tps = 6.061104 (including connections establishing)
tps = 6.061211 (excluding connections establishing)
```

After applying first patch (copied as well):

```
latency average = 853.862 ms
latency stddev = 112.270 ms
tps = 8.191861 (including connections establishing)
tps = 8.192028 (excluding connections establishing)
```

After applying both patches:

```
latency average = 533.646 ms
latency stddev = 86.311 ms
tps = 13.110328 (including connections establishing)
tps = 13.110608 (excluding connections establishing)
```

Small amount of partitions (2 to be exact), no patches:

```
latency average = 0.928 ms
latency stddev = 0.296 ms
tps = 7535.224897 (including connections establishing)
tps = 7536.145457 (excluding connections establishing)
```

Small amount of partitions, bot patches applied:

```
latency average = 0.915 ms
latency stddev = 0.269 ms
tps = 7638.344922 (including connections establishing)
tps = 7639.164769 (excluding connections establishing)
```

As you can see these patches don't make things worse in any regard.

Perf shows that both bottlenecks are gone. Before [1] and after [2].

[1] http://afiskon.ru/s/00/2008c4ae66_temp.png
[2] http://afiskon.ru/s/a5/fd81628a3a_temp.png

On Fri, Mar 24, 2017 at 03:17:44PM +0300, Aleksander Alekseev wrote:
> Hi Anastasia,
> 
> Thanks a lot for a review!
> 
> As was mentioned above there is also a bottleneck in find_all_inheritors
> procedure. Turned out the problem there is similar and it could be easily
> fixed as well. Corresponding patch is attached to this message. To keep
> things in order I'm attaching the previous patch as well.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread David Steele

On 3/23/17 4:45 PM, Peter Eisentraut wrote:

On 3/22/17 17:33, David Steele wrote:

I think if we don't change the default size it's very unlikely I would
use alternate WAL segment sizes or recommend that anyone else does, at
least in v10.

I simply don't think it would get the level of testing required to be
production worthy


I think we could tweak the test harnesses to run all the tests with
different segment sizes.  That would get pretty good coverage.


I would want to see 1,16,64 at a minimum.  More might be nice but that 
gets a bit ridiculous at some point.  I would be fine with different 
critters having different defaults.  I don't think that each critter 
needs to test each value.



More generally, the methodology that we should not add an option unless
we also change the default because the option would otherwise not get
enough testing is a bit dubious.


Generally, I would agree, but I think this is a special case.  This 
option has been around for a long time and we are just now exposing it 
in a way that's useful to end users.  It's easy to see how various 
assumptions may have arisen around the default and led to code that is 
not quite right when using different values.  Even if that's not true in 
the backend code, it might affect bin, and certainly affects third party 
tools.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create_unique_path and GEQO

2017-03-24 Thread Tom Lane
Ashutosh Bapat  writes:
>> Do you have test case, which can reproduce the issue you
>> explained above?

> No. It would require some surgery in standard_planner() to measure the
> memory consumed in the planner context OR build the code with
> SHOW_MEMORY_STATS defined and dump memory context statistics and check
> planner context memory usage. I don't think I can produce a testcase
> quickly right now. But then, I think the problem is quite apparent
> from the code inspection alone, esp. comparing what mark_dummy_rel()
> does with what create_unique_path() is doing.

Yeah.  I think the code in mark_dummy_rel is newer and better-thought-out
than what's in create_unique_path.  It probably makes sense to change over.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Peter Eisentraut
On 3/23/17 19:32, Petr Jelinek wrote:
> Yes, I also forgot to check if the table actually exists on subscriber
> when fetching them in CREATE SUBSCRIPTION (we have check during
> replication but not there).

I think for this we can probably just change the missing_ok argument of
RangeVarGetRelid() to false.

Unless we want the custom error message, in which case we need to change
AlterSubscription_refresh(), because right now it errors out because
missing_ok = false.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Simon Riggs
On 1 March 2017 at 01:36, Amit Langote  wrote:

> I don't know which way you're thinking of fixing this, but a planner patch
> to implement faster partition-pruning will have taken care of this, I
> think.  As you may know, even declarative partitioned tables currently
> depend on constraint exclusion for partition-pruning and planner's current
> approach of handling inheritance requires to open all the child tables
> (partitions), whereas the new approach hopefully shouldn't need to do
> that.  I am not sure if looking for a more localized fix for this would be
> worthwhile, although I may be wrong.

What "new approach" are we discussing?

Is there a patch or design discussion?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Petr Jelinek
On 24/03/17 15:05, Peter Eisentraut wrote:
> On 3/23/17 19:32, Petr Jelinek wrote:
>> Yes, I also forgot to check if the table actually exists on subscriber
>> when fetching them in CREATE SUBSCRIPTION (we have check during
>> replication but not there).
> 
> I think for this we can probably just change the missing_ok argument of
> RangeVarGetRelid() to false.
> 
> Unless we want the custom error message, in which case we need to change
> AlterSubscription_refresh(), because right now it errors out because
> missing_ok = false.
> 

You are right, stupid me.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 81c7750095cfac35fad9a841309b2c5501e59a62 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Fri, 24 Mar 2017 00:24:47 +0100
Subject: [PATCH 2/2] Check that published table exists on subscriber

---
 src/backend/commands/subscriptioncmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8f201b2..efe70b4 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -416,7 +416,7 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel)
 RangeVar   *rv = (RangeVar *) lfirst(lc);
 Oid			relid;
 
-relid = RangeVarGetRelid(rv, AccessShareLock, true);
+relid = RangeVarGetRelid(rv, AccessShareLock, false);
 
 SetSubscriptionRelState(subid, relid, table_state,
 		InvalidXLogRecPtr);
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_serial early wraparound

2017-03-24 Thread Anastasia Lubennikova
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hi, I've tried to review this patch, but it seems that I miss something 
essential.
You claim that SLRUs now support five digit segment name, while in slru.h
at current master I see the following:

 * Note: slru.c currently assumes that segment file names will be four hex
 * digits.  This sets a lower bound on the segment size (64K transactions
 * for 32-bit TransactionIds).
 */
#define SLRU_PAGES_PER_SEGMENT  32

/* Maximum length of an SLRU name */
#define SLRU_MAX_NAME_LENGTH32

Could you please clarify the idea of the patch? Is it still relevant?

I've also run your test script.
pg_clog was renamed to pg_xact, so it need to be changed accordingly
echo "Contents of pg_clog:"
  ls $PGDATA/pg_xact/


The test shows failed assertion:

== setting next xid to 1073741824 =
Transaction log reset
waiting for server to start2017-03-24 17:05:19.897 MSK [1181] LOG:  
listening on IPv4 address "127.0.0.1", port 5432
2017-03-24 17:05:19.981 MSK [1181] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2017-03-24 17:05:20.081 MSK [1183] LOG:  database system was shut down at 
2017-03-24 17:05:19 MSK
2017-03-24 17:05:20.221 MSK [1181] LOG:  database system is ready to accept 
connections
 done
server started
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template0"
vacuumdb: vacuuming database "template1"
TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals(oldestXact, 
ShmemVariableCache->oldestXid))", File: "clog.c", Line: 669)
vacuumdb: vacuuming of database "template1" failed: server closed the 
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2017-03-24 17:05:21.541 MSK [1181] LOG:  server process (PID 1202) was 
terminated by signal 6: Aborted
2017-03-24 17:05:21.541 MSK [1181] DETAIL:  Failed process was running: VACUUM 
(FREEZE);

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we cacheline align PGXACT?

2017-03-24 Thread Simon Riggs
On 10 March 2017 at 13:08, Alexander Korotkov 
wrote:

Results look good for me.  Idea of committing both of patches looks
> attractive.
>

I'll commit mine since I understand what it does. I'll look at the other
one also, but won't commit yet.


> We have pretty much acceleration for read-only case and small acceleration
> for read-write case.
> I'll run benchmark on 72-cores machine as well.
>

No need to wait for that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Adding the optional clause 'AS' in CREATE TRIGGER

2017-03-24 Thread David Steele

Hi Naoki,

On 3/17/17 11:58 AM, Surafel Temesgen wrote:


I am sending the review of this patch


This thread has been idle for a week.  Please respond and/or post a new 
patch by 2017-03-28 00:00 AoE (UTC-12) or this submission will be marked 
"Returned with Feedback".


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] kNN for SP-GiST

2017-03-24 Thread David Steele

On 3/21/17 11:45 AM, David Steele wrote:

Hi Nikita,

On 3/9/17 8:52 AM, Alexander Korotkov wrote:


I take a look to this patchset.  My first notes are following.


This thread has been idle for quite a while.  Please respond and/or post
a new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".


This submission has been marked "Returned with Feedback".  Please feel 
free to resubmit to a future commitfest.


Regards,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-24 Thread David Steele

Hi Ivan,

On 3/21/17 1:06 PM, David Steele wrote:

Hi Ivan,

On 3/12/17 10:20 PM, Thomas Munro wrote:

On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshov
 wrote:

Here I attached rebased patch waitlsn_10dev_v3 (core feature)
I will leave the choice of implementation (core/contrib) to the
discretion
of the community.

Will be glad to hear your suggestion about syntax, code and patch.


Hi Ivan,

Here is some feedback based on a first read-through of the v4 patch.
I haven't tested it yet.


This thread has been idle for over a week.  Please respond and/or post a
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".


This submission has been marked "Returned with Feedback".  Please feel 
free to resubmit to a future commitfest.


Regards,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

2017-03-24 Thread David Steele

Hi Kaigai,

On 3/21/17 1:11 PM, David Steele wrote:

On 3/13/17 3:25 AM, Jeevan Chalke wrote:


I have reviewed this patch further and here are my comments:


This thread has been idle for over a week.  Please respond and/or post a
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".


This submission has been marked "Returned with Feedback".  Please feel 
free to resubmit to a future commitfest.


Regards,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential data loss of 2PC files

2017-03-24 Thread Teodor Sigaev

And the renaming of pg_clog to pg_xact is also my fault. Attached is
an updated patch.


Thank you. One more question: what about symlinks? If DBA moves, for example, 
pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync 
on symlink will do nothing actually.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-24 Thread David Steele

Hi Dmitry,

On 3/21/17 4:42 PM, Dmitry Dolgov wrote:

On 21 March 2017 at 18:16, David Steele 
> wrote:


This thread has been idle for over a week.


Yes, sorry for the late reply. I'm still trying to find a better
solution for
some of the problems, that arose in this patch.


On 15 March 2017 at 00:10, Tom Lane 
> wrote:

Dmitry Dolgov <9erthali...@gmail.com >

writes:


I looked through this a bit.


Do you have an idea when you will have a patch ready?  We are now into 
the last week of the commitfest.  I see one question for Tom, but it's 
not clear that this would prevent you from producing a new patch.


Please post a new patch by 2017-03-28 00:00 AoE (UTC-12) or this 
submission will be marked "Returned with Feedback".


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix slot name change handling for subscriptions

2017-03-24 Thread Petr Jelinek
Hi,

ALTER SUBSCRIPTION ... WITH (SLOT NAME = foo) will make the worker dies
on error about unexpected subscription changed. It's my oversight in the
original logical replication patch-set. Attached patch fixes it to
behave same way as other changes to subscription options.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From aaa54e5a608987449542da65fa029afa73d4373b Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Tue, 21 Mar 2017 22:04:57 +0100
Subject: [PATCH] Handle change of slotname in logical replication apply

---
 src/backend/replication/logical/worker.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c2ccab7..fc01cd3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1352,6 +1352,21 @@ reread_subscription(void)
 	}
 
 	/*
+	 * We need to make new connection to new slot if slot name has changed
+	 * so exit here as well if that's the case.
+	 */
+	if (strcmp(newsub->slotname, MySubscription->slotname) != 0)
+	{
+		ereport(LOG,
+(errmsg("logical replication worker for subscription \"%s\" will "
+		"restart because the replication slot name was changed",
+		MySubscription->name)));
+
+		walrcv_disconnect(wrconn);
+		proc_exit(0);
+	}
+
+	/*
 	 * Exit if publication list was changed. The launcher will start
 	 * new worker.
 	 */
@@ -1383,8 +1398,7 @@ reread_subscription(void)
 	}
 
 	/* Check for other changes that should never happen too. */
-	if (newsub->dbid != MySubscription->dbid ||
-		strcmp(newsub->slotname, MySubscription->slotname) != 0)
+	if (newsub->dbid != MySubscription->dbid)
 	{
 		elog(ERROR, "subscription %u changed unexpectedly",
 			 MyLogicalRepWorker->subid);
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication origin tracking fix

2017-03-24 Thread Petr Jelinek
On 10/03/17 05:59, Petr Jelinek wrote:
> Hi,
> 
> while discussing with Craig issues around restarting logical replication
> stream related to the patch he posted [1], I realized that we track
> wrong origin LSN in the logical replication apply.
> 
> We currently track commit_lsn which is *start* of commit record, what we
> need to track is end_lsn which is *end* of commit record otherwise we
> might request transaction that was already replayed if the subscription
> instance has crashed right after commit.
> 
> Attached patch fixes that.
> 

Rebase after table copy patch got committed.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 2a312a4ecce09aab6c72f5f039c49b2300b3f3a4 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Thu, 9 Mar 2017 12:54:43 +0100
Subject: [PATCH] Fix remote position tracking in logical replication

We need to set the origin remote position to end_lsn and not commit_lsn
as commit_lsn is start of commit record and we use the origin remote
position as start position when restarting replication stream. If we'd
use commit_lsn we could request data that we already received from
remote server after crash of downstream server.
---
 src/backend/replication/logical/worker.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index bbf3506..c2ccab7 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -421,9 +421,6 @@ apply_handle_begin(StringInfo s)
 
 	logicalrep_read_begin(s, &begin_data);
 
-	replorigin_session_origin_timestamp = begin_data.committime;
-	replorigin_session_origin_lsn = begin_data.final_lsn;
-
 	remote_final_lsn = begin_data.final_lsn;
 
 	in_remote_transaction = true;
@@ -443,14 +440,18 @@ apply_handle_commit(StringInfo s)
 
 	logicalrep_read_commit(s, &commit_data);
 
-	Assert(commit_data.commit_lsn == replorigin_session_origin_lsn);
-	Assert(commit_data.committime == replorigin_session_origin_timestamp);
-
 	Assert(commit_data.commit_lsn == remote_final_lsn);
 
 	/* The synchronization worker runs in single transaction. */
 	if (IsTransactionState() && !am_tablesync_worker())
 	{
+		/*
+		 * Update origin state so we can restart streaming from correct
+		 * position in case of crash.
+		 */
+		replorigin_session_origin_lsn = commit_data.end_lsn;
+		replorigin_session_origin_timestamp = commit_data.committime;
+
 		CommitTransactionCommand();
 
 		store_flush_position(commit_data.end_lsn);
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication apply to run with sync commit off by default

2017-03-24 Thread Petr Jelinek
On 21/03/17 22:37, Petr Jelinek wrote:
> On 21/03/17 18:54, Robert Haas wrote:
>> On Mon, Mar 20, 2017 at 7:56 PM, Petr Jelinek
>>  wrote:
>>> On 18/03/17 13:31, Petr Jelinek wrote:
 On 07/03/17 06:23, Petr Jelinek wrote:
> there has been discussion at the logical replication initial copy thread
> [1] about making apply work with sync commit off by default for
> performance reasons and adding option to change that per subscription.
>
> Here I propose patch to implement this - it adds boolean column
> subssynccommit to pg_subscription catalog which decides if
> synchronous_commit should be off or local for apply. And it adds
> SYNCHRONOUS_COMMIT = boolean to the list of WITH options for CREATE and
> ALTER SUBSCRIPTION. When nothing is specified it will set it to false.
>
> The patch is built on top of copy patch currently as there are conflicts
> between the two and this helps a bit with testing of copy patch.
>
> [1]
> https://www.postgresql.org/message-id/CA+TgmoY7Lk2YKArcp4O=Qu=xoor8j71mad1oteojawmuje3...@mail.gmail.com
>

 I rebased this patch against recent changes and the latest version of
 copy patch.
>>>

Rebase after table copy patch got committed.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 4e56f693a0b82d116930a03d63c7d034ef004f25 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Mon, 6 Mar 2017 13:07:45 +0100
Subject: [PATCH] Add option to modify sync commit per subscription

This also changes default behaviour of subscription workers to
synchronous_commit = off
---
 doc/src/sgml/catalogs.sgml | 11 +++
 doc/src/sgml/ref/alter_subscription.sgml   |  2 ++
 doc/src/sgml/ref/create_subscription.sgml  | 12 
 src/backend/catalog/pg_subscription.c  |  1 +
 src/backend/commands/subscriptioncmds.c| 49 --
 src/backend/replication/logical/launcher.c |  2 +-
 src/backend/replication/logical/worker.c   | 10 ++
 src/bin/pg_dump/pg_dump.c  | 12 +++-
 src/bin/pg_dump/pg_dump.h  |  1 +
 src/bin/pg_dump/t/002_pg_dump.pl   |  2 +-
 src/bin/psql/describe.c|  5 ++-
 src/include/catalog/pg_subscription.h  | 11 ---
 src/test/regress/expected/subscription.out | 27 
 src/test/regress/sql/subscription.sql  |  3 +-
 14 files changed, 116 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c531c73..9e072ea 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6414,6 +6414,17 @@
  
 
  
+  subsynccommit
+  bool
+  
+  
+   If true, the apply for the subscription will run with
+   synchronous_commit set to local.
+   Otherwise it will have it set to false.
+  
+ 
+
+ 
   subconninfo
   text
   
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 6f94247..572e370 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -26,6 +26,8 @@ ALTER SUBSCRIPTION name WITH ( where suboption can be:
 
 SLOT NAME = slot_name
+SLOT NAME = slot_name
+| SYNCHRONOUS_COMMIT = boolean
 
 ALTER SUBSCRIPTION name SET PUBLICATION publication_name [, ...] { REFRESH WITH ( puboption [, ... ] ) | NOREFRESH }
 ALTER SUBSCRIPTION name REFRESH PUBLICATION WITH ( puboption [, ... ] )
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 8f3c30b..479f4e4 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -32,6 +32,7 @@ CREATE SUBSCRIPTION subscription_nameslot_name
 | COPY DATA | NOCOPY DATA
+| SYNCHRONOUS_COMMIT = boolean
 | NOCONNECT
 
  
@@ -148,6 +149,17 @@ CREATE SUBSCRIPTION subscription_name
 

+SYNCHRONOUS_COMMIT = boolean
+
+ 
+  The value of this parameter overrides the
+  synchronous_commit setting.
+  The default value is false.
+ 
+
+   
+
+   
 NOCONNECT
 
  
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index e420ec1..e7a1634 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -68,6 +68,7 @@ GetSubscription(Oid subid, bool missing_ok)
 	sub->name = pstrdup(NameStr(subform->subname));
 	sub->owner = subform->subowner;
 	sub->enabled = subform->subenabled;
+	sub->synccommit = subform->subsynccommit;
 
 	/* Get conninfo */
 	datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 0784ca7..8f201b2 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -60,12 +60,13 @@ static List *fetch_table_list(

Re: [HACKERS] Index usage for elem-contained-by-const-range clauses

2017-03-24 Thread David Steele

Hi Pritam,

On 3/17/17 5:41 PM, Pritam Baral wrote:


So sorry. I'm attaching the correct version of the original with this,
in case you want to test the limited implementation, because I still
have to go through Tom's list of suggestions.

BTW, the patch is for applying on top of REL9_6_2, and while I
suspect it may work on master too, I haven't tested it since the
original submission (Feb 23).


Also, I noticed that patch haven't regression tests.  Some mention of
this optimization in docs is also nice to have.  > > -- >
Alexander Korotkov > Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company


This thread has been idle for a week.  Please respond and/or post a new 
patch by 2017-03-28 00:00 AoE (UTC-12) or this submission will be marked 
"Returned with Feedback".


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Logical replication SnapBuildInitalSnapshot spelling

2017-03-24 Thread Marko Tiikkaja
Hi,

Commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920 seems to have introduced an
alternative spelling of "initial".  Fixed in the attached.


.m


logical_inital.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_wal_write statistics view

2017-03-24 Thread David Steele

On 3/16/17 1:54 AM, vinayak wrote:


On 2017/03/16 14:46, Haribabu Kommi wrote:


As the view name already contains WAL, I am not sure
whether is it required to include WAL in every column?
I am fine to change if others have the same opinion of
adding WAL to column names.


Ok.


So what is the status of this patch now?  Should it be marked "Ready for 
Committer"?


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.

2017-03-24 Thread Fabien COELHO


Hello Teodor,

Hi, the patch looks good except why do you remove initialization of 
is_throttled? Suppose, just a typo?


No, it is really needed so that the lag measure is correct.

Without the is_throttled change:

 sh> ./pgbench -T 3 -R 10 -C -S -P 1
 starting vacuum...end.
 progress: 1.0 s, 9.0 tps, lat 8.278 ms stddev 9.134, lag  2049638230412146.250 
ms
 progress: 2.0 s, 12.0 tps, lat 4.897 ms stddev 3.961, lag 2.219 ms
 transaction type: 
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 3 s
 number of transactions actually processed: 33
 latency average = 5.602 ms
 latency stddev = 5.820 ms
 rate limit schedule lag: avg 558992244657859.500 (max 18446744073709264.000) ms
 tps = 11.024559 (including connections establishing)
 tps = 12.183456 (excluding connections establishing)

With the is_throttled change:

 ./pgbench -T 3 -R 10 -C -S -P 1
 starting vacuum...end.
 progress: 1.0 s, 11.0 tps, lat 3.742 ms stddev 2.161, lag 1.658 ms
 progress: 2.0 s, 7.0 tps, lat 2.985 ms stddev 0.496, lag 0.276 ms
 transaction type: 
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 3 s
 number of transactions actually processed: 25
 latency average = 3.312 ms
 latency stddev = 1.518 ms
 rate limit schedule lag: avg 0.894 (max 7.031) ms
 tps = 8.398353 (including connections establishing)
 tps = 9.069456 (excluding connections establishing)

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LWLock optimization for multicore Power machines

2017-03-24 Thread David Steele

Hi Alexander,

On 3/16/17 1:35 PM, David Steele wrote:

On 2/21/17 9:54 AM, Bernd Helmle wrote:

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
 to sampling of wait
events.


I've tried this with your example from your blog post[1] and got this:

(pgbench scale 1000)

pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2

SELECT-only:

SELECT * FROM profile_log ;
 ts |  event_type   | event | count
+---+---+---
 2017-02-21 15:21:52.45719  | LWLockNamed   | ProcArrayLock | 8
 2017-02-21 15:22:11.19594  | LWLockTranche | lock_manager  | 1
 2017-02-21 15:22:11.19594  | LWLockNamed   | ProcArrayLock |24
 2017-02-21 15:22:31.220803 | LWLockNamed   | ProcArrayLock | 1
 2017-02-21 15:23:01.255969 | LWLockNamed   | ProcArrayLock | 1
 2017-02-21 15:23:11.272254 | LWLockNamed   | ProcArrayLock | 2
 2017-02-21 15:23:41.313069 | LWLockNamed   | ProcArrayLock | 1
 2017-02-21 15:24:31.37512  | LWLockNamed   | ProcArrayLock |19
 2017-02-21 15:24:41.386974 | LWLockNamed   | ProcArrayLock | 1
 2017-02-21 15:26:41.530399 | LWLockNamed   | ProcArrayLock | 1
(10 rows)

writes pgbench runs have far more events logged, see the attached text
file. Maybe this is of interest...


[1] http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/


This patch applies cleanly at cccbdde.  It doesn't break compilation on
amd64 but I can't test on a Power-based machine

Alexander, have you had a chance to look at Bernd's results?


I'm marking this submission "Waiting for Author" as your input seems to 
be required.


This thread has been idle for a week.  Please respond and/or post a new 
patch by 2017-03-28 00:00 AoE (UTC-12) or this submission will be marked 
"Returned with Feedback".


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-24 Thread David Steele

Hi Vaishnavi,

On 3/19/17 9:32 PM, Vaishnavi Prabakaran wrote:

On Fri, Mar 17, 2017 at 12:37 AM, Daniel Verite 

Daniel, any input here?


> I would also like to hear Craig's opinion on it before applying this fix
> to the original patch, just to make sure am not missing anything here.


Craig?



Am going to include the test which you shared in the test patch. Please
let me know if you want to cover anymore specific cases to gain
confidence.


I have marked this submission "Waiting for Author" since it appears a 
new patch is required based on input and adding a new test.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Aleksander Alekseev
Hi Simon,

> > I don't know which way you're thinking of fixing this, but a planner patch
> > to implement faster partition-pruning will have taken care of this, I
> > think.  As you may know, even declarative partitioned tables currently
> > depend on constraint exclusion for partition-pruning and planner's current
> > approach of handling inheritance requires to open all the child tables
> > (partitions), whereas the new approach hopefully shouldn't need to do
> > that.  I am not sure if looking for a more localized fix for this would be
> > worthwhile, although I may be wrong.
> 
> What "new approach" are we discussing?
> Is there a patch or design discussion?

I think what was meant was plans of my colleague Dmitry Ivanov to
implement partition-pruning. I've just spoke with Dmitry about this
matter. Unless there is anyone else who is already working on this
optimization we would like to work on it together. Unfortunately there
is no patch or design discussion of partition-pruning on this
commitfest.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-24 Thread Dmitry Dolgov
On 24 March 2017 at 15:39, David Steele  wrote:
>
> Do you have an idea when you will have a patch ready?

Yes, I'll prepare a new version with most important changes in two days.


Re: [HACKERS] cast result of copyNode()

2017-03-24 Thread David Steele

On 3/21/17 6:52 PM, Mark Dilger wrote:



On Mar 21, 2017, at 2:13 PM, David Steele  wrote:

Hi Mark,

On 3/9/17 3:34 PM, Peter Eisentraut wrote:

On 3/7/17 18:27, Mark Dilger wrote:

You appear to be using a #define macro to wrap a function of the same name
with the code:

#define copyObject(obj) ((typeof(obj)) copyObject(obj))


Yeah, that's a bit silly.  Here is an updated version that changes that.


Do you know when you'll have a chance to take a look at the updated patch?


The patch applies cleanly, compiles, and passes all the regression tests
for me on my laptop.  Peter appears to have renamed the function copyObject
as copyObjectImpl, which struct me as odd when I first saw it, but I don't have
a better name in mind, so that seems ok.

If the purpose of this patch is to avoid casting so many things down to (Node 
*),
perhaps some additional work along the lines of the patch I'm attaching are
appropriate.  (This patch applies on top Peter's v2 patch).  The idea being to
keep objects as (Expr *) where appropriate, rather than casting through (Node *)
quite so much.

I'm not certain that this is (a) merely a bad idea, (b) a different idea than 
what
Peter is proposing, and as such should be submitted independently, or
(c) something that aught to be included in Peter's patch prior to commit.
I only applied this idea to one file, and maybe not completely in that file, 
because
I'd like feedback before going any further along these lines.


I have marked this "Waiting on Author" pending Peter's input.

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-24 Thread Tom Lane
Another modest proposal:

I'm not really sold on the approach of using EEOP_FETCHSOME opcodes to
trigger initial tupleslot de-forming.  Certainly we want to have a single
slot_getsomeattrs call per source slot, but as-is, we need a separate
traversal over the expression tree just to precompute the max attribute
number needed.  That can't be good for expression compile speed, and
it introduces nontrivial bug risks because the code that does that
is completely decoupled from the code that emits the EEOP_VAR opcodes
(which are what's really relying on the de-forming to have happened).

What do you think about a design like this:

* Drop the FETCHSOME opcodes.

* Add fields to struct ExprState that will hold the maximum inner,
outer, and scan attribute numbers needed.

* ExecInitExpr initializes those fields to zero, and then during
ExecInitExprRec, whenever we generate an EEOP_VAR opcode, we do e.g.

state->last_inner_attno = Max(state->last_inner_attno,
  variable->varattno);

* ExecInitExprSlots, get_last_attnums_walker, etc all go away.

* In the startup segment of ExecInterpExpr, add

if (state->last_inner_attno > 0)
slot_getsomeattrs(innerslot, state->last_inner_attno);
if (state->last_outer_attno > 0)
slot_getsomeattrs(outerslot, state->last_outer_attno);
if (state->last_scan_attno > 0)
slot_getsomeattrs(scanslot, state->last_scan_attno);

This would be a little different from the existing code as far as runtime
branch-prediction behavior goes, but it's not apparent to me that it'd be
any worse.  Also, for JIT purposes it'd still be entirely possible to
compile the slot_getsomeattrs calls in-line; you'd just be looking to a
different part of the ExprState struct to find out what to do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-24 Thread Tom Lane
David Steele  writes:
> Do you have an idea when you will have a patch ready?  We are now into 
> the last week of the commitfest.  I see one question for Tom, but it's 
> not clear that this would prevent you from producing a new patch.

FWIW, I'm up to my eyeballs in Andres' faster-expressions patch, and
won't have time to think about this one for at least a couple more
days.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Teodor Sigaev

things in order I'm attaching the previous patch as well.


Patches look good, but I have some notices:

1 step1 Why do you need TabStatHashEntry at all? TabStatHashEntry.t_id is never 
used for read, so entry for hash could be just a pointer to PgStat_TableStatus.


2 step1 In pgstat_report_stat() you remove one by one entries from hash and 
remove them all. Isn't it better to hash_destroy/hash_create or even let hash 
lives in separate memory context and just resets it?


3 step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash 
although they will be free from point of view of pgStatTabList.



4 step 2. The same as 1) about SeenRelsEntry->rel_id but it even isn't 
initialized anywhere.




--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow to store select results into variables

2017-03-24 Thread Fabien COELHO


Hello Rafia,


if (my_command->argc > 2)
+ syntax_error(source, lineno, my_command->line, my_command->argv[0],
+ "at most on argument expected", NULL, -1);

I suppose you mean 'one' argument here.


Indeed.


Apart from that indentation is not correct as per pgindent, please check.


I guess that you are refering to switch/case indentation which my emacs 
does not do as expected.


Please find attached a v8 which hopefully fixes these two issues.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 1eee8dc..1108fcb 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -816,6 +816,51 @@ pgbench  options  dbname
   
 
   
+   
+
+ \gcset [prefix] or
+ \gset [prefix]
+
+
+
+ 
+  These commands may be used to end SQL queries, replacing a semicolon.
+  \gcset replaces an embedded semicolon (\;) within
+  a compound SQL command, and \gset replaces a final
+  (;) semicolon which ends the SQL command. 
+ 
+
+ 
+  When these commands are used, the preceding SQL query is expected to
+  return one row, the columns of which are stored into variables named after
+  column names, and prefixed with prefix if provided.
+ 
+
+ 
+  The following example puts the final account balance from the first query
+  into variable abalance, and fills variables
+  one, two and p_three with
+  integers from a compound query.
+
+UPDATE pgbench_accounts
+  SET abalance = abalance + :delta
+  WHERE aid = :aid
+  RETURNING abalance \gset
+-- compound of two queries
+SELECT 1 AS one, 2 AS two \gcset
+SELECT 3 AS three \gset p_
+
+ 
+
+ 
+  
+\gcset and \gset commands do not work when
+empty SQL queries appear within a compound SQL command.
+  
+ 
+
+   
+

 
  \set varname expression
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 78f1e6b..ac9289b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -375,11 +375,14 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
 typedef struct
 {
-	char	   *line;			/* text of command line */
+	char	   *line;			/* first line for short display */
+	char	   *lines;			/* full multi-line text of command */
 	int			command_num;	/* unique index of this Command struct */
 	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
 	int			argc;			/* number of command words */
 	char	   *argv[MAX_ARGS]; /* command word list */
+	int			compound;   /* last compound command (number of \;) */
+	char	  **gset;   /* per-compound command prefix */
 	PgBenchExpr *expr;			/* parsed expression, if needed */
 	SimpleStats stats;			/* time spent in this command */
 } Command;
@@ -1225,6 +1228,105 @@ getQueryParams(CState *st, const Command *command, const char **params)
 		params[i] = getVariable(st, command->argv[i + 1]);
 }
 
+/* read all responses from backend */
+static bool
+read_response(CState *st, char **gset)
+{
+	PGresult   *res;
+	int			compound = -1;
+
+	while ((res = PQgetResult(st->con)) != NULL)
+	{
+		compound += 1;
+
+		switch (PQresultStatus(res))
+		{
+			case PGRES_COMMAND_OK: /* non-SELECT commands */
+			case PGRES_EMPTY_QUERY: /* may be used for testing no-op overhead */
+if (gset[compound] != NULL)
+{
+	fprintf(stderr,
+			"client %d file %d command %d compound %d: "
+			"\\gset expects a row\n",
+			st->id, st->use_file, st->command, compound);
+	st->ecnt++;
+	return false;
+}
+break; /* OK */
+
+			case PGRES_TUPLES_OK:
+if (gset[compound] != NULL)
+{
+	/* store result into variables */
+	int ntuples = PQntuples(res),
+		nfields = PQnfields(res),
+		f;
+
+	if (ntuples != 1)
+	{
+		fprintf(stderr,
+"client %d file %d command %d compound %d: "
+"expecting one row, got %d\n",
+st->id, st->use_file, st->command, compound, ntuples);
+		st->ecnt++;
+		PQclear(res);
+		discard_response(st);
+		return false;
+	}
+
+	for (f = 0; f < nfields ; f++)
+	{
+		char *varname = PQfname(res, f);
+		if (*gset[compound] != '\0')
+			varname = psprintf("%s%s", gset[compound], varname);
+
+		/* store result as a string */
+		if (!putVariable(st, "gset", varname,
+		 PQgetvalue(res, 0, f)))
+		{
+			/* internal error, should it rather abort? */
+			fprintf(stderr,
+	"client %d file %d command %d compound %d: "
+	"error storing into var %s\n",
+	st->id, st->use_file, st->command, compound,
+	varname);
+			st->ecnt++;
+			PQclear(res);
+			discard_response(st);
+			return false;
+		}
+
+		if (*gset[compound] != '\0')
+			free(varname);
+	}
+}
+break;	/* OK */
+
+			default:
+/* everything else is unexpected, so probably an error */
+fprintf(stderr,
+		"client %d file %d aborted in command %d

Re: [HACKERS] Supporting huge pages on Windows

2017-03-24 Thread David Steele

Hi Ashutosh,

On 3/22/17 8:52 AM, Amit Kapila wrote:

On Wed, Mar 22, 2017 at 12:07 AM, David Steele  wrote:


Amit, Magnus, you are signed up as reviewers for this patch.  Do you know
when you'll have a chance to take a look?



I have provided my feedback and I could not test it on my machine.  I
think Ashutosh Sharma has tested it.  I can give it another look, but
not sure if it helps.


I know you are not signed up as a reviewer, but have you take a look at 
this patch?


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we cacheline align PGXACT?

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 10:12 AM, Simon Riggs  wrote:
> On 10 March 2017 at 13:08, Alexander Korotkov 
> wrote:
>> Results look good for me.  Idea of committing both of patches looks
>> attractive.
>
> I'll commit mine since I understand what it does. I'll look at the other one
> also, but won't commit yet.

Might've been nice to include reviewers, testers, and a link to the
mailing list discussion in the commit message.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscripting

2017-03-24 Thread Arthur Zakirov

On 24.03.2017 18:29, Tom Lane wrote:

David Steele  writes:

Do you have an idea when you will have a patch ready?  We are now into
the last week of the commitfest.  I see one question for Tom, but it's
not clear that this would prevent you from producing a new patch.


FWIW, I'm up to my eyeballs in Andres' faster-expressions patch, and
won't have time to think about this one for at least a couple more
days.

regards, tom lane




I can try to review a new version of the patch, when Dmitry will send 
it. If no one objects. Besides, I've seen the previous versions of the 
patch from previous commitfest.


--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Teodor Sigaev

Sorry, 1) and 4) is my fault, comment in hsearch.h:
* ... The hash key
* is expected to be at the start of the caller's hash entry data structure.

Ops, forgot that.

Teodor Sigaev wrote:

things in order I'm attaching the previous patch as well.


Patches look good, but I have some notices:

1 step1 Why do you need TabStatHashEntry at all? TabStatHashEntry.t_id is never
used for read, so entry for hash could be just a pointer to PgStat_TableStatus.

2 step1 In pgstat_report_stat() you remove one by one entries from hash and
remove them all. Isn't it better to hash_destroy/hash_create or even let hash
lives in separate memory context and just resets it?

3 step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash
although they will be free from point of view of pgStatTabList.


4 step 2. The same as 1) about SeenRelsEntry->rel_id but it even isn't
initialized anywhere.





--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)

2017-03-24 Thread Robert Haas
On Thu, Mar 23, 2017 at 7:29 AM, Michael Paquier
 wrote:
> On Thu, Mar 23, 2017 at 8:19 PM, Kuntal Ghosh
>  wrote:
>> Hence, to be consistent with others, bgworker processes can be
>> initialized from BackgroundWorkerInitializeConnectionBy[Oid].
>
> Yeah, I am fine with that. Thanks for the updated versions.

I think this is still not good.  The places where pgstat_bestart() has
been added are not even correct.  For example, the call added to
BackgroundWriterMain() occurs after the section that does
error-recovery, so it would get repeated every time the background
writer recovers from an error.  There are similar problems elsewhere.
Furthermore, although in theory there's an idea here that we're making
it no longer the responsibility of InitPostgres() to call
pgstat_bestart(), the patch as proposed only removes one of the two
calls, so we really don't even have a consistent practice.  I think
it's better to go with the idea of having InitPostgres() be
responsible for calling this for regular backends, and
AuxiliaryProcessMain() for auxiliary backends.  That involves
substantially fewer calls to pgstat_bestart() and they are spread
across only two functions, which IMHO makes fewer bugs of omission a
lot less likely.

Updated patch with that change attached.  In addition to that
modification, I made some other alterations:

- I changed the elog() for the can't-happen case in pgstat_bestart()
from PANIC to FATAL.  The contents of shared memory aren't corrupted,
so I think PANIC is overkill.

- I tweaked the comment in WalSndLoop() just before the
pgstat_report_activity() call to accurately reflect what the effect of
that call now is.

- I changed the column ordering in pg_stat_get_activity() to put
backend_type with the other columns that appear in pg_stat_activity,
rather than (as the patch did) grouping it with the ones that appear
in pg_stat_ssl.

- I modified the code to tolerate a NULL return from
AuxiliaryPidGetProc().  I am pretty sure that without that there's a
race condition that could lead to a crash if somebody tried to call
this function just as an auxiliary process was terminating.

- I updated the documentation slightly.

- I rebased over some conflicting commits.

If there aren't objections, I plan to commit this version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


more-pg-stat-activity.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-24 Thread Amit Khandekar
On 24 March 2017 at 00:38, Amit Khandekar  wrote:
> On 23 March 2017 at 16:26, Amit Khandekar  wrote:
>> On 23 March 2017 at 05:55, Robert Haas  wrote:
>>>
>>> So in your example we do this:
>>>
>>> C[0] += 20;
>>> C[1] += 16;
>>> C[2] += 10;
>>> /* C[2] is smaller than C[0] or C[1] at this point, so we add the next
>>> path to C[2] */
>>> C[2] += 8;
>>> /* after the previous line, C[1] is now the smallest, so add to that
>>> entry next */
>>> C[1] += 3;
>>> /* now we've got C[0] = 20, C[1] = 19, C[2] = 18, so add to C[2] */
>>> C[2] += 1;
>>> /* final result: C[0] = 20, C[1] = 19, C[2] = 19 */
>>>
>>> Now we just take the highest entry that appears in any array, which in
>>> this case is C[0], as the total cost.
>>
>> Wow. The way your final result exactly tallies with my algorithm
>> result is very interesting. This looks like some maths or computer
>> science theory that I am not aware.
>>
>> I am currently coding the algorithm using your method.
>

> While I was coding this, I was considering if Path->rows also should
> be calculated similar to total cost for non-partial subpath and total
> cost for partial subpaths. I think for rows, we can just take
> total_rows divided by workers for non-partial paths, and this
> approximation should suffice. It looks odd that it be treated with the
> same algorithm we chose for total cost for non-partial paths.

Attached is the patch v12, where Path->rows calculation of non-partial
paths is kept separate from the way total cost is done for non-partial
costs. rows for non-partial paths is calculated as total_rows divided
by workers as approximation. And then rows for partial paths are just
added one by one.

>
> Meanwhile, attached is a WIP patch v10. The only change in this patch
> w.r.t. the last patch (v9) is that this one has a new function defined
> append_nonpartial_cost(). Just sending this to show how the algorithm
> looks like; haven't yet called it.

Now append_nonpartial_cost() is used, and it is tested.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


ParallelAppend_v12.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>
> For normal commits and aborts we already reset PgXact->xmin
> Avoiding touching highly contented shmem improves concurrent
> performance.
>
> Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#0  0x7fff9679c286 in __pthread_kill ()
#1  0x7fff94e1a9f9 in pthread_kill ()
#2  0x7fff9253a9a3 in abort ()
#3  0x000107e0659e in ExceptionalCondition (conditionName=, errorType=0x6 , fileName=, lineNumber=) at assert.c:54
#4  0x000107e4be2b in AtEOXact_Snapshot (isCommit=, isPrepare=0 '\0') at
snapmgr.c:1154
#5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
#6  0x000107a76267 in CommitTransactionCommand () at xact.c:2818
#7  0x000107cecfc2 in exec_simple_query
(query_string=0x7f975481e640 "ABORT TRANSACTION") at postgres.c:2461
#8  0x000107ceabb7 in PostgresMain (argc=, argv=, dbname=, username=) at postgres.c:4071
#9  0x000107c6bb58 in PostmasterMain (argc=, argv=) at postmaster.c:4317
#10 0x000107be5cdd in main (argc=, argv=) at main.c:228

I suspect that is the fault of this patch.  Please fix or revert.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

2017-03-24 Thread Aleksander Alekseev
Hi Teodor,

Thanks a lot for a review!

> > step1 In pgstat_report_stat() you remove one by one entries from hash and
> > remove them all. Isn't it better to hash_destroy/hash_create or even let 
> > hash
> > lives in separate memory context and just resets it?

Agree, fixed.

> > step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash
> > although they will be free from point of view of pgStatTabList.

Good point! Fixed.

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b704788eb5..4060f241e2 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -161,6 +161,20 @@ typedef struct TabStatusArray
 static TabStatusArray *pgStatTabList = NULL;
 
 /*
+ * pgStatTabHash entry
+ */
+typedef struct TabStatHashEntry
+{
+	Oid t_id;
+	PgStat_TableStatus* tsa_entry;
+} TabStatHashEntry;
+
+/*
+ * Hash table for O(1) t_id -> tsa_entry lookup
+ */
+static HTAB *pgStatTabHash = NULL;
+
+/*
  * Backends store per-function info that's waiting to be sent to the collector
  * in this hash table (indexed by function OID).
  */
@@ -824,6 +838,12 @@ pgstat_report_stat(bool force)
 	}
 
 	/*
+	 * pgStatTabHash is outdated on this point so we have to clean it.
+	 */
+	hash_destroy(pgStatTabHash);
+	pgStatTabHash = NULL;
+
+	/*
 	 * Send partial messages.  Make sure that any pending xact commit/abort
 	 * gets counted, even if there are no table stats to send.
 	 */
@@ -1668,59 +1688,87 @@ pgstat_initstats(Relation rel)
 }
 
 /*
+ * Make sure pgStatTabList and pgStatTabHash are initialized.
+ */
+static void
+make_sure_stat_tab_initialized()
+{
+	HASHCTL			ctl;
+	MemoryContext	new_ctx;
+
+	if(!pgStatTabList)
+	{
+		/* This is first time procedure is called */
+		pgStatTabList = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+sizeof(TabStatusArray));
+	}
+
+	if(pgStatTabHash)
+		return;
+
+	/* Hash table was freed or never existed.  */
+
+	new_ctx = AllocSetContextCreate(
+		TopMemoryContext,
+		"PGStatLookupHashTableContext",
+		ALLOCSET_DEFAULT_SIZES);
+
+	memset(&ctl, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);
+	ctl.entrysize = sizeof(TabStatHashEntry);
+	ctl.hcxt = new_ctx;
+
+	pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup hash table",
+		TABSTAT_QUANTUM, &ctl, HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+}
+
+/*
  * get_tabstat_entry - find or create a PgStat_TableStatus entry for rel
  */
 static PgStat_TableStatus *
 get_tabstat_entry(Oid rel_id, bool isshared)
 {
+	TabStatHashEntry* hash_entry;
 	PgStat_TableStatus *entry;
 	TabStatusArray *tsa;
-	TabStatusArray *prev_tsa;
-	int			i;
+	bool found;
+
+	make_sure_stat_tab_initialized();
 
 	/*
-	 * Search the already-used tabstat slots for this relation.
+	 * Find an entry or create a new one.
 	 */
-	prev_tsa = NULL;
-	for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = tsa->tsa_next)
+	hash_entry = hash_search(pgStatTabHash, &rel_id, HASH_ENTER, &found);
+	if(found)
+		return hash_entry->tsa_entry;
+
+	/*
+	 * `hash_entry` was just created and now we have to fill it.
+	 * First make sure there is a free space in a last element of pgStatTabList.
+	 */
+	tsa = pgStatTabList;
+	while(tsa->tsa_used == TABSTAT_QUANTUM)
 	{
-		for (i = 0; i < tsa->tsa_used; i++)
+		if(tsa->tsa_next == NULL)
 		{
-			entry = &tsa->tsa_entries[i];
-			if (entry->t_id == rel_id)
-return entry;
+			tsa->tsa_next = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+		sizeof(TabStatusArray));
 		}
 
-		if (tsa->tsa_used < TABSTAT_QUANTUM)
-		{
-			/*
-			 * It must not be present, but we found a free slot instead. Fine,
-			 * let's use this one.  We assume the entry was already zeroed,
-			 * either at creation or after last use.
-			 */
-			entry = &tsa->tsa_entries[tsa->tsa_used++];
-			entry->t_id = rel_id;
-			entry->t_shared = isshared;
-			return entry;
-		}
+		tsa = tsa->tsa_next;
 	}
 
 	/*
-	 * We ran out of tabstat slots, so allocate more.  Be sure they're zeroed.
-	 */
-	tsa = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
-	sizeof(TabStatusArray));
-	if (prev_tsa)
-		prev_tsa->tsa_next = tsa;
-	else
-		pgStatTabList = tsa;
-
-	/*
-	 * Use the first entry of the new TabStatusArray.
+	 * Add an entry.
 	 */
 	entry = &tsa->tsa_entries[tsa->tsa_used++];
 	entry->t_id = rel_id;
 	entry->t_shared = isshared;
+
+	/*
+	 * Add a corresponding entry to pgStatTabHash.
+	 */
+	hash_entry->tsa_entry = entry;
 	return entry;
 }
 
@@ -1732,22 +1780,19 @@ get_tabstat_entry(Oid rel_id, bool isshared)
 PgStat_TableStatus *
 find_tabstat_entry(Oid rel_id)
 {
-	PgStat_TableStatus *entry;
-	TabStatusArray *tsa;
-	int			i;
+	TabStatHashEntry* hash_entry;
 
-	for (tsa = pgStatTabList; tsa != NULL; tsa = tsa->tsa_next)
-	{
-		for (i = 0; i < tsa->tsa_used; i++)
-		{
-			entry = &tsa->tsa_entries[i];
-			if (entry->t_id == rel_id)
-return entry;
-		}
-	}
+	/*
+	 * There are no entries at a

Re: [HACKERS] Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.

2017-03-24 Thread Teodor Sigaev

No, it is really needed so that the lag measure is correct.

Thank you, pushed

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Monitoring roles patch

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 8:30 AM, Stephen Frost  wrote:
>> But why not do it with GRANTs in the first place then?
>
> This is akin to asking why do we need GRANT ALL and ALTER DEFAULT PRIVs.

Not really.  ALTER DEFAULT PRIVILEGES affects what happens for future
objects, which isn't necessary here at all.  The monitoring tool only
needs to be granted the minimum set of privileges that it currently
needs, not future privileges which it or other monitoring tools may
need in future versions.  GRANT ALL is closer, but GRANT .. ON ALL
TABLES IN SCHEMA really is just a convenience function.  You would
still have the capability to do the exact same thing without that
function; it would just be more work.  And the same is true here.  I
understand why you and Dave want to make this a single command: that's
easier.  But, like GRANT ON ALL TABLES, it's also less flexible.  If
you want to grant on all tables except one, you can't use that magic
syntax any more, and so here.  pg_monitor will either target one
particular monitoring solution (hey, maybe it'll be the one my
employer produces!) or the judgement of one particular person
(whatever Stephen Frost thinks it SHOULD do in a given release,
independent of what tools on the ground do) and those aren't good
definitions.

> If we make the users run all the statements individually then they'll
> also have to get an updated script for the next version of PG too
> because we will have added things that the tools will want access to.

That's possible, but it really depends on the tool, not on core
PostgreSQL.  The tool should be the one providing the script, because
the tool is what knows its own permissions requirements.  Users who
care about security won't want to grant every privilege given by a
pg_monitor role to a tool that only needs a subset of those
privileges.

> Further, they'll have to make sure to install all the contrib modules
> they want to use before running the GRANT script which is provided, or
> deal with GRANT'ing the rights out after installing some extension.

That doesn't sound very hard.

> With the approach that Dave and I are advocating, we can avoid all of
> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
> upgrades can be handled smoothly even when we add new capabilities which
> are appropriate, users have a simple and straight-forward way to set up
> good monitoring, and tool authors will know what permissions are
> available and can finally have a better answer than "well, just make the
> monior user superuser if you want to avoid all these complexities."

They can have that anyway.  They just have to run a script provided by
the tool rather than one provided by the core project as a
one-size-fits-all solution for every tool.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>>
>> For normal commits and aborts we already reset PgXact->xmin
>> Avoiding touching highly contented shmem improves concurrent
>> performance.
>>
>> Simon Riggs
>
> I'm getting occasional crashes with backtraces that look like this:
>
> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit= temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> snapmgr.c:1154
> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
>
> I suspect that is the fault of this patch.  Please fix or revert.

Also, the entire buildfarm is turning red.

longfin, spurfowl, and magpie all show this assertion failure in the
log.  I haven't checked the others.

TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 2:27 AM, Craig Ringer  wrote:
> On 24 March 2017 at 02:29, Robert Haas  wrote:
>> On Tue, Mar 21, 2017 at 11:35 PM, Craig Ringer  wrote:
>>> Changes made per discussion.
>>
>> Committed 0001.
>
> Much appreciated.
>
> Here's the 2nd patch rebased on top of master, with the TAP test
> included this time. Looks ready to go.

Committed.

> I really appreciate the time you've taken to look at this. Point me at
> anything from your team you want some outside review on.

Thanks, that is a valuable offer which I will be pleased to accept!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel "return query" is no good

2017-03-24 Thread Robert Haas
On Thu, Mar 23, 2017 at 1:53 PM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> I guess the downside of back-patching this is that it could cause a
>> plan change for somebody which ends up being worse.  On the whole,
>> serial execution of queries intended to be run in parallel isn't
>> likely to work out well, but it's always possible somebody has a cases
>> where it happens to be winning, and this could break it.  So maybe I
>> should do this only in master?  Thoughts?
>
> I think that the chances of someone depending on a parallel plan running
> serially by accident which is better than the non-parallel plan, are
> pretty slim.
>
> +1 for back-patching.

All right, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-24 Thread Fabien COELHO


Hello Corey,


v24 highlights:

- finally using git format-patch
- all conditional slash commands broken out into their own functions
(exec_command_$NAME) , each one tests if it's in an active branch, and if
it's not it consumes the same number of parameters, but discards them.
comments for each slash-command family were left as-is, may need to be
bulked up.
- TAP tests discarded in favor of one ON_EROR_STOP test for invalid \elif
placement
- documentation recommending ON_ERROR_STOP removed, because invalid
expressions no longer throw off if-endif balance
- documentation updated to reflex that contextually-correct-but-invalid
boolean expressions are treated as false
- psql_get_variable has a passthrough void pointer now, but I ended up not
needing it. Instead, all slash commands in false blocks either fetch
OT_NO_EVAL or OT_WHOLE_LINE options. If I'm missing something, let me know.


A few comments about the patch.

Patch applies. "make check" ok.

As already pointed out, there is one useless file in the patch.

Although currently there is only one expected argument for boolean 
expressions, the n² concatenation algorithm in gather_boolean_expression 
is not very elegant. Is there some string buffer data structure which 
could be used instead?


ISTM that ignore_boolean_expression may call free on a NULL pointer if the 
expression is empty?


Generally I find the per-command functions rather an improvement.

However there is an impact on testing because of all these changes. ISTM 
that test cases should reflect this situation and test that \cd, \edit, 
... are indeed ignored properly and taking account there expected args...


In "exec_command_connect" an argument is changed from "-reuse-previous" to 
"-reuse-previous=", not sure why.


There seems to be pattern repetition for _ev _ef and _sf _sv. Would it 
make sense to create a function instead of keeping the initial copy-paste?


I think that some functions could be used for some repeated cases such as 
"discard one arg", "discard one or two arg", "discard whole line", for the 
various inactive branches, so as to factor out code.


I would suggest to put together all if-related backslash command, 
so that the stack management is all in one function instead of 4.


For pset the inactive branch does OT_NORMAL instead of OT_NOT_EVAL, not 
sure why.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Monitoring roles patch

2017-03-24 Thread Dave Page
On Fri, Mar 24, 2017 at 4:24 PM, Robert Haas  wrote:
>
>> If we make the users run all the statements individually then they'll
>> also have to get an updated script for the next version of PG too
>> because we will have added things that the tools will want access to.
>
> That's possible, but it really depends on the tool, not on core
> PostgreSQL.  The tool should be the one providing the script, because
> the tool is what knows its own permissions requirements.  Users who
> care about security won't want to grant every privilege given by a
> pg_monitor role to a tool that only needs a subset of those
> privileges.

The upshot of this would be that every time there's a database server
upgrade that changes the permissions required somehow, the user has to
login to every server they have and run a script. It is no longer a
one-time thing, which makes it vastly more painful to deal with
upgrades.

>> With the approach that Dave and I are advocating, we can avoid all of
>> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
>> upgrades can be handled smoothly even when we add new capabilities which
>> are appropriate, users have a simple and straight-forward way to set up
>> good monitoring, and tool authors will know what permissions are
>> available and can finally have a better answer than "well, just make the
>> monior user superuser if you want to avoid all these complexities."
>
> They can have that anyway.  They just have to run a script provided by
> the tool rather than one provided by the core project as a
> one-size-fits-all solution for every tool.

Do you object to having individual default roles specifically for
cases where there are superuser-only checks at the moment that prevent
GRANT? e.g. pg_show_all_settings for SHOW, pg_show_all_stats for
pg_tablespace_size and friends, pg_stat_statements for, well,
pg_stat_statements and so on? It would be an inferior solution in my
opinion, given that it would demonstrably cause users more work, but
at least we could do something.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 6:44 AM, Kuntal Ghosh
 wrote:
>> ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
>> mode, cmd pokes its nose and does CR / LF conversions on its own. So,
>> whenever we want compression on a plain-text dump file, we can set the
>> stdout mode to O_BINARY. Is it a wrong approach?
> With the help from Ashutosh Sharma, I tested this in Windows
> environment. Sadly, it still doesn't work. :( IMHO, we should document
> the issue somewhere.

Why not?  I mean, if there's code there to force the output into
binary mode, does that not work for the -Fc case?  And if it does work
for the -Fc case, then why doesn't it also work for -Z9?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting huge pages on Windows

2017-03-24 Thread Ashutosh Sharma
Hi,

On Fri, Mar 24, 2017 at 9:00 PM, David Steele  wrote:
> Hi Ashutosh,
>
> On 3/22/17 8:52 AM, Amit Kapila wrote:
>>
>> On Wed, Mar 22, 2017 at 12:07 AM, David Steele 
>> wrote:
>>>
>>>
>>> Amit, Magnus, you are signed up as reviewers for this patch.  Do you know
>>> when you'll have a chance to take a look?
>>>
>>
>> I have provided my feedback and I could not test it on my machine.  I
>> think Ashutosh Sharma has tested it.  I can give it another look, but
>> not sure if it helps.
>
>
> I know you are not signed up as a reviewer, but have you take a look at this
> patch?

Well, I had a quick look into the patch just because I wanted the test
it as I am having windows setup. But, I never looked into the patch
from reviewer's perspective. The intention was to reverify the test
results shared by Tsunawaka.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Monitoring roles patch

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 12:46 PM, Dave Page  wrote:
> On Fri, Mar 24, 2017 at 4:24 PM, Robert Haas  wrote:
>>> If we make the users run all the statements individually then they'll
>>> also have to get an updated script for the next version of PG too
>>> because we will have added things that the tools will want access to.
>>
>> That's possible, but it really depends on the tool, not on core
>> PostgreSQL.  The tool should be the one providing the script, because
>> the tool is what knows its own permissions requirements.  Users who
>> care about security won't want to grant every privilege given by a
>> pg_monitor role to a tool that only needs a subset of those
>> privileges.
>
> The upshot of this would be that every time there's a database server
> upgrade that changes the permissions required somehow, the user has to
> login to every server they have and run a script. It is no longer a
> one-time thing, which makes it vastly more painful to deal with
> upgrades.

So, I might be all wet here, but I would have expected that changes on
the TOOL side would be vastly more frequent.  I mean, we do not change
what a certain builtin permission does very often.  If we add
pg_read_all_settings, what is the likelihood that the remit of that
role is ever going to change?  I would judge that was is vastly more
likely is that a new version of some tool would start needing that
privilege (or some other) where it didn't before.  If that happens,
and the script is on the tool side, then you just add a new line to
the script.  If the script is built into initdb, then you've got to
wait for the next major release before you can update the definition
of pg_monitor - and maybe argue with other tool authors with different
requirements.

>>> With the approach that Dave and I are advocating, we can avoid all of
>>> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
>>> upgrades can be handled smoothly even when we add new capabilities which
>>> are appropriate, users have a simple and straight-forward way to set up
>>> good monitoring, and tool authors will know what permissions are
>>> available and can finally have a better answer than "well, just make the
>>> monior user superuser if you want to avoid all these complexities."
>>
>> They can have that anyway.  They just have to run a script provided by
>> the tool rather than one provided by the core project as a
>> one-size-fits-all solution for every tool.
>
> Do you object to having individual default roles specifically for
> cases where there are superuser-only checks at the moment that prevent
> GRANT? e.g. pg_show_all_settings for SHOW, pg_show_all_stats for
> pg_tablespace_size and friends, pg_stat_statements for, well,
> pg_stat_statements and so on? It would be an inferior solution in my
> opinion, given that it would demonstrably cause users more work, but
> at least we could do something.

I think pg_show_all_settings is brilliant.  It's narrowly targeted and
in no way redundant with what can anyway be done with GRANT otherwise.
As far as the others, I think that we should just let people GRANT
privileges one by one whenever that's possible, and use built-in roles
where it isn't.  So I'm fine with this:

-if (tblspcOid != MyDatabaseTableSpace)
+if (tblspcOid != MyDatabaseTableSpace &&
+!is_member_of_role(GetUserId(), DEFAULT_ROLE_READ_ALL_STATS))

But I don't like this:

+GRANT EXECUTE ON FUNCTION pgstattuple(text) TO pg_read_all_stats;

My position is that execute permission on a function is already
grantable, so granting it by default to a built-in role is just
removing flexibility which would otherwise be available to the user.
I do understand that in some cases that may result in a long list of
GRANT commands to make a particular monitoring tool work, but I think
the right solution is to package those commands in an extension or
script distributed with that tool.  When there's a permission that is
reasonably necessary for monitoring tools (or some other reasonable
purpose) and we can't arrange for that permission to be given via a
GRANT EXECUTE ON FUNCTION, then I support adding built-in roles for
those things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread Beena Emerson
Hello,

On Wed, Mar 22, 2017 at 9:41 PM, Kuntal Ghosh 
wrote:

> On Wed, Mar 22, 2017 at 3:14 PM, Beena Emerson 
> wrote:
> > PFA an updated patch which fixes a minor bug I found. It only increases
> the
> > string size in pretty_wal_size function.
> > The 01-add-XLogSegmentOffset-macro.patch has also been rebased.
> Thanks for the updated versions. Here is a partial review of the patch:
>
> In pg_standby.c and pg_waldump.c,
> + XLogPageHeader hdr = (XLogPageHeader) buf;
> + XLogLongPageHeader NewLongPage = (XLogLongPageHeader) hdr;
> +
> + XLogSegSize = NewLongPage->xlp_seg_size;
> It waits until the file is at least XLOG_BLCKSZ, then gets the
> expected final size from XLogPageHeader. This looks really clean
> compared to the previous approach.
>

thank you for testing. PFA the rebased patch incorporating your comments.


>
> + * Verify that the min and max wal_size meet the minimum requirements.
> Better to write min_wal_size and max_wal_size.
>

Updated wherever applicable.


>
> + errmsg("Insufficient value for \"min_wal_size\"")));
> "min_wal_size %d is too low" may be? Use lower case for error
> messages. Same for max_wal_size.
>

updated.


>
> + /* Set the XLogSegSize */
> + XLogSegSize = ControlFile->xlog_seg_size;
> +
> A call to IsValidXLogSegSize() will be good after this, no?
>

Is it necessary? We already have the CRC check for ControlFiles. Is that
not enough?


>
> + /* Update variables using XLogSegSize */
> + check_wal_size();
> The method name looks somewhat misleading compared to the comment for
> it, doesn't it?
>

The method name is correct since it only checks if the value provided is
sufficient (at least 2 segment size). I have updated the comment to say
Check and update variables dependent on XLogSegSize


> This patch introduces a new guc_unit having values in MB for
> max_wal_size and min_wal_size. I'm not sure about the upper limit
> which is set to INT_MAX for 32-bit systems as well. Is it needed to
> define something like MAX_MEGABYTES similar to MAX_KILOBYTES?
> It is worth mentioning that GUC_UNIT_KB can't be used in this case
> since MAX_KILOBYTES is INT_MAX / 1024(<2GB) on 32-bit systems. That's
> not a sufficient value for min_wal_size/max_wal_size.
>

You are right. I can use the same value as upper limit for GUC_UNIT_MB, we
could probably change the name of MAX_KILOBYTES to something more general
for both GUC_UNIT_MB and GUC_UNIT_KB. The max size in 32-bit systems would
be 2TB.


> While testing with pg_waldump, I got the following error.
> bin/pg_waldump -p master/pg_wal/ -s 0/0100
> Floating point exception (core dumped)
> Stack:
> #0  0x004039d6 in ReadPageInternal ()
> #1  0x00404c84 in XLogFindNextRecord ()
> #2  0x00401e08 in main ()
> I think that the problem is in following code:
> /* parse files as start/end boundaries, extract path if not specified */
> if (optind < argc)
> {
> 
> + if (!RetrieveXLogSegSize(full_path))
> ...
> }
> In this case, RetrieveXLogSegSize is conditionally called. So, if the
> condition is false, XLogSegSize will not be initialized.
>
>
pg_waldump code has been updated.




-- 

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


02-initdb-walsegsize-v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM

2017-03-24 Thread Fujii Masao
On Thu, Mar 23, 2017 at 4:28 AM, Fujii Masao  wrote:
> On Wed, Mar 15, 2017 at 7:51 PM, Masahiko Sawada  
> wrote:
>> On Wed, Mar 15, 2017 at 1:09 PM, Yugo Nagata  wrote:
>>> On Fri, 10 Mar 2017 20:08:42 +0900
>>> Masahiko Sawada  wrote:
>>>
 On Fri, Mar 10, 2017 at 3:58 PM, Jim Nasby  wrote:
 > On 3/6/17 8:34 PM, Masahiko Sawada wrote:
 >>
 >> I don't think it can say "1 frozen pages" because the number of
 >> skipped pages according to visibility map is always more than 32
 >> (SKIP_PAGES_THRESHOLD).
 >
 >
 > That's just an artifact of how the VM currently works. I'm not a fan of
 > cross dependencies like that unless there's a pretty good reason.

 Thank you for the comment.
 Agreed. Attached fixed version patch.
>>>
>>> It seems that it says "frozen pages" if pinskipped_pages is not zero,
>>> rather than about frozenskipped_pages.
>>>
>>> How about writing as below?
>>>
>>> appendStringInfo(&buf, ngettext("Skipped %u page due to buffer pins, "
>>> "Skipped %u pages due to buffer pins, ",
>>> vacrelstats->pinskipped_pages),
>>>  vacrelstats->pinskipped_pages);
>>> appendStringInfo(&buf, ngettext("%u frozen page.\n", "%u frozen 
>>> pages.\n",
>>> vacrelstats->frozenskipped_pages),
>>>  vacrelstats->frozenskipped_pages);
>>>
>>
>> Yeah, you're right. I've attached the updated version patch
>> incorporated your comment and fixing documentation.
>
> The patch looks very simple and good to me.
> Barring objection, I will commit this.

Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas  wrote:
> On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
>> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
>>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>>>
>>> For normal commits and aborts we already reset PgXact->xmin
>>> Avoiding touching highly contented shmem improves concurrent
>>> performance.
>>>
>>> Simon Riggs
>>
>> I'm getting occasional crashes with backtraces that look like this:
>>
>> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit=> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
>> snapmgr.c:1154
>> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
>>
>> I suspect that is the fault of this patch.  Please fix or revert.
>
> Also, the entire buildfarm is turning red.
>
> longfin, spurfowl, and magpie all show this assertion failure in the
> log.  I haven't checked the others.
>
> TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

Another thing that is interesting is that when I run make -j8
check-world, the overall tests appear to succeed even though there are
failures mid-way through:

test tablefunc... FAILED (test process exited with exit code 2)

...but then later we end with:

ok
All tests successful.
Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
cusr 14.45 csys = 34.31 CPU)
Result: PASS

real4m27.421s
user3m50.047s
sys1m31.937s

That's unrelated to the current problem of course, but it seems to
suggest that make's -j option doesn't entirely do what you'd expect
when used with make check-world.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Simon Riggs
On 24 March 2017 at 16:14, Robert Haas  wrote:

> I suspect that is the fault of this patch.  Please fix or revert.

Will revert then fix.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Andres Freund
On 2017-03-24 13:50:54 -0400, Robert Haas wrote:
> On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas  wrote:
> > On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
> >> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  
> >> wrote:
> >>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
> >>>
> >>> For normal commits and aborts we already reset PgXact->xmin
> >>> Avoiding touching highly contented shmem improves concurrent
> >>> performance.
> >>>
> >>> Simon Riggs
> >>
> >> I'm getting occasional crashes with backtraces that look like this:
> >>
> >> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit= >> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> >> snapmgr.c:1154
> >> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
> >>
> >> I suspect that is the fault of this patch.  Please fix or revert.
> >
> > Also, the entire buildfarm is turning red.
> >
> > longfin, spurfowl, and magpie all show this assertion failure in the
> > log.  I haven't checked the others.
> >
> > TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 
> > 1154)
> 
> Another thing that is interesting is that when I run make -j8
> check-world, the overall tests appear to succeed even though there are
> failures mid-way through:
> 
> test tablefunc... FAILED (test process exited with exit code 
> 2)
> 
> ...but then later we end with:
> 
> ok
> All tests successful.
> Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
> cusr 14.45 csys = 34.31 CPU)
> Result: PASS

> real4m27.421s
> user3m50.047s
> sys1m31.937s

> That's unrelated to the current problem of course, but it seems to
> suggest that make's -j option doesn't entirely do what you'd expect
> when used with make check-world.
> 

That's likely the output of a different test from the one that failed.
It's a lot easier to see the result if you're doing
&& echo success || echo failure

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-24 Thread Pavan Deolasee
On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila 
wrote:

>
>
> I was worried for the case if the index is created non-default
> collation, will the datumIsEqual() suffice the need.  Now again
> thinking about it, I think it will because in the index tuple we are
> storing the value as in heap tuple.  However today it occurred to me
> how will this work for toasted index values (index value >
> TOAST_INDEX_TARGET).  It is mentioned on top of datumIsEqual() that it
> probably won't work for toasted values.  Have you considered that
> point?
>
>
No, I haven't and thanks for bringing that up. And now that I think more
about it and see the code, I think the naive way of just comparing index
attribute value against heap values is probably wrong. The example of
TOAST_INDEX_TARGET is one such case, but I wonder if there are other
varlena attributes that we might store differently in heap and index. Like
index_form_tuple() ->heap_fill_tuple seem to some churning for varlena.
It's not clear to me if index_get_attr will return the values which are
binary comparable to heap values.. I wonder if calling index_form_tuple on
the heap values, fetching attributes via index_get_attr on both index
tuples and then doing a binary compare is a more robust idea. Or may be
that's just duplicating efforts.

While looking at this problem, it occurred to me that the assumptions made
for hash indexes are also wrong :-( Hash index has the same problem as
expression indexes have. A change in heap value may not necessarily cause a
change in the hash key. If we don't detect that, we will end up having two
hash identical hash keys with the same TID pointer. This will cause the
duplicate key scans problem since hashrecheck will return true for both the
hash entries. That's a bummer as far as supporting WARM for hash indexes is
concerned, unless we find a way to avoid duplicate index entries.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] extended statistics: n-distinct

2017-03-24 Thread Alvaro Herrera
Pushed this after some more tweaking.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_serial early wraparound

2017-03-24 Thread Thomas Munro
On Sat, Mar 25, 2017 at 3:11 AM, Anastasia Lubennikova
 wrote:
> Hi, I've tried to review this patch, but it seems that I miss something 
> essential.

Hi Anastasia,

Thanks for looking at this.

> You claim that SLRUs now support five digit segment name, while in slru.h
> at current master I see the following:
>
>  * Note: slru.c currently assumes that segment file names will be four hex
>  * digits.  This sets a lower bound on the segment size (64K transactions
>  * for 32-bit TransactionIds).
>  */
> #define SLRU_PAGES_PER_SEGMENT  32
>
> /* Maximum length of an SLRU name */
> #define SLRU_MAX_NAME_LENGTH32

That comment is out of date.  Commit 638cf09e extended SLRUs to
support 5 digit names, to support pg_multixact.  And I see now that
commit 73c986ad more recently created the possibility of 6 chacater
SLRU file names for pg_commit_ts.

> Could you please clarify the idea of the patch? Is it still relevant?

The idea is simply to remove some strange old code including scary
error messages that is no longer needed.  In my study of predicate.c
for other reasons, I noticed this in passing and thought I'd tidy it
up.  Because I have tangled with pg_multixact and seen 5-character
SLRU files with my own eyes, I knew that the restriction that
motivated this code was no longer valid.

> I've also run your test script.
> pg_clog was renamed to pg_xact, so it need to be changed accordingly
> echo "Contents of pg_clog:"
>   ls $PGDATA/pg_xact/

Right.

> The test shows failed assertion:
>
> == setting next xid to 1073741824 =
> Transaction log reset
> waiting for server to start2017-03-24 17:05:19.897 MSK [1181] LOG:  
> listening on IPv4 address "127.0.0.1", port 5432
> 2017-03-24 17:05:19.981 MSK [1181] LOG:  listening on Unix socket 
> "/tmp/.s.PGSQL.5432"
> 2017-03-24 17:05:20.081 MSK [1183] LOG:  database system was shut down at 
> 2017-03-24 17:05:19 MSK
> 2017-03-24 17:05:20.221 MSK [1181] LOG:  database system is ready to accept 
> connections
>  done
> server started
> vacuumdb: vacuuming database "postgres"
> vacuumdb: vacuuming database "template0"
> vacuumdb: vacuuming database "template1"
> TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals(oldestXact, 
> ShmemVariableCache->oldestXid))", File: "clog.c", Line: 669)
> vacuumdb: vacuuming of database "template1" failed: server closed the 
> connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> 2017-03-24 17:05:21.541 MSK [1181] LOG:  server process (PID 1202) was 
> terminated by signal 6: Aborted
> 2017-03-24 17:05:21.541 MSK [1181] DETAIL:  Failed process was running: 
> VACUUM (FREEZE);

My cheap trick for moving the xid around the clock quickly to test
wraparound scenarios no longer works, since this new assertion was
added in ea42cc18.  That was committed just a few hours before you
tested this.  Bad luck for me!

> The new status of this patch is: Waiting on Author

It's not urgent, it's just cleanup work, so I've now moved it to the
next commitfest.  I will try to figure out a new way to demonstrate
that it works correctly without having to ask a reviewing to disable
any assertions.  Thanks again.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-24 Thread Andres Freund
Hi,

On 2017-03-24 11:26:27 -0400, Tom Lane wrote:
> Another modest proposal:
> 
> I'm not really sold on the approach of using EEOP_FETCHSOME opcodes to
> trigger initial tupleslot de-forming.  Certainly we want to have a single
> slot_getsomeattrs call per source slot, but as-is, we need a separate
> traversal over the expression tree just to precompute the max attribute
> number needed.  That can't be good for expression compile speed, and
> it introduces nontrivial bug risks because the code that does that
> is completely decoupled from the code that emits the EEOP_VAR opcodes
> (which are what's really relying on the de-forming to have happened).

Hm.  We had the separate traversal for projections for a long while, and
I don't think there've been a a lot of changes to the extraction of the
last attribute number.  I'm very doubtful that the cost of traversing
the expression twice is meaningful in comparison to the other costs.


> What do you think about a design like this:
> 
> * Drop the FETCHSOME opcodes.
> 
> * Add fields to struct ExprState that will hold the maximum inner,
> outer, and scan attribute numbers needed.
> 
> * ExecInitExpr initializes those fields to zero, and then during
> ExecInitExprRec, whenever we generate an EEOP_VAR opcode, we do e.g.
> 
>   state->last_inner_attno = Max(state->last_inner_attno,
> variable->varattno);
> 
> * ExecInitExprSlots, get_last_attnums_walker, etc all go away.
> 
> * In the startup segment of ExecInterpExpr, add
> 
>   if (state->last_inner_attno > 0)
>   slot_getsomeattrs(innerslot, state->last_inner_attno);
>   if (state->last_outer_attno > 0)
>   slot_getsomeattrs(outerslot, state->last_outer_attno);
>   if (state->last_scan_attno > 0)
>   slot_getsomeattrs(scanslot, state->last_scan_attno);
> 
> This would be a little different from the existing code as far as runtime
> branch-prediction behavior goes, but it's not apparent to me that it'd be
> any worse.

I'd be suprised if it weren't.

I'm not super strongly against this setup, but I fail to see the benefit
of whacking this around.  I've benchmarked the previous/current setup
fairly extensively, and I'd rather not redo that.  In contrast to the
other changes you've talked about, this definitely is in the hot-path...


> Also, for JIT purposes it'd still be entirely possible to compile the
> slot_getsomeattrs calls in-line; you'd just be looking to a different
> part of the ExprState struct to find out what to do.

Yea, we could do that.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present

2017-03-24 Thread Michael Banck
Hi,

On Thu, Mar 23, 2017 at 12:41:54PM +0100, Magnus Hagander wrote:
> On Tue, Mar 21, 2017 at 8:34 PM, Robert Haas  wrote:
> > So I tend to think that there should always be some explicit user
> > action to cause the creation of a slot, like --create-slot-if-needed
> > or --create-slot=name.  That still won't prevent careless use of that
> > option but it's less dangerous than assuming that a user who refers to
> > a nonexistent slot intended to create it when, perhaps, they just
> > typo'd it.
> 
> Well, the explicit user action would be "--slot". But sure, I can
> definitely live with a --create-if-not-exists. 

Can we just make that option create slot and don't worry if one exists
already? CreateReplicationSlot() can be told to not mind about already
existing slots, and I don't see a huge point in erroring out if the slot
exists already, unless somebody can show that this leads to data loss
somehow.

> The important thing, to me, is that you can run it as a single
> command, which makes it a lot easier to work with. (And not like we
> currently have for pg_receivewal which requires a separate command to
> create the slot)

Oh, that is how it works with pg_receivewal, I have to admit I've never
used it so was a bit confused about this when I read its code.

So in that case I think we don't necessarily need to have the same user
interface at all. I first thought about just adding "-C, --create" (as
in "--create --slot=foo"), but this on second thought this looked a bit
shortsighted - who knows what flashy thing pg_basebackup might create in
5 years... So I settled on --create-slot, which is only slightly more to
type (albeit repetive, "--create-slot --slot=foo"), but adding a short
option "-C" would be fine I thinkg "-C -S foo".

So attached is a patch with adds that option. If people really think it
should be --create-slot-if-not-exists instead I can update the patch, of
course.

I again added a second patch with some further refactoring which makes
it print a message on temporary slot creation in verbose mode.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
>From 4e37e110ac402e67874f729832b330a837284d4b Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Fri, 24 Mar 2017 18:27:47 +0100
Subject: [PATCH 1/2] Add option to create a replication slot in pg_basebackup
 if not yet present.

When reqeusting a particular replication slot, the new option --create tries to
create it before starting to replicate from it in case it does not exist
already.
---
 doc/src/sgml/ref/pg_basebackup.sgml  | 12 
 src/bin/pg_basebackup/pg_basebackup.c| 44 +++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 23 +--
 3 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index e1cec9d..789f649 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -269,6 +269,18 @@ PostgreSQL documentation
  
 
  
+  -C
+  --create-slot
+  
+   
+This option can only be used together with the --slot 
+	option.  It causes the specified slot name to be created if it does not
+exist already.  
+   
+  
+ 
+
+ 
   -T olddir=newdir
   --tablespace-mapping=olddir=newdir
   
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 0a4944d..2af2e22 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -92,6 +92,7 @@ static pg_time_t last_progress_report = 0;
 static int32 maxrate = 0;		/* no limit by default */
 static char *replication_slot = NULL;
 static bool temp_replication_slot = true;
+static bool create_slot = false;
 
 static bool success = false;
 static bool made_new_pgdata = false;
@@ -337,6 +338,7 @@ usage(void)
 			 " write recovery.conf for replication\n"));
 	printf(_("  -S, --slot=SLOTNAMEreplication slot to use\n"));
 	printf(_("  --no-slot  prevent creation of temporary replication slot\n"));
+	printf(_("  -C, --create-slot  create replication slot if not present already\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 	  " relocate tablespace in OLDDIR to NEWDIR\n"));
 	printf(_("  -X, --wal-method=none|fetch|stream\n"
@@ -581,6 +583,19 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
 	else
 		param->temp_slot = temp_replication_slot;
 
+	/*
+	 * Create permanent physical replication slot if requested.
+	 */
+	if (replication_slot && create_slot)
+	{
+		if (verbose)
+			fprintf(stderr, _("%s: creating replication slot \"%s\"\n"),
+	p

Re: [HACKERS] multivariate statistics (v25)

2017-03-24 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Here's a rebased series on top of today's a3eac988c267.  I call this
> v28.
> 
> I put David's pg_dump and COMMENT patches as second in line, just after
> the initial infrastructure patch.  I suppose those three have to be
> committed together, while the others (which add support for additional
> statistic types) can rightly remain as separate commits.

As I said in another thread, I pushed parts 0002,0003,0004.  Tomas said
he would try to rebase patches 0001,0005,0006 on top of what was
committed.  My intention is to give that one a look as soon as it is
available.  So we will have n-distinct and functional dependencies in
PG10.  It sounds unlikely that we will get MCVs and histograms in, since
they're each a lot of code.

I suppose we need 0011 too (psql tab completion), but that can wait.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 1:16 PM, Alvaro Herrera  wrote:
> Implement multivariate n-distinct coefficients

dromedary and arapaima have failures like this, which seems likely
related to this commit:

  EXPLAIN
   SELECT COUNT(*) FROM ndistinct GROUP BY a, d;
   QUERY PLAN
  -
!  HashAggregate  (cost=225.00..235.00 rows=1000 width=16)
 Group Key: a, d
!->  Seq Scan on ndistinct  (cost=0.00..150.00 rows=1 width=8)
  (3 rows)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling parallelism for queries coming from SQL or other PL functions

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 5:57 AM, Rafia Sabih
 wrote:
>> I suspect that code fails to achieve its goals anyway.  At the top of
>> exec_eval_expr(), you call exec_prepare_plan() and unconditionally
>> pass CURSOR_OPT_PARALLEL_OK, so when that function returns, expr->plan
>> might now be a parallel plan.  If we reach the call to
>> exec_run_select() further down in that function, and if we happen to
>> pass false, it's not going to matter, because exec_run_select() is
>> going to find the plan already initialized.
>>
> True, fixed.
> The attached patch is to be applied over [1].

After some scrutiny I didn't find anything particularly wrong with
this, with the exception that exec_eval_expr() was passing false as
the parallelOK argument to exec_run_select(), which is inconsistent
with that function's earlier use of CURSOR_OPT_PARALLEL_OK to plan the
same query.  I fixed that by ripping out the parallelOK argument
altogether and just passing CURSOR_OPT_PARALLEL_OK when portalP ==
NULL.  The only reason I added parallelOK in the first place was
because of that RETURN QUERY stuff which subsequent study has shown to
be misguided.

Committed that way; please let me know if you see any problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Peter Eisentraut
On 3/24/17 10:09, Petr Jelinek wrote:
> On 24/03/17 15:05, Peter Eisentraut wrote:
>> On 3/23/17 19:32, Petr Jelinek wrote:
>>> Yes, I also forgot to check if the table actually exists on subscriber
>>> when fetching them in CREATE SUBSCRIPTION (we have check during
>>> replication but not there).
>>
>> I think for this we can probably just change the missing_ok argument of
>> RangeVarGetRelid() to false.
>>
>> Unless we want the custom error message, in which case we need to change
>> AlterSubscription_refresh(), because right now it errors out because
>> missing_ok = false.
>>
> 
> You are right, stupid me.

Committed this version.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication existing data copy

2017-03-24 Thread Peter Eisentraut
On 3/24/17 05:22, Michael Banck wrote:
> However, replication also seems to not work, I'm using the following
> script right now:

The problem is that your publication does not include any tables.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-24 Thread Robert Haas
On Wed, Mar 22, 2017 at 6:05 PM, David Steele  wrote:
>> Wait, really?  I thought you abandoned this approach because there's
>> then no principled way to handle WAL segments of less than the default
>> size.
>
> I did say that, but I thought I had hit on a compromise.
>
> But, as I originally pointed out the hex characters in the filename are not
> aligned correctly for > 8 bits (< 16MB segments) and using different
> alignments just made it less consistent.

I don't think I understand what the compromise is.  Are you saying we
should have one rule for segments < 16MB and another rule for segments
> 16MB?  I think using two different rules for file naming depending
on the segment size will be a negative for both tool authors and
ordinary users.

> It would be OK if we were willing to drop the 1,2,4,8 segment sizes because
> then the alignment would make sense and not change the current 16MB
> sequence.

Well, that is true.  But the thing I'm trying to do here is to keep
this patch down to what actually needs to be changed in order to
accomplish the original purchase, not squeeze more and more changes
into it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >