Re: Server won't start with fallback setting by initdb.

2018-02-27 Thread Michael Paquier
On Wed, Feb 14, 2018 at 10:08:06AM +0900, Kyotaro HORIGUCHI wrote:
> Definitely. The another rationale for the value is that regtest
> fails with the numbers less than 20. So it's not 11 but
> 20. Currently regtest should succeed with that number of
> connections as written in parallel_schedule and I've read (in
> Robert's mail, but I haven't confirmed) that tests for parallel
> scans are designed to use up to 20 connections.

I just noticed, but this thread in registered in next CF, so I have
switched this patch as ready for committer.  In the documentation,
guc-max-connections (config.sgml) mentions that the default value of
max_connections is typically 100, but it could be less as determined by
initdb.  Do we want to specify as well that initdb would use 100 as
upper bound and 20 as lower bound when it does its evaluation?  I would
think that this is not worth mentioning in the docs but as initdb is
directly mentioned..
--
Michael


signature.asc
Description: PGP signature


Re: PATCH: Configurable file mode mask

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 03:52:32PM -0500, David Steele wrote:
> On 1/30/18 3:01 AM, Michael Paquier wrote:
> The purpose of this test to be ensure nothing else is in the directory.
> As the tests get more complex I think keeping track of the state will be
> important.  In other words, this is really an assertion that the current
> test state is correct, not that the prior command succeeded.

Good point.  Objection removed.

>> Another test that could be easily included is with -o, then create a
>> table and check for its OID value.  Also for -x, then just use
>> txid_current to check the value consistency.  For -c, with
>> track_commit_timestamp set to on, you can set a couple of values and
>> then check for pg_last_committed_xact() which would be NULL if you use
>> an XID older than the minimum set for example.  All those are simple
>> enough so they could easily be added in the first version of this test
>> suite.
> 
> I would prefer to leave this for another patch.

OK, no problems with that either.

>>> 2) 02-mkdir
>>>
>>> Converts mkdir() calls to MakeDirectoryDefaultPerm() if the original
>>> call used default permissions.
>> 
>> So the only call not converted to that API is in ipc.c...  OK, this one
>> is pretty simple so nothing really to say about it, the real meat comes
>> with patch 3.  I would rather see with a good eye if this patch
>> introduces file_perm.h which includes both PG_FILE_MODE_DEFAULT and
>> PG_DIR_MODE_DEFAULT, and have the frontends use those values.  This
>> would make the switch to 0003 a bit easier if you look at pg_resetwal's
>> diffs for example.
> 
> Agreed.  I thought about this originally but could not come up with a
> good split.  I spent some time on it and came up with something that I
> think works (and would make sense to commit separately).

OK, thanks.

>> +command_ok(
>> +   ['chmod', "-R", 'g+rX', "$pgdata"],
>> +   'add group perms to PGDATA');
>>
>> That would blow up on Windows.  You should replace that by perl's chmod
>> and look at its return result for sanity checks, likely with ($cnt != 0).
>> And let's not use icacls please...
> 
> Fixed with a new function, add_pg_data_group_perm().

That's basically a recursive chmod, so chmod_recursive is more adapted?
I could imagine that this is useful as well for removing group
permissions, so the new mode could be specified as an argument.

>> +if ($params{has_group_access})
>> +{
>> +push(@{$params{extra}}, '-g');
>> +}
>> No need to introduce a new parameter here, please use directly extra =>
>> [ '-g' ] in the routines calling PostgresNode::init.
> 
> Other code needs to know if group access is enabled on the node (see
> RewindTest.pm) so it's not just a matter of passing the option.

I don't quite understand here.  I have no objection into extending
setup_cluster() with a group_access argument so as the tests run both
the group access and without it, but I'd rather keep the low-level API
clean of anything fancy if we can use the facility which already
exists.

> New patches are attached.  The end result is almost identical to v6 but
> code was moved from 03 to 02 as per Michael's suggestion.

Thanks for the updated versions.

> 1) 01-pgresetwal-test
> 
> Adds a very basic test suite for pg_resetwal.

Okay.  This one looks fine to me.  It could be passed down to a
committer.

> 2) 02-file-perm
> 
> Adds a new header (file_perm.h) and makes all front-end utilities use
> the new constants for setting umask.  Converts mkdir() calls in the
> backend to MakeDirectoryDefaultPerm() if the original
> call used default permissions.  Adds tests to make sure permissions in
> PGDATA are set correctly by the front-end tools.

I had a more simple (complicated?) thing in mind for the split, which
would actually cause this patch to be split into two more:
1) Introduce file_perm.h and replace all the existing values for modes
and masks to what the header introduces.  This allows to check if the
new header is not forgotten anywhere, especially for the frontends.
2) Introduce MakeDirectoryDefaultPerm, which switches the backend calls
of mkdir() to the new API.
3) Add the grouping facilities, which updates the default masks and
modes of file_perm.h.

Grouping 1) and 2) together as you do makes sense as well I guess, as in
my proposal the mkdir calls in the backend would be touched twice, still
things get more incremental.

+/*
+ * Default mode for created files.
+ */
+#define PG_FILE_MODE_DEFAULT   (S_IRUSR | S_IWUSR | S_IRGRP)
+
+/*
+ * Default mode for directories.
+ */
+#define PG_DIR_MODE_DEFAULT(S_IRWXU | S_IRGRP | S_IXGRP)
For the first cut, this should not use S_IRGRP in the first declaration,
and (S_IXGRP | S_IXGRP) in the second declaration.

-if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+if (script == NULL && (script = fopen(output_path, "w")) == NULL)
Why are those calls in pg_upgrade changed?

TestLib.pm does not need the group-related extensions, r

Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-02-27 Thread Rajkumar Raghuwanshi
On Wed, Feb 14, 2018 at 5:44 PM, Amit Kapila 
wrote:

> +# Concurrency error from GetTupleForTrigger
> +# Concurrency error from ExecLockRows
>
> I think you don't need to mention above sentences in spec files.
> Apart from that, your patch looks good to me.  I have marked it as
> Ready For Committer.
>

I too have tested this feature with isolation framework and this look good
to me.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


RE: [bug fix] pg_rewind takes long time because it mistakenly copies data files

2018-02-27 Thread Tsunakawa, Takayuki
From: Michael Paquier [mailto:mich...@paquier.xyz]
> So I would propose to just do that later.  I have looked a second time at
> your patch, attached is the set of tests I have run:

Thanks so much, that has helped me a lot!

> I have one small comment though.  The comment block at the beginning of
> isRelDataFile() refers to "pg_tblspc//PG_9.4_201403261/".
> Could you switch that to "pg_tblspc//"?
> That's not directly the fault of your patch, but as long as we are on it..

Done, thanks again for marking the CF entry.

Regards
Takayuki Tsunakawa



pg_rewind_pathcmp_v2.patch
Description: pg_rewind_pathcmp_v2.patch


Re: [bug fix] pg_rewind takes long time because it mistakenly copies data files

2018-02-27 Thread Michael Paquier
On Mon, Feb 26, 2018 at 06:24:02PM +0900, Michael Paquier wrote:
> Anything like that would be work only for HEAD I think as that's a bit
> of refactoring.  And indeed it could give you a good introduction to the
> TAP facility.

So I would propose to just do that later.  I have looked a second time
at your patch, attached is the set of tests I have run:
- With assertions enabled, I see pg_rewind failing on an assertion as
mentioned upthread.
- With assertion disabled (look at rewind_logs.log as generated by the
script attached), then all the files from the tablespace are copied with
all the file chunks fetched at the second phase of the rewind.
- With the patch, both the assertion and the handling of tablespace
files are fixed.  Again, in the logs produced by the script you would
see that tablespace files are not completely copied anymore, and that
only portions of them are.

I have one small comment though.  The comment block at the beginning of
isRelDataFile() refers to "pg_tblspc//PG_9.4_201403261/".
Could you switch that to "pg_tblspc//"?
That's not directly the fault of your patch, but as long as we are on
it..

So I am marking this as ready for committer.  Thanks for the report,
Tsunakawa-san!
--
Michael
#!/bin/bash

set -e

PRIMARY_PGDATA=$HOME/data/primary
STANDBY_PGDATA=$HOME/data/standby
PRIMARY_TBSPACE=$HOME/data/tbspace
STANDBY_TBSPACE=$HOME/data/tbspace_standby

rm -rf $PRIMARY_PGDATA $STANDBY_PGDATA
rm -rf $PRIMARY_TBSPACE
rm -rf $STANDBY_TBSPACE

mkdir $PRIMARY_TBSPACE

# Create primary with its tablespace
initdb --noclean --nosync -D $PRIMARY_PGDATA
cat >> $PRIMARY_PGDATA/postgresql.conf <> $STANDBY_PGDATA/postgresql.conf < $STANDBY_PGDATA/recovery.conf < $HOME/data/rewind_logs.log
echo "port=5432" >> $PRIMARY_PGDATA/postgresql.conf
cat > $PRIMARY_PGDATA/recovery.conf <

signature.asc
Description: PGP signature


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
В Wed, 28 Feb 2018 10:45:24 +0900
Michael Paquier  пишет:


> Replace backslashes by forward slashes in MSVC build code
> 
> This makes it possible to run some stages of these build scripts on
> non-Windows systems.  That way, we can more easily test whether file
> moves or makefile changes might break the MSVC build.

It would be nice to be able at least run perl -wc on these scripts on
non-windows platform.

Unforutnately following development seems to break this.

Now Mkvcbuild.pm depends on Win32 module and Win32API::File module,
both of which are not exist on non Win32 platforms.

Or there exist somewhere fake Win32 module which would satisfy
dependencies and do nothing?



-- 
   Victor Wagner 



Re: TODO item for broken \s with libedit seems fixed

2018-02-27 Thread Tatsuo Ishii
> Tatsuo Ishii  writes:
>> Would you like to do this yourself? Or shall I do this?
> 
> Go ahead, I have a bunch of other stuff to do ...

Done. Once my editing is confirmed ok, I will delete the item.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Tom Lane
Robert Haas  writes:
> On Tue, Feb 27, 2018 at 2:50 PM, Andres Freund  wrote:
>> What I'm concerned about isn't so much testing paths specific to
>> dynamic_shared_memory_type=none, but paths where we currently need
>> fallbacks for the case we couldn't actually allocate dynamic shared
>> memory. Which I think we at least somewhat gracefully need to deal with.

> Well, it's not very hard to just hack the code to make dsm_create()
> always fail, or fail X% of the time, if you're so inclined.  I agree
> that -c dynamic_shared_memory_type=none is a little more convenient
> than sticking something like that into the code, but I don't think
> it's sufficiently more convenient to justify keeping an option we
> don't otherwise want.

I'd be in favor of having some normally-ifdef'd-out facility for causing
failures of this kind.  (As I mentioned upthread, I do not think "always
fail" is sufficient.)  That's very different from having a user-visible
option, though.  We don't expose a GUC that enables CLOBBER_FREED_MEMORY,
to take a relevant example.

regards, tom lane



Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Robert Haas
On Tue, Feb 27, 2018 at 2:50 PM, Andres Freund  wrote:
> What I'm concerned about isn't so much testing paths specific to
> dynamic_shared_memory_type=none, but paths where we currently need
> fallbacks for the case we couldn't actually allocate dynamic shared
> memory. Which I think we at least somewhat gracefully need to deal with.

Well, it's not very hard to just hack the code to make dsm_create()
always fail, or fail X% of the time, if you're so inclined.  I agree
that -c dynamic_shared_memory_type=none is a little more convenient
than sticking something like that into the code, but I don't think
it's sufficiently more convenient to justify keeping an option we
don't otherwise want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index

2018-02-27 Thread Robert Haas
On Mon, Feb 26, 2018 at 7:51 PM, Thomas Munro
 wrote:
> Thinking about how to tune that got me thinking about a simple middle
> way we could perhaps consider...
>
> What if we just always locked pseudo page numbers using hash_value %
> max_predicate_locks_per_relation (= effectively 31 by default)?  Then
> you'd have lower collision rates on small hash indexes, you'd never
> have to deal with page splits, and you'd never exceed
> max_predicate_locks_per_relation so you'd never escalate to relation
> level locks on busy systems.  On the downside, you'd have eg ~3%
> chance of collision instead of a 1/hash_maxbucket chance of collision,
> so it gets a bit worse for large indexes on systems that are not busy
> enough to exceed max_predicate_locks_per_relation.  You win some, you
> lose some...

Hmm, yeah, that definitely has some appeal.  On the other hand,
there's a lot of daylight between locking hv % 2^32 and locking hv %
31; the former is going to potentially blow out the lock table really
fast, while the latter is potentially going to create an uncomfortable
number of false collisions.  One could imagine a modulus larger than
31 and smaller than 4294967296, although I don't have a principled
suggestion for how to pick it.  On the third hand, people who are
using SSI heavily may well have increased
max_predicate_locks_per_relation and with this proposal that just
kinda does what you want.

I don't really know how we can judge the merits of any particular
modulus (or of committing the patch at all) without some test results
showing that it helps reduce rollbacks or increase performance or
something.  Otherwise we're just guessing.  It does however seem to me
that locking the hash value % (something) is better than basing the
locking on bucket or page numbers.  Basing it on page numbers strictly
speaking cannot work, since the same tuple could be present in any
page in the bucket chain; you'd have to lock the page number of the
head of the bucket chain.  There is however no advantage of doing that
over locking the bucket number directly.  Moreover, locking the bucket
number directly forces you to worry about splits, whereas if you log
hv % (something) then you don't have to care.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: jsonpath

2018-02-27 Thread Robert Haas
On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
 wrote:
> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
>Now run-time errors in arithmetic operators are catched (added
>PG_TRY/PG_CATCH around operator's functions calls) and converted into
>Unknown values in predicates as it is required by the standard:

I think we really need to rename PG_TRY and PG_CATCH or rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 05:52:20PM -0500, Tom Lane wrote:
> It already does treat SIGUSR1 as a log rotation request.  Apparently
> the point of this patch is that some people don't find that easy enough
> to use, which is fair, because finding out the collector's PID from
> outside isn't very easy.

True enough.  The syslogger does not show up in pg_stat_activity either,
so I think that being able to do so would help for this case.
--
Michael


signature.asc
Description: PGP signature


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 04:14:02PM +0300, Victor Wagner wrote:
> How interesting - somewhere between 9.3 (for which patch was made) and
> 9.5 path to the Makefile in windows-specific script  was changed from
> Windows-style separators to unix-style and this broke context.

You are looking for that, which was added during 9.4's development:
commit: 854adb83711da8fda2a8f028c27ad8956179c04a
author: Peter Eisentraut 
date: Sat, 25 Apr 2015 08:58:01 -0400
Replace backslashes by forward slashes in MSVC build code

This makes it possible to run some stages of these build scripts on
non-Windows systems.  That way, we can more easily test whether file
moves or makefile changes might break the MSVC build.

Peter Eisentraut and Michael Paquier
--
Michael


signature.asc
Description: PGP signature


Re: prokind column (was Re: [HACKERS] SQL procedures)

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 11:50:31AM -0500, Tom Lane wrote:
> We support the various psql/describe.c features against old servers,
> so it would be quite inconsistent for tab completion not to work
> similarly.  There are some gaps in that already, as per the other
> thread, but we should clean those up not add more.

I'll try to study this thread a bit more as I lack context.  So I'll
reply there.  What I am scared of is that Tomas Munro has done a heroic
effort to increase the readability of psql's tab completion during the
dev cycle of 9.6.  And it would be sad to reduce the quality of this
code.
--
Michael


signature.asc
Description: PGP signature


Re: compress method for spgist - 2

2018-02-27 Thread Daniel Gustafsson
> On 04 Jan 2018, at 06:17, Dagfinn Ilmari Mannsåker  wrote:
> 
> Teodor Sigaev  writes:
> 
>>> Now, this patch is ready for committer from my point of view.
>> 
>> Thank you, pushed
> 
> This patch added two copies of the poly_ops row to the "Built-in SP-GiST
> Operator Classes" table in spgist.sgml.  The attached patched removes
> one of them.

Patch looks good, marked as Ready for Committer in the CF app.

cheers ./daniel


Re: jsonlog logging only some messages?

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 04:50:24PM +, Greg Stark wrote:
> I would actually lean another way. If jsonlog opened *.json and wrote
> there then it could leave the output_to_server field unchanged. It
> does look like this might be a bit awkward with yet more of the static
> functions needing to be duplicated.

Which brings in why the chunked protocol that the syslogger uses is
useful, because by using a custom file you would either need to introduce
a global locking mechanism or to create a background worker that jsonlog
can feed its data to so as the log file does not finish with overlapping
writes. This bgworker which could use its own protocol or use the same
protocol as the syslogger, resulting in a sort of secondary syslogger
process.  If you feel motivated to code anything like that, I'll be
happy to merge it in my tree with jsonlog :)
--
Michael


signature.asc
Description: PGP signature


Re: Online enabling of checksums

2018-02-27 Thread Daniel Gustafsson
> On 28 Feb 2018, at 00:51, Andrey Borodin  wrote:
> 
>> 25 февр. 2018 г., в 21:17, Magnus Hagander  написал(а):
>> 
>> PFA an updated patch that adds this, and also fixes the problem in 
>> pg_verify_checksums spotted by Michael Banck. 
> 
> I had to ALLOW_CONNECTIONS to template0 to make it work
> postgres=# 2018-02-27 21:29:05.993 +05 [57259] ERROR:  Database template0 
> does not allow connections.
> Is it a problem with my installation or some regression in the patch?

This is due to a limitation that apply to bgworkers, in order to add checksums
the bgworker must connect to the database and template0 does not allow
connections by default.  There is a discussion, and patch, for lifting this
restriction but until this makes it in (if it does), the user will have to
allow connections manually.  For reference, the thread for allowing bypassing
allowconn in bgworker is here:

https://www.postgresql.org/message-id/cabuevewwt9zmonbmrff0owneon3dapgovzwhan0bukxaqy3...@mail.gmail.com

> 2018-02-27 21:40:47.132 +05 [57402] HINT:  either disable or enable checksums 
> by calling the pg_data_checksums_enable()/disable() functions
> Function names are wrong in this hint: pg_enable_data_checksums()

Fixed.  The format of this hint (and errmsg) is actually also incorrect, which
I fixed while in there.

> The code is nice and clear. One minor spot in the comment
>> This option can only _be_ enabled when data checksums are enabled.

Fixed.

> Is there any way we could provide this functionality for previous versions 
> (9.6,10)? Like implement utility for offline checksum enabling, without 
> WAL-logging, surely.

While outside the scope of the patch in question (since it deals with enabling
checksums online), such a utility should be perfectly possible to write.

Thanks for reviewing!

cheers ./daniel



online_checksums3.diff
Description: Binary data


Re: [HACKERS] path toward faster partition pruning

2018-02-27 Thread Amit Langote
On 2018/02/28 1:05, Robert Haas wrote:
> On Mon, Feb 26, 2018 at 10:59 PM, Amit Langote
>  wrote:
>> You may say that partition bounds might have to be different too in this
>> case and hence partition-wise join won't occur anyway, but I'm wondering
>> if the mismatch of partcollation itself isn't enough to conclude that?
> 
> Yeah, you're right.  I think that this is just a bug in partition-wise
> join, and that the partition scheme should just be using partcollation
> instead of parttypcoll, as in the attached.

Ah, OK. I was missing that there is no need to have both parttypcoll and
partcollation in PartitionSchemeData, as the Vars in rel->partexprs are
built from a bare PartitionKey (not PartitionSchemeData), and after that
point, parttypcoll no longer needs to kept around.

I noticed that there is a typo in the patch.

+   memcpy(part_scheme->partcollation, partkey->parttypcoll,

s/parttypcoll/partcollation/g

BTW, should there be a relevant test in partition_join.sql?  If yes,
attached a patch (partitionwise-join-collation-test-1.patch) to add one.

Also attached updated version of your patch (fixed the typo).

Thanks,
Amit
diff --git a/src/backend/optimizer/util/plancat.c 
b/src/backend/optimizer/util/plancat.c
index 60f21711f4..b799e249db 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1891,7 +1891,7 @@ find_partition_scheme(PlannerInfo *root, Relation 
relation)
   sizeof(Oid) * partnatts) != 0 ||
memcmp(partkey->partopcintype, 
part_scheme->partopcintype,
   sizeof(Oid) * partnatts) != 0 ||
-   memcmp(partkey->parttypcoll, part_scheme->parttypcoll,
+   memcmp(partkey->partcollation, 
part_scheme->partcollation,
   sizeof(Oid) * partnatts) != 0)
continue;
 
@@ -1926,8 +1926,8 @@ find_partition_scheme(PlannerInfo *root, Relation 
relation)
memcpy(part_scheme->partopcintype, partkey->partopcintype,
   sizeof(Oid) * partnatts);
 
-   part_scheme->parttypcoll = (Oid *) palloc(sizeof(Oid) * partnatts);
-   memcpy(part_scheme->parttypcoll, partkey->parttypcoll,
+   part_scheme->partcollation = (Oid *) palloc(sizeof(Oid) * partnatts);
+   memcpy(part_scheme->partcollation, partkey->partcollation,
   sizeof(Oid) * partnatts);
 
part_scheme->parttyplen = (int16 *) palloc(sizeof(int16) * partnatts);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index db8de2dfd0..d576aa7350 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -351,7 +351,7 @@ typedef struct PartitionSchemeData
int16   partnatts;  /* number of partition 
attributes */
Oid*partopfamily;   /* OIDs of operator families */
Oid*partopcintype;  /* OIDs of opclass declared 
input data types */
-   Oid*parttypcoll;/* OIDs of collations of 
partition keys. */
+   Oid*partcollation;  /* OIDs of partitioning 
collations */
 
/* Cached information about partition key data types. */
int16  *parttyplen;
diff --git a/src/test/regress/expected/partition_join.out 
b/src/test/regress/expected/partition_join.out
index 4fccd9ae54..6323a13777 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -1869,3 +1869,31 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_n t1 FULL JOIN 
prt1 t2 ON (t1.c = t2.c);
->  Seq Scan on prt1_n_p2 t1_1
 (10 rows)
 
+--
+-- No partition-wise join if partitioning collation doesn't match
+--
+CREATE TABLE posix_text (a text) PARTITION BY RANGE (a COLLATE "POSIX");
+CREATE TABLE posix_text1 PARTITION OF posix_text FOR VALUES FROM ('a') TO 
('m');
+CREATE TABLE posix_text2 PARTITION OF posix_text FOR VALUES FROM ('m') TO ('z 
');
+CREATE TABLE c_text (a text) PARTITION BY RANGE (a COLLATE "C");
+CREATE TABLE c_text1 PARTITION OF c_text FOR VALUES FROM ('a') TO ('m');
+CREATE TABLE c_text2 PARTITION OF c_text FOR VALUES FROM ('m') TO ('z ');
+EXPLAIN (COSTS OFF)
+SELECT * FROM posix_text p JOIN c_text c ON (p.a = c.a);
+  QUERY PLAN   
+---
+ Merge Join
+   Merge Cond: (p.a = c.a)
+   ->  Sort
+ Sort Key: p.a
+ ->  Append
+   ->  Seq Scan on posix_text1 p
+   ->  Seq Scan on posix_text2 p_1
+   ->  Sort
+ Sort Key: c.a
+ ->  Append
+   ->  Seq Scan on c_text1 c
+   ->  Seq Scan on c_text2 c_1
+(12 rows)
+
+DROP TABLE posix_text, c_text;
diff --git a/src/test/regress/sql/partition_join.sql 
b/src/test/regress/sql/partition_join.sql
index a2d8b1be55..0df7df487d 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/s

Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 02:53:55PM -0500, Tom Lane wrote:
> Andres Freund  writes:
>> On 2018-02-27 14:41:47 -0500, Tom Lane wrote:
>>> What I didn't understand about it was what kind of testing this'd make
>>> harder.  If we desupport dynamic_shared_memory_type=none, there aren't
>>> any code paths that need to cope with the case, and we should just
>>> remove any code that thereby becomes unreachable.
> 
>> What I'm concerned about isn't so much testing paths specific to
>> dynamic_shared_memory_type=none, but paths where we currently need
>> fallbacks for the case we couldn't actually allocate dynamic shared
>> memory. Which I think we at least somewhat gracefully need to deal with.
> 
> Ah.  That's a fair point, but I do not think
> dynamic_shared_memory_type=none is a good substitute for having a way to
> provoke allocation failures.  That doesn't let you test recovery from
> situations where your first allocation works and second one fails, for
> example; and cleanup from that sort of case is likely to be more
> complicated than the trivial case.

dynamic_shared_memory_type is used in six code paths where it is aimed
at doing sanity checks:
- Mute DSM initialization at postmaster startup.
- Mute cleanup of previous DSM segments from a past postmaster.
- Block backend startup if there is no DSM.
- Mute parallel query in planner.
- Mute parallel query for parallel btree builds.
- Mute creation of parallel contexts in executor.
The point is that there are other control mechanisms for each one of
them.  Mainly, for the executor portion, the number of workers is
controlled by other GUCs on planner-side.
--
Michael


signature.asc
Description: PGP signature


Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Michael Paquier
On Tue, Feb 27, 2018 at 02:00:36PM -0500, Robert Haas wrote:
> I think the two issues you are pointing out are the same issue --
> namely, if initdb probes for a max_connections setting with an invalid
> DSM implementation configured, it will fail the test for every value
> of max_connections and thus select the lowest possible value.  The
> solution to this is presumably just to make sure we choose the DSM
> implementation before we do the max_connections probing so that we can
> pass through the correct value there, which I think is what your patch
> does.

Yes, that's what the thing does.  It moves the routine to find the DSM
implementation before computing max_connections.
--
Michael


signature.asc
Description: PGP signature


ON CONFLICT DO UPDATE for partitioned tables

2018-02-27 Thread Alvaro Herrera
I updated Amit Langote's patch for INSERT ON CONFLICT DO UPDATE[1].
Following the lead of edd44738bc88 ("Be lazier about partition tuple
routing.") this incarnation only does the necessary push-ups for the
specific partition that needs it, at execution time.  As far as I can
tell, it works as intended.

I chose to refuse the case where the DO UPDATE clause causes the tuple
to move to another partition (i.e. you're updating the partition key of
the tuple).  While it's probably possible to implement that, it doesn't
seem a very productive use of time.

However, there is a shortcoming in the design: it fails if there are
multiple levels of partitioning, because there is no easy (efficient)
way to map the index OID more than one level.  I had already mentioned
this shortcoming to Amit's patch.  So this case (which I added in the
regression tests) fails unexpectedly:

-- multiple-layered partitioning
create table parted_conflict_test (a int primary key, b text) partition by 
range (a);
create table parted_conflict_test_1 partition of parted_conflict_test
  for values from (0) to (1) partition by range (a);
create table parted_conflict_test_1_1 partition of parted_conflict_test_1
  for values from (0) to (100);
insert into parted_conflict_test values ('10', 'ten');
insert into parted_conflict_test values ('10', 'ten two')
  on conflict (a) do update set b = excluded.b;
ERROR:  invalid ON CONFLICT DO UPDATE specification
DETAIL:  An inferred index was not found in partition 
"parted_conflict_test_1_1".

So the problem here is that my MapPartitionIndexList() implementation is
too stupid.  I think it would be smarter to use the ResultRelInfo
instead of bare Relation, for one.  But that still doesn't answer how to
find a "path" from root to leaf partition, which is what I'd need to
verify that there are valid pg_inherits relationships for the partition
indexes.  I'm probably missing something about the partitionDesc or
maybe the partitioned_rels lists that helps me do it efficiently, but I
hope figure out soon.

One idea I was toying with is to add RelationData->rd_children as a list
of OIDs of children relations.  So it should be possible to walk the
list from the root to the desired descendant, without having to scan
pg_inherits repeatedly ... although that would probably require doing
relation_open() for every index, which sounds undesirable.

(ISTM that having RelationData->rd_children might be a good idea in
general anyway -- I mean to speed up some code that currently scans
pg_inherits via find_inheritance_children.  However, since the partition
descriptor itself is already in relcache, maybe this doesn't matter too
much.)

Another idea is to abandon the notion that we need to find a path from
parent index to descendant index, and just run the inference algorithm
again on the partition.  I'm not sure how much I like this idea, yet.

Anyway, while this is still WIP, I think it works correctly for the case
where there is a single partition level.


[1] https://postgr.es/m/c1651d5b-7bd6-b7e7-e1cc-16ecfe2c0...@lab.ntt.co.jp

-- 
Álvaro Herrer
>From a2517bd315034de7f6c5a4728f66729136918e88 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 27 Feb 2018 20:52:56 -0300
Subject: [PATCH v1] fix ON CONFLICT DO UPDATE for partitioned tables

---
 src/backend/catalog/pg_inherits.c |  72 
 src/backend/executor/execPartition.c  |  29 +++
 src/backend/executor/nodeModifyTable.c|  33 +++-
 src/backend/optimizer/util/plancat.c  |   3 +-
 src/backend/parser/analyze.c  |   7 --
 src/include/catalog/pg_inherits_fn.h  |   3 +
 src/test/regress/expected/insert_conflict.out | 113 --
 src/test/regress/sql/insert_conflict.sql  |  75 +++--
 8 files changed, 311 insertions(+), 24 deletions(-)

diff --git a/src/backend/catalog/pg_inherits.c 
b/src/backend/catalog/pg_inherits.c
index 5a5beb9273..e1a46bcd2b 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -407,6 +407,78 @@ typeInheritsFrom(Oid subclassTypeId, Oid superclassTypeId)
 }
 
 /*
+ * Given a list of index OIDs in the rootrel, return a list of OIDs of the
+ * corresponding indexes in the partrel.  If any index in the rootrel does not
+ * correspond to any index in the child, an error is raised.
+ *
+ * This processes the index list for INSERT ON CONFLICT DO UPDATE at execution
+ * time.  This fact is hardcoded in the error messages.
+ *
+ * XXX this implementation fails if the partition is not a direct child of
+ * rootrel.
+ */
+List *
+MapPartitionIndexList(Relation rootrel, Relation partrel, List *indexlist)
+{
+   List   *result = NIL;
+   List   *partIdxs;
+   RelationinhRel;
+   ScanKeyData key;
+   ListCell   *cell;
+
+   partIdxs = RelationGetIndexList(partrel);
+   /* quick exit if partition has no indexes */
+   if (partIdxs == NIL)
+  

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-02-27 Thread Thomas Munro
On Wed, Jan 3, 2018 at 4:31 AM, Shubham Barai  wrote:
> I have created new isolation tests. Please have a look at
> updated patch.

Hi Shubham,

Could we please have a rebased version of the gin one?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Tom Lane
"David G. Johnston"  writes:
> On Tue, Feb 27, 2018 at 4:12 PM, Tom Lane  wrote:
>> IOW, I think a fair response to this is "if you're using logrotate with
>> Postgres, you're doing it wrong".  That was of some use back before we
>> spent so much sweat on the syslogger, but it's not a reasonable setup
>> today.

> A couple of weeks ago a message was posted to general [1] in which I
> concluded the desired behavior is not supported natively.  I'm curious
> whether better advice than mine can be given ...

> https://www.postgresql.org/message-id/flat/CAKoQ0XHAy9De1C8gxUWHSW6w5iKcqX03wyWGe_%2Bc8NxJccCBHw%40mail.gmail.com#cakoq0xhay9de1c8gxuwhsw6w5ikcqx03wywge_+c8nxjccc...@mail.gmail.com

The particular behavior that guy wanted would require some new %-escape
in the log_filename parameter.  Essentially we'd need to keep an
increasing sequence counter for log files and have it wrap around at some
user-specified count (5 in his example), then add a %-escape to include
the counter value in the generated filename.  It's not an unreasonable
idea, if somebody wanted to code it up.

regards, tom lane



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Andres Freund
On 2018-02-27 16:20:28 -0700, David G. Johnston wrote:
> On Tue, Feb 27, 2018 at 4:12 PM, Tom Lane  wrote:
> 
> > IOW, I think a fair response to this is "if you're using logrotate with
> > Postgres, you're doing it wrong".  That was of some use back before we
> > spent so much sweat on the syslogger, but it's not a reasonable setup
> > today.
> >
> 
> A couple of weeks ago a message was posted to general [1] in which I
> concluded the desired behavior is not supported natively.  I'm curious
> whether better advice than mine can be given ...
> 
> https://www.postgresql.org/message-id/flat/CAKoQ0XHAy9De1C8gxUWHSW6w5iKcqX03wyWGe_%2Bc8NxJccCBHw%40mail.gmail.com#cakoq0xhay9de1c8gxuwhsw6w5ikcqx03wywge_+c8nxjccc...@mail.gmail.com

That link appears to be broken. Real one
https://www.postgresql.org/message-id/cakoq0xhay9de1c8gxuwhsw6w5ikcqx03wywge_+c8nxjccc...@mail.gmail.com



Re: Reopen logfile on SIGHUP

2018-02-27 Thread David G. Johnston
On Tue, Feb 27, 2018 at 4:12 PM, Tom Lane  wrote:

> IOW, I think a fair response to this is "if you're using logrotate with
> Postgres, you're doing it wrong".  That was of some use back before we
> spent so much sweat on the syslogger, but it's not a reasonable setup
> today.
>

A couple of weeks ago a message was posted to general [1] in which I
concluded the desired behavior is not supported natively.  I'm curious
whether better advice than mine can be given ...

https://www.postgresql.org/message-id/flat/CAKoQ0XHAy9De1C8gxUWHSW6w5iKcqX03wyWGe_%2Bc8NxJccCBHw%40mail.gmail.com#cakoq0xhay9de1c8gxuwhsw6w5ikcqx03wywge_+c8nxjccc...@mail.gmail.com

David J.


Re: Sigh, I broke crake again

2018-02-27 Thread Andrew Dunstan
On Wed, Feb 28, 2018 at 6:08 AM, Tom Lane  wrote:

> I wrote:
> > I had not intended to back-patch, since those changes were just cosmetic,
> > but it might be the best way to preserve the XversionUpgrade tests.
>
> After closer study, it seems like the least painful answer is:
>
> 1. In HEAD, revert the renaming of int44in/int44out to
> city_budget_in/_out; the gain in obviousness isn't worth the screwing
> around we'd have to do to cope with it in cross-version upgrade testing.
>
> 2. In the back branches, drop the CREATE FUNCTION commands for
> funny_dup17 and boxarea, which are used nowhere so they can easily
> be dispensed with.
>
> We could ask TestUpgradeXversion.pm to drop the latter two functions,
> similarly to what it's already doing for oldstyle_length.  But
> oldstyle_length was actually being used for something in the older
> branches so it was worth the trouble to cope with a cross-branch
> difference.  These two don't seem worth changing the buildfarm for.
>
>



I see you're reverted the change that caused the issue. The fact is that
we're always going to get this sort of issue from time to time with
cross-version testing. FWIW, my experience in testing the module offline
over an extended period (several years) is that the breakage isn't very
frequent. I'm always willing to make small tweaks to accommodate things
like name changes. It's not terribly difficult. I think it's sufficiently
important that we have good cross-version upgrade testing that we can
tolerate a little temporary breakage occasionally.

cheers

andrew


Re: Reopen logfile on SIGHUP

2018-02-27 Thread Tom Lane
Andres Freund  writes:
> On 2018-02-27 22:32:41 +, Greg Stark wrote:
>> On 27 February 2018 at 14:41, Anastasia Lubennikova
>>  wrote:
>>> Small patch in the attachment implements logfile reopeninig on SIGHUP.

>> HUP will cause Postgres to reload its config files. That seems like a
>> fine time to reopen the log files as well but it would be nice if
>> there was also some way to get it to *just* do that and not reload the
>> config files.

> Is that an actually important thing to be able to do?

Yeah, after further consideration I'm having a hard time seeing the point
of this patch.  The syslogger already has plenty sufficient knobs for
controlling when it rotates its output file.  If you're not using those,
I think the answer is to start using them, not to make the syslogger's
behavior even more complicated so you can avoid learning about them.

IOW, I think a fair response to this is "if you're using logrotate with
Postgres, you're doing it wrong".  That was of some use back before we
spent so much sweat on the syslogger, but it's not a reasonable setup
today.

There'd be a point to this perhaps in configurations *not* using the
syslogger, but it's patching the wrong place for that case.  (I'm
not sure there is a right place, unfortunately --- we don't have any
good way to redirect postmaster stderr after launch, since so many
processes would have to individually redirect.)

regards, tom lane



Re: Unexpected behavior with transition tables in update statement trigger

2018-02-27 Thread Tom Kazimiers

On Wed, Feb 28, 2018 at 10:27:23AM +1300, Thomas Munro wrote:

Tom K, if you need a workaround before 10.4 comes out in May[1], you
could try selecting the whole transition table into a CTE up front.
Something like WITH my_copy AS (SELECT * FROM new_table) SELECT * FROM
my_copy UNION ALL SELECT * FROM my_copy should work.

[1] https://www.postgresql.org/developer/roadmap/


Thanks, that's what I am going to do.

Cheers,
Tom



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Tom Lane
Greg Stark  writes:
> On 27 February 2018 at 14:41, Anastasia Lubennikova
>  wrote:
>> Small patch in the attachment implements logfile reopeninig on SIGHUP.
>> It only affects the file accessed by logging collector, which name you can
>> check with pg_current_logfile().

> HUP will cause Postgres to reload its config files. That seems like a
> fine time to reopen the log files as well but it would be nice if
> there was also some way to get it to *just* do that and not reload the
> config files.

There's already a pretty substantial amount of logic in syslogger.c
to decide whether to force a rotation if any of the logging collection
parameters changed.  I don't especially like the proposed patch, aside
from its lack of error handling, because it is completely disconnected
from that logic and thus is likely to produce unnecessary thrashing
of the output file.

> I wonder if it would be easiest to just have the syslogger watch for
> some other signal as well (I'm guessing the the syslogger doesn't use
> relcache invalidations so it could reuse USR1 for example). That would
> be a bit inconvenient as the admins would have to find the syslogger
> and deliver the signal directly, rather than through the postmaster
> but it would be pretty easy for them.

It already does treat SIGUSR1 as a log rotation request.  Apparently
the point of this patch is that some people don't find that easy enough
to use, which is fair, because finding out the collector's PID from
outside isn't very easy.

regards, tom lane



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Andres Freund
On 2018-02-27 22:32:41 +, Greg Stark wrote:
> On 27 February 2018 at 14:41, Anastasia Lubennikova
>  wrote:
> 
> > Small patch in the attachment implements logfile reopeninig on SIGHUP.
> > It only affects the file accessed by logging collector, which name you can
> > check with pg_current_logfile().
> 
> HUP will cause Postgres to reload its config files. That seems like a
> fine time to reopen the log files as well but it would be nice if
> there was also some way to get it to *just* do that and not reload the
> config files.

Is that an actually important thing to be able to do?


> I wonder if it would be easiest to just have the syslogger watch for
> some other signal as well (I'm guessing the the syslogger doesn't use
> relcache invalidations so it could reuse USR1 for example). That would
> be a bit inconvenient as the admins would have to find the syslogger
> and deliver the signal directly, rather than through the postmaster
> but it would be pretty easy for them.

-many. We have been "signal starved" a number of times, and definitely
shouldn't waste one on this.

- Andres



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Greg Stark
On 27 February 2018 at 14:41, Anastasia Lubennikova
 wrote:

> Small patch in the attachment implements logfile reopeninig on SIGHUP.
> It only affects the file accessed by logging collector, which name you can
> check with pg_current_logfile().

HUP will cause Postgres to reload its config files. That seems like a
fine time to reopen the log files as well but it would be nice if
there was also some way to get it to *just* do that and not reload the
config files.

I wonder if it would be easiest to just have the syslogger watch for
some other signal as well (I'm guessing the the syslogger doesn't use
relcache invalidations so it could reuse USR1 for example). That would
be a bit inconvenient as the admins would have to find the syslogger
and deliver the signal directly, rather than through the postmaster
but it would be pretty easy for them.

-- 
greg



Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

2018-02-27 Thread Peter Geoghegan
On Thu, Feb 22, 2018 at 6:32 AM, Robert Haas  wrote:
>> Attached patch does this. I cannot recreate this issue locally, but
>> this should still fix it on skink.
>
> Committed.

Looks like it worked.

-- 
Peter Geoghegan



Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly

2018-02-27 Thread Tom Lane
Etsuro Fujita  writes:
> (2018/02/11 6:24), Tom Lane wrote:
>> However, jaguarundi still shows a problem:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2018-02-10%2008%3A41%3A32

> I ran the postgres_fdw regression test with no sleep two times in a 
> CLOBBER_CACHE_ALWAYS-enabled build, and then the regression test with 
> the sleep (60 seconds) two times, but I couldn't reproduce that in both 
> cases.  I suspect the changes in the order of the RETURNING output there 
> was still caused by autovacuum kicking in partway through the run.  So 
> to make the regression test more stable against autovacuum, I'd propose 
> to modify the regression test to disable autovacuum for the remote table 
> (ie "S 1"."T 1") (and perform VACUUM ANALYZE to that table manually 
> instead) in hopes of getting that fixed.  Attached is a patch for that. 
>   I think changes added by the previous patch wouldn't make sense 
> anymore, so I removed those changes.

Ping?  We're still seeing those failures on jaguarundi.

regards, tom lane



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-27 Thread Peter Geoghegan
On Tue, Feb 27, 2018 at 10:19 AM, Pavan Deolasee
 wrote:
>> I attach a rough example of this, that uses plpgsql.
>
> Thanks for writing the sample code. I understand you probably don't mean to
> suggest that we need to mimic the behaviour of the plpgsql code and the
> semantics offered by MERGE would most likely be different than what the
> plpgsql sample does. Because there are several problems with the plpgsql
> code:
>
> - It would never turn a MATCHED case into a NOT MATCHED case because of
> concurrent UPDATE/DELETE
> - The WHERE clauses attached to the UPDATE/DELETE statement should be using
> the quals attached to the WHEN clauses to ensure they are evaluated on the
> new version of the row, if needed.

It's definitely possible to poke holes in my plpgsql example, most (or
all) of which are not fixable within the confines of what plpgsql can
do. I still think that it's really useful to frame the discussion with
examples of the kind of procedural code MERGE replaces, though,
because:

* That's the explicit purpose of MERGE, according to the SQL standard.
Everyone is very clear that the join outputs rows that are separately
inserted, updated, or deleted (additional "WHEN ... AND" quals are
evaluated separately). We should be clear on that, too. We're quite
specifically replacing procedural code that follows a general pattern.
We might even give an example of such procedural code in the docs, as
SQL Server does.

* It shows that in some ways, the INSERT/UPDATE/DELETE parts are
separate "zero or one row" statements. They could do their own
snapshot acquisitions within RC mode, for example. Also, you don't get
the ON CONFLICT behavior with excluded being affected by BEFORE ROW
triggers within UPDATE expression evaluation for ON CONFLICT DO
UPDATE.

* My example is buggy, but seemingly only in a way that is just about
unavoidable -- all the bugs are in RC mode with concurrent writes.
Poking holes in what I came up with is actually useful, and may be
less confusing than discussing the same issues some other way.

There are very few users in the world that would understand these
issues. Moreover, if all affected users that have this kind of code in
the wild were to somehow magically develop a strong understanding of
this stuff overnight, even then I'm not sure that much would change.
They still use RC mode for all the usual reasons, and they mostly
won't have any way of fixing concurrency issues that is actually
acceptable to them. In short, I'm not sure that I can fix the problems
with my plpgsql code, so what chance do they have of fixing theirs?

>> >> Some novel new behavior -- "EPQ with a twist"-- is clearly necessary.
>> >> I feel a bit uneasy about it because anything that anybody suggests is
>> >> likely to be at least a bit arbitrary (EPQ itself is kind of
>> >> arbitrary). We only get to make a decision on how "EPQ with a twist"
>> >> will work once, and that should be a decision that is made following
>> >> careful deliberation. Ambiguity is much more likely to kill a patch
>> >> than a specific technical defect, at least in my experience. Somebody
>> >> can usually just fix a technical defect.
>
>
> TBH that's one reason why I like Simon's proposed behaviour of throwing
> errors in case of corner cases. I am not suggesting that's what we do at the
> end, but it's definitely worth considering.

I now feel like Simon's suggestion of throwing an error in corner
cases isn't so bad. It still seems like we could do better, but the
more I think about it, the less that seems like a cop-out. My reasons
are:

* As I already said, my plpgsql example, while buggy, might actually
be the safest way to get the intended behavior in RC mode today. I can
definitely imagine a way of dealing with concurrency that is both
safer and less prone to throwing weird errors, but the fact remains
that my example is the state of the art here, in a way.

* Simon has already introduced something that looks like "EPQ with a
twist" to me -- the steps that happen before he even raises this
error. IOW, he does something extra that is EPQ-like. He likely does a
lot better than my plpgsql example manages to, I think.

* I suspect that the kind of users that really like the ON CONFLICT DO
UPDATE's simplicity (in terms of what it guarantees them) are unlikely
to care about MERGE at all. The kind of user that cares about MERGE is
likely to have at least heard of the isolation levels.

* I do see an important difference between making likely-unexpected
errors in RC mode very unlikely, and making them *impossible*. This
patch is not ON CONFLICT DO UPDATE, though, and that strong guarantee
simply isn't on the table.

* We can all agree that *not* raising an error in the specific way
Simon proposes is possible, somehow or other. We also all agree that
avoiding the broader category of RC errors can only be taken so far
(e.g. in any event duplicate violations errors are entirely possible,
in RC mode, when a MERGE inserts a row). So this is a

Re: Unexpected behavior with transition tables in update statement trigger

2018-02-27 Thread Tom Kazimiers

On Tue, Feb 27, 2018 at 03:58:14PM -0500, Tom Lane wrote:

Thomas Munro  writes:

Here's a new version with tuplestore_select_read_pointer() added in
another place where it was lacking, and commit message.  Moving to
-hackers, where patches go.


Pushed, along with a regression test based on your example.
Unfortunately, this came in a bit too late for this week's releases :-(


Ah, so close. :-) Regardless, thanks both of you for fixing this and 
committing the fix to master. I am looking forward to the release 
including this.


Cheers,
Tom



Re: Unexpected behavior with transition tables in update statement trigger

2018-02-27 Thread Thomas Munro
On Wed, Feb 28, 2018 at 9:58 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> Here's a new version with tuplestore_select_read_pointer() added in
>> another place where it was lacking, and commit message.  Moving to
>> -hackers, where patches go.
>
> Pushed, along with a regression test based on your example.
> Unfortunately, this came in a bit too late for this week's releases :-(

Thanks!

Tom K, if you need a workaround before 10.4 comes out in May[1], you
could try selecting the whole transition table into a CTE up front.
Something like WITH my_copy AS (SELECT * FROM new_table) SELECT * FROM
my_copy UNION ALL SELECT * FROM my_copy should work.

[1] https://www.postgresql.org/developer/roadmap/

-- 
Thomas Munro
http://www.enterprisedb.com



Re: PATCH: Exclude unlogged tables from base backups

2018-02-27 Thread David Steele
On 1/29/18 8:10 PM, Masahiko Sawada wrote:
> On Tue, Jan 30, 2018 at 5:45 AM, Adam Brightwell
>>
>> If it is agreed that the temp file exclusion should be submitted as a
>> separate patch, then I will mark 'ready for committer'.
> 
> Agreed, please mark this patch as "Ready for Committer".

Attached is a rebased patch that applies cleanly.

Thanks,
-- 
-David
da...@pgmasters.net
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3cec9e0b0c..a46c857e48 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2552,6 +2552,12 @@ The commands accepted in walsender mode are:
  with pgsql_tmp.
 

+   
+
+ Unlogged relations, except for the init fork which is required to
+ recreate the (empty) unlogged relation on recovery.
+
+   

 
  pg_wal, including subdirectories. If the backup 
is run
diff --git a/src/backend/replication/basebackup.c 
b/src/backend/replication/basebackup.c
index 185f32a5f9..eb6eb7206d 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -26,6 +26,7 @@
 #include "nodes/pg_list.h"
 #include "pgtar.h"
 #include "pgstat.h"
+#include "port.h"
 #include "postmaster/syslogger.h"
 #include "replication/basebackup.h"
 #include "replication/walsender.h"
@@ -33,6 +34,7 @@
 #include "storage/dsm_impl.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
+#include "storage/reinit.h"
 #include "utils/builtins.h"
 #include "utils/ps_status.h"
 #include "utils/relcache.h"
@@ -958,12 +960,44 @@ sendDir(const char *path, int basepathlen, bool sizeonly, 
List *tablespaces,
charpathbuf[MAXPGPATH * 2];
struct stat statbuf;
int64   size = 0;
+   const char  *lastDir;   /* Split last dir from 
parent path. */
+   boolisDbDir = false;/* Does this directory contain 
relations? */
+
+   /*
+* Determine if the current path is a database directory that can
+* contain relations.
+*
+* Start by finding the location of the delimiter between the parent
+* path and the current path.
+*/
+   lastDir = last_dir_separator(path);
+
+   /* Does this path look like a database path (i.e. all digits)? */
+   if (lastDir != NULL &&
+   strspn(lastDir + 1, "0123456789") == strlen(lastDir + 1))
+   {
+   /* Part of path that contains the parent directory. */
+   int parentPathLen = lastDir - path;
+
+   /*
+* Mark path as a database directory if the parent path is 
either
+* $PGDATA/base or a tablespace version path.
+*/
+   if (strncmp(path, "./base", parentPathLen) == 0 ||
+   (parentPathLen >= (sizeof(TABLESPACE_VERSION_DIRECTORY) 
- 1) &&
+strncmp(lastDir - 
(sizeof(TABLESPACE_VERSION_DIRECTORY) - 1),
+TABLESPACE_VERSION_DIRECTORY,
+sizeof(TABLESPACE_VERSION_DIRECTORY) - 
1) == 0))
+   isDbDir = true;
+   }
 
dir = AllocateDir(path);
while ((de = ReadDir(dir, path)) != NULL)
{
int excludeIdx;
boolexcludeFound;
+   ForkNumber  relForkNum; /* Type of fork if file 
is a relation */
+   int relOidChars;/* Chars in filename 
that are the rel oid */
 
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 
0)
@@ -1007,6 +1041,36 @@ sendDir(const char *path, int basepathlen, bool 
sizeonly, List *tablespaces,
if (excludeFound)
continue;
 
+   /* Exclude all forks for unlogged tables except the init fork */
+   if (isDbDir &&
+   parse_filename_for_nontemp_relation(de->d_name, 
&relOidChars,
+   
&relForkNum))
+   {
+   /* Never exclude init forks */
+   if (relForkNum != INIT_FORKNUM)
+   {
+   char initForkFile[MAXPGPATH];
+   char relOid[OIDCHARS + 1];
+
+   /*
+* If any other type of fork, check if there is 
an init fork
+* with the same OID. If so, the file can be 
excluded.
+*/
+   strncpy(relOid, de->d_name, relOidChars);
+   snprintf(initForkFile, sizeof(initForkFile), 
"%s/%s_init",
+path, relOid);
+
+

Re: [HACKERS] [POC] Faster processing at Gather node

2018-02-27 Thread Andres Freund
Hi,

On 2018-02-27 16:03:17 -0500, Robert Haas wrote:
> On Wed, Feb 7, 2018 at 1:41 PM, Andres Freund  wrote:
> > Well, it's more than just systems like that - for 64bit atomics we
> > sometimes do fall back to spinlock based atomics on 32bit systems, even
> > if they support 32 bit atomics.
> 
> I built with -m32 on my laptop and tried "select aid, count(*) from
> pgbench_accounts group by 1 having count(*) > 1" on pgbench at scale
> factor 100 with pgbench_accounts_pkey dropped and
> max_parallel_workers_per_gather set to 10 on (a) commit
> 0b5e33f667a2042d7022da8bef31a8be5937aad1 (I know this is a little old,
> but I think it doesn't matter), (b) same plus
> shm-mq-less-spinlocks-v3, and (c) same plus shm-mq-less-spinlocks-v3
> and shm-mq-reduce-receiver-latch-set-v2.
> 
> (a) 16563.790 ms, 16625.257 ms, 16496.062 ms
> (b) 17217.051 ms, 17157.745 ms, 17225.755 ms [median to median +3.9% vs. (a)]
> (c) 15491.947 ms, 15455.840 ms, 15452.649 ms [median to median -7.0%
> vs. (a), -10.2% vs (b)]
> 
> Do you think that's a problem?  If it is, what do you think we should
> do about it?  It seems to me that it's probably OK because (1) with
> both patches we still come out ahead and (2) 32-bit systems will
> presumably continue to become rarer as time goes on, but you might
> disagree.

No, I think this is fairly reasonable. A fairly extreme usecase on a 32
bit machine regressing a bit, while gaining peformance in other case?
That works for me.


> OK, I'll try to check how feasible that would be.

cool.

Greetings,

Andres Freund



Re: [HACKERS] [POC] Faster processing at Gather node

2018-02-27 Thread Robert Haas
On Wed, Feb 7, 2018 at 1:41 PM, Andres Freund  wrote:
> Well, it's more than just systems like that - for 64bit atomics we
> sometimes do fall back to spinlock based atomics on 32bit systems, even
> if they support 32 bit atomics.

I built with -m32 on my laptop and tried "select aid, count(*) from
pgbench_accounts group by 1 having count(*) > 1" on pgbench at scale
factor 100 with pgbench_accounts_pkey dropped and
max_parallel_workers_per_gather set to 10 on (a) commit
0b5e33f667a2042d7022da8bef31a8be5937aad1 (I know this is a little old,
but I think it doesn't matter), (b) same plus
shm-mq-less-spinlocks-v3, and (c) same plus shm-mq-less-spinlocks-v3
and shm-mq-reduce-receiver-latch-set-v2.

(a) 16563.790 ms, 16625.257 ms, 16496.062 ms
(b) 17217.051 ms, 17157.745 ms, 17225.755 ms [median to median +3.9% vs. (a)]
(c) 15491.947 ms, 15455.840 ms, 15452.649 ms [median to median -7.0%
vs. (a), -10.2% vs (b)]

Do you think that's a problem?  If it is, what do you think we should
do about it?  It seems to me that it's probably OK because (1) with
both patches we still come out ahead and (2) 32-bit systems will
presumably continue to become rarer as time goes on, but you might
disagree.

>> > Hm, do all paths here guarantee that mq->mq_detached won't be stored on
>> > the stack / register in the first iteration, and then not reread any
>> > further? I think it's fine because every branch of the if below ends up
>> > in a syscall / barrier, but it might be worth noting on that here.
>>
>> Aargh.  I hate compilers.  I added a comment.  Should I think about
>> changing mq_detached to pg_atomic_uint32 instead?
>
> I think a pg_compiler_barrier() would suffice to alleviate my concern,
> right? If you wanted to go for an atomic, using pg_atomic_flag would
> probably be more appropriate than pg_atomic_uint32.

Hmm, all right, I'll add pg_compiler_barrier().

>> >> - /* Write as much data as we can via a single 
>> >> memcpy(). */
>> >> + /*
>> >> +  * Write as much data as we can via a single 
>> >> memcpy(). Make sure
>> >> +  * these writes happen after the read of 
>> >> mq_bytes_read, above.
>> >> +  * This barrier pairs with the one in 
>> >> shm_mq_inc_bytes_read.
>> >> +  */
>> >
>> > s/above/above. Otherwise a newer mq_bytes_read could become visible
>> > before the corresponding reads have fully finished./?
>>
>> I don't find that very clear.  A newer mq_bytes_read could become
>> visible whenever, and a barrier doesn't prevent that from happening.
>
> Well, my point was that the barrier prevents the the write to
> mq_bytes_read becoming visible before the corresponding reads have
> finished. Which then would mean the memcpy() could overwrite them. And a
> barrier *does* prevent that from happening.

I think we're talking about the same thing, but not finding each
others' explanations very clear.

>> Hmm, I'm not sure I understand what you're suggesting, here.  In
>> general, we return with the data for the current message unconsumed,
>> and then consume it the next time the function is called, so that
>> (except when the message wraps the end of the buffer) we can return a
>> pointer directly into the buffer rather than having to memcpy().  What
>> this patch does is postpone consuming the data further, either until
>> we can free up at least a quarter of the ring buffer or until we need
>> to wait for more data. It seemed worthwhile to free up space in the
>> ring buffer occasionally even if we weren't to the point of waiting
>> yet, so that the sender has an opportunity to write new data into that
>> space if it happens to still be running.
>
> What I'm trying to suggest is that instead of postponing an update of
> mq_bytes_read (by storing amount of already performed reads in
> mqh_consume_pending), we continue to update mq_bytes_read but only set
> the latch if your above thresholds are crossed. That way a burst of
> writes can fully fill the ringbuffer, but the cost of doing a SetLatch()
> is amortized. In my testing SetLatch() was the expensive part, not the
> necessary barriers in shm_mq_inc_bytes_read().

OK, I'll try to check how feasible that would be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Registering LWTRANCHE_PARALLEL_HASH_JOIN

2018-02-27 Thread Thomas Munro
On Wed, Feb 28, 2018 at 8:39 AM, Robert Haas  wrote:
> On Sat, Feb 10, 2018 at 6:07 PM, Thomas Munro
>  wrote:
>> I forgot to register a display name for LWTRANCHE_PARALLEL_HASH_JOIN,
>> the tranche ID used by the LWLock that Parallel Hash uses when handing
>> out chunks of memory.  Please see attached.
>
> I think that you need to insert some weasel words into the
> documentation for this, because I don't think it's really accurate to
> say that it's only used when trying to acquire a new chunk of memory.
>
> Or maybe I'm wrong and it's altogether accurate ... but
> ExecParallelHashMergeCounters doesn't look like an allocation to me,
> and ExecParallelHashTuplePrealloc doesn't really look like an
> allocation either.

Ok.  How about this?

I noticed that some of the descriptions don't attempt to explain what
activity the lock protects at all, they just say "Waiting for $BLAH
lock".  I went the other way and covered the various different uses.
There are 4 uses for the lock but only three things in my list,
because I think "allocate" covers both ExecParallelHashTupleAlloc()
and ExecParallelHashTuplePrealloc().

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Register-and-document-LWTRANCHE_PARALLEL_HASH_JOI-v2.patch
Description: Binary data


Re: [HACKERS] [bug-fix] Cannot select big bytea values (~600MB)

2018-02-27 Thread Robert Haas
On Tue, Feb 27, 2018 at 2:17 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> +1.  We don't have to support everything, but things that don't work
>> should fail on insertion, not retrieval.  Otherwise what we have is
>> less a database and more a data black hole.
>
> That sounds nice as a principle but I'm not sure how workable it really
> is.  Do you want to reject text strings that fit fine in, say, LATIN1
> encoding, but might be overlength if some client tries to read them in
> UTF8 encoding?  (bytea would have a comparable problem with escape vs hex
> representation, for instance.)  Should the limit vary depending on how
> many columns are in the table?  Should we account for client-side tuple
> length restrictions?

I suppose what I really want is to have a limit that's large enough
for how big the retrieved data can be that people stop hitting it.

> Anyway, as Alvaro pointed out upthread, we've been down this particular
> path before and it didn't work out.  We need to learn something from that
> failure and decide how to move forward.

Yep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Unexpected behavior with transition tables in update statement trigger

2018-02-27 Thread Tom Lane
Thomas Munro  writes:
> Here's a new version with tuplestore_select_read_pointer() added in
> another place where it was lacking, and commit message.  Moving to
> -hackers, where patches go.

Pushed, along with a regression test based on your example.
Unfortunately, this came in a bit too late for this week's releases :-(

regards, tom lane



Re: PATCH: Configurable file mode mask

2018-02-27 Thread David Steele
Hi Michael,

Thanks for having a look at the patches.

On 1/30/18 3:01 AM, Michael Paquier wrote:
> On Mon, Jan 29, 2018 at 04:29:08PM -0500, David Steele wrote:
>>
>> Adds a *very* basic test suite for pg_resetwal.  I was able to make this
>> utility core dump (floating point errors) pretty easily with empty or
>> malformed pg_control files so I focused on WAL reset functionality plus
>> the basic help/command tests that every utility has.
> 
> A little is better than nothing, so that's an improvement, thanks!
> 
> +# Remove WAL from pg_wal and make sure it gets rebuilt
> +$node->stop;
> +
> +unlink("$pgwal/00010001") == 1
> +   or die("unable to remove 00010001");
> +
> +is_deeply(
> +   [sort(slurp_dir($pgwal))], [sort(qw(. .. archive_status))], 'no WAL');
> This is_deeply test serves little purpose.  The segment gets removed
> directly so I would just remove it.

The purpose of this test to be ensure nothing else is in the directory.
As the tests get more complex I think keeping track of the state will be
important.  In other words, this is really an assertion that the current
test state is correct, not that the prior command succeeded.

> Another test that could be easily included is with -o, then create a
> table and check for its OID value.  Also for -x, then just use
> txid_current to check the value consistency.  For -c, with
> track_commit_timestamp set to on, you can set a couple of values and
> then check for pg_last_committed_xact() which would be NULL if you use
> an XID older than the minimum set for example.  All those are simple
> enough so they could easily be added in the first version of this test
> suite.

I would prefer to leave this for another patch.

> You need to update src/bin/pg_resetxlog/.gitignore to include
> tmp_check/.

Added.

>> 2) 02-mkdir
>>
>> Converts mkdir() calls to MakeDirectoryDefaultPerm() if the original
>> call used default permissions.
> 
> So the only call not converted to that API is in ipc.c...  OK, this one
> is pretty simple so nothing really to say about it, the real meat comes
> with patch 3.  I would rather see with a good eye if this patch
> introduces file_perm.h which includes both PG_FILE_MODE_DEFAULT and
> PG_DIR_MODE_DEFAULT, and have the frontends use those values.  This
> would make the switch to 0003 a bit easier if you look at pg_resetwal's
> diffs for example.

Agreed.  I thought about this originally but could not come up with a
good split.  I spent some time on it and came up with something that I
think works (and would make sense to commit separately).

>> 3) 03-group
>>
>> Allow group access on PGDATA.  This includes backend changes, utility
>> changes (initdb, pg_ctl, pg_upgrade, etc.) and tests for each utility.
> 
> +++ b/src/include/common/file_perm.h
> + *
> + * This module is located in common so the backend can use the constants.
> + *
> This is the case of more or less all the content in src/common/, except
> for the things which are frontend-only, so this comment could just be
> removed.

Removed.

> +command_ok(
> +   ['chmod', "-R", 'g+rX', "$pgdata"],
> +   'add group perms to PGDATA');
>
> That would blow up on Windows.  You should replace that by perl's chmod
> and look at its return result for sanity checks, likely with ($cnt != 0).
> And let's not use icacls please...

Fixed with a new function, add_pg_data_group_perm().

> +if ($params{has_group_access})
> +{
> +push(@{$params{extra}}, '-g');
> +}
> No need to introduce a new parameter here, please use directly extra =>
> [ '-g' ] in the routines calling PostgresNode::init.

Other code needs to know if group access is enabled on the node (see
RewindTest.pm) so it's not just a matter of passing the option.

> The efforts put in the tests are good.

Thanks!

New patches are attached.  The end result is almost identical to v6 but
code was moved from 03 to 02 as per Michael's suggestion.

1) 01-pgresetwal-test

Adds a very basic test suite for pg_resetwal.

2) 02-file-perm

Adds a new header (file_perm.h) and makes all front-end utilities use
the new constants for setting umask.  Converts mkdir() calls in the
backend to MakeDirectoryDefaultPerm() if the original
call used default permissions.  Adds tests to make sure permissions in
PGDATA are set correctly by the front-end tools.

3) 03-group

Allow group access on PGDATA.  This includes backend changes, utility
changes (initdb, pg_ctl, pg_upgrade, etc.) and tests for each utility.

Thanks!
-- 
-David
da...@pgmasters.net
diff --git a/src/bin/pg_resetwal/.gitignore b/src/bin/pg_resetwal/.gitignore
index 236abb4323..a950255fd7 100644
--- a/src/bin/pg_resetwal/.gitignore
+++ b/src/bin/pg_resetwal/.gitignore
@@ -1 +1,2 @@
 /pg_resetwal
+/tmp_check
diff --git a/src/bin/pg_resetwal/Makefile b/src/bin/pg_resetwal/Makefile
index 5ab7ad33e0..13c9ca6279 100644
--- a/src/bin/pg_resetwal/Makefile
+++ b/src/bin/pg_resetwal/Makefile
@@ -33,3 +33,9 @@ uninstall:
 
 clean distclean maintai

Re: Unexpected behavior with transition tables in update statement trigger

2018-02-27 Thread Tom Kazimiers

On Tue, Feb 27, 2018 at 02:52:02PM +1300, Thomas Munro wrote:

On Tue, Feb 27, 2018 at 4:18 AM, Tom Kazimiers  wrote:
It would be great if this or a similar fix would make it into the 
next official release.


Here's a new version with tuplestore_select_read_pointer() added in
another place where it was lacking, and commit message.  Moving to
-hackers, where patches go.


Thanks and just to confirm the obvious: the new patch still fixes this 
bug in both my test trigger function and my real trigger function.



Here's a shorter repro.  On master it prints:

NOTICE:  count = 1
NOTICE:  count union = 1

With the patch the second number is 2, as it should be.

CREATE TABLE test (i int);
INSERT INTO test VALUES (1);

CREATE OR REPLACE FUNCTION my_trigger_fun() RETURNS trigger
LANGUAGE plpgsql AS
$$
 BEGIN
RAISE NOTICE 'count = %', (SELECT COUNT(*) FROM new_test);
RAISE NOTICE 'count union = %', (SELECT COUNT(*)
 FROM (SELECT * FROM new_test UNION ALL SELECT * FROM new_test) ss);
RETURN NULL;
END;
$$;

CREATE TRIGGER my_trigger AFTER UPDATE ON test
REFERENCING NEW TABLE AS new_test OLD TABLE as old_test
FOR EACH STATEMENT EXECUTE PROCEDURE my_trigger_fun();

UPDATE test SET i = i;


That's a much cleaner repro (and test case I suppose), thanks.

Tom



Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Tom Lane
Andres Freund  writes:
> On 2018-02-27 14:41:47 -0500, Tom Lane wrote:
>> What I didn't understand about it was what kind of testing this'd make
>> harder.  If we desupport dynamic_shared_memory_type=none, there aren't
>> any code paths that need to cope with the case, and we should just
>> remove any code that thereby becomes unreachable.

> What I'm concerned about isn't so much testing paths specific to
> dynamic_shared_memory_type=none, but paths where we currently need
> fallbacks for the case we couldn't actually allocate dynamic shared
> memory. Which I think we at least somewhat gracefully need to deal with.

Ah.  That's a fair point, but I do not think
dynamic_shared_memory_type=none is a good substitute for having a way to
provoke allocation failures.  That doesn't let you test recovery from
situations where your first allocation works and second one fails, for
example; and cleanup from that sort of case is likely to be more
complicated than the trivial case.

regards, tom lane



Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Andres Freund
On 2018-02-27 14:41:47 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Thu, Feb 15, 2018 at 1:00 PM, Andres Freund  wrote:
> >> Hm, I'm not quite convinced by this. Seems to make testing a bunch of
> >> codepaths harder.  I think it's fine to say that pg doesn't work
> >> correctly with them disabled though.
> 
> > I'm not sure I understand this.  Do you mean we should just add a
> > disclaimer to the documentation?
> 
> What I didn't understand about it was what kind of testing this'd make
> harder.  If we desupport dynamic_shared_memory_type=none, there aren't
> any code paths that need to cope with the case, and we should just
> remove any code that thereby becomes unreachable.

What I'm concerned about isn't so much testing paths specific to
dynamic_shared_memory_type=none, but paths where we currently need
fallbacks for the case we couldn't actually allocate dynamic shared
memory. Which I think we at least somewhat gracefully need to deal with.

Greetings,

Andres Freund



Re: Wait event names mismatch: oldserxid

2018-02-27 Thread Robert Haas
On Fri, Feb 9, 2018 at 8:53 AM, Michael Paquier  wrote:
> So the docs look correct to me on this side.  What I find weird is the
> phrasing to define oldserxid.  Instead of that, the current description:
> Waiting to I/O on an oldserxid buffer.
> I would suggest "waiting *for* I/O"
>
> A small patch is attached.

Agreed.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Tom Lane
Robert Haas  writes:
> On Thu, Feb 15, 2018 at 1:00 PM, Andres Freund  wrote:
>> Hm, I'm not quite convinced by this. Seems to make testing a bunch of
>> codepaths harder.  I think it's fine to say that pg doesn't work
>> correctly with them disabled though.

> I'm not sure I understand this.  Do you mean we should just add a
> disclaimer to the documentation?

What I didn't understand about it was what kind of testing this'd make
harder.  If we desupport dynamic_shared_memory_type=none, there aren't
any code paths that need to cope with the case, and we should just
remove any code that thereby becomes unreachable.

regards, tom lane



Re: Registering LWTRANCHE_PARALLEL_HASH_JOIN

2018-02-27 Thread Robert Haas
On Sat, Feb 10, 2018 at 6:07 PM, Thomas Munro
 wrote:
> I forgot to register a display name for LWTRANCHE_PARALLEL_HASH_JOIN,
> the tranche ID used by the LWLock that Parallel Hash uses when handing
> out chunks of memory.  Please see attached.

I think that you need to insert some weasel words into the
documentation for this, because I don't think it's really accurate to
say that it's only used when trying to acquire a new chunk of memory.

Or maybe I'm wrong and it's altogether accurate ... but
ExecParallelHashMergeCounters doesn't look like an allocation to me,
and ExecParallelHashTuplePrealloc doesn't really look like an
allocation either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Sigh, I broke crake again

2018-02-27 Thread Tom Lane
I wrote:
> I had not intended to back-patch, since those changes were just cosmetic,
> but it might be the best way to preserve the XversionUpgrade tests.

After closer study, it seems like the least painful answer is:

1. In HEAD, revert the renaming of int44in/int44out to
city_budget_in/_out; the gain in obviousness isn't worth the screwing
around we'd have to do to cope with it in cross-version upgrade testing.

2. In the back branches, drop the CREATE FUNCTION commands for
funny_dup17 and boxarea, which are used nowhere so they can easily
be dispensed with.

We could ask TestUpgradeXversion.pm to drop the latter two functions,
similarly to what it's already doing for oldstyle_length.  But
oldstyle_length was actually being used for something in the older
branches so it was worth the trouble to cope with a cross-branch
difference.  These two don't seem worth changing the buildfarm for.

regards, tom lane



Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Robert Haas
On Thu, Feb 15, 2018 at 1:00 PM, Andres Freund  wrote:
> Hm, I'm not quite convinced by this. Seems to make testing a bunch of
> codepaths harder.  I think it's fine to say that pg doesn't work
> correctly with them disabled though.

I'm not sure I understand this.  Do you mean we should just add a
disclaimer to the documentation?

As long as dynamic_shared_memory_type=none is a supported
configuration, we can't use DSM for anything critical, but there's a
desire to use it in things like AV and the stats collector, which are
pretty critical.  Instead of removing it, we could do something like
this:


If you configure dynamic_shared_memory_type=none, parallel query will
be disabled.  Also, the stats collector will fail to run, so there
will be no statistics to trigger autovacuum, and your database will
become horribly bloated.  Actually, that would be true anyway even if
the stats collector somehow managed to work, because autovacuum itself
will also fail.  Moreover, because we've decide that you really have
to have dynamic shared memory, there may probably be other things that
also fail, too, either now or in the future.  Basically, if you
configure dynamic_shared_memory_type=none, expect a swift and painful
death.


...but I'm not sure that's really a great option.  It seems like
leaving user-visible knobs lying around that break the entire world is
a bad plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] [bug-fix] Cannot select big bytea values (~600MB)

2018-02-27 Thread Tom Lane
Robert Haas  writes:
> +1.  We don't have to support everything, but things that don't work
> should fail on insertion, not retrieval.  Otherwise what we have is
> less a database and more a data black hole.

That sounds nice as a principle but I'm not sure how workable it really
is.  Do you want to reject text strings that fit fine in, say, LATIN1
encoding, but might be overlength if some client tries to read them in
UTF8 encoding?  (bytea would have a comparable problem with escape vs hex
representation, for instance.)  Should the limit vary depending on how
many columns are in the table?  Should we account for client-side tuple
length restrictions?

Anyway, as Alvaro pointed out upthread, we've been down this particular
path before and it didn't work out.  We need to learn something from that
failure and decide how to move forward.

regards, tom lane



Re: ALTER TABLE ADD COLUMN fast default

2018-02-27 Thread Andres Freund
Hi,

On 2018-02-27 14:29:44 +1030, Andrew Dunstan wrote:
> This was debated quite some time ago. See for example
> 
> The consensus then seemed to be to store them in pg_attribute. If not,
> I think we'd need a new catalog - pg_attrdef doesn't seem right. They
> would have to go on separate rows, and we'd be storing values rather
> than expressions, so it would get somewhat ugly.

I know that I'm concerned with storing a number of large values in
pg_attributes, and I haven't seen that argument addressed much in a
quick scan of the discussion.


> >> @@ -293,10 +408,18 @@ heap_fill_tuple(TupleDesc tupleDesc,
> >>   * 
> >>   */
> >>  bool
> >> -heap_attisnull(HeapTuple tup, int attnum)
> >> +heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
> >>  {
> >> + /*
> >> +  * We allow a NULL tupledesc for relations not expected to have
> >> +  * missing values, such as catalog relations and indexes.
> >> +  */
> >> + Assert(!tupleDesc || attnum <= tupleDesc->natts);
> >
> > This seems fairly likely to hide bugs.

> How? There are plenty of cases where we simply expect quite reasonably
> that there won't be any missing attributes, and we don't need  a
> tupleDesc at all in such cases.

Missing to pass a tupledesc for tables that can have defaults with not
have directly visible issues, you'll just erroneously get back a null.


> >> @@ -508,6 +508,8 @@ tlist_matches_tupdesc(PlanState *ps, List *tlist, 
> >> Index varno, TupleDesc tupdesc
> >>   return false;   /* out of order */
> >>   if (att_tup->attisdropped)
> >>   return false;   /* table contains dropped 
> >> columns */
> >> + if (att_tup->atthasmissing)
> >> + return false;   /* table contains cols with 
> >> missing values */
> >
> > Hm, so incrementally building a table definitions with defaults, even if
> > there aren't ever any rows, or if there's a subsequent table rewrite,
> > will cause slowdowns. If so, shouldn't a rewrite remove all
> > atthasmissing values?

> Yes, I think so. Working on it. OTOH, I'm somewhat mystified by having
> had to make this change.  Still, it looks like dropping a column has
> the same effect.

What exactly is mystifying you? That the new default stuff doesn't work
when the physical tlist optimization is in use?


> >> @@ -561,10 +568,73 @@ RelationBuildTupleDesc(Relation relation)
> >>   
> >> MemoryContextAllocZero(CacheMemoryContext,
> >>
> >>   relation->rd_rel->relnatts *
> >>
> >>   sizeof(AttrDefault));
> >> - attrdef[ndef].adnum = attp->attnum;
> >> + attrdef[ndef].adnum = attnum;
> >>   attrdef[ndef].adbin = NULL;
> >> +
> >>   ndef++;
> >>   }
> >> +
> >> + /* Likewise for a missing value */
> >> + if (attp->atthasmissing)
> >> + {
> >
> > As I've written somewhere above, I don't think it's a good idea to do
> > this preemptively. Tupledescs are very commonly copied, and the
> > missing attributes are quite likely never used. IOW, we should just
> > remember the attmissingattr value and fill in the corresponding
> > attmissingval on-demand.

> Defaults are frequently not used either, yet this function fills those
> in regardless. I don't see why we need to treat missing values
> differently.

It's already hurting us quite a bit in workloads with a lot of trivial
queries. Adding more makes it worse.


> Attached is the current state of things, with the fixes indicated
> above, as well as some things pointed out by David Rowley.
> 
> None of this seems to have had much effect on performance.
> 
> Here's what oprofile reports from a run of pgbench with
> create-alter.sql and the query "select sum(c1000) from t":
> 
> Profiling through timer interrupt
> samples  %image name   symbol name
> 2258428.5982  postgres ExecInterpExpr
> 1195015.1323  postgres slot_getmissingattrs
> 5471  6.9279  postgres AllocSetAlloc
> 3018  3.8217  postgres TupleDescInitEntry
> 2042  2.5858  postgres MemoryContextAllocZeroAligned
> 1807  2.2882  postgres SearchCatCache1
> 1638  2.0742  postgres expression_tree_walker
> 
> 
> That's very different from what we see on the master branch. The time
> spent in slot_getmissingattrs is perhaps not unexpected, but I don't
> (at least yet) understand why we're getting so much time spent in
> ExecInterpExpr, which doesn't show up at all when the same benchmark
> is run on master.

I'd guess that's becau

Re: Let's remove DSM_INPL_NONE.

2018-02-27 Thread Robert Haas
On Thu, Feb 15, 2018 at 5:58 AM, Kyotaro HORIGUCHI
 wrote:
> It is amost a just-to-delete work but I see two issues raised so
> far.

I think the two issues you are pointing out are the same issue --
namely, if initdb probes for a max_connections setting with an invalid
DSM implementation configured, it will fail the test for every value
of max_connections and thus select the lowest possible value.  The
solution to this is presumably just to make sure we choose the DSM
implementation before we do the max_connections probing so that we can
pass through the correct value there, which I think is what your patch
does.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: atomics.h may not be included from frontend code

2018-02-27 Thread Andres Freund
On 2018-02-27 13:43:23 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-02-27 10:36:01 -0500, Robert Haas wrote:
> >> However, those are presumably rare configurations that many people
> >> (including many developers) don't care about.
> 
> > I don't think that's quite true anymore. We e.g. now rely on 64bit
> > atomics being emulated on some machines, and they're unavailable on a
> > bigger number of systems than atomics proper, particularly 32bit
> > sytems.  There's also hppa, of course ;)
> 
> I had the idea that there were also still some MIPS machines out there
> with no real atomics support.  If there's not, I wouldn't complain
> hard about deciding to desupport HPPA, whenever we want to rip out
> the fallbacks.

There's definitely several machines that we currently support without
good enough atomics support. I don't *think* mips is among them, the
architectural manuals I had looked at when implementing it suggested
ll/sc type support is available everywhere.

https://wiki.postgresql.org/wiki/Atomics
lacking 32 bit atomics: armv5, sparcv8, pa-risc, m32r, i386
lacking 64 bit atomics: some ppc, armv6, i486, !gcc !msvc x86 in 32bit mode

None of these seem extremely interesting personally, but I'm way happier
to drop platforms than some others ;


> I am not sure though that we want to promise that atomics.h would work
> in arbitrary frontend environments in any case.  Those functions are
> very specifically intended to do what PG needs them to do; do we
> really want them to try to serve multiple masters?

I think if we'd allow the client to implement a locking mechanism for
the fallback implementation, there'd not be a big issue. Besides the
fallback code there's really not that much PG specificity in there...

Greetings,

Andres Freund



Re: Sigh, I broke crake again

2018-02-27 Thread Tom Lane
Andres Freund  writes:
> On 2018-02-27 13:36:59 -0500, Tom Lane wrote:
>> Does it seem practical to adjust TestUpgradeXversion.pm to cope with
>> this change?  An alternative answer is to put back C-language stubs
>> in regress.c for the removed functions, but that seems a bit grotty.

> Could we just drop the relevant functions in the course of the test?

The change I was thinking of was to have TestUpgradeXversion.pm do that.
If we wanted to modify the back branches themselves to help with this,
what I'd be inclined to do is back-patch the regression test changes,
or some subset thereof, so that the back branches have the same idea
of which functions are in regress.so as HEAD does.

I had not intended to back-patch, since those changes were just cosmetic,
but it might be the best way to preserve the XversionUpgrade tests.

regards, tom lane



Re: Proposal for changes in official Docker image

2018-02-27 Thread Максим Кольцов
2018-02-27 21:48 GMT+03:00 Robert Haas :
> On Sat, Feb 17, 2018 at 2:47 PM, Максим Кольцов  wrote:
>> First of all, thanks for the great app :)
>>
>> I started using PgAdmin with docker image (dpage/pgadmin4) a few weeks
>> ago, however I thought that it had some issues, so I decided to make
>> my own image. Some of the advantages:
>>
>> - Use alpine linux instead of centos to greatly reduce image size
>> (170MB vs 560MB)
>> - Use lightweight pure-python HTTP server waitress instead of heavy
>> apache/mod_wsgi
>> - Use python 3.6
>>
>> You can test the image at https://hub.docker.com/r/maksbotan/pgadmin4/
>> Readme contains more detailed explanation and usage instructions.
>>
>> The Dockerfile is hosted at github: 
>> https://github.com/maksbotan/pgadmin4_docker
>>
>> If you find my work useful, I'd love to make a contribution with these
>> scripts, after some discussion with pgadmin developers and further
>> improvements.
>
> This mailing list is for PostgreSQL, not pgadmin.

Yeah, I already figured that out, wrote to the correct list and
started conversation with Dave.
Sorry!

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Re: Proposal for changes in official Docker image

2018-02-27 Thread Robert Haas
On Sat, Feb 17, 2018 at 2:47 PM, Максим Кольцов  wrote:
> First of all, thanks for the great app :)
>
> I started using PgAdmin with docker image (dpage/pgadmin4) a few weeks
> ago, however I thought that it had some issues, so I decided to make
> my own image. Some of the advantages:
>
> - Use alpine linux instead of centos to greatly reduce image size
> (170MB vs 560MB)
> - Use lightweight pure-python HTTP server waitress instead of heavy
> apache/mod_wsgi
> - Use python 3.6
>
> You can test the image at https://hub.docker.com/r/maksbotan/pgadmin4/
> Readme contains more detailed explanation and usage instructions.
>
> The Dockerfile is hosted at github: 
> https://github.com/maksbotan/pgadmin4_docker
>
> If you find my work useful, I'd love to make a contribution with these
> scripts, after some discussion with pgadmin developers and further
> improvements.

This mailing list is for PostgreSQL, not pgadmin.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] [bug-fix] Cannot select big bytea values (~600MB)

2018-02-27 Thread Robert Haas
On Fri, Feb 16, 2018 at 2:00 PM, Anna Akenteva
 wrote:
> It's not necessarily my goal. My goal is to avoid the confusing situation
> where you insert something into a table and suddenly everything seems to
> break for no reason and you don't get any information on what to do next. As
> I see it, it could be solved with:
> a) allowing including big bytea values but making sure that it doesn't cause
> problems (which I tried to do with my patch)
> b) prohibiting inserting the kind of data that will cause problems
> c) informing the user about the issue (maybe documenting this behaviour or
> giving a more informative error message)

+1.  We don't have to support everything, but things that don't work
should fail on insertion, not retrieval.  Otherwise what we have is
less a database and more a data black hole.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: atomics.h may not be included from frontend code

2018-02-27 Thread Tom Lane
Andres Freund  writes:
> On 2018-02-27 10:36:01 -0500, Robert Haas wrote:
>> However, those are presumably rare configurations that many people
>> (including many developers) don't care about.

> I don't think that's quite true anymore. We e.g. now rely on 64bit
> atomics being emulated on some machines, and they're unavailable on a
> bigger number of systems than atomics proper, particularly 32bit
> sytems.  There's also hppa, of course ;)

I had the idea that there were also still some MIPS machines out there
with no real atomics support.  If there's not, I wouldn't complain
hard about deciding to desupport HPPA, whenever we want to rip out
the fallbacks.

I am not sure though that we want to promise that atomics.h would work
in arbitrary frontend environments in any case.  Those functions are
very specifically intended to do what PG needs them to do; do we
really want them to try to serve multiple masters?

regards, tom lane



Re: Sigh, I broke crake again

2018-02-27 Thread Andres Freund
Hi,

On 2018-02-27 13:36:59 -0500, Tom Lane wrote:
> I just committed some regression test cleanup that removed or renamed
> a couple of C-language functions in regress.so.  I see that crake is
> now failing XversionUpgrade tests, and although I can't see the details,
> I bet the problem is from trying to load CREATE FUNCTION commands from
> the old regression databases that now don't point to an existing C
> function.
> 
> Does it seem practical to adjust TestUpgradeXversion.pm to cope with
> this change?  An alternative answer is to put back C-language stubs
> in regress.c for the removed functions, but that seems a bit grotty.

Could we just drop the relevant functions in the course of the test?

Greetings,

Andres Freund



Re: [HACKERS] Partition-wise aggregation/grouping

2018-02-27 Thread Robert Haas
On Tue, Feb 27, 2018 at 4:29 AM, Jeevan Chalke
 wrote:
> Hi Robert,
> I had a look over his provided testcase and observed that when we create a
> Gather Merge path over a cheapest partial path by sorting it explicitly as
> generate_gather_paths won't consider it, we accidentally used cheapest
> partial path from the input_rel to create a Gather Merge; instead we need a
> cheapest partial path from the partially_grouped_rel.
>
> Attached fix_aggref_in_non-agg_error.patch fixing this.

Oops.  Thanks, committed.

> test_for_aggref_in_non-agg_error.patch has a testcase reported by Rajkumar
> which I have added in a aggregates.sql.

Didn't commit this; I think that's overkill.

> While doing so, I have observed few cleanup changes, added those in
> misc_cleanup.patch.

Committed those.

> While re-basing my partitionwise aggregate changes, I observed that when we
> want to create partial aggregation paths for a child partition, we don't
> need to add Gather or Gather Merge on top of it as we first want to append
> them all and then want to stick a gather on it. So it will be better to have
> that code part in a separate function so that we can call it from required
> places.
>
> I have attached patch (create_non_partial_paths.patch) for it including all
> above fix.

I don't like that very much.  For one thing, the name
create_non_partial_paths() is not very descriptive at all.  For
another thing, it somewhat renders add_paths_to_partial_grouping_rel()
a misnomer, as that function then adds only partial paths.  I think
what you should just do is have the main patch add a test for
rel->reloptkind == RELOPT_UPPER_REL; if true, add the Gather paths; if
not, skip it.  Then it will be skipped for RELOPT_OTHER_UPPER_REL
which is what we want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Sigh, I broke crake again

2018-02-27 Thread Tom Lane
I just committed some regression test cleanup that removed or renamed
a couple of C-language functions in regress.so.  I see that crake is
now failing XversionUpgrade tests, and although I can't see the details,
I bet the problem is from trying to load CREATE FUNCTION commands from
the old regression databases that now don't point to an existing C
function.

Does it seem practical to adjust TestUpgradeXversion.pm to cope with
this change?  An alternative answer is to put back C-language stubs
in regress.c for the removed functions, but that seems a bit grotty.

regards, tom lane



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-27 Thread Pavan Deolasee
On Fri, Feb 16, 2018 at 6:37 AM, Peter Geoghegan  wrote:

>
> ISTM that a MERGE isn't really a thing that replaces 2 or 3 other DML
> statements, at least in most cases. It's more like a replacement for
> procedural code with an outer join, with an INSERT, UPDATE or DELETE
> that affects zero or one rows inside the procedural loop that
> processes matching/non-matching rows. The equivalent procedural code
> could ultimately perform *thousands* of snapshot acquisitions for
> thousands of RC DML statements. MERGE is sometimes explained in terms
> of "here is the kind of procedural code that you don't have to write
> anymore, thanks to MERGE" -- that's what the code looks like.
>
> I attach a rough example of this, that uses plpgsql.
>

Thanks for writing the sample code. I understand you probably don't mean to
suggest that we need to mimic the behaviour of the plpgsql code and the
semantics offered by MERGE would most likely be different than what the
plpgsql sample does. Because there are several problems with the plpgsql
code:

- It would never turn a MATCHED case into a NOT MATCHED case because of
concurrent UPDATE/DELETE
- The WHERE clauses attached to the UPDATE/DELETE statement should be using
the quals attached to the WHEN clauses to ensure they are evaluated on the
new version of the row, if needed.


>
> >> Some novel new behavior -- "EPQ with a twist"-- is clearly necessary.
> >> I feel a bit uneasy about it because anything that anybody suggests is
> >> likely to be at least a bit arbitrary (EPQ itself is kind of
> >> arbitrary). We only get to make a decision on how "EPQ with a twist"
> >> will work once, and that should be a decision that is made following
> >> careful deliberation. Ambiguity is much more likely to kill a patch
> >> than a specific technical defect, at least in my experience. Somebody
> >> can usually just fix a technical defect.
>

TBH that's one reason why I like Simon's proposed behaviour of throwing
errors in case of corner cases. I am not suggesting that's what we do at
the end, but it's definitely worth considering.


> >
> >
> > While I agree, I think we need to make these decisions in a time bound
> > fashion. If there is too much ambiguity, then it's not a bad idea to
> settle
> > for throwing appropriate errors instead of providing semantically wrong
> > answers, even in some remote corner case.
>
> Everything is still on the table, I think.
>

Ok.


>
> > Ok. I am now back from holidays and I will too start thinking about this.
> > I've also requested a colleague to help us with comparing it against
> > Oracle's behaviour. N That's not a gold standard for us, but knowing how
> > other major databases handle RC conflicts, is not a bad idea.
>
> The fact that Oracle doesn't allow WHEN MATCHED ... AND quals did seem
> like it might be significant to me.
>
>
Here are some observations from Rahila's analysis so far. I must say,
Oracle's handling seems quite inconsistent, especially the conditions under
which it sometimes re-evaluates the join and sometimes don't.

- Oracle does not support multiple WHEN MATCHED clauses. So the question of
re-checking all WHEN clauses does not arise.

- Only one UPDATE and one DELETE clause is supported. The DELETE must be
used in conjunction with UPDATE.

- The DELETE clause is invoked iff the UPDATE clause is invoked. It works
on the updated rows. Since the row is already updated (and locked) by the
MERGE, DELETE action never blocks on a concurrent update/delete

- MERGE does not allow updating the column used in the JOIN's ON qual

- In case of concurrent UPDATE, the join is re-evaluated iff the concurrent
  UPDATE updates (modifies?) the same column that MERGE is updating OR a
column
  that MERGE is referencing in the WHERE clause is updated by the
concurrent update. IOW if the
  MERGE and concurrent UPDATE is operating on different columns, join is NOT
  re-evaluated, thus possibly invoking WHEN MATCHED action on a row which no
  longer matches the join condition.

- In case of concurrent DELETE, the join is re-evaluated and the action may
change from MATCHED to NOT MATCHED

I am curiously surprised by it's behaviour of re-evaluating join only when
certain columns are updated. It looks to me irrespective of what we choose,
our implementation would be much superior to what Oracle offers.

BTW I've sent v17a of the patch, which is very close to being complete from
my perspective (except some documentation fixes/improvements). The only
thing pending is the decision to accept or change the currently implemented
concurrency semantics.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: atomics.h may not be included from frontend code

2018-02-27 Thread Andres Freund
Hi,

On 2018-02-27 10:36:01 -0500, Robert Haas wrote:
> On Tue, Feb 27, 2018 at 8:49 AM, Aleksander Alekseev
> We tried to follow commit messages [1] and discussions [2]. However no matter
> how you try to look on this code it's weird.

I don't see how that makes the code weird. Not fit for your purpose,
sure, but weird?


> > We would like to know whether you share this concern and whether it
> > would be a good idea to try to refactor the code so that atomics could
> > be used not only from the backend.
> 
> I think the concern on the referenced threads was that atomics might
> be implemented using spinlocks if we don't have a real atomics
> implementation; and those in turn might be implemented as semaphores
> if we don't have a real spinlock implementation.  When we emulate
> atomics using spinlocks, we use a fixed-size array of spinlocks stored
> in shared memory; and when we emulate spinlocks using semaphore, we
> use the semaphores in each PGPROC.  So those emulation layers are
> deeply tied into the backend's shared-memory architecture, and
> untangling them might be a bit complicated.

Right.


> However, those are presumably rare configurations that many people
> (including many developers) don't care about.

I don't think that's quite true anymore. We e.g. now rely on 64bit
atomics being emulated on some machines, and they're unavailable on a
bigger number of systems than atomics proper, particularly 32bit
sytems.  There's also hppa, of course ;)


> If #include "atomics.h" worked from all frontend code except where one
> of those emulation layers were in use, that would represent an
> improvement over the current status quo for people doing out-of-core
> development against PostgreSQL.

I think if we wanted to enable this code to be used from frontend, we'd
have to to implement the various fallbacks paths in a bit different way,
so they can be supplied by frontend code.  And then we can rely on it on
frontend code for real, if we want to.

Greetings,

Andres Freund



Re: [HACKERS] GUC for cleanup indexes threshold.

2018-02-27 Thread Alexander Korotkov
On Fri, Jan 12, 2018 at 7:05 AM, Masahiko Sawada 
wrote:

> On Wed, Jan 10, 2018 at 11:28 AM, Masahiko Sawada 
> wrote:
> > On Sun, Jan 7, 2018 at 8:40 AM, Peter Geoghegan  wrote:
> >> On Sat, Jan 6, 2018 at 2:20 PM, Stephen Frost 
> wrote:
>  > IIRC the patches that makes the cleanup scan skip has a problem
>  > pointed by Peter[1], that is that we stash an XID when a btree page
> is
>  > deleted, which is used to determine when it's finally safe to
> recycle
>  > the page. Yura's patch doesn't have that problem?
>  >
>  > [1] https://www.postgresql.org/message-id/CAH2-Wz%3D1%
> 3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com
> >>
> >>> Masahiko Sawada, if this patch isn't viable or requires serious rework
> >>> to be acceptable, then perhaps we should change its status to 'returned
> >>> with feedback' and you can post a new patch for the next commitfest..?
> >>
> >> I believe that the problem that I pointed out with freezing/wraparound
> >> is a solvable problem. If we think about it carefully, we will come up
> >> with a good solution. I have tried to get the ball rolling with my
> >> pd_prune_xid suggestion. I think it's important to not lose sight of
> >> the fact that the page deletion/recycling XID thing is just one detail
> >> that we need to think about some more.
> >>
> >> I cannot fault Sawada-san for waiting to hear other people's views
> >> before proceeding. It really needs to be properly discussed.
> >>
> >
> > Thank you for commenting.
> >
> > IIUC we have two approaches: one idea is based on Peter's suggestion.
> > We can use pd_prune_xid to store epoch of xid of which the page is
> > deleted. That way, we can correctly mark deleted pages as recyclable
> > without breaking on-disk format.
> >
> > Another idea is suggested by  Sokolov Yura. His original patch makes
> > btree have a flag in btpo_flags that implies the btree has deleted but
> > not recyclable page or not. I'd rather want to store it as bool in
> > BTMetaPageData. Currently btm_version is defined as uint32 but I think
> > we won't use all of them. If we store it in part of btm_version we
> > don't break on-disk format. However, we're now assuming that the
> > vacuum on btree index always scans whole btree rather than a part, and
> > this approach will nurture it more. It might be possible that it will
> > become a restriction in the future.
> >
>
> On third thought, it's similar to what we are doing for heaps but we
> can store the oldest btpo.xact of a btree index to somewhere(metapage
> or pg_class.relfrozenxid?) after vacuumed. We can skip cleanup
> vacuuming as long as the xid is not more than a certain threshold old
> (say vacuum_index_cleanup_age). Combining with new parameter I
> proposed before, the condition of doing cleanup index vacuum will
> become as follows.
>
> if (there is garbage on heap)
>Vacuum index, and update oldest btpo.xact
> else /* there is no garbage on heap */
> {
> if (# of scanned pages > (nblocks *
> vacuum_cleanup_index_scale_factor) OR oldest btpo.xact is more than
> vacuum_index_cleanup_age old?))
> Cleanup vacuum index, and update oldest btpo.xact
> }
>
> In current index vacuuming, it scans whole index pages if there is
> even one garbage on heap(I'd like to improve it though someday), which
> it also lead to update the oldest btpo.xact at the time. If
> vacuum_cleanup_index_slace_factor is enough high, we can skip the
> scanning whole index as long as either the table isn't modified for 2
> billion transactions or the oldest btpo.xact isn't more than
> vacuum_index_cleanup_age transactions old. But if there is a opened
> transaction for a very long time, we might not be able to mark deleted
> page as recyclable. Some tricks might be required. Feedback is
> welcome.


Storing oldest btpo.xact seems like a good idea for me.  So, by comparing
oldest btpo.xact to RecentGlobalXmin we can determine if there are
some recyclable pages which are really going to be recycled during
possible cleanup?  Assuming that index pages deletions are rare, that
seems like sufficient criterion to run the cleanup.  Also, this criterion
will eliminate the risk of wraparound, because we'll recycle index pages
not later than we do now.

Another case when running cleanup is necessary (or at least desirable)
is stalled index statistics.  Assuming that index statistics can be stalled
only when there is no deletions from index (otherwise index would
be scanned during vacuum), we can use number of index insertions as
measure of statistics oldness.

So, what about following logic?

if (there is garbage on heap)
   Vacuum index, and update oldest btpo.xact
else /* there is no garbage on heap */
{
if (number of inserted index tuples since last btvacuumscan() /
total number of index tuples >= some threshold
OR oldest btpo.xact is older than RecentGlobalXmin)
Cleanup vacuum index, and update oldest btpo.xact
}

--
Alexander Korotkov
Postgre

Re: Online enabling of checksums

2018-02-27 Thread Andrey Borodin
Hi, Magus!

> 25 февр. 2018 г., в 21:17, Magnus Hagander  написал(а):
> 
> PFA an updated patch that adds this, and also fixes the problem in 
> pg_verify_checksums spotted by Michael Banck. 

I had to ALLOW_CONNECTIONS to template0 to make it work
postgres=# 2018-02-27 21:29:05.993 +05 [57259] ERROR:  Database template0 does 
not allow connections.
Is it a problem with my installation or some regression in the patch?

2018-02-27 21:40:47.132 +05 [57402] HINT:  either disable or enable checksums 
by calling the pg_data_checksums_enable()/disable() functions
Function names are wrong in this hint: pg_enable_data_checksums()

The code is nice and clear. One minor spot in the comment
>This option can only _be_ enabled when data checksums are enabled.

Is there any way we could provide this functionality for previous versions 
(9.6,10)? Like implement utility for offline checksum enabling, without 
WAL-logging, surely.


Thanks for the patch!

Best regards, Andrey Borodin.


Re: jsonlog logging only some messages?

2018-02-27 Thread Greg Stark
On 27 February 2018 at 02:04, Michael Paquier  wrote:
> On Mon, Feb 26, 2018 at 05:38:56PM +, Greg Stark wrote:
>> I tried loading the jsonlog module from
>> https://github.com/michaelpq/pg_plugins into Postgres 9.6.
>>
>> However it seems it resulted in logs only for session log messages but
>> not any background worker log messages. We have log_checkpoints set
>> but there were no log messages in the json log about checkpoints. Nor
>> were there any from autovacuum.
>
> Hm.  I have just loaded jsonlog on a 9.6 and 10 instance where
> log_checkpoints is enabled with this background worker which logs a
> simple string every 10s:
> https://github.com/michaelpq/pg_plugins/tree/master/hello_world
>
> Both checkpoint logs and the logs of the bgworker are showing up for me.

Weird. I guess I have some more debugging with gdb to do.

>> Also I have log_destination set to stderr,cvslog and as I understand
>> it the jsonlog module effectively overrides the "stderr" output which
>> goes to the file named *.log (which I find super confusing,
>> incidentally). But I was expecting to get the csv file as well. We
>> didn't, we only got the *.log file with the (partial) json logs. Is
>> there something I'm missing here or is this unexpected?
>
> Yeah, that's unfortunately expected...  jsonlog enforces
> edata->output_to_server to false, which has the advantage of disabling
> extra log outputs, but also has the advantage of preventing duplicate
> entries into stderr.  One way that I can see to solve things would be to
> patch the backend and get output_to_server replaced by a filter which
> allows a plugin to choose what are the extra output types allowed.  In
> this case you don't want stderr later on, but would still want csvlog to
> happen, instead of having an all-or-nothing switch.  I haven't tested,
> but it could be possible to have as well jsonlog enforce Log_destination
> to only use csvlog after it generates its entry so as stderr is not
> duplicated by csvlog gets though.  Not sure how you would reset the
> parameter though, so you may perhaps want to invoke an extra plugin
> which outputs to csvlog as jsonlog cascades through things.


I would actually lean another way. If jsonlog opened *.json and wrote
there then it could leave the output_to_server field unchanged. It
does look like this might be a bit awkward with yet more of the static
functions needing to be duplicated.




-- 
greg



Re: prokind column (was Re: [HACKERS] SQL procedures)

2018-02-27 Thread Tom Lane
Catalin Iacob  writes:
> On Tue, Feb 27, 2018 at 4:03 AM, Michael Paquier  wrote:
>> I would just recommend users to use a version of psql matching
>> the one of the server instead of putting an extra load of maintenance
>> into psql for years to come

> Breaking tab completion in new psql against old servers might be
> acceptable as it's a fringe feature, but I don't think your
> recommendation of matching versions is practical. Lots of people
> manage multiple server versions and using the latest psql for all of
> them is currently, as far as I know, a perfectly supported way of
> doing that, getting new psql features and keeping compatibility. I
> think it would be a pity to loose that.

We support the various psql/describe.c features against old servers,
so it would be quite inconsistent for tab completion not to work
similarly.  There are some gaps in that already, as per the other
thread, but we should clean those up not add more.

regards, tom lane



Re: [HACKERS] GUC for cleanup indexes threshold.

2018-02-27 Thread Alexander Korotkov
On Wed, Nov 29, 2017 at 6:06 PM, Masahiko Sawada 
wrote:

> On Wed, Nov 29, 2017 at 11:05 PM, Simon Riggs 
> wrote:
> > On 25 September 2017 at 22:34, Kyotaro HORIGUCHI
> >  wrote:
> >
> >>> > Here is a small patch that skips scanning btree index if no pending
> >>> > deleted pages exists.
> >>> > It detects this situation by comparing pages_deleted with pages_free.
> >>
> >> It seems to work to prevent needless cleanup scans.
> >
> > So this leaves us in the situation that
> >
> > 1. Masahiko's patch has unresolved problems
> > 2. Yura's patch works and is useful
> >
> > Unless there is disagreement on the above, it seems we should apply
> > Yura's patch (an edited version, perhaps).
> >
>
> IIRC the patches that makes the cleanup scan skip has a problem
> pointed by Peter[1], that is that we stash an XID when a btree page is
> deleted, which is used to determine when it's finally safe to recycle
> the page. Yura's patch doesn't have that problem?
>
> [1] https://www.postgresql.org/message-id/CAH2-Wz%3D1%
> 3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com


Yes, I think Yura's patch doesn't have that problem, because it skips
cleanup only when there are no recyclable pages.  And that means there
is no btpo.xact stored, so no XIDs to be wraparounded.

BTW, I've rebased Yura's patch.  I think this patch has following issues
for now:

1) Adding BTP_NEED_NO_CLEANUP as per-page flag doesn't look nice.
2) In the append-only case, index statistics can lag indefinitely.
3) Patch definitely needs more comments.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-lazy-btree-cleanup-2.patch
Description: Binary data


Re: prokind column (was Re: [HACKERS] SQL procedures)

2018-02-27 Thread Catalin Iacob
On Tue, Feb 27, 2018 at 4:03 AM, Michael Paquier  wrote:
> I would just recommend users to use a version of psql matching
> the one of the server instead of putting an extra load of maintenance
> into psql for years to come

Breaking tab completion in new psql against old servers might be
acceptable as it's a fringe feature, but I don't think your
recommendation of matching versions is practical. Lots of people
manage multiple server versions and using the latest psql for all of
them is currently, as far as I know, a perfectly supported way of
doing that, getting new psql features and keeping compatibility. I
think it would be a pity to loose that.



Re: Reopen logfile on SIGHUP

2018-02-27 Thread Dagfinn Ilmari Mannsåker
Anastasia Lubennikova  writes:

> Large percentage of postgres installations, for example PGDG packages
> for Debian/Ubuntu, assume by default that log management will be
> handled by extrernals tools such as logrotate.
>
> Unfortunately such tools have no way to tell postgres to reopen log
> file after rotation and forced to use copy-truncate strategy that
> leads to a loss of log messages which is unacceptable.
>
> Small patch in the attachment implements logfile reopeninig on SIGHUP.
> It only affects the file accessed by logging collector, which name you
> can check with pg_current_logfile().
>
> I hope you will find this feature useful.

+1 for the feature, but:

>   syslogFile = logfile_open(last_file_name, "a", false);

This will cause a fatal error if opening the logfile fails for any
reason (even transient errors like ENFILE/EMFILE).  There is already the
logfile_rotate() function that can reopen log files safely based on time
and date limits.  I'd suggest extending that by adding a config option
that controls whether to always reopen the log file on SIGHUP.

- ilmari
-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl



Re: [HACKERS] path toward faster partition pruning

2018-02-27 Thread Robert Haas
On Mon, Feb 26, 2018 at 10:59 PM, Amit Langote
 wrote:
> You may say that partition bounds might have to be different too in this
> case and hence partition-wise join won't occur anyway, but I'm wondering
> if the mismatch of partcollation itself isn't enough to conclude that?

Yeah, you're right.  I think that this is just a bug in partition-wise
join, and that the partition scheme should just be using partcollation
instead of parttypcoll, as in the attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


partition-scheme-collation.patch
Description: Binary data


Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

2018-02-27 Thread Claudio Freire
On Tue, Feb 6, 2018 at 4:56 AM, Masahiko Sawada  wrote:
> For vacuuming fsm of index, we might have to consider to
> vacuum fsm of index after lazy_vacuum_index.

I've been thinking about that, and I think you're right.

So here's a fourth patch that adds that to nbtree's bulkdelete implementation.
Seems to be the best place to add such a thing.

GIN and GIST don't delete pages until vacuumcleanup, so they can't do
the same, sadly.
From 505f3143f85d42cea5adf6f04332443a61edcac0 Mon Sep 17 00:00:00 2001
From: Claudio Freire 
Date: Tue, 27 Feb 2018 12:51:46 -0300
Subject: [PATCH] Index vacuum: Vacuum FSM after each bulkdelete call

If any pages have been deleted during bulkdelete, vacuum
the FSM to expose those pages to concurrent activity.
Try to avoid redundant FSM vacuum at vacuumcleanup.
---
 src/backend/access/nbtree/nbtree.c| 22 --
 src/backend/access/spgist/spgvacuum.c | 18 --
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8158508..d673b88 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -798,6 +798,12 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 		cycleid = _bt_start_vacuum(rel);
 
 		btvacuumscan(info, stats, callback, callback_state, cycleid);
+
+		if (stats->pages_deleted > 0)
+		{
+			/* vacuum the FSM to expose deleted pages, if any */
+			IndexFreeSpaceMapVacuum(info->index);
+		}
 	}
 	PG_END_ENSURE_ERROR_CLEANUP(_bt_end_vacuum_callback, PointerGetDatum(rel));
 	_bt_end_vacuum(rel);
@@ -813,6 +819,8 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 IndexBulkDeleteResult *
 btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
+	bool	needs_fsm_vacuum;
+
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
 		return stats;
@@ -825,15 +833,25 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	 *
 	 * Since we aren't going to actually delete any leaf items, there's no
 	 * need to go through all the vacuum-cycle-ID pushups.
+	 *
+	 * If there was a btbulkdelete call, it will vacuum the FSM too if it
+	 * deleted any pages, so we can skip our FSM vacuum in that case only.
 	 */
 	if (stats == NULL)
 	{
 		stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
 		btvacuumscan(info, stats, NULL, NULL, 0);
+
+		needs_fsm_vacuum = true;
 	}
+	else
+		needs_fsm_vacuum = (stats->pages_deleted == 0);
 
-	/* Finally, vacuum the FSM */
-	IndexFreeSpaceMapVacuum(info->index);
+	if (needs_fsm_vacuum)
+	{
+		/* Finally, vacuum the FSM */
+		IndexFreeSpaceMapVacuum(info->index);
+	}
 
 	/*
 	 * It's quite possible for us to be fooled by concurrent page splits into
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index 72839cb..e9ed3fb 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -898,6 +898,12 @@ spgbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 
 	spgvacuumscan(&bds);
 
+	if (stats->pages_deleted > 0)
+	{
+		/* vacuum the FSM to expose deleted pages, if any */
+		IndexFreeSpaceMapVacuum(info->index);
+	}
+
 	return stats;
 }
 
@@ -918,6 +924,7 @@ spgvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 {
 	Relation	index = info->index;
 	spgBulkDeleteState bds;
+	bool		needs_fsm_vacuum;
 
 	/* No-op in ANALYZE ONLY mode */
 	if (info->analyze_only)
@@ -938,10 +945,17 @@ spgvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 		bds.callback_state = NULL;
 
 		spgvacuumscan(&bds);
+
+		needs_fsm_vacuum = true;
 	}
+	else
+		needs_fsm_vacuum = stats->pages_deleted == 0;
 
-	/* Finally, vacuum the FSM */
-	IndexFreeSpaceMapVacuum(index);
+	if (needs_fsm_vacuum)
+	{
+		/* Finally, vacuum the FSM */
+		IndexFreeSpaceMapVacuum(index);
+	}
 
 	/*
 	 * It's quite possible for us to be fooled by concurrent tuple moves into
-- 
1.8.4.5



Re: atomics.h may not be included from frontend code

2018-02-27 Thread Robert Haas
On Tue, Feb 27, 2018 at 8:49 AM, Aleksander Alekseev
 wrote:
> We would like to know whether you share this concern and whether it
> would be a good idea to try to refactor the code so that atomics could
> be used not only from the backend.

I think the concern on the referenced threads was that atomics might
be implemented using spinlocks if we don't have a real atomics
implementation; and those in turn might be implemented as semaphores
if we don't have a real spinlock implementation.  When we emulate
atomics using spinlocks, we use a fixed-size array of spinlocks stored
in shared memory; and when we emulate spinlocks using semaphore, we
use the semaphores in each PGPROC.  So those emulation layers are
deeply tied into the backend's shared-memory architecture, and
untangling them might be a bit complicated.

However, those are presumably rare configurations that many people
(including many developers) don't care about.  If #include "atomics.h"
worked from all frontend code except where one of those emulation
layers were in use, that would represent an improvement over the
current status quo for people doing out-of-core development against
PostgreSQL.  On the other hand, it would also make it quite easy for
the use of atomics to creep into frontend code within PG itself, and
that's not OK so long as configurations that lack atomics and/or
spinlocks are considered supported.  So that's something to think
about.

I have to imagine that systems which lack atomics and/or spinlocks are
almost extinct.  It looks like we have at least one buildfarm member
running with each of --disable-spinlocks and --disable-atomics, but
systems that organically lack spinlocks and/or atomics must by now be
extremely rare.  At some point, I think we should just de-support such
configurations, but I'm not sure if we're at that point yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Noah Misch
On Tue, Feb 27, 2018 at 11:43:34AM +0100, Magnus Hagander wrote:
> On Tue, Feb 27, 2018 at 11:27 AM, Victor Wagner  wrote:
> > These errors seems to be caused by commit  91f3ffc5249
> > which improves readability of src/bin/scripts/Makefile for
> > humans by moving list of common object files into variable.
> >
> > Unfortunately, it makes this Makefile unreadable for perl scripts
> > in src/tools/msvc, which generate Microsift Visual Studio projects.

> It's also interesting to note that this did not break in HEAD, 10 or 9.6.
> And none of those actually have the SCRIPTS_COMMON code.
> 
> I'm unsure why this was introduced in 9.5 and earlier, but not in the newer
> ones.  This smells like a possible backpatch mistake, in which case that
> part should probably be backed out of the old branches rather than teaching
> mkvcbuild about it.
> 
> Noah? Can you confirm if it was intentional?

It was intentional; the release made common.c depend on dumputils.c in 9.5 and
earlier, so everything that needs common.c now needs all four files.  (From 9.6,
relevant code had moved to libpgfeutils.)  Nonetheless, your fix was correct.
Thanks for pushing it.



Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

2018-02-27 Thread Claudio Freire
On Mon, Feb 26, 2018 at 10:20 PM, Masahiko Sawada  wrote:
> Thank you for updating patches!
>
> 0001 patch looks good to me except for the following unnecessary empty lines.
>
> +* If there are no indexes then we should periodically
> vacuum the FSM
> +* on huge relations to make free space visible early.
> +*/
> +   else if (nindexes == 0 && fsm_updated_pages >
> vacuum_fsm_every_pages)
> +   {
> +   /* Vacuum the Free Space Map */
> +   FreeSpaceMapVacuum(onerel, max_freespace);
> +   fsm_updated_pages = 0;
> +   max_freespace = 0;
> +   }
> +
> +
> +   /*
>
> @@ -913,10 +967,14 @@ lazy_scan_heap(Relation onerel, int options,
> LVRelStats *vacrelstats,
>
> vmbuffer, InvalidTransactionId,
>
> VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
> END_CRIT_SECTION();
> +
> }
>
> 0002 patch looks good to me.
>
> For 0003 patch, I think fsm_set() function name could lead misreading
> because it actually not only sets the value but also returns the
> value. fsm_set_and_get() might be better?

Attached is the patchset modified as requested
From bc79add4171beb1c8a0a714436cb4d7c27d6a2fc Mon Sep 17 00:00:00 2001
From: Claudio Freire 
Date: Fri, 28 Jul 2017 21:31:59 -0300
Subject: [PATCH 1/3] Vacuum: Update FSM more frequently

Make Vacuum update the FSM more frequently, to avoid the case where
autovacuum fails to reach the point where it updates the FSM in
highly contended tables.

Introduce a tree pruning threshold to FreeSpaceMapVacuum that avoids
recursing into branches that already contain enough free space, to
avoid having to traverse the whole FSM and thus induce quadratic
costs. Intermediate FSM vacuums are only supposed to make enough
free space visible to avoid extension until the final (non-partial)
FSM vacuum.

Run partial FSM vacuums after each index pass, which is a point
at which whole ranges of the heap have been thorougly cleaned, and
we can expect no further updates to those ranges of the FSM save
for concurrent activity. When there are no indexes, and thus no
index passes, run partial FSM vacuums every 8GB of dirtied pages
or 1/8th of the relation, whichever is highest. This allows some
partial work to be made visible without incurring quadratic cost.

In any case, FSM are small in relation to the table, so even when
quadratic cost is involved, it should not be problematic. Index
passes already incur quadratic cost, and the addition of the FSM
is unlikely to be measurable.

Run a partial FSM vacuum with a low pruning threshold right at
the beginning of the VACUUM to finish up any work left over from
prior canceled vacuum runs, something that is common in highly
contended tables when running only autovacuum.
---
 src/backend/access/brin/brin.c|  2 +-
 src/backend/access/brin/brin_pageops.c| 10 ++--
 src/backend/commands/vacuumlazy.c | 81 ---
 src/backend/storage/freespace/freespace.c | 27 +--
 src/backend/storage/freespace/indexfsm.c  |  2 +-
 src/include/storage/freespace.h   |  2 +-
 6 files changed, 105 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 68b3371..24d6df7 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -1461,5 +1461,5 @@ brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy)
 	 * the way to the top.
 	 */
 	if (vacuum_fsm)
-		FreeSpaceMapVacuum(idxrel);
+		FreeSpaceMapVacuum(idxrel, 0);
 }
diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index 60a7025..d4c7a87 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -136,7 +136,7 @@ brin_doupdate(Relation idxrel, BlockNumber pagesPerRange,
 brin_initialize_empty_new_buffer(idxrel, newbuf);
 			UnlockReleaseBuffer(newbuf);
 			if (extended)
-FreeSpaceMapVacuum(idxrel);
+FreeSpaceMapVacuum(idxrel, 0);
 		}
 		return false;
 	}
@@ -156,7 +156,7 @@ brin_doupdate(Relation idxrel, BlockNumber pagesPerRange,
 brin_initialize_empty_new_buffer(idxrel, newbuf);
 			UnlockReleaseBuffer(newbuf);
 			if (extended)
-FreeSpaceMapVacuum(idxrel);
+FreeSpaceMapVacuum(idxrel, 0);
 		}
 		return false;
 	}
@@ -211,7 +211,7 @@ brin_doupdate(Relation idxrel, BlockNumber pagesPerRange,
 		LockBuffer(oldbuf, BUFFER_LOCK_UNLOCK);
 
 		if (extended)
-			FreeSpaceMapVacuum(idxrel);
+			FreeSpaceMapVacuum(idxrel, 0);
 
 		return true;
 	}
@@ -313,7 +313,7 @@ brin_doupdate(Relation idxrel, BlockNumber pagesPerRange,
 		{
 			Assert(BlockNumberIsValid(newblk));
 			RecordPageWithFreeSpace(idxrel, newblk, freespace);
-			FreeSpaceMapVacuum(idxrel);
+			FreeSpaceMapVacuum(idxrel, 0);
 		}
 
 		return true;
@@ -457,7 +457,7 @@ brin_doinsert(Relatio

Reopen logfile on SIGHUP

2018-02-27 Thread Anastasia Lubennikova
Large percentage of postgres installations, for example PGDG packages 
for Debian/Ubuntu,
assume by default that log management will be handled by extrernals 
tools such as logrotate.


Unfortunately such tools have no way to tell postgres to reopen log file 
after rotation
and forced to use copy-truncate strategy that leads to a loss of log 
messages which is unacceptable.


Small patch in the attachment implements logfile reopeninig on SIGHUP.
It only affects the file accessed by logging collector, which name you 
can check with pg_current_logfile().


I hope you will find this feature useful.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 3b3ea19d2dfb88dd0e3b31338a4b72a1cd0183e2
Author: Anastasia 
Date:   Tue Feb 27 17:27:25 2018 +0300

Reopen logfile and csv logfile on SIGHUP.

diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index f70eea3..54a5bdf 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -306,6 +306,24 @@ SysLoggerMain(int argc, char *argv[])
 		if (got_SIGHUP)
 		{
 			got_SIGHUP = false;
+
+			/* Reopen last logfiles on SIGHUP */
+			if (last_file_name != NULL)
+			{
+fclose(syslogFile);
+syslogFile = logfile_open(last_file_name, "a", false);
+ereport(LOG,
+		(errmsg("Reopen current logfile on SIGHUP")));
+			}
+
+			if (last_csv_file_name != NULL)
+			{
+fclose(csvlogFile);
+csvlogFile = logfile_open(last_csv_file_name, "a", false);
+ereport(LOG,
+		(errmsg("Reopen current CSV logfile on SIGHUP")));
+			}
+
 			ProcessConfigFile(PGC_SIGHUP);
 
 			/*


Re: TODO item for broken \s with libedit seems fixed

2018-02-27 Thread Patrick Krecker


> On Feb 27, 2018, at 06:25, Tom Lane  wrote:
> 
> Tatsuo Ishii  writes:
>> Would you like to do this yourself? Or shall I do this?
> 
> Go ahead, I have a bunch of other stuff to do ...
> 
>regards, tom lane

I don’t think I have permission to edit the page, otherwise I would be happy to.

Thanks,
Patrick


Re: TODO item for broken \s with libedit seems fixed

2018-02-27 Thread Tom Lane
Tatsuo Ishii  writes:
> Would you like to do this yourself? Or shall I do this?

Go ahead, I have a bunch of other stuff to do ...

regards, tom lane



atomics.h may not be included from frontend code

2018-02-27 Thread Aleksander Alekseev
Hello hackers,

My colleague Anastasia Lubennikova and I were discussing a weird piece
of code in src/include/port/atomics.h:

```
#ifdef FRONTEND
#error "atomics.h may not be included from frontend code"
#endif
```

We tried to follow commit messages [1] and discussions [2]. However no matter
how you try to look on this code it's weird.

Basically it says that atomics are written the way that they can be used
from one code and can't be used from another. So if you want to write a
cross-platform parallel incremental backup tool (what Anastasia is
currently working on) you have to use C++ and std::atomic (since MS
doesn't develops C anymore) or write something like:

```
#undef FRONTEND
#include 
#define FRONTEND
```

We would like to know whether you share this concern and whether it
would be a good idea to try to refactor the code so that atomics could
be used not only from the backend.

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=4eda0a64705
[2]: https://postgr.es/m/20150806070902.ge12...@awork2.anarazel.de

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
On Tue, 27 Feb 2018 14:14:04 +0100
Magnus Hagander  wrote:

> 
> Thanks. I've pushed this for 9.3-9.5.
> 
> Please verify that it looks good if you can (it takes a while for the
> buildfarm to get around to it)

Works at least for 9.5, although I haven't run tap tests on this build.

---



Re: Re: ERROR: Aggref found in non-Agg plan node (introducesd in commit 3bf05e096b9f8375e640c5d7996aa57efd7f240c)

2018-02-27 Thread Jeevan Chalke
On Tue, Feb 27, 2018 at 4:55 PM, Andreas Joseph Krogh 
wrote:

> På tirsdag 27. februar 2018 kl. 12:16:42, skrev Jeevan Chalke <
> jeevan.cha...@enterprisedb.com>:
>
> Hi,
>
> On Tue, Feb 27, 2018 at 3:46 PM, Andreas Joseph Krogh 
> wrote:
>>
>> $subject
>>
>> The query I'm using is 14K and I have not produced a test-case, so this
>> is just a heads-up.
>>
>
> Rajkumar off-list shared same issue with me. And I have posted a fix for
> the same (https://www.postgresql.org/message-id/CAM2+6=X9kxQoL2ZqZ00E
> 6asbt9z+rfywbomhxj0+8fpaymz...@mail.gmail.com)
>
> Can you please check if that fixes your test-case or not?
>
>
>
> Applying fix_aggref_in_non-agg_error.patch (only) fixes the issue
>

Thank you Andreas for confirming.


>
> --
> Andreas Joseph Krogh
> ​
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Magnus Hagander
On Tue, Feb 27, 2018 at 2:05 PM, Victor Wagner  wrote:

> On Tue, 27 Feb 2018 13:35:33 +0100
> Magnus Hagander  wrote:
>
>
> > > Note that said commit (91f3ffc5249) is not limited to rearranging
> > > makefile. It also changes a lot into C code itself. So it is not a
> > > question of reverting commit - it is making new commit, which
> > > reverts changes in just one file.
> >
> >
> > Oh, I missed that.
> >
> > I think we should revert *just the changes to the Makefile*, and of
> > course leave the rest of the comimt. Can you confirm if that fixes
> > the problem?
>
> It seems that it is not so easy. I've tried to revert changes in the
> Makefile, and found out that some utilities (createlang, droplang,
> pg_isready) now need more common object files, than they need before
> this commit .
>
> Attached patch to makefile which fixe problem for 9.3 branch.
> I think it should do for 9.4 and 9.5 too.
>

Thanks. I've pushed this for 9.3-9.5.

Please verify that it looks good if you can (it takes a while for the
buildfarm to get around to it)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
On Tue, 27 Feb 2018 13:49:29 +0100
Magnus Hagander  wrote:


 
 
> Does not apply cleanly to 9.5 at least for me. Probably easy to fix,
> but I still feel we shouldn't mess around with the buildsystem in
> back branches unless we actually have to.

How interesting - somewhere between 9.3 (for which patch was made) and
9.5 path to the Makefile in windows-specific script  was changed from
Windows-style separators to unix-style and this broke context.

-- 



Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Sandeep Thakkar
Hi Magnus

On Tue, Feb 27, 2018 at 6:05 PM, Magnus Hagander 
wrote:

> On Tue, Feb 27, 2018 at 11:53 AM, Victor Wagner 
> wrote:
>
>> On Tue, 27 Feb 2018 11:43:34 +0100
>> Magnus Hagander  wrote:
>>
>> > On Tue, Feb 27, 2018 at 11:27 AM, Victor Wagner 
>> > wrote:
>> >
>> > > Hello, hackers.
>> > >
>> > > I've tried to build last state of REL9_5_STABLE branch (commit
>> > > 1f19e46124eee8c6a54834) and under Win32 encountered  following
>> > > errors:
>> > >
>> [skip]
>> >
>> > It's also interesting to note that this did not break in HEAD, 10 or
>> > 9.6. And none of those actually have the SCRIPTS_COMMON code.
>>
>> It seems that it early stages of 9.6 cycle there was another approach
>> taken to improve readability of this Makefile - just all common code
>> put into one C file. So there is no need for SCRIPTS_COMMON variable,
>> because its name is longer than name of common.o which would be its sole
>> contents.
>>
>> > I'm unsure why this was introduced in 9.5 and earlier, but not in the
>> > newer ones.  This smells like a possible backpatch mistake, in which
>> > case that part should probably be backed out of the old branches
>> > rather than teaching mkvcbuild about it.
>>
>> Note that said commit (91f3ffc5249) is not limited to rearranging
>> makefile. It also changes a lot into C code itself. So it is not a
>> question of reverting commit - it is making new commit, which reverts
>> changes in just one file.
>
>
> Oh, I missed that.
>
> I think we should revert *just the changes to the Makefile*, and of course
> leave the rest of the comimt. Can you confirm if that fixes the problem?
>
> That fixes some of the errors but the following errors is still seen:

 common.obj : error LNK2019: unresolved external symbol _fmtQualifiedId
referenced in function _appendQualifiedRelation
[D:\pginstaller.auto\postgres.windows\createlang.vcxproj]
 common.obj : error LNK2019: unresolved external symbol
_appendStringLiteralConn referenced in function _appendQualifiedRelation
[D:\pginstaller.auto\postgres.windows\createlang.vcxproj]
 .\Release\createlang\createlang.exe : fatal error LNK1120: 2 unresolved
externals [D:\pginstaller.auto\postgres.windows\createlang.vcxproj]

"D:\pginstaller.auto\postgres.windows\pgsql.sln" (default target) (1) ->
"D:\pginstaller.auto\postgres.windows\droplang.vcxproj" (default target)
(95) ->
 common.obj : error LNK2019: unresolved external symbol _fmtQualifiedId
referenced in function _appendQualifiedRelation
[D:\pginstaller.auto\postgres.windows\droplang.vcxproj]
 common.obj : error LNK2019: unresolved external symbol
_appendStringLiteralConn referenced in function _appendQualifiedRelation
[D:\pginstaller.auto\postgres.windows\droplang.vcxproj]
 .\Release\droplang\droplang.exe : fatal error LNK1120: 2 unresolved
externals [D:\pginstaller.auto\postgres.windows\droplang.vcxproj]

"D:\pginstaller.auto\postgres.windows\pgsql.sln" (default target) (1) ->
"D:\pginstaller.auto\postgres.windows\pg_isready.vcxproj" (default target)
(100) ->
 common.obj : error LNK2019: unresolved external symbol _fmtQualifiedId
referenced in function _appendQualifiedRelation
[D:\pginstaller.auto\postgres.windows\pg_isready.vcxproj]
 common.obj : error LNK2019: unresolved external symbol
_appendStringLiteralConn referenced in function _appendQualifiedRelation
[D:\pginstaller.auto\postgres.windows\pg_isready.vcxproj]
 .\Release\pg_isready\pg_isready.exe : fatal error LNK1120: 2 unresolved
externals [D:\pginstaller.auto\postgres.windows\pg_isready.vcxproj]
--


> --
>  Magnus Hagander
>  Me: https://www.hagander.net/ 
>  Work: https://www.redpill-linpro.com/ 
>



-- 
Sandeep Thakkar
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
On Tue, 27 Feb 2018 13:35:33 +0100
Magnus Hagander  wrote:


> > Note that said commit (91f3ffc5249) is not limited to rearranging
> > makefile. It also changes a lot into C code itself. So it is not a
> > question of reverting commit - it is making new commit, which
> > reverts changes in just one file.  
> 
> 
> Oh, I missed that.
> 
> I think we should revert *just the changes to the Makefile*, and of
> course leave the rest of the comimt. Can you confirm if that fixes
> the problem?

It seems that it is not so easy. I've tried to revert changes in the
Makefile, and found out that some utilities (createlang, droplang,
pg_isready) now need more common object files, than they need before
this commit .

Attached patch to makefile which fixe problem for 9.3 branch.
I think it should do for 9.4 and 9.5 too.

--
 
 

diff --git a/src/bin/scripts/Makefile b/src/bin/scripts/Makefile
old mode 100644
new mode 100755
index 996db500e9..0dba267c20
--- a/src/bin/scripts/Makefile
+++ b/src/bin/scripts/Makefile
@@ -25,17 +25,17 @@ all: $(PROGRAMS)
 %: %.o $(WIN32RES)
 	$(CC) $(CFLAGS) $^ $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
 
-SCRIPTS_COMMON = common.o dumputils.o kwlookup.o keywords.o
-createdb: createdb.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-createlang: createlang.o $(SCRIPTS_COMMON) print.o mbprint.o | submake-libpq submake-libpgport
-createuser: createuser.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-dropdb: dropdb.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-droplang: droplang.o $(SCRIPTS_COMMON) print.o mbprint.o | submake-libpq submake-libpgport
-dropuser: dropuser.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-clusterdb: clusterdb.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-vacuumdb: vacuumdb.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-reindexdb: reindexdb.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
-pg_isready: pg_isready.o $(SCRIPTS_COMMON) | submake-libpq submake-libpgport
+createdb: createdb.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+createlang: createlang.o common.o dumputils.o kwlookup.o keywords.o print.o mbprint.o | submake-libpq submake-libpgport
+createuser: createuser.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+dropdb: dropdb.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+droplang: droplang.o common.o dumputils.o kwlookup.o keywords.o print.o mbprint.o | submake-libpq submake-libpgport
+dropuser: dropuser.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+clusterdb: clusterdb.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+vacuumdb: vacuumdb.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+reindexdb: reindexdb.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+pg_isready: pg_isready.o common.o dumputils.o kwlookup.o keywords.o | submake-libpq submake-libpgport
+
 
 dumputils.c keywords.c: % : $(top_srcdir)/src/bin/pg_dump/%
 	rm -f $@ && $(LN_S) $< .
@@ -67,5 +67,5 @@ uninstall:
 
 clean distclean maintainer-clean:
 	rm -f $(addsuffix $(X), $(PROGRAMS)) $(addsuffix .o, $(PROGRAMS))
-	rm -f $(SCRIPTS_COMMON) print.o mbprint.o $(WIN32RES)
+	rm -f common.o dumputils.o kwlookup.o keywords.o print.o mbprint.o $(WIN32RES)
 	rm -f dumputils.c print.c mbprint.c kwlookup.c keywords.c


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Magnus Hagander
On Tue, Feb 27, 2018 at 1:42 PM, Victor Wagner  wrote:

> On Tue, 27 Feb 2018 11:43:34 +0100
> Magnus Hagander  wrote:
>
>
> > I'm unsure why this was introduced in 9.5 and earlier, but not in the
> > newer ones.  This smells like a possible backpatch mistake, in which
> > case that part should probably be backed out of the old branches
> > rather than teaching mkvcbuild about it.
>
> Patch for Mkvcbuild.pm is actually quite simple (see attach)


Does not apply cleanly to 9.5 at least for me. Probably easy to fix, but I
still feel we shouldn't mess around with the buildsystem in back branches
unless we actually have to.


Really, I have more complicated patch, which supports recursive
> substitution as gmake does. It was developed to simplify inclusion of
> PGXS extensions into contrib tree. (authors of modern extension often
> use complex Makefile constructs).
>

That's of course not going to get backpatched.



>
> I've not presented it to community because I hoped that current MSVC
> build system would be soon replaced by cmake.
>

I don't know how you define "soon". Some people have been saying "soon" for
almost 10 years on that, so I wouldn't hold my breath :P


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
On Tue, 27 Feb 2018 11:43:34 +0100
Magnus Hagander  wrote:


> I'm unsure why this was introduced in 9.5 and earlier, but not in the
> newer ones.  This smells like a possible backpatch mistake, in which
> case that part should probably be backed out of the old branches
> rather than teaching mkvcbuild about it.

Patch for Mkvcbuild.pm is actually quite simple (see attach).

Really, I have more complicated patch, which supports recursive
substitution as gmake does. It was developed to simplify inclusion of
PGXS extensions into contrib tree. (authors of modern extension often
use complex Makefile constructs).

I've not presented it to community because I hoped that current MSVC
build system would be soon replaced by cmake.

-- 


diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
old mode 100644
new mode 100755
index 2e19aa7035..2bd506bb90
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -686,6 +686,8 @@ sub mkvcbuild
 
 	$mf = Project::read_file('src\bin\scripts\Makefile');
 	$mf =~ s{\\s*[\r\n]+}{}mg;
+	$mf =~ m{SCRIPTS_COMMON\s*=\s*(.*)$}m;
+	my @common_files = split /\s+/, $1;
 	$mf =~ m{PROGRAMS\s*=\s*(.*)$}m
 	  || die 'Could not match in bin\scripts\Makefile' . "\n";
 	foreach my $prg (split /\s+/, $1)
@@ -694,6 +696,7 @@ sub mkvcbuild
 		$mf =~ m{$prg\s*:\s*(.*)$}m
 		  || die 'Could not find script define for $prg' . "\n";
 		my @files = split /\s+/, $1;
+		push @files,@common_files if (grep {$_ eq '$(SCRIPTS_COMMON)'} @files);
 		foreach my $f (@files)
 		{
 			$f =~ s/\.o$/\.c/;


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Magnus Hagander
On Tue, Feb 27, 2018 at 11:53 AM, Victor Wagner  wrote:

> On Tue, 27 Feb 2018 11:43:34 +0100
> Magnus Hagander  wrote:
>
> > On Tue, Feb 27, 2018 at 11:27 AM, Victor Wagner 
> > wrote:
> >
> > > Hello, hackers.
> > >
> > > I've tried to build last state of REL9_5_STABLE branch (commit
> > > 1f19e46124eee8c6a54834) and under Win32 encountered  following
> > > errors:
> > >
> [skip]
> >
> > It's also interesting to note that this did not break in HEAD, 10 or
> > 9.6. And none of those actually have the SCRIPTS_COMMON code.
>
> It seems that it early stages of 9.6 cycle there was another approach
> taken to improve readability of this Makefile - just all common code
> put into one C file. So there is no need for SCRIPTS_COMMON variable,
> because its name is longer than name of common.o which would be its sole
> contents.
>
> > I'm unsure why this was introduced in 9.5 and earlier, but not in the
> > newer ones.  This smells like a possible backpatch mistake, in which
> > case that part should probably be backed out of the old branches
> > rather than teaching mkvcbuild about it.
>
> Note that said commit (91f3ffc5249) is not limited to rearranging
> makefile. It also changes a lot into C code itself. So it is not a
> question of reverting commit - it is making new commit, which reverts
> changes in just one file.


Oh, I missed that.

I think we should revert *just the changes to the Makefile*, and of course
leave the rest of the comimt. Can you confirm if that fixes the problem?


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] Add support for tuple routing to foreign partitions

2018-02-27 Thread Etsuro Fujita

(2018/02/21 20:54), Etsuro Fujita wrote:

void
BeginForeignRouting();

Prepare for a tuple-routing operation on a foreign table. This is called
from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo.


I modified execPartition.c so that this callback routine is called from 
a single function that I added to execPartition.c and it is called the 
first time the foreign partition is chose as the target partition to 
route a tuple to.  That removes CheckValidResultRel, the 
tuple-conversion setup, and the FDW initialization for each UPDATE 
subplan from ExecSetupPartitionTupleRouting, so it would minimize the 
possibly-useless overhead in doing that function.


Changes other than that are:

* Fixed typo and revised code/comments
* Added more regression tests
* Added docs

Attached is a new version of the patch set.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 375,386  static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
--- 375,398 
  static void create_cursor(ForeignScanState *node);
  static void fetch_more_data(ForeignScanState *node);
  static void close_cursor(PGconn *conn, unsigned int cursor_number);
+ static PgFdwModifyState *create_fdw_modify_state(ModifyTableState *mtstate,
+ 		ResultRelInfo *resultRelInfo,
+ 		CmdType operation,
+ 		int subplan_index,
+ 		char *query,
+ 		List *target_attrs,
+ 		bool has_returning,
+ 		List *retrieved_attrs);
  static void prepare_foreign_modify(PgFdwModifyState *fmstate);
  static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
  		 ItemPointer tupleid,
  		 TupleTableSlot *slot);
+ static int execute_prep_stmt(PgFdwModifyState *fmstate,
+   const char **p_values,
+   TupleTableSlot *slot);
  static void store_returning_result(PgFdwModifyState *fmstate,
  	   TupleTableSlot *slot, PGresult *res);
+ static void finish_foreign_modify(PgFdwModifyState *fmstate);
  static List *build_remote_returning(Index rtindex, Relation rel,
  	   List *returningList);
  static void rebuild_fdw_scan_tlist(ForeignScan *fscan, List *tlist);
***
*** 1678,1695  postgresBeginForeignModify(ModifyTableState *mtstate,
  		   int eflags)
  {
  	PgFdwModifyState *fmstate;
! 	EState	   *estate = mtstate->ps.state;
! 	CmdType		operation = mtstate->operation;
! 	Relation	rel = resultRelInfo->ri_RelationDesc;
! 	RangeTblEntry *rte;
! 	Oid			userid;
! 	ForeignTable *table;
! 	UserMapping *user;
! 	AttrNumber	n_params;
! 	Oid			typefnoid;
! 	bool		isvarlena;
! 	ListCell   *lc;
! 	TupleDesc	tupdesc = RelationGetDescr(rel);
  
  	/*
  	 * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
--- 1690,1699 
  		   int eflags)
  {
  	PgFdwModifyState *fmstate;
! 	char	   *query;
! 	List	   *target_attrs;
! 	bool		has_returning;
! 	List	   *retrieved_attrs;
  
  	/*
  	 * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
***
*** 1698,1779  postgresBeginForeignModify(ModifyTableState *mtstate,
  	if (eflags & EXEC_FLAG_EXPLAIN_ONLY)
  		return;
  
- 	/* Begin constructing PgFdwModifyState. */
- 	fmstate = (PgFdwModifyState *) palloc0(sizeof(PgFdwModifyState));
- 	fmstate->rel = rel;
- 
- 	/*
- 	 * Identify which user to do the remote access as.  This should match what
- 	 * ExecCheckRTEPerms() does.
- 	 */
- 	rte = rt_fetch(resultRelInfo->ri_RangeTableIndex, estate->es_range_table);
- 	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
- 
- 	/* Get info about foreign table. */
- 	table = GetForeignTable(RelationGetRelid(rel));
- 	user = GetUserMapping(userid, table->serverid);
- 
- 	/* Open connection; report that we'll create a prepared statement. */
- 	fmstate->conn = GetConnection(user, true);
- 	fmstate->p_name = NULL;		/* prepared statement not made yet */
- 
  	/* Deconstruct fdw_private data. */
! 	fmstate->query = strVal(list_nth(fdw_private,
! 	 FdwModifyPrivateUpdateSql));
! 	fmstate->target_attrs = (List *) list_nth(fdw_private,
! 			  FdwModifyPrivateTargetAttnums);
! 	fmstate->has_returning = intVal(list_nth(fdw_private,
! 			 FdwModifyPrivateHasReturning));
! 	fmstate->retrieved_attrs = (List *) list_nth(fdw_private,
!  FdwModifyPrivateRetrievedAttrs);
! 
! 	/* Create context for per-tuple temp workspace. */
! 	fmstate->temp_cxt = AllocSetContextCreate(estate->es_query_cxt,
! 			  "postgres_fdw temporary data",
! 			  ALLOCSET_SMALL_SIZES);
! 
! 	/* Prepare for input conversion of RETURNING results. */
! 	if (fmstate->has_returning)
! 		fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
! 
! 	/* Prepare for output conversion of parameters used in prepared stmt. */
! 	n_params = list_length(fmstate->target_attrs) + 1;
! 	fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
! 	fmstate->p_nums = 0;
! 
! 	if (operation == CMD_UPDATE || operati

Sv: Re: ERROR: Aggref found in non-Agg plan node (introducesd in commit 3bf05e096b9f8375e640c5d7996aa57efd7f240c)

2018-02-27 Thread Andreas Joseph Krogh
På tirsdag 27. februar 2018 kl. 12:16:42, skrev Jeevan Chalke <
jeevan.cha...@enterprisedb.com >:
Hi,
  
On Tue, Feb 27, 2018 at 3:46 PM, Andreas Joseph Krogh mailto:andr...@visena.com>> wrote: $subject
 
The query I'm using is 14K and I have not produced a test-case, so this is 
just a heads-up.

 Rajkumar off-list shared same issue with me. And I have posted a fix for the 
same (
https://www.postgresql.org/message-id/CAM2+6=x9kxqol2zqz00e6asbt9z+rfywbomhxj0+8fpaymz...@mail.gmail.com
 

)

 Can you please check if that fixes your test-case or not?




 
 
Applying fix_aggref_in_non-agg_error.patch (only) fixes the issue
 
 
-- Andreas Joseph Krogh
​




Re: parallel append vs. simple UNION ALL

2018-02-27 Thread Rajkumar Raghuwanshi
On Sat, Feb 24, 2018 at 2:55 AM, Robert Haas  wrote:

> 0001 is pretty much the same as the subquery-smarts.patch file I
> attached to the previous email.  I don't see much reason not to go
> ahead and commit this, although it could use a test case.  It makes
> the simple/flattened case work.  After some study I think that the
> gather-parameter handling is correct, although if somebody felt like
> reviewing that portion especially I wouldn't say no.
>

I have applied 0001 on pg-head, and while playing with regression tests, it
crashed with below test case.

postgres=# SET min_parallel_table_scan_size=0;
SET
postgres=# SELECT * FROM information_schema.foreign_data_wrapper_options
ORDER BY 1, 2, 3;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

--logfile
2018-02-26 22:06:07.331 IST [43508] LOG:  database system is ready to
accept connections
TRAP: FailedAssertion("!(subpath->parallel_safe)", File: "pathnode.c",
Line: 1813)
2018-02-26 22:06:42.345 IST [43508] LOG:  server process (PID 43519) was
terminated by signal 6: Aborted
2018-02-26 22:06:42.345 IST [43508] DETAIL:  Failed process was running:
SELECT * FROM information_schema.foreign_data_wrapper_options ORDER BY 1,
2, 3;

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


Re: ERROR: Aggref found in non-Agg plan node (introducesd in commit 3bf05e096b9f8375e640c5d7996aa57efd7f240c)

2018-02-27 Thread Jeevan Chalke
Hi,

On Tue, Feb 27, 2018 at 3:46 PM, Andreas Joseph Krogh 
wrote:

> $subject
>
> The query I'm using is 14K and I have not produced a test-case, so this is
> just a heads-up.
>

Rajkumar off-list shared same issue with me. And I have posted a fix for
the same (
https://www.postgresql.org/message-id/CAM2+6=x9kxqol2zqz00e6asbt9z+rfywbomhxj0+8fpaymz...@mail.gmail.com
)

Can you please check if that fixes your test-case or not?

Thanks



>
> --
> Andreas Joseph Krogh
> ​
>
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
On Tue, 27 Feb 2018 11:43:34 +0100
Magnus Hagander  wrote:

> On Tue, Feb 27, 2018 at 11:27 AM, Victor Wagner 
> wrote:
> 
> > Hello, hackers.
> >
> > I've tried to build last state of REL9_5_STABLE branch (commit
> > 1f19e46124eee8c6a54834) and under Win32 encountered  following
> > errors:
> >
[skip]
> 
> It's also interesting to note that this did not break in HEAD, 10 or
> 9.6. And none of those actually have the SCRIPTS_COMMON code.

It seems that it early stages of 9.6 cycle there was another approach
taken to improve readability of this Makefile - just all common code 
put into one C file. So there is no need for SCRIPTS_COMMON variable, 
because its name is longer than name of common.o which would be its sole
contents.

> I'm unsure why this was introduced in 9.5 and earlier, but not in the
> newer ones.  This smells like a possible backpatch mistake, in which
> case that part should probably be backed out of the old branches
> rather than teaching mkvcbuild about it.

Note that said commit (91f3ffc5249) is not limited to rearranging
makefile. It also changes a lot into C code itself. So it is not a
question of reverting commit - it is making new commit, which reverts
changes in just one file.



> Noah? Can you confirm if it was intentional?
 




Re: MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Magnus Hagander
On Tue, Feb 27, 2018 at 11:27 AM, Victor Wagner  wrote:

> Hello, hackers.
>
> I've tried to build last state of REL9_5_STABLE branch (commit
> 1f19e46124eee8c6a54834) and under Win32 encountered  following errors:
>
>
>   C:\Program Files (x86)\Microsoft Visual Studio
> 12.0\VC\bin\amd64\link.exe /ERRORREPORT:QUEUE /OUT:".\Release\createdb\
> createdb.exe"
> /INCREMENTAL:NO /NOLOGO
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\zlib\lib\zdll.
> lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\openssl\lib\VC\
> ssleay32MD.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\openssl\lib\VC\
> libeay32MD.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\libintl\lib\
> libintl.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> dependencies\iconv\lib\iconv.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\libxml2\lib\
> libxml2.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\libxslt\lib\
> libxslt.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\icu\lib\icuin.
> lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\icu\lib\icuuc.
> lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\wineditline\
> lib64\edit.lib"
> Release/libpq/libpq.lib Release/libpgcommon/libpgcommon.lib
> Release/libpgport/libpgport.lib ws2_32.lib kernel32.lib user32.lib
> gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib
> oleaut32.lib uuid.lib odbc32.lib
> odbccp32.lib /NODEFAULTLIB:libc /MANIFEST /MANIFESTUAC:"level='asInvoker'
> uiAccess='false'" /manifest:embed /DEBUG /PDB:".\Release\createdb\
> createdb.pdb"
> /SUBSYSTEM:CONSOLE /STACK:"4194304" /TLBID:1 /DYNAMICBASE:NO /NXCOMPAT
> /IMPLIB:".\Release\createdb\createdb.lib"
> /MACHINE:X64 /ignore:4197 .\Release\createdb\win32ver.res
> .\Release\createdb\createdb.obj
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\Release\libpq\libpq.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\Release\libpgcommon\libpgcommon.lib"
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\Release\libpgport\libpgport.lib"
> createdb.obj : error LNK2019: unresolved external symbol
> handle_help_version_opts referenced in function main
> [C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
> createdb.obj : error LNK2019: unresolved external symbol
> connectMaintenanceDatabase referenced in function main
> [C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
> createdb.obj : error LNK2019: unresolved external symbol fmtId
> referenced in function main
> [C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
> createdb.obj : error LNK2019: unresolved external symbol
> appendStringLiteralConn referenced in function main
> [C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
> .\Release\createdb\createdb.exe :
> fatal error LNK1120: 4 unresolved externals
> [C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
> Done Building Project
> "C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\
> postgresql\postgrespro-9.5.12.1\createdb.vcxproj" (default
> targets) -- FAILED.
>
> And much more similar errors for all applications in src/bin/scripts
> Problem is reproducible with both MSVC 2010 and MSVC2013 and both 32
> and 64 bit builds.
>
> These errors seems to be caused by commit  91f3ffc5249
> which improves readability of src/bin/scripts/Makefile for
> humans by moving list of common object files into variable.
>
> Unfortunately, it makes this Makefile unreadable for perl scripts
> in src/tools/msvc, which generate Microsift Visual Studio projects.
>
> So, Mkvcbuild.pm should be somehow fixed as well. As this module all
> consists of special cases and hidden knowledge, the simplest way to fix
> is to add special parsing of SCRIPTS_COMMON variable.
>

(This can also be seen on the buildfarm which is red for 9.5 and earlier
with msvc)

It's also interesting to note that this did not break in HEAD, 10 or 9.6.
And none of those actually have the SCRIPTS_COMMON code.

I'm unsure why this was introduced in 9.5 and earlier, but not in the newer
ones.  This smells like a possible backpatch mistake, in which case that
part should probably be backed out of the old branches rather than teaching
mkvcbuild about it.

Noah? Can you confirm if it was intentional?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


MSVC builld of 9.5.12 is broken?

2018-02-27 Thread Victor Wagner
Hello, hackers.

I've tried to build last state of REL9_5_STABLE branch (commit
1f19e46124eee8c6a54834) and under Win32 encountered  following errors:


  C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\link.exe 
/ERRORREPORT:QUEUE /OUT:".\Release\createdb\createdb.exe" 
/INCREMENTAL:NO /NOLOGO
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\zlib\lib\zdll.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\openssl\lib\VC\ssleay32MD.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\openssl\lib\VC\libeay32MD.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\libintl\lib\libintl.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\iconv\lib\iconv.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\libxml2\lib\libxml2.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\libxslt\lib\libxslt.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\icu\lib\icuin.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\icu\lib\icuuc.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\dependencies\wineditline\lib64\edit.lib"
Release/libpq/libpq.lib Release/libpgcommon/libpgcommon.lib
Release/libpgport/libpgport.lib ws2_32.lib kernel32.lib user32.lib
gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib
oleaut32.lib uuid.lib odbc32.lib
odbccp32.lib /NODEFAULTLIB:libc /MANIFEST /MANIFESTUAC:"level='asInvoker'
uiAccess='false'" /manifest:embed /DEBUG /PDB:".\Release\createdb\createdb.pdb"
/SUBSYSTEM:CONSOLE /STACK:"4194304" /TLBID:1 /DYNAMICBASE:NO /NXCOMPAT 
/IMPLIB:".\Release\createdb\createdb.lib"
/MACHINE:X64 /ignore:4197 .\Release\createdb\win32ver.res 
.\Release\createdb\createdb.obj
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\Release\libpq\libpq.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\Release\libpgcommon\libpgcommon.lib"
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\Release\libpgport\libpgport.lib"
createdb.obj : error LNK2019: unresolved external symbol
handle_help_version_opts referenced in function main
[C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
createdb.obj : error LNK2019: unresolved external symbol
connectMaintenanceDatabase referenced in function main
[C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
createdb.obj : error LNK2019: unresolved external symbol fmtId
referenced in function main
[C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
createdb.obj : error LNK2019: unresolved external symbol
appendStringLiteralConn referenced in function main
[C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
 .\Release\createdb\createdb.exe :
fatal error LNK1120: 4 unresolved externals
[C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\createdb.vcxproj]
Done Building Project
"C:\pgbuild\pgpro-9.5-msvc2013-amd64\builddir\postgresql\postgrespro-9.5.12.1\createdb.vcxproj"
 (default
targets) -- FAILED.

And much more similar errors for all applications in src/bin/scripts
Problem is reproducible with both MSVC 2010 and MSVC2013 and both 32
and 64 bit builds.

These errors seems to be caused by commit  91f3ffc5249  
which improves readability of src/bin/scripts/Makefile for
humans by moving list of common object files into variable.

Unfortunately, it makes this Makefile unreadable for perl scripts
in src/tools/msvc, which generate Microsift Visual Studio projects.

So, Mkvcbuild.pm should be somehow fixed as well. As this module all
consists of special cases and hidden knowledge, the simplest way to fix
is to add special parsing of SCRIPTS_COMMON variable.

--



--



server crash in nodeAppend.c

2018-02-27 Thread Rajkumar Raghuwanshi
Hi,

I am getting a server crash on PG-HEAD with below test case.

SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
create or replace function foobar() returns setof text as
$$ select 'foo'::varchar union all select 'bar'::varchar ; $$
language sql stable;

postgres=# select foobar();
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

--logfile
2018-02-26 22:15:52.908 IST [61738] LOG:  database system is ready to
accept connections
TRAP: FailedAssertion("!(whichplan >= 0 && whichplan <= node->as_nplans)",
File: "nodeAppend.c", Line: 365)
2018-02-26 22:17:50.441 IST [61738] LOG:  server process (PID 61748) was
terminated by signal 6: Aborted
2018-02-26 22:17:50.441 IST [61738] DETAIL:  Failed process was running:
select foobar();


Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


Fix a typo in walsender.c

2018-02-27 Thread atorikoshi

Hi,

Attached a minor patch for a typo: s/log/lag

Regards,

--
Atsushi Torikoshi
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index d46374d..8bb1919 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1245,7 +1245,7 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr 
lsn, TransactionId xid,
 /*
  * LogicalDecodingContext 'update_progress' callback.
  *
- * Write the current position to the log tracker (see XLogSendPhysical).
+ * Write the current position to the lag tracker (see XLogSendPhysical).
  */
 static void
 WalSndUpdateProgress(LogicalDecodingContext *ctx, XLogRecPtr lsn, 
TransactionId xid)


ERROR: Aggref found in non-Agg plan node (introducesd in commit 3bf05e096b9f8375e640c5d7996aa57efd7f240c)

2018-02-27 Thread Andreas Joseph Krogh
$subject
 
The query I'm using is 14K and I have not produced a test-case, so this is 
just a heads-up.
 
-- Andreas Joseph Krogh
​
 



  1   2   >