date:20180313

Re: fixing more format truncation issues

2018-03-13 Thread Michael Paquier

On Wed, Feb 28, 2018 at 11:14:23PM -0500, Peter Eisentraut wrote:
> AFAICT, the issues addressed here either can't really happen without
> trying very hard, or would cause harmless output truncation.  Still, it
> seems good to clean this up properly and not rely on made-up buffer size
> guesses that turn out to be wrong, even if we don't want to adopt the
> warning options by default.

Good idea.

> One issue that is of external interest is that I increase BGW_MAXLEN
> from 64 to 96.  Apparently, the old value would cause the bgw_name of
> logical replication workers to be truncated in some circumstances.  I
> have also seen truncated background worker names with third-party
> packages, so giving some more room here would be useful.

OK, no complains about that.

@@ -89,7 +89,7 @@ static Datum
 build_pgstattuple_type(pgstattuple_type *stat, FunctionCallInfo fcinfo)
 {
 #define NCOLUMNS   9
-#define NCHARS 32
+#define NCHARS 314

So this one is caused by the output of %.2f...

Enabling them by default would generate some useless noise if the patch
is let as-is as a couple of them are not addressed.  Please see the full
report attached.  Is that intentional?  I am using GCC 7.3 here.

interval.c: In function ‘AppendSeconds’:
interval.c:759:22: warning: ‘%0*d’ directive output between 1 and
2147483648 bytes may exceed minimum required size of 4095
[-Wformat-overflow=]
sprintf(cp, "%02d.%0*d", abs(sec), precision, (int) Abs(fsec));

pg_rusage.c:64:5: note: in expansion of macro ‘_’
 _("CPU: user: %d.%02d s, system: %d.%02d s, elapsed: %d.%02d s"),
  ^
pg_rusage.c:63:2: note: ‘snprintf’ output between 51 and 108
 bytes into a destination of size 100
   snprintf(result, sizeof(result),
--
Michael
:0:0: note: this is the location of the previous definition
In file included from ../../../src/include/postgres.h:46:0,
 from be-secure-openssl.c:17:
be-secure-openssl.c: In function ‘SSLerrmessage’:
../../../src/include/c.h:1009:20: warning: ‘%lu’ directive output may be 
truncated writing between 1 and 20 bytes into a region of size 17 
[-Wformat-truncation=]
 #define gettext(x) (x)
^
../../../src/include/c.h:1015:14: note: in expansion of macro ‘gettext’
 #define _(x) gettext(x)
  ^~~
be-secure-openssl.c:1023:35: note: in expansion of macro ‘_’
  snprintf(errbuf, sizeof(errbuf), _("SSL error code %lu"), ecode);
   ^
be-secure-openssl.c:1023:2: note: ‘snprintf’ output between 17 and 36 bytes 
into a destination of size 32
  snprintf(errbuf, sizeof(errbuf), _("SSL error code %lu"), ecode);
  ^~~~
postgres.c: In function ‘check_log_duration’:
postgres.c:2156:36: warning: ‘snprintf’ output may be truncated before the last 
format character [-Wformat-truncation=]
snprintf(msec_str, 32, "%ld.%03d",
^
postgres.c:2156:4: note: ‘snprintf’ output between 6 and 33 bytes into a 
destination of size 32
snprintf(msec_str, 32, "%ld.%03d",
^~
   secs * 1000 + msecs, usecs % 1000);
   ~~
formatting.c: In function ‘DCH_to_char’:
formatting.c:2455:17: warning: ‘%0*d’ directive output between 1 and 2147483648 
bytes may exceed minimum required size of 4095 [-Wformat-overflow=]
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_hour >= 0) ? 2 : 3,
 ^~~~
formatting.c:2455:16: note: assuming directive output of 11 bytes
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_hour >= 0) ? 2 : 3,
^~
formatting.c:2463:17: warning: ‘%0*d’ directive output between 1 and 2147483648 
bytes may exceed minimum required size of 4095 [-Wformat-overflow=]
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_hour >= 0) ? 2 : 3,
 ^~~~
formatting.c:2463:16: note: assuming directive output of 11 bytes
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_hour >= 0) ? 2 : 3,
^~
formatting.c:2470:17: warning: ‘%0*d’ directive output between 1 and 2147483648 
bytes may exceed minimum required size of 4095 [-Wformat-overflow=]
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_min >= 0) ? 2 : 3,
 ^~~~
formatting.c:2470:16: note: assuming directive output of 11 bytes
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_min >= 0) ? 2 : 3,
^~
formatting.c:2477:17: warning: ‘%0*d’ directive output between 1 and 2147483648 
bytes may exceed minimum required size of 4095 [-Wformat-overflow=]
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_sec >= 0) ? 2 : 3,
 ^~~~
formatting.c:2477:16: note: assuming directive output of 11 bytes
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : (tm->tm_sec >= 0) ? 2 : 3,
^~
formatting.c:2538:19: warning: ‘%0*d’ directive output between 1 and 2147483648 
bytes may exceed minimum req

Re: inserts into partitioned table may cause crash

2018-03-13 Thread Amit Langote

Fujita-san,

Thanks for the updates and sorry I couldn't reply sooner.

On 2018/03/06 21:26, Etsuro Fujita wrote:
> One thing I notice while working on this is this in ExecInsert/CopyFrom,
> which I moved to ExecPrepareTupleRouting as-is for the former:
>
> /*
>  * If we're capturing transition tuples, we might need to convert from
> the
>  * partition rowtype to parent rowtype.
>  */
> if (mtstate->mt_transition_capture != NULL)
> {
> if (resultRelInfo->ri_TrigDesc &&
> (resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
>  resultRelInfo->ri_TrigDesc->trig_insert_instead_row))
> {
> /*
>  * If there are any BEFORE or INSTEAD triggers on the partition,
>  * we'll have to be ready to convert their result back to
>  * tuplestore format.
>  */
> mtstate->mt_transition_capture->tcs_original_insert_tuple =
NULL;
> mtstate->mt_transition_capture->tcs_map =
> TupConvMapForLeaf(proute, rootRelInfo, leaf_part_index);
> }
>
> Do we need to consider INSTEAD triggers here?  The partition is either a
> plain table or a foreign table, so I don't think it can have those
> triggers.  Am I missing something?

I think you're right.  We don't need to consider INSTEAD OF triggers here
if we know we're dealing with a partition which cannot have those.

Just to make sure, a word from Thomas would help.

On 2018/03/12 12:25, Etsuro Fujita wrote:
> (2018/03/09 20:18), Etsuro Fujita wrote:
>> Here are updated patches for PG10 and HEAD.
>>
>> Other changes:
>> * Add regression tests based on your test cases shown upthread
> 
> I added a little bit more regression tests and revised comments.  Please
> find attached an updated patch.

Thanks for adding the test.

I was concerned that ExecMaterializeSlot will be called twice now -- first
in ExecPrepareTupleRouting and then in ExecInsert -- instead of only once
in ExecInsert with the existing code, but perhaps it doesn't matter much
because most of the work would be done in the 1st call anyway.

Sorry that this may be nitpicking that I should've brought up before, but
doesn't ExecPrepareTupleRouting do all the work that's needed for routing
a tuple and hence isn't the name a bit misleading?  Maybe,
ExecPerformTupleRouting?

Btw, I noticed that the patches place ExecPrepareTupleRouting (both the
declaration and the definition) at different relative locations within
nodeModifyTable.c in the HEAD branch vs. in the 10 branch.  It might be a
good idea to bring them to the same relative location to avoid hassle when
back-patching relevant patches in the future.  I did that in the attached
updated version (v4) of the patch for HEAD, which does not make any other
changes.  Although, the patch for PG-10 is unchanged, I still changed its
file name to contain v4.

Regards,
Amit
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 2656,2668  CopyFrom(CopyState cstate)
if (cstate->transition_capture != NULL)
{
if (resultRelInfo->ri_TrigDesc &&
!   
(resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
!
resultRelInfo->ri_TrigDesc->trig_insert_instead_row))
{
/*
!* If there are any BEFORE or INSTEAD 
triggers on the
!* partition, we'll have to be ready to 
convert their
!* result back to tuplestore format.
 */

cstate->transition_capture->tcs_original_insert_tuple = NULL;
cstate->transition_capture->tcs_map =
--- 2656,2667 
if (cstate->transition_capture != NULL)
{
if (resultRelInfo->ri_TrigDesc &&
!   
resultRelInfo->ri_TrigDesc->trig_insert_before_row)
{
/*
!* If there are any BEFORE triggers on 
the partition,
!* we'll have to be ready to convert 
their result back to
!* tuplestore format.
 */

cstate->transition_capture->tcs_original_insert_tuple = NULL;
cstate->transition_capture->tcs_map =
***
*** 2803,2814  CopyFrom(CopyState cstate)
 * tuples inserted by an INSERT command.
 */
processed++;

!

Re: remove pg_class.relhaspkey

2018-03-13 Thread Michael Paquier

On Sat, Mar 10, 2018 at 01:52:56PM +0100, Tomas Vondra wrote:
> I agree with this sentiment - I don't think those flags are particularly
> helpful for client applications, and would vote +1 for removal.

OK, so I can see that we are moving to a consensus here.

> For the other flags we would probably need to test what impact would it
> have (e.g. table with no indexes, many indexes on other tables, and
> something calling get_relation_info a lot). But this patch proposes to
> remove only relhaspkey.

Yes, you are right here.  Peter, would you do that within this commit
fest or not?  As we are half-way through the commit fest, we could also
cut the apple in half and just remove relhaspkey for now as that's a
no-brainer.  So I would suggest to just do the latter and consider this
patch as done.

Attached is a rebased patch, there were some conflicts in pg_class.h by
the way.
--
Michael
From 1601b84f9ef2e8d0467b109c0b1d3df435edb59a Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 14 Mar 2018 14:46:43 +0900
Subject: [PATCH] Remove pg_class.relhaspkey

It's not used for anything internally, and it can't be relied on for
external uses, so it can just be removed.

Patch by Peter Eisentraut.
---
 doc/src/sgml/catalogs.sgml  |  9 -
 src/backend/catalog/heap.c  |  1 -
 src/backend/catalog/index.c | 32 ++-
 src/backend/commands/vacuum.c   | 10 --
 src/backend/rewrite/rewriteDefine.c |  1 -
 src/include/catalog/pg_class.h  | 38 ++---
 6 files changed, 20 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..30e6741305 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1848,15 +1848,6 @@ SCRAM-SHA-256$:&l
   
  
 
- 
-  relhaspkey
-  bool
-  
-  
-   True if the table has (or once had) a primary key
-  
- 
-
  
   relhasrules
   bool
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index cf36ce4add..3d80ff9e5b 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -798,7 +798,6 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relnatts - 1] = Int16GetDatum(rd_rel->relnatts);
 	values[Anum_pg_class_relchecks - 1] = Int16GetDatum(rd_rel->relchecks);
 	values[Anum_pg_class_relhasoids - 1] = BoolGetDatum(rd_rel->relhasoids);
-	values[Anum_pg_class_relhaspkey - 1] = BoolGetDatum(rd_rel->relhaspkey);
 	values[Anum_pg_class_relhasrules - 1] = BoolGetDatum(rd_rel->relhasrules);
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
 	values[Anum_pg_class_relrowsecurity - 1] = BoolGetDatum(rd_rel->relrowsecurity);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 431bc31969..9e2dd0e729 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -125,7 +125,7 @@ static void UpdateIndexRelation(Oid indexoid, Oid heapoid,
 	bool isvalid,
 	bool isready);
 static void index_update_stats(Relation rel,
-   bool hasindex, bool isprimary,
+   bool hasindex,
    double reltuples);
 static void IndexCheckExclusion(Relation heapRelation,
 	Relation indexRelation,
@@ -1162,7 +1162,6 @@ index_create(Relation heapRelation,
 		 */
 		index_update_stats(heapRelation,
 		   true,
-		   isprimary,
 		   -1.0);
 		/* Make the above update visible */
 		CommandCounterIncrement();
@@ -1364,21 +1363,6 @@ index_constraint_create(Relation heapRelation,
 			 InvalidOid, conOid, indexRelationId, true);
 	}
 
-	/*
-	 * If needed, mark the table as having a primary key.  We assume it can't
-	 * have been so marked already, so no need to clear the flag in the other
-	 * case.
-	 *
-	 * Note: this might better be done by callers.  We do it here to avoid
-	 * exposing index_update_stats() globally, but that wouldn't be necessary
-	 * if relhaspkey went away.
-	 */
-	if (mark_as_primary)
-		index_update_stats(heapRelation,
-		   true,
-		   true,
-		   -1.0);
-
 	/*
 	 * If needed, mark the index as primary and/or deferred in pg_index.
 	 *
@@ -2041,7 +2025,6 @@ FormIndexDatum(IndexInfo *indexInfo,
  * to ensure we can do all the necessary work in just one update.
  *
  * hasindex: set relhasindex to this value
- * isprimary: if true, set relhaspkey true; else no change
  * reltuples: if >= 0, set reltuples to this value; else no change
  *
  * If reltuples >= 0, relpages and relallvisible are also updated (using
@@ -2058,7 +2041,6 @@ FormIndexDatum(IndexInfo *indexInfo,
 static void
 index_update_stats(Relation rel,
    bool hasindex,
-   bool isprimary,
    double reltuples)
 {
 	Oid			relid = RelationGetRelid(rel);
@@ -2088,7 +2070,7 @@ index_update_stats(Relation rel,
 	 * It is safe to use a non-transactional update even though our
 	 * transaction could still fail before committing.  Set

Re: Faster inserts with mostly-monotonically increasing values

2018-03-13 Thread Pavan Deolasee

On Wed, Mar 14, 2018 at 10:58 AM, Claudio Freire 
wrote:

>
>
> I'm thinking there could be contention on some lock somewhere.
>
> Can you attach the benchmark script you're using so I can try to reproduce
> it?
>


I am using a very ad-hoc script.. But here it is.. It assumes a presence of
a branch named "btree_rightmost" with the patched code.

You will need to make necessary adjustments of course.

Thanks,
Pavan


-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

start_options="-c shared_buffers=16GB -c max_wal_size=64GB -c min_wal_size=16GB -c checkpoint_timeout=60min -c bgwriter_delay=100 -c bgwriter_lru_maxpages=100 -c bgwriter_lru_multiplier=3 -c bgwriter_flush_after=256 -c checkpoint_completion_target=0.9"
PGDATA=/data/pgsql

echo "INSERT INTO testtab(b) SELECT generate_series(1,10);" > s1.sql

for b in master btree_rightmost; do
git checkout $b > /dev/null 2>&1
make -s clean > /dev/null 2>&1
./configure --prefix=$HOME/pg-install/$b
make -s -j8 install > /dev/null 2>&1
export PATH=$HOME/pg-install/$b/bin:$PATH
for c in 1 2 4 8; do
pg_ctl -D $PGDATA -w stop
rm -rf $PGDATA
initdb -D $PGDATA
pg_ctl -D $PGDATA -o "$start_options" -w start -l logfile.$b
psql -c "CREATE TABLE testtab (a bigserial UNIQUE, b bigint);" postgres
echo "$b $c" >> results.txt
num_txns_per_client=$((1024 / $c))
time pgbench -n -l -c $c -j 1 -t $num_txns_per_client -f s1.sql postgres >> results.txt
done
done

Re: add queryEnv to ExplainOneQuery_hook

2018-03-13 Thread Tom Lane

Michael Paquier  writes:
> On Wed, Mar 14, 2018 at 05:36:26PM +1300, Thomas Munro wrote:
>> Hmm.  I suppose we could have invented a new extended hook with a
>> different name and back-patched it so that PG10 would support both.
>> Then binary compatibility with existing compiled extensions wouldn't
>> be affected AFAICS, but you could use the new extended hook in (say)
>> 10.4 or higher.  Then for PG11 (or later) we could remove the old hook
>> and just keep the new one.  I suppose that option is still technically
>> open to us, though I'm not sure of the committers' appetite for messing
>> with back branches like that.

> The interactions between both hooks would not be difficult to define: if
> the original hook is not defined, just do not trigger the second.  Still
> that's too late for v10, so I would rather let it go.  New features are
> not backpatched.

Yeah.  There would be no good way for a v10 extension to know whether the
additional hook is available --- it would have to know that at compile
time, and it can't --- so I don't see that this is practical.

Ideally we'd have noticed the problem before v10 got out ... so my own
takeaway here is that this is a reminder to extension authors that they
ought to test their stuff during beta period.

regards, tom lane

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread Tatsuro Yamada


Hi David and Stephen,

On 2018/03/14 12:59, David G. Johnston wrote:

On Tuesday, March 13, 2018, Tatsuro Yamada mailto:yamada.tats...@lab.ntt.co.jp>> wrote:

Hi Hackers,

I found a bug, maybe.
If it is able to get an explain command result from below query 
successfully,
I think that it means the query is executable.


There is a difference between executable, compilable, and able to execute to 
completion, runtime, on specific data.  You've proven the former but as the 
error indicates specific data causes the complete execution of the query to 
fail.

I can write "select cola / colb from tbl" and as long as there are no zeros in 
colb the query will complete, but if there is you get a divide by zero runtime error.  
This is similar.

David J.


Thank you for the explanation.
I understand that it's a runtime error and not able to avoid in parser and 
planner. :)


Thanks,
Tatsuro Yamada

Re: add queryEnv to ExplainOneQuery_hook

2018-03-13 Thread Michael Paquier

On Wed, Mar 14, 2018 at 05:36:26PM +1300, Thomas Munro wrote:
> Hmm.  I suppose we could have invented a new extended hook with a
> different name and back-patched it so that PG10 would support both.
> Then binary compatibility with existing compiled extensions wouldn't
> be affected AFAICS, but you could use the new extended hook in (say)
> 10.4 or higher.  Then for PG11 (or later) we could remove the old hook
> and just keep the new one.  I suppose that option is still technically
> open to us, though I'm not sure of the committers' appetite for messing
> with back branches like that.

The interactions between both hooks would not be difficult to define: if
the original hook is not defined, just do not trigger the second.  Still
that's too late for v10, so I would rather let it go.  New features are
not backpatched.
--
Michael


signature.asc
Description: PGP signature

Re: Faster inserts with mostly-monotonically increasing values

2018-03-13 Thread Claudio Freire

On Wed, Mar 14, 2018 at 1:36 AM, Pavan Deolasee
 wrote:
>
>
> On Sun, Mar 11, 2018 at 9:18 PM, Claudio Freire 
> wrote:
>>
>> On Sun, Mar 11, 2018 at 2:27 AM, Pavan Deolasee
>>
>> >
>> > Yes, I will try that next - it seems like a good idea. So the idea would
>> > be:
>> > check if the block is still the rightmost block and the insertion-key is
>> > greater than the first key in the page. If those conditions are
>> > satisfied,
>> > then we do a regular binary search within the page to find the correct
>> > location. This might add an overhead of binary search when keys are
>> > strictly
>> > ordered and a single client is inserting the data. If that becomes a
>> > concern, we might be able to look for that special case too and optimise
>> > for
>> > it too.
>>
>> Yeah, pretty much that's the idea. Beware, if the new item doesn't
>> fall in the rightmost place, you still need to check for serialization
>> conflicts.
>
>
> So I've been toying with this idea since yesterday and I am quite puzzled
> with the results. See the attached patch which compares the insertion key
> with the last key inserted by this backend, if the cached block is still the
> rightmost block in the tree. I initially only compared with the first key in
> the page, but I tried this version because of the strange performance
> regression which I still have no answers.
>
> For a small number of clients, the patched version does better. But as the
> number of clients go up, the patched version significantly underperforms
> master. I roughly counted the number of times the fastpath is taken and I
> noticed that almost 98% inserts take the fastpath. I first thought that the
> "firstkey" location in the page might be becoming a hot-spot for concurrent
> processes and hence changed that to track the per-backend last offset and
> compare against that the next time. But that did not help much.
>
> +-++---+
> | clients | Master - Avg load time in sec | Patched - Avg load time in sec |
> +-++---+
> | 1   | 500.0725203| 347.632079|
> +-++---+
> | 2   | 308.4580771| 263.9120163   |
> +-++---+
> | 4   | 359.4851779| 514.7187444   |
> +-++---+
> | 8   | 476.4062592| 780.2540855   |
> +-++---+
>
> The perf data does not show anything interesting either. I mean there is a
> reduction in CPU time spent in btree related code in the patched version,
> but the overall execution time to insert the same number of records go up
> significantly.
>
> Perf (master):
> ===
>
> +   72.59% 1.81%  postgres  postgres[.] ExecInsert
> +   47.55% 1.27%  postgres  postgres[.]
> ExecInsertIndexTuples
> +   44.24% 0.48%  postgres  postgres[.] btinsert
> -   42.40% 0.87%  postgres  postgres[.] _bt_doinsert
>- 41.52% _bt_doinsert
>   + 21.14% _bt_search
>   + 12.57% _bt_insertonpg
>   + 2.03% _bt_binsrch
> 1.60% _bt_mkscankey
> 1.20% LWLockAcquire
>   + 1.03% _bt_freestack
> 0.67% LWLockRelease
> 0.57% _bt_check_unique
>+ 0.87% _start
> +   26.03% 0.95%  postgres  postgres[.] ExecScan
> +   21.14% 0.82%  postgres  postgres[.] _bt_search
> +   20.70% 1.31%  postgres  postgres[.] ExecInterpExpr
> +   19.05% 1.14%  postgres  postgres[.] heap_insert
> +   18.84% 1.16%  postgres  postgres[.] nextval_internal
> +   18.08% 0.84%  postgres  postgres[.] ReadBufferExtended
> +   17.24% 2.03%  postgres  postgres[.] ReadBuffer_common
> +   12.57% 0.59%  postgres  postgres[.] _bt_insertonpg
> +   11.12% 1.63%  postgres  postgres[.] XLogInsert
> +9.90% 0.10%  postgres  postgres[.] _bt_relandgetbuf
> +8.97% 1.16%  postgres  postgres[.] LWLockAcquire
> +8.42% 2.03%  postgres  postgres[.] XLogInsertRecord
> +7.26% 1.01%  postgres  postgres[.] _bt_binsrch
> +7.07% 1.20%  postgres  postgres[.]
> RelationGetBufferForTuple
> +6.27% 4.92%  postgres  postgres[.] _bt_compare
> +5.97% 0.63%  postgres  postgres[.]
> read_seq_tuple.isra.3
> +5.70% 4.89%  postgres  postgres[.]
> hash_search_with_hash_value
> +5.44% 5.44%  postgres  postgres[.] LWLockAttemptLock
>
>
> Perf (Patched):
>

Re: [bug fix] Cascaded standby cannot start after a clean shutdown

2018-03-13 Thread Michael Paquier

On Tue, Feb 27, 2018 at 05:15:29AM +, Tsunakawa, Takayuki wrote:
> That was my first thought, and I gave it up.  As you say,
> XLogReadRecord() could allocate up to 1 GB of memory for a garbage.
> That allocation can fail due to memory shortage, which prevents the
> recovery from proceeding.

Even with that, the resulting patch is sort of ugly...  So it seems to
me that the conclusion to this thread is that there is no clear winner,
and that the problem is so unlikely to happen that it is not worth the
performance impact to zero all the WAL pages fetched.  Still, the
attached has the advantage to not cause the startup process to fail
suddendly because of the allocation failure but to fail afterwards when
validating the record's contents, which has the disadvantage to allocate
temporarily up to 1GB of memory for some of the garbage, but that
allocation is short-lived.  So that would switch the failure from a
FATAL allocation failure taking down Postgres to something which will
fetching of new WAL records to be tried again.

All in all, that would be a win for the case reported on this thread.

Thoughts from anybody?
--
Michael
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 3a86f3497e..5bf50a2656 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -25,6 +25,10 @@
 #include "common/pg_lzcompress.h"
 #include "replication/origin.h"
 
+#ifndef FRONTEND
+#include "utils/memutils.h"
+#endif
+
 static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength);
 
 static bool ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr,
@@ -309,7 +313,14 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 	 * rest of the header after reading it from the next page.  The xl_tot_len
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
-	 * ValidXLogRecordHeader at all.
+	 * ValidXLogRecordHeader at all.  Note that in much unlucky circumstances,
+	 * the random data read from a recycled segment could allow the set of
+	 * sanity checks to pass.  If the garbage data read in xl_tot_len is
+	 * higher than MaxAllocSize on the backend, then the startup process
+	 * would retry to fetch fresh WAL data.  If this is lower than
+	 * MaxAllocSize, then a more-than-necessary memory allocation will
+	 * happen for a short amount of time, until the WAL reader fails at
+	 * validating the next record's data.
 	 */
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
@@ -321,7 +332,11 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 	else
 	{
 		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
+		if (total_len < SizeOfXLogRecord
+#ifndef FRONTEND
+			|| !AllocSizeIsValid(total_len)
+#endif
+		)
 		{
 			report_invalid_record(state,
   "invalid record length at %X/%X: wanted %u, got %u",


signature.asc
Description: PGP signature

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread Tatsuro Yamada


Hi Stephen,


On 2018/03/14 12:36, Stephen Frost wrote:

Greetings,

* Tatsuro Yamada (yamada.tats...@lab.ntt.co.jp) wrote:

I found a bug, maybe.


I don't think so...


* Result of Select: failed

# select
   subq_1.c0
from
   test as ref_0,
   lateral (select subq_0.c0 as c0
from
 (select ref_0.c2 as c0,
 (select c1 from test) as c1 from test as ref_1
where (select c3 from test) is NULL) as subq_0
right join test as ref_2
on (subq_0.c1 = ref_2.c1 )) as subq_1;

ERROR:  more than one row returned by a subquery used as an expression


You don't need LATERAL or anything complicated to reach that error,
simply do:

=*> select * from test where (select c1 from test) is null;
ERROR:  more than one row returned by a subquery used as an expression

The problem there is that the WHERE clause is trying to evaluate an
expression, which is "(select c1 from test) is null" and you aren't
allowed to have multiple rows returned from that subquery (otherwise,
how would we know which row to compare in the expression..?).

If you're actually intending to refer to the 'c3' column from the test
through the lateral join, you would just refer to it as 'ref_0.c3', as
you do in another part of that query.


Thanks for your reply.

The query is not useful for me and it's just a test query for planner
because it is made by sqlsmith. :)
My question is that was it possible to handle the error only in
executer phase? I expected that it is checked in parsing or planning phase.


Thanks,
Tatsuro Yamada

Re: Ambigous Plan - Larger Table on Hash Side

2018-03-13 Thread Ashutosh Bapat

On Tue, Mar 13, 2018 at 4:32 PM, Narendra Pradeep U U
 wrote:
> Hi,
>   Thanks everyone for your suggestions. I would like to add  explain
> analyze of both the plans so that we can have broader picture.
>
> I have a work_mem of 1000 MB.
>
> The Plan which we get regularly with table being analyzed .
>
> tpch=# explain analyze  select b from tab2 left join tab1 on a = b;
>QUERY PLAN
> 
>  Hash Left Join  (cost=945515.68..1071064.34 rows=78264 width=4) (actual
> time=9439.410..20445.620 rows=78264 loops=1)
>Hash Cond: (tab2.b = tab1.a)
>->  Seq Scan on tab2  (cost=0.00..1129.64 rows=78264 width=4) (actual
> time=0.006..5.116 rows=78264 loops=1)
>->  Hash  (cost=442374.30..442374.30 rows=30667630 width=4) (actual
> time=9133.593..9133.593 rows=30667722 loops=1)
>  Buckets: 33554432  Batches: 2  Memory Usage: 801126kB
>  ->  Seq Scan on tab1  (cost=0.00..442374.30 rows=30667630 width=4)
> (actual time=0.030..3584.652 rows=30667722 loops=1)
>  Planning time: 0.055 ms
>  Execution time: 20472.603 ms
> (8 rows)
>
>
>
> I reproduced the  other plan by not analyzing the smaller table.
>
> tpch=# explain analyze  select b from tab2 left join tab1 on a = b;
> QUERY PLAN
> --
>  Hash Right Join  (cost=2102.88..905274.97 rows=78039 width=4) (actual
> time=15.331..7590.406 rows=78264 loops=1)
>Hash Cond: (tab1.a = tab2.b)
>->  Seq Scan on tab1  (cost=0.00..442375.48 rows=30667748 width=4)
> (actual time=0.046..2697.480 rows=30667722 loops=1)
>->  Hash  (cost=1127.39..1127.39 rows=78039 width=4) (actual
> time=15.133..15.133 rows=78264 loops=1)
>  Buckets: 131072  Batches: 1  Memory Usage: 3776kB
>  ->  Seq Scan on tab2  (cost=0.00..1127.39 rows=78039 width=4)
> (actual time=0.009..5.516 rows=78264 loops=1)
>  Planning time: 0.053 ms
>  Execution time: 7592.688 ms
> (8 rows)

I am surprised to see the estimates to be very close to the actual
values even without analysing the small table.

>
>
> The actual plan seems to be Slower. The smaller table (tab2) has exactly
> each row duplicated 8 times  and all the rows in larger table (tab2) are
> distinct. what may be the exact reason  and  can we fix this ?

After analysing the small table, the first plan is chosen as the
cheapest. This means that the plan with smaller table being hashed has
cost higher than the plan with larger table being hashed. We need to
examine that costing to see what went wrong in costing.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-13 Thread Masahiko Sawada

On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
 wrote:
> On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada 
> wrote:
>>
>> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
>>  wrote:
>> > 2) These parameters are reset during btbulkdelete() and set during
>> > btvacuumcleanup().
>>
>> Can't we set these parameters even during btbulkdelete()? By keeping
>> them up to date, we will able to avoid an unnecessary cleanup vacuums
>> even after index bulk-delete.
>
>
> We certainly can update cleanup-related parameters during btbulkdelete().
> However, in this case we would update B-tree meta-page during each
> VACUUM cycle.  That may cause some overhead for non append-only
> workloads.  I don't think this overhead would be sensible, because in
> non append-only scenarios VACUUM typically writes much more of information.
> But I would like this oriented to append-only workload patch to be
> as harmless as possible for other workloads.

What overhead are you referring here? I guess the overhead is only the
calculating the oldest btpo.xact. And I think it would be harmless.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: add queryEnv to ExplainOneQuery_hook

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 2:57 PM, Jim Finnerty  wrote:
> Passing NULL in place of queryEnv in PG10 causes a failure in installcheck
> tests portals and plpgsql.
>
> For PG10, if you want a both an ExplainOneQuery hook and clean run of
> installcheck, is there a better workaround than to (a) pass NULL in place of
> queryEnv, and (b) to comment out the portals and plpgsql tests?

Hi Jim,

I can't think of a good way right now.  It's unfortunate that we
couldn't back-patch 4d41b2e0 because 10 was out the door; but perhaps
you can?

Hmm.  I suppose we could have invented a new extended hook with a
different name and back-patched it so that PG10 would support both.
Then binary compatibility with existing compiled extensions wouldn't
be affected AFAICS, but you could use the new extended hook in (say)
10.4 or higher.  Then for PG11 (or later) we could remove the old hook
and just keep the new one.  I suppose that option is still technically
open to us, though I'm not sure of the committers' appetite for messing
with back branches like that.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Faster inserts with mostly-monotonically increasing values

2018-03-13 Thread Pavan Deolasee

On Sun, Mar 11, 2018 at 9:18 PM, Claudio Freire 
wrote:

> On Sun, Mar 11, 2018 at 2:27 AM, Pavan Deolasee
>
> >
> > Yes, I will try that next - it seems like a good idea. So the idea would
> be:
> > check if the block is still the rightmost block and the insertion-key is
> > greater than the first key in the page. If those conditions are
> satisfied,
> > then we do a regular binary search within the page to find the correct
> > location. This might add an overhead of binary search when keys are
> strictly
> > ordered and a single client is inserting the data. If that becomes a
> > concern, we might be able to look for that special case too and optimise
> for
> > it too.
>
> Yeah, pretty much that's the idea. Beware, if the new item doesn't
> fall in the rightmost place, you still need to check for serialization
> conflicts.
>

So I've been toying with this idea since yesterday and I am quite puzzled
with the results. See the attached patch which compares the insertion key
with the last key inserted by this backend, if the cached block is still
the rightmost block in the tree. I initially only compared with the first
key in the page, but I tried this version because of the strange
performance regression which I still have no answers.

For a small number of clients, the patched version does better. But as the
number of clients go up, the patched version significantly underperforms
master. I roughly counted the number of times the fastpath is taken and I
noticed that almost 98% inserts take the fastpath. I first thought that the
"firstkey" location in the page might be becoming a hot-spot for concurrent
processes and hence changed that to track the per-backend last offset and
compare against that the next time. But that did not help much.

+-++---+
| clients | Master - Avg load time in sec | Patched - Avg load time in sec |
+-++---+
| 1   | 500.0725203| 347.632079|
+-++---+
| 2   | 308.4580771| 263.9120163   |
+-++---+
| 4   | 359.4851779| 514.7187444   |
+-++---+
| 8   | 476.4062592| 780.2540855   |
+-++---+

The perf data does not show anything interesting either. I mean there is a
reduction in CPU time spent in btree related code in the patched version,
but the overall execution time to insert the same number of records go up
significantly.

Perf (master):
===

+   72.59% 1.81%  postgres  postgres[.] ExecInsert

+   47.55% 1.27%  postgres  postgres[.]
ExecInsertIndexTuples
+   44.24% 0.48%  postgres  postgres[.] btinsert

-   42.40% 0.87%  postgres  postgres[.] _bt_doinsert

   - 41.52% _bt_doinsert

  + 21.14% _bt_search

  + 12.57% _bt_insertonpg

  + 2.03% _bt_binsrch

1.60% _bt_mkscankey

1.20% LWLockAcquire

  + 1.03% _bt_freestack

0.67% LWLockRelease

0.57% _bt_check_unique

   + 0.87% _start

+   26.03% 0.95%  postgres  postgres[.] ExecScan

+   21.14% 0.82%  postgres  postgres[.] _bt_search

+   20.70% 1.31%  postgres  postgres[.] ExecInterpExpr

+   19.05% 1.14%  postgres  postgres[.] heap_insert

+   18.84% 1.16%  postgres  postgres[.] nextval_internal

+   18.08% 0.84%  postgres  postgres[.] ReadBufferExtended

+   17.24% 2.03%  postgres  postgres[.] ReadBuffer_common

+   12.57% 0.59%  postgres  postgres[.] _bt_insertonpg

+   11.12% 1.63%  postgres  postgres[.] XLogInsert

+9.90% 0.10%  postgres  postgres[.] _bt_relandgetbuf

+8.97% 1.16%  postgres  postgres[.] LWLockAcquire

+8.42% 2.03%  postgres  postgres[.] XLogInsertRecord

+7.26% 1.01%  postgres  postgres[.] _bt_binsrch

+7.07% 1.20%  postgres  postgres[.]
RelationGetBufferForTuple
+6.27% 4.92%  postgres  postgres[.] _bt_compare

+5.97% 0.63%  postgres  postgres[.]
read_seq_tuple.isra.3
+5.70% 4.89%  postgres  postgres[.]
hash_search_with_hash_value
+5.44% 5.44%  postgres  postgres[.] LWLockAttemptLock

Perf (Patched):

+   69.33% 2.36%  postgres  postgres[.] ExecInsert

+   35.21% 0.64%  postgres  postgres[.]
ExecInsertIndexTuples
-   32.14% 0.45%  postgres  postgres[.] btinser

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread David G. Johnston

On Tuesday, March 13, 2018, Tatsuro Yamada 
wrote:

> Hi Hackers,
>
> I found a bug, maybe.
> If it is able to get an explain command result from below query
> successfully,
> I think that it means the query is executable.
>

There is a difference between executable, compilable, and able to execute
to completion, runtime, on specific data.  You've proven the former but as
the error indicates specific data causes the complete execution of the
query to fail.

I can write "select cola / colb from tbl" and as long as there are no zeros
in colb the query will complete, but if there is you get a divide by zero
runtime error.  This is similar.

David J.

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread Stephen Frost

Greetings,

* Tatsuro Yamada (yamada.tats...@lab.ntt.co.jp) wrote:
> I found a bug, maybe.

I don't think so...

> * Result of Select: failed
> 
> # select
>   subq_1.c0
> from
>   test as ref_0,
>   lateral (select subq_0.c0 as c0
>from
> (select ref_0.c2 as c0,
> (select c1 from test) as c1 from test as ref_1
>where (select c3 from test) is NULL) as subq_0
>right join test as ref_2
>on (subq_0.c1 = ref_2.c1 )) as subq_1;
> 
> ERROR:  more than one row returned by a subquery used as an expression

You don't need LATERAL or anything complicated to reach that error,
simply do:

=*> select * from test where (select c1 from test) is null;
ERROR:  more than one row returned by a subquery used as an expression

The problem there is that the WHERE clause is trying to evaluate an
expression, which is "(select c1 from test) is null" and you aren't
allowed to have multiple rows returned from that subquery (otherwise,
how would we know which row to compare in the expression..?).

If you're actually intending to refer to the 'c3' column from the test
through the lateral join, you would just refer to it as 'ref_0.c3', as
you do in another part of that query.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re:Re: Re: [GSOC 18] Performance Farm Project——Initialization Project

2018-03-13 Thread Hongyuan Ma

Hi Dave,
I am willing to use React to re-create the front-end application. Since I plan 
to use separate front-end and back-end development methods, this means that 
there will be no html files in Django applications. Front-end and back-end 
applications will interact via restful api. I will use react to rewrite some 
existing html files if needed. Before initializing the react application, I 
want to learn more about your tendency toward front-end technology.

 - About React: Which version of React (15.x or 16) and react-router (2.x or 4) 
do you tend to use?
 - About Package Manager: Do you tend to use yarn or npm?
 - About the UI library: I guess you might prefer Bootstrap or Material-UI.

At the same time I hope you can help me understand the functional requirements 
of this project more clearly. Here are some of my thoughts on PerfFarm:

 - I see this comment in the client folder (In line 15 of the "pgperffarm \ 
client \ benchmarks \ pgbench.py" file):
'''
# TODO allow running custom scripts, not just the default
'''
Will PerfFarm have many test items so that the test item search function or 
even the test item sort function is expected to be provided?
 
 - What value will be used to determine if a machine's performance is improving 
or declining?
 - After the user logs in to the site, I think it may be necessary to provide a 
page for the user to browse or manage his own machine. Is there any function 
that requires users to log in before they can use it?
 - When I registered a machine on BuildFarm, I did not receive the confirmation 
email immediately but several days later. In PerfFarm, when a user registers a 
machine, will the administrator review it before sending a confirmation email?
 - I see BuildFarm assigning an animal name to each registered machine. Will 
PerfFarm also have this interesting feature?

My postgresql.org community account is:
 - Username: maleicacid
 - Email: cs_maleica...@163.com

I hope to get the commit permission of the pgperffarm.git repository. I am 
willing to continue coding and complete the project on the existing 
code.Looking forward to your reply.

Best Regards,
Hongyuan Ma (cs_maleica...@163.com)


At 2018-03-13 03:03:08, "Dave Page"  wrote:

Hi


On Mon, Mar 12, 2018 at 9:57 AM, Hongyuan Ma  wrote:

Hi Dave,
Thank you for your reminder. This is indeed my negligence.
In fact, I have browsed the code in the pgperffarm.git repository, including 
the client folder and the web folder. However, I found that the web folder has 
some unfinished html files and does not contain model class files. And the 
project used Django 1.8 without importing the Django REST Framework (When I 
talked to Mark about the PerfFarm project, he told me he insisted on using 
Django 1.11 and considered using the REST framework). So I mistakenly thought 
that the code in the web folder had been shelved.


Nope, not at all. It just wasn't updated to meet the latest PG infrastructure 
requirements yet (basically just the update to Django 11). The rest should be 
fine as-is.
 


In the newly initialized Django application, I upgraded its version and 
assigned the directory to make the project structure clearer. I want to use a 
separate front-end development approach (The front-end applications will use 
vue.js for programming.). I hope you can allow me to use it instead of the old 
one. I am willing to integrate the auth module into this new application as 
soon as possible.


I would much prefer to see jQuery + React, purely because there are likely more 
PostgreSQL developers (particularly from the pgAdmin team) that know those 
technologies. It is important to consider long term maintenance as well as ease 
of initial development with any project.


Thanks.






Regards,
Hongyuan Ma (cs_maleica...@163.com) 

在 2018-03-12 02:26:43，"Dave Page"  写道：
Hi


Maybe I’m missing something (I’ve been offline a lot recently for unavoidable 
reasons), but the perf farm project already has a Django backend initialised 
and configured to work with community auth, on community infrastructure.


https://git.postgresql.org/gitweb/?p=pgperffarm.git;a=summary

On Sunday, March 11, 2018, Hongyuan Ma  wrote:

Hello, mark
I initialized a Django project and imported the Django REST Framework. Its 
github address is: https://github.com/PGPerfFarm/server-code
I created some model classes and also wrote scripts in the dbtools folder to 
import simulation data for development. I am hesitant to use admin or xadmin as 
the administrator's backend for the site (I prefer to use xadmin).


I also initialized the website's front-end application. Here is its address: 
https://github.com/PGPerfFarm/front-end-code.git
I wrote the header component as shown:


I hope this effect can enhance the future user experience:).
This application uses vue.js and element-ui, but if you insist on using other 
js libraries or not using js, please let me know. I will empty this project and 
then rewrite it as you wish.


My next step is to d

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread David Rowley

On 14 March 2018 at 06:54, Jesper Pedersen  wrote:
> * "Add more expression types..." -- Are you planning to add more of
> these ? Otherwise change the comment

Run-time pruning will deal with Param types here. The comment there
might be just to remind us all that the function must remain generic
enough so we can support more node types later. I don't particularly
need the comment for that patch, and I'm not quite sure if I should be
removing it in that patch. My imagination is not stretching far enough
today to think what we could use beyond Const and Param.

I don't feel strongly about the comment either way. This is just to
let you know that there are more up and coming things to do in that
spot.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

planner bug regarding lateral and subquery?

2018-03-13 Thread Tatsuro Yamada


Hi Hackers,

I found a bug, maybe.
If it is able to get an explain command result from below query successfully,
I think that it means the query is executable. However, I got an error by
executing the query without an explain command. I guess that planner makes a 
wrong plan.

I share a reproduction procedure and query results on 
3b7ab4380440d7b14ee390fabf39f6d87d7491e2.

* Reproduction

create table test (c1 integer, c2 integer, c3 text);
insert into test values (1, 3, 'a');
insert into test values (2, 4, 'b');

explain (costs off)
select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;

select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;


* Result of Explain: succeeded

# explain (costs off)
select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;

QUERY PLAN
---
 Nested Loop
   InitPlan 1 (returns $0)
 ->  Seq Scan on test
   InitPlan 2 (returns $1)
 ->  Seq Scan on test test_1
   ->  Seq Scan on test ref_0
   ->  Nested Loop Left Join
 Join Filter: ($1 = ref_2.c1)
 ->  Seq Scan on test ref_2
 ->  Materialize
   ->  Result
 One-Time Filter: ($0 IS NULL)
 ->  Seq Scan on test ref_1

* Result of Select: failed

# select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;

ERROR:  more than one row returned by a subquery used as an expression


* The error message came from here

./src/backend/executor/nodeSubplan.c

   if (found &&
(subLinkType == EXPR_SUBLINK ||
 subLinkType == MULTIEXPR_SUBLINK ||
 subLinkType == ROWCOMPARE_SUBLINK))
ereport(ERROR,
(errcode(ERRCODE_CARDINALITY_VIOLATION),
 errmsg("more than one row returned by a subquery used as an 
expression")));


Thanks,
Tatsuro Yamada

Re: Fixes for missing schema qualifications

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 07:30:23PM -0700, David G. Johnston wrote:
> On Tue, Mar 13, 2018 at 6:50 PM, Michael Paquier 
> wrote:
>> To simplify user's life, we
>> could also recommend just to users to issue a ALTER FUNCTION SET
>> search_path to fix the problem for all functions, that's easier to
>> digest.
> 
> I'm unclear as to what scope you are suggesting the above advice (and
> option #1) apply to.  All pg_catalog/information_schema functions or all
> functions including those created by users?
> 

I am suggesting that to keep simple the release notes of the next minor
versions, but still patch information_schema.sql with the changes based
on operator(pg_catalog.foo) for all functions internally as as new
deployments get the right call.
--
Michael


signature.asc
Description: PGP signature

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

2018-03-13 Thread Masahiko Sawada

On Tue, Mar 6, 2018 at 11:27 PM, Alvaro Herrera  wrote:
> Hello
>
> I haven't read your respective patches yet, but both these threads
> brought to memory a patch I proposed a few years ago that I never
> completed:
>
> https://www.postgresql.org/message-id/flat/20130124215715.GE4528%40alvh.no-ip.org

Thank you for sharing the thread.

>
> In that thread I posted a patch to implement a prioritisation scheme for
> autovacuum, based on an equation which was still under discussion when
> I abandoned it.  Chris Browne proposed a crazy equation to mix in both
> XID age and fraction of dead tuples; probably that idea is worth
> studying further.  I tried to implement that in my patch but I probably
> didn't do it correctly (because, as I recall, it failed to work as
> expected).  Nowadays I think we would also consider the multixact freeze
> age, too.
>
> Maybe that's worth giving a quick look in case some of the ideas there
> are useful for the patches now being proposed.

Yeah, that's definitely useful for the patches. I'll look at this and
will discuss that here.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Fixes for missing schema qualifications

2018-03-13 Thread David G. Johnston

On Tue, Mar 13, 2018 at 6:50 PM, Michael Paquier 
wrote:

> On Sat, Mar 10, 2018 at 08:36:34AM +, Noah Misch wrote:
> > This qualifies some functions, but it leaves plenty of unqualified
> operators.
>
> Yeah, I know that, and i don't have a perfect reply to offer to you.
> There are a couple of methods that we could use to tackle that:
> 1) For functions, enforce search_path with a SET search_path =
> 'pg_catalog' command.  However this has a performance impact.
> 2) Enforce operators qualification with operator(pg_catalog.foo).  This
> has no impact on performance, but repeating that all over the place is
> rather ugly, particularly for psql's describe.c and tab-completion.c.
> 3) Tweak dynamically search_path before running a query:
> - Save the existing search_path value by issuing SHOW search_path.
> - Use ALWAYS_SECURE_SEARCH_PATH_SQL to enforce the path.
> - Set back search_path based on the previous value.
> This logic can happen in a dedicated wrapper, but this impacts
> performance as it requires extra round trips to the server.
>
> For information_schema.sql, we are talking about tweaking 12 functions.
> So I think that we could live with 2).

That seems ideal.

> To simplify user's life, we
> could also recommend just to users to issue a ALTER FUNCTION SET
> search_path to fix the problem for all functions, that's easier to
> digest.
>

I'm unclear as to what scope you are suggesting the above advice (and
option #1) apply to.  All pg_catalog/information_schema functions or all
functions including those created by users?

>
> For the rest, which basically concerns psql, I have been thinking that
> actually using 2) would be the most painful approach, still something
> which does not impact the user experience, while 3) is easier to
> back-patch by minimizing the code footprint and avoids also any kind of
> future problems.
>

In furtherance of option 2 is there some way to execute a query (at least
in a development build) with no search_path in place - thus requiring every
object reference to be schema-qualified - and in doing so all such
unadorned operators/functions/relations would fail to be found quickly at
parse time?  Given the number of user-hours spent running describe commands
and tab-completion the extra round-time solution is definitely less
appealing in terms of long term time expended.

David J.

Re: User defined data types in Logical Replication

2018-03-13 Thread Masahiko Sawada

On Wed, Mar 7, 2018 at 9:19 PM, Masahiko Sawada  wrote:
> On Wed, Mar 7, 2018 at 2:52 AM, Alvaro Herrera  
> wrote:
>> Masahiko Sawada wrote:
>>> On Tue, Mar 6, 2018 at 8:35 AM, Alvaro Herrera  
>>> wrote:
>>
>>> > Therefore, I'm inclined to make this function raise a warning, then
>>> > return a substitute value (something like "unrecognized type XYZ").
>>> > [...]
>>>
>>> I agree with you about not hiding the actual reason for the error but
>>> if we raise a warning at logicalrep_typmap_gettypname don't we call
>>> slot_store_error_callback recursively?
>>
>> Hmm, now that you mention it, I don't really know.  I think it's
>> supposed not to happen, since calling ereport() again opens a new
>> recursion level, but then maybe errcontext doesn't depend on the
>> recursion level ... I haven't checked.  This is why the TAP test would
>> be handy :-)
>
> The calling ereport opens a new recursion level. The calling ereport
> with error doesn't return to caller but the calling with warning does.
> So the recursively calling ereport(WARNING) ends up with exceeding the
> errordata stack size. So it seems to me that we can set errcontext in
> logicalrep_typmap_gettypname() instead of raising warning or error.
>
>>
>>> Agreed. Will add a TAP test.
>>
>> Great.  This patch waits on that, then.
>>
>
> Okay. I think the most simple and convenient way to reproduce this
> issue is to call an elog(LOG) in input function of a user-defined data
> type. So I'm thinking to create the test in src/test/module directory.
>

After more thought, I think since the errors in
logicalrep_typmap_gettypname are not relevant with the actual error
(i.g. type conversion error) it would not be good idea to show the
error message of "could not found data type entry" in errcontext.
It might be more appropriate if we return a substitute value
("unrecognized type" or "unrecognized built-in type") without raising
neither an error nor a warning. Thoughts?

Regarding to regression test, I added a new test module
test_subscription that creates a new user-defined data type. In a
subscription regression test, using test_subscription we make
subscriber call slot_store_error_callback and check if the subscriber
can correctly look up both remote and local data type strings. One
downside of this regression test is that in a failure case, the
duration of the test will be long (up to 180sec) because it has to
wait for the polling timeout.
Attached an updated patch with a regression test.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

fix_slot_store_error_callback_v13.patch
Description: Binary data

Re: PQHost() undefined behavior if connecting string contains both host and hostaddr types

2018-03-13 Thread Michael Paquier

On Fri, Mar 09, 2018 at 04:42:30PM -0500, Peter Eisentraut wrote:
> It seems, however, that PGhost() has always been broken for hostaddr
> use.  In 9.6 (before the multiple-hosts stuff was introduced), when
> connecting to "hostaddr=127.0.0.1", PGhost() returns "/tmp".  Urgh.
> 
> I think we should decide what PGhost() is supposed to mean when hostaddr
> is in use, and then make a fix for that consistently across all versions.

Hm.  The only inconsistency I can see here is when "host" is not defined
but "hostaddr" is, so why not make PQhost return the value of "hostaddr"
in this case?
--
Michael


signature.asc
Description: PGP signature

Re: ALTER TABLE ADD COLUMN fast default

2018-03-13 Thread Andrew Dunstan





> On Mar 14, 2018, at 10:58 AM, David Rowley  
> wrote:
> 
> On 14 March 2018 at 11:36, Andrew Dunstan
>  wrote:
>> Here are the benchmark results from the v15 patch. Fairly similar to
>> previous results. I'm going to run some profiling again to see if I
>> can identify any glaring hotspots. I do suspect that the "physical
>> tlist" optimization sometimes turns out not to be one. It seems
>> perverse to be able to improve a query's performance by dropping a
>> column.
> 
> Can you explain what "fdnmiss" is that appears in the results?
> 
> 

It’s the patched code run against a materialized version of the table, i.e. one 
with no missing attributes.

Cheers

Andrew

Re: Fixes for missing schema qualifications

2018-03-13 Thread Michael Paquier

On Wed, Mar 14, 2018 at 10:50:38AM +0900, Michael Paquier wrote:
> For the rest, which basically concerns psql, I have been thinking that
> actually using 2) would be the most painful approach, still something
> which does not impact the user experience, while 3) is easier to
> back-patch by minimizing the code footprint and avoids also any kind of
> future problems.

Actually, there is also the case of pgbench, where there are two items
to be careful about:
1) Tweak correctly parameters in built-in benchmark queries.
2) Make pgbench gain an extra option allowing one to run the benchmark
on a wanted schema.

1) is essential to do, 2) perhaps less as custom benchmarks can always
be designed so the query strings are secured.  Most users don't run
pgbench on multi-user, untrusted systems as well (right?), giving less
value to 2).
--
Michael


signature.asc
Description: PGP signature

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Jeff Janes

On Tue, Mar 13, 2018 at 4:57 PM, Thomas Munro  wrote:

> On Wed, Mar 14, 2018 at 12:29 PM, Tom Lane  wrote:
> > Thomas Munro  writes:
> >> There is a fundamental and complicated estimation problem lurking here
> >> of course and I'm not sure what to think about that yet.  Maybe there
> >> is a very simple fix for this particular problem:
> >
> > Ah, I see you thought of the same hack I did.
> >
> > I think this may actually be a good fix, and here's the reason: this plan
> > is in fact being driven entirely off planner default estimates, because
> > we don't have any estimation code that knows what to do with
> > "wholerowvar *= wholerowvar".  I'm suspicious that we could drop the
> > preceding ANALYZE as being a waste of cycles, except maybe it's finding
> > out the number of rows for us.  In any case, LIMIT 1 is only a good idea
> > to the extent that the planner knows what it's doing, and this is an
> > example where it demonstrably doesn't and won't any time soon.
>
> Hmm.  I wonder if the ANALYZE might have been needed to avoid the
> nested loop plan at some point in history.
>
> Here's a patch to remove LIMIT 1, which fixes the plan for Jeff's test
> scenario and some smaller and larger examples I tried.  The query is
> already executed with SPI_execute(..., 1) so it'll give up after one
> row anyway.  The regression test includes a case that causes a row to
> be produced here and that's passing ('ERROR:  new data for
> materialized view "mvtest_mv" contains duplicate rows without any null
> columns').
>

Is there any good way to make the regression tests fail if the plan reverts
to the bad one?  The only thing I can think of would be to make the table
bigger so the regression tests becomes "noticeably slower", but that is
pretty vague and not user friendly to formally pass and just hope it is
slow enough for someone to investigate.

Cheers,

Jeff

Re: Fixes for missing schema qualifications

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 05:19:50PM -0700, David G. Johnston wrote:
> On Tue, Mar 13, 2018 at 5:11 PM, Tatsuo Ishii  wrote:
>> select pg_catalog.count(*) from pg_catalog.pg_namespace where
>> pg_catalog.nameeq(nspname, '%s');
>>
>>
> I'd rather write that:
> 
> select [...] where nspname operator(pg_catalog.=) '%s'
> 
> Introducing undocumented implementation functions to these queries is
> undesirable; and besides, indexing and equivalence relies on operators and
> not the underlying functions so there would be some risk of performance
> issues if the functions were used directly.

Using directly the function calls calls under the wood of what an
operator does is a potential solution, though I would discard it as it
becomes harder for the reader to undertand that this is an operator.
Things become even more confusing when dealing with input parameters of
different types for simple maths like addition, multiplication or
division using several kinds of integers, leading to more complications
in maintaining this code in the future.  And the operator is careful
about doing proper casting itself.
--
Michael


signature.asc
Description: PGP signature

Re: Fixes for missing schema qualifications

2018-03-13 Thread Michael Paquier

On Sat, Mar 10, 2018 at 08:36:34AM +, Noah Misch wrote:
> This qualifies some functions, but it leaves plenty of unqualified operators.

Yeah, I know that, and i don't have a perfect reply to offer to you.
There are a couple of methods that we could use to tackle that:
1) For functions, enforce search_path with a SET search_path =
'pg_catalog' command.  However this has a performance impact.
2) Enforce operators qualification with operator(pg_catalog.foo).  This
has no impact on performance, but repeating that all over the place is
rather ugly, particularly for psql's describe.c and tab-completion.c.
3) Tweak dynamically search_path before running a query:
- Save the existing search_path value by issuing SHOW search_path.
- Use ALWAYS_SECURE_SEARCH_PATH_SQL to enforce the path.
- Set back search_path based on the previous value.
This logic can happen in a dedicated wrapper, but this impacts
performance as it requires extra round trips to the server.

For information_schema.sql, we are talking about tweaking 12 functions.
So I think that we could live with 2).  To simplify user's life, we
could also recommend just to users to issue a ALTER FUNCTION SET
search_path to fix the problem for all functions, that's easier to
digest.

For the rest, which basically concerns psql, I have been thinking that
actually using 2) would be the most painful approach, still something
which does not impact the user experience, while 3) is easier to
back-patch by minimizing the code footprint and avoids also any kind of
future problems.
 
Thoughts?
--
Michael


signature.asc
Description: PGP signature

Re: ExplainProperty* and units

2018-03-13 Thread David Rowley

On 14 March 2018 at 13:27, Andres Freund  wrote:
> In the attached *POC* patch I've added a 'unit' parameter to the numeric
> ExplainProperty* functions, which EXPLAIN_FORMAT_TEXT adds to the
> output.  This can avoid the above and other similar branches (of which
> the JIT patch would add a number).

> Comments?

This seems like a good change. It would be nice to see us completely
get rid of all the appendStringInfo* calls, apart from in the
functions which meant to handle the behaviour specific to the explain
format. This is a step towards that, so has my vote.

Functions like show_hash_info are making it difficult to do more in this area.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 12:19:07PM -0400, David Steele wrote:
> I'll attach new patches in a reply to [1] once I have made the changes
> Tom requested.

Cool, thanks for your patience.  Looking forward to seeing those.  I'll
spend time on 0003 at the same time.  Let's also put the additional
umask calls for pg_rewind and pg_resetwal into a separate patch. 
--
Michael


signature.asc
Description: PGP signature

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 01:28:17PM -0400, Tom Lane wrote:
> David Steele  writes:
>> On 3/12/18 3:28 AM, Michael Paquier wrote:
>>> In pg_rewind and pg_resetwal, isn't that also a portion which is not
>>> necessary without the group access feature?
> 
>> These seem like a good idea to me with or without patch 03.  Some of our
>> front-end tools (initdb, pg_upgrade) were setting umask and others
>> weren't.  I think it's more consistent (and safer) if they all do, at
>> least if they are writing into PGDATA.
> 
> +1 ... see a926eb84e for an example of how easy it is to screw up if
> the process's overall umask is permissive.

Okay.  A suggestion that I have here would be to split those extra calls
into a separate patch.  That's a useful self-contained improvement.
--
Michael


signature.asc
Description: PGP signature

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-13 Thread David Gould

On Tue, 13 Mar 2018 11:29:03 -0400
Tom Lane  wrote:

> David Gould  writes:
> > I have attached the patch we are currently using. It applies to 9.6.8. I
> > have versions for older releases in 9.4, 9.5, 9.6. I fails to apply to 10,
> > and presumably head but I can update it if there is any interest.  
> 
> > The patch has three main features:
> > - Impose an ordering on the autovacuum workers worklist to avoid
> >   the need for rechecking statistics to skip already vacuumed tables.
> > - Reduce the frequency of statistics refreshes
> > - Remove the AutovacuumScheduleLock  
> 
> As per the earlier thread, the first aspect of that needs more work to
> not get stuck when the worklist has long tasks near the end.  I don't
> think you're going to get away with ignoring that concern.

I agree that the concern needs to be addressed. The other concern is that
sorting by oid is fairly arbitrary, the essential part is that there is
some determinate order.

> Perhaps we could sort the worklist by decreasing table size?  That's not
> an infallible guide to the amount of time that a worker will need to
> spend, but it's sure safer than sorting by OID.

That is better. I'll modify it to also prioritize tables that have relpages
and reltuples = 0 which usually means the table has no stats at all. Maybe use
oid to break ties.

> Alternatively, if we decrease the frequency of stats refreshes, how
> much do we even need to worry about reordering the worklist?

The stats refresh in the current scheme is needed to make sure that two
different workers don't vacuum the same table in close succession. It
doesn't actually work, and it costs the earth. The patch imposes an ordering
to prevent workers trying to claim recently vacuumed tables. This removes the
need for the stats refresh.

> In any case, I doubt anyone will have any appetite for back-patching
> such a change.  I'd recommend that you clean up your patch and rebase
> to HEAD, then submit it into the September commitfest (either on a
> new thread or a continuation of the old #13750 thread, not this one).

I had in mind to make a more comprehensive patch to try to make utovacuum
more responsive when there are lots of tables, but was a bit shy of the
reception. I'll try again with this one (in a new thread) based on the
suggestions. Thanks!

-dg

-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.

Re: Fixes for missing schema qualifications

2018-03-13 Thread David G. Johnston

On Tue, Mar 13, 2018 at 5:26 PM, Tatsuo Ishii  wrote:

> Next question is, should we update the manual? There are bunch of
> places where example queries are shown without schema qualifications.
>
>
I hope that isn't deemed necessary.

David J.

Re: ALTER TABLE ADD COLUMN fast default

2018-03-13 Thread Andres Freund

Hi,

On 2018-03-14 09:06:54 +1030, Andrew Dunstan wrote:
> I do suspect that the "physical tlist" optimization sometimes turns
> out not to be one. It seems perverse to be able to improve a query's
> performance by dropping a column.

Yea, it's definitely not always a boon. Especially w/ the newer v10+
project code.  I suspect we should largely get rid of it in v12, which
presumably will see a storage layer abstraction...

It'll be even more worthwhile to get rid of it if we manage to avoid
deforming columns that aren't needed but are lower than the currently
required columns. I still think we should build bitmasks of required
columns during planning.

Greetings,

Andres Freund

Re: ALTER TABLE ADD COLUMN fast default

2018-03-13 Thread David Rowley

On 14 March 2018 at 11:36, Andrew Dunstan
 wrote:
> Here are the benchmark results from the v15 patch. Fairly similar to
> previous results. I'm going to run some profiling again to see if I
> can identify any glaring hotspots. I do suspect that the "physical
> tlist" optimization sometimes turns out not to be one. It seems
> perverse to be able to improve a query's performance by dropping a
> column.

Can you explain what "fdnmiss" is that appears in the results?


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

ExplainProperty* and units

2018-03-13 Thread Andres Freund

Hi,

while adding EXPLAIN support for JITing (displaying time spent etc), I
got annoyed by the amount of duplication required. There's a fair amount
of
if (es->format == EXPLAIN_FORMAT_TEXT)
appendStringInfo(es->str, "Execution time: %.3f ms\n",
 1000.0 * totaltime);
else
ExplainPropertyFloat("Execution Time", 1000.0 * totaltime,
which is fairly redundant.

In the attached *POC* patch I've added a 'unit' parameter to the numeric
ExplainProperty* functions, which EXPLAIN_FORMAT_TEXT adds to the
output.  This can avoid the above and other similar branches (of which
the JIT patch would add a number).

The most valid counterargument I see is that in many cases, particularly
inside plans, we have more specific output for text mode anyway. Which
means there we'll not benefit much.  But I think that's a) considerably
done due to backward compatibility concerns b) verbosity concerns inside
plans, which obviously can be complicated.  Therefore I think it's
perfectly reasonable to avoid specific branches for data that's only
going to be displayed once per plan?

We also could add separate ExplainProperty*Unit(...) functions, but I
don't really see a need.

Comments?

Greetings,

Andres Freund
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 900fa74e85e..d5d1363d8e1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -123,8 +123,8 @@ static void ExplainSubPlans(List *plans, List *ancestors,
 const char *relationship, ExplainState *es);
 static void ExplainCustomChildren(CustomScanState *css,
 	  List *ancestors, ExplainState *es);
-static void ExplainProperty(const char *qlabel, const char *value,
-bool numeric, ExplainState *es);
+static void ExplainProperty(const char *qlabel, const char *unit,
+const char *value, bool numeric, ExplainState *es);
 static void ExplainDummyGroup(const char *objtype, const char *labelname,
   ExplainState *es);
 static void ExplainXMLTag(const char *tagname, int flags, ExplainState *es);
@@ -549,11 +549,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	{
 		double		plantime = INSTR_TIME_GET_DOUBLE(*planduration);
 
-		if (es->format == EXPLAIN_FORMAT_TEXT)
-			appendStringInfo(es->str, "Planning time: %.3f ms\n",
-			 1000.0 * plantime);
-		else
-			ExplainPropertyFloat("Planning Time", 1000.0 * plantime, 3, es);
+		ExplainPropertyFloat("Planning Time", "ms", 1000.0 * plantime, 3, es);
 	}
 
 	/* Print info about runtime of triggers */
@@ -585,14 +581,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	 * the output).  By default, ANALYZE sets SUMMARY to true.
 	 */
 	if (es->summary && es->analyze)
-	{
-		if (es->format == EXPLAIN_FORMAT_TEXT)
-			appendStringInfo(es->str, "Execution time: %.3f ms\n",
-			 1000.0 * totaltime);
-		else
-			ExplainPropertyFloat("Execution Time", 1000.0 * totaltime,
- 3, es);
-	}
+		ExplainPropertyFloat("Execution Time", "ms", 1000.0 * totaltime, 3,
+			 es);
 
 	ExplainCloseGroup("Query", NULL, true, es);
 }
@@ -764,8 +754,9 @@ report_triggers(ResultRelInfo *rInfo, bool show_relname, ExplainState *es)
 ExplainPropertyText("Constraint Name", conname, es);
 			ExplainPropertyText("Relation", relname, es);
 			if (es->timing)
-ExplainPropertyFloat("Time", 1000.0 * instr->total, 3, es);
-			ExplainPropertyFloat("Calls", instr->ntuples, 0, es);
+ExplainPropertyFloat("Time", "ms", 1000.0 * instr->total, 3,
+	 es);
+			ExplainPropertyFloat("Calls", NULL, instr->ntuples, 0, es);
 		}
 
 		if (conname)
@@ -1280,10 +1271,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		}
 		else
 		{
-			ExplainPropertyFloat("Startup Cost", plan->startup_cost, 2, es);
-			ExplainPropertyFloat("Total Cost", plan->total_cost, 2, es);
-			ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
-			ExplainPropertyInteger("Plan Width", plan->plan_width, es);
+			ExplainPropertyFloat("Startup Cost", NULL, plan->startup_cost, 2, es);
+			ExplainPropertyFloat("Total Cost", NULL, plan->total_cost, 2, es);
+			ExplainPropertyFloat("Plan Rows", NULL, plan->plan_rows, 0, es);
+			ExplainPropertyInteger("Plan Width", NULL, plan->plan_width, es);
 		}
 	}
 
@@ -1323,11 +1314,11 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		{
 			if (es->timing)
 			{
-ExplainPropertyFloat("Actual Startup Time", startup_sec, 3, es);
-ExplainPropertyFloat("Actual Total Time", total_sec, 3, es);
+ExplainPropertyFloat("Actual Startup Time", NULL, startup_sec, 3, es);
+ExplainPropertyFloat("Actual Total Time", NULL, total_sec, 3, es);
 			}
-			ExplainPropertyFloat("Actual Rows", rows, 0, es);
-			ExplainPropertyFloat("Actual Loops", nloops, 0, es);
+			ExplainPropertyFloat("Actual Rows", NULL, rows, 0, es);
+			ExplainPropertyFloat("Actual Loops", NULL, nloops, 0, es);
 		}
 	}
 	else if (es->analyze)
@@ -1338,11 +1329,11

Re: Fixes for missing schema qualifications

2018-03-13 Thread Tatsuo Ishii

>> select pg_catalog.count(*) from pg_catalog.pg_namespace where
>> pg_catalog.nameeq(nspname, '%s');
>>
>>
> I'd rather write that:
> 
> select [...] where nspname operator(pg_catalog.=) '%s'
> 
> Introducing undocumented implementation functions to these queries is
> undesirable; and besides, indexing and equivalence relies on operators and
> not the underlying functions so there would be some risk of performance
> issues if the functions were used directly.

Thanks. Yours looks much better.

Next question is, should we update the manual? There are bunch of
places where example queries are shown without schema qualifications.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Fixes for missing schema qualifications

2018-03-13 Thread David G. Johnston

On Tue, Mar 13, 2018 at 5:11 PM, Tatsuo Ishii  wrote:

> >>> +"select pg_catalog.count(*) "
> >>> +"from pg_catalog.pg_namespace
> where nspname = '%s'",
> >>
> >> This qualifies some functions, but it leaves plenty of unqualified
> operators.
>
> Oops. I meant:
>
> select pg_catalog.count(*) from pg_catalog.pg_namespace where
> pg_catalog.nameeq(nspname, '%s');
>
>
I'd rather write that:

select [...] where nspname operator(pg_catalog.=) '%s'

Introducing undocumented implementation functions to these queries is
undesirable; and besides, indexing and equivalence relies on operators and
not the underlying functions so there would be some risk of performance
issues if the functions were used directly.

David J.

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2018-03-13 Thread David Rowley

On 14 March 2018 at 09:25, Robert Haas  wrote:
> What do you think about the idea of using a projection path as a proxy
> path instead of inventing a new method?  It seems simple enough to do:
>
> new_path = (Path *) create_projection_path(root, new_rel, old_path,
> old_path->pathtarget);
>
> ...when we need a proxy path.

I'm very open to finding a better way to do this, but does that not
just handle the targetlist issue? The proxy path also carries
information which allows the translation of Vars in other places in
the plan from the old rel into the vars of the new rel. Join
conditions, sort clauses etc.

Wouldn't a ProjectionPath just need the same additional translation
fields that I've bolted onto AppendPath to make it work properly?

I've looked at the projection path code but got a bit confused with
the dummypp field. I see where it gets set, but not where it gets
checked. A comment in create_projection_plan() seems to explain that
dummypp is pretty useless since targetlists are modified sometimes, so
I'm a bit unsure what the point of it is. Maybe that just goes to show
that my understanding of projection paths is not that great.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Fixes for missing schema qualifications

2018-03-13 Thread Tatsuo Ishii

>>> +"select pg_catalog.count(*) "
>>> +"from pg_catalog.pg_namespace where 
>>> nspname = '%s'",
>> 
>> This qualifies some functions, but it leaves plenty of unqualified operators.
> 
> So this should go something like this?
> 
> select pg_catalog.count(*) from pg_catalog.pg_namespace where nameeq(nspname, 
> '%s');

Oops. I meant:

select pg_catalog.count(*) from pg_catalog.pg_namespace where 
pg_catalog.nameeq(nspname, '%s');

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Fixes for missing schema qualifications

2018-03-13 Thread Tatsuo Ishii

>> + "select pg_catalog.count(*) "
>> + "from pg_catalog.pg_namespace where 
>> nspname = '%s'",
> 
> This qualifies some functions, but it leaves plenty of unqualified operators.

So this should go something like this?

select pg_catalog.count(*) from pg_catalog.pg_namespace where nameeq(nspname, 
'%s');

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: SQL/JSON: functions

2018-03-13 Thread Oleg Bartunov

On 14 Mar 2018 01:54, "Michael Paquier"  wrote:

On Tue, Mar 13, 2018 at 04:08:01PM +0300, Oleg Bartunov wrote:
> The docs are here
> https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
>
> It's not easy to write docs for SQL/JSON in xml, so I decided to write in
more
> friendly way. We'll have time to convert it to postgres format.

If you aim at getting a feature committed first without its
documentation, and getting the docs written after the feature freeze
using a dedicated open item or such,


Exactly. SQL/JSON is rather complex thing and "converting" the standard to
the user level understanding is a separate challenge and I'd like to
continue to work on it. It's mostly written, we need to understand how to
organize it.

this

is much acceptable in my
opinion and the CF is running short in time.
--
Michael

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 12:29 PM, Tom Lane  wrote:
> Thomas Munro  writes:
>> There is a fundamental and complicated estimation problem lurking here
>> of course and I'm not sure what to think about that yet.  Maybe there
>> is a very simple fix for this particular problem:
>
> Ah, I see you thought of the same hack I did.
>
> I think this may actually be a good fix, and here's the reason: this plan
> is in fact being driven entirely off planner default estimates, because
> we don't have any estimation code that knows what to do with
> "wholerowvar *= wholerowvar".  I'm suspicious that we could drop the
> preceding ANALYZE as being a waste of cycles, except maybe it's finding
> out the number of rows for us.  In any case, LIMIT 1 is only a good idea
> to the extent that the planner knows what it's doing, and this is an
> example where it demonstrably doesn't and won't any time soon.

Hmm.  I wonder if the ANALYZE might have been needed to avoid the
nested loop plan at some point in history.

Here's a patch to remove LIMIT 1, which fixes the plan for Jeff's test
scenario and some smaller and larger examples I tried.  The query is
already executed with SPI_execute(..., 1) so it'll give up after one
row anyway.  The regression test includes a case that causes a row to
be produced here and that's passing ('ERROR:  new data for
materialized view "mvtest_mv" contains duplicate rows without any null
columns').

-- 
Thomas Munro
http://www.enterprisedb.com

0001-Fix-performance-regression-in-REFRESH-MATERIALIZED-V.patch
Description: Binary data

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Chapman,

* Chapman Flack (c...@anastigmatix.net) wrote:
> I'm not advocating the Sisyphean task of having PG incorporate
> knowledge of all those possibilities. I'm advocating the conservative
> approach: have PG be that well-behaved application that those extended
> semantics are generally all designed to play well with, and just not
> do stuff that obstructs or tramples over what the admin takes care
> to set up.

I think we get that you're advocating removing the checks and
permissions-setting that the PG tools do, however...

> I wonder how complicated it would have to be really. On any system
> with a POSIX base, I guess it's possible for PGDATA to have an S_ISVTX
> "sticky" bit in the mode, which does have an official significance (but
> one that only affects whether non-postgres can rename or unlink things
> in the directory, which might be of little practical significance).
> Perhaps its meaning could be overloaded with "the admin is handling
> the permissions, thank you", and postmaster and various command-line
> utilities could see it, and refrain from any gratuitous chmods or
> refusals to function.
> 
> Or, if overloading S_ISVTX seems in poor taste, what would be wrong
> with simply checking for an empty file PERMISSIONS-ARE-MANAGED in
> PGDATA and responding the same way?
> 
> Or, assuming some form of ACL is available, just let the admin
> change the owner and group of PGDATA to other than postgres,
> grant no access to other, and give rwx to postgres in the ACL?

None of these suggestions really sound workable to me.  I certainly
don't think we should be overloading the meaning of a specific and
entirely independent filesystem permission (and really don't even
want to imagine what it would require to do something like that on
Windows...), dropping an empty file in the directory strikes me as a
ripe target for confusion, and we have specific checks to try and
make sure we are running as the owner that owns the directory with
some pretty good reasons around that (avoiding multiple postmasters
running in the same PGDATA directory, specifically).

> PG could then reason as follows: * I do not own this directory.
> * I am not the group of this directory. * It grants no access to other.
> * Yet, I find myself listing and accessing files in it without
> difficulty. * The admin has set this up for me in a way I do not
> understand. * I will refrain from messing with it.

Removing the check that says "we aren't going to try to run PG in this
directory if we aren't the owner of it" also doesn't seem like it's
necessairly a great plan.

> Three ideas off the top of my head. Probably more where they came from.

None of them really seem workable though.  On the other hand, let's
consider what this patch actually ends up doing when POSIX ACLs are
involved.  This allows ACLs to be used where they weren't before and
with appropriate defaults set even to work for new files being created,
which wouldn't work before.  Yes, if you create a default ACL which
says "grant this other user write access to files in the PG data
directory" then that won't actually be honored because we will
chmod(640) the file and the POSIX ACL system actually works with the
user/group privilege system, like so:

Create the "data" dir, as initdb would:
➜  ~ mkdir xyz
➜  ~ chmod 750 xyz
➜  ~ ls -ld xyz
drwxr-x--- 2 sfrost sfrost 4096 Mar 13 19:07 xyz

Set a default ACL to allow the "daemon" user read/write access to files
created:

➜  ~ setfacl -dm u:daemon:rw xyz
➜  ~ getfacl xyz
# file: xyz
# owner: sfrost
# group: sfrost
user::rwx
group::r-x
other::---
default:user::rwx
default:user:daemon:rw-
default:group::r-x
default:mask::rwx
default:other::---

Create a file the way PG would:
➜  ~ touch xyz/a
➜  ~ chmod 640 xyz/a
➜  ~ getfacl xyz/a
# file: xyz/a
# owner: sfrost
# group: sfrost
user::rw-
user:daemon:rw- #effective:r--
group::r-x  #effective:r--
mask::r--
other::---

The daemon user ends up with read-only access (note the '#effective',
which shows that the POSIX ACL system isn't overriding the "regular"
ACLs).

... but that's basically what we want.

Multiple users having write access to the data directory could be quite
bad as you might possibly get two postmasters running against the same
data directory at the same time and there's basically no case where
that's a good thing to have happen.  This change does let users grant
out read access to other users/groups, even beyond what's possible using
the traditional user/group system, so this opens up a lot more possible
options for advanced users, provided they set the defaults
appropriately at the directory level (which, presumably, an
administrator versed in POSIX ACLs and wishing to use them would know,
or would figure out quickly).

Yes, perhaps there's some argument to be made that we should have an
option where we don't force any privileges, but that can certainly be
considered a future capability and what's being implemented here doesn't

JIT compiling with LLVM v12

2018-03-13 Thread Andres Freund

Hi,

I've pushed a revised and rebased version of my JIT patchset.
The git tree is at
  https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
  
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

There's nothing hugely exciting, mostly lots of cleanups.

- added some basic EXPLAIN output, displaying JIT options and time spent
  jitting (see todo below)

  JIT:
Functions: 9
Generation Time: 4.604
Inlining: false
Inlining Time: 0.000
Optimization: false
Optimization Time: 0.585
Emission Time: 12.858

- Fixed bugs around alignment computations in tuple deforming. Wasn't
  able to trigger any bad consequences, but it was clearly wrong.
- Worked a lot on making code more pgindent safe. There's still some
  minor layout damage, but it's mostly ok now. For that I had to add a
  bunch of helpers that make the code shorter
- Freshly emitted functions now have proper attributes indicating
  architecture, floating point behaviour etc.  That's what previously
  prevented the inliner of doing its job without forcing its hand. That
  yields a bit of a speedup.
- reduced size of code a bit by deduplicating code, in particular
  don't "manually" create signatures for function declarations
  anymore. Besides deduplicating, this also ensures code generation time
  errors when function signatures change.
- fixed a number of FIXMEs etc
- added a lot of comments
- portability fixes (OSX, freebsd)

Todo:
- some build issues with old clang versions pointed out by Thomas Munro
- when to take jit_expressions into account (both exec and plan or just
  latter)
- EXPLAIN for queries that are JITed should display units. Starting
  thread about effort to not duplicate code for that
- more explanations of type & function signature syncing
- GUC docs (including postgresql.conf.sample)

Thanks everyone, particularly Peter in this update, for helping me
along!


Regards,

Andres

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Alvaro Herrera

By the way, I checked whether patch 0002 (additional tests) had an
effect on coverage, and couldn't detect any changes in terms of
lines/functions.  Were you able to find any bugs in your code thanks to
the new tests that would not have been covered by existing tests?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Tom Lane

Thomas Munro  writes:
> There is a fundamental and complicated estimation problem lurking here
> of course and I'm not sure what to think about that yet.  Maybe there
> is a very simple fix for this particular problem:

Ah, I see you thought of the same hack I did.

I think this may actually be a good fix, and here's the reason: this plan
is in fact being driven entirely off planner default estimates, because
we don't have any estimation code that knows what to do with
"wholerowvar *= wholerowvar".  I'm suspicious that we could drop the
preceding ANALYZE as being a waste of cycles, except maybe it's finding
out the number of rows for us.  In any case, LIMIT 1 is only a good idea
to the extent that the planner knows what it's doing, and this is an
example where it demonstrably doesn't and won't any time soon.

regards, tom lane

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Tom Lane

Thomas Munro  writes:
> This looks like an invisible correlation problem.

Yeah --- the planner has no idea that the join rows satisfying
newdata.* *= newdata2.* are likely to be exactly the ones not
satisfying newdata.ctid <> newdata2.ctid.  It's very accidental
that we got a good plan before.

I've not looked to see where this query is generated, but I wonder
if we could make things better by dropping the LIMIT 1 and instead
using an executor count parameter to stop early.

regards, tom lane

Re: SQL/JSON: functions

2018-03-13 Thread Andres Freund

On 2018-03-14 07:54:35 +0900, Michael Paquier wrote:
> On Tue, Mar 13, 2018 at 04:08:01PM +0300, Oleg Bartunov wrote:
> > The docs are here
> > https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
> > 
> > It's not easy to write docs for SQL/JSON in xml, so I decided to write in 
> > more
> > friendly way. We'll have time to convert it to postgres format.
> 
> If you aim at getting a feature committed first without its
> documentation, and getting the docs written after the feature freeze
> using a dedicated open item or such, this is much acceptable in my
> opinion and the CF is running short in time.

Given that this patch still uses PG_TRY/CATCH around as wide paths of
code as a whole ExecEvalExpr() invocation, basically has gotten no
review, I don't see this going anywhere for v11.

Greetings,

Andres Freund

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 11:34 AM, Thomas Munro
 wrote:
> LOG:  duration: 26101.612 ms  plan:
> Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16452 newdata WHERE
> newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16452
> newdata2 WHERE newdata2 IS NOT NULL AND newdata2
> OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
> OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
> Limit  (cost=0.00..90.52 rows=1 width=28) (actual
> time=26101.608..26101.608 rows=0 loops=1)
>  ->  Nested Loop Semi Join  (cost=0.00..225220.96 rows=2488 width=28)
> (actual time=26101.606..26101.606 rows=0 loops=1)
>Join Filter: ((newdata2.ctid <> newdata.ctid) AND (newdata.* *=
> newdata2.*))
>Rows Removed by Join Filter: 2500
>->  Seq Scan on pg_temp_16452 newdata  (cost=0.00..73.00
> rows=4975 width=34) (actual time=0.022..15.448 rows=5000 loops=1)
>  Filter: (newdata.* IS NOT NULL)
>->  Materialize  (cost=0.00..97.88 rows=4975 width=34) (actual
> time=0.000..0.500 rows=5000 loops=5000)
>  ->  Seq Scan on pg_temp_16452 newdata2  (cost=0.00..73.00
> rows=4975 width=34) (actual time=0.010..4.033 rows=5000 loops=1)
>Filter: (newdata2.* IS NOT NULL)

This plan is chosen because we're looking for just one row (LIMIT 1)
that has equal data but a different ctid.  In this case we're not
going to find one, so we'll pay the full enormous cost of the nested
loop, but the startup cost is estimated as 0 and we think we are going
to find a row straight away.  That's because we don't know that it's
unlikely for there to be a row with the same columns but a different
ctid.

There is a fundamental and complicated estimation problem lurking here
of course and I'm not sure what to think about that yet.  Maybe there
is a very simple fix for this particular problem:

--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -660,7 +660,7 @@ refresh_by_match_merge(Oid matviewOid, Oid
tempOid, Oid relowner,
 "(SELECT * FROM %s newdata2
WHERE newdata2 IS NOT NULL "
 "AND newdata2
OPERATOR(pg_catalog.*=) newdata "
 "AND newdata2.ctid
OPERATOR(pg_catalog.<>) "
-"newdata.ctid) LIMIT 1",
+"newdata.ctid)",
 tempname, tempname);
if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
elog(ERROR, "SPI_exec failed: %s", querybuf.data);

That gets me back to the sort-merge plan, but maybe it's too superficial.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: SQL/JSON: functions

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 04:08:01PM +0300, Oleg Bartunov wrote:
> The docs are here
> https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
> 
> It's not easy to write docs for SQL/JSON in xml, so I decided to write in more
> friendly way. We'll have time to convert it to postgres format.

If you aim at getting a feature committed first without its
documentation, and getting the docs written after the feature freeze
using a dedicated open item or such, this is much acceptable in my
opinion and the CF is running short in time.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Alvaro Herrera

Amit Langote wrote:

> I will continue working on improving the comments / cleaning things up and
> post a revised version soon, but until then please look at the attached.

I tried to give this a read.  It looks pretty neat stuff -- as far as I
can tell, it follows Robert's sketch for how this should work.  The fact
that it's under-commented makes me unable to follow it too closely
though (I felt like adding a few "wtf?" comments here and there), so
it's taking me a bit to follow things in detail.  Please do submit
improved versions as you have them.

I think you're using an old version of pg_bsd_indent.

In particular need of commentary
 * match_clause_to_partition_key() should indicate which params are
 output and what do they get

 * get_steps_using_prefix already has a comment, but it doesn't really
 explain much.  (I'm not sure why you use the term "tuple" here.  I mean,
 mathematically it probably makes sense, but in the overall context it
 seems just confusing.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ALTER TABLE ADD COLUMN fast default

2018-03-13 Thread Andrew Dunstan

On Tue, Mar 13, 2018 at 10:58 PM, Andrew Dunstan
 wrote:
> On Tue, Mar 13, 2018 at 2:40 PM, Andrew Dunstan
>  wrote:
>
>>>
>>> Going by the commitfest app, the patch still does appear to be waiting
>>> on Author. Never-the-less, I've made another pass over it and found a
>>> few mistakes and a couple of ways to improve things:
>>>
>>
>> working on these. Should have a new patch tomorrow.
>>
>
>
> Here is a patch that attends to most of these, except that I haven't
> re-indented it.
>
> Your proposed changes to slot_getmissingattrs() wouldn't work, but I
> have rewritten it in a way that avoids the double work you disliked.
>
> I'll rerun the benchmark tests I posted upthread and let you know the results.
>

Here are the benchmark results from the v15 patch. Fairly similar to
previous results. I'm going to run some profiling again to see if I
can identify any glaring hotspots. I do suspect that the "physical
tlist" optimization sometimes turns out not to be one. It seems
perverse to be able to improve a query's performance by dropping a
column.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


results.t100r50k.v15
Description: Binary data


results.t100r64.v15
Description: Binary data

Re: [submit code] I develop a tool for pgsql, how can I submit it

2018-03-13 Thread Euler Taveira

2018-03-13 12:19 GMT-03:00 leap :
> I develop a tool for pgsql, I want to submit it on pgsql.
> how can I submit it?
>
As Laurenz said a tool doesn't necessarily have to be part of
PostgreSQL. Sometimes a lot of people ask for a tool, someone write it
and the community decide to maintain it. I'm not sure this is the case
for your tool. The pro is that the tool will be maintained by a group
of experienced programmers (I'm not saying you can't have this group
outside PostgreSQL project. You may have enough interest if people are
excited by it). The con about this direction is that the tool
development cycle is tied to PostgreSQL (which means 1-year cycle and
slow development over time). Since I do a lot of debug it seems very
useful for me. If you decide to convince people about the usefulness
of your tool, you should: (i) remove some overlap between the tool and
core, (ii) cleanup your code to follow the PostgreSQL guidelines,
(iii) reuse core functions, (iv) reuse constants instead of redefine
them and (v) document everything. Steps (i), (iii) and (iv) may
require some code rearrange in PostgreSQL. Even if you decide to
continue the development outside PostgreSQL, those steps would improve
your code.


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 8:07 AM, Jeff Janes  wrote:
> The following commit has caused a devastating performance regression
> in concurrent refresh of MV:
>
> commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
> Author: Tom Lane 
> Date:   Wed Nov 29 22:00:29 2017 -0500
>
> Fix neqjoinsel's behavior for semi/anti join cases.
>
>
> The below reproduction goes from taking about 1 second to refresh, to taking
> an amount of time I don't have the patience to measure.
>
> drop table foobar2 cascade;
> create table foobar2 as select * from generate_series(1,20);
> create materialized view foobar3 as select * from foobar2;
> create unique index on foobar3 (generate_series );
> analyze foobar3;
> refresh materialized view CONCURRENTLY foobar3 ;
>
>
> When I interrupt the refresh, I get a message including this line:
>
> CONTEXT:  SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420 newdata
> WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16420
> newdata2 WHERE newdata2 IS NOT NULL AND newdata2 OPERATOR(pg_catalog.*=)
> newdata AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1"
>
> So I makes sense that the commit in question could have caused a change in
> the execution plan.  Because these are temp tables, I can't easily get my
> hands on them to investigate further.

Ouch.  A quadratic join.  This looks like an invisible correlation problem.

load 'auto_explain';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;

drop table if exists t cascade;

create table t as select generate_series(1, 5000);
create materialized view mv as select * from t;
create unique index on mv(generate_series);
analyze mv;
refresh materialized view concurrently mv;

HEAD:

LOG:  duration: 26101.612 ms  plan:
Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16452 newdata WHERE
newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16452
newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
Limit  (cost=0.00..90.52 rows=1 width=28) (actual
time=26101.608..26101.608 rows=0 loops=1)
 ->  Nested Loop Semi Join  (cost=0.00..225220.96 rows=2488 width=28)
(actual time=26101.606..26101.606 rows=0 loops=1)
   Join Filter: ((newdata2.ctid <> newdata.ctid) AND (newdata.* *=
newdata2.*))
   Rows Removed by Join Filter: 2500
   ->  Seq Scan on pg_temp_16452 newdata  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.022..15.448 rows=5000 loops=1)
 Filter: (newdata.* IS NOT NULL)
   ->  Materialize  (cost=0.00..97.88 rows=4975 width=34) (actual
time=0.000..0.500 rows=5000 loops=5000)
 ->  Seq Scan on pg_temp_16452 newdata2  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.010..4.033 rows=5000 loops=1)
   Filter: (newdata2.* IS NOT NULL)

And with commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97 reverted:

LOG:  duration: 36.358 ms  plan:
Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16470 newdata WHERE
newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16470
newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
Limit  (cost=756.95..939.50 rows=1 width=28) (actual
time=36.354..36.354 rows=0 loops=1)
 ->  Merge Semi Join  (cost=756.95..2947.51 rows=12 width=28) (actual
time=36.352..36.352 rows=0 loops=1)
   Merge Cond: (newdata.* *= newdata2.*)
   Join Filter: (newdata2.ctid <> newdata.ctid)
   Rows Removed by Join Filter: 5000
   ->  Sort  (cost=378.48..390.91 rows=4975 width=34) (actual
time=9.622..10.300 rows=5000 loops=1)
 Sort Key: newdata.* USING *<
 Sort Method: quicksort  Memory: 622kB
 ->  Seq Scan on pg_temp_16470 newdata  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.021..4.986 rows=5000 loops=1)
   Filter: (newdata.* IS NOT NULL)
   ->  Sort  (cost=378.48..390.91 rows=4975 width=34) (actual
time=7.378..8.010 rows=5000 loops=1)
 Sort Key: newdata2.* USING *<
 Sort Method: quicksort  Memory: 622kB
 ->  Seq Scan on pg_temp_16470 newdata2  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.017..3.034 rows=5000 loops=1)
   Filter: (newdata2.* IS NOT NULL)

-- 
Thomas Munro
http://www.enterprisedb.com

Re: JIT compiling with LLVM v11

2018-03-13 Thread Andres Freund

Hi,

On 2018-03-14 10:32:40 +1300, Thomas Munro wrote:
> I decided to try this on a CentOS 7.2 box.  It has LLVM 3.9 in the
> 'epel' package repo, but unfortunately it only has clang 3.4.

That's a bit odd, given llvm and clang really live in the same repo...


> clang: error: unknown argument: '-fexcess-precision=standard'
> clang: error: unknown argument: '-flto=thin'
>
> Ok, so I hacked src/Makefile.global.in to remove -flto=thin.

I think I can get actually rid of that entirely.


> It looks
> like -fexcess-precision-standard is coming from a configure test that
> was run against ${CC}, not against ${CLANG}, so I hacked the generated
> src/Makefile.global to remove that too, just to see if I could get
> past that.

Yea, I'd hoped we could avoid duplicating all the configure tests, but
maybe not :(.



> Then I could build successfully and make check passed.  I did see one warning:
> 
> In file included from execExpr.c:39:
> ../../../src/include/jit/jit.h:36:3: warning: redefinition of typedef
> 'JitProviderCallbacks' is a C11 feature [-Wtypedef-redefinition]
> } JitProviderCallbacks;
>   ^
> ../../../src/include/jit/jit.h:22:37: note: previous definition is here
> typedef struct JitProviderCallbacks JitProviderCallbacks;
> ^

Yep. Removed the second superflous / redundant typedef.  Will push a
heavily rebased version in a bit, will include fix for this.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v11

2018-03-13 Thread Thomas Munro

On Thu, Mar 1, 2018 at 9:02 PM, Andres Freund  wrote:
> Biggest changes:
> - LLVM 3.9 - master are now supported. This includes a good chunk of
>   work by Pierre Ducroquet.

I decided to try this on a CentOS 7.2 box.  It has LLVM 3.9 in the
'epel' package repo, but unfortunately it only has clang 3.4.  I
suppose it's important to make this work for RHEL7 using only
dependencies that can be met by the vendor package repos?  Maybe
someone who knows more about CentOS/RHE could tell me if I'm mistaken
and there is a way to get a more modern clang from a reputable repo
that our packages could depend on, though I release that clang is only a
build dependency, not a runtime one.  I'm unsure how that constrains
things.

clang: "clang version 3.4.2 (tags/RELEASE_34/dot2-final)"
gcc and g++: "gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)"
llvm: "3.9.1"

First problem:

clang: error: unknown argument: '-fexcess-precision=standard'
clang: error: unknown argument: '-flto=thin'

Ok, so I hacked src/Makefile.global.in to remove -flto=thin.  It looks
like -fexcess-precision-standard is coming from a configure test that
was run against ${CC}, not against ${CLANG}, so I hacked the generated
src/Makefile.global to remove that too, just to see if I could get
past that.

I don't know if there was another way to control floating point
precision in ancient clang before they adopted the GCC-compatible
flag, but it would seem slightly fishy to have .o files and .bc files
compiled with different floating point settings because then you could
get different answers depending on whether your expression is JITted.

Then I could build successfully and make check passed.  I did see one warning:

In file included from execExpr.c:39:
../../../src/include/jit/jit.h:36:3: warning: redefinition of typedef
'JitProviderCallbacks' is a C11 feature [-Wtypedef-redefinition]
} JitProviderCallbacks;
  ^
../../../src/include/jit/jit.h:22:37: note: previous definition is here
typedef struct JitProviderCallbacks JitProviderCallbacks;
^

That's a legit complaint.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: parallel append vs. simple UNION ALL

2018-03-13 Thread Robert Haas

On Tue, Mar 13, 2018 at 12:35 AM, Ashutosh Bapat
 wrote:
> It looks like it was not changed in all the places. make falied. I
> have fixed all the instances of these two functions in the attached
> patchset (only 0003 changes). Please check.

Oops.  Thanks.

I'm going to go ahead and commit 0001 here.  Any more thoughts on the rest?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Chapman Flack

On 03/13/2018 02:47 PM, Tom Lane wrote:
> Well, to be blunt, what we target is POSIX-compatible systems.  If
> you're telling us to worry about non-POSIX filesystem semantics,
> and your name is not Microsoft, it's going to be a hard sell.
> We have enough to do just keeping up with that scope of target
> systems.

So, how many POSIX-compatible systems are available today (if any),
where you can actually safely assume that there are no additional
security/access-control-related considerations in effect beyond
three user bits/three group bits/three other bits, and not be wrong?

I'm not advocating the Sisyphean task of having PG incorporate
knowledge of all those possibilities. I'm advocating the conservative
approach: have PG be that well-behaved application that those extended
semantics are generally all designed to play well with, and just not
do stuff that obstructs or tramples over what the admin takes care
to set up.

On 03/13/2018 03:44 PM, Stephen Frost wrote:
> * Chapman Flack (c...@anastigmatix.net) wrote:
>> So my suggestion boils down to PG having at least an option, somehow,
>> to be well-behaved in that sense.
>
> I'm afraid that we haven't got any great answer to that "somehow".  I
> was hoping you might have some other ideas beyond command-line
> switches which could leave the system in an inconsistent state a bit
> too easily.

I wonder how complicated it would have to be really. On any system
with a POSIX base, I guess it's possible for PGDATA to have an S_ISVTX
"sticky" bit in the mode, which does have an official significance (but
one that only affects whether non-postgres can rename or unlink things
in the directory, which might be of little practical significance).
Perhaps its meaning could be overloaded with "the admin is handling
the permissions, thank you", and postmaster and various command-line
utilities could see it, and refrain from any gratuitous chmods or
refusals to function.

Or, if overloading S_ISVTX seems in poor taste, what would be wrong
with simply checking for an empty file PERMISSIONS-ARE-MANAGED in
PGDATA and responding the same way?

Or, assuming some form of ACL is available, just let the admin
change the owner and group of PGDATA to other than postgres,
grant no access to other, and give rwx to postgres in the ACL?

PG could then reason as follows: * I do not own this directory.
* I am not the group of this directory. * It grants no access to other.
* Yet, I find myself listing and accessing files in it without
difficulty. * The admin has set this up for me in a way I do not
understand. * I will refrain from messing with it.

Three ideas off the top of my head. Probably more where they came from.
:)

-Chap

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2018-03-13 Thread Robert Haas

On Mon, Feb 19, 2018 at 4:02 AM, David Rowley
 wrote:
> On 19 February 2018 at 18:01, David Rowley  
> wrote:
>> On 19 February 2018 at 15:11, Tomas Vondra  
>> wrote:
>>> and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
>>> naming for boolean variables.
>>
>> You're right. I'll change that and post an updated patch. I'll also
>> reprocess your email again and try to improve some comments to make
>> the intent of the code more clear.
>
> I've made a few changes to the patch. "isproxy" is now "is_proxy".
> I've made the comments a bit more verbose at the top of the definition
> of struct AppendPath. Also shuffled some comments around in allpaths.c
>
> I've attached both a delta patch with just these changes, and also a
> completely new patch based on a recent master.

While I was working on
http://postgr.es/m/ca+tgmoakt5gmahbpwgqrr2nadfomaonoxyowhrdvfgws34t...@mail.gmail.com
I noticed that a projection path is very close to being usable as a
proxy path, because we've already got code to strip out unnecessary
proxy paths at plan generation time.  I noticed that there were a few
problems with that which I wrote patches to fix (see 0001, 0002
attached to that email) but they seem to be only minor issues.

What do you think about the idea of using a projection path as a proxy
path instead of inventing a new method?  It seems simple enough to do:

new_path = (Path *) create_projection_path(root, new_rel, old_path,
old_path->pathtarget);

...when we need a proxy path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Greetings Chapman,

* Chapman Flack (c...@anastigmatix.net) wrote:
> On 03/13/2018 01:50 PM, Stephen Frost wrote:
> > PG will stat PGDATA and conclude that the system is saying that group
> > access has been granted on PGDATA and will do the same for subsequent
> > files created later on. ... The only issue that remains is that PG
> > doesn't understand or work with POSIX ACLs or Linux capabilities,
> > but that's not anything new.
> 
> What's new is that it is now pretending even more extravagantly to
> understand what it doesn't understand. Where it would previously draw
> in incorrect conclusion and refuse to start, which is annoying but
> not very difficult to work around if need be, it would now draw the
> same incorrect conclusion and then actively go about making the real
> world embody the incorrect conclusion, contrary to the admin's efforts.

I have to say that I disagree about it being "easy to work-around" PG
refusing to start.  Also, we're not pretending any more or less, we're
sticking to exactly what we do understand- which is the 80's unix
permission system, as you put it.  The options that I see here are to
stick with the user/group system and our naive understanding of it, to
go whole-hog and try to completely understand everything (with lots of
#ifdef's, as discussed), or to completely remove all checks- but we
don't have a clear proposal for that and it strikes me as at least
unlikely to go over well anyway, given all of the discussion here about
trying to simply change the one check we have.

> >> umask inherited by the postmaster. A --permission-transparency option
> >> would simply use open and mkdir in the traditional ways, allowing
> >> the chosen umask, ACL defaults, SELinux contexts, etc., to do their
> >> thing, and would avoid then stepping on the results with explicit
> >> chmods (and of course skip the I-refuse-to-start-because-I-
> >> misunderstand-your-setup check).
> >> ...
> > 
> > I'm a fan of this idea in general, but it's unclear how that
> > --permission-transparency option would work in practice.  Are you
> > suggesting that it be a compile-time flag, which would certainly work
> > but also then have to be debated among packagers as to the right setting
> > and if there's any need to be concerned about users misunderstanding it,
> > or a flag for each program,
> 
> I was thinking of a command-line option ...
> 
> > which hardly seems ideal as some invokations
> > of programs might forget to specify that, leading to inconsistent
> > permissions, or something else..?
> 
> ... but I see how that gets complicated with the various other command-
> line utilities included.

Indeed.

> > .. we'd have to build in complete
> > support for POSIX ACLs and Linux capabilities if we go down a route
> 
> I'm wary of an argument that we can't do better except by building
> in complete support for POSIX ACLs, and capabilities (and NFSv4
> ACLs, and SELinux, and AppArmor, and #ifdef and #ifdef and #ifdef).

I don't think I meant to imply that we can't do better, I was just
trying to enumate what I saw the different options being.

> It seems to me that, in most cases, the designers of these sorts of
> extensions to old traditional Unix behavior take great pains to design
> them such that they can still usefully function in the presence of
> programs that "don't pay attention to or understand or use" them, as
> long as those programs are in some sense well-behaved, and not going
> out of their way with active steps that insist on or impose permissions
> that aren't appropriate under the non-understood circumstances.
> 
> So my suggestion boils down to PG having at least an option, somehow,
> to be well-behaved in that sense.

I'm afraid that we haven't got any great answer to that "somehow".  I
was hoping you might have some other ideas beyond command-line switches
which could leave the system in an inconsistent state a bit too easily.
Unless there's a better way then the approach proposed by Tom
(originally) and implemented by David seems like the way to go and at
least an improvement over the current situation.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: [submit code] I develop a tool for pgsql, how can I submit it

2018-03-13 Thread Laurenz Albe

leap wrote:
> I develop a tool for pgsql, I want to submit it on pgsql.
> how can I submit it?
> 
> address: https://github.com/leapking/pgcheck

I would leave it on Github and add some documentation.

A lot of great tools for PostgreSQL are not part of the
core distribution; that doesn't mean that they are unloved.

Yours,
Laurenz Albe

Re: [HACKERS] taking stdbool.h into use

2018-03-13 Thread Peter Eisentraut

On 3/1/18 23:34, Michael Paquier wrote:
> Indeed, this is wrong.  Peter, why did you switch suddendly this patch
> as ready for committer?  The patch is waiting for your input as you
> mentioned that the GIN portion of this patch series is not completely
> baked yet.  So I have switched the patch in this state.

After more digging, there are more problems with having a bool that is
not 1 byte.  For example, pg_control has a bool field, so with a
different bool size, pg_control would be laid out differently.  That
would require changing all the mentions of bool to bool8 where the end
up on disk somehow, as I had already done for the system catalog
structures, but we don't know all the other places that would be affected.

So I'm going back to my proposal from December, to just use stdbool.h
when sizeof(bool) == 1, and add a static assertion to prevent other
configurations.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From daedc3ea6d044dde18cc0c630085a7f55e764794 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 19 Dec 2017 13:54:05 -0500
Subject: [PATCH v6 1/2] Add static assertions about size of GinTernaryValue vs
 bool

We need these to be the same because some code casts between pointers to
them.
---
 src/backend/utils/adt/tsginidx.c | 4 +++-
 src/include/access/gin.h | 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/tsginidx.c b/src/backend/utils/adt/tsginidx.c
index de59e6417e..00e32b2570 100644
--- a/src/backend/utils/adt/tsginidx.c
+++ b/src/backend/utils/adt/tsginidx.c
@@ -309,7 +309,9 @@ gin_tsquery_consistent(PG_FUNCTION_ARGS)
 * query.
 */
gcv.first_item = GETQUERY(query);
-   gcv.check = check;
+   StaticAssertStmt(sizeof(GinTernaryValue) == sizeof(bool),
+"sizes of GinTernaryValue and 
bool are not equal");
+   gcv.check = (GinTernaryValue *) check;
gcv.map_item_operand = (int *) (extra_data[0]);
gcv.need_recheck = recheck;
 
diff --git a/src/include/access/gin.h b/src/include/access/gin.h
index 0acdb88241..3d8a130b69 100644
--- a/src/include/access/gin.h
+++ b/src/include/access/gin.h
@@ -51,8 +51,8 @@ typedef struct GinStatsData
 /*
  * A ternary value used by tri-consistent functions.
  *
- * For convenience, this is compatible with booleans. A boolean can be
- * safely cast to a GinTernaryValue.
+ * This must be of the same size as a bool because some code will cast a
+ * pointer to a bool to a pointer to a GinTernaryValue.
  */
 typedef char GinTernaryValue;
 

base-commit: 17bb62501787c56e0518e61db13a523d47afd724
-- 
2.16.2

>From 67f32f4195632fd1068bf3ad0bf93c12a865c7d4 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 19 Dec 2017 14:24:43 -0500
Subject: [PATCH v6 2/2] Use stdbool.h if suitable

Using the standard bool type provided by C allows some recent compilers
and debuggers to give better diagnostics.  Also, some extension code and
third-party headers are increasingly pulling in stdbool.h, so it's
probably saner if everyone uses the same definition.

But PostgreSQL code is not prepared to handle bool of a size other than
1, so we keep our own old definition if we encounter a stdbool.h with a
bool of a different size.  (Apparently, that includes Power CPUs and
some very old compilers that declared bool to be an enum.)  We have
static assertions in critical places that check that bool is of the
right size.
---
 configure  | 213 -
 configure.in   |   7 ++
 src/include/c.h|  14 ++-
 src/include/pg_config.h.in |   9 ++
 src/pl/plperl/plperl.h |  10 +--
 5 files changed, 205 insertions(+), 48 deletions(-)

diff --git a/configure b/configure
index 3943711283..c670becd16 100755
--- a/configure
+++ b/configure
@@ -1999,116 +1999,116 @@ $as_echo "$ac_res" >&6; }
 
 } # ac_fn_c_check_func
 
-# ac_fn_c_check_member LINENO AGGR MEMBER VAR INCLUDES
-# 
-# Tries to find if the field MEMBER exists in type AGGR, after including
-# INCLUDES, setting cache variable VAR accordingly.
-ac_fn_c_check_member ()
+# ac_fn_c_check_type LINENO TYPE VAR INCLUDES
+# ---
+# Tests whether TYPE exists after having included INCLUDES, setting cache
+# variable VAR accordingly.
+ac_fn_c_check_type ()
 {
   as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack
-  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2.$3" >&5
-$as_echo_n "checking for $2.$3... " >&6; }
-if eval \${$4+:} false; then :
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5
+$as_echo_n "checking for $2... " >&6; }
+if eval \${$3+:} false; then :
   $as_echo_n "(cached) " >&6
 else
+  eval "$3=no"
   cat conf

Re: JIT compiling with LLVM v11

2018-03-13 Thread Tom Lane

Andres Freund  writes:
> On 2018-03-13 14:36:44 -0400, Robert Haas wrote:
>> I realize that EXPLAIN (JIT OFF) may sound like it's intended to
>> disable JIT itself

> Yea, that's what I'm concerned about.

>> , but I think it's pretty clear that EXPLAIN (BUFFERS OFF) does not
>> disable the use of actual buffers, only the display of buffer-related
>> information.

> Hm.

FWIW, I agree with Robert's preference for just JIT here.  The "info"
bit isn't conveying anything.  And we've never had any EXPLAIN options
that actually change the behavior of the explained command, only ones
that change the amount of info displayed.  I don't see why we'd
consider JIT an exception to that.

regards, tom lane

neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Jeff Janes

The following commit has caused a devastating performance regression
in concurrent refresh of MV:

commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
Author: Tom Lane 
Date:   Wed Nov 29 22:00:29 2017 -0500

Fix neqjoinsel's behavior for semi/anti join cases.


The below reproduction goes from taking about 1 second to refresh, to
taking an amount of time I don't have the patience to measure.

drop table foobar2 cascade;
create table foobar2 as select * from generate_series(1,20);
create materialized view foobar3 as select * from foobar2;
create unique index on foobar3 (generate_series );
analyze foobar3;
refresh materialized view CONCURRENTLY foobar3 ;


When I interrupt the refresh, I get a message including this line:

CONTEXT:  SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420
newdata WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM
pg_temp_3.pg_temp_16420 newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid OPERATOR(pg_catalog.<>)
newdata.ctid) LIMIT 1"

So I makes sense that the commit in question could have caused a change in
the execution plan.  Because these are temp tables, I can't easily get my
hands on them to investigate further.

Cheers,

Jeff

Re: Fix error in ECPG while connection handling

2018-03-13 Thread Jeevan Ladhe

> Thanks for spotting and fixing. I will push the patch as soon as I'm
> online again.
>

Thanks Michael for taking care of this.

Regards,
Jeevan Ladhe.

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Tom Lane

Chapman Flack  writes:
> On 03/13/2018 01:50 PM, Stephen Frost wrote:
>> I'll point out that PG does run on quite a few other OS's beyond Linux.

> I'll second that, as I think it is making my argument. When I can
> supply three or four examples of semantic subtleties that undermine
> PG's assumptions under Linux alone, the picture when broadened to
> those quite-a-few other platforms as well certainly doesn't become
> simpler!

Well, to be blunt, what we target is POSIX-compatible systems.  If
you're telling us to worry about non-POSIX filesystem semantics,
and your name is not Microsoft, it's going to be a hard sell.
We have enough to do just keeping up with that scope of target
systems.

regards, tom lane

Re: proposal: schema variables

2018-03-13 Thread Pavel Stehule

2018-03-13 10:54 GMT+01:00 Pavel Luzanov :

> Pavel Stehule wrote
> > Now, there is one important question - storage - Postgres stores all
> > objects to files - only memory storage is not designed yet. This is part,
> > where I need a help.
>
> O, I do not feel confident in such questions.
> May be some ideas you can get from extension with similar functionality:
> https://github.com/postgrespro/pg_variables


Unfortunately not - it doesn't implement this functionality

Regards

Pavel


>
>
> -
> Pavel Luzanov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-
> f1928748.html
>
>

Re: JIT compiling with LLVM v11

2018-03-13 Thread Andres Freund

Hi,

On 2018-03-13 14:36:44 -0400, Robert Haas wrote:
> On Mon, Mar 12, 2018 at 5:04 PM, Andres Freund  wrote:
> > Currently a handful of explain outputs in the regression tests change
> > output when compiled with JITing. Therefore I'm thinking of adding
> > JITINFO or such option, which can be set to false for those tests?
> 
> Can we spell that JIT or at least JIT_INFO?

The latter works, I don't have a strong opinion on that. For now I've
just tied it to COSTS off.


> I realize that EXPLAIN (JIT OFF) may sound like it's intended to
> disable JIT itself

Yea, that's what I'm concerned about.


> , but I think it's pretty clear that EXPLAIN (BUFFERS OFF) does not
> disable the use of actual buffers, only the display of buffer-related
> information.

Hm.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v11

2018-03-13 Thread Robert Haas

On Mon, Mar 12, 2018 at 5:04 PM, Andres Freund  wrote:
> Currently a handful of explain outputs in the regression tests change
> output when compiled with JITing. Therefore I'm thinking of adding
> JITINFO or such option, which can be set to false for those tests?

Can we spell that JIT or at least JIT_INFO?

I realize that EXPLAIN (JIT OFF) may sound like it's intended to
disable JIT itself, but I think it's pretty clear that EXPLAIN
(BUFFERS OFF) does not disable the use of actual buffers, only the
display of buffer-related information.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-13 Thread Robert Haas

On Wed, Feb 14, 2018 at 5:37 AM, Amit Kapila  wrote:
> Your concern is valid, but isn't the same problem exists in another
> approach as well, because in that also we can call
> adjust_paths_for_srfs after generating gather path which means that we
> might lose some opportunity to reduce the relative cost of parallel
> paths due to tlists having SRFs.  Also, a similar problem can happen
> in create_order_paths  for the cases as described in the example
> above.

You're right.  I think cleaning all of this up for v11 is too much to
consider, but please tell me your opinion of the attached patch set.
Here, instead of the ripping the problematic logic out of
apply_projection_to_path, what I've done is just removed several of
the callers to apply_projection_to_path.  I think that the work of
removing other callers to that function could be postponed to a future
release, but we'd still get some benefit now, and this shows the
direction I have in mind.  I'm going to explain what the patches do
one by one, but out of order, because I backed into the need for the
earlier patches as a result of troubleshooting the later ones in the
series.  Note that the patches need to be applied in order even though
I'm explaining them out of order.

0003 introduces a new upper relation to represent the result of
applying the scan/join target to the topmost scan/join relation.  I'll
explain further down why this seems to be needed.  Since each path
must belong to only one relation, this involves using
create_projection_path() for the non-partial pathlist as we already do
for the partial pathlist, rather than apply_projection_to_path().
This is probably marginally more expensive but I'm hoping it doesn't
matter.  (However, I have not tested.)  Each non-partial path in the
topmost scan/join rel produces a corresponding path in the new upper
rel.  The same is done for partial paths if the scan/join target is
parallel-safe; otherwise we can't.

0004 causes the main part of the planner to skip calling
generate_gather_paths() from the topmost scan/join rel.  This logic is
mostly stolen from your patch.  If the scan/join target is NOT
parallel-safe, we perform generate_gather_paths() on the topmost
scan/join rel.  If the scan/join target IS parallel-safe, we do it on
the post-projection rel introduced by 0003 instead.  This is the patch
that actually fixes Jeff's original complaint.

0005 goes a bit further and removes a bunch of logic from
create_ordered_paths().  The removed logic tries to satisfy the
query's ordering requirement via cheapest_partial_path + Sort + Gather
Merge.  Instead, it adds an additional path to the new upper rel added
in 0003.  This again avoids a call to apply_projection_to_path() which
could cause projection to be pushed down after costing has already
been fixed.  Therefore, it gains the same advantages as 0004 for
queries that would this sort of plan.  Currently, this loses the
ability to set limit_tuples for the Sort path; that doesn't look too
hard to fix but I haven't done it.  If we decide to proceed with this
overall approach I'll see about getting it sorted out.

Unfortunately, when I initially tried this approach, a number of
things broke due to the fact that create_projection_path() was not
exactly equivalent to apply_projection_to_path().  This initially
surprised me, because create_projection_plan() contains logic to
eliminate the Result node that is very similar to the logic in
apply_projection_to_path().  If apply_projection_path() sees that the
subordinate node is projection-capable, it applies the revised target
list to the child; if not, it adds a Result node.
create_projection_plan() does the same thing.  However,
create_projection_plan() can lose the physical tlist optimization for
the subordinate node; it forces an exact projection even if the parent
node doesn't require this.  0001 fixes this, so that we get the same
plans with create_projection_path() that we would with
apply_projection_to_path().  I think this patch is a good idea
regardless of what we decide to do about the rest of this, because it
can potentially avoid losing the physical-tlist optimization in any
place where create_projection_path() is used.

It turns out that 0001 doesn't manage to preserve the physical-tlist
optimization when the projection path is attached to an upper
relation.  0002 fixes this.

If we decide to go forward with this approach, it may makes sense to
merge some of these when actually committing, but I found it useful to
separate them for development and testing purposes, and for clarity
about what was getting changed at each stage.  Please let me know your
thoughts.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0005-Remove-explicit-path-construction-logic-in-create_or.patch
Description: Binary data

0004-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data

0003-Add-new-upper-rel-to-represent-projecting-toplevel-

Re: JIT compiling with LLVM v11

2018-03-13 Thread Andres Freund

On 2018-03-13 10:25:49 -0400, Peter Eisentraut wrote:
> On 3/12/18 17:04, Andres Freund wrote:
> > │ JIT:  
> >  │
> > │   Functions: 4
> >  │
> > │   Inlining: false 
> >  │
> > │   Inlining Time: 0.000
> >  │
> > │   Optimization: false 
> >  │
> > │   Optimization Time: 5.023
> >  │
> > │   Emission Time: 34.987   
> >  │
> 
> The time quantities need some units.
> 
> > │ Execution time: 46.277 ms 
> >  │
> 
> like this :)

Yea, I know. I was planning to start a thread about that. explain.c is
littered with code like
if (es->format == EXPLAIN_FORMAT_TEXT)
appendStringInfo(es->str, "Planning time: %.3f ms\n",
 1000.0 * plantime);
else
ExplainPropertyFloat("Planning Time", 1000.0 * 
plantime, 3, es);
which, to me, is bonkers.  I think we should add add 'const char *unit'
parameter to at least ExplainProperty{Float,Integer,Long}? Or a *Unit
version of them doing so, allowing a bit more gradual change?

Greetings,

Andres Freund

Re: TupleTableSlot abstraction

2018-03-13 Thread Andres Freund

On 2018-03-13 15:18:34 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > Hi,
> > 
> > I've recently been discussing with Robert how to abstract
> > TupleTableSlots in the light of upcoming developments around abstracting
> > storage away.  Besides that aspect I think we also need to make them
> > more abstract to later deal with vectorized excution, but I'm more fuzzy
> > on the details there.

> Hi, is this patch proposed for pg11?

I wish, but I don't see it happening unless I get a time compression
device from somewhere :(

Greetings,

Andres Freund

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Chapman Flack

On 03/13/2018 01:50 PM, Stephen Frost wrote:

> Yes, that's true, but PG has never paid attention to POSIX ACLs or Linux
> capabilities.  Changing it to do so is quite a bit beyond the scope...

I think we're largely in agreement here, as my aim was not to advocate
that PG should work harder to understand the subtleties of every system
it could be installed on, but rather that it should work less hard at
pretending to understand them when it doesn't, and thus avoid
obstructing the admin, who presumably does.

> I'll point out that PG does run on quite a few other OS's beyond Linux.

I'll second that, as I think it is making my argument. When I can
supply three or four examples of semantic subtleties that undermine
PG's assumptions under Linux alone, the picture when broadened to
those quite-a-few other platforms as well certainly doesn't become
simpler!

>> but then sooner or later it will still end up making assumptions
>> that aren't true under, say, SELinux, so there's another #ifdef,
>> and where does it end?
> 
> I agree with this general concern.

:)  That's probably where it became clear that I'm not advocating
an add-#ifdefs-for-everything approach.

> PG will stat PGDATA and conclude that the system is saying that group
> access has been granted on PGDATA and will do the same for subsequent
> files created later on. ... The only issue that remains is that PG
> doesn't understand or work with POSIX ACLs or Linux capabilities,
> but that's not anything new.

What's new is that it is now pretending even more extravagantly to
understand what it doesn't understand. Where it would previously draw
in incorrect conclusion and refuse to start, which is annoying but
not very difficult to work around if need be, it would now draw the
same incorrect conclusion and then actively go about making the real
world embody the incorrect conclusion, contrary to the admin's efforts.

>> umask inherited by the postmaster. A --permission-transparency option
>> would simply use open and mkdir in the traditional ways, allowing
>> the chosen umask, ACL defaults, SELinux contexts, etc., to do their
>> thing, and would avoid then stepping on the results with explicit
>> chmods (and of course skip the I-refuse-to-start-because-I-
>> misunderstand-your-setup check).
>> ...
> 
> I'm a fan of this idea in general, but it's unclear how that
> --permission-transparency option would work in practice.  Are you
> suggesting that it be a compile-time flag, which would certainly work
> but also then have to be debated among packagers as to the right setting
> and if there's any need to be concerned about users misunderstanding it,
> or a flag for each program,

I was thinking of a command-line option ...

> which hardly seems ideal as some invokations
> of programs might forget to specify that, leading to inconsistent
> permissions, or something else..?

... but I see how that gets complicated with the various other command-
line utilities included.

> .. we'd have to build in complete
> support for POSIX ACLs and Linux capabilities if we go down a route

I'm wary of an argument that we can't do better except by building
in complete support for POSIX ACLs, and capabilities (and NFSv4
ACLs, and SELinux, and AppArmor, and #ifdef and #ifdef and #ifdef).

It seems to me that, in most cases, the designers of these sorts of
extensions to old traditional Unix behavior take great pains to design
them such that they can still usefully function in the presence of
programs that "don't pay attention to or understand or use" them, as
long as those programs are in some sense well-behaved, and not going
out of their way with active steps that insist on or impose permissions
that aren't appropriate under the non-understood circumstances.

So my suggestion boils down to PG having at least an option, somehow,
to be well-behaved in that sense.

-Chap

Re: INOUT parameters in procedures

2018-03-13 Thread Pavel Stehule

2018-03-13 14:14 GMT+01:00 Peter Eisentraut <
peter.eisentr...@2ndquadrant.com>:

> On 3/8/18 02:25, Pavel Stehule wrote:
> > It looks like some error in this concept. The rules for enabling
> > overwriting procedures should modified, so this collision should not be
> > done.
> >
> > When I using procedure from PL/pgSQL, then it is clear, so I place on
> > *OUT position variables. But when I call procedure from top, then I'll
> > pass fake parameters to get some result.
>
> What we'll probably want to do here is to make the OUT parameters part
> of the identity signature of procedures, unlike in functions.  This
> should be a straightforward change, but it will require some legwork in
> many parts of the code.
>

yes


>
> >if (argmodes && (argmodes[i] == PROARGMODE_INOUT ||
> > argmodes[i] == PROARGMODE_OUT))
> > +   {
> > +   Param  *param;
> >
> > Because PROARGMODE_OUT are disallowed, then this check is little bit
> > messy. Please, add some comment.
>
> Fixed.
>
> I discovered another issue, in LANGUAGE SQL procedures.  Currently, if
> you make a CALL with an INOUT parameter in an SQL procedure, the output
> is thrown away (unless it's the last command).  I would like to keep
> open the option of assigning the results by name, like we do in
> PL/pgSQL.  So in this patch I have made a change to prohibit calling
> procedures with INOUT parameters in LANGUAGE SQL routines (see
> check_sql_fn_statements()).  What do you think?
>

The disabling it, it is probably the best what is possible now. The
variables in SQL are more named parameters than variables. Is not necessary
to complicate it.

Regards

Pavel



>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: TupleTableSlot abstraction

2018-03-13 Thread Alvaro Herrera

Andres Freund wrote:
> Hi,
> 
> I've recently been discussing with Robert how to abstract
> TupleTableSlots in the light of upcoming developments around abstracting
> storage away.  Besides that aspect I think we also need to make them
> more abstract to later deal with vectorized excution, but I'm more fuzzy
> on the details there.

Hi, is this patch proposed for pg11?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Inquiry regarding the projects under GSOC

2018-03-13 Thread Dr K.G Yadav

 Hello Postgres Team

I am Ankit 3rd year B.tech student from India

I am really interested to work along with your organisation as an intern to
rectify the issues and bugs as I love challenges and stated in the docs
that some features might be not possible to implement so I would love to
try. My inquiry is that what are the current bug fixes or features updates
available for interns to work upon

Thanks !!

[submit code] I develop a tool for pgsql, how can I submit it

2018-03-13 Thread leap

hello!
I develop a tool for pgsql, I want to submit it on pgsql.
how can I submit it?


address: https://github.com/leapking/pgcheck

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Jesper Pedersen


Hi Amit,

On 03/13/2018 07:37 AM, Amit Langote wrote:

I will continue working on improving the comments / cleaning things up and
post a revised version soon, but until then please look at the attached.



Passes check-world.

Some minor comments:

0001: Ok

0002: Ok

0003:
* Trailing white space
* pruning.c
  - partkey_datum_from_expr
* "Add more expression types..." -- Are you planning to add more of 
these ? Otherwise change the comment

  - get_partitions_for_null_keys
* Documentation for method
* 'break;' missing for _HASH and default case
  - get_partitions_for_keys
* 'break;'s are outside of the 'case' blocks
* The 'switch(opstrategy)'s could use some {} blocks
  * 'break;' missing from default
  - perform_pruning_combine_step
* Documentation for method
* nodeFuncs.c
  - Missing 'break;'s to follow style

0004: Ok

Best regards,
 Jesper

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Greetings Chapman,

* Chapman Flack (c...@anastigmatix.net) wrote:
> On 03/13/2018 10:40 AM, Stephen Frost wrote:
> > ...  Ultimately, the default which makes sense here isn't a
> > one-size-fits-all but is system dependent and the administrator should
> > be able to choose what permissions they want to have.
> 
> Hear, hear. Returning for a moment again to
> 
> https://www.postgresql.org/message-id/8b16cc30-0187-311e-04b5-1f32446d89dd%40anastigmatix.net
> 
> we see that a stat() returning mode 0750 on a modern system may not
> even mean there is any group access at all. In that example, the
> datadir had these permissions:

Yes, that's true, but PG has never paid attention to POSIX ACLs or Linux
capabilities.  Changing it to do so is quite a bit beyond the scope of
this particular patch and I don't see anything in what this patch is
doing which would preclude someone from putting in that effort in the
future.

> While PostgreSQL does its stat() and interprets the mode as if it is
> still on a Unix box from the '80s, two things have changed underneath
> it: POSIX ACLs and Linux capabilities. Capabilities take the place of
> the former specialness of root, who now needs to be granted r-x
> explicitly in the ACL to be able to read stuff there at all, and there
> is clearly no access to group and no access to other. It would be hard
> for anybody to call this an insecure configuration. But when you stat()
> an object with a POSIX ACL, you get the 'mask' value where the 'group'
> bits used to go, so postgres sees this as 0750, thinks the 5 represents
> group access, and refuses to start. Purely a mistake.

I'll point out that PG does run on quite a few other OS's beyond Linux.

> It's the kind of mistake that is inherent in this sort of check,
> which tries to draw conclusions from the semantics it assumes, while
> systems evolve and semantics march along. One hypothetical fix would
> be to add:
> 
> #ifdef HAS_POSIX_ACLS
> ... check if there's really an ACL here, and the S_IRWXG bits are
> really just the mask, or even try to pass judgment on whether the
> admin's chosen ACL is adequately secure ...
> #endif
> 
> but then sooner or later it will still end up making assumptions
> that aren't true under, say, SELinux, so there's another #ifdef,
> and where does it end?

I agree with this general concern.

> On 03/13/2018 11:00 AM, Tom Lane wrote:
> > In a larger sense, this fails to explain the operating principle,
> > namely what I stated above, that we allow the existing permissions
> > on PGDATA to decide what we allow as group permissions.
> 
> I admit I've only skimmed the patches, trying to determine what
> that will mean in practice. In a case like the ACL example above,
> does this mean that postgres will stat PGDATA, conclude incorrectly
> that group access is granted, and then, based on that, actually go
> granting unwanted group access for real on newly-created files
> and directories?

PG will stat PGDATA and conclude that the system is saying that group
access has been granted on PGDATA and will do the same for subsequent
files created later on.  This is new in PG, so there isn't any concern
about this causing problems in an existing environment- you couldn't
have had those ACLs on an existing PGDATA dir in the first place, as you
note above.  The only issue that remains is that PG doesn't understand
or work with POSIX ACLs or Linux capabilities, but that's not anything
new.

> On 03/13/2018 10:45 AM, David Steele wrote:
> > As Stephen notes, this can be enforced by the user if they want to,
> > and without much effort (and with better tools).
> 
> To me, that seems really the key. An --allow-group-access option is
> nice (but, as we see, misleading if its assumptions are outdated
> regarding how the filesystem works), but I would plug even harder for
> a --permission-transparency option, which would just assume that the
> admin is making security arrangements, through mechanisms that
> postgres may or may not even understand. The admin can create ACLs
> with default entries that propagate to newly created objects.
> SELinux contexts can work in similar ways. The admin controls the
> umask inherited by the postmaster. A --permission-transparency option
> would simply use open and mkdir in the traditional ways, allowing
> the chosen umask, ACL defaults, SELinux contexts, etc., to do their
> thing, and would avoid then stepping on the results with explicit
> chmods (and of course skip the I-refuse-to-start-because-I-
> misunderstand-your-setup check).
> 
> It wouldn't be for every casual install, but it would be the best
> way to stay out of the way of an admin securing a system with modern
> facilities.
> 
> A lot of the design effort put into debating what postgres itself
> should or shouldn't insist on could then, perhaps, go into writing
> postgres-specific configuration rule packages for some of those
> better configuration-checking tools, and there it might even be
> possible to write tests tha

Re: Additional Statistics Hooks

2018-03-13 Thread Tom Lane

Mat Arye  writes:
> An example, of a case a protransform type system would not be able to
> optimize is mathematical operator expressions like bucketing integers by
> decile --- (integer / 10) * 10.

Uh, why not?  An estimation function that is specific to integer divide
shouldn't have much trouble figuring out that x/10 has one-tenth as many
distinct values as x does.  I'd certainly rather have that knowledge
associated directly with int4div, and the corresponding knowledge about
date_trunc associated with that function, and similar knowledge about
extension-provided operators provided by the extensions, than try to
maintain a hook function that embeds all such knowledge.

> I also think that the point with extended statistics is a good one and
> points to the need for more experimentation/experience which I think
> a C hook is better suited for. Putting in a hook will allow extension
> writers like us to experiment and figure out the kinds of transform on
> statistics that are useful while having
> a small footprint on the core.

If you're experimenting you might as well just change the source code.
A hook is only useful if you're trying to ship something for production,
and I doubt that factorizing things this way is a credible production
solution.

regards, tom lane

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Chapman Flack

On 03/13/2018 10:40 AM, Stephen Frost wrote:
> * Michael Paquier (mich...@paquier.xyz) wrote:
>> Well, one thing is that the current checks in the postmaster make sure
>> that a data folder is never using anything else than 0700.  From a
>> security point of view, making it possible to allow a postmaster to
>> start with 0750 is a step backwards ...

> Lastly, the user *is* able to enforce the privileges on the data
> directory if they wish to, using tools such as tripwire which are built
> specifically to provide such checks and to do so regularly instead of
> the extremely ad-hoc check provided by PG.
> 
> ...  Ultimately, the default which makes sense here isn't a
> one-size-fits-all but is system dependent and the administrator should
> be able to choose what permissions they want to have.

Hear, hear. Returning for a moment again to

https://www.postgresql.org/message-id/8b16cc30-0187-311e-04b5-1f32446d89dd%40anastigmatix.net

we see that a stat() returning mode 0750 on a modern system may not
even mean there is any group access at all. In that example, the
datadir had these permissions:

# getfacl .
# file: .
# owner: postgres
# group: postgres
user::rwx
user:root:r-x
group::---
mask::r-x
other::---

While PostgreSQL does its stat() and interprets the mode as if it is
still on a Unix box from the '80s, two things have changed underneath
it: POSIX ACLs and Linux capabilities. Capabilities take the place of
the former specialness of root, who now needs to be granted r-x
explicitly in the ACL to be able to read stuff there at all, and there
is clearly no access to group and no access to other. It would be hard
for anybody to call this an insecure configuration. But when you stat()
an object with a POSIX ACL, you get the 'mask' value where the 'group'
bits used to go, so postgres sees this as 0750, thinks the 5 represents
group access, and refuses to start. Purely a mistake.

It's the kind of mistake that is inherent in this sort of check,
which tries to draw conclusions from the semantics it assumes, while
systems evolve and semantics march along. One hypothetical fix would
be to add:

#ifdef HAS_POSIX_ACLS
... check if there's really an ACL here, and the S_IRWXG bits are
really just the mask, or even try to pass judgment on whether the
admin's chosen ACL is adequately secure ...
#endif

but then sooner or later it will still end up making assumptions
that aren't true under, say, SELinux, so there's another #ifdef,
and where does it end?

On 03/13/2018 11:00 AM, Tom Lane wrote:
> In a larger sense, this fails to explain the operating principle,
> namely what I stated above, that we allow the existing permissions
> on PGDATA to decide what we allow as group permissions.

I admit I've only skimmed the patches, trying to determine what
that will mean in practice. In a case like the ACL example above,
does this mean that postgres will stat PGDATA, conclude incorrectly
that group access is granted, and then, based on that, actually go
granting unwanted group access for real on newly-created files
and directories?

That doesn't seem ideal.

On 03/13/2018 10:45 AM, David Steele wrote:
> As Stephen notes, this can be enforced by the user if they want to,
> and without much effort (and with better tools).

To me, that seems really the key. An --allow-group-access option is
nice (but, as we see, misleading if its assumptions are outdated
regarding how the filesystem works), but I would plug even harder for
a --permission-transparency option, which would just assume that the
admin is making security arrangements, through mechanisms that
postgres may or may not even understand. The admin can create ACLs
with default entries that propagate to newly created objects.
SELinux contexts can work in similar ways. The admin controls the
umask inherited by the postmaster. A --permission-transparency option
would simply use open and mkdir in the traditional ways, allowing
the chosen umask, ACL defaults, SELinux contexts, etc., to do their
thing, and would avoid then stepping on the results with explicit
chmods (and of course skip the I-refuse-to-start-because-I-
misunderstand-your-setup check).

It wouldn't be for every casual install, but it would be the best
way to stay out of the way of an admin securing a system with modern
facilities.

A lot of the design effort put into debating what postgres itself
should or shouldn't insist on could then, perhaps, go into writing
postgres-specific configuration rule packages for some of those
better configuration-checking tools, and there it might even be
possible to write tests that incorporate knowledge of ACLs, SELinux,
etc.

-Chap

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Tom Lane

David Steele  writes:
> On 3/12/18 3:28 AM, Michael Paquier wrote:
>> In pg_rewind and pg_resetwal, isn't that also a portion which is not
>> necessary without the group access feature?

> These seem like a good idea to me with or without patch 03.  Some of our
> front-end tools (initdb, pg_upgrade) were setting umask and others
> weren't.  I think it's more consistent (and safer) if they all do, at
> least if they are writing into PGDATA.

+1 ... see a926eb84e for an example of how easy it is to screw up if
the process's overall umask is permissive.

regards, tom lane

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-13 Thread Michail Nikolaev

Hello.

Tom, thanks a lot for your thorough review.

> What you've done to
> IndexNext() is a complete disaster from a modularity standpoint: it now
> knows all about the interactions between index_getnext, index_getnext_tid,
> and index_fetch_heap.

I was looking into the current IndexOnlyNext implementation as a starting
point - and it knows about index_getnext_tid and index_fetch_heap already.
At the same time I was trying to keep patch non-invasive.
Patched IndexNext now only knowns about index_getnext_tid and
index_fetch_heap - the same as IndexOnlyNext.
But yes, it probably could be done better.

> I'm not sure about a nicer way to refactor that, but I'd suggest that
> maybe you need an additional function in indexam.c that hides all this
> knowledge about the internal behavior of an IndexScanDesc.

I'll try to split index_getnext into two functions. A new one will do
everything index_getnext does except index_fetch_heap.

> Or that is, it knows too much and still not enough,
> because it's flat out wrong for the case that xs_continue_hot is set.
> You can't call index_getnext_tid when that's still true from last time.

Oh.. Yes, clear error here.

< The PredicateLockPage call also troubles me quite a bit, not only from
< a modularity standpoint but because that implies a somewhat-user-visible
< behavioral change when this optimization activates. People who are using
< serializable mode are not going to find it to be an improvement if their
< queries fail and need retries a lot more often than they did before.
< I don't know if that problem is bad enough that we should disable skipping
< when serializable mode is active, but it's something to think about.

Current IndexOnlyScan already does that. And I think a user should expect
such a change in serializable mode.

> You haven't documented the behavior required for tuple-skipping in any
> meaningful fashion, particularly not the expectation that the child plan
> node will still return tuples that just need not contain any valid
> content.

Only nodeLimit could receive such tuples and they are immediately
discarded. I'll add some comment to it.

> is a gross hack and probably wrong. You could use ExecStoreAllNullTuple,
> perhaps.

Oh, nice, missed it.

> I'm disturbed by the fact that you've not taught the planner about the
> potential cost saving from this, so that it won't have any particular
> reason to pick a regular indexscan over some other plan type when this
> optimization applies. Maybe there's no practical way to do that, or maybe
> it wouldn't really matter in practice; I've not looked into it. But not
> doing anything feels like a hack.

I was trying to do it. But current planner architecture does not provide a
nice way to achive it due to the way it handles limit and offset.
So, I think it is better to to be avoided for now.

> Setting this back to Waiting on Author.

I'll try to make the required changes in a few days.

Thanks.

Re: PATCH: Unlogged tables re-initialization tests

2018-03-13 Thread Alvaro Herrera

Dagfinn Ilmari Mannsåker wrote:

> $SIG{__DIE__} gets called even for exceptions that would be caught by a
> surrunding eval block, so this should at the very least be:
> 
> $SIG{__DIE__} = sub { BAIL_OUT(@_) unless $^S };
> 
> However, that is still wrong, because die() and BAIL_OUT() mean
> different things: die() aborts the current test script, while BAIL_OUT()
> aborts the entire 'prove' run, i.e. all subsequent scripts in the same
> test directory.

Sounds like 'die' is the correct thing, then, and that BAIL_OUT should
be called sparingly ... for example this one in PostgresNode::start
seems like an overreaction:
BAIL_OUT("node \"$name\" is already running") if defined $self->{_pid};

Yes?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: Exclude temp relations from base backup

2018-03-13 Thread David Steele

Hi,

On 2/28/18 10:55 AM, David Steele wrote:
> This is a follow-up patch from the exclude unlogged relations discussion
> [1].
> 
> The patch excludes temporary relations during a base backup using the
> existing looks_like_temp_rel_name() function for identification.
> 
> It shares code to identify database directories from [1], so for now
> that has been duplicated in this patch to make it independent.  I'll
> rebase depending on what gets committed first.

Updated the patch to change die() to BAIL_OUT() and use append_to_file()
as suggested for another test patch [1].

Regards,
-- 
-David
da...@pgmasters.net

[1]
https://www.postgresql.org/message-id/6bc5d931-5b00-279f-f65a-26e32de400a6%40pgmasters.net
From c83f3ae09bacb961c511ebe9cb1f9f79bf1e3d29 Mon Sep 17 00:00:00 2001
From: David Steele 
Date: Tue, 13 Mar 2018 12:22:24 -0400
Subject: [PATCH 1/1] Exclude temporary relations from base backup.

Exclude temporary relations during a base backup using the existing 
looks_like_temp_rel_name() function for identification.

It shares code to identify database directories with [1], so for now that has 
been duplicated in this patch to make it independent.  I'll rebase depending on 
what gets committed first.

[1] 
https://www.postgresql.org/message-id/04791bab-cb04-ba43-e9c0-664a4c1ffb2c%40pgmasters.net
---
 doc/src/sgml/protocol.sgml   |  2 +-
 src/backend/replication/basebackup.c | 41 +++
 src/backend/storage/file/fd.c|  3 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 49 +++-
 src/include/storage/fd.h |  1 +
 5 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 4fd61d7c2d..629a462574 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2565,7 +2565,7 @@ The commands accepted in replication mode are:
 
  Various temporary files and directories created during the operation
  of the PostgreSQL server, such as any file or directory beginning
- with pgsql_tmp.
+ with pgsql_tmp and temporary relations.
 


diff --git a/src/backend/replication/basebackup.c 
b/src/backend/replication/basebackup.c
index 185f32a5f9..f0c3d13b2b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -26,6 +26,7 @@
 #include "nodes/pg_list.h"
 #include "pgtar.h"
 #include "pgstat.h"
+#include "port.h"
 #include "postmaster/syslogger.h"
 #include "replication/basebackup.h"
 #include "replication/walsender.h"
@@ -958,6 +959,36 @@ sendDir(const char *path, int basepathlen, bool sizeonly, 
List *tablespaces,
charpathbuf[MAXPGPATH * 2];
struct stat statbuf;
int64   size = 0;
+   const char  *lastDir;   /* Split last dir from 
parent path. */
+   boolisDbDir = false;/* Does this directory contain 
relations? */
+
+   /*
+* Determine if the current path is a database directory that can
+* contain relations.
+*
+* Start by finding the location of the delimiter between the parent
+* path and the current path.
+*/
+   lastDir = last_dir_separator(path);
+
+   /* Does this path look like a database path (i.e. all digits)? */
+   if (lastDir != NULL &&
+   strspn(lastDir + 1, "0123456789") == strlen(lastDir + 1))
+   {
+   /* Part of path that contains the parent directory. */
+   int parentPathLen = lastDir - path;
+
+   /*
+* Mark path as a database directory if the parent path is 
either
+* $PGDATA/base or a tablespace version path.
+*/
+   if (strncmp(path, "./base", parentPathLen) == 0 ||
+   (parentPathLen >= (sizeof(TABLESPACE_VERSION_DIRECTORY) 
- 1) &&
+strncmp(lastDir - 
(sizeof(TABLESPACE_VERSION_DIRECTORY) - 1),
+TABLESPACE_VERSION_DIRECTORY,
+sizeof(TABLESPACE_VERSION_DIRECTORY) - 
1) == 0))
+   isDbDir = true;
+   }
 
dir = AllocateDir(path);
while ((de = ReadDir(dir, path)) != NULL)
@@ -1007,6 +1038,16 @@ sendDir(const char *path, int basepathlen, bool 
sizeonly, List *tablespaces,
if (excludeFound)
continue;
 
+   /* Exclude temporary relations */
+   if (isDbDir && looks_like_temp_rel_name(de->d_name))
+   {
+   elog(DEBUG2,
+"temporary relation file \"%s\" excluded from 
backup",
+de->d_name);
+
+   continue;
+   }
+
snprintf(pathbuf, sizeof(pathbuf), "%s/%s", path, de->d_name);

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-13 Thread Arthur Zakirov

On Mon, Mar 12, 2018 at 02:11:42PM +0100, Julian Markwort wrote:
> > In same manner pg_stat_statements_good_plan_reset() and
> > pg_stat_statements_bad_plan_reset() functions can be combined in
> > function:
> 
> > pg_stat_statements_plan_reset(in queryid bigint, in good boolean, in bad
> > boolean)
> 
> This is also sensible, however if any more kinds of plans would be added in 
> the future, there would be a lot of flags in this function. I think having 
> varying amounts of flags (between extension versions) as arguments to the 
> function also makes any upgrade paths difficult. Thinking more about this, we 
> could also user function overloading to avoid this.
> An alternative would be to have the caller pass the type of plan he wants to 
> reset. Omitting the type would result in the deletion of all plans, maybe?
> pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)

Yes, it is a good idea. But maybe instead of passing an empty string
into plans type we should overload the function with only queryid
argument. I think it is a common practice in PostgreSQL. Otherwise
allowance to pass empty string as plan type may confuse users. So
functions declaration may be the following:

pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)
pg_stat_statements_plan_reset(in queryid bigint)

+   interquartile_dist = 2.0*(0.6745 * 
sqrt(e->counters.sum_var_time / e->counters.calls));

I think it would be better to have defines for 2.0 and 0.6745 values for
the sake of readability.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: Additional Statistics Hooks

2018-03-13 Thread Mat Arye

On Tue, Mar 13, 2018 at 6:31 AM, David Rowley 
wrote:

> On 13 March 2018 at 11:44, Tom Lane  wrote:
> > While it would certainly be nice to have better behavior for that,
> > "add a hook so users who can write C can fix it by hand" doesn't seem
> > like a great solution.  On top of the sheer difficulty of writing a
> > hook function, you'd have the problem that no pre-written hook could
> > know about all available functions.  I think somehow we'd need a way
> > to add per-function knowledge, perhaps roughly like the protransform
> > feature.
>

I think this isn't either-or. I think a general hook can be useful for
extensions
that want to optimize particular data distributions/workloads using
domain-knowledge about functions common for those workloads.
That way users working with that data can use extensions to optimize
workloads without writing C themselves. I also think a
protransform like feature would add a lot of power to the native planner
but this could take a while
to get into core properly and may not handle all kinds of data
distributions/cases.

An example, of a case a protransform type system would not be able to
optimize is mathematical operator expressions like bucketing integers by
decile --- (integer / 10) * 10.
This is somewhat analogous to date_trunc in the integer space and would
also change the number of resulting distinct rows.

>
> I always imagined that extended statistics could be used for this.
> Right now the estimates are much better when you create an index on
> the function, but there's no real reason to limit the stats that are
> gathered to just plain columns + expression indexes.
>
> I believe I'm not the only person to have considered this. Originally
> extended statistics were named multivariate statistics. I think it was
> Dean and I (maybe others too) that suggested to Tomas to give the
> feature a more generic name so that it can be used for a more general
> purpose later.
>

I also think that the point with extended statistics is a good one and
points to the need for more experimentation/experience which I think
a C hook is better suited for. Putting in a hook will allow extension
writers like us to experiment and figure out the kinds of transform on
statistics that are useful while having
a small footprint on the core. I think designing a protransform-like system
would benefit from more experience with the kinds of transformations that
are useful.
For example, can anything be done if the interval passed to date_trunc is
not constant, or is it not even worth bothering with that case? Maybe
extended
statistics is a better approach, etc.

>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: PATCH: Configurable file mode mask

2018-03-13 Thread David Steele

Hi Michael,

On 3/12/18 3:28 AM, Michael Paquier wrote:
> On Fri, Mar 09, 2018 at 01:51:14PM -0500, David Steele wrote:
>> How about a GUC that enforces one mode or the other on startup?  Default
>> would be 700.  The GUC can be set automatically by initdb based on the
>> -g option.  We had this GUC originally, but since the front-end tools
>> can't read it we abandoned it.  Seems like it would be good as an
>> enforcing mechanism, though.
> 
> Hm.  OK.  I can see the whole set of points about that.  Please let me
> think a bit more about that bit.  Do you think that there could be a
> pool of users willing to switch from one mode to another?  Compared to
> your v1, we could indeed have a GUC which enforces a restriction to not
> allow group access, and enabled by default.  As the commit fest is
> running and we don't have a clear picture yet, I am afraid that it may
> be better to move that to v12, and focus on getting patches 1 and 2
> committed. This will provide a good base for the next move.
> 
> There are three places where things are still not correct:
> 
> -   if (chmod(location, S_IRWXU) != 0)
> +   current_umask = umask(0);
> +   umask(current_umask);
> +
> +   if (chmod(location, PG_DIR_MODE_DEFAULT & ~current_umask) != 0)
> This is in tablespace.c.

I have moved this hunk to 03 and used only PG_DIR_MODE_DEFAULT instead.

> @@ -185,6 +186,9 @@ main(int argc, char **argv)
> exit(1);
> }
> 
> +   /* Set dir/file mode mask */
> +   umask(PG_MODE_MASK_DEFAULT);
> +
> In pg_rewind and pg_resetwal, isn't that also a portion which is not
> necessary without the group access feature?

These seem like a good idea to me with or without patch 03.  Some of our
front-end tools (initdb, pg_upgrade) were setting umask and others
weren't.  I think it's more consistent (and safer) if they all do, at
least if they are writing into PGDATA.

> This is all I have basically for patch 2, which would be good for
> shipping.

Thanks!

I'll attach new patches in a reply to [1] once I have made the changes
Tom requested.

-- 
-David
da...@pgmasters.net

[1] https://www.postgresql.org/message-id/22928.1520953220%40sss.pgh.pa.us



signature.asc
Description: OpenPGP digital signature

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Greetings,

* David Steele (da...@pgmasters.net) wrote:
> On 3/13/18 11:00 AM, Tom Lane wrote:
> > Stephen Frost  writes:
> >> * Michael Paquier (mich...@paquier.xyz) wrote:
> >>> If the problem is parsing, it could as well be more portable to put that
> >>> in the control file, no?
> > 
> >> Then we'd need a tool to allow changing it for people who do want to
> >> change it.  There's no reason we should have to have an extra tool for
> >> this- an administrator who chooses to change the privileges on the data
> >> folder should be able to do so and I don't think anyone is going to
> >> thank us for requiring them to use some special tool to do so for
> >> PostgreSQL.
> > 
> > FWIW, I took a quick look through this patch and don't have any problem
> > with the approach, which appears to be "use the data directory's
> > startup-time permissions as the status indicator".  I am kinda wondering
> > about this though:
> > 
> > +(These files can confuse pg_ctl.)  If group 
> > read
> > +access is enabled on the data directory and an unprivileged user in the
> > +PostgreSQL group is performing the backup, 
> > then
> > +postmaster.pid will not be readable and must be
> > +excluded.
> > 
> > If we're allowing group read on the data directory, why should that not
> > include postmaster.pid?  There's nothing terribly security-relevant in
> > there, and being able to find out the postmaster PID seems useful.  I can
> > see the point of restricting server key files, as documented elsewhere,
> > but it's possible to keep those somewhere else so they don't cause
> > problems for simple backup software.
> 
> I'm OK with that.  I had a look at the warnings regarding the required
> mode of postmaster.pid in miscinit.c (889-911) and it seems to me they
> still hold true if the mode is 640 instead of 600.
> 
> Do you agree, Tom?  Stephen?
> 
> If so, I'll make those changes.

I agree that we can still consider an EPERM-error case as being ok even
with the changes proposed, and with the same-user check happening
earlier in checkDataDir(), we won't even get to the point of looking at
the pid file if the userid's don't match.  The historical comment about
the old datadir permissions can likely just be removed, perhaps replaced
with a bit more commentary above that check in checkDataDir().  The
open() call should also fail if we only have group-read privileges on
the file (0640), but surely the kill() will in any case.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Stephen Frost

Greetings,

* Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote:
> Craig, I believe you probably did something wrong if you had to work
> with some library directly. Actually you generate classes from text
> description and just use them. I worked with Thrift some time ago, in
> 2015 [1]. I wouldn't call it awkward. Protobuf is fine too, but
> unfortunately we don't have any Protobuf-related projects this time.

Just to be clear, the list on the wiki is just a set of suggestions-
students are welcome to propose their own projects as well.

> Also it's probably worth noticing that the GSoC project doesn't imply
> using any existing libraries, only the binary format which is quite
> stable.

A student proposal should really include information about what other
libraries, if any, are being considered for the project as that will
play into the consideration as to if it's something we would be
interested in including in PG or not.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-13 Thread Tom Lane

David Gould  writes:
> I have attached the patch we are currently using. It applies to 9.6.8. I
> have versions for older releases in 9.4, 9.5, 9.6. I fails to apply to 10,
> and presumably head but I can update it if there is any interest.

> The patch has three main features:
> - Impose an ordering on the autovacuum workers worklist to avoid
>   the need for rechecking statistics to skip already vacuumed tables.
> - Reduce the frequency of statistics refreshes
> - Remove the AutovacuumScheduleLock

As per the earlier thread, the first aspect of that needs more work to
not get stuck when the worklist has long tasks near the end.  I don't
think you're going to get away with ignoring that concern.

Perhaps we could sort the worklist by decreasing table size?  That's not
an infallible guide to the amount of time that a worker will need to
spend, but it's sure safer than sorting by OID.

Alternatively, if we decrease the frequency of stats refreshes, how
much do we even need to worry about reordering the worklist?

In any case, I doubt anyone will have any appetite for back-patching
such a change.  I'd recommend that you clean up your patch and rebase
to HEAD, then submit it into the September commitfest (either on a
new thread or a continuation of the old #13750 thread, not this one).

regards, tom lane

Re: Additional Statistics Hooks

2018-03-13 Thread Mat Arye

On Tue, Mar 13, 2018 at 6:56 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Tue, Mar 13, 2018 at 4:14 AM, Tom Lane  wrote:
> > Mat Arye  writes:
> >> So the use-case is an analytical query like
> >
> >> SELECT date_trunc('hour', time) AS MetricMinuteTs, AVG(value) as avg
> >> FROM hyper
> >> WHERE time >= '2001-01-04T00:00:00' AND time <= '2001-01-05T01:00:00'
> >> GROUP BY MetricMinuteTs
> >> ORDER BY MetricMinuteTs DESC;
> >
> >> Right now this query will choose a much-less-efficient GroupAggregate
> plan
> >> instead of a HashAggregate. It will choose this because it thinks the
> >> number of groups
> >> produced here is 9,000,000 because that's the number of distinct time
> >> values there are.
> >> But, because date_trunc "buckets" the values there will be about 24
> groups
> >> (1 for each hour).
> >
> > While it would certainly be nice to have better behavior for that,
> > "add a hook so users who can write C can fix it by hand" doesn't seem
> > like a great solution.  On top of the sheer difficulty of writing a
> > hook function, you'd have the problem that no pre-written hook could
> > know about all available functions.  I think somehow we'd need a way
> > to add per-function knowledge, perhaps roughly like the protransform
> > feature.
>
> Like cost associated with a function, we may associate mapping
> cardinality with a function. It tells how many distinct input values
> map to 1 output value. By input value, I mean input argument tuple. In
> Mat's case the mapping cardinality will be 12. The number of distinct
> values that function may output is estimated as number of estimated
> rows / mapping cardinality of that function.
>

I think this is complicated by the fact that the mapping cardinality is not
a constant per function
but depends on the constant given as the first argument to the function and
the granularity of the
underlying data (do you have a second-granularity or microsecond
granularity). I actually think the logic for the
estimate here should be the (max(time)-min(time))/interval. I think to be
general you need to allow functions on statistics to determine the estimate.



>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>

Re: PATCH: Configurable file mode mask

2018-03-13 Thread David Steele

On 3/13/18 11:00 AM, Tom Lane wrote:
> Stephen Frost  writes:
>> * Michael Paquier (mich...@paquier.xyz) wrote:
>>> If the problem is parsing, it could as well be more portable to put that
>>> in the control file, no?
> 
>> Then we'd need a tool to allow changing it for people who do want to
>> change it.  There's no reason we should have to have an extra tool for
>> this- an administrator who chooses to change the privileges on the data
>> folder should be able to do so and I don't think anyone is going to
>> thank us for requiring them to use some special tool to do so for
>> PostgreSQL.
> 
> FWIW, I took a quick look through this patch and don't have any problem
> with the approach, which appears to be "use the data directory's
> startup-time permissions as the status indicator".  I am kinda wondering
> about this though:
> 
> +(These files can confuse pg_ctl.)  If group 
> read
> +access is enabled on the data directory and an unprivileged user in the
> +PostgreSQL group is performing the backup, 
> then
> +postmaster.pid will not be readable and must be
> +excluded.
> 
> If we're allowing group read on the data directory, why should that not
> include postmaster.pid?  There's nothing terribly security-relevant in
> there, and being able to find out the postmaster PID seems useful.  I can
> see the point of restricting server key files, as documented elsewhere,
> but it's possible to keep those somewhere else so they don't cause
> problems for simple backup software.

I'm OK with that.  I had a look at the warnings regarding the required
mode of postmaster.pid in miscinit.c (889-911) and it seems to me they
still hold true if the mode is 640 instead of 600.

Do you agree, Tom?  Stephen?

If so, I'll make those changes.

> Also, in general, this patch desperately needs a round of copy-editing,
> particularly with attention to the comments.  For instance, there are a
> minimum of three grammatical errors in this one comment:
> 
> + * Group read/execute may optional be enabled on PGDATA so any frontend tools
> + * That write into PGDATA must know what mask to set and the permissions to
> + * use for creating files and directories.

> In a larger sense, this fails to explain the operating principle,
> namely what I stated above, that we allow the existing permissions
> on PGDATA to decide what we allow as group permissions.  If you
> don't already understand that, you're going to have a hard time
> extracting it from either file_perm.h or file_perm.c.  (The latter
> fails to even explain what the argument of DataDirectoryMask is.)
I'll do comment/doc review for the next round of patches.

Thanks,
-- 
-David
da...@pgmasters.net

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-13 Thread Tom Lane

David Gould  writes:
> I also thought about the theory and am confident that there really is no way
> to trick it. Basically if there are enough pages that are different to affect
> the overall density, say 10% empty or so, there is no way a random sample
> larger than a few hundred probes can miss them no matter how big the table is.
> If there are few enough pages to "hide" from the sample, then they are so few
> they don't matter anyway.

> After all this my vote is for back patching too. I don't see any case where
> the patched analyze is or could be worse than what we are doing. I'm happy to
> provide my test cases if anyone is interested.

Yeah, you have a point.  I'm still worried about unexpected side-effects,
but it seems like overall this is very unlikely to hurt anyone.  I'll
back-patch (minus the removal of the unneeded vac_estimate_reltuples
argument).

regards, tom lane

Re: committing inside cursor loop

2018-03-13 Thread Peter Eisentraut

On 3/6/18 07:48, Ildus Kurbangaliev wrote:
> Although as was discussed before it seems inconsistent without ROLLBACK
> support. There was a little discussion about it, but no replies. Maybe
> the status of the patch should be changed to 'Waiting on author' until
> the end of discussion.

I'm wondering what the semantics of it should be.

For example, consider this:

drop table test1;
create table test1 (a int, b int);
insert into test1 values (1, 11), (2, 22), (3, 33);

do
language plpgsql
$$
declare
  x int;
begin
  for x in update test1 set b = b + 1 where b > 20 returning a loop
raise info 'x = %', x;
if x = 2 then
   rollback;
end if;
  end loop;
end;
$$;

The ROLLBACK call in the first loop iteration undoes the UPDATE command
that drives the loop.  Is it then sensible to continue the loop?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw

2018-03-13 Thread Stephen Frost

Greetings,

* Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
> On Mon, Mar 12, 2018 at 1:25 PM, Etsuro Fujita
>  wrote:
> > (2018/03/09 20:55), Etsuro Fujita wrote:
> >> (2018/03/08 14:24), Ashutosh Bapat wrote:
> >>> For local constraints to be enforced, we use remote
> >>> constraints. For local WCO we need to use remote WCO. That means we
> >>> create many foreign tables pointing to same local table on the foreign
> >>> server through many views, but it's not impossible.
> >>
> >> Maybe I don't understand this correctly, but I guess that it would be
> >> the user's responsibility to not create foreign tables in such a way.
> >
> > I think I misunderstood your words.  Sorry for that.  I think what you
> > proposed would be a solution for this issue, but I'm not sure that's a good
> > one because that wouldn't work for the data sources that don't support views
> > with WCO options.
> 
> Our solution for the constraints doesn't work with the data sources
> (like flat files) which don't support constraints. So, that argument
> doesn't help.

It would really help to have some examples of exactly what is being
proposed here wrt solutions.

WCO is defined at a view level, so I'm not following the notion that
we're going to depend on something remote to enforce the WCO when the
remote object is just a regular table that you can't define a WCO on top
of.  That's not the case when we're talking about foreign tables vs.
local tables, so it's not the same.  I certainly don't think we should
require a remote view to exist to perform the WCO check.  If the remote
WCO is a view itself then I would expect it to operate in the same
manner as WCO on local views does- you can have them defined as being
cascaded or not.

In other words, there is no case where we have a "foreign view."  Views
are always local.  A "foreign table" could actually be a view, in which
case everything we treat it as a table in the local database, but WCO
doesn't come up in that case at all- there's no way to define WCO on a
table, foreign or not.  If WCO is defined on the view on the remote
server, then it should operate properly and not require anything from the
local side.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Andrey Borodin

Thanks, Aleksander!SP-

> 13 марта 2018 г., в 19:03, Aleksander Alekseev  
> написал(а):
> 
>> Do you have any project related insights as to what I should put in
>> there?
> 
Christos, as far as I remember, good proposal must have schedule, 
implementation details and deliverables.
Also, it is good to show that you are capable of executing the project, like 
mentioning your previous project (no matter commercial, educational or pet 
projects), achievements etc.
GSoC typically have 3 milestones, usually this milestones must have some viable 
results. 
There are exact dates, but here I'll put a sketch.
Algorithms. June - implement introsort and timsort, July - design benchmarks, 
implement some other hashtables, August - if benchmarks are successful, then 
propose patch to commitfest and review others patches from commitfest, else 
implement more algorithms.
amcheck. June - implement checks for Gin (like b-tree in b-tree, resembles 
existing amcheck), July - checks for Hash, BRIN and SP-GiST, August - RUM, 
patch, commitfest, reviews.

Best regards, Andrey Borodin.

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Tom Lane

Stephen Frost  writes:
> * Michael Paquier (mich...@paquier.xyz) wrote:
>> If the problem is parsing, it could as well be more portable to put that
>> in the control file, no?

> Then we'd need a tool to allow changing it for people who do want to
> change it.  There's no reason we should have to have an extra tool for
> this- an administrator who chooses to change the privileges on the data
> folder should be able to do so and I don't think anyone is going to
> thank us for requiring them to use some special tool to do so for
> PostgreSQL.

FWIW, I took a quick look through this patch and don't have any problem
with the approach, which appears to be "use the data directory's
startup-time permissions as the status indicator".  I am kinda wondering
about this though:

+(These files can confuse pg_ctl.)  If group read
+access is enabled on the data directory and an unprivileged user in the
+PostgreSQL group is performing the backup, then
+postmaster.pid will not be readable and must be
+excluded.

If we're allowing group read on the data directory, why should that not
include postmaster.pid?  There's nothing terribly security-relevant in
there, and being able to find out the postmaster PID seems useful.  I can
see the point of restricting server key files, as documented elsewhere,
but it's possible to keep those somewhere else so they don't cause
problems for simple backup software.

Also, in general, this patch desperately needs a round of copy-editing,
particularly with attention to the comments.  For instance, there are a
minimum of three grammatical errors in this one comment:

+ * Group read/execute may optional be enabled on PGDATA so any frontend tools
+ * That write into PGDATA must know what mask to set and the permissions to
+ * use for creating files and directories.

In a larger sense, this fails to explain the operating principle,
namely what I stated above, that we allow the existing permissions
on PGDATA to decide what we allow as group permissions.  If you
don't already understand that, you're going to have a hard time
extracting it from either file_perm.h or file_perm.c.  (The latter
fails to even explain what the argument of DataDirectoryMask is.)

regards, tom lane

1 2 >

1 - 100 of 130 matches

Mail list logo