Re: [HACKERS] Indirect indexes

2016-12-28 Thread Alvaro Herrera
Here's v3.  This one applies on top of the "interesting-attrs" patch I
sent a few hours ago:
https://www.postgresql.org/message-id/20161228232018.4hc66ndrzpz4g4wn@alvherre.pgsql
and contains a number of bugfixes that I discovered on my own (and which
therefore require no further explanation, since evidently no-one has
reviewed the patch yet).

One exception: in htup_details.h I changed the definition of
HeapTupleHeaderIsHeapOnly() so that it returns a true boolean rather
than a random bit in a wider integer.  Without this change, my compiler
complains when the result is passed as a bool argument into a function.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 1b45a4c..9f899c7 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -92,6 +92,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amstorage = true;
amroutine->amclusterable = false;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = InvalidOid;
 
amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f07eedc..1bc91d2 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -49,6 +49,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amstorage = true;
amroutine->amclusterable = false;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = InvalidOid;
 
amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index b8aa9bc..4ec34d5 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -69,6 +69,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amstorage = true;
amroutine->amclusterable = true;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = InvalidOid;
 
amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 1fa087a..a2cf278 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -66,6 +66,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amstorage = false;
amroutine->amclusterable = false;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = INT4OID;
 
amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 1dbefda..71d5fbc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3411,6 +3411,8 @@ simple_heap_delete(Relation relation, ItemPointer tid)
  * crosscheck - if not InvalidSnapshot, also check old tuple against this
  * wait - true if should wait for any conflicting update to commit/abort
  * hufd - output parameter, filled in failure cases (see below)
+ * unchanged_ind_cols - output parameter; bits set for unmodified columns
+ * that are indexed by indirect indexes
  * lockmode - output parameter, filled with lock mode acquired on tuple
  *
  * Normal, successful return value is HeapTupleMayBeUpdated, which
@@ -3433,13 +3435,15 @@ simple_heap_delete(Relation relation, ItemPointer tid)
 HTSU_Result
 heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
-   HeapUpdateFailureData *hufd, LockTupleMode *lockmode)
+   HeapUpdateFailureData *hufd, Bitmapset 
**unchanged_ind_cols,
+   LockTupleMode *lockmode)
 {
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset  *hot_attrs;
Bitmapset  *key_attrs;
Bitmapset  *id_attrs;
+   Bitmapset  *indirect_attrs;
Bitmapset  *interesting_attrs;
Bitmapset  *modified_attrs;
ItemId  lp;
@@ -3496,14 +3500,16 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
 * Note that we get a copy here, so we need not worry about relcache 
flush
 * happening midway through.
 */
-   hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
+   hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_HOT);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,

  INDEX_ATTR_BITMAP_IDENTITY_KEY);
+   indirect_attrs = RelationGetIndexAttrBitmap(relation,
+   
INDEX_ATT

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-12-28 Thread Dilip Kumar
On Fri, Dec 23, 2016 at 8:28 AM, Amit Kapila  wrote:
> The results look positive.  Do you think we can conclude based on all
> the tests you and Dilip have done, that we can move forward with this
> patch (in particular group-update) or do you still want to do more
> tests?   I am aware that in one of the tests we have observed that
> reducing contention on CLOGControlLock has increased the contention on
> WALWriteLock, but I feel we can leave that point as a note to
> committer and let him take a final call.  From the code perspective
> already Robert and Andres have taken one pass of review and I have
> addressed all their comments, so surely more review of code can help,
> but I think that is not a big deal considering patch size is
> relatively small.

I have done one more pass of the review today. I have few comments.

+ if (nextidx != INVALID_PGPROCNO)
+ {
+ /* Sleep until the leader updates our XID status. */
+ for (;;)
+ {
+ /* acts as a read barrier */
+ PGSemaphoreLock(&proc->sem);
+ if (!proc->clogGroupMember)
+ break;
+ extraWaits++;
+ }
+
+ Assert(pg_atomic_read_u32(&proc->clogGroupNext) == INVALID_PGPROCNO);
+
+ /* Fix semaphore count for any absorbed wakeups */
+ while (extraWaits-- > 0)
+ PGSemaphoreUnlock(&proc->sem);
+ return true;
+ }

1. extraWaits is used only locally in this block so I guess we can
declare inside this block only.

2. It seems that we have missed one unlock in case of absorbed
wakeups. You have initialised extraWaits with -1 and if there is one
extra wake up then extraWaits will become 0 (it means we have made one
extra call to PGSemaphoreLock and it's our responsibility to fix it as
the leader will Unlock only once). But it appear in such case we will
not make any call to PGSemaphoreUnlock. Am I missing something?



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Ashutosh Bapat
On Wed, Dec 28, 2016 at 10:55 PM, Stephen Frost  wrote:
> Tom,
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Stephen Frost  writes:
>> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> >> I think it's an awful choice of name; it has nothing to do with either
>> >> the functionality or the printed name of the field.
>>
>> > As an example, we might some day wish to include a summary of buffer
>> > information at the bottom too when 'buffers' is used.  The proposed
>> > 'summary' option would cover that nicely, but 'timing' wouldn't.  That's
>> > actually why I was thinking summary might be a good option to have.
>>
>> What, would this option then turn off the total-time displays by default?
>
> To retain our current mixed behavior with explain vs. explain-analyze,
> we'd have to say it defaults to off for explain and on with analyze.  I
> don't particularly like that, and would rather we just default it to on,
> but that would mean adjusting the regression tests.
>
>> I do not see that being a reasonable thing to do.  Basically, you're
>> taking what seems like a very general-purpose option name and nailing
>> it down to mean "print planning time".  You aren't going to be able
>> to change that later.
>
> No, that's not what I was suggesting to do and I disagree that we
> couldn't ever change it later.  If we want it to mean "print planning
> time" and only ever that then I agree that calling it "summary" isn't a
> good option.

No, that wasn't my intention either. I have clarified it in my mail.

>
>> > No, but consider how the docs for the current 'timing' option would have
>> > to be rewritten.
>>
>> Well, sure, they'd have to be rewritten, but I think this definition
>> would actually be more orthogonal.
>
> This definition would have two completely different meanings- one for
> when analyze is used, and one for when it isn't.
>
>> > We would also have to say something like "the default when not using
>> > 'analyze' is off, but with 'analyze' the default is on" which seems
>> > pretty grotty to me.
>>
>> But the default for TIMING already does depend on ANALYZE.
>
> I would argue that timing can only actually be used with analyze today,
> which makes sense when you consider that timing is about enabling or
> disabling per-node timing information.  Redefining it to mean something
> else isn't particularly different from redefining 'summary' later to
> mean something else.
>
>> > Then again, from a *user's* perspective, it should just be included by
>> > default.
>>
>> Actually, the reason it hasn't gotten included is probably that the
>> use-case for it is very small.  If you just do psql \timing on an
>> EXPLAIN, you get something close enough to the planning time.  I don't
>> mind adding this as an option, but claiming that it's so essential
>> that it should be there by default is silly.  People would have asked
>> for it years ago if it were all that important.
>
> I don't buy this argument.  Planning time is (hopefully, anyway...) a
> rather small amount of time which means that the actual results from
> \timing (or, worse, the timing info from other tools like pgAdmin) is
> quite far off.  On a local instance with a simple plan, you can get an
> order-of-magnitude difference between psql's \timing output and the
> actual planning time, throw in a few or even 10s of ms of network
> latency and you might as well forget about trying to figure out what
> the planning time actually is.

+1. On my machine

regression=# \timing
Timing is on.
regression=# explain select * from pg_class c, pg_type t where
c.reltype = t.oid;

[...] clipped plan

Time: 1.202 ms
regression=# \timing
Timing is off.
regression=# explain analyze select * from pg_class c, pg_type t where
c.reltype = t.oid;

[...] clipped plan

 Planning time: 0.332 ms
 Execution time: 1.670 ms
(8 rows)

\timing output is way off than the actual planning time.

Take another example

regression=# explain analyze select * from pg_class c, pg_type t,
pg_inherits i where c.reltype = t.oid and i.inhparent = c.oid;
QUERY PLAN
[ ... ] clipped plan

 Planning time: 0.592 ms
 Execution time: 2.294 ms
(13 rows)

regression=# \timing
Timing is on.
regression=# explain select * from pg_class c, pg_type t, pg_inherits
i where c.reltype = t.oid and i.inhparent = c.oid;
[...] clipped plan

Time: 1.831 ms

The planning time has almost doubled, but what \timing reported has
only grown by approximately 50%.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Ashutosh Bapat
On Wed, Dec 28, 2016 at 10:30 PM, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>>> I think it's an awful choice of name; it has nothing to do with either
>>> the functionality or the printed name of the field.
>
>> As an example, we might some day wish to include a summary of buffer
>> information at the bottom too when 'buffers' is used.  The proposed
>> 'summary' option would cover that nicely, but 'timing' wouldn't.  That's
>> actually why I was thinking summary might be a good option to have.
>
> What, would this option then turn off the total-time displays by default?
> I do not see that being a reasonable thing to do.  Basically, you're
> taking what seems like a very general-purpose option name and nailing
> it down to mean "print planning time".  You aren't going to be able
> to change that later.

I don't intend to use "summary" to print only planning time. As
Stephen has pointed out in his mail, it can be expanded later to
include other things. But I guess, the documentation changes I
included in the patch are the reason behind your objection.


+SUMMARY
+
+ 
+  Include planning time, except when used with EXECUTE.
+  Since EXPLAIN EXECUTE displays plan for a prepared
+  query, i.e. a query whose plan is already created, the planning time is
+  not available when EXPLAIN EXECUTE is executed.
+  It defaults to FALSE.
+ 
+
+   
+

I think I did a bad job there. Sorry for that. I think we should
reword the paragraph as

 "Include summary of planning and execution of query. When used
without ANALYZE it prints planning time. When used
with ANALYZE, this option is considered to be
TRUE overriding user specified value and prints
execution time. Since EXPLAIN EXECUTE displays plan
for a prepared query, i.e. a query whose plan is already created, the
planning time is
 not available when EXPLAIN EXECUTE is executed. It
defaults to FALSE."

We can add more things to this later.

I am not very happy with the sentence explaining ANALYZE, but that's
how it is today. We can change that. With ANALYZE, SUMMARY is ON if
user doesn't specify SUMMARY. But in case user specifies SUMMARY OFF
with ANALYZE, we won't print execution and planning time. It's a
conscious decision by user not to print those things. That will make
the documentation straight forward.

I am not so happy with EXPLAIN EXECUTE either, but it would be better
to clarify the situation. Or we can print planning time as 0 for
EXPLAIN EXECUTE. We can do better there as well. We can print planning
time if the cached plan was invalidated and required planning,
otherwise print 0. That would be a helpful diagnostic.

I do think that there is some merit in reporting planning time as a
whole just like execution time. Planning time is usually so small that
users don't care about how it's split across various phases of
planning. But with more and more complex queries and more and more
planning techniques, it becomes essential to know module-wise (join
planner, subquery planner, upper planner) timings. Developers
certainly would like that, but advanced users who try to control
optimizer might find it helpful. In that case, total planning time
becomes a "summary". In this case "TIMING" would control reporting
granular planning time and SUMMARY would control reporting overall
printing time. I don't intend to add granular timings right now, and
that wasn't something I was thinking of while writing this patch.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Robert Haas
On Wed, Dec 28, 2016 at 1:06 PM, Tom Lane  wrote:
> Jim Nasby  writes:
>> On 12/28/16 11:25 AM, Tom Lane wrote:
>>> The idea of just capturing the wait start for heavyweight locks, and
>>> not other lock types, still seems superior to any of the alternatives
>>> that have been suggested ...
>
>> Is some kind of alarm a viable option for the others? If setting the
>> alarm is cheap,
>
> Setting an alarm is certain to require a gettimeofday and/or a kernel
> call.  It is far from cheap.

If one were OK with a really-really-rough value for when the wait
started, you could have a slot for a timestamp in PGPROC that is
cleared by pgstat_report_wait_end() but not set by
pgstat_report_wait_start().  Then you could have a background process
that goes through and sets it for all processes advertising a wait
event every second, or every 10 seconds, or every half second, or
whatever you want.  Of course this doesn't eliminate the overhead but
it pushes it into the background which, if there are idle CPUs, is
almost as good.

I previously proposed something similar to this as a way of profiling
waits and I believe you weren't very supportive of it, but I still
think these kinds of ideas have some legs.  We can neither take the
approach that timestamp overhead is irrelevant nor the position that
timestamps are expensive so let's not have any.  Some third way has to
be found.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Amit Kapila
On Wed, Dec 28, 2016 at 8:41 PM, Stephen Frost  wrote:
> * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
>> >> One can use this option as
>> >> postgres=# explain (summary on) select * from pg_class c, pg_type t
>> >> where c.reltype = t.oid;
>> >> QUERY PLAN
>> >> --
>> >>  Hash Join  (cost=17.12..35.70 rows=319 width=511)
>> >>Hash Cond: (c.reltype = t.oid)
>> >>->  Seq Scan on pg_class c  (cost=0.00..14.19 rows=319 width=259)
>> >>->  Hash  (cost=12.61..12.61 rows=361 width=256)
>> >>  ->  Seq Scan on pg_type t  (cost=0.00..12.61 rows=361 width=256)
>> >>  Planning time: 48.823 ms
>> >> (6 rows)
>> >>
>> >> When analyze is specified, summary is also set to ON. By default this
>> >> flag is OFF.
>> >>
>> >
>> > I am not sure whether using *summary* to print just planning time is a
>> > good idea.  Another option could be SUMMARY_PLAN_TIME.
>>
>> I have just used the same name as the boolean which controls the
>> printing of planning time. Suggestions are welcome though. We haven't
>> used words with "_" for EXPLAIN options, so I am not sure about
>> SUMMARY_PLAN_TIME.
>
> Using 'summary' seems entirely reasonable to me, I don't think we need
> to complicate it by saying 'summary_plan_time'- I know that I'd had to
> have to write that out.
>
>> > + /* Execution time matters only when analyze is requested */
>> > + if (es->summary && es->analyze)
>> >
>> > Do you really need es->summary in above check?
>
> I'm pretty sure we do.
>
> EXPLAIN (ANALYZE, SUMMARY OFF)
>
> Should not print the summary (which means it shouldn't print either the
> planning or execution time).
>

Hmm, have you checked what output patch gives for above command, in
above usage, it will print both planning and execution time.  IIUC,
then patch doesn't do what you have in mind.   As far as I understand
the proposed usage for the summary parameter was only for planning
time.  Now if you want to mean that it should be used to print the
last two lines of Explain output (planning or execution time), then I
think patch requires some change.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Duplicate node tag assignments

2016-12-28 Thread Alvaro Herrera
Amit Kapila wrote:
> On Wed, Dec 28, 2016 at 10:03 PM, Tom Lane  wrote:

> > Or we could just abandon the manually-assigned breaks in the list
> > altogether, and let NodeTags run from 1 to whatever.  This would
> > slightly complicate debugging, in that the numeric values of node
> > tags would change more than they used to, but on the whole that does
> > not sound like a large problem.
> 
> Yeah.  In most cases, during debugging I use the tag for typecasting
> the node to see the values of the particular node type.

I do likewise.  The actual numeric values don't matter to me one bit.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Duplicate node tag assignments

2016-12-28 Thread Amit Kapila
On Wed, Dec 28, 2016 at 10:03 PM, Tom Lane  wrote:
> By chance I happened to notice that the recent partition patch pushed
> us over the number of available node tags between
>
> T_A_Expr = 900,
>
> and
>
> T_TriggerData = 950,/* in commands/trigger.h */
>
> Specifically we now have some of the replication grammar node type
> codes conflicting with some of the "random other stuff" tags.
> It's not terribly surprising that that hasn't caused visible problems,
> because of the relatively localized use of replication grammar nodes,
> but it's still a Bad Thing.  And it's especially bad that apparently
> no one's compiler has complained about it.
>
> So we need to renumber enum NodeTag a bit, which is no big deal,
> but I think we had better take thought to prevent a recurrence
> (which might have worse consequences next time).
>
> One idea is to add StaticAsserts someplace asserting that there's
> still daylight at each manually-assigned break in the NodeTag list.
> But I'm not quite sure how to make that work.  For example, asserting
> that T_PlanInvalItem < T_PlanState won't help if people add new nodes
> after PlanInvalItem and don't think to update the Assert.
>
> Or we could just abandon the manually-assigned breaks in the list
> altogether, and let NodeTags run from 1 to whatever.  This would
> slightly complicate debugging, in that the numeric values of node
> tags would change more than they used to, but on the whole that does
> not sound like a large problem.
>

Yeah.  In most cases, during debugging I use the tag for typecasting
the node to see the values of the particular node type.

>  When you're working in gdb, say,
> it's easy enough to convert:
>
> (gdb) p (int) T_CreateReplicationSlotCmd
> $8 = 950
> (gdb) p (enum NodeTag) 949
> $9 = T_BaseBackupCmd
>
> So I'm leaning to the second, more drastic, solution.
>

Sounds sensible.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : Parallel Merge Join

2016-12-28 Thread Dilip Kumar
On Thu, Dec 29, 2016 at 3:15 AM, Tomas Vondra
 wrote:
> FWIW, I've done quite a bit of testing on this patch, and also on the other
> patches adding parallel index scans and bitmap heap scan. I've been running
> TPC-H and TPC-DS on 16GB data sets with each patch, looking for regressions
> or crashes.

Thanks for looking into this.

>
> I haven't found any of that so far, which is good of course. It however
> seems the plan changes only for very few queries in those benchmarks with
> any of the patches, even after tweaking the costs to make parallel plans
> more likely.

You can also try with reducing random_page_cost (that will help
parallel merge join with index scan), in case your data fits in memory
and you are ensuring warm cache environment.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Amit Kapila
On Wed, Dec 28, 2016 at 10:55 PM, Tom Lane  wrote:
> Jim Nasby  writes:
>> On 12/28/16 7:10 AM, Amit Kapila wrote:
>>> Can we think of introducing new guc trace_system_waits or something
>>> like that which will indicate that the sessions will report the value
>>> of wait_start in pg_stat_activity?
>
>> In my experience the problem with those kind of settings is that they're
>> never enabled when you actually need them.
>

I agree with that, but I think it might be better than nothing
especially for LWLock waits.

> Yeah.  And they especially won't be enabled in production situations,
> if people find out that they cause lots of overhead.
>

What kind of overhead are we expecting here?  I think it probably
depends on workload, but it is quite possible that in most workloads
it is in single digit.  I agree that in an ideal situation we should
not allow even 1% overhead, however, if people are interested in doing
the detail level of monitoring at the expense of some performance hit,
then why not give them the option to do so.

>> I think it'd be much better
>> to find a way to always capture wait_starts that are over some minimum
>> duration, where collection overhead won't matter but you still have some
>> good info about what's going on. For pg_stat_activity I'd think that
>> threshold would be on the order of 50-100ms, though maybe there's other
>> places where a tighter tolerance would help.
>
> The idea of just capturing the wait start for heavyweight locks, and
> not other lock types, still seems superior to any of the alternatives
> that have been suggested ...
>

If we want to capture waits without any system level parameter, then
that seems better option and maybe, later on, we can capture I/O waits
as well. However, I feel as proposed the name of parameter wait_start
can confuse users considering we have wait_event and wait_event_type
for generic waits.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for pg_receivexlog --format=plain|tar

2016-12-28 Thread Michael Paquier
On Wed, Dec 28, 2016 at 9:31 PM, Magnus Hagander  wrote:
> Conditional tests? It probably wouldn't hurt to have them, but that would be
> something more generic (like we'd need something to actually validate it --
> but it would make sense to have a test that, with compression enabled, would
> verify if the uncompressed file turns out to be exactly 16Mb for example).

Looking at if the build is compiled with libz via SQL or using
pg_config is the way to go here. A minimum is doable.

>> A couple of things to be aware of though:
>> - gzopen, gzwrite and gzclose are used to handle the gz files. That's
>> unconsistent with the tar method that is based on mainly deflate() and
>> more low level routines.
>
> But chosen for simplicity, I assume?

Yep. That's quite in-line with the current code.

>> - I have switched the directory method to use a file pointer instead
>> of a file descriptor as gzwrite returns int as the number of
>> uncompressed bytes written.
>
> I don't really follow that reasoning :) Why does the directory method have
> to change to use a filepointer because of that?

The only reason is that write() returns size_t and fwrite returns int,
while gzwrite() returns int. It seems more consistent to use fwrite()
in this case. Or we don't bother about my nitpicking and just cast
stuff.

>> What do you think about this approach? I'll add that to the next CF.
>
> I haven't reviweed the code in detail but yes, I think this approach is the
> right one.

OK, thanks. I'll think about those conditional tests, correct one or
two things in the patch and submit again soon.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] rewrite HeapSatisfiesHOTAndKey

2016-12-28 Thread Alvaro Herrera
Pursuant to my comments at
https://www.postgresql.org/message-id/20161223192245.hx4rbrxbrwtgwj6i@alvherre.pgsql
and because of a stupid bug I found in my indirect indexes patch, I
rewrote (read: removed) HeapSatisfiesHOTAndKey.  The replacement
function HeapDetermineModifiedColumns returns a bitmapset with a bit set
for each modified column, for those columns that are listed as
"interesting" -- currently that set is the ID columns, the "key"
columns, and the indexed columns.  The new code is much simpler, at the
expense of a few bytes of additional memory used during heap_update().

All tests pass.

Both WARM and indirect indexes should be able to use this infrastructure
in a way that better serves them than the current HeapSatisfiesHOTAndKey
(both patches modify that function in a rather ugly way).  I intend to
get this pushed shortly unless objections are raised.

The new coding prevents stopping the check early as soon as the three
result booleans could be determined.  However, both patches were adding
reasons to avoid stopping early anyway, so I don't think this is a
significant loss.

Pavan, please rebase your WARM patch on top of this and let me know how
you like it.  I'll post a new version of indirect indexes later this
week.

Comments welcome.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index ea579a0..1dbefda 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -95,11 +95,8 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer 
oldbuf,
Buffer newbuf, HeapTuple oldtup,
HeapTuple newtup, HeapTuple old_key_tup,
bool all_visible_cleared, bool 
new_all_visible_cleared);
-static void HeapSatisfiesHOTandKeyUpdate(Relation relation,
-Bitmapset *hot_attrs,
-Bitmapset *key_attrs, 
Bitmapset *id_attrs,
-bool *satisfies_hot, 
bool *satisfies_key,
-bool *satisfies_id,
+static Bitmapset *HeapDetermineModifiedColumns(Relation relation,
+Bitmapset 
*interesting_cols,
 HeapTuple oldtup, 
HeapTuple newtup);
 static bool heap_acquire_tuplock(Relation relation, ItemPointer tid,
 LockTupleMode mode, LockWaitPolicy 
wait_policy,
@@ -3443,6 +3440,8 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
Bitmapset  *hot_attrs;
Bitmapset  *key_attrs;
Bitmapset  *id_attrs;
+   Bitmapset  *interesting_attrs;
+   Bitmapset  *modified_attrs;
ItemId  lp;
HeapTupleData oldtup;
HeapTuple   heaptup;
@@ -3460,9 +3459,6 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
pagefree;
boolhave_tuple_lock = false;
booliscombo;
-   boolsatisfies_hot;
-   boolsatisfies_key;
-   boolsatisfies_id;
booluse_hot_update = false;
boolkey_intact;
boolall_visible_cleared = false;
@@ -3504,6 +3500,10 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,

  INDEX_ATTR_BITMAP_IDENTITY_KEY);
+   interesting_attrs = bms_add_members(NULL, hot_attrs);
+   interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
+   interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
+
 
block = ItemPointerGetBlockNumber(otid);
buffer = ReadBuffer(relation, block);
@@ -3524,7 +3524,7 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
Assert(ItemIdIsNormal(lp));
 
/*
-* Fill in enough data in oldtup for HeapSatisfiesHOTandKeyUpdate to 
work
+* Fill in enough data in oldtup for HeapDetermineModifiedColumns to 
work
 * properly.
 */
oldtup.t_tableOid = RelationGetRelid(relation);
@@ -3551,6 +3551,13 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
}
 
/*
+* Determine which columns are being modified by the update.
+*/
+
+   modified_attrs = HeapDetermineModifiedColumns(relation, 
interesting_attrs,
+   
  &oldtup, newtup);
+
+

Re: [HACKERS] Improving RLS planning

2016-12-28 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Here's an updated version of the RLS planning patch that gets rid of
>> the incorrect interaction with Query.hasRowSecurity and adjusts
>> terminology as agreed.

> I've spent a fair bit of time going over this change to understand it,
> how it works, and how it changes the way RLS and securiy barrier views
> work.

Thanks for the review.  Attached is an updated patch that I believe
addresses all of the review comments so far.  The code is unchanged from
v2, but I improved the README, some comments, and the regression tests.

> Are you thinking that we will be able to remove the cut-out in
> is_simple_subquery() which currently punts whenever an RTE is marked as
> security_barrier?

Yeah, that's the long-term plan, but it's not done yet.

> Speaking of which, it seems like we should probably update the README to
> include some mention, at least, of what we're doing today when it comes
> to joins which involve security barrier entanglements.

I tweaked the new section in README to point out specifically that
security views aren't flattened.

>> ! * In addition to the quals inherited from the parent, we might 
>> have
>> ! * securityQuals associated with this particular child node.  
>> (This
>> ! * won't happen in inheritance cases, only with appendrels 
>> originating
>> ! * from UNION ALL.)

> Right, this won't happen in inheritance cases because we explicitly
> don't consider the quals of the children when querying through the
> parent, similar to how we don't consider the GRANT-based permissions on
> the child tables.  This is mentioned elsewhere but might make sense to
> also mention it here, or at least say 'see expand_inherited_rtentry()'.

Comment adjusted.

>> +/* Reject if it is potentially postponable by security considerations */

> The first comment makes a lot of sense, but the one-liner doesn't seem
> as clear, to me anyway.

Not sure how to make it better.

> The result of the above, as I understand it, is that security_level will
> either be zero, or the restrictinfo will be leakproof, no?  Meaning that
> ec_max_security will either be zero, or the functions involved will be
> leakproof, right?

Right.  We still have to check other member operators of the opfamily,
if we need to select one, but we at least know that the original clauses
are safe.

> Perhaps it's more difficult than it's worth to come up with cases that
> cover the other code paths involved, but it seems like it might be good
> to at least try to as it's likely to happen in more cases now that we're
> returning (or should be, at least) InvalidOid due to the only operators
> found being leaky ones.

To test this you need a btree opclass that contains some leakproof and
some non-leakproof operators, which is a mite unusual, but fortunately
we already have a test (equivclass.sql) that creates such a situation.
I added a test there that demonstrates the planner backing off an
equivalence-class deduction in the presence of lower-level security
quals.  We might have to tweak the test in future if we refine these
rules, but that seems fine.

> As discussed previously, this looks like a good, practical, hack, but I
> feel a little bad that we don't mention it anywhere except in this
> comment.  Is it too low-level to get a mention in the README?

Done.

I also adjusted the test that was collapsing to a dummy query, and
updated the expected results for a couple of new queries that weren't
there two months ago.

regards, tom lane



new-rls-planning-implementation-3.patch.gz
Description: new-rls-planning-implementation-3.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : Parallel Merge Join

2016-12-28 Thread Tomas Vondra

Hi,

On 12/21/2016 04:53 PM, Dilip Kumar wrote:

On Wed, Dec 21, 2016 at 8:39 PM, Robert Haas  wrote:

Committed the refactoring patch with some mild cosmetic adjustments.


Thanks..


As to the second patch:

+if (jointype == JOIN_UNIQUE_INNER)
+jointype = JOIN_INNER;

Isn't this dead code?  save_jointype might that value, but jointype won't.


Yes, it is.

I have fixed this, and also observed that comment for
try_partial_mergejoin_path header was having some problem, fixed that
too.



FWIW, I've done quite a bit of testing on this patch, and also on the 
other patches adding parallel index scans and bitmap heap scan. I've 
been running TPC-H and TPC-DS on 16GB data sets with each patch, looking 
for regressions or crashes.


I haven't found any of that so far, which is good of course. It however 
seems the plan changes only for very few queries in those benchmarks 
with any of the patches, even after tweaking the costs to make parallel 
plans more likely.


I'm going to try with larger scales and also --enable-cassert and post 
the results during CF 2017-1.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical tape pause/resume

2016-12-28 Thread Peter Geoghegan
On Wed, Dec 21, 2016 at 4:25 AM, Heikki Linnakangas  wrote:
> In the meanwhile, Robert committed the cap on the number of tapes. Since
> that's in, I'm not sure if the pause/resume part of this is worthwhile.
> Maybe it is.

I rebased my parallel tuplesort patch on top of what you committed a
few days back (your 0001-* patch). It definitely made my changes to
logtape.c significantly simpler, which was a big improvement.

I would be inclined to not go forward with 0002-* though, because I
think it's cleaner for the parallel tuplesort patch to have workers
rely on the tape freezing code to flush the last block out to make
state available in temp files for the leader to process/merge. The
memory savings that remain on the table are probably not measurable if
we were to fix them, given the work we've already done, palloc()
fragmentation, and so on.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-28 Thread Claudio Freire
On Wed, Dec 28, 2016 at 3:41 PM, Claudio Freire  wrote:
>> Anyway, I found the problem that had caused segfault.
>>
>> for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; tupindex =
>> 0, segindex++)
>> {
>> DeadTuplesSegment *seg =
>> &(vacrelstats->dead_tuples.dead_tuples[segindex]);
>> intnum_dead_tuples = seg->num_dead_tuples;
>>
>> while (tupindex < num_dead_tuples)
>> ...
>>
>> You rely on the value of tupindex here, while during the very first pass the
>> 'tupindex' variable
>> may contain any garbage. And it happend that on my system there was negative
>> value
>> as I found inspecting core dump:
>>
>> (gdb) info locals
>> num_dead_tuples = 5
>> tottuples = 0
>> tupindex = -1819017215
>>
>> Which leads to failure in the next line
>> tblk = ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
>>
>> The solution is to move this assignment inside the cycle.
>
> Good catch. I read that line suspecting that very same thing but
> somehow I was blind to it.

Attached v4 patches with the requested fixes.
From 988527b49ec0f48ea7fc85c7533e8eaa07d6fc3b Mon Sep 17 00:00:00 2001
From: Claudio Freire 
Date: Fri, 2 Sep 2016 23:33:48 -0300
Subject: [PATCH 1/2] Vacuum: prefetch buffers on backward scan

This patch speeds up truncation by prefetching buffers during
the backward scan phase just prior to truncation. This optimization
helps rotating media because disk rotation induces a buffer-to-buffer
latency of roughly a whole rotation when reading backwards, such
disks are usually optimized for forward scans. Prefetching buffers
in larger chunks ameliorates the issue considerably and is otherwise
harmless. The prefetch amount should also not be an issue on SSD,
which won't benefit from the optimization, but won't be hurt either.
---
 src/backend/commands/vacuumlazy.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index b5fb325..854bce3 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -1825,7 +1825,7 @@ lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats)
 static BlockNumber
 count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 {
-	BlockNumber blkno;
+	BlockNumber blkno, prefetchBlkno;
 	instr_time	starttime;
 
 	/* Initialize the starttime if we check for conflicting lock requests */
@@ -1833,6 +1833,8 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 
 	/* Strange coding of loop control is needed because blkno is unsigned */
 	blkno = vacrelstats->rel_pages;
+	prefetchBlkno = blkno & ~0x1f;
+	prefetchBlkno = (prefetchBlkno > 32) ? prefetchBlkno - 32 : 0;
 	while (blkno > vacrelstats->nonempty_pages)
 	{
 		Buffer		buf;
@@ -1882,6 +1884,23 @@ count_nondeletable_pages(Relation onerel, LVRelStats *vacrelstats)
 
 		blkno--;
 
+		/*
+		 * Speed up reads on rotating rust by prefetching a few blocks ahead.
+		 * Backward scans on rotary disks are slow if done one buffer at a time
+		 * because readahead won't kick in on most OSes, and each buffer will
+		 * have to wait for the platter to do a full rotation.
+		 * Should be harmless on SSD.
+		 */
+		if (blkno <= prefetchBlkno) {
+			BlockNumber pblkno;
+			if (prefetchBlkno >= 32)
+prefetchBlkno -= 32;
+			else
+prefetchBlkno = 0;
+			for (pblkno = prefetchBlkno; pblkno < blkno; pblkno++)
+PrefetchBuffer(onerel, MAIN_FORKNUM, pblkno);
+		}
+
 		buf = ReadBufferExtended(onerel, MAIN_FORKNUM, blkno,
  RBM_NORMAL, vac_strategy);
 
-- 
1.8.4.5

From d360720415a2db5d5285a757c9a06e7ed854c5c5 Mon Sep 17 00:00:00 2001
From: Claudio Freire 
Date: Mon, 12 Sep 2016 23:36:42 -0300
Subject: [PATCH 2/2] Vacuum: allow using more than 1GB work mem

Turn the dead_tuples array into a structure composed of several
exponentially bigger arrays, to enable usage of more than 1GB
of work mem during vacuum and thus reduce the number of full
index scans necessary to remove all dead tids when the memory is
available.
---
 src/backend/commands/vacuumlazy.c | 302 ++
 1 file changed, 237 insertions(+), 65 deletions(-)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 854bce3..8b714e0 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -93,11 +93,40 @@
 #define LAZY_ALLOC_TUPLES		MaxHeapTuplesPerPage
 
 /*
+ * Minimum (starting) size of the dead_tuples array segments. Will allocate
+ * space for 128MB worth of tid pointers in the first segment, further segments
+ * will grow in size exponentially. Don't make it too small or the segment list
+ * will grow bigger than the sweetspot for search efficiency on big vacuums.
+ */
+#define LAZY_MIN_TUPLES		Max(MaxHeapTuplesPerPage, (128<<20) / sizeof(ItemPointerData))
+
+/*
  * Before we consider skipping a page that's marked as clean in
  * visibility map, we must've seen at least th

Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Pavel Stehule
2016-12-28 20:25 GMT+01:00 Jim Nasby :

> On 12/28/16 12:51 PM, Pavel Stehule wrote:
>
>> Now, the incompatibility can be hard issue - it is big question if we
>> lock some users on old versions because some users can save to lines of
>> code. Introduction of ROW_COUNT is lowly incompatibility - it can be
>> simply detected - but for example change of behave of FOUND variable is
>> terrible, because the code will be quietly calculate differently.
>> sometimes we can break code - probably people will not be happy, but
>> sometimes we can change the results - it can be big fail. So on one side
>> is big costs. On second side is few lines less code.
>>
>
> That's my whole point of why this needs to be settable at a global level:
> so that people with a lot of legacy code can set the OLD behavior at a
> global level, and deal with the old code over time.
>
> If there's no global setting then there are only two choices: we default
> to new behavior and force everyone to add a bunch of stuff to *every*
> function they have (loads of complaints), or we default to old behavior and
> no one bothers to even adopt the new usage because they have to add extra
> stuff to every function. Either way is a failure. This is why I think there
> MUST be some way to control this at a higher level than per function.
>
>
we can have both - plpgsql.variable_conflict can be precedent.


> Certainly GUCs aren't the only option, we could invent something else. One
> feature I could see being useful is being able to set a default on a schema
> level, which isn't currently possible with a GUC. But I can certainly see
> database and global settings being useful, and perhaps per-user as well.
> GUCs already have those.


yes, without GUC you cannot set the behave of plpgsql globally.

Regards

Pavel


>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>


Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Jim Nasby

On 12/28/16 12:51 PM, Pavel Stehule wrote:

Now, the incompatibility can be hard issue - it is big question if we
lock some users on old versions because some users can save to lines of
code. Introduction of ROW_COUNT is lowly incompatibility - it can be
simply detected - but for example change of behave of FOUND variable is
terrible, because the code will be quietly calculate differently.
sometimes we can break code - probably people will not be happy, but
sometimes we can change the results - it can be big fail. So on one side
is big costs. On second side is few lines less code.


That's my whole point of why this needs to be settable at a global 
level: so that people with a lot of legacy code can set the OLD behavior 
at a global level, and deal with the old code over time.


If there's no global setting then there are only two choices: we default 
to new behavior and force everyone to add a bunch of stuff to *every* 
function they have (loads of complaints), or we default to old behavior 
and no one bothers to even adopt the new usage because they have to add 
extra stuff to every function. Either way is a failure. This is why I 
think there MUST be some way to control this at a higher level than per 
function.


Certainly GUCs aren't the only option, we could invent something else. 
One feature I could see being useful is being able to set a default on a 
schema level, which isn't currently possible with a GUC. But I can 
certainly see database and global settings being useful, and perhaps 
per-user as well. GUCs already have those.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-12-28 Thread Tom Lane
Stephen Frost  writes:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> Hmm. I've not read any of the new code yet, but the fact that this
>> test now reduces to a one-time filter makes it effectively useless as
>> a test of qual evaluation order because it has deduced that it doesn't
>> need to evaluate them. I would suggest replacing the qual with
>> something that can't be reduced, perhaps "2*a = 6".

> That's a good thought, I agree.

[ getting back to this patch finally... ]  I made the suggested change
to that test case, and what I see is a whole lot of "NOTICE: snooped value
= whatever" outputs.  The fact that there are none in the current test
output is because in

UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;

we currently decide that the subquery can't be flattened, but we then push
down the two leakproof quals into it, so that they get evaluated ahead of
the snoop() call.  The revised code doesn't do that, allowing snoop() to
be called on rows that will fail the other two quals --- but AFAICS,
that's a feature not a bug.  There is no security-based argument why
snoop() can't go before them, and on cost grounds it should.

I'd leave it as shown in the attached diff fragment, except that I'm
a bit worried about possible platform dependency of the output.  The
hashing occurring in the subplans shouldn't affect output order, but
I'm not sure if we want a test output like this or not.  Thoughts?

regards, tom lane


*** SELECT * FROM v1 WHERE a=8;
*** 2114,2198 
  (4 rows)
  
  EXPLAIN (VERBOSE, COSTS OFF)
! UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
!  QUERY PLAN   
  
! 

!  Update on public.t1 t1_4
!Update on public.t1 t1_4
!Update on public.t11 t1
!Update on public.t12 t1
!Update on public.t111 t1
!->  Subquery Scan on t1
   Output: 100, t1.b, t1.c, t1.ctid
!  Filter: snoop(t1.a)
!  ->  LockRows
!Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c, t1_5.ctid, 
t12.ctid, t12.tableoid
!->  Nested Loop Semi Join
!  Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c, t1_5.ctid, 
t12.ctid, t12.tableoid
!  ->  Seq Scan on public.t1 t1_5
!Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c
!Filter: ((t1_5.a > 5) AND (t1_5.a = 3) AND 
leakproof(t1_5.a))
!  ->  Append
!->  Seq Scan on public.t12
!  Output: t12.ctid, t12.tableoid, t12.a
!  Filter: (t12.a = 3)
!->  Seq Scan on public.t111
!  Output: t111.ctid, t111.tableoid, t111.a
!  Filter: (t111.a = 3)
!->  Subquery Scan on t1_1
!  Output: 100, t1_1.b, t1_1.c, t1_1.d, t1_1.ctid
!  Filter: snoop(t1_1.a)
!  ->  LockRows
!Output: t11.ctid, t11.a, t11.b, t11.c, t11.d, t11.ctid, 
t12_1.ctid, t12_1.tableoid
!->  Nested Loop Semi Join
!  Output: t11.ctid, t11.a, t11.b, t11.c, t11.d, t11.ctid, 
t12_1.ctid, t12_1.tableoid
!  ->  Seq Scan on public.t11
!Output: t11.ctid, t11.a, t11.b, t11.c, t11.d
!Filter: ((t11.a > 5) AND (t11.a = 3) AND 
leakproof(t11.a))
!  ->  Append
!->  Seq Scan on public.t12 t12_1
!  Output: t12_1.ctid, t12_1.tableoid, t12_1.a
!  Filter: (t12_1.a = 3)
!->  Seq Scan on public.t111 t111_1
!  Output: t111_1.ctid, t111_1.tableoid, 
t111_1.a
!  Filter: (t111_1.a = 3)
!->  Subquery Scan on t1_2
!  Output: 100, t1_2.b, t1_2.c, t1_2.e, t1_2.ctid
!  Filter: snoop(t1_2.a)
!  ->  LockRows
!Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, t12_2.e, 
t12_2.ctid, t12_3.ctid, t12_3.tableoid
!->  Nested Loop Semi Join
!  Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, t12_2.e, 
t12_2.ctid, t12_3.ctid, t12_3.tableoid
!  ->  Seq Scan on public.t12 t12_2
!Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, 
t12_2.e
!Filter: ((t12_2.a > 5) AND (t12_2.a = 3) AND 
leakproof(t12_2.a))
!  ->  Append
!->  Seq Scan on public.t12 t12_3
!  Output: t12_3.ctid, t12_3.tableoid, t12_3.a
!  Filter: (t1

Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Pavel Stehule
2016-12-28 18:54 GMT+01:00 Jim Nasby :

> On 12/28/16 7:16 AM, Pavel Stehule wrote:
>
>>
>>
>> 2016-12-28 5:09 GMT+01:00 Jim Nasby > >:
>>
>> On 12/27/16 4:56 PM, Merlin Moncure wrote:
>>
>> On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule
>> mailto:pavel.steh...@gmail.com>> wrote:
>> Which is why this is an external fork of plpgsql.
>>
>>
>> ok. Just I would not to repeat Perl6 or Python3 story - it is big
>> adventure, but big fail too
>>
>
> Yeah, creating an entirely "new" PL to deal with compatibility doesn't
> seem like a good idea to me.
>
> ** The real problem is that we have no mechanism for allowing a PL's
>> language/syntax/API to move forward without massive backwards
>> compatibility problems. **
>>
>>
>> We have not, but there are few possibilities:
>>
>> 1. enhance #option command
>> 2. we can introduce PRAGMA command
>> https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
>>
>
> See separate reply.
>
> 
>
> I'm honestly surprised (even shocked) that you've never run into any
>> of the problems plpgsql2 is trying to solve. I've hit all those
>> problems except for OUT parameters. I'd say the order they're listed
>> in actually corresponds to how often I hit the problems.
>>
>>
>> I hit lot of older harder (now solved) issues - now, with more
>> experience I am able to see these issues. And I wrote plpgsql_check,
>> partially for self too. Years ago I prefer safe expressions.
>>
>
> Recognizing a problem ahead of time (or having plpgsql_check do it for
> you) still means you have to find a way to work around it. In some cases
> (ie: STRICT), that workaround can be a serious PITA. Better to just
> eliminate the problem itself.
>
> I think trying to move the ball forward in a meaningful way without
>> breaking compatibility is a lost cause. Some of these issues could
>> be addressed by adding more syntax, but even that has limits (do we
>> really want another variation of STRICT that allows only 0 or 1
>> rows?). And there's no way to fix your #1 item below without
>> breaking compatibility.
>>
>>
>> I think so there is way with extra check, or with persistent plpgsql
>> options - just use it, please. Some checks are clear, some other not.
>>
>
> I will assert that there will ALWAYS be problems that you can't plaster
> over with some kind of extra checking (like plpgsql_check). At some point,
> in order to fix those, you have to somehow break compatibility.


> Look at libpq as an example. There's a reason we're on protocol V3.
>
> If you know ALGOL family languages, then it is not problem. What is a
>>
>
> Lets be realistic... what % of our users have even heard of ALGOL, let
> alone used it? :)


 not too much - but the problem is not in BEGIN, END. I wrote PL/PSM where
BEGIN END doesn't exists. The functionality was same as PLpgSQL - and there
was not anybody who use it.


>
> harder problem for people is different implementation of mix SQL and PL
>> - different than Oracle, or MSSQL. Our model is better, simpler but
>> different. It is difficult for people without knowleadge of differences
>> between functions and procedures. Partially we badly speaking so our
>> void functions are procedures.
>>
>
> I suspect that's only confusing for people coming from Oracle (which of
> course is a non-trivial number of people).
>
> #6: The variations of syntax between the FOR variants is annoying
>> (specifically, FOREACH necessitating the ARRAY keyword).
>>
>>
>> this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
>> extending
>>
>
> Understood. It still sucks though. :)
>
> #8: EVERYTHING command option should accept a variable. In
>> particular, RAISE should accept a variable for level, but there's
>> other cases of this I've run into. I'd also be nice if you could
>> plop variables into SQL commands where you'd have an identifier,
>> though presumably that would require some kind of explicit variable
>> identifier.
>>
>>
>> It is hiding dynamic SQL - I am strongly against it - minimally due
>> performance issues. Important functionality should not be hidden.
>>
>
> There's definitely ways around the performance issue. I do agree that it
> needs to be clear when you're doing something dynamic so it's not
> accidental. One way to do that would be to add support for variable
> decorators and mandate the use of decorators when using a variable for an
> identifier.
>
> That said, *every* option to RAISE can be a variable except the level.
> That's just plain silly and should be fixed.


I am sorry - I read it wrong - If there is not a parser issue, then it can
be fixed simply.


>
>
> #13: cstring support would allow a lot more people to experiment
>> with things like custom types. Yes, plpgsql might be slow as hell
>> for this, but sometimes that doesn't matter. Even if it does, it can
>> be a lot easier to prototype in some

Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Pavel Stehule
2016-12-28 19:23 GMT+01:00 Jim Nasby :

> On 12/28/16 12:15 PM, Pavel Stehule wrote:
>
>> GUC are fragile - the source code and settings can be separated.
>>
>
> *Can* be, but they don't *have* to be. That's a huge feature, not a bug.
>
> Our #option is more robust, because source code holds all flags required
>> for execution. So I would to see a mechanism, that will be strongly
>> joined with code.
>>
>
> That means you must ALWAYS specify, which is an enormous pain. It
> basically guarantees that users will NEVER switch to the new syntax.
>
> Using function assigned GUC is similar, but it is looking less robust -
>> and some editors can forgot this information.
>>
>
> If you forget then you get an error. Then you remember.
>
> Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
>> probably not all - for example issue of FOUND variable or introducing
>> new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
>> statement GET DIAGNOSTICS, but I understand so isn't funny to write more
>> and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice,
>> but there is risk, so this shortcut breaks existing code, and the
>> costs/benefits are individual. There cannot be 100% agreement ever. So
>> some customisation should be good.
>>
>
> That's the whole point of having settings to deal with incompatibilities:
> so we can actually fix these warts without breaking everyone's code, yet
> also make it clear to users that they should stop using the warts and
> instead use the new and improved syntax.


Now, the incompatibility can be hard issue - it is big question if we lock
some users on old versions because some users can save to lines of code.
Introduction of ROW_COUNT is lowly incompatibility - it can be simply
detected - but for example change of behave of FOUND variable is terrible,
because the code will be quietly calculate differently. sometimes we can
break code - probably people will not be happy, but sometimes we can change
the results - it can be big fail. So on one side is big costs. On second
side is few lines less code.

Regards

Pavel


>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-28 Thread Claudio Freire
On Wed, Dec 28, 2016 at 10:26 AM, Anastasia Lubennikova
 wrote:
> 27.12.2016 20:14, Claudio Freire:
>
> On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova
>  wrote:
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
> vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
> 1417tblk =
> ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
> (gdb) bt
> #0  0x006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
> vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
> #1  0x00693dfe in lazy_scan_heap (onerel=0x1ec2360, options=9,
> vacrelstats=0x1ef6e00, Irel=0x1ef7168, nindexes=2, aggressive=1 '\001')
> at vacuumlazy.c:1337
> #2  0x00691e66 in lazy_vacuum_rel (onerel=0x1ec2360, options=9,
> params=0x7ffe0f866310, bstrategy=0x1f1c4a8) at vacuumlazy.c:290
> #3  0x0069191f in vacuum_rel (relid=1247, relation=0x0, options=9,
> params=0x7ffe0f866310) at vacuum.c:1418
>
> Those line numbers don't match my code.
>
> Which commit are you based on?
>
> My tree is (currently) based on 71f996d22125eb6cfdbee6094f44370aa8dec610
>
>
> Hm, my branch is based on 71f996d22125eb6cfdbee6094f44370aa8dec610 as well.
> I merely applied patches
> 0001-Vacuum-prefetch-buffers-on-backward-scan-v3.patch
> and 0002-Vacuum-allow-using-more-than-1GB-work-mem-v3.patch
> then ran configure and make as usual.
> Am I doing something wrong?

Doesn't sound wrong. Maybe it's my tree the unclean one. I'll have to
do a clean checkout to verify.

> Anyway, I found the problem that had caused segfault.
>
> for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; tupindex =
> 0, segindex++)
> {
> DeadTuplesSegment *seg =
> &(vacrelstats->dead_tuples.dead_tuples[segindex]);
> intnum_dead_tuples = seg->num_dead_tuples;
>
> while (tupindex < num_dead_tuples)
> ...
>
> You rely on the value of tupindex here, while during the very first pass the
> 'tupindex' variable
> may contain any garbage. And it happend that on my system there was negative
> value
> as I found inspecting core dump:
>
> (gdb) info locals
> num_dead_tuples = 5
> tottuples = 0
> tupindex = -1819017215
>
> Which leads to failure in the next line
> tblk = ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
>
> The solution is to move this assignment inside the cycle.

Good catch. I read that line suspecting that very same thing but
somehow I was blind to it.

> I've read the second patch.
>
> 1. What is the reason to inline vac_cmp_itemptr() ?

Performance, mostly. By inlining some transformations can be applied
that wouldn't be possible otherwise. During the binary search, this
matters performance-wise.

> 2. +#define LAZY_MIN_TUPLESMax(MaxHeapTuplesPerPage, (128<<20) /
> sizeof(ItemPointerData))
> What does 128<<20 mean? Why not 1<<27? I'd ask you to replace it with named
> constant,
> or at least add a comment.

I tought it was more readable like that, since 1<<20 is known to be
"MB", that reads as "128 MB".

I guess I can add a comment saying so.

> I'll share my results of performance testing it in a few days.

Thanks


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 19:17 GMT+01:00 Jim Nasby :

> On 12/28/16 11:29 AM, Fabien COELHO wrote:
>
>>
>> Hello Jim,
>>
>> 1) Variables would be completely non-transactional. [...] A solution
>>> to this problem would be to provide a plpgsql equivalent to plperl or
>>> plpython's session hashes.
>>>
>>
>> That is what I have in mind with "session variables" à la MS/MY SQL, but
>> at the SQL level, not PL/pgSQL level.
>>
>
> I'm just saying there might be use for a plpgsql equivalent to the session
> hashes that other PLs provide, but that's a different issue.
>
> users are forced to create accessor functions,
>>>
>>
>> Yes if the private variable should be accessed. If the variable is
>> private, then it is private and nothing is needed. Idem for public.
>>
>
> Why force the extra busywork? Just allow for them to be public.
>
> For that matter, if we're going to start introducing private objects, that
> certainly needs to be thought through.
>
> and you run a serious risk of confusion from getting the function
>>> ownerships wrong.
>>>
>>
>> One can get the permissions on special session variable wrong as well...
>> I do not see how it differs.
>>
>
> It's a lot harder to mess up an explicit grant than it is to mess up
> object ownership.
>
> More importantly, the security definer trick you're suggesting has a
>>> fatal flaw: you can't call one SECDEF function from another SECDEF
>>> function.
>>>
>>
>> I do not see why there would be such a restriction?
>>
>>  postgres@db> CREATE FUNCTION secfunc() RETURNS TEXT SECURITY DEFINER
>>   AS $$ SELECT CURRENT_USER::TEXT; $$ LANGUAGE SQL;
>>
>>  fabien@db> CREATE FUNCTION secfunc2() RETURNS TEXT SECURITY DEFINER
>> AS $$ SELECT secfunc() || ' - ' || CURRENT_USER; $$ LANGUAGE
>> SQL;
>>
>>  *@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both
>> sides...
>>
>
> Perhaps I've got the restrictions on SECDEF wrong, but I know there's
> problems with it. Certainly one issue is you can't change roles back to the
> calling user.
>
> Maybe they would be workable in this case, but it's just a bunch of extra
> busywork for the user that serves no real purpose.
>
> We should protect for the possibility of truly global (as in
>>> cross-session) variables.
>>>
>>
>> Yes, why not... Although having "cross-session session variable" seems
>> to create some problems of its own... Would they be cross-database as
>> well?
>>
>
> Yes. It'd be a shared catalog.
>
> ...
>
> Yes, you could simulate the same thing with functions, but why make
>>> users do all that work if we could easily provide the same functionality?
>>>
>>
>> The easy is unclear. Eg if special declared with permissions partially
>> persistent session variables preclude future basic session variables, or
>> their efficiency, or their syntax, it would be a problem. Hence the
>> point of discussing before proceeding.
>>
>
> Then IMHO what needs to happen is to have a discussion on actual syntax
> instead of calling into question the value of the feature. Following this
> thread has been very painful because the communications have not been very
> clear. Focus on grammar would probably be a big improvement in that regard.


I don't think. There are some significant questions:

1. Should be "variables" fixed in schema? Should be metadata persistent?
2. Should we use GRANT/REVOKE statements for "variables"?
3. Should be "variable" name unique in schema like tables, indexes,
sequences?
4. The content of "variables" should be nontransactional or transactional.
5. What mode we should to support, what mode will be default "unshared",
"shared"

That is all. All discussion is about these questions. These questions
creates multidimensional space, that can be covered. But we cannot to
implement all in one stage.

We can have schema unbound variables declared by

DECLARE @xx AS int, and accessed with get('@xx') and set('@xx', val)

and we have to have bound variables created by

CREATE VARIABLE public.xx AS int and accessed with get('public.xx'),
set('public.xx', val)

We can have both - I don't see any problem. Implementation DECLARE
statement doesn't eliminate implementation CREATE statement. I can
understand so somebody doesn't need secure variables - so it doesn't need
CREATE statement. But if you need secure variables then some PRIVATE flags
is minimally redundant to our standard security design.

My proposal doesn't eliminate Fabien's proposal - proposal by Fabien can be
good enough for Fabien - and for interactive work, but is not good enough
for some checking and security usage.

Regards

Pavel






>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>


Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Jim Nasby

On 12/28/16 12:15 PM, Pavel Stehule wrote:

GUC are fragile - the source code and settings can be separated.


*Can* be, but they don't *have* to be. That's a huge feature, not a bug.


Our #option is more robust, because source code holds all flags required
for execution. So I would to see a mechanism, that will be strongly
joined with code.


That means you must ALWAYS specify, which is an enormous pain. It 
basically guarantees that users will NEVER switch to the new syntax.



Using function assigned GUC is similar, but it is looking less robust -
and some editors can forgot this information.


If you forget then you get an error. Then you remember.


Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
probably not all - for example issue of FOUND variable or introducing
new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
statement GET DIAGNOSTICS, but I understand so isn't funny to write more
and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice,
but there is risk, so this shortcut breaks existing code, and the
costs/benefits are individual. There cannot be 100% agreement ever. So
some customisation should be good.


That's the whole point of having settings to deal with 
incompatibilities: so we can actually fix these warts without breaking 
everyone's code, yet also make it clear to users that they should stop 
using the warts and instead use the new and improved syntax.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Joe Conway
On 12/28/2016 10:17 AM, Jim Nasby wrote:
> Then IMHO what needs to happen is to have a discussion on actual syntax
> instead of calling into question the value of the feature. Following
> this thread has been very painful because the communications have not
> been very clear. Focus on grammar would probably be a big improvement in
> that regard.

+1

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Jim Nasby

On 12/28/16 11:29 AM, Fabien COELHO wrote:


Hello Jim,


1) Variables would be completely non-transactional. [...] A solution
to this problem would be to provide a plpgsql equivalent to plperl or
plpython's session hashes.


That is what I have in mind with "session variables" à la MS/MY SQL, but
at the SQL level, not PL/pgSQL level.


I'm just saying there might be use for a plpgsql equivalent to the 
session hashes that other PLs provide, but that's a different issue.



users are forced to create accessor functions,


Yes if the private variable should be accessed. If the variable is
private, then it is private and nothing is needed. Idem for public.


Why force the extra busywork? Just allow for them to be public.

For that matter, if we're going to start introducing private objects, 
that certainly needs to be thought through.



and you run a serious risk of confusion from getting the function
ownerships wrong.


One can get the permissions on special session variable wrong as well...
I do not see how it differs.


It's a lot harder to mess up an explicit grant than it is to mess up 
object ownership.



More importantly, the security definer trick you're suggesting has a
fatal flaw: you can't call one SECDEF function from another SECDEF
function.


I do not see why there would be such a restriction?

 postgres@db> CREATE FUNCTION secfunc() RETURNS TEXT SECURITY DEFINER
  AS $$ SELECT CURRENT_USER::TEXT; $$ LANGUAGE SQL;

 fabien@db> CREATE FUNCTION secfunc2() RETURNS TEXT SECURITY DEFINER
AS $$ SELECT secfunc() || ' - ' || CURRENT_USER; $$ LANGUAGE
SQL;

 *@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both
sides...


Perhaps I've got the restrictions on SECDEF wrong, but I know there's 
problems with it. Certainly one issue is you can't change roles back to 
the calling user.


Maybe they would be workable in this case, but it's just a bunch of 
extra busywork for the user that serves no real purpose.



We should protect for the possibility of truly global (as in
cross-session) variables.


Yes, why not... Although having "cross-session session variable" seems
to create some problems of its own... Would they be cross-database as well?


Yes. It'd be a shared catalog.

...


Yes, you could simulate the same thing with functions, but why make
users do all that work if we could easily provide the same functionality?


The easy is unclear. Eg if special declared with permissions partially
persistent session variables preclude future basic session variables, or
their efficiency, or their syntax, it would be a problem. Hence the
point of discussing before proceeding.


Then IMHO what needs to happen is to have a discussion on actual syntax 
instead of calling into question the value of the feature. Following 
this thread has been very painful because the communications have not 
been very clear. Focus on grammar would probably be a big improvement in 
that regard.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Pavel Stehule
2016-12-28 18:54 GMT+01:00 Jim Nasby :

> On 12/28/16 7:16 AM, Pavel Stehule wrote:
>
>> ** The real problem is that we have no mechanism for allowing a PL's
>> language/syntax/API to move forward without massive backwards
>> compatibility problems. **
>>
>>
>> We have not, but there are few possibilities:
>>
>> 1. enhance #option command
>> 2. we can introduce PRAGMA command
>> https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
>> 
>>
>
> I wanted to break this out separately, because IMO it's the real heart of
> the matter.
>
> I think it would be silly not to allow a global setting of compatibility.
> You certainly don't want to force people to stick magic keywords in their
> code forevermore.
>
> To that end, would GUCs be a workable answer here? That should give you
> the ability to control incompatibilities at a function, user, database and
> global level. It would also allow you to chose between raising a WARNING vs
> a FATAL.
>

GUC are fragile - the source code and settings can be separated.

Our #option is more robust, because source code holds all flags required
for execution. So I would to see a mechanism, that will be strongly joined
with code.

Using function assigned GUC is similar, but it is looking less robust - and
some editors can forgot this information.

Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
probably not all - for example issue of FOUND variable or introducing new
auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
statement GET DIAGNOSTICS, but I understand so isn't funny to write more
and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice, but
there is risk, so this shortcut breaks existing code, and the
costs/benefits are individual. There cannot be 100% agreement ever. So some
customisation should be good.



>
> I realize we've had some bad experiences with compatibility GUCs in the
> past, but I'd argue we've also had some good experiences. I see that
> add_missing_from is now completely gone, for example, presumably with no
> complaints. There's probably several other compatibility GUCs we could
> remove now.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Tom Lane
Jim Nasby  writes:
> On 12/28/16 11:25 AM, Tom Lane wrote:
>> The idea of just capturing the wait start for heavyweight locks, and
>> not other lock types, still seems superior to any of the alternatives
>> that have been suggested ...

> Is some kind of alarm a viable option for the others? If setting the 
> alarm is cheap,

Setting an alarm is certain to require a gettimeofday and/or a kernel
call.  It is far from cheap.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Jim Nasby

On 12/28/16 11:25 AM, Tom Lane wrote:

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...


Is some kind of alarm a viable option for the others? If setting the 
alarm is cheap, you could just set one for say 5ms when you have to wait 
on a lock. If setting one is too expensive perhaps one could be set at 
transaction start that looks at a global (and the global would be set 
when a wait started). Obviously that means there's inaccuracy to the 
true time spent waiting, but I think that's certainly fine for 
pg_stat_activity. Most importantly, it would mean that if something has 
gone horribly wrong you'd at least have some kind of relatively accurate 
duration to work from.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Jim Nasby

On 12/28/16 7:16 AM, Pavel Stehule wrote:



2016-12-28 5:09 GMT+01:00 Jim Nasby mailto:jim.na...@bluetreble.com>>:

On 12/27/16 4:56 PM, Merlin Moncure wrote:

On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule
mailto:pavel.steh...@gmail.com>> wrote:
Which is why this is an external fork of plpgsql.


ok. Just I would not to repeat Perl6 or Python3 story - it is big
adventure, but big fail too


Yeah, creating an entirely "new" PL to deal with compatibility doesn't 
seem like a good idea to me.



** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **


We have not, but there are few possibilities:

1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas


See separate reply.




I'm honestly surprised (even shocked) that you've never run into any
of the problems plpgsql2 is trying to solve. I've hit all those
problems except for OUT parameters. I'd say the order they're listed
in actually corresponds to how often I hit the problems.


I hit lot of older harder (now solved) issues - now, with more
experience I am able to see these issues. And I wrote plpgsql_check,
partially for self too. Years ago I prefer safe expressions.


Recognizing a problem ahead of time (or having plpgsql_check do it for 
you) still means you have to find a way to work around it. In some cases 
(ie: STRICT), that workaround can be a serious PITA. Better to just 
eliminate the problem itself.



I think trying to move the ball forward in a meaningful way without
breaking compatibility is a lost cause. Some of these issues could
be addressed by adding more syntax, but even that has limits (do we
really want another variation of STRICT that allows only 0 or 1
rows?). And there's no way to fix your #1 item below without
breaking compatibility.


I think so there is way with extra check, or with persistent plpgsql
options - just use it, please. Some checks are clear, some other not.


I will assert that there will ALWAYS be problems that you can't plaster 
over with some kind of extra checking (like plpgsql_check). At some 
point, in order to fix those, you have to somehow break compatibility.


Look at libpq as an example. There's a reason we're on protocol V3.


If you know ALGOL family languages, then it is not problem. What is a


Lets be realistic... what % of our users have even heard of ALGOL, let 
alone used it? :)



harder problem for people is different implementation of mix SQL and PL
- different than Oracle, or MSSQL. Our model is better, simpler but
different. It is difficult for people without knowleadge of differences
between functions and procedures. Partially we badly speaking so our
void functions are procedures.


I suspect that's only confusing for people coming from Oracle (which of 
course is a non-trivial number of people).



#6: The variations of syntax between the FOR variants is annoying
(specifically, FOREACH necessitating the ARRAY keyword).


this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
extending


Understood. It still sucks though. :)


#8: EVERYTHING command option should accept a variable. In
particular, RAISE should accept a variable for level, but there's
other cases of this I've run into. I'd also be nice if you could
plop variables into SQL commands where you'd have an identifier,
though presumably that would require some kind of explicit variable
identifier.


It is hiding dynamic SQL - I am strongly against it - minimally due
performance issues. Important functionality should not be hidden.


There's definitely ways around the performance issue. I do agree that it 
needs to be clear when you're doing something dynamic so it's not 
accidental. One way to do that would be to add support for variable 
decorators and mandate the use of decorators when using a variable for 
an identifier.


That said, *every* option to RAISE can be a variable except the level. 
That's just plain silly and should be fixed.



#13: cstring support would allow a lot more people to experiment
with things like custom types. Yes, plpgsql might be slow as hell
for this, but sometimes that doesn't matter. Even if it does, it can
be a lot easier to prototype in something other than C. (Granted, I
think there's some non-plpgsql stuff that would need to happen to
allow this.)


Not sure about it (I have really realy wrong experience with some
developers about performance) - but PLPython, PLPerl can do it well, and
I miss some possibility - We can use transformations more time - SQL/MM
is based on new datatypes and transformations.


Well, there's probably some other things that could be done to make 
plpgsql perform better in this regard. One thing I've wondered about is 
allowing array-like access to a plain

Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Jim Nasby

On 12/28/16 7:16 AM, Pavel Stehule wrote:

** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **


We have not, but there are few possibilities:

1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas



I wanted to break this out separately, because IMO it's the real heart 
of the matter.


I think it would be silly not to allow a global setting of 
compatibility. You certainly don't want to force people to stick magic 
keywords in their code forevermore.


To that end, would GUCs be a workable answer here? That should give you 
the ability to control incompatibilities at a function, user, database 
and global level. It would also allow you to chose between raising a 
WARNING vs a FATAL.


I realize we've had some bad experiences with compatibility GUCs in the 
past, but I'd argue we've also had some good experiences. I see that 
add_missing_from is now completely gone, for example, presumably with no 
complaints. There's probably several other compatibility GUCs we could 
remove now.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 17:53 GMT+01:00 Jim Nasby :

> On 12/28/16 9:57 AM, Fabien COELHO wrote:
>
>> * Other later triggers, etc, also reference USER_IS_AUDITOR
>>>
>>
>> The variable is not directly referenced, one would have to call
>> isUserAuditor() to access the private session value, but then you can
>> GRANT/REVOKE whatever you want on the access function.
>>
>
> Why force users to create Yet Another Function as a getter?
>
> There's 2 big points that I think keep getting missed:
>
> 1) Variables would be completely non-transactional. The only way you can
> do that today is to use a "non-standard" language (such as plperl or
> plpython), or by creating a custom GUC (which is ugly because it
> necessitates changing postgresql.conf and is only text). A solution to this
> problem would be to provide a plpgsql equivalent to plperl or plpython's
> session hashes. I'm sure there are use cases that would be satisfied by
> simple doing that, but...
>
> 2) Variables provide permissions. Theoretically you could allow the
> hypothetical plpgsql session variables in (1) to be marked private, but
> that means you now have to keep all those variables on a per-role basis,
> users are forced to create accessor functions, and you run a serious risk
> of confusion from getting the function ownerships wrong. That certainly
> seems no better than defining permanent variables and giving them
> permissions (as Pavel suggested). More importantly, the security definer
> trick you're suggesting has a fatal flaw: you can't call one SECDEF
> function from another SECDEF function. So as soon as you have multiple
> privileged roles making use of variables, there's a serious risk of not
> being able to make use of these private variables at all.
>
> Now maybe pg_class is absolutely the wrong place to store info about
> predefined variables, but that's an implementation detail, not a design
> flaw.
>

We can talk about implementation - for me a variable is a object that holds
data - more, I would to fix this object in schema without possible
collision with tables. I plan to support SELECT for access and UPDATE for
changes in future - so using pg_class is natural. On second hand - lot of
fields in pg_class are not used for variables everywhere. I would to fix
variables in some schema, but I would not to solve possible collision
between variables and other SQL objects - due possible direct access inside
SQL statements in future.



>
> Some other points:
> We should protect for the possibility of truly global (as in
> cross-session) variables. Presumably these would have to be pre-defined via
> DDL before use. These would be uniquely valuable as a means of
> communication between sessions that are connected to different databases. I
> could also see use in cross-database in-memory queues. AFAIK both of these
> would be pretty easy to do with the shared memory infrastructure we now
> have.
>


I didn't write any what close this possibility.


>
> It would be nice if we could come up with a plan for what permanently
> defined temp tables looked like, so the syntax and operation was similar to
> the permanently defined session variables that Pavel is proposing. That
> said, given how long that has been an open issue I think it's completely
> unfair to stonewall this feature if we can't get permanent temp tables
> figured out.
>
> While permanent temp tables would eliminate some objections to store
> "session variables", the fact still remains that any kind of table would
> still be MVCC, and that is NOT always what you want.
>
> It would be nice if whatever syntax was decided for defined session
> variables allowed room for "variables" that were actually MVCC, because
> sometimes that actually is what you want. Yes, you could simulate the same
> thing with functions, but why make users do all that work if we could
> easily provide the same functionality? These should probably be called
> something other than "variables", but presumably all the other syntax and
> settings could be the same. Again, it's not the job of this proposal to
> boil that ocean, but it would be nice to leave the option open.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Fabien COELHO


Hello Jim,

1) Variables would be completely non-transactional. [...] A solution to 
this problem would be to provide a plpgsql equivalent to plperl or 
plpython's session hashes.


That is what I have in mind with "session variables" à la MS/MY SQL, but 
at the SQL level, not PL/pgSQL level.


2) Variables provide permissions. Theoretically you could allow the 
hypothetical plpgsql session variables in (1) to be marked private, but 
that means you now have to keep all those variables on a per-role basis,


No, I think a common "session hash" could be used, with some basic prefix 
convention, there is no need to per-role hash.



users are forced to create accessor functions,


Yes if the private variable should be accessed. If the variable is 
private, then it is private and nothing is needed. Idem for public.


and you run a serious risk of confusion from getting the function 
ownerships wrong.


One can get the permissions on special session variable wrong as well... I 
do not see how it differs.


That certainly seems no better than defining permanent variables and 
giving them permissions (as Pavel suggested).


As far as permissions is concerned, ISTM that it is the same.

The point is that there would be only usual fast "session variables", like 
MS/MY/Oracle SQL, and they could be used for Pavel rather special use case 
as well.


More importantly, the security definer trick you're suggesting has a 
fatal flaw: you can't call one SECDEF function from another SECDEF 
function.


I do not see why there would be such a restriction?

 postgres@db> CREATE FUNCTION secfunc() RETURNS TEXT SECURITY DEFINER
  AS $$ SELECT CURRENT_USER::TEXT; $$ LANGUAGE SQL;

 fabien@db> CREATE FUNCTION secfunc2() RETURNS TEXT SECURITY DEFINER
AS $$ SELECT secfunc() || ' - ' || CURRENT_USER; $$ LANGUAGE SQL;

 *@db> SELECT secfunc2(); -- returns: "postgres - fabien" from both sides...

So as soon as you have multiple privileged roles making use of 
variables, there's a serious risk of not being able to make use of these 
private variables at all.


?

Now maybe pg_class is absolutely the wrong place to store info about 
predefined variables, but that's an implementation detail, not a design 
flaw.


Yes, but is not my only point.

We should protect for the possibility of truly global (as in cross-session) 
variables.


Yes, why not... Although having "cross-session session variable" seems to 
create some problems of its own... Would they be cross-database as well?



Presumably these would have to be pre-defined via DDL before use.


Probably.


These would be uniquely valuable as a means of communication between sessions
that are connected to different databases.


Why not.

I could also see use in cross-database in-memory queues. AFAIK both of 
these would be pretty easy to do with the shared memory infrastructure 
we now have.


Yep, that would means keeping some "session hash" in shared buffers.

It would be nice if we could come up with a plan for what permanently defined 
temp tables looked like, so the syntax and operation was similar to the 
permanently defined session variables that Pavel is proposing.


Yes. And basic session variables as well.

That said, given how long that has been an open issue I think it's 
completely unfair to stonewall this feature if we can't get permanent 
temp tables figured out.


...

While permanent temp tables would eliminate some objections to store "session 
variables", the fact still remains that any kind of table would still be 
MVCC, and that is NOT always what you want.


Yes, I understood that. But sometimes it may be what is wanted.

It would be nice if whatever syntax was decided for defined session variables 
allowed room for "variables" that were actually MVCC, because sometimes that 
actually is what you want.


Yep, that is what I'm arguing about, having a global variable design, and 
then maybe inside this design just a special case implemented.


Yes, you could simulate the same thing with functions, but why make 
users do all that work if we could easily provide the same 
functionality?


The easy is unclear. Eg if special declared with permissions partially 
persistent session variables preclude future basic session variables, or 
their efficiency, or their syntax, it would be a problem. Hence the point 
of discussing before proceeding.


These should probably be called something other than 
"variables", but presumably all the other syntax and settings could be 
the same. Again, it's not the job of this proposal to boil that ocean, 
but it would be nice to leave the option open.


Yep.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> I think it's an awful choice of name; it has nothing to do with either
> >> the functionality or the printed name of the field.
> 
> > As an example, we might some day wish to include a summary of buffer
> > information at the bottom too when 'buffers' is used.  The proposed
> > 'summary' option would cover that nicely, but 'timing' wouldn't.  That's
> > actually why I was thinking summary might be a good option to have.
> 
> What, would this option then turn off the total-time displays by default?

To retain our current mixed behavior with explain vs. explain-analyze,
we'd have to say it defaults to off for explain and on with analyze.  I
don't particularly like that, and would rather we just default it to on,
but that would mean adjusting the regression tests.

> I do not see that being a reasonable thing to do.  Basically, you're
> taking what seems like a very general-purpose option name and nailing
> it down to mean "print planning time".  You aren't going to be able
> to change that later.

No, that's not what I was suggesting to do and I disagree that we
couldn't ever change it later.  If we want it to mean "print planning
time" and only ever that then I agree that calling it "summary" isn't a
good option.

> > No, but consider how the docs for the current 'timing' option would have
> > to be rewritten.
> 
> Well, sure, they'd have to be rewritten, but I think this definition
> would actually be more orthogonal.

This definition would have two completely different meanings- one for
when analyze is used, and one for when it isn't.

> > We would also have to say something like "the default when not using
> > 'analyze' is off, but with 'analyze' the default is on" which seems
> > pretty grotty to me.
> 
> But the default for TIMING already does depend on ANALYZE.

I would argue that timing can only actually be used with analyze today,
which makes sense when you consider that timing is about enabling or
disabling per-node timing information.  Redefining it to mean something
else isn't particularly different from redefining 'summary' later to
mean something else.

> > Then again, from a *user's* perspective, it should just be included by
> > default.
> 
> Actually, the reason it hasn't gotten included is probably that the
> use-case for it is very small.  If you just do psql \timing on an
> EXPLAIN, you get something close enough to the planning time.  I don't
> mind adding this as an option, but claiming that it's so essential
> that it should be there by default is silly.  People would have asked
> for it years ago if it were all that important.

I don't buy this argument.  Planning time is (hopefully, anyway...) a
rather small amount of time which means that the actual results from
\timing (or, worse, the timing info from other tools like pgAdmin) is
quite far off.  On a local instance with a simple plan, you can get an
order-of-magnitude difference between psql's \timing output and the
actual planning time, throw in a few or even 10s of ms of network
latency and you might as well forget about trying to figure out what
the planning time actually is.

> > Having the summary option exposed would also provide a way for Andres to
> > do what he wanted to originally from the referred-to thread.  There may
> > be other pieces to address if the plan might involve platform-specific
> > details about sorts, etc, but from what he was suggesting that wouldn't
> > be an issue for his initial case, and as Robert mentioned on that
> > thread, we could do something about those other cases too.  I don't
> > think having 'timing' or 'whatever controls showing planning and total
> > execution times at the bottom' would make sense as an option to disable
> > showing platform-specific sort or hashing info though.
> 
> Again, you're proposing that you can add an option today and totally
> redefine what it means tomorrow.  I do not think that's a plan.

The above paragraph was intended to suggest that we could add 'summary'
now to control the last few lines which are displayed after the plan
(which would be a consistent definition of 'summary', even if we added
things to the summary contents later) and that we could then add an
independent option to control the output of plan nodes like 'sort' to
allow for platform-independent output.  I was not suggesting that
'summary' would control what 'sort' produces.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Tom Lane
Jim Nasby  writes:
> On 12/28/16 7:10 AM, Amit Kapila wrote:
>> Can we think of introducing new guc trace_system_waits or something
>> like that which will indicate that the sessions will report the value
>> of wait_start in pg_stat_activity?

> In my experience the problem with those kind of settings is that they're 
> never enabled when you actually need them.

Yeah.  And they especially won't be enabled in production situations,
if people find out that they cause lots of overhead.

> I think it'd be much better 
> to find a way to always capture wait_starts that are over some minimum 
> duration, where collection overhead won't matter but you still have some 
> good info about what's going on. For pg_stat_activity I'd think that 
> threshold would be on the order of 50-100ms, though maybe there's other 
> places where a tighter tolerance would help.

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Jim Nasby

On 12/28/16 7:10 AM, Amit Kapila wrote:

Can we think of introducing new guc trace_system_waits or something
like that which will indicate that the sessions will report the value
of wait_start in pg_stat_activity?  The default value of such a
parameter can be false which means wait_start will be shown as NULL in
pg_stat_activity and when it is enabled the wait_start can show the
time as proposed in this thread.


In my experience the problem with those kind of settings is that they're 
never enabled when you actually need them. I think it'd be much better 
to find a way to always capture wait_starts that are over some minimum 
duration, where collection overhead won't matter but you still have some 
good info about what's going on. For pg_stat_activity I'd think that 
threshold would be on the order of 50-100ms, though maybe there's other 
places where a tighter tolerance would help.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks

2016-12-28 Thread Jim Nasby

On 12/28/16 10:43 AM, David Fetter wrote:

Callbacks aren't easy to find either.

Should callbacks be another chapter in the docs?


That would also be nice, but I suspect that will be harder than finding 
all the hooks.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks

2016-12-28 Thread Jim Nasby

On 12/27/16 11:14 PM, David Fetter wrote:

Sure, but that seems like an effort somewhat orthogonal to the one I
proposed, which is to get some user-facing i.e. SGML docs up for the
current hooks.


My point was that a (presumably small) amount of effort towards 
earmarking hooks in code so that a tool can find them would be a step in 
the direction of allowing documentation to be generated direct from 
code, which several people agreed would be very worthwhile.


Is that effort *required* to meet your goal? No.

Does the journey of 1,000 miles start with a single step? Yes. So if 
there's a reasonably easy way to start that journey here, please 
consider doing so.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I think it's an awful choice of name; it has nothing to do with either
>> the functionality or the printed name of the field.

> As an example, we might some day wish to include a summary of buffer
> information at the bottom too when 'buffers' is used.  The proposed
> 'summary' option would cover that nicely, but 'timing' wouldn't.  That's
> actually why I was thinking summary might be a good option to have.

What, would this option then turn off the total-time displays by default?
I do not see that being a reasonable thing to do.  Basically, you're
taking what seems like a very general-purpose option name and nailing
it down to mean "print planning time".  You aren't going to be able
to change that later.

> No, but consider how the docs for the current 'timing' option would have
> to be rewritten.

Well, sure, they'd have to be rewritten, but I think this definition
would actually be more orthogonal.

> We would also have to say something like "the default when not using
> 'analyze' is off, but with 'analyze' the default is on" which seems
> pretty grotty to me.

But the default for TIMING already does depend on ANALYZE.

> Then again, from a *user's* perspective, it should just be included by
> default.

Actually, the reason it hasn't gotten included is probably that the
use-case for it is very small.  If you just do psql \timing on an
EXPLAIN, you get something close enough to the planning time.  I don't
mind adding this as an option, but claiming that it's so essential
that it should be there by default is silly.  People would have asked
for it years ago if it were all that important.

> Having the summary option exposed would also provide a way for Andres to
> do what he wanted to originally from the referred-to thread.  There may
> be other pieces to address if the plan might involve platform-specific
> details about sorts, etc, but from what he was suggesting that wouldn't
> be an issue for his initial case, and as Robert mentioned on that
> thread, we could do something about those other cases too.  I don't
> think having 'timing' or 'whatever controls showing planning and total
> execution times at the bottom' would make sense as an option to disable
> showing platform-specific sort or hashing info though.

Again, you're proposing that you can add an option today and totally
redefine what it means tomorrow.  I do not think that's a plan.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Jim Nasby

On 12/28/16 9:57 AM, Fabien COELHO wrote:

* Other later triggers, etc, also reference USER_IS_AUDITOR


The variable is not directly referenced, one would have to call
isUserAuditor() to access the private session value, but then you can
GRANT/REVOKE whatever you want on the access function.


Why force users to create Yet Another Function as a getter?

There's 2 big points that I think keep getting missed:

1) Variables would be completely non-transactional. The only way you can 
do that today is to use a "non-standard" language (such as plperl or 
plpython), or by creating a custom GUC (which is ugly because it 
necessitates changing postgresql.conf and is only text). A solution to 
this problem would be to provide a plpgsql equivalent to plperl or 
plpython's session hashes. I'm sure there are use cases that would be 
satisfied by simple doing that, but...


2) Variables provide permissions. Theoretically you could allow the 
hypothetical plpgsql session variables in (1) to be marked private, but 
that means you now have to keep all those variables on a per-role basis, 
users are forced to create accessor functions, and you run a serious 
risk of confusion from getting the function ownerships wrong. That 
certainly seems no better than defining permanent variables and giving 
them permissions (as Pavel suggested). More importantly, the security 
definer trick you're suggesting has a fatal flaw: you can't call one 
SECDEF function from another SECDEF function. So as soon as you have 
multiple privileged roles making use of variables, there's a serious 
risk of not being able to make use of these private variables at all.


Now maybe pg_class is absolutely the wrong place to store info about 
predefined variables, but that's an implementation detail, not a design 
flaw.


Some other points:
We should protect for the possibility of truly global (as in 
cross-session) variables. Presumably these would have to be pre-defined 
via DDL before use. These would be uniquely valuable as a means of 
communication between sessions that are connected to different 
databases. I could also see use in cross-database in-memory queues. 
AFAIK both of these would be pretty easy to do with the shared memory 
infrastructure we now have.


It would be nice if we could come up with a plan for what permanently 
defined temp tables looked like, so the syntax and operation was similar 
to the permanently defined session variables that Pavel is proposing. 
That said, given how long that has been an open issue I think it's 
completely unfair to stonewall this feature if we can't get permanent 
temp tables figured out.


While permanent temp tables would eliminate some objections to store 
"session variables", the fact still remains that any kind of table would 
still be MVCC, and that is NOT always what you want.


It would be nice if whatever syntax was decided for defined session 
variables allowed room for "variables" that were actually MVCC, because 
sometimes that actually is what you want. Yes, you could simulate the 
same thing with functions, but why make users do all that work if we 
could easily provide the same functionality? These should probably be 
called something other than "variables", but presumably all the other 
syntax and settings could be the same. Again, it's not the job of this 
proposal to boil that ocean, but it would be nice to leave the option open.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Duplicate node tag assignments

2016-12-28 Thread Tom Lane
Andres Freund  writes:
> On 2016-12-28 11:33:31 -0500, Tom Lane wrote:
>> By chance I happened to notice that the recent partition patch pushed
>> us over the number of available node tags between
>> So I'm leaning to the second, more drastic, solution.  Thoughts?

> Alternatively we could add a -Wduplicate-enum/-Werror=duplicate-enum for
> clang. That'd protect against that in all enums...

Meh ... I'm not sure that we want to forbid it in *all* enums, just this
one.  Anyway, a lot of us are not using clang, and I could easily see
wasting a great deal of time identifying a bug caused by this sort of
conflict.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
> >>> I am not sure whether using *summary* to print just planning time is a
> >>> good idea.  Another option could be SUMMARY_PLAN_TIME.
> 
> > Using 'summary' seems entirely reasonable to me,
> 
> I think it's an awful choice of name; it has nothing to do with either
> the functionality or the printed name of the field.  And I could imagine
> future uses of "summary" to mean something much different, like say an
> actual summary.  (The dictionary meaning of "summary" is "a brief
> restatement of the main points of something"; adding new information
> is not even approximately within the meaning.)

I don't particularly agree with this.  It's a summary of the timing
information today (and at least one dictionary agrees that "summary" can
also mean "Formed into a sum") and in the future it certainly could be
expanded on, in which case I'd expect the 'summary' option to control
whatever else is added there.

As an example, we might some day wish to include a summary of buffer
information at the bottom too when 'buffers' is used.  The proposed
'summary' option would cover that nicely, but 'timing' wouldn't.  That's
actually why I was thinking summary might be a good option to have.

> How about just saying that the existing TIMING option turns this on,
> if it's specified without ANALYZE?  Right now that combination draws
> an error:
> 
>   regression=# explain (timing on) select 1;
>   ERROR:  EXPLAIN option TIMING requires ANALYZE
> 
> so there's no existing usage that this would break.

No, but consider how the docs for the current 'timing' option would have
to be rewritten.  Further, I'd expect to include examples of the ability
to show planning time in 14.1 as that's something that regular users
will actually be using, unlike 'timing's current usage which is, in my
experience anyway, not terribly common.

We would also have to say something like "the default when not using
'analyze' is off, but with 'analyze' the default is on" which seems
pretty grotty to me.

Then again, from a *user's* perspective, it should just be included by
default.  The reason it isn't hasn't got anything to do with what our
*users* want, in my opinion at least.

Having the summary option exposed would also provide a way for Andres to
do what he wanted to originally from the referred-to thread.  There may
be other pieces to address if the plan might involve platform-specific
details about sorts, etc, but from what he was suggesting that wouldn't
be an issue for his initial case, and as Robert mentioned on that
thread, we could do something about those other cases too.  I don't
think having 'timing' or 'whatever controls showing planning and total
execution times at the bottom' would make sense as an option to disable
showing platform-specific sort or hashing info though.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix for documentation of timestamp type

2016-12-28 Thread Cynthia Shang
The latest patch attachment has a couple typos in it ("storead" instead of 
"stored"). I interpreted the final suggestion in the thread to mean 1) default 
stores in microseconds 2) deprecated compile-time option stores as seconds.  If 
these assumptions are correct then the suggestion in the thread (minus 
"instead" as Tom suggested) provided below should be incorporated and attached 
as a patch to this thread. Therefore I recommend an "Awaiting Author" status.

When timestamp values are stored as eight-byte integers (currently the 
default), microsecond precision is available over the full range of values.  In 
this case, the internal representation is the number of microseconds before or 
after midnight 2000-01-01. When timestamp values are stored as double 
precision floating-point numbers (a deprecated compile-time option), the 
internal representation is the number of seconds before or after midnight 
2000-01-01.  With this representation, the effective limit of precision might 
be less than 6; in practice, microsecond precision is achieved for dates within 
a few years of 2000-01-01, but the precision degrades for dates further away. 
Note that using floating-point datetimes allows a larger range of 
timestamp values to be represented than shown above: from 4713 BC 
up to 5874897 AD.

Thanks,
-Cynthia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks

2016-12-28 Thread David Fetter
On Wed, Dec 28, 2016 at 12:19:11PM +0800, Craig Ringer wrote:
> On 28 December 2016 at 12:15, Jim Nasby  wrote:
> 
> > Can we reduce the scope of this to a manageable starting point?
> > I'm guessing that all existing hooks share certain characteristics
> > that it'd be pretty easy to detect. If you can detect the hook
> > (which I guess means finding a static variable with hook in the
> > name) then you can verify that there's an appropriate comment
> > block. I'm guessing someone familiar with tools like doxygen could
> > set that up without too much effort, and I'd be surprised if the
> > community had a problem with it.
> 
> Lets just make sure the comment blocks are nice and grep-able too.
> 
> I think this is a great idea FWIW. Discovering the extension points
> within Pg isn't easy.
> 
> Callbacks aren't easy to find either.

Should callbacks be another chapter in the docs?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Duplicate node tag assignments

2016-12-28 Thread Andres Freund
Hi,

On 2016-12-28 11:33:31 -0500, Tom Lane wrote:
> By chance I happened to notice that the recent partition patch pushed
> us over the number of available node tags between
> 
>   T_A_Expr = 900,

> So I'm leaning to the second, more drastic, solution.  Thoughts?

Alternatively we could add a -Wduplicate-enum/-Werror=duplicate-enum for
clang. That'd protect against that in all enums...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Duplicate node tag assignments

2016-12-28 Thread Tom Lane
By chance I happened to notice that the recent partition patch pushed
us over the number of available node tags between

T_A_Expr = 900,

and

T_TriggerData = 950,/* in commands/trigger.h */

Specifically we now have some of the replication grammar node type
codes conflicting with some of the "random other stuff" tags.
It's not terribly surprising that that hasn't caused visible problems,
because of the relatively localized use of replication grammar nodes,
but it's still a Bad Thing.  And it's especially bad that apparently
no one's compiler has complained about it.

So we need to renumber enum NodeTag a bit, which is no big deal,
but I think we had better take thought to prevent a recurrence
(which might have worse consequences next time).

One idea is to add StaticAsserts someplace asserting that there's
still daylight at each manually-assigned break in the NodeTag list.
But I'm not quite sure how to make that work.  For example, asserting
that T_PlanInvalItem < T_PlanState won't help if people add new nodes
after PlanInvalItem and don't think to update the Assert.

Or we could just abandon the manually-assigned breaks in the list
altogether, and let NodeTags run from 1 to whatever.  This would
slightly complicate debugging, in that the numeric values of node
tags would change more than they used to, but on the whole that does
not sound like a large problem.  When you're working in gdb, say,
it's easy enough to convert:

(gdb) p (int) T_CreateReplicationSlotCmd
$8 = 950
(gdb) p (enum NodeTag) 949
$9 = T_BaseBackupCmd

So I'm leaning to the second, more drastic, solution.  Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 16:57 GMT+01:00 Fabien COELHO :

>
> My 0.02€ to try to illustrate a possible private session variable based
> implementation for this use case:
>
> * Session starts
>>
>
> \c app
>
> * app does SELECT setup_user('user-auth-key-data', 'some-other-blob')
>>
>
> SELECT setup_user('fjshdfjkshfjks', 'jklfsjfklsjfk');
>
> **  setup_user is SECURITY DEFINER to 'appadmin'
>>
>
> -- appadmin did:
> CREATE FUNCTION setup_user(TEXT, TEXT)
>   RETURNS BOOLEAN SECURITY DEFINER AS $$
>
> **  'appadmin' owns a variable IS_AUDITOR. Other roles have only read
>> access to it.
>>
>
>   not sure how it is used afterwards... is it the same as USER_IS_AUDITOR?
>
> **  setup_user(...) does whatever expensive/slow work it has to do
>>
>
> ... checks, updates, whatever
>
> **   setup_user sets USER_IS_AUDITOR var
>>
>
> -- declare a private session variable
> DECLARE @user_is_auditor BOOLEAN PRIVATE;
> -- set its value to whatever appropriate
> SET @user_is_auditor = ???;
> --- returns its value
> RETURN @user_is_auditor;
> $$ LANGUAGE xxx;
>
> * Later RLS policies simply reference USER_IS_AUDITOR var. They don't
>> need to know the 'user-auth-key-data', or do whatever expensive
>> processing that it does.
>>
>
> -- appadmin did:
> CREATE FUNCTION isUserAuditor()
>RETURNS BOOLEAN SECURITY DEFINER AS $$
>-- say variable is just confirmed if it exists already in session?
>DECLARE @user_is_auditor BOOLEAN PRIVATE;
>RETURN @user_is_auditor;
> $$ LANGUAGE xxx;
>
> * Other later triggers, etc, also reference USER_IS_AUDITOR
>>
>
> The variable is not directly referenced, one would have to call
> isUserAuditor() to access the private session value, but then you can
> GRANT/REVOKE whatever you want on the access function.
>
> * User cannot make themselves an auditor by SETting USER_IS_AUDITOR
>>
>
> Indeed, the user cannot access the private variable, only appadmin can,
> and probably root could.
>
> The user could create its own private session variable @user_is_auditor,
> or a public session variable of the same name. That would be distinct
> variables which would not influence isUserAuditor which would use its own.
>

so what is worse - I did one new entry in pg_class and one entry in
pg_attributes. You wrote two entries in pg_proc function - more you have to
ensure consistency of these functions.

Regards

Pavel




>
> --
> Fabien.


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Fabien COELHO


My 0.02€ to try to illustrate a possible private session variable based 
implementation for this use case:



* Session starts


\c app


* app does SELECT setup_user('user-auth-key-data', 'some-other-blob')


SELECT setup_user('fjshdfjkshfjks', 'jklfsjfklsjfk');


**  setup_user is SECURITY DEFINER to 'appadmin'


-- appadmin did:
CREATE FUNCTION setup_user(TEXT, TEXT)
  RETURNS BOOLEAN SECURITY DEFINER AS $$


**  'appadmin' owns a variable IS_AUDITOR. Other roles have only read
access to it.


  not sure how it is used afterwards... is it the same as USER_IS_AUDITOR?


**  setup_user(...) does whatever expensive/slow work it has to do


... checks, updates, whatever


**   setup_user sets USER_IS_AUDITOR var


-- declare a private session variable
DECLARE @user_is_auditor BOOLEAN PRIVATE;
-- set its value to whatever appropriate
SET @user_is_auditor = ???;
--- returns its value
RETURN @user_is_auditor;
$$ LANGUAGE xxx;


* Later RLS policies simply reference USER_IS_AUDITOR var. They don't
need to know the 'user-auth-key-data', or do whatever expensive
processing that it does.


-- appadmin did:
CREATE FUNCTION isUserAuditor()
   RETURNS BOOLEAN SECURITY DEFINER AS $$
   -- say variable is just confirmed if it exists already in session?
   DECLARE @user_is_auditor BOOLEAN PRIVATE;
   RETURN @user_is_auditor;
$$ LANGUAGE xxx;


* Other later triggers, etc, also reference USER_IS_AUDITOR


The variable is not directly referenced, one would have to call 
isUserAuditor() to access the private session value, but then you can 
GRANT/REVOKE whatever you want on the access function.



* User cannot make themselves an auditor by SETting USER_IS_AUDITOR


Indeed, the user cannot access the private variable, only appadmin can, 
and probably root could.


The user could create its own private session variable @user_is_auditor, 
or a public session variable of the same name. That would be distinct 
variables which would not influence isUserAuditor which would use its own.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 16:42 GMT+01:00 Fabien COELHO :

>
> Hello,
>
> Why should they? If it is a session variable, being created when needed or
>>> used with the right type could be enough?
>>>
>>
>> You cannot to trust some fuzzy object - or you have to play hard game with
>> securing content - hashing, coding, decoding - it is slow, cpu intensive
>>
>
> I'm afraid that I do not understand. I do not think that a hash get in
> memory is particularly costly. I do not see how you could do less that that
> to manage variables with associated values.
>
> Currently the big issue of plpgsql_check is work with temporary tables.
>>>
>>
>> No, I mean so when you use temporary tables inside plpgsql functions, then
>> the static analyze like plpgsql check is almost impossible.
>>
>
> I cannot to speak instead you, but lot of people prefer static analyze of
>> code.
>>
>
> Hmmm... I know a little bit about that field, ISTM that you are speaking
> of the current capabilities of a particular static analysis tool, but I'm
> not sure that the tool capabilities could not be enhanced to manage more
> things.
>

It cannot be - the static analyze is limited to function scope - in static
analyze you don't know a order of calls.


>
> The static analyze can be done only on static (persistent metadata).
>>
>
> A session variable is a bit like a global temporary variable. A static
> analysis tool could take into account a global temporary variable.
>
> You cannot  do it with dynamic (unfixed in schema) objects.
>>
>
> The private session variables I suggested have a fixed (static) name, and
> their type could be infered by a static analysis tool, eg:
>

>   ...
>   DECLARE @foo BOOLEAN PRIVATE;
>   -- I know that there is private a boolean variable "@foo" of unknown
> value
>   SET @foo = TRUE;
>   --- I know that @foo is true...
>   ...
>

This is not too friendly - you have to repeat DECLARE in every function.
What is somebody somewhere write @fooo or use DECLARE @foo integer instead.
There is big space for errors.


Regards

Pavel




>
> --
> Fabien.
>


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Fabien COELHO


Hello,


Why should they? If it is a session variable, being created when needed or
used with the right type could be enough?


You cannot to trust some fuzzy object - or you have to play hard game with
securing content - hashing, coding, decoding - it is slow, cpu intensive


I'm afraid that I do not understand. I do not think that a hash get in 
memory is particularly costly. I do not see how you could do less that 
that to manage variables with associated values.



Currently the big issue of plpgsql_check is work with temporary tables.


No, I mean so when you use temporary tables inside plpgsql functions, then
the static analyze like plpgsql check is almost impossible.



I cannot to speak instead you, but lot of people prefer static analyze of
code.


Hmmm... I know a little bit about that field, ISTM that you are speaking 
of the current capabilities of a particular static analysis tool, but I'm 
not sure that the tool capabilities could not be enhanced to manage more 
things.



The static analyze can be done only on static (persistent metadata).


A session variable is a bit like a global temporary variable. A static 
analysis tool could take into account a global temporary variable.



You cannot  do it with dynamic (unfixed in schema) objects.


The private session variables I suggested have a fixed (static) name, and 
their type could be infered by a static analysis tool, eg:


  ...
  DECLARE @foo BOOLEAN PRIVATE;
  -- I know that there is private a boolean variable "@foo" of unknown value
  SET @foo = TRUE;
  --- I know that @foo is true...
  ...

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make more use of RoleSpec struct

2016-12-28 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> > * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> 
> > > The only functional issue might be the removal of the IsA() checks.  If
> > > we don't cast any Node before passing it to any of those functions,
> > > there should be no problem because any misfeasance will be reported as a
> > > compile-time warning.  Perhaps it's worth adding IsA() checks in loops
> > > such as the one in roleSpecsToIds().
> > 
> > Maybe inside of an Assert() though..?
> 
> Yeah, I was of two minds when writing that para.  Maybe roleSpecsToIds
> (and all similar functions, if any others exist) has a limited enough
> caller base that it's trivial to audit that no bogus callsite exists.

Well, I think an Assert() is fine to make sure someone doesn't add a new
call that passes in something else where we don't have the compile-time
type checking happening.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Alvaro Herrera
Tom Lane wrote:

> How about just saying that the existing TIMING option turns this on,
> if it's specified without ANALYZE?  Right now that combination draws
> an error:
> 
>   regression=# explain (timing on) select 1;
>   ERROR:  EXPLAIN option TIMING requires ANALYZE
> 
> so there's no existing usage that this would break.

Sounds much more reasonable to me than SUMMARY.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make more use of RoleSpec struct

2016-12-28 Thread Alvaro Herrera
Stephen Frost wrote:
> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:

> > The only functional issue might be the removal of the IsA() checks.  If
> > we don't cast any Node before passing it to any of those functions,
> > there should be no problem because any misfeasance will be reported as a
> > compile-time warning.  Perhaps it's worth adding IsA() checks in loops
> > such as the one in roleSpecsToIds().
> 
> Maybe inside of an Assert() though..?

Yeah, I was of two minds when writing that para.  Maybe roleSpecsToIds
(and all similar functions, if any others exist) has a limited enough
caller base that it's trivial to audit that no bogus callsite exists.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Tom Lane
Stephen Frost  writes:
> * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
>>> I am not sure whether using *summary* to print just planning time is a
>>> good idea.  Another option could be SUMMARY_PLAN_TIME.

> Using 'summary' seems entirely reasonable to me,

I think it's an awful choice of name; it has nothing to do with either
the functionality or the printed name of the field.  And I could imagine
future uses of "summary" to mean something much different, like say an
actual summary.  (The dictionary meaning of "summary" is "a brief
restatement of the main points of something"; adding new information
is not even approximately within the meaning.)

How about just saying that the existing TIMING option turns this on,
if it's specified without ANALYZE?  Right now that combination draws
an error:

regression=# explain (timing on) select 1;
ERROR:  EXPLAIN option TIMING requires ANALYZE

so there's no existing usage that this would break.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks

2016-12-28 Thread David Fetter
On Wed, Dec 28, 2016 at 03:07:52PM +0900, Michael Paquier wrote:
> On Wed, Dec 28, 2016 at 2:14 PM, David Fetter  wrote:
> > Here's everything that matches ^\w+_hook$ that I've found so far in
> > git master.  There are very likely false positives in this list.
> >
> > [... long list of hooks ...]
> >
> > Some appear to be client-side, some server-side.  I hope that no hook
> > is named other than that way, and I'll do my best to check.
> 
> This list includes some of the hooks of psql used to assign a pset
> variable. You had better just list the backend-side ones, or if you
> want just those in src/backend/. Those are the ones that matter to
> extension and plugin developers.

Thanks for the pointer!

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make more use of RoleSpec struct

2016-12-28 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Peter Eisentraut wrote:
> > Here is a small cleanup patch to make more use of the RoleSpec
> > struct/node throughout the parser to avoid casts and make the code more
> > self-documenting.
> 
> This makes sense to me.  I started to do this at some point when I was
> writing RoleSpec; eventually got annoyed about fixing the fallout (I was
> working to get somebody else's patch committed) and went back to how it
> is.  Glad to see it done.
> 
> The only functional issue might be the removal of the IsA() checks.  If
> we don't cast any Node before passing it to any of those functions,
> there should be no problem because any misfeasance will be reported as a
> compile-time warning.  Perhaps it's worth adding IsA() checks in loops
> such as the one in roleSpecsToIds().

Maybe inside of an Assert() though..?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Stephen Frost
* Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
> >> One can use this option as
> >> postgres=# explain (summary on) select * from pg_class c, pg_type t
> >> where c.reltype = t.oid;
> >> QUERY PLAN
> >> --
> >>  Hash Join  (cost=17.12..35.70 rows=319 width=511)
> >>Hash Cond: (c.reltype = t.oid)
> >>->  Seq Scan on pg_class c  (cost=0.00..14.19 rows=319 width=259)
> >>->  Hash  (cost=12.61..12.61 rows=361 width=256)
> >>  ->  Seq Scan on pg_type t  (cost=0.00..12.61 rows=361 width=256)
> >>  Planning time: 48.823 ms
> >> (6 rows)
> >>
> >> When analyze is specified, summary is also set to ON. By default this
> >> flag is OFF.
> >>
> >
> > I am not sure whether using *summary* to print just planning time is a
> > good idea.  Another option could be SUMMARY_PLAN_TIME.
> 
> I have just used the same name as the boolean which controls the
> printing of planning time. Suggestions are welcome though. We haven't
> used words with "_" for EXPLAIN options, so I am not sure about
> SUMMARY_PLAN_TIME.

Using 'summary' seems entirely reasonable to me, I don't think we need
to complicate it by saying 'summary_plan_time'- I know that I'd had to
have to write that out.

> > + /* Execution time matters only when analyze is requested */
> > + if (es->summary && es->analyze)
> >
> > Do you really need es->summary in above check?

I'm pretty sure we do.

EXPLAIN (ANALYZE, SUMMARY OFF)

Should not print the summary (which means it shouldn't print either the
planning or execution time).

> I think es->summary controls printing overall timing, planning as well
> as execution (hence probably the name "summary").

I am imagining it controlling if we print the summary or not, where the
summary today is the last two lines:

 Planning time: 14.020 ms
 Execution time: 79.555 ms

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Craig Ringer
On 28 December 2016 at 22:04, Pavel Stehule  wrote:

> For security the variable should be persistent.
>
> If you would to do statical analyse (what you usually would), then variable
> should be persistent.
>
> Currently the big issue of plpgsql_check is work with temporary tables.
> Local objects or dynamic sql is stop for static check.

Someone asked me off-list what use cases such a thing would have,
since it seems not to be spelled out very clearly in this discussion.
I think we're all assuming knowledge here.

So.

* Session starts
* app does SELECT setup_user('user-auth-key-data', 'some-other-blob')
**  setup_user is SECURITY DEFINER to 'appadmin'
**  'appadmin' owns a variable IS_AUDITOR. Other roles have only read
access to it.
**  setup_user(...) does whatever expensive/slow work it has to do
**   setup_user sets USER_IS_AUDITOR var
* Later RLS policies simply reference USER_IS_AUDITOR var. They don't
need to know the 'user-auth-key-data', or do whatever expensive
processing that it does.
* Other later triggers, etc, also reference USER_IS_AUDITOR
* User cannot make themselves an auditor by SETting USER_IS_AUDITOR

That's the general idea.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 15:38 GMT+01:00 Fabien COELHO :

>
> For security the variable should be persistent.
>>
>
> Why should they? If it is a session variable, being created when needed or
> used with the right type could be enough?
>

You cannot to trust some fuzzy object - or you have to play hard game with
securing content - hashing, coding, decoding - it is slow, cpu intensive


>
> If you would to do statical analyse (what you usually would), then variable
>> should be persistent.
>>
>
> I do not understand what static analysis you would need/want to do on
> session variables.
>
> Currently the big issue of plpgsql_check is work with temporary tables.
>>
>
> Do you mean that temporary table are too slow/costly?
>

No, I mean so when you use temporary tables inside plpgsql functions, then
the static analyze like plpgsql check is almost impossible.


>
> Local objects or dynamic sql is stop for static check.
>>
>
> Hmm. If something is dynamic, it is not static, but I do not understand
> your point.
>

I cannot to speak instead you, but lot of people prefer static analyze of
code. The static analyze can be done only on static (persistent metadata).
You cannot  do it with dynamic (unfixed in schema) objects.

regards

Pavel



>
> --
> Fabien.
>


Re: [HACKERS] make more use of RoleSpec struct

2016-12-28 Thread Alvaro Herrera
Peter Eisentraut wrote:
> Here is a small cleanup patch to make more use of the RoleSpec
> struct/node throughout the parser to avoid casts and make the code more
> self-documenting.

This makes sense to me.  I started to do this at some point when I was
writing RoleSpec; eventually got annoyed about fixing the fallout (I was
working to get somebody else's patch committed) and went back to how it
is.  Glad to see it done.

The only functional issue might be the removal of the IsA() checks.  If
we don't cast any Node before passing it to any of those functions,
there should be no problem because any misfeasance will be reported as a
compile-time warning.  Perhaps it's worth adding IsA() checks in loops
such as the one in roleSpecsToIds().

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Fabien COELHO



For security the variable should be persistent.


Why should they? If it is a session variable, being created when needed or 
used with the right type could be enough?



If you would to do statical analyse (what you usually would), then variable
should be persistent.


I do not understand what static analysis you would need/want to do on 
session variables.



Currently the big issue of plpgsql_check is work with temporary tables.


Do you mean that temporary table are too slow/costly?


Local objects or dynamic sql is stop for static check.


Hmm. If something is dynamic, it is not static, but I do not understand 
your point.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Fabien COELHO



Also, I'm not yet convinced that simple privatizable transcient/session
variables would not be enough to fit the use case, [...]



So... maybe? The main question then becomes how you integrate access control.


For what it's worth, permissions on persistent functions could be used to 
control access to private-to-a-role transcient/session variables, see:


https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg300651.html

The good point is that the implementation on top of session variables 
would be trivial and very efficient, just prefix the variable with the 
owner id in the key-value storage. The bad point is that there is no 
full-featured GRANT on the variable itself, and it looks adhoc, somehow.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] make more use of RoleSpec struct

2016-12-28 Thread Peter Eisentraut
Here is a small cleanup patch to make more use of the RoleSpec
struct/node throughout the parser to avoid casts and make the code more
self-documenting.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

>From 6f65b08e55d0a1005536c0661c7f857dac29d3bf Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 27 Dec 2016 12:00:00 -0500
Subject: [PATCH] Make more use of RoleSpec struct

Most code was casting this through a generic Node.  By declaring
everything as RoleSpec appropriately, we can remove a bunch of casts and
ad-hoc node type checking.
---
 src/backend/catalog/aclchk.c   |  6 +++---
 src/backend/commands/policy.c  |  2 +-
 src/backend/commands/user.c| 10 +-
 src/backend/parser/gram.y  | 17 +
 src/backend/utils/adt/acl.c| 28 ++--
 src/include/nodes/parsenodes.h | 22 +++---
 src/include/utils/acl.h|  8 
 7 files changed, 39 insertions(+), 54 deletions(-)

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 3086021432..fb6c276eaf 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -423,7 +423,7 @@ ExecuteGrantStmt(GrantStmt *stmt)
 grantee_uid = ACL_ID_PUBLIC;
 break;
 			default:
-grantee_uid = get_rolespec_oid((Node *) grantee, false);
+grantee_uid = get_rolespec_oid(grantee, false);
 break;
 		}
 		istmt.grantees = lappend_oid(istmt.grantees, grantee_uid);
@@ -922,7 +922,7 @@ ExecAlterDefaultPrivilegesStmt(ParseState *pstate, AlterDefaultPrivilegesStmt *s
 grantee_uid = ACL_ID_PUBLIC;
 break;
 			default:
-grantee_uid = get_rolespec_oid((Node *) grantee, false);
+grantee_uid = get_rolespec_oid(grantee, false);
 break;
 		}
 		iacls.grantees = lappend_oid(iacls.grantees, grantee_uid);
@@ -1012,7 +1012,7 @@ ExecAlterDefaultPrivilegesStmt(ParseState *pstate, AlterDefaultPrivilegesStmt *s
 		{
 			RoleSpec   *rolespec = lfirst(rolecell);
 
-			iacls.roleid = get_rolespec_oid((Node *) rolespec, false);
+			iacls.roleid = get_rolespec_oid(rolespec, false);
 
 			/*
 			 * We insist that calling user be a member of each target role. If
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index 6da3205c9e..3746bf1f9f 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -177,7 +177,7 @@ policy_role_list_to_array(List *roles, int *num_roles)
 		}
 		else
 			role_oids[i++] =
-ObjectIdGetDatum(get_rolespec_oid((Node *) spec, false));
+ObjectIdGetDatum(get_rolespec_oid(spec, false));
 	}
 
 	return role_oids;
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index adc6b99b21..5aaf6cf088 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -449,7 +449,7 @@ CreateRole(ParseState *pstate, CreateRoleStmt *stmt)
 	foreach(item, addroleto)
 	{
 		RoleSpec   *oldrole = lfirst(item);
-		HeapTuple	oldroletup = get_rolespec_tuple((Node *) oldrole);
+		HeapTuple	oldroletup = get_rolespec_tuple(oldrole);
 		Oid			oldroleid = HeapTupleGetOid(oldroletup);
 		char	   *oldrolename = NameStr(((Form_pg_authid) GETSTRUCT(oldroletup))->rolname);
 
@@ -1396,7 +1396,7 @@ roleSpecsToIds(List *memberNames)
 
 	foreach(l, memberNames)
 	{
-		Node	   *rolespec = (Node *) lfirst(l);
+		RoleSpec   *rolespec = (RoleSpec *) lfirst(l);
 		Oid			roleid;
 
 		roleid = get_rolespec_oid(rolespec, false);
@@ -1493,7 +1493,7 @@ AddRoleMems(const char *rolename, Oid roleid,
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_GRANT_OPERATION),
 	 (errmsg("role \"%s\" is a member of role \"%s\"",
-		rolename, get_rolespec_name((Node *) memberRole);
+		rolename, get_rolespec_name(memberRole);
 
 		/*
 		 * Check if entry for this role/member already exists; if so, give
@@ -1508,7 +1508,7 @@ AddRoleMems(const char *rolename, Oid roleid,
 		{
 			ereport(NOTICE,
 	(errmsg("role \"%s\" is already a member of role \"%s\"",
-		 get_rolespec_name((Node *) memberRole), rolename)));
+		 get_rolespec_name(memberRole), rolename)));
 			ReleaseSysCache(authmem_tuple);
 			continue;
 		}
@@ -1619,7 +1619,7 @@ DelRoleMems(const char *rolename, Oid roleid,
 		{
 			ereport(WARNING,
 	(errmsg("role \"%s\" is not a member of role \"%s\"",
-		 get_rolespec_name((Node *) memberRole), rolename)));
+		 get_rolespec_name(memberRole), rolename)));
 			continue;
 		}
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 834a00971a..08cf5b78f5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -144,7 +144,7 @@ static Node *makeBitStringConst(char *str, int location);
 static Node *makeNullAConst(int location);
 static Node *makeAConst(Value *v, int location);
 static Node *makeBoolAConst(bool state, int location);
-static Node *makeRoleSpec(RoleSpecType type, int location);
+static RoleSpec *makeRoleSpec(RoleSpecType 

Re: [HACKERS] [PATCH] Fix minor race in commit_ts SLRU truncation vs lookups

2016-12-28 Thread Petr Jelinek
On 28/12/16 15:01, Craig Ringer wrote:
> Hi all
> 
> There's a minor race between commit_ts SLRU truncation and concurrent
> commit_ts lookups, where a lookup can check the lower valid bound xid
> without knowing it's already been truncated away. This would result in
> a SLRU lookup error.
> 
> It's pretty low-harm since it's hard to trigger and the only problem
> is an error being emitted when we should otherwise return null/zero.
> Most notably you have to pass an xid that used to be within the
> datrozenxid but due to a concurrent vacuum has just moved outside it.
> This can't happen if you're passing the xmin of a tuple that still
> exists so it only matters for callers passing arbitrary XIDs in via
> pg_xact_commit_timestamp(...).
> 
> The race window is bigger on standby because there we don't find out
> about the advance of the lower commit ts bound until the next
> checkpoint. But you still have to be looking at very old xids that
> don't exist on the heap anymore.
> 
> We might as well fix it in HEAD, but it's totally pointless to
> back-patch, and the only part of the race that can be realistically
> hit is on standby, where we can't backpatch a fix w/o changing the
> xlog format. Nope. We could narrow the scope by limiting commit_ts
> slru truncation to just before a checkpoint, but given how hard this
> is to hit... I don't care.
> 
> (This came up as part of the investigation I've been doing on the
> txid_status thread, where Robert pointed out a similar problem that
> can arise where txid_status races with clog truncation. I noticed the
> issue with standby while looking into that.)
> 

Hi,

I remember thinking this might affect committs when I was reading that
thread but didn't have time to investigate it yet myself. Thanks for
doing the all the work yourself.

About the patch, it looks good to me for master with the minor exception
that:
> + appendStringInfo(buf, "pageno %d, xid %u",
> + trunc.pageno, trunc.oldestXid);

This should probably say oldestXid instead of xid in the text description.

About back-patching, I wonder if standby could be modified to move
oldestXid based on pageno reverse calculation, but it's going to be
slightly ugly.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 15:00 GMT+01:00 Craig Ringer :

> On 28 December 2016 at 21:19, Fabien COELHO  wrote:
>
> > Also, I'm not yet convinced that simple privatizable transcient/session
> > variables would not be enough to fit the use case, so that for the same
> > price there would be session variables for all, not only special ones
> with
> > permissions.
>
> Since, unlike Oracle, we don't have compiled packages or plan-caching
> above the session level, there's not the same hard requirement for the
> variable definition to be persistent.
>
> So... maybe? The main question then becomes how you integrate access
> control.
>

For security the variable should be persistent.

If you would to do statical analyse (what you usually would), then variable
should be persistent.

Currently the big issue of plpgsql_check is work with temporary tables.
Local objects or dynamic sql is stop for static check.

Regards

Pavel


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Craig Ringer
On 28 December 2016 at 21:19, Fabien COELHO  wrote:

> Also, I'm not yet convinced that simple privatizable transcient/session
> variables would not be enough to fit the use case, so that for the same
> price there would be session variables for all, not only special ones with
> permissions.

Since, unlike Oracle, we don't have compiled packages or plan-caching
above the session level, there's not the same hard requirement for the
variable definition to be persistent.

So... maybe? The main question then becomes how you integrate access control.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Fix minor race in commit_ts SLRU truncation vs lookups

2016-12-28 Thread Craig Ringer
Hi all

There's a minor race between commit_ts SLRU truncation and concurrent
commit_ts lookups, where a lookup can check the lower valid bound xid
without knowing it's already been truncated away. This would result in
a SLRU lookup error.

It's pretty low-harm since it's hard to trigger and the only problem
is an error being emitted when we should otherwise return null/zero.
Most notably you have to pass an xid that used to be within the
datrozenxid but due to a concurrent vacuum has just moved outside it.
This can't happen if you're passing the xmin of a tuple that still
exists so it only matters for callers passing arbitrary XIDs in via
pg_xact_commit_timestamp(...).

The race window is bigger on standby because there we don't find out
about the advance of the lower commit ts bound until the next
checkpoint. But you still have to be looking at very old xids that
don't exist on the heap anymore.

We might as well fix it in HEAD, but it's totally pointless to
back-patch, and the only part of the race that can be realistically
hit is on standby, where we can't backpatch a fix w/o changing the
xlog format. Nope. We could narrow the scope by limiting commit_ts
slru truncation to just before a checkpoint, but given how hard this
is to hit... I don't care.

(This came up as part of the investigation I've been doing on the
txid_status thread, where Robert pointed out a similar problem that
can arise where txid_status races with clog truncation. I noticed the
issue with standby while looking into that.)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 361a92ecbe45226ed25ce87da6a8c7bd749cca4a Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 28 Dec 2016 21:23:59 +0800
Subject: [PATCH] Fix a minor race between commit_ts slru truncation and
 lookups

vac_truncate_clog was truncating the commit timestamp SLRU before
it advanced the oldest xid limit for commit timestamp lookups. This
created a small race window where a concurrent pg_xact_commit_timestamp
calling TransactionIdGetCommitTsData(...) could check the oldest
xid after the SLRU page containing it is truncated away but before
the threshold is updated.

Fix this by advancing the lower bound before removing the relevant SLRU pages.

A larger race window also existed on a standby. The lower bound for commit
timetamp validity was only advanced on a standby when we replayed a checkpoint.
This could happen a long time after the commit timestamp SLRU truncation.

This race only affects XIDs that vac_truncate_clog has determined are no longer
referenced in the system, so the minimum datfrozenxid has advanced past them.
It's never going to be triggered by code that looks up commit timestamps of
rows it's just read during an in-progress transaction. Also, the worst outcome
should the race be triggered is an I/O error from the SLRU code (where the
commit ts lookup should more correctly return null). So this race is pretty
harmless.

The nearly identical clog code has the same race, but it doesn't matter there
since there's no way for users to pass arbitrary XIDs to look them up in clog.
---
 src/backend/access/rmgrdesc/committsdesc.c |  8 +---
 src/backend/access/transam/commit_ts.c | 23 +++
 src/backend/commands/vacuum.c  |  3 ++-
 src/include/access/commit_ts.h |  8 
 4 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/rmgrdesc/committsdesc.c b/src/backend/access/rmgrdesc/committsdesc.c
index 527e5dc..2b117e4 100644
--- a/src/backend/access/rmgrdesc/committsdesc.c
+++ b/src/backend/access/rmgrdesc/committsdesc.c
@@ -33,10 +33,12 @@ commit_ts_desc(StringInfo buf, XLogReaderState *record)
 	}
 	else if (info == COMMIT_TS_TRUNCATE)
 	{
-		int			pageno;
+		xl_commit_ts_truncate trunc;
 
-		memcpy(&pageno, rec, sizeof(int));
-		appendStringInfo(buf, "%d", pageno);
+		memcpy(&trunc, rec, SizeOfCommitTsTruncate);
+
+		appendStringInfo(buf, "pageno %d, xid %u",
+			trunc.pageno, trunc.oldestXid);
 	}
 	else if (info == COMMIT_TS_SETTS)
 	{
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index a5b270c..86ac1d6 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -113,7 +113,7 @@ static bool CommitTsPagePrecedes(int page1, int page2);
 static void ActivateCommitTs(void);
 static void DeactivateCommitTs(void);
 static void WriteZeroPageXlogRec(int pageno);
-static void WriteTruncateXlogRec(int pageno);
+static void WriteTruncateXlogRec(int pageno, TransactionId oldestXid);
 static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
 		 TransactionId *subxids, TimestampTz timestamp,
 		 RepOriginId nodeid);
@@ -824,7 +824,7 @@ TruncateCommitTs(TransactionId oldestXact)
 		return;	/* nothing to remove */
 
 	/* Write XLOG record */
-	WriteTruncateXlogRec(cutoffPage);
+	WriteTruncateXlogRec

Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 4:40 GMT+01:00 Craig Ringer :

> Fabien, I don't really see the point of "persistent variables". What
> benefit do they add over relations?
>
> You can add a simple function to fetch a tuple if you want it not to
> look like a subquery. Do it with heap access in C if you like, save
> the (trivial) planning costs.
>
>
> I do see value to two different things discussed here:
>
> * Pavel's proposal for persistent-declaration, non-persistent-value
> session variables with access control. These will be very beneficial
> with RLS, where we need very fast lookups. Their purpose is that you
> can set up expensive state with SECURITY DEFINER functions, C-level
> functions, etc, then test it very cheaply later from RLS and from
> other functions. Their advantage over relations is very cheap, fast
> access.
>
>   I can maybe see global temporary relations being an alternative to
> these, if we optimise by using a tuplestore to back them and only
> switch to a relfilenode if the relation grows. The pg_catalog entries
> would be persistent so you could GRANT or REVOKE access to them, etc.
> Especially if we optimised the 1-row case this could work. It'd be
> less like what Oracle does, but might let us re-use more functionality
> and have fewer overlapping features. Pavel?
>

This is hard question - I have different expectation from variables and
from tables. Sure, there can be a optimization for one row global temporary
tables, but it fails on first update. And with this optimization we should
not to use current heap access functionality - or we increase a complexity
of currently lot of complex code. From success global temp tables
implementation I am expecting unbloating catalogue and all relation related
functionality - indexes, column statistics, usage statistics, MVCC. I can't
to imagine a one row optimization there. More it has limit - it is pretty
hard implement there expandable types.

I see very useful to define variable as memory based NOT ACID, not MVCC
storage (always). Tables are ACI(D) MVCC. The minimalism is great, but have
to has practical limit - we have variables inside/outside plpgsql -
although we can use temporary tables.

There are secondary question if we can introduce some better interface for
writing getter/setter function, where current relations can be used.







>
>
> * Fabien's earlier mention of transient session / query variables,
> a-la MySQL or MS-SQL. They're obviously handy for more general purpose
> programming, but our strict division between SQL and plpgsql makes
> them a bit less useful than in MS-SQL with T-SQL. I think it's a very
> separate topic to this and should be dealt with in a separate thread
> if/when someone wants to work on them.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Ashutosh Bapat
>>
>> One can use this option as
>> postgres=# explain (summary on) select * from pg_class c, pg_type t
>> where c.reltype = t.oid;
>> QUERY PLAN
>> --
>>  Hash Join  (cost=17.12..35.70 rows=319 width=511)
>>Hash Cond: (c.reltype = t.oid)
>>->  Seq Scan on pg_class c  (cost=0.00..14.19 rows=319 width=259)
>>->  Hash  (cost=12.61..12.61 rows=361 width=256)
>>  ->  Seq Scan on pg_type t  (cost=0.00..12.61 rows=361 width=256)
>>  Planning time: 48.823 ms
>> (6 rows)
>>
>> When analyze is specified, summary is also set to ON. By default this
>> flag is OFF.
>>
>
> I am not sure whether using *summary* to print just planning time is a
> good idea.  Another option could be SUMMARY_PLAN_TIME.

I have just used the same name as the boolean which controls the
printing of planning time. Suggestions are welcome though. We haven't
used words with "_" for EXPLAIN options, so I am not sure about
SUMMARY_PLAN_TIME.

>
> + /* Execution time matters only when analyze is requested */
> + if (es->summary && es->analyze)
>
> Do you really need es->summary in above check?

I think es->summary controls printing overall timing, planning as well
as execution (hence probably the name "summary"). Earlier it was
solely controlled by es->analyze, but now that it's exposed, we do
want to check if analyze was also true as without analyze there can
not be execution time. So, I changed the earlier condition if
(es->summary) to include es->analyze. Can we use only analyze?
Probably yes. The question there is why we didn't do it to start with?
OR why did we have summary controlling execution time report. Probably
the author thought that at some point in future we might separate
those two. So, I have left summary there. I don't have problem
removing it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-28 Thread Anastasia Lubennikova

27.12.2016 20:14, Claudio Freire:

On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova
  wrote:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
1417tblk =
ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
(gdb) bt
#0  0x006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
#1  0x00693dfe in lazy_scan_heap (onerel=0x1ec2360, options=9,
vacrelstats=0x1ef6e00, Irel=0x1ef7168, nindexes=2, aggressive=1 '\001')
 at vacuumlazy.c:1337
#2  0x00691e66 in lazy_vacuum_rel (onerel=0x1ec2360, options=9,
params=0x7ffe0f866310, bstrategy=0x1f1c4a8) at vacuumlazy.c:290
#3  0x0069191f in vacuum_rel (relid=1247, relation=0x0, options=9,
params=0x7ffe0f866310) at vacuum.c:1418

Those line numbers don't match my code.

Which commit are you based on?

My tree is (currently) based on 71f996d22125eb6cfdbee6094f44370aa8dec610


Hm, my branch is based on 71f996d22125eb6cfdbee6094f44370aa8dec610 as well.
I merely applied patches 
0001-Vacuum-prefetch-buffers-on-backward-scan-v3.patch

and 0002-Vacuum-allow-using-more-than-1GB-work-mem-v3.patch
then ran configure and make as usual.
Am I doing something wrong?

Anyway, I found the problem that had caused segfault.

for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; 
tupindex = 0, segindex++)

{
DeadTuplesSegment *seg = 
&(vacrelstats->dead_tuples.dead_tuples[segindex]);

intnum_dead_tuples = seg->num_dead_tuples;

while (tupindex < num_dead_tuples)
...

You rely on the value of tupindex here, while during the very first pass 
the 'tupindex' variable
may contain any garbage. And it happend that on my system there was 
negative value

as I found inspecting core dump:

(gdb) info locals
num_dead_tuples = 5
tottuples = 0
tupindex = -1819017215

Which leads to failure in the next line
tblk = ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);

The solution is to move this assignment inside the cycle.

I've read the second patch.

1. What is the reason to inline vac_cmp_itemptr() ?

2. +#define LAZY_MIN_TUPLESMax(MaxHeapTuplesPerPage, (128<<20) / 
sizeof(ItemPointerData))
What does 128<<20 mean? Why not 1<<27? I'd ask you to replace it with 
named constant,

or at least add a comment.

I'll share my results of performance testing it in a few days.

--
Anastasia Lubennikova
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Pavel Stehule
2016-12-28 14:19 GMT+01:00 Fabien COELHO :

>
> Hello Craig,
>
> Fabien, I don't really see the point of "persistent variables". What
>> benefit do they add over relations?
>>
>
> A relation is a set of values, a variable is a scalar with one value.
>
> It is always possible to declare a set and use it as a singleton, but
> somehow it seems cleaner to ask for what you want and have the database
> maintain the singleton property just like any other constraint.
>
> Behind the scene a "persistent variable" would probably be implemented as
> a row in a special table or some kind of one-row table... So there is no
> deep semantical difference, but mostly a syntactic one: you ask for a
> variable and you use it as a variable, i.e. there can be a simple well
> integrated syntax to get its value without having to "SELECT FROM" or
> resorting to functions.
>
> You can add a simple function to fetch a tuple if you want it not to
>> look like a subquery.
>>
>
> ISTM that if there are some kind of (persistent/session/...) variables,
> there should be a simple direct way of getting its value, like @var or &var
> or whatever. If one must write pg_get_variable_value('var')::ZZZ, it
> somehow defeats the purpose, as "(SELECT var FROM some_table)" is shorter.
>

just note - getter function returns typed value - there are not necessary
any other casting


>
> I do see value to two different things discussed here:
>>
>> * Pavel's proposal for persistent-declaration, non-persistent-value
>> session variables with access control. [...]
>>
>
> Yep, that is one. I missed the half-persistence property at the
> beginning...
>
> * Fabien's earlier mention of transient session / query variables, a-la
>> [...] I think it's a very separate topic to this and should be dealt with
>> in a separate thread if/when someone wants to work on them.
>>
>
> Yes and no: ISTM that at least a global design should be discussed
> *before* some kind of special-case variables (session-alive,
> persistent-in-existence-but-not-in-value, not-transactional,
> subject-to-permissions, not-subject-to-constraints...) are introduced, so
> that the special case does not preclude the possible future existence of
> other types of variables.
>
> Then I would be more at ease with having a special case implemented first,
> knowing that others may come and fit neatly, both semantically and
> syntaxically.
>
> I'm bothered by the half-persistence proposed, because it interferes both
> with possible session (light-weight, only in memory) and persistent
> (heavy-weight, in catalog) variables.
>
> Also, I'm not yet convinced that simple privatizable transcient/session
> variables would not be enough to fit the use case, so that for the same
> price there would be session variables for all, not only special ones with
> permissions.
>
> --
> Fabien.
>


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Michael Banck
On Wed, Dec 28, 2016 at 02:08:55PM +0100, Michael Banck wrote:
> On Tue, Dec 27, 2016 at 09:26:21AM -0500, Stephen Frost wrote:
> > * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
> > > We report planning and execution time when EXPLAIN ANALYZE is issued.
> > > We do not have facility to report planning time as part EXPLAIN
> > > output. In order to get the planning time, one has to issue EXPLAIN
> > > ANALYZE which involves executing the plan, which is unnecessary.
> > 
> > After reading that, rather long, thread, I agree that just having it be
> > 'summary' is fine.  We don't really want to make it based off of
> > 'timing' or 'costs' or 'verbose', those are different things.
> 
> I'm wondering, why is 'timing' (in the EXPLAIN scope) a different thing
> to planning time?  It might be that people don't expect it at this point
> and external tools might break (dunno), but in oder to print the
> planning time, 'timing' sounds clearer than 'summary' to me.

OK, after also rereading the thread, this indeed seems to be very
complicated, and I don't want to reopen this can of worms, sorry.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-28 Thread Fabien COELHO


Hello Craig,


Fabien, I don't really see the point of "persistent variables". What
benefit do they add over relations?


A relation is a set of values, a variable is a scalar with one value.

It is always possible to declare a set and use it as a singleton, but 
somehow it seems cleaner to ask for what you want and have the database 
maintain the singleton property just like any other constraint.


Behind the scene a "persistent variable" would probably be implemented as 
a row in a special table or some kind of one-row table... So there is no 
deep semantical difference, but mostly a syntactic one: you ask for a 
variable and you use it as a variable, i.e. there can be a simple well 
integrated syntax to get its value without having to "SELECT FROM" or 
resorting to functions.



You can add a simple function to fetch a tuple if you want it not to
look like a subquery.


ISTM that if there are some kind of (persistent/session/...) variables, 
there should be a simple direct way of getting its value, like @var or 
&var or whatever. If one must write pg_get_variable_value('var')::ZZZ, it 
somehow defeats the purpose, as "(SELECT var FROM some_table)" is shorter.



I do see value to two different things discussed here:

* Pavel's proposal for persistent-declaration, non-persistent-value
session variables with access control. [...]


Yep, that is one. I missed the half-persistence property at the 
beginning...


* Fabien's earlier mention of transient session / query variables, a-la 
[...] I think it's a very separate topic to this and should be dealt 
with in a separate thread if/when someone wants to work on them.


Yes and no: ISTM that at least a global design should be discussed 
*before* some kind of special-case variables (session-alive, 
persistent-in-existence-but-not-in-value, not-transactional, 
subject-to-permissions, not-subject-to-constraints...) are introduced, so 
that the special case does not preclude the possible future existence of 
other types of variables.


Then I would be more at ease with having a special case implemented first, 
knowing that others may come and fit neatly, both semantically and 
syntaxically.


I'm bothered by the half-persistence proposed, because it interferes both 
with possible session (light-weight, only in memory) and persistent 
(heavy-weight, in catalog) variables.


Also, I'm not yet convinced that simple privatizable transcient/session 
variables would not be enough to fit the use case, so that for the same 
price there would be session variables for all, not only special ones with 
permissions.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merging some features from plpgsql2 project

2016-12-28 Thread Pavel Stehule
2016-12-28 5:09 GMT+01:00 Jim Nasby :

> On 12/27/16 4:56 PM, Merlin Moncure wrote:
>
>> On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule 
>> wrote:
>>
>>> First I describe my initial position. I am strongly against introduction
>>> "new" language - plpgsql2 or new plpgsql, or any else. The trust of
>>> developers to us is important and introduction of any not compatible or
>>> different feature has to have really big reason. PostgreSQL is
>>> conservative
>>> environment, and PLpgSQL should not be a exception. More - I have not any
>>>
>>
> Which is why this is an external fork of plpgsql.
>

ok. Just I would not to repeat Perl6 or Python3 story - it is big
adventure, but big fail too


>
> ** The real problem is that we have no mechanism for allowing a PL's
> language/syntax/API to move forward without massive backwards compatibility
> problems. **
>

We have not, but there are few possibilities:

1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas


>
> This is NOT unique to plpgsql. plpython (for one) definitely has some
> stupidity that will require an API break to fix.
>
> A secondary issue is the lack of a blessed collection of extensions. If we
> had that we could maintain some of this stuff outside of the core release
> schedule, as well as provide more room for people to run experimental
> versions of extensions if they desired. If we had this then perhaps
> plpgsql_check would become a viable answer to some of this (though IMHO
> plpgsql_check is just a work-around for our lack of dealing with API
> compatibility).
>

plpgsql_check can do some test, that are impossible in plpgsql - from
performance view, from features. But some "blessed collections of
extension" can be nice. More if will be joined with some automatic test and
build tools. Although lot of extensions are really mature, the knowleadge
about these extensions are minimal - and building extensions on windows is
hard work still (for Linux developer).


> information from my customers, colleagues about missing features in this
>>> language.  If there is some gaps, then it is in outer environment - IDE,
>>> deployment, testing,
>>>
>>
> I'm honestly surprised (even shocked) that you've never run into any of
> the problems plpgsql2 is trying to solve. I've hit all those problems
> except for OUT parameters. I'd say the order they're listed in actually
> corresponds to how often I hit the problems.
>

I hit lot of older harder (now solved) issues - now, with more experience I
am able to see these issues. And I wrote plpgsql_check, partially for self
too. Years ago I prefer safe expressions.


>
> Breaking language compatibility is a really big deal.  There has to be
>> a lot of benefits to the effort and you have to make translation from
>> plpgsql1 to plpgsql2 really simple.  You have made some good points on
>>
>
> I think trying to move the ball forward in a meaningful way without
> breaking compatibility is a lost cause. Some of these issues could be
> addressed by adding more syntax, but even that has limits (do we really
> want another variation of STRICT that allows only 0 or 1 rows?). And
> there's no way to fix your #1 item below without breaking compatibility.
>

I think so there is way with extra check, or with persistent plpgsql
options - just use it, please. Some checks are clear, some other not.


> There *are* other ways this could be done, besides creating a different
> PL. One immediate possibility is custom GUCs; there may be other options.
>
> #1 problem with plpgsql in my point of view is that the language and
>> grammar are not supersets of sql.  A lot of PLPGSQL keywords (EXECUTE,
>> BEGIN, INTO, END) have incompatible meanings with our SQL
>> implementation.  IMNSHO, SQL ought to give the same behavior inside or
>> outside of plpgsql.  It doesn't, and this is one of the reasons why
>> plpgsql may not be a good candidate for stored procedure
>> implementation.
>>
>
> While this doesn't bug me, it's got to be confusing as hell for newbies.
>

If you know ALGOL family languages, then it is not problem. What is a
harder problem for people is different implementation of mix SQL and PL -
different than Oracle, or MSSQL. Our model is better, simpler but
different. It is difficult for people without knowleadge of differences
between functions and procedures. Partially we badly speaking so our void
functions are procedures.


>
> #2 problem with plpgsql is after function entry it's too late to do
>> things like set transaction isolation level and change certain kinds
>> of variables (like statement_timeout).  This is very obnoxious, I
>> can't wrap the database in an API 100%; the application has to manage
>> things that really should be controlled in SQL.
>>
>
> +1
>
> #3 problem with plpgsql is complete lack of inlining.  inlining
>> function calls in postgres is a black art even for very trivial cases.
>> This makes it hard for us to write quick thing

Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Michael Banck
On Tue, Dec 27, 2016 at 09:26:21AM -0500, Stephen Frost wrote:
> * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
> > We report planning and execution time when EXPLAIN ANALYZE is issued.
> > We do not have facility to report planning time as part EXPLAIN
> > output. In order to get the planning time, one has to issue EXPLAIN
> > ANALYZE which involves executing the plan, which is unnecessary.
> 
> After reading that, rather long, thread, I agree that just having it be
> 'summary' is fine.  We don't really want to make it based off of
> 'timing' or 'costs' or 'verbose', those are different things.

I'm wondering, why is 'timing' (in the EXPLAIN scope) a different thing
to planning time?  It might be that people don't expect it at this point
and external tools might break (dunno), but in oder to print the
planning time, 'timing' sounds clearer than 'summary' to me.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_activity.waiting_start

2016-12-28 Thread Amit Kapila
On Sat, Dec 24, 2016 at 7:46 AM, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>>> The difficulty with that is it'd require a gettimeofday() call for
>>> every wait start.  Even on platforms where those are relatively cheap,
>>> the overhead would be nasty --- and on some platforms, it'd be
>>> astonishingly bad.  We sweated quite a lot to get the overhead of
>>> pg_stat_activity wait monitoring down to the point where it would be
>>> tolerable for non-heavyweight locks, but I'm afraid this would push
>>> it back into the not-tolerable range.
>
>> Could we handle this like log_lock_waits..?
>
> Well, that only applies to heavyweight locks, which do a gettimeofday
> anyway in order to schedule the deadlock-check timeout.  If you were
> willing to populate this new column only for heavyweight locks, maybe it
> could be done for minimal overhead.  But that would be backsliding
> quite a lot compared to what we just did to extend pg_stat_activity's
> coverage of lock types.
>

Can we think of introducing new guc trace_system_waits or something
like that which will indicate that the sessions will report the value
of wait_start in pg_stat_activity?  The default value of such a
parameter can be false which means wait_start will be shown as NULL in
pg_stat_activity and when it is enabled the wait_start can show the
time as proposed in this thread.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting planning time with EXPLAIN

2016-12-28 Thread Amit Kapila
On Tue, Dec 27, 2016 at 1:27 PM, Ashutosh Bapat
 wrote:
> Hi,
> We report planning and execution time when EXPLAIN ANALYZE is issued.
> We do not have facility to report planning time as part EXPLAIN
> output. In order to get the planning time, one has to issue EXPLAIN
> ANALYZE which involves executing the plan, which is unnecessary.
>

+1.  I think getting to know planning time is useful for cases when we
are making changes in planner to know if the changes has introduced
any regression.

>
> One can use this option as
> postgres=# explain (summary on) select * from pg_class c, pg_type t
> where c.reltype = t.oid;
> QUERY PLAN
> --
>  Hash Join  (cost=17.12..35.70 rows=319 width=511)
>Hash Cond: (c.reltype = t.oid)
>->  Seq Scan on pg_class c  (cost=0.00..14.19 rows=319 width=259)
>->  Hash  (cost=12.61..12.61 rows=361 width=256)
>  ->  Seq Scan on pg_type t  (cost=0.00..12.61 rows=361 width=256)
>  Planning time: 48.823 ms
> (6 rows)
>
> When analyze is specified, summary is also set to ON. By default this
> flag is OFF.
>

I am not sure whether using *summary* to print just planning time is a
good idea.  Another option could be SUMMARY_PLAN_TIME.

+ /* Execution time matters only when analyze is requested */
+ if (es->summary && es->analyze)

Do you really need es->summary in above check?

We should update documentation of Explain command, but maybe that can
wait till we finalize the specs.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-28 Thread Anastasia Lubennikova

27.12.2016 16:54, Alvaro Herrera:

Anastasia Lubennikova wrote:


I ran configure using following set of flags:
  ./configure --enable-tap-tests --enable-cassert --enable-debug
--enable-depend CFLAGS="-O0 -g3 -fno-omit-frame-pointer"
And then ran make check. Here is the stacktrace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
1417tblk =
ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);

This doesn't make sense, since the patch removes the "tupindex"
variable in that function.



Sorry, I didn't get what you mean.

In  0002-Vacuum-allow-using-more-than-1GB-work-mem-v3.patch

-tupindex = 0;
-while (tupindex < vacrelstats->num_dead_tuples)
+segindex = 0;
+tottuples = 0;
+for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; 
tupindex = 0, segindex++)

 {

variable is still there, it was just moved to the loop.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for pg_receivexlog --format=plain|tar

2016-12-28 Thread Magnus Hagander
On Wed, Dec 28, 2016 at 6:58 AM, Michael Paquier 
wrote:

> On Wed, Dec 28, 2016 at 3:12 AM, Magnus Hagander 
> wrote:
> > On Tue, Dec 27, 2016 at 1:16 PM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >> On Tue, Dec 27, 2016 at 6:34 PM, Magnus Hagander 
> >> wrote:
> >> > On Tue, Dec 27, 2016 at 2:23 AM, Michael Paquier
> >> > 
> >> > wrote:
> >> >> Magnus, you have mentioned me as well that you had a couple of ideas
> >> >> on the matter, feel free to jump in and let's mix our thoughts!
> >> >
> >> >
> >> > Yeah, I've been wondering what the actual usecase is here :)
> >>
> >> There is value to compress segments finishing with trailing zeros,
> >> even if they are not the species with the highest representation in
> >> the WAL archive.
> >
> > Agreed on that part -- that's the value in compression though, and not
> > necessarily the TAR format itself.
> >
> > Is there any value of the TAR format *without* compression in your
> scenario?
>
> Hm. I cannot think of one.
>
> >> > I can see the point of being able to compress the individual segments
> >> > directly in pg_receivexlog in smaller systems though, without the need
> >> > to
> >> > rely on an external compression program as well. But in that case, is
> >> > there
> >> > any reason we need to wrap it in a tarfile, and can't just write it to
> >> > .gz natively?
> >>
> >> You mean having a --compress=0|9 option that creates individual gz
> >> files for each segment? Definitely we could just do that. It would be
> >
> > Yes, that's what I meant.
>
> OK, I have bitten the bullet and attached is a patch to add just
> --compress=0|9. So there is one .gz file generated per segment, and
> things are rather straight-forward. Adding tests is unfortunately not
> in scope as not all builds are compiled with libz.
>

Conditional tests? It probably wouldn't hurt to have them, but that would
be something more generic (like we'd need something to actually validate it
-- but it would make sense to have a test that, with compression enabled,
would verify if the uncompressed file turns out to be exactly 16Mb for
example).


> A couple of things to be aware of though:
> - gzopen, gzwrite and gzclose are used to handle the gz files. That's
> unconsistent with the tar method that is based on mainly deflate() and
> more low level routines.
>

But chosen for simplicity, I assume?



> - I have switched the directory method to use a file pointer instead
> of a file descriptor as gzwrite returns int as the number of
> uncompressed bytes written.
>

I don't really follow that reasoning :) Why does the directory method have
to change to use a filepointer because of that?



> - history and timeline files are gzip'd as well. Perhaps we don't want
> to do that.
>

I think it makes sense to compress everything.



> What do you think about this approach? I'll add that to the next CF.


I haven't reviweed the code in detail but yes, I think this approach is the
right one.

//Magnus


Re: [HACKERS] parallelize queries containing subplans

2016-12-28 Thread Rafia Sabih
On Wed, Dec 28, 2016 at 11:47 AM, Amit Kapila  wrote:
>
> Currently, queries that have references to SubPlans or
> AlternativeSubPlans are considered parallel-restricted.  I think we
> can lift this restriction in many cases especially when SubPlans are
> parallel-safe.  To make this work, we need to propagate the
> parallel-safety information from path node to plan node and the same
> could be easily done while creating a plan.  Another option could be
> that instead of propagating parallel-safety information from path to
> plan, we can find out from the plan if it is parallel-safe (doesn't
> contain any parallel-aware node) by traversing whole plan tree, but I
> think it is a waste of cycles.  Once we have parallel-safety
> information in the plan, we can use that for detection of
> parallel-safe expressions in max_parallel_hazard_walker().  Finally,
> we can pass all the subplans to workers during plan serialization in
> ExecSerializePlan().  This will enable workers to execute subplans
> that are referred in parallel part of the plan.  Now, we might be able
> to optimize it such that we pass only subplans that are referred in
> parallel portion of plan, but I am not sure if it is worth the trouble
> because it is one-time cost and much lesser than other things we do
> (like creating
> dsm, launching workers).
>
> Attached patch implements the above idea.  This will enable
> parallelism for queries containing un-correlated subplans, an example
> of which is as follows:
>
> set parallel_tuple_cost=0;
> set parallel_setup_cost=0;
> set min_parallel_relation_size=50;
>
> create table t1 (i int, j int, k int);
> create table t2 (i int, j int, k int);
>
> insert into t1 values (generate_series(1,10)*random(),
> generate_series(5,50)*random(),
> generate_series(8,80)*random());
> insert into t2 values (generate_series(4,10)*random(),
> generate_series(5,90)*random(),
> generate_series(2,10)*random());
>
>
> Plan without Patch
> -
> postgres=# explain select * from t1 where t1.i not in (select t2.i
> from t2 where t2.i in (1,2,3));
>   QUERY PLAN
> ---
>  Seq Scan on t1  (cost=110.84..411.72 rows=8395 width=12)
>Filter: (NOT (hashed SubPlan 1))
>SubPlan 1
>  ->  Seq Scan on t2  (cost=0.00..104.50 rows=2537 width=4)
>Filter: (i = ANY ('{1,2,3}'::integer[]))
> (5 rows)
>
> Plan with Patch
> 
> postgres=# explain select * from t1 where t1.i not in (select t2.i
> from t2 where t2.i in (1,2,3));
>QUERY PLAN
> -
>  Gather  (cost=110.84..325.30 rows=8395 width=12)
>Workers Planned: 1
>->  Parallel Seq Scan on t1  (cost=110.84..325.30 rows=4938 width=12)
>  Filter: (NOT (hashed SubPlan 1))
>  SubPlan 1
>->  Seq Scan on t2  (cost=0.00..104.50 rows=2537 width=4)
>  Filter: (i = ANY ('{1,2,3}'::integer[]))
> (7 rows)
>
> We have observed that Q-16 in TPC-H have been improved with the patch
> and the analysis of same will be shared by my colleague Rafia.
>
To study the effect of uncorrelated sub-plan pushdown on TPC-H and
TPC-DS benchmark queries we performed some experiments and the
execution time results for same are summarised as follows,

Query| HEAD | Patches | scale-factor
---+-+---+-
DS-Q45 | 35   | 19 | scale-factor: 100
H-Q16   | 812 | 645   | scale-factor: 300
H-Q16   | 49   | 37 | scale-factor: 20

Execution time given in above table is in seconds. Detailed analysis
of this experimentation is as follows,
Additional patches applied in this analysis are,
Parallel index scan [1]
Parallel index-only scan [2]
Parallel merge-join [3]
The system setup used for this experiment is,

Server parameter settings:
work_mem = 500 MB,
max_parallel_workers_per_gather = 4,
random_page_cost = seq_page_cost = 0.1 = parallel_tuple_cost,
shared_buffers = 1 GB
Machine used: IBM Power, 4 socket machine, 512 GB RAM

TPC-DS scale factor = 100 (approx size of database is 150 GB)

Query 45 which takes around 35 seconds on head, completes in 19
seconds with these patches. The point to note here is that without
this patch of pushing uncorrelated sublans, hash join which is using
subplan in join filter could not be pushed to workers and hence query
was unable to use the parallelism enough, with this patch parallelism
is available till the topmost join node. The output of explain analyse
statement of this query on both head and with patches is given in
attached file ds_q45.txt.

On further evaluating these patches on TPC-H queries on different
scale factors we came across query 16, for TPC-H scale factor 20 and
aforementioned parameter settings with the change of
max_parallel_workers_per gather = 2, it took 37 seconds with the
patches and 49 seconds on head.

Re: [HACKERS] Measuring replay lag

2016-12-28 Thread Thomas Munro
On Thu, Dec 22, 2016 at 10:14 AM, Thomas Munro
 wrote:
> On Thu, Dec 22, 2016 at 2:14 AM, Fujii Masao  wrote:
>> I agree that the capability to measure the remote_apply lag is very useful.
>> Also I want to measure the remote_write and remote_flush lags, for example,
>> in order to diagnose the cause of replication lag.
>
> Good idea.  I will think about how to make that work.

Here is an experimental version that reports the write, flush and
apply lag separately as requested.  This is done with three separate
(lsn, timestamp) buffers on the standby side.  The GUC is now called
replication_lag_sample_interval.  Not tested much yet.

postgres=# select application_name, write_lag, flush_lag, replay_lag
from pg_stat_replication ;
 application_name |write_lag|flush_lag|   replay_lag
--+-+-+-
 replica1 | 00:00:00.032408 | 00:00:00.032409 | 00:00:00.697858
 replica2 | 00:00:00.032579 | 00:00:00.03258  | 00:00:00.551125
 replica3 | 00:00:00.033686 | 00:00:00.033687 | 00:00:00.670571
 replica4 | 00:00:00.032861 | 00:00:00.032862 | 00:00:00.521902
(4 rows)

-- 
Thomas Munro
http://www.enterprisedb.com


replay-lag-v15.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] parallelize queries containing initplans

2016-12-28 Thread Amit Kapila
By seeing the subject line, one might wonder why we need to consider
parallelizing the queries containing initplans differently from
queries containing subplans considering that I have posted a mail to
achieve later a few hours back.  The reason is that both are treated
differently with respect to parallelism and otherwise as well and both
can be parallelized in a different way depending on the design we
choose.  InitPlans can be used in three forms (a) a Param node
representing a single scalar result (b) a row comparison tree
containing multiple Param nodes (c) NULL constant for MULTIEXPR
subquery whereas SubPlans are used as SubPlan nodes. Here, I am
primarily interested in parallelizing queries that contain InitPlans
of the form (a) and the reason is that I have seen that form used more
as compared to other forms (primarily based on a study of TPC-H and
TPC-DS workloads).  However, if we find that parallelizing other forms
can be done along with it easily, then that is excellent.  To start
with let us see the plan of TPC-H query (Q-22) and understand how it
can be improved.

Limit
   InitPlan 1 (returns $0)
 ->  Finalize Aggregate
   ->  Gather
 Workers Planned: 2
 ->  Partial Aggregate
   ->  Parallel Seq Scan on customer customer_1
 Filter: (...)
   ->  GroupAggregate
 Group Key: ("substring"((customer.c_phone)::text, 1, 2))
 ->  Sort
   Sort Key: ("substring"((customer.c_phone)::text, 1, 2))
   ->  Nested Loop Anti Join
 ->  Seq Scan on customer
   Filter: ((c_acctbal > $0) AND (...)))
 ->  Index Only Scan using idx_orders_custkey on orders
   Index Cond: (o_custkey = customer.c_custkey)


In the above plan, we can see that the join on customer and orders
table (Nested Loop Anti Join) is not parallelised even though we have
the capability to parallelize Nested Loop Joins. The reason for not
choosing the parallel plan is that one of the nodes (Seq Scan on
customer) is referring to initplan and we consider such nodes as
parallel-restricted which means they can't be parallelised.  Now, I
could see three ways of parallelizing such a query.  The first way is
that we just push parallel-safe initplans to workers and allow them to
execute it, the drawback of this approach is that it won't be able to
push initplans in cases as shown above where initplan is
parallel-unsafe (contains Gather node) and second is we will lose the
expectation of single evaluation.  The second way is that we always
execute the initplan in the master backend and pass the resultant
value to the worker, this will allow above form of plans to push
initplans to workers and hence can help in enabling parallelism for
other nodes in plan tree.   The drawback of the second approach is
that we need to evaluate the initplan before it is actually required
which means that we might evaluate it even when it is not required.  I
am not sure if it is always safe to assume that we can evaluate the
initplan before pushing it to workers especially for the cases when it
is far enough down in the plan tree which we are parallelizing,
however, I think we can assume it when the iniplan is above the plan
tree where it is used (like in the above case).  The third way is that
we allow Gather node to be executed below another Gather node, but I
think that will be bad especially for the plans like above because
each worker needs to further spawn another set of workers to evaluate
the iniplan which could be done once.  Now we can build some way such
that only one of the workers executes such an initplan and share the
values with other workers, but I think overall this requires much more
effort than first or second approach.

Among all the three approaches, first seems to be simpler than the
other two, but I feel if we just do that then we leave a lot on the
table.   Another way to accomplish this project could be that we do a
mix of first and second such that when the initplan is above the plan
tree to be parallelized, then use the second approach (one-time
evaluation by master backend and share the result with workers),
otherwise use the first approach of pushing down the initplan to
workers.

Thoughts?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2016-12-28 Thread Fabien COELHO


Hello Tom,

[...] It's touching every single utility statement type, which is not 
only pretty invasive in itself, but will break any pending patches that 
add more utility statement types.


Yep, but it is limited to headers and the break is trivial...

And you've not followed through on the implications of adding those 
fields in, for instance, src/backend/nodes/;


Hmmm, probably you are pointing out structure read/write functions.

I would have hoped that such code would be automatically derived from the 
corresponding struct definition...



I understand that you suggest to add a new intermediate structure [...] 
So that raw_parser would return a List instead of a 
List, and the statements would be unmodified.


Yeah, that's what I was thinking of.  There aren't very many places that
would need to know about that, I believe; [...]


Hmmm. I've run into a tiny small ever so little detail in trying to apply 
this clever approach...


For fixing the information in pg_stat_statement, the location data must be 
transported from the parsed node to the query to the planned node, because 
the later two nodes types are passed to different hooks.


Now the detail is that utility statements, which seems to be nearly all of 
them but select/update/delete/insert, do not have plans: The statement 
itself is its own plan... so there is no place to store the location & 
length.


As adding such fields to store the information in the structures is no go 
area, I'm hesitating about the course to follow to provide a solution 
acceptable to you.


Would you have any other clever proposition or advice about how to 
proceed?


I thought of creating yet another node "utility plans with locations" or 
maybe reuse the "parsed node" for the purpose, but then it has to be 
managed in quite a few places as well.


For the parser output, there is 4 raw_parser calls but also 10 
pg_parse_query calls to manage. I'm not sure that going this way will fit 
the "few lines of code" bill...


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2016-12-28 Thread Craig Ringer
On 23 December 2016 at 18:00, Craig Ringer  wrote:

> I'll have to follow up with a patch soon, as it's Toddler o'Clock.

Here we go.

This patch advances oldestXid, under XidGenLock, _before_ truncating clog.

txid_status() holds XidGenLock from when it tests oldestXid until it's
done looking up clog, thus eliminating the race.

CLOG_TRUNCATE records now contain the oldestXid, so they can advance
oldestXid on a standby, or when we've truncated clog since the most
recent checkpoint on the master during recovery. It's advanced under
XidGenLock during redo to protect against this race on standby.

As outlined in my prior mail I think this is the right approach. I
don't like taking XidGenLock twice, but we don't advance datfrozenxid
much so it's not a big concern. While a separate ClogTruncationLock
could be added like in my earlier patch, oldestXid is currently under
XidGenLock and I'd rather not change that.

The biggest change here is that oldestXid is advanced separately to
the vac limits in the rest of ShmemVariableCache. As far as I can tell
we don't prevent two manual VACUUMs on different DBs from trying to
concurrently run vac_truncate_clog, so this has to be safe against two
invocations racing each other. Rather than try to lock out such
concurrency, the patch ensures that oldestXid can never go backwards.
It doesn't really matter if the vac limits go backwards, it's no worse
than what can already happen in the current code.

We cannot advance the vacuum limits before we truncate the clog away,
in case someone tries to access a very new xid (if we're near
wraparound)

I'm pretty sure that commit timestamps suffer from the same flaw as
Robert identified upthread with clog. This patch fixes the clog race,
but not the similar one in commit timestamps. Unlike the clog race
with txid_status(), the commit timestamps one is already potentially
user-visible since we allow arbitrary xids to be looked up for commit
timestamps. I'll address that separately.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From a61d9b81c82f3241e2fc47caf261b6f7bedf82d9 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 22 Dec 2016 13:07:00 +0800
Subject: [PATCH] Introduce txid_status(bigint) to get status of an xact

If an application loses its connection while a COMMIT request is in
flight, the backend crashes mid-commit, etc, then an application may
not be sure whether or not a commit completed successfully or was
rolled back. While two-phase commit solves this it does so at a
considerable overhead, so introduce a lighter alternative.

txid_status(bigint) lets an application determine the status of a a
commit based on an xid-with-epoch as returned by txid_current() or
similar. Status may be committed, aborted, in-progress (including
prepared xacts) or null if the xact is too old for its commit status
to still be retained because it has passed the wrap-around epoch
boundary.

Applications must call txid_current() in their transactions to make
much use of this since PostgreSQL does not automatically report an xid
to the client when one is assigned.

Introduces TransactionIdInRecentPast(...) for the use of other
functions that need similar logic in future.

There was previously no way to look up an arbitrary xid without
running the risk of having clog truncated out from under you. This
hasn't been a problem because anything looking up xids in clog knows
they're protected by datminxid, but that's not the case for arbitrary
user-supplied XIDs. clog was truncated before we advance oldestXid so
taking XidGenLock was insufficient. There's no way to look up a
SLRU with soft-failure. To address this, increase oldestXid under XidGenLock
before we trunate clog rather than after, so concurrent access is safe.

Craig Ringer, Robert Haas
---
 doc/src/sgml/func.sgml |  27 
 src/backend/access/rmgrdesc/clogdesc.c |  12 +++-
 src/backend/access/transam/clog.c  |  33 +++---
 src/backend/access/transam/varsup.c|  38 ++-
 src/backend/access/transam/xlog.c  |  17 +++--
 src/backend/commands/vacuum.c  |  13 
 src/backend/utils/adt/txid.c   | 116 +
 src/include/access/clog.h  |   5 ++
 src/include/access/transam.h   |   2 +
 src/include/catalog/pg_proc.h  |   2 +
 src/include/utils/builtins.h   |   1 +
 src/test/regress/expected/txid.out |  68 +++
 src/test/regress/sql/txid.sql  |  38 +++
 13 files changed, 355 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 10e3186..8b28dc6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17182,6 +17182,10 @@ SELECT collation for ('foo' COLLATE "de_DE");
 txid_visible_in_snapshot

 
+   
+txid_status
+   
+

 The functions shown in 
 provide server transaction information in a

Re: [HACKERS] Faster methods for getting SPI results

2016-12-28 Thread Craig Ringer
On 28 December 2016 at 12:32, Jim Nasby  wrote:
> On 12/27/16 9:10 PM, Craig Ringer wrote:
>>
>> On 28 December 2016 at 09:58, Jim Nasby  wrote:
>>
>>> I've looked at this some more, and ITSM that the only way to do this
>>> without
>>> some major surgery is to create a new type of Destination specifically
>>> for
>>> SPI that allows for the execution of an arbitrary C function for each
>>> tuple
>>> to be sent.
>>
>>
>> That sounds a lot more sensible than the prior proposals. Callback driven.
>
>
> Are there other places this would be useful? I'm reluctant to write all of
> this just to discover it doesn't help performance at all, but if it's useful
> on it's own I can just submit it as a stand-alone patch.

I don't have a use for it personally. In BDR and pglogical anything
that does work with nontrivial numbers of tuples uses lower level
scans anyway.

I expect anything that uses the SPI to run arbitrary user queries
could have a use for something like this though. Any PL, for one.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2016-12-28 Thread Ashutosh Bapat
On Wed, Dec 28, 2016 at 1:29 PM, Etsuro Fujita
 wrote:
> On 2016/12/28 15:54, Ashutosh Bapat wrote:
>>
>> On Wed, Dec 28, 2016 at 9:20 AM, Etsuro Fujita
>>  wrote:
>>>
>>> On 2016/12/27 22:03, Ashutosh Bapat wrote:

 If mergejoin_allowed is true and mergeclauselist is non-NIL but
 hashclauselist is NIL (that's rare but there can be types has merge
 operators but not hash operators), we will end up returning NULL. I
 think we want to create a merge join in that case. I think the order
 of conditions should be 1. check hashclause_list then create hash join
 2. else check if merge allowed, create merge join. It looks like that
 would cover all the cases, if there aren't any hash clauses, and also
 no merge clauses, we won't be able to implement a FULL join, so it
 will get rejected during path creation itself.
>
>
>>> Right, maybe we can do that by doing similar things as in
>>> match_unsort_outer
>>> and/or sort_inner_and_outer.  But as you mentioned, the case is rare, so
>>> the
>>> problem would be whether it's worth complicating the code (and if it's
>>> worth, whether we should do that at the first version of the function).
>
>
>> All I am requesting is changing the order of conditions. That doesn't
>> complicate the code.
>
>
> I might have misunderstood your words, but you are saying we should consider
> mergejoin paths with some mergeclauses in the case where hashclauses is NIL,
> right?  To do so, we would need to consider the sort orders of outer/inner
> paths, which would make the code complicated.

Hmm. If I understand the patch correctly, it does not return any path
when merge join is allowed and there are merge clauses but no hash
clauses. In this case we will not create a foreign join path, loosing
some optimization. If we remove GetExistingLocalJoinPath, which
returns a path in those cases as well, we have a regression in
performance.

>
 The reason we chose to pick up an existing path was that the
 discussion in thread [1] concluded the efficiency of the local plan
 wasn't a concern for EPQ. Are we now saying something otherwise?
>
>
>>> No, I won't.  Usually, the overhead would be negligible, but in some
>>> cases
>>> where there are many concurrent updates, the overhead might not be
>>> negligible due to many EPQ rechecks.  So it would be better to have an
>>> efficient local plan.
>
>
>> All that the EPQ rechecks do is apply the join and other quals again
>> on the base relation rows. Will choice of plan affect the efficiency?
>
>
> Merge or hash joins would need extra steps to start that work (for example,
> building a hash table from the inner relation for a hash join.)

Hmm, I agree.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2016-12-28 Thread Etsuro Fujita

On 2016/12/28 15:54, Ashutosh Bapat wrote:

On Wed, Dec 28, 2016 at 9:20 AM, Etsuro Fujita
 wrote:

On 2016/12/27 22:03, Ashutosh Bapat wrote:

If mergejoin_allowed is true and mergeclauselist is non-NIL but
hashclauselist is NIL (that's rare but there can be types has merge
operators but not hash operators), we will end up returning NULL. I
think we want to create a merge join in that case. I think the order
of conditions should be 1. check hashclause_list then create hash join
2. else check if merge allowed, create merge join. It looks like that
would cover all the cases, if there aren't any hash clauses, and also
no merge clauses, we won't be able to implement a FULL join, so it
will get rejected during path creation itself.



Right, maybe we can do that by doing similar things as in match_unsort_outer
and/or sort_inner_and_outer.  But as you mentioned, the case is rare, so the
problem would be whether it's worth complicating the code (and if it's
worth, whether we should do that at the first version of the function).



All I am requesting is changing the order of conditions. That doesn't
complicate the code.


I might have misunderstood your words, but you are saying we should 
consider mergejoin paths with some mergeclauses in the case where 
hashclauses is NIL, right?  To do so, we would need to consider the sort 
orders of outer/inner paths, which would make the code complicated.



The reason we chose to pick up an existing path was that the
discussion in thread [1] concluded the efficiency of the local plan
wasn't a concern for EPQ. Are we now saying something otherwise?



No, I won't.  Usually, the overhead would be negligible, but in some cases
where there are many concurrent updates, the overhead might not be
negligible due to many EPQ rechecks.  So it would be better to have an
efficient local plan.



All that the EPQ rechecks do is apply the join and other quals again
on the base relation rows. Will choice of plan affect the efficiency?


Merge or hash joins would need extra steps to start that work (for 
example, building a hash table from the inner relation for a hash join.)


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers