date:20180511

Re: Compiler warnings with --enable-dtrace

2018-05-11 Thread Peter Geoghegan

On Mon, May 7, 2018 at 9:42 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> --enable-dtrace produces compiler warnings about const correctness,
>> except on macOS.  That's because Apple's dtrace produces function
>> declarations in probes.h that take strings as const char * whereas
>> AFAIK on all other operating systems they take char * (you can see
>> that possibly recent difference in Apple's version of dt_header_decl()
>> in dt_program.c).  People have complained before[1].
>
> Yeah, it's a bit annoying, although AFAICT hardly anyone uses dtrace.

You're probably right about that, but the "--enable-dtrace" configure
option is another matter. I started to use it for release builds on my
personal Linux system a few months back, though not because I am a
SystemTap user.

lwn.net had a great (subscriber only) article just today about new
Linux tooling for USDT probes [1]. The BPF/BCC + USDT stuff only
became available in the last year or so. It seems like a technology
that has the potential to be enormously useful for debugging complex
production issues, while still being relatively easy to use. The BCC
"trace" utility [2] can be used to produce simple one-liners that spit
out interesting information about a running Postgres instance. It
seems to be surprisingly low overhead in many cases.

I've been meaning to do a write-up on all of this to make it more
accessible, though it's already quite accessible.

[1] https://lwn.net/Articles/753601/
[2] https://github.com/iovisor/bcc/blob/master/tools/trace_example.txt
-- 
Peter Geoghegan

Re: allow psql to watch \dt

2018-05-11 Thread Tom Lane

Justin Pryzby  writes:
> I thought that would be desirable, although I don't see any better way of
> getting there than this.

Hm, but a lot of the \d commands involve more than one underlying query,
as well as a bunch of postprocessing.  I doubt that the approach you seem
to be using here can handle such cases.

I think you have also stomped all over the semantics of
query-buffer-related commands that are executed in the vicinity of a \d.
Up to now, \d didn't change the query buffer.  That has its uses, eg

select somecol
\dt some*   -- how's that table spelled again?
from sometable;

We could maybe think about making it work by moving the \watch repetition
up a level, so that exec_command() as a whole would be repeated ... but
I wonder what people would think repetition of other commands such as \i,
\e, \r, \if, etc etc should mean.

On the whole I think this is not a can of worms I want to open.  There's
a clear distinction right now between plain SQL and backslash commands,
and this is going to fuzz that in ways that are hard to predict the
consequences of.

regards, tom lane

allow psql to watch \dt

2018-05-11 Thread Justin Pryzby

I thought that would be desirable, although I don't see any better way of
getting there than this.

I don't see other commands for which which watch is wanted...but who am I to
say that watching creation extention isn't useful?  So I imagine this should be
generalized to save query buffer for all \d commands.

BTW, for amusement value, I briefly looked at implementing it for \!

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 4c85f43..580d708 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -68,7 +68,7 @@ static backslashResult exec_command_copy(PsqlScanState 
scan_state, bool active_b
 static backslashResult exec_command_copyright(PsqlScanState scan_state, bool 
active_branch);
 static backslashResult exec_command_crosstabview(PsqlScanState scan_state, 
bool active_branch);
 static backslashResult exec_command_d(PsqlScanState scan_state, bool 
active_branch,
-  const char *cmd);
+  const char *cmd, PQExpBuffer query_buf);
 static backslashResult exec_command_edit(PsqlScanState scan_state, bool 
active_branch,
  PQExpBuffer query_buf, PQExpBuffer 
previous_buf);
 static backslashResult exec_command_ef_ev(PsqlScanState scan_state, bool 
active_branch,
@@ -310,7 +310,7 @@ exec_command(const char *cmd,
else if (strcmp(cmd, "crosstabview") == 0)
status = exec_command_crosstabview(scan_state, active_branch);
else if (cmd[0] == 'd')
-   status = exec_command_d(scan_state, active_branch, cmd);
+   status = exec_command_d(scan_state, active_branch, cmd, 
query_buf);
else if (strcmp(cmd, "e") == 0 || strcmp(cmd, "edit") == 0)
status = exec_command_edit(scan_state, active_branch,
   query_buf, 
previous_buf);
@@ -693,7 +693,7 @@ exec_command_crosstabview(PsqlScanState scan_state, bool 
active_branch)
  * \d* commands
  */
 static backslashResult
-exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
+exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd, 
PQExpBuffer query_buf)
 {
backslashResult status = PSQL_CMD_SKIP_LINE;
boolsuccess = true;
@@ -720,7 +720,7 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
success = describeTableDetails(pattern, 
show_verbose, show_system);
else
/* standard listing of interesting 
things */
-   success = listTables("tvmsE", NULL, 
show_verbose, show_system);
+   success = listTables("tvmsE", NULL, 
show_verbose, show_system, query_buf);
break;
case 'A':
success = describeAccessMethods(pattern, 
show_verbose);
@@ -794,7 +794,7 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
case 'i':
case 's':
case 'E':
-   success = listTables([1], pattern, 
show_verbose, show_system);
+   success = listTables([1], pattern, 
show_verbose, show_system, query_buf);
break;
case 'r':
if (cmd[2] == 'd' && cmd[3] == 's')
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 410131e..b637e1b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3404,7 +3404,7 @@ listDbRoleSettings(const char *pattern, const char 
*pattern2)
  * (any order of the above is fine)
  */
 bool
-listTables(const char *tabtypes, const char *pattern, bool verbose, bool 
showSystem)
+listTables(const char *tabtypes, const char *pattern, bool verbose, bool 
showSystem, PQExpBuffer buf)
 {
boolshowTables = strchr(tabtypes, 't') != NULL;
boolshowIndexes = strchr(tabtypes, 'i') != NULL;
@@ -3413,7 +3413,6 @@ listTables(const char *tabtypes, const char *pattern, 
bool verbose, bool showSys
boolshowSeq = strchr(tabtypes, 's') != NULL;
boolshowForeign = strchr(tabtypes, 'E') != NULL;
 
-   PQExpBufferData buf;
PGresult   *res;
printQueryOpt myopt = pset.popt;
static const bool translate_columns[] = {false, false, true, false, 
false, false, false};
@@ -3422,13 +3421,13 @@ listTables(const char *tabtypes, const char *pattern, 
bool verbose, bool showSys
if (!(showTables || showIndexes || showViews || showMatViews || showSeq 
|| showForeign))
showTables = showViews = showMatViews = showSeq = showForeign = 
true;
 
-   initPQExpBuffer();
+   initPQExpBuffer(buf);
 
/*
 * Note: as

Re: PANIC during crash recovery of a recently promoted standby

2018-05-11 Thread Michael Paquier

On Fri, May 11, 2018 at 12:09:58PM -0300, Alvaro Herrera wrote:
> Yeah, I had this exact comment, but I was unable to come up with a test
> case that would cause a problem.

pg_ctl promote would wait for the control file to be updated, so you
cannot use it in the TAP tests to trigger the promotion.  Still I think
I found one after waking up?  Please note I have not tested it:
- Use a custom trigger file and then trigger promotion with a signal.
- Use a sleep command in recovery_end_command to increase the window, as
what matters is sleeping after CreateEndOfRecoveryRecord updates the
control file.
- Issue a restart point on the standby, which will update the control
file.
- Stop the standby with immediate mode.
- Start the standby, it should see unreferenced pages.

> Hmm.  Can we change the control file in released branches?  (It should
> be possible to make the new server understand both old and new formats,
> but I think this is breaking new ground and it looks easy to introduce
> more bugs there.)

We definitely can't, even if the new value is added at the end of
DBState :( 

A couple of wild ideas, not tested, again after waking up:
1) We could also abuse of existing values by using the existing
DB_IN_CRASH_RECOVERY or DB_STARTUP.  Still that's not completely true as
the cluster may be open for business as a hot standby.
2) Invent a new special value for XLogRecPtr, normally impossible to
reach, which uses high bits.
--
Michael


signature.asc
Description: PGP signature

Re: Having query cache in core

2018-05-11 Thread Andres Freund

On 2018-05-12 08:20:13 +1000, CK Tan wrote:
> On Sat, May 12, 2018 at 8:18 AM, Tatsuo Ishii  wrote:
> 
> > >
> > > How do you handle tables hiding behind views? Also how does cached
> > entries
> > > in pgpools know if some tables are modified without going thru pgpool, eg
> > > pgplsql or trigger or via psql directly?
> >
> > Pgpool-II do not invalidate cache entries for views, triggers and
> > others. That's an limitation of the implementation in Pgpool-II.
> >
> > I think in-core query cache would not have the limitation because it
> > would have a full access to system catalogs and wal.
> >
> >
> Yes. That's my point. You can't do it outside the core.

There's a lot of possibilities between "external daemon" and "in
core". ISTM that it's likely that the current extension API already make
this possible. But if not, it seems like adding the few missing hooks
wouldn't be that much work.

Greetings,

Andres Freund

Re: Having query cache in core

2018-05-11 Thread CK Tan

On Sat, May 12, 2018 at 8:18 AM, Tatsuo Ishii  wrote:

> >
> > How do you handle tables hiding behind views? Also how does cached
> entries
> > in pgpools know if some tables are modified without going thru pgpool, eg
> > pgplsql or trigger or via psql directly?
>
> Pgpool-II do not invalidate cache entries for views, triggers and
> others. That's an limitation of the implementation in Pgpool-II.
>
> I think in-core query cache would not have the limitation because it
> would have a full access to system catalogs and wal.
>
>
Yes. That's my point. You can't do it outside the core.

-cktan

Re: Having query cache in core

2018-05-11 Thread Tatsuo Ishii

>> > I think you need to know which tables are involved and if they were
>> > modified.
>>
>> Of course. While creating a cache entry for a SELECT, we need to
>> analyze it and extract tables involved in the SELECT. The information
>> should be stored along with the cache entry. If any of the tables were
>> modified, cache entries using the table must be removed.
>> (these are already implemented in Pgpool-II's in memory query cache)
>>
> 
> How do you handle tables hiding behind views? Also how does cached entries
> in pgpools know if some tables are modified without going thru pgpool, eg
> pgplsql or trigger or via psql directly?

Pgpool-II do not invalidate cache entries for views, triggers and
others. That's an limitation of the implementation in Pgpool-II.

I think in-core query cache would not have the limitation because it
would have a full access to system catalogs and wal.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Having query cache in core

2018-05-11 Thread Tatsuo Ishii

> On 11.05.2018 18:01, Tatsuo Ishii wrote:
>> Plus checking username is neccessary (otherwise any user could
>> retrieve a cache for a table lookup which is not permitted by other
>> users).
> 
> as the tables a cached query operated on is known anyway -- it's
> needed
> to purge cache entries when table content changes -- schema and table
> level SELECT privileges can be checked ... I'm not fully sure about
> how MySQL handles its column level privileges in that respect,
> something
> I'd need to try out ...

I am not talking about cache invalidation. If a cache entry is created
for a table which is only accessable by user A, the cache entry should
be hit for only A, not someone else. Otherwise it will be a serious
security problem.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: perlcritic: Missing "return"

2018-05-11 Thread Andrew Dunstan



On 05/11/2018 09:59 AM, Mike Blackwell wrote:
> After applying the perlcritic overrides Andrew used for the buildfarm,
> one of the most common remaining level 4 warnings in the PostgreSQL
> source, with 186 occurrences, is 'Subroutine does not end with "return"'.
>
> The point of this warning is that, in Perl, falling off the end of a
> subroutine returns the result of the last statement.  Therefor  one
> should explicitly 'return;' to make it clear the caller is not
> expecting that result as the return value. 
>
> I believe Andrew took the approach of adding return at the end of all
> functions for the buildfarm code.  Would the project prefer the same? 
> The other option would be disable the warning, based on a policy of
> always explicitly using 'return' when returning a value.
>
> Thoughts?
>
>


Part of the reasoning behind this perlcritic policy is that falling off
the end of a function can leak information. That's why I went and
cleaned it up in the buildfarm code. I don't have terribly strong
feelings either way, but if Mike wants to do the leg work then I favor
accepting a patch to clean this up.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 10:20:17PM +0300, Teodor Sigaev wrote:
> >>857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during cleanup
> >>of B-tree indexes when possible
> >
> >I read that and thought it was too details to be in the release notes.
> >It is not that it is unimportant, but it is hard to see how people would
> >notice the difference or change their behavior based on this change.
> Hm, it could be something like:
> Add possibility to miss scan indexes on append-only table, it decreases
> vacuum time on such tables.
> 
> >
> >>c0cbe00fee6d0a5e0ec72c6d68a035e674edc4cc Add explicit cast from scalar jsonb
> >>to all numeric and bool types.
> >>
> >>Pls, edit:
> >>Add functionjson(b)_to_tsvector to create usable text search queries
> >>matching JSON/JSONB values (Dmitry Dolgov)
> >>to
> >>Add function json(b)_to_tsvector to create usable vectors to search in
> >>json(b) (Dmitry Dolgov)
> >>or somehow else. Your wording is about query but actually that functions are
> >>about how to create tsvector from json(b)
> >
> >OK, how is this text?
> >
> > Add function json(b)_to_tsvector() to create
> > text search query for matching JSON/JSONB
> > values (Dmitry Dolgov)
> Not query - *_to_tsvector functions produce a tsvector. So:
> 
> Add function json(b)_to_tsvector() to use customizable
> full text search over json(b) columns.

Done and URL updated.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 12:28:57PM -0700, Peter Geoghegan wrote:
> On Fri, May 11, 2018 at 12:17 PM, Peter Geoghegan  wrote:
> > * Suggest replacement sort item be phrased as: "Remove the
> > configuration parameter replacement_sort_tuples.  The
> > replacement selection sort algorithm is no longer used."
> 
> Also, it should be moved to "Migration to Version 11", since the only
> issue here is compatibility with older versions.

Done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 12:17:47PM -0700, Peter Geoghegan wrote:
> On Fri, May 11, 2018 at 12:04 PM, Andres Freund  wrote:
> > I don't think the table being 'append-only' is necessary?  Nor does it
> > have to be a manual vacuum. And 'needless index scan' sounds less than
> > it is/was, namely a full scan of the index. Perhaps something like:
> >
> >   Allow vacuum to skip doing a full scan of btree indexes after VACUUM,
> >   if not necessary.
> >
> > or something like that?
> 
> I suggest "Allow vacuuming to avoid full index scans for indexes when
> there are no dead tuples found in a table. Where necessary, the
> behavior can be adjusted via the new configuration parameter
> vacuum_cleanup_index_scale_factor."
> 
> Also:
> 
> * "Allow indexes to be built in parallel" should specify that it only
> works for B-Tree index builds.
> 
> * Suggest replacement sort item be phrased as: "Remove the
> configuration parameter replacement_sort_tuples.  The
> replacement selection sort algorithm is no longer used."

All done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Peter Geoghegan

On Fri, May 11, 2018 at 12:17 PM, Peter Geoghegan  wrote:
> * Suggest replacement sort item be phrased as: "Remove the
> configuration parameter replacement_sort_tuples.  The
> replacement selection sort algorithm is no longer used."

Also, it should be moved to "Migration to Version 11", since the only
issue here is compatibility with older versions.

-- 
Peter Geoghegan

Re: Postgres 11 release notes

2018-05-11 Thread Justin Pryzby

On Fri, May 11, 2018 at 12:22:08PM -0700, Andres Freund wrote:
> Btw, is it just me, or do the commit and docs confuse say stalled when
> stale is intended?

Should be fixed since yesterday's 8e12f4a250d250a89153da2eb9b91c31bb80c483 ?

Justin

Re: Postgres 11 release notes

2018-05-11 Thread Andres Freund

On 2018-05-11 15:11:31 -0400, Bruce Momjian wrote:
> On Fri, May 11, 2018 at 04:00:58PM -0300, Alvaro Herrera wrote:
> > Bruce Momjian wrote:
> > 
> > > OK, so what is the text that people will understand?  This?
> > > 
> > >   Prevent manual VACUUMs on append-only tables from performing
> > >   needless index scans
> > 
> > Make vacuum cheaper by avoiding scans of btree indexes when not
> > necessary
> > ?
> > 
> > Why "manual vacuum"?  It's a problem with vacuums invoked from
> > autovacuum too.
> 
> Uh, from the commit it says:
> 
>   When workload on particular table is append-only, then autovacuum
>   isn't intended to touch this table. However, user may run vacuum
>   manually in order to fill visibility map and get benefits of
>   
>   index-only scans.  Then ambulkdelete wouldn't be called for
>   indexes of such table (because no heap tuples were deleted), only
>  ---
>   amvacuumcleanup would be called In this case, amvacuumcleanup
>   would perform full index scan for two objectives: put recyclable
>   pages into free space map and update index statistics.
> 
> Why would autovacuum run on a table with no expired index entries?

Anti-Wraparound is one case. One where it's really painful to take
forever on lots of unchanged tables. Lots of hot updates another.

Btw, is it just me, or do the commit and docs confuse say stalled when
stale is intended?


> Personally, I think the fact that autovacuum doesn't run on suvch tables
> and therefore doesn't automatically do index-only scans is a problem. I
> tried to fix it years ago, but failed and gave up.

I'm really unhappy about that too.

Greetings,

Andres Freund

Re: Postgres 11 release notes

2018-05-11 Thread Teodor Sigaev


857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during cleanup
of B-tree indexes when possible


I read that and thought it was too details to be in the release notes.
It is not that it is unimportant, but it is hard to see how people would
notice the difference or change their behavior based on this change.

Hm, it could be something like:
Add possibility to miss scan indexes on append-only table, it decreases 
vacuum time on such tables.





c0cbe00fee6d0a5e0ec72c6d68a035e674edc4cc Add explicit cast from scalar jsonb
to all numeric and bool types.

Pls, edit:
Add functionjson(b)_to_tsvector to create usable text search queries
matching JSON/JSONB values (Dmitry Dolgov)
to
Add function json(b)_to_tsvector to create usable vectors to search in
json(b) (Dmitry Dolgov)
or somehow else. Your wording is about query but actually that functions are
about how to create tsvector from json(b)


OK, how is this text?

 Add function json(b)_to_tsvector() to create
 text search query for matching JSON/JSONB
 values (Dmitry Dolgov)

Not query - *_to_tsvector functions produce a tsvector. So:

Add function json(b)_to_tsvector() to use 
customizable full text search over json(b) columns.



--
Teodor Sigaev  E-mail: teo...@sigaev.ru
  WWW: http://www.sigaev.ru/

Re: Postgres 11 release notes

2018-05-11 Thread Peter Geoghegan

On Fri, May 11, 2018 at 12:04 PM, Andres Freund  wrote:
> I don't think the table being 'append-only' is necessary?  Nor does it
> have to be a manual vacuum. And 'needless index scan' sounds less than
> it is/was, namely a full scan of the index. Perhaps something like:
>
>   Allow vacuum to skip doing a full scan of btree indexes after VACUUM,
>   if not necessary.
>
> or something like that?

I suggest "Allow vacuuming to avoid full index scans for indexes when
there are no dead tuples found in a table. Where necessary, the
behavior can be adjusted via the new configuration parameter
vacuum_cleanup_index_scale_factor."

Also:

* "Allow indexes to be built in parallel" should specify that it only
works for B-Tree index builds.

* Suggest replacement sort item be phrased as: "Remove the
configuration parameter replacement_sort_tuples.  The
replacement selection sort algorithm is no longer used."

-- 
Peter Geoghegan

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 11:59:12AM -0700, Andres Freund wrote:
> Hi,
> 
> Quick notes:
> >   
> >Add Just-In-Time (JIT) compilation of plans
> >run the by the executor
> >(Andres Freund)
> >   
> >  
> 
> It's currently not yet compilation of entire plans, but only parts. I
> think that's a relevant distinction because I'd like to add that as a
> feature to v12 ;). How about just adding 'parts of' or such?  I'd also
> rephrase the plan and 'run by the executor a bit'. How about:
> 
>   Add Just-In-Time (JIT) compilation of parts of
>   queries, to accelerate their execution.

OK, new text:

Add Just-In-Time (JIT) compilation of some
parts of query plans to improve execution speed (Andres Freund)

> >  
> >
> >
> >   
> >Add configure flag --with-llvm to test for
> >LLVM support (Andres Freund)
> >   
> >  
> >
> >  
> >
> >
> >   
> >Have configure check for the availability of a C++ compiler
> >(Andres Freund)
> >   
> >  
> >
> >  
> 
> I wonder if we shouldn't omit those, considering them part of the JIT
> entry?  Not quite sure what our policy there is.

OK, removed.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 04:00:58PM -0300, Alvaro Herrera wrote:
> Bruce Momjian wrote:
> 
> > OK, so what is the text that people will understand?  This?
> > 
> > Prevent manual VACUUMs on append-only tables from performing
> > needless index scans
> 
> Make vacuum cheaper by avoiding scans of btree indexes when not
> necessary
> ?
> 
> Why "manual vacuum"?  It's a problem with vacuums invoked from
> autovacuum too.

Uh, from the commit it says:

When workload on particular table is append-only, then autovacuum
isn't intended to touch this table. However, user may run vacuum
manually in order to fill visibility map and get benefits of

index-only scans.  Then ambulkdelete wouldn't be called for
indexes of such table (because no heap tuples were deleted), only
   ---
amvacuumcleanup would be called In this case, amvacuumcleanup
would perform full index scan for two objectives: put recyclable
pages into free space map and update index statistics.

Why would autovacuum run on a table with no expired index entries?

Personally, I think the fact that autovacuum doesn't run on suvch tables
and therefore doesn't automatically do index-only scans is a problem. I
tried to fix it years ago, but failed and gave up.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Andres Freund

On 2018-05-11 14:59:04 -0400, Bruce Momjian wrote:
> On Fri, May 11, 2018 at 11:50:51AM -0700, Andres Freund wrote:
> > On 2018-05-11 14:44:06 -0400, Bruce Momjian wrote:
> > > On Fri, May 11, 2018 at 07:49:50PM +0300, Teodor Sigaev wrote:
> > > > 
> > > > 
> > > > Bruce Momjian wrote:
> > > > >I have committed the first draft of the Postgres 11 release notes.  I
> > > > >will add more markup soon.  You can view the most current version here:
> > > > >
> > > > >   http://momjian.us/pgsql_docs/release-11.html
> > > > >
> > > > >I expect a torrent of feedback.  ;-)
> > > > Hi!
> > > > 
> > > > Seems, you miss:
> > > > 857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during 
> > > > cleanup
> > > > of B-tree indexes when possible
> > > 
> > > I read that and thought it was too details to be in the release notes. 
> > > It is not that it is unimportant, but it is hard to see how people would
> > > notice the difference or change their behavior based on this change.
> > 
> > It's a *huge* performance problem in larger installations
> > currently. When you have a multi-TB relation and correspondingly large
> > relation, the VM allows to make the heap cleanups cheap, but then the
> > index scan takes just about forever.  I know at least one large PG user
> > that moved off postgres because of it.  This won't solve all of those
> > concerns, but it definitely is crucial to know for such users.
> > 
> > People would notice by vacuums of large relations not taking forever
> > anymore. And the behaviour change would be to a) upgrade b) tune the
> > associated reloption/GUC.
> 
> OK, so what is the text that people will understand?  This?
> 
>   Prevent manual VACUUMs on append-only tables from performing
>   needless index scans

I don't think the table being 'append-only' is necessary?  Nor does it
have to be a manual vacuum. And 'needless index scan' sounds less than
it is/was, namely a full scan of the index. Perhaps something like:

  Allow vacuum to skip doing a full scan of btree indexes after VACUUM,
  if not necessary.

or something like that?


> You can see why I was hesitant to include it, based on this text, but I
> am happy to add it.

I can't. Even if the above were accurate it'd be relevant information.

Greetings,

Andres Freund

Re: Postgres 11 release notes

2018-05-11 Thread Alvaro Herrera

Andres Freund wrote:

> >   
> >Add configure flag --with-llvm to test for
> >LLVM support (Andres Freund)
> >   
> >   
> >Have configure check for the availability of a C++ compiler
> >(Andres Freund)
> >   
> 
> I wonder if we shouldn't omit those, considering them part of the JIT
> entry?  Not quite sure what our policy there is.

I don't see why users would be interested in these items at all, so +1
for omitting them.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Postgres 11 release notes

2018-05-11 Thread Alvaro Herrera

Bruce Momjian wrote:

> OK, so what is the text that people will understand?  This?
> 
>   Prevent manual VACUUMs on append-only tables from performing
>   needless index scans

Make vacuum cheaper by avoiding scans of btree indexes when not
necessary
?

Why "manual vacuum"?  It's a problem with vacuums invoked from
autovacuum too.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Postgres 11 release notes

2018-05-11 Thread Andres Freund

Hi,

Quick notes:
>   
>Add Just-In-Time (JIT) compilation of plans
>run the by the executor
>(Andres Freund)
>   
>  

It's currently not yet compilation of entire plans, but only parts. I
think that's a relevant distinction because I'd like to add that as a
feature to v12 ;). How about just adding 'parts of' or such?  I'd also
rephrase the plan and 'run by the executor a bit'. How about:

  Add Just-In-Time (JIT) compilation of parts of
  queries, to accelerate their execution.


>  
>
>
>   
>Add configure flag --with-llvm to test for
>LLVM support (Andres Freund)
>   
>  
>
>  
>
>
>   
>Have configure check for the availability of a C++ compiler
>(Andres Freund)
>   
>  
>
>  

I wonder if we shouldn't omit those, considering them part of the JIT
entry?  Not quite sure what our policy there is.

Greetings,

Andres Freund

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 11:50:51AM -0700, Andres Freund wrote:
> On 2018-05-11 14:44:06 -0400, Bruce Momjian wrote:
> > On Fri, May 11, 2018 at 07:49:50PM +0300, Teodor Sigaev wrote:
> > > 
> > > 
> > > Bruce Momjian wrote:
> > > >I have committed the first draft of the Postgres 11 release notes.  I
> > > >will add more markup soon.  You can view the most current version here:
> > > >
> > > > http://momjian.us/pgsql_docs/release-11.html
> > > >
> > > >I expect a torrent of feedback.  ;-)
> > > Hi!
> > > 
> > > Seems, you miss:
> > > 857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during 
> > > cleanup
> > > of B-tree indexes when possible
> > 
> > I read that and thought it was too details to be in the release notes. 
> > It is not that it is unimportant, but it is hard to see how people would
> > notice the difference or change their behavior based on this change.
> 
> It's a *huge* performance problem in larger installations
> currently. When you have a multi-TB relation and correspondingly large
> relation, the VM allows to make the heap cleanups cheap, but then the
> index scan takes just about forever.  I know at least one large PG user
> that moved off postgres because of it.  This won't solve all of those
> concerns, but it definitely is crucial to know for such users.
> 
> People would notice by vacuums of large relations not taking forever
> anymore. And the behaviour change would be to a) upgrade b) tune the
> associated reloption/GUC.

OK, so what is the text that people will understand?  This?

Prevent manual VACUUMs on append-only tables from performing
needless index scans

You can see why I was hesitant to include it, based on this text, but I
am happy to add it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Andres Freund

On 2018-05-11 14:44:06 -0400, Bruce Momjian wrote:
> On Fri, May 11, 2018 at 07:49:50PM +0300, Teodor Sigaev wrote:
> > 
> > 
> > Bruce Momjian wrote:
> > >I have committed the first draft of the Postgres 11 release notes.  I
> > >will add more markup soon.  You can view the most current version here:
> > >
> > >   http://momjian.us/pgsql_docs/release-11.html
> > >
> > >I expect a torrent of feedback.  ;-)
> > Hi!
> > 
> > Seems, you miss:
> > 857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during cleanup
> > of B-tree indexes when possible
> 
> I read that and thought it was too details to be in the release notes. 
> It is not that it is unimportant, but it is hard to see how people would
> notice the difference or change their behavior based on this change.

It's a *huge* performance problem in larger installations
currently. When you have a multi-TB relation and correspondingly large
relation, the VM allows to make the heap cleanups cheap, but then the
index scan takes just about forever.  I know at least one large PG user
that moved off postgres because of it.  This won't solve all of those
concerns, but it definitely is crucial to know for such users.

People would notice by vacuums of large relations not taking forever
anymore. And the behaviour change would be to a) upgrade b) tune the
associated reloption/GUC.

Greetings,

Andres Freund

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 01:40:54PM -0400, Chapman Flack wrote:
> On 05/11/2018 11:08 AM, Bruce Momjian wrote:
> 
> > http://momjian.us/pgsql_docs/release-11.html
> > 
> > I expect a torrent of feedback.  ;-)
> 
> Very superficial things:
> 
> 
> Add predicate locking for Hash, GiST, and GIN indexes
> 
>   s/likelyhood/likelihood/
> 
> Add extension jsonb_plpython
> 
>   There are two such entries; one looks like it should be plperl.
> 
> Improve the speed of aggregate computations
> 
>   This entry is missing a bullet.

Great, all fixed, and the URL contents are updated with this and
previous suggestions, plus more markup.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 07:49:50PM +0300, Teodor Sigaev wrote:
> 
> 
> Bruce Momjian wrote:
> >I have committed the first draft of the Postgres 11 release notes.  I
> >will add more markup soon.  You can view the most current version here:
> >
> > http://momjian.us/pgsql_docs/release-11.html
> >
> >I expect a torrent of feedback.  ;-)
> Hi!
> 
> Seems, you miss:
> 857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during cleanup
> of B-tree indexes when possible

I read that and thought it was too details to be in the release notes. 
It is not that it is unimportant, but it is hard to see how people would
notice the difference or change their behavior based on this change.

> c0cbe00fee6d0a5e0ec72c6d68a035e674edc4cc Add explicit cast from scalar jsonb
> to all numeric and bool types.
> 
> Pls, edit:
> Add functionjson(b)_to_tsvector to create usable text search queries
> matching JSON/JSONB values (Dmitry Dolgov)
> to
> Add function json(b)_to_tsvector to create usable vectors to search in
> json(b) (Dmitry Dolgov)
> or somehow else. Your wording is about query but actually that functions are
> about how to create tsvector from json(b)

OK, how is this text?

Add function json(b)_to_tsvector() to create
text search query for matching JSON/JSONB
values (Dmitry Dolgov)

I have made this change but can adjust it some more.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Clock with Adaptive Replacement

2018-05-11 Thread ben.manes

I have been working on caching as a hobby project for the last few years and
would be happy to collaborate on an analysis and design. I've tackled a lot
of these issues, wrote widely used implementations, and co-authored a paper
for ACM's Transactions on Storage on an eviction policy. A short summary of
my results is described in this HighScalability article [1].

I very much agree with the requests for traces to use real-world data. I
wrote a simulator that supports a large number of policies and traces [2].
It was invaluable. I believe ARC's DS1 should be the closest to a Postgres
workload. Their OLTP trace appears to be the I/O of the WAL, making it
recency-biased and not very interesting.

** Eviction **
ARC-based policies (CAR, CART) are only modestly better than LRU in most
workloads. It does much better in Zipf, but poorly in loopy workloads like
glimpse. I think that the problem is that it does not capture and use the
history (ghost lists) very effectively.

LIRS-based policies (ClockPro, FRD) are very strong. Unfortunately the
papers do a poor job in describing them, it is complex to implement, and
most are flawed so as to diminish the hit rate. As described in their paper,
LIRS has an unbounded history that can cause memory leaks if not capped and
the pruning is O(n).

I try to avoid Clock-based policies in my designs due to their O(n) scans.
This can cause latency spikes that are difficult to reproduce because it
depends on the policy's state. Random sampling policies don't have this
problem, but tend to under perform due to a lack of history. I prefer
amortized O(1) policies, but those are difficult due to concurrency (solved
below).

I chose TinyLFU with an admission window, called W-TinyLFU. This uses a
frequency sketch with saturating counters for its history (e.g,
CountMinSketch w/ 4-bit counters). The counters are halved every sample
period, e.g. 10x the maximum size, which efficiently ages the LFU. Instead
of trying to order for eviction, TinyLFU avoids cache pollution by rejecting
candidates with a low probability of reuse. This is done by checking the
candidate's frequency against the eviction policy's victim, and admitting
only if the candidate offers a better hit rate. The LRU admission window
helps for recency-based traces where the candidate is rejected prematurely
and when accepted is unlikely to be used again soon. This simple and compact
scheme has the highest hit rate in most real-world workloads that I have
tested. To bypass the ACM paywall, you can use the author link on my github
project's README [3].

We are currently working on making W-TinyLFU adaptive by dynamically
resizing the window size. I can send the draft paper privately if anyone is
interested. I am a Bay Area engineer working with researchers at Israel's
Technion, so the mixture has been fruitful.

** Concurrency **
As you know, LRU's O(1) nature is very attractive but the list manipulations
are not friendly for multi-threading. The lock hold times are small, but the
access rate is high and causes a lot of contention.

The solution is to use the WAL approach to record the event into a buffer
and replay it under a tryLock. This way adding to the ring buffer is a cheap
CAS and threads won't block as their work has been handed off. When the
buffers fill up they are replayed under the lock in a batch, thereby
isolating the policy from concurrent mutations. I use multiple, lossy ring
buffers to capture reads and replaying is delayed until beneficial. For
writes, I use a ring buffer that is replayed immediately and when full
causes back pressure (though is rarely filled).

In my benchmarks [4] the overhead quite small on a Zipf workload (to
simulate hot entries). The read throughput is 33% of a lock-free hash table
and scales linearly. The write throughput is nearly the same due to
contention when updating the hot entries. The read throughput is high enough
to meet performance budgets and its overhead is only observable in synthetic
micro-benchmarks.

The benefit of this approach is that for a very small penalty to reads, the
policy is mostly isolated from the multi-threading. This allows efficient
and complex O(1) algorithms that are not thread-safe. For example I use
W-TinyLFU (LRU+SLRU) for eviction and a Hierarchical TimerWheel for
expiration. While the initial infrastructure is more work, its been a boon
due to composability, a simpler mental model, and not having to use
algorithms that degrade in difficult to debug ways.

Hope that helps,
Ben

[1] http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html
[2] https://github.com/ben-manes/caffeine/wiki/Simulator
[3] https://github.com/ben-manes/caffeine
[4] https://github.com/ben-manes/caffeine/wiki/Benchmarks



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

On Fri, May 11, 2018 at 06:29:39PM +0200, Fabien COELHO wrote:
> 
> Hello Bruce,
> 
> > http://momjian.us/pgsql_docs/release-11.html
> 
> >I expect a torrent of feedback.  ;-)
> 
> Here is some, for things I know about:
> 
> >>Add major scripting features to pgbench (Fabien Coelho)
> 
> I assume that you are refering to "bc7fa0c1". I do not think that anything
> qualifies as "major".
> 
> I would rather describe it as "Add support for NULL and boolean, as well as
> assorted operators and functions, to pgbench expressions". Or something in
> better English.

OK, new wording is:

Add pgbench expressions support for NULLs, booleans, and some
functions and operators (Fabien Coelho)

> >>Add \if macro support to pgbench (Fabien Coelho)
> 
> I would remove the word "macro", as it is not a pre-compilation feature, it
> is just a somehow simple standard conditional.

New wording is:

Add \if conditional support to pgbench (Fabien Coelho)

> >On a personal note, I want to apologize for not adequately championing
> >two features that I think have strategic significance but didn't make it
> >into Postgres 11:  parallel FDW access and improved multi-variate
> >statistics.
> 
> I'm planning to try to help with the later.

Great, thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Postgres 11 release notes

2018-05-11 Thread Chapman Flack

On 05/11/2018 11:08 AM, Bruce Momjian wrote:

>   http://momjian.us/pgsql_docs/release-11.html
> 
> I expect a torrent of feedback.  ;-)

Very superficial things:

Add predicate locking for Hash, GiST, and GIN indexes

  s/likelyhood/likelihood/

Add extension jsonb_plpython

  There are two such entries; one looks like it should be plperl.

Improve the speed of aggregate computations

  This entry is missing a bullet.

-Chap

Re: Postgres 11 release notes

2018-05-11 Thread Teodor Sigaev




Bruce Momjian wrote:

I have committed the first draft of the Postgres 11 release notes.  I
will add more markup soon.  You can view the most current version here:

http://momjian.us/pgsql_docs/release-11.html

I expect a torrent of feedback.  ;-)

Hi!

Seems, you miss:
857f9c36cda520030381bd8c2af20adf0ce0e1d4 Skip full index scan during cleanup of 
B-tree indexes when possible
c0cbe00fee6d0a5e0ec72c6d68a035e674edc4cc Add explicit cast from scalar jsonb to 
all numeric and bool types.


Pls, edit:
Add functionjson(b)_to_tsvector to create usable text search queries matching 
JSON/JSONB values (Dmitry Dolgov)

to
Add function json(b)_to_tsvector to create usable vectors to search in json(b) 
(Dmitry Dolgov)
or somehow else. Your wording is about query but actually that functions are 
about how to create tsvector from json(b)


Thank you!

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

Re: Having query cache in core

2018-05-11 Thread Hartmut Holzgraefe


On 11.05.2018 18:01, Tatsuo Ishii wrote:

Plus checking username is neccessary (otherwise any user could
retrieve a cache for a table lookup which is not permitted by other
users).


as the tables a cached query operated on is known anyway -- it's needed
to purge cache entries when table content changes -- schema and table
level SELECT privileges can be checked ... I'm not fully sure about
how MySQL handles its column level privileges in that respect, something
I'd need to try out ...

--
hartmut

Re: Postgres 11 release notes

2018-05-11 Thread Fabien COELHO



Hello Bruce,


http://momjian.us/pgsql_docs/release-11.html



I expect a torrent of feedback.  ;-)


Here is some, for things I know about:


Add major scripting features to pgbench (Fabien Coelho)


I assume that you are refering to "bc7fa0c1". I do not think that anything 
qualifies as "major".


I would rather describe it as "Add support for NULL and boolean, as well 
as assorted operators and functions, to pgbench expressions". Or something 
in better English.



Add \if macro support to pgbench (Fabien Coelho)


I would remove the word "macro", as it is not a pre-compilation feature, 
it is just a somehow simple standard conditional.



On a personal note, I want to apologize for not adequately championing
two features that I think have strategic significance but didn't make it
into Postgres 11:  parallel FDW access and improved multi-variate
statistics.


I'm planning to try to help with the later.

--
Fabien.

Re: Compiler warnings with --enable-dtrace

2018-05-11 Thread David Pacheco

On Sat, May 5, 2018 at 6:22 AM, Thomas Munro 
wrote:

> Hi hackers,
>
> --enable-dtrace produces compiler warnings about const correctness,
> except on macOS.  That's because Apple's dtrace produces function
> declarations in probes.h that take strings as const char * whereas
> AFAIK on all other operating systems they take char * (you can see
> that possibly recent difference in Apple's version of dt_header_decl()
> in dt_program.c).  People have complained before[1].
>
> Maybe we should do what the Perl people do[2] and post-process the
> generated header file to add const qualifiers?  Please see attached.
>
> I have just added --enable-dtrace to my build farm animal elver so
> these warnings should appear at the next build.  I wonder if the
> owners of damselfly, castoroides, protosciurus (CCed) would consider
> adding it for them too so that we could get some coverage of this
> build option on Illumos and Solaris.
>
> [1] https://www.postgresql.org/message-id/flat/38D06FCCB225BA1C6699D4E7%
> 40amenophis
> [2] https://github.com/Perl/perl5/blob/a385812b685b3164e706880a72ee60
> c9cc9573e4/Makefile.SH#L870
>
> --
> Thomas Munro
> http://www.enterprisedb.com
>


I've updated the damselfly (illumos build farm member) configuration to add
"--enable-dtrace":
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=damselfly=2018-05-11%2015%3A55%3A03

Thanks for the heads up!

-- Dave

Re: Having query cache in core

2018-05-11 Thread Tatsuo Ishii

> that's almost actually how the MySQL query cache works: the query
> cache
> lookup kicks in even before the query is parsed, to get maximum gain
> from cache hits.
> 
> It does not just take the query text into account alone though,
> but also the current default database, protocol version,

> and character
> set/collation settings of the client session, asl all these may have
> an influence on the actual result values, too ...

Plus checking username is neccessary (otherwise any user could
retrieve a cache for a table lookup which is not permitted by other
users).

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Having query cache in core

2018-05-11 Thread CK Tan

On Fri, May 11, 2018, 10:26 PM Tatsuo Ishii  wrote:

>
> >
> > I think you need to know which tables are involved and if they were
> > modified.
>
> Of course. While creating a cache entry for a SELECT, we need to
> analyze it and extract tables involved in the SELECT. The information
> should be stored along with the cache entry. If any of the tables were
> modified, cache entries using the table must be removed.
> (these are already implemented in Pgpool-II's in memory query cache)
>

How do you handle tables hiding behind views? Also how does cached entries
in pgpools know if some tables are modified without going thru pgpool, eg
pgplsql or trigger or via psql directly?

=cktan

Re: Having query cache in core

2018-05-11 Thread Tatsuo Ishii

>> If any of the tables were modified, cache entries using the table must be
> removed.
>> (these are already implemented in Pgpool-II's in memory query cache)
> 
> How do you identify updates made from a pl/pgsql procedure?

Pgpool-II does not invalidate query cache in that case.

I think in-core query cache could deal with the case (probably as
Michael suggested).

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: PANIC during crash recovery of a recently promoted standby

2018-05-11 Thread Alvaro Herrera

Michael Paquier wrote:
> On Thu, May 10, 2018 at 10:52:12AM +0530, Pavan Deolasee wrote:
> > I propose that we should always clear the minRecoveryPoint after promotion
> > to ensure that crash recovery always run to the end if a just-promoted
> > standby crashes before completing its first regular checkpoint. A WIP patch
> > is attached.
> 
> I have been playing with your patch and upgraded the test to check as
> well for cascading standbys.  We could use that in the final patch.
> That's definitely something to add in the recovery test suite, and the
> sleep phases should be replaced by waits on replay and/or flush.
> 
> Still, that approach looks sensitive to me.  A restart point could be
> running while the end-of-recovery record is inserted, so your patch
> could update minRecoveryPoint to InvalidXLogRecPtr, and then a restart
> point would happily update again the control file's minRecoveryPoint to
> lastCheckPointEndPtr because it would see that the former is older than
> lastCheckPointEndPtr (let's not forget that InvalidXLogRecPtr is 0), so
> you could still crash on invalid pages?

Yeah, I had this exact comment, but I was unable to come up with a test
case that would cause a problem.

> I need to think a bit more about that stuff, but one idea would be to
> use a special state in the control file to mark it as ending recovery,
> this way we would control race conditions with restart points.

Hmm.  Can we change the control file in released branches?  (It should
be possible to make the new server understand both old and new formats,
but I think this is breaking new ground and it looks easy to introduce
more bugs there.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Postgres 11 release notes

2018-05-11 Thread Bruce Momjian

I have committed the first draft of the Postgres 11 release notes.  I
will add more markup soon.  You can view the most current version here:

http://momjian.us/pgsql_docs/release-11.html

I expect a torrent of feedback.  ;-)

On a personal note, I want to apologize for not adequately championing
two features that I think have strategic significance but didn't make it
into Postgres 11:  parallel FDW access and improved multi-variate
statistics.  I hope to do better job during Postgres 12.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: [HACKERS] Surjective functional indexes

2018-05-11 Thread Simon Riggs

On 11 May 2018 at 16:37, Andres Freund  wrote:
> On 2018-05-11 14:56:12 +0200, Simon Riggs wrote:
>> On 11 May 2018 at 05:32, Andres Freund  wrote:
>> > No. Simon just claimed it's not actually a concern:
>> > https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com
>> >
>> > And yes, it got committed without doing squat to address the
>> > architectural concerns.
>>
>> "Squat" means "zero, nothing" to me. So that comment would be inaccurate.
>
> Yes, I know it means that. And I don't see how it is inaccurate.
>
>
>> I have no problem if you want to replace this with an even better
>> design in a later release.
>
> Meh. The author / committer should get a patch into the right shape

They have done, at length. Claiming otherwise is just trash talk.

As you pointed out, the design of the patch avoids layering violations
that could have led to architectural objections. Are you saying I
should have ignored your words and rewritten the patch to introduce a
layering violation? What other objection do you think has not been
addressed?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_locale compilation error with Visual Studio 2017

2018-05-11 Thread Tom Lane

Sandeep Thakkar  writes:
> I found the same error was raised by someone in pgsql-general lists but
> don't see the submitted patch was committed. Here is the discussion link
> https://www.postgresql.org/message-id/CAL5LfdQdxt9u1W4oJmELybVBJE3X4rtNrNp62NNFw9n%3DXd-wTQ%40mail.gmail.com

Yeah, that patch was rejected as unmaintainable, and it still is.
Somebody needs to do some research in Microsoft's docs and find out
what is the MS-approved way of getting the locale name now.

In the other thread, Munro wondered if ResolveLocaleName() would do
the trick, but AFAICS nobody has tried it.

Another idea would be to replace the whole mess with a lookup table,
although how to fill the table or maintain it in future would be
problematic too.

regards, tom lane

Re: [HACKERS] Surjective functional indexes

2018-05-11 Thread Andres Freund

On 2018-05-11 14:56:12 +0200, Simon Riggs wrote:
> On 11 May 2018 at 05:32, Andres Freund  wrote:
> > No. Simon just claimed it's not actually a concern:
> > https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com
> >
> > And yes, it got committed without doing squat to address the
> > architectural concerns.
> 
> "Squat" means "zero, nothing" to me. So that comment would be inaccurate.

Yes, I know it means that. And I don't see how it is inaccurate.


> I have no problem if you want to replace this with an even better
> design in a later release.

Meh. The author / committer should get a patch into the right shape, not
other people that are concerned with the consequences.

Greetings,

Andres Freund

Re: [HACKERS] Surjective functional indexes

2018-05-11 Thread David G. Johnston

On Fri, May 11, 2018 at 4:58 AM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

>
> Sorry, may be I do not completely understand you.
> So whats happed before this patch:
>
> - On update postgres compares old and new values of all changed attributes
> to determine whether them are actually changed.
> - If value of some indexed attribute is changed,  then hot update is not
> applicable and we have to rebuild indexed.
> - Otherwise hot update is used and indexes should not be updated.
>
> What is changed:
>
> -  When some of attributes, on which functional index depends, is changed,
> then we calculate value of index expression. It is done using existed
> FormIndexDatum function which stores calculated expression value in the
> provided slot. This evaluation of index expressions and their comparison is
> done in access/heap/heapam.c file.
> - Only if old and new values of index expression are different, then hot
> update is really not applicable.
>
>
Thanks.

So, in my version of layman's terms, before this patch we treated simple
column referenced indexes and expression indexes differently because in a
functional index we didn't actually compare the expression results, only
the original values of the depended-on columns.  With this patch both are
treated the same - and in a manner that I would think would be
normal/expected.  Treating expression indexes this way, however, is
potentially more costly than before and thus we want to provide the user a
way to say "hey, PG, don't bother with the extra effort of comparing the
entire expression, it ain't gonna matter since if I change the depended-on
values odds are the expression result will change as well."  Changing an
option named "recheck_on_update" doesn't really communicate that to me (I
dislike the word recheck, too implementation specific).  Something like
"only_compare_dependencies_on_update (default false)" would do a better job
at that - tell me what the abnormal behavior is, not the normal, and lets
me enable it instead of disabling the default behavior and not know what it
is falling back to.  The description for the parameter does a good job of
describing this - just suggesting that the name match how the user might
see things.

On the whole the change makes sense - and the general assumptions around
runtime cost seem sound and support the trade-off simply re-computing the
expression on the old tuple versus engineering an inexpensive way to
retrieve the existing value from the index.

I don't have a problem with "total expression cost" being used as a
heuristic - though maybe that would mean we should make this an enum with
(off, on, never/always) with the third option overriding all heuristics.

David J.

perlcritic: Missing "return"

2018-05-11 Thread Mike Blackwell

After applying the perlcritic overrides Andrew used for the buildfarm, one
of the most common remaining level 4 warnings in the PostgreSQL source,
with 186 occurrences, is 'Subroutine does not end with "return"'.

The point of this warning is that, in Perl, falling off the end of a
subroutine returns the result of the last statement.  Therefor  one should
explicitly 'return;' to make it clear the caller is not expecting that
result as the return value.

I believe Andrew took the approach of adding return at the end of all
functions for the buildfarm code.  Would the project prefer the same?  The
other option would be disable the warning, based on a policy of always
explicitly using 'return' when returning a value.

Thoughts?

Mike B
* *

Re: Considering signal handling in plpython again

2018-05-11 Thread Heikki Linnakangas

On 11 May 2018 10:01:56 EEST, Hubert Zhang  wrote:
>2. Add a flag in hook function to indicate whether to call
>Py_AddPendingCall.
>This is straightforward.(I prefer it)

Yeah, that's what I had in mind, too. A global bool variable that's set when 
you enter libpython, and cleared on return. Need to handle nesting, i.e if a 
PL/python function runs a slow query with SPI, and cancellation happens during 
that. And the case that the SPI query calls another PL/python function.

- Heikki

Re: Having query cache in core

2018-05-11 Thread Hartmut Holzgraefe


On 11.05.2018 14:26, Tatsuo Ishii wrote:

If any of the tables were
modified, cache entries using the table must be removed.
(these are already implemented in Pgpool-II's in memory query cache)


... and this is the expensive part in the MySQL implementation that
has rendered the query cache mostly useless for the last decade or so:

Unless you come up with a clever lockless implementation concurrent
writes on the same table effectively become serialized by this,
creating serious contention problems.

Peter Zaitsev once listed several points that could be improved to
make the query cache somewhat useful again, but in the end noone
really seemed to be interested in really doing so, including
Percona themselves, as apparently even without the contention issues
there are only few workloads these days that would significantly profit
from cached result sets.

https://www.percona.com/blog/2011/04/10/should-we-give-a-mysqlquery-cache-a-second-chance/

Maybe this list can be taken as a "what to avoid" hint sheet ...

--
hartmut

Re: [HACKERS] Surjective functional indexes

2018-05-11 Thread Simon Riggs

On 11 May 2018 at 05:32, Andres Freund  wrote:
> On 2018-05-10 23:25:58 -0400, Robert Haas wrote:
>> On Thu, Mar 1, 2018 at 2:48 PM, Andres Freund  wrote:
>> > I still don't think, as commented upon by Tom and me upthread, that we
>> > want this feature in the current form.
>>
>> Was this concern ever addressed, or did the patch just get committed anyway?
>
> No. Simon just claimed it's not actually a concern:
> https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com
>
> And yes, it got committed without doing squat to address the
> architectural concerns.

"Squat" means "zero, nothing" to me. So that comment would be inaccurate.

I've spent a fair amount of time reviewing and grooming the patch, and
I am happy that Konstantin has changed his patch significantly in
response to those reviews, as well as explaining why the patch is fine
as it is. It's a useful patch that PostgreSQL needs, so all good.

I have no problem if you want to replace this with an even better
design in a later release.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Indexes on partitioned tables and foreign partitions

2018-05-11 Thread Simon Riggs

On 9 May 2018 at 17:33, Robert Haas  wrote:
> On Wed, May 9, 2018 at 11:20 AM, Simon Riggs  wrote:
>> On 9 May 2018 at 16:15, Robert Haas  wrote:
>>> On Wed, May 9, 2018 at 11:14 AM, Simon Riggs  wrote:
 On 9 May 2018 at 16:10, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, May 9, 2018 at 9:08 AM, Simon Riggs  
>> wrote:
>>> Shouldn't the fix be to allow creation of indexes on foreign tables?
>>> (Maybe they would be virtual or foreign indexes??)
>
>> It might be useful to invent the concept of a foreign index, but not
>> for v11 a month after feature freeze.
>
> Yeah.  That's a can of worms we can *not* open at this stage.

 Lucky nobody suggested that then, eh? Robert's just making a joke.
>>>
>>> Someone did suggest that.  It was you.
>>
>> Oh, you weren't joking. I think we're having serious problems with
>> people putting words in my mouth again then.
>>
>> Please show me where I suggested doing anything for v11?
>
> Come on, Simon.  It's in the quoted text.

No, its not.

>  I realize you didn't say
> v11 specifically, but this is the context of a patch that is proposed
> a bug-fix for v11.  If you meant that we should apply the patch as
> proposed now, or some other one, and do the other thing later, you
> could have said so.

I think its reasonable to expect you interpret my words sensibly,
rather than in some more dramatic form where I seem to break rules
with every phrase.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: postgres_fdw: Oddity in pushing down inherited UPDATE/DELETE joins to remote servers

2018-05-11 Thread Etsuro Fujita

(2018/05/11 16:19), Amit Langote wrote:
> On 2018/05/11 16:12, Amit Langote wrote:
>> Just to clarify, does this problem only arise because there is a pushed
>> down join involving the child?  That is, does the problem only occur as of
>> the following commit:
>>
>> commit 1bc0100d270e5bcc980a0629b8726a32a497e788
>> Author: Robert Haas
>> Date:   Wed Feb 7 15:34:30 2018 -0500
>>
>>  postgres_fdw: Push down UPDATE/DELETE joins to remote servers.
>>
>> In other words, do we need to back-patch this up to 9.5 which added
>> foreign table inheritance?
> 
> Oops, it should have been clear by the subject line that the problem
> didn't exist before that commit.  Sorry.

No.  In theory, I think we could consider this as an older bug added in
9.5, because in case of inherited UPDATE/DELETE, the PlannerInfo passed
to PlanForeignModify doesn't match the one the FDW saw at Path creation
time, as you mentioned in a previous email, while in case of
non-inherited UPDATE/DELETE, the PlannerInfo passed to that function
matches the one the FDW saw at that time.  I think that's my fault :(.
But considering there seems to be no field reports on that, I don't
think we need back-patching up to 9.5.

Best regards,
Etsuro Fujita

Re: postgres_fdw: Oddity in pushing down inherited UPDATE/DELETE joins to remote servers

2018-05-11 Thread Etsuro Fujita

Hi Amit,

(2018/05/11 16:12), Amit Langote wrote:
> On 2018/05/10 21:41, Etsuro Fujita wrote:
>> I think the reason for that is: in that case we try to find the target
>> foreign-join RelOptInfo using find_join_rel in postgresPlanDirectModify,
>> but can't find it, because the given root is the *parent* root and
>> doesn't have join RelOptInfos with it.  To fix this, I'd like to propose
>> to modify make_modifytable so that in case of an inherited
>> UPDATE/DELETE, it passes to PlanDirectModify the per-child modified
>> subroot, not the parent root, for the FDW to get the foreign-join
>> RelOptInfo from the given root in PlanDirectModify.  I think that that
>> would be more consistent with the non-inherited UPDATE/DELETE case in
>> that the FDW can look at any join RelOptInfos in the given root in
>> PlanDirectModify, which I think would be a good thing because some FDWs
>> might need to do that for some reason.  For the same reason, I'd also
>> like to propose to pass to PlanForeignModify the per-child modified
>> subroot, not the parent root, as for PlanDirectModify.  Attached is a
>> proposed patch for that.  I'll add this to the open items list for PG11.
> 
> So IIUC, we must pass the per-child PlannerInfo here, because that's what
> would have been used for the planning join between the child (ft1 in your
> example) and the other table (ft2 in your example).  So that's where the
> RelOptInfo's corresponding to planning for the child, including that for
> the pushed-down join, would be stored.

Yeah, I think so too.

> Anyway, the patch and tests it adds look good.

Thanks for the review!

I added an assertion test to make_modifytable to match that in
create_modifytable_path.  Attached is an updated version of the patch.

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***
*** 7370,7375  drop table bar cascade;
--- 7370,7450 
  NOTICE:  drop cascades to foreign table bar2
  drop table loct1;
  drop table loct2;
+ -- Test pushing down UPDATE/DELETE joins to the remote server
+ create table parent (a int, b text);
+ create table loct1 (a int, b text);
+ create table loct2 (a int, b text);
+ create foreign table remt1 (a int, b text)
+   server loopback options (table_name 'loct1');
+ create foreign table remt2 (a int, b text)
+   server loopback options (table_name 'loct2');
+ alter foreign table remt1 inherit parent;
+ insert into remt1 values (1, 'foo');
+ insert into remt1 values (2, 'bar');
+ insert into remt2 values (1, 'foo');
+ insert into remt2 values (2, 'bar');
+ analyze remt1;
+ analyze remt2;
+ explain (verbose, costs off)
+ update parent set b = parent.b || remt2.b from remt2 where parent.a = remt2.a returning *;
+   QUERY PLAN   
+ ---
+  Update on public.parent
+Output: parent.a, parent.b, remt2.a, remt2.b
+Update on public.parent
+Foreign Update on public.remt1
+->  Nested Loop
+  Output: parent.a, (parent.b || remt2.b), parent.ctid, remt2.*, remt2.a, remt2.b
+  Join Filter: (parent.a = remt2.a)
+  ->  Seq Scan on public.parent
+Output: parent.a, parent.b, parent.ctid
+  ->  Foreign Scan on public.remt2
+Output: remt2.b, remt2.*, remt2.a
+Remote SQL: SELECT a, b FROM public.loct2
+->  Foreign Update
+  Remote SQL: UPDATE public.loct1 r4 SET b = (r4.b || r2.b) FROM public.loct2 r2 WHERE ((r4.a = r2.a)) RETURNING r4.a, r4.b, r2.a, r2.b
+ (14 rows)
+ 
+ update parent set b = parent.b || remt2.b from remt2 where parent.a = remt2.a returning *;
+  a |   b| a |  b  
+ ---++---+-
+  1 | foofoo | 1 | foo
+  2 | barbar | 2 | bar
+ (2 rows)
+ 
+ explain (verbose, costs off)
+ delete from parent using remt2 where parent.a = remt2.a returning parent;
+ QUERY PLAN
+ --
+  Delete on public.parent
+Output: parent.*
+Delete on public.parent
+Foreign Delete on public.remt1
+->  Nested Loop
+  Output: parent.ctid, remt2.*
+  Join Filter: (parent.a = remt2.a)
+  ->  Seq Scan on public.parent
+Output: parent.ctid, parent.a
+  ->  Foreign Scan on public.remt2
+Output: remt2.*, remt2.a
+Remote SQL: SELECT a, b FROM public.loct2
+->  Foreign Delete
+  Remote SQL: DELETE FROM public.loct1 r4 USING public.loct2 r2 WHERE ((r4.a = r2.a)) RETURNING r4.a, r4.b
+ (14 rows)
+ 
+ delete from parent using remt2

Re: Having query cache in core

2018-05-11 Thread Vladimir Sitnikov

> If any of the tables were modified, cache entries using the table must be
removed.
> (these are already implemented in Pgpool-II's in memory query cache)

How do you identify updates made from a pl/pgsql procedure?

Vladimir

Re: Having query cache in core

2018-05-11 Thread Tatsuo Ishii

>> Thanks for the advice. But I rather thought about bypassing the raw
>> parser and the planner. i.e. use the query string (or its hash) as the
>> index of the query cache.
>>
> 
> I think you need to know which tables are involved and if they were
> modified.

Of course. While creating a cache entry for a SELECT, we need to
analyze it and extract tables involved in the SELECT. The information
should be stored along with the cache entry. If any of the tables were
modified, cache entries using the table must be removed.
(these are already implemented in Pgpool-II's in memory query cache)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: [HACKERS] Surjective functional indexes

2018-05-11 Thread Konstantin Knizhnik




On 11.05.2018 07:48, David G. Johnston wrote:
On Thursday, February 1, 2018, Konstantin Knizhnik 
> wrote:



Old + New for check = 2
plus calculate again in index = 3


Yes, we have to calculate the value of index expression for
original and updated version of the record. If them are equal,
then it is all we have to do with this index: hot update is
applicable.
In this case we calculate index expression twice.
But if them are not equal, then hot update is not applicable and
we have to update index. In this case expression will be
calculated one more time. So totally three times.
This is why, if calculation of index expression is very expensive,
then effect of this optimization may be negative even if value of
index expression is not changed.


For the old/new comparison and the re-calculate if changed dynamics - 
is this a side effect of separation of concerns only or is there some 
other reason the old computed value already stored in the index isn't 
compared to the one and only function result of the new tuple which, 
if found to be different, is then stored.  One function invocation, 
which has to happen anyway, and one extra equality check.  Seems like 
this could be used for non-functional indexes too, so that mere 
presence in the update listing doesn't negate HOT if the column didn't 
actually change (if I'm not mis-remembering something here anyway...)


Also, create index page doc typo from site:  "with an total" s/b "with 
a total" (expression cost less than 1000) - maybe add a comma for 1,000


David J.




Sorry, may be I do not completely understand you.
So whats happed before this patch:

- On update postgres compares old and new values of all changed 
attributes to determine whether them are actually changed.
- If value of some indexed attribute is changed,  then hot update is not 
applicable and we have to rebuild indexed.

- Otherwise hot update is used and indexes should not be updated.

What is changed:

-  When some of attributes, on which functional index depends, is 
changed, then we calculate value of index expression. It is done using 
existed FormIndexDatum function which stores calculated expression value 
in the provided slot. This evaluation of index expressions and their 
comparison is done in access/heap/heapam.c file.
- Only if old and new values of index expression are different, then hot 
update is really not applicable.
- In this case we have to rebuild indexes. It is done by 
ExecInsertIndexTuples in executor/execIndexing.c which calls 
FormIndexDatum once again to calculate index expression.


So in principle, it is certainly possible to store value of index 
expression calculated in ProjIndexIsUnchanged and reuse it 
ExecInsertIndexTuples.

But I do not know right place (context) where this value can be stored.
And also it will add some dependency between  heapam and execIndexing 
modules. Also it is necessary to take in account that 
ProjIndexIsUnchanged is not always called. So index expression value may 
be not present.


Finally, most of practically used index expressions are not expensive. 
It is usually something like extraction of JSON component. Cost of 
execution of this function is much smaller than cost of extracting and 
unpacking toasted JSON value. So I do not think that such optimization 
will be really useful, while it is obvious that it significantly 
complicate code.

Also please notice that FormIndexDatum is used in some other places:

  utils/adt/selfuncs.c
  utils/sort/tuplesort.c
  mmands/constraint.c

So this patch just adds two more calls of this function.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Having query cache in core

2018-05-11 Thread Hartmut Holzgraefe


On 11.05.2018 11:12, Tatsuo Ishii wrote:

Thanks for the advice. But I rather thought about bypassing the raw
parser and the planner. i.e. use the query string (or its hash) as the
index of the query cache.


that's almost actually how the MySQL query cache works: the query cache
lookup kicks in even before the query is parsed, to get maximum gain 
from cache hits.


It does not just take the query text into account alone though,
but also the current default database, protocol version, and character 
set/collation settings of the client session, asl all these may have

an influence on the actual result values, too ...

--
Hartmut Holzgraefe, Principal Support Engineer (EMEA)
MariaDB Corporation | http://www.mariadb.com/

Re: Needless additional partition check in INSERT?

2018-05-11 Thread Amit Khandekar

On 11 May 2018 at 14:50, Amit Khandekar  wrote:
> On 10 May 2018 at 15:26, David Rowley  wrote:
>> Yeah, the comments do need work. In order to make it a bit easier to
>> document I changed the way that check_partition_constr is set. This is
>> now done with an if/else if/else clause for both COPY and INSERT.
>>
>> Hopefully, that's easier to understand and prevents further mistakes.
>>
>> Patch attached.
>
> The patch looks good in terms of handling the redundant constraint check.
>
> Since this patch also does some minor code restructuring with the if
> conditions, below are some comments on them :
>
> With the patch, the if condition uses ri_PartitionCheck, and further
> ExecConstraints() also uses ri_PartitionCheck to decide whether to
> call ExecPartitionCheck(). So instead, if we just don't bother about
> ri_PartitionCheck outside ExecConstraints() and let ExecConstraints()
> handle it, then the if conditions would get simpler :
>
> if (resultRelInfo->ri_PartitionRoot == NULL ||
> (resultRelInfo->ri_TrigDesc &&
>  resultRelInfo->ri_TrigDesc->trig_insert_before_row))
>check_partition_constr = true;
> else
>check_partition_constr = false;

This looks better (it will avoid unnecessary ExecConstraints() call) :

if (resultRelInfo->ri_PartitionRoot == NULL ||
 (resultRelInfo->ri_TrigDesc &&
  resultRelInfo->ri_TrigDesc->trig_insert_before_row))
check_partition_constr = resultRelInfo->ri_PartitionCheck;
else
   check_partition_constr = false;

>
> --
> Thanks,
> -Amit Khandekar
> EnterpriseDB Corporation
> The Postgres Database Company



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Re: Needless additional partition check in INSERT?

2018-05-11 Thread Amit Khandekar

On 10 May 2018 at 15:26, David Rowley  wrote:
> Yeah, the comments do need work. In order to make it a bit easier to
> document I changed the way that check_partition_constr is set. This is
> now done with an if/else if/else clause for both COPY and INSERT.
>
> Hopefully, that's easier to understand and prevents further mistakes.
>
> Patch attached.

The patch looks good in terms of handling the redundant constraint check.

Since this patch also does some minor code restructuring with the if
conditions, below are some comments on them :

With the patch, the if condition uses ri_PartitionCheck, and further
ExecConstraints() also uses ri_PartitionCheck to decide whether to
call ExecPartitionCheck(). So instead, if we just don't bother about
ri_PartitionCheck outside ExecConstraints() and let ExecConstraints()
handle it, then the if conditions would get simpler :

if (resultRelInfo->ri_PartitionRoot == NULL ||
(resultRelInfo->ri_TrigDesc &&
 resultRelInfo->ri_TrigDesc->trig_insert_before_row))
   check_partition_constr = true;
else
   check_partition_constr = false;

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Re: Having query cache in core

2018-05-11 Thread CK Tan

On Fri, May 11, 2018, 7:13 PM Tatsuo Ishii  wrote:

> > You could probably write an extension for that, though. I think the
> > planner hook and custom scans give you enough flexibility to do that
> > without modifying the server code.
>
> Thanks for the advice. But I rather thought about bypassing the raw
> parser and the planner. i.e. use the query string (or its hash) as the
> index of the query cache.
>

I think you need to know which tables are involved and if they were
modified.

Regards,
-cktan

Re: Having query cache in core

2018-05-11 Thread Tatsuo Ishii

> You could probably write an extension for that, though. I think the
> planner hook and custom scans give you enough flexibility to do that
> without modifying the server code.

Thanks for the advice. But I rather thought about bypassing the raw
parser and the planner. i.e. use the query string (or its hash) as the
index of the query cache.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: [HACKERS] asynchronous execution

2018-05-11 Thread Kyotaro HORIGUCHI

Hello. This is the new version of $Subject.

But, this is not just a rebased version. On the way fixing
serious conflicts, I refactored patch and I believe this becomes
way readable than the previous shape.

# 0003 lacks changes of postgres_fdw.out now.

- Waiting queue manipulation is moved into new functions. It had
  a bug that the same node can be inserted in the queue more than
  once and it is fixed.

- postgresIterateForeginScan had somewhat a tricky strcuture to
  merge similar procedures thus it cannot be said easy-to-read at
  all. Now it is far simpler and straight-forward looking.

- Still this works only on Append/ForeignScan.

> > The attached PoC patch theoretically has no impact on the normal
> > code paths and just brings gain in async cases.

I performed almost the same test as before but with:

- partition tables
  (There should be no difference with inheritance.)

- added test for fetch_size of 200 and 1000 as long as 100.

  100 unreasonably magnifies the lag by context switching on
  single poor box and the test D/F below. (They became faster by
  about twice by adding a small delay (1000 times of
  clock_gettime()(*1)) just before epoll_wait so that it doesn't
  sleep (I suppose)...)

- Table size of test B is one tenth of the previous size, the
  same to one partition.

*1: The reason for the function is that first I found that the
queries get way faster by just prefixing by "explain analyze"..

>patched(ms)  unpatched(ms)   gain(%)
> A: simple table scan :  3562.32  3444.81-3.4
> B: local partitioning:  1451.25  1604.38 9.5
> C: single remote table   :  8818.92  9297.76 5.1
> D: sharding (single con) :  5966.14  6646.7310.2
> E: sharding (multi con)  :  1802.25  6515.4972.3

fetch_size = 100
patched(ms)  unpatched(ms)   gain(%)
A: simple table scan :   3033.48   2997.44-1.2
B: local partitioning:   1405.52   1426.66 1.5
C: single remote table   :   8335.50   8463.22 1.5
D: sharding (single con) :   6862.92   6820.97-0.6
E: sharding (multi con)  :   2185.84   6733.6367.5
F: partition (single con):   6818.13   6741.01-1.1
G: partition (multi con) :   2150.58   6407.4666.4

fetch_size = 200
patched(ms)  unpatched(ms)   gain(%)
A: simple table scan :   
B: local partitioning:   
C: single remote table   :   
D: sharding (single con) :   
E: sharding (multi con)  :   
F: partition (single con):   
G: partition (multi con) :   

fetch_size = 1000
patched(ms)  unpatched(ms)   gain(%)
A: simple table scan :   3050.31   2980.29-2.3
B: local partitioning:   1401.34   1419.54 1.3
C: single remote table   :   8375.48445.27 0.8
D: sharding (single con) :   3935.97   4737.8416.9
E: sharding (multi con)  :   1330.44   4752.8772.0
F: partition (single con):   3997.63   4747.4415.8
G: partition (multi con) :   1323.02   4807.7272.5

Async append doesn't affect non-async path at all so B is
expected to get no degradation. It seems within error.

C and F are the gain when all foreign tables share one connection
and D and G are the gain when every foreign tables has dedicate
connection.

I will repost after filling the blank portion of the tables and
complete regression of the patch next week. Sorry for the
incomplete post.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 7ad4210dd20b6672367255492e2b1d95cd90b122 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 22 May 2017 12:42:58 +0900
Subject: [PATCH 1/3] Allow wait event set to be registered to resource owner

WaitEventSet needs to be released using resource owner for a certain
case. This change adds WaitEventSet reowner and allow the creator of a
WaitEventSet to specify a resource owner.
---
 src/backend/libpq/pqcomm.c|  2 +-
 src/backend/storage/ipc/latch.c   | 18 ++-
 src/backend/storage/lmgr/condition_variable.c |  2 +-
 src/backend/utils/resowner/resowner.c | 67 +++
 src/include/storage/latch.h   |  4 +-
 src/include/utils/resowner_private.h  |  8 
 6 files changed, 96 insertions(+), 5 deletions(-)

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index a4f6d4deeb..890972b9b8 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -220,7 +220,7 @@ pq_init(void)
 (errmsg("could not set socket to nonblocking mode: %m")));
 #endif
 
-	FeBeWaitSet = CreateWaitEventSet(TopMemoryContext, 3);
+	FeBeWaitSet = CreateWaitEventSet(TopMemoryContext, NULL, 3);
 	AddWaitEventToSet(FeBeWaitSet, WL_SOCKET_WRITEABLE, MyProcPort->sock,
 	  NULL, NULL);

Re: postgres_fdw: Oddity in pushing down inherited UPDATE/DELETE joins to remote servers

2018-05-11 Thread Amit Langote

On 2018/05/11 16:12, Amit Langote wrote:
> Just to clarify, does this problem only arise because there is a pushed
> down join involving the child?  That is, does the problem only occur as of
> the following commit:
> 
> commit 1bc0100d270e5bcc980a0629b8726a32a497e788
> Author: Robert Haas 
> Date:   Wed Feb 7 15:34:30 2018 -0500
> 
> postgres_fdw: Push down UPDATE/DELETE joins to remote servers.
> 
> In other words, do we need to back-patch this up to 9.5 which added
> foreign table inheritance?

Oops, it should have been clear by the subject line that the problem
didn't exist before that commit.  Sorry.

Thanks,
Amit

Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

2018-05-11 Thread Ashutosh Bapat

On Thu, May 10, 2018 at 6:23 PM, Etsuro Fujita
 wrote:
> (2018/05/10 13:04), Ashutosh Bapat wrote:
>>
>> On Wed, May 9, 2018 at 5:20 PM, Etsuro Fujita
>>   wrote:
>>>
>>> (2018/04/25 18:51), Ashutosh Bapat wrote:

 Actually I noticed that ConvertRowtypeExpr are used to cast a child's
 whole row reference expression (not just a Var node) into that of its
 parent and not. For example a cast like NULL::child::parent produces a
 ConvertRowtypeExpr encapsulating a NULL constant node and not a Var
 node. We are interested only in ConvertRowtypeExprs embedding Var
 nodes with Var::varattno = 0. I have changed this code to use function
 is_converted_whole_row_reference() instead of the above code with
 Assert. In order to use that function, I had to take it out of
 setrefs.c and put it in nodeFuncs.c which seemed to be a better fit.
>>>
>>>
>>> This change seems a bit confusing to me because the flag bits
>>> "PVC_INCLUDE_CONVERTROWTYPES" and "PVC_RECURSE_CONVERTROWTYPES" passed to
>>> pull_var_clause look as if it handles any ConvertRowtypeExpr nodes from a
>>> given clause, but really, it only handles ConvertRowtypeExprs you
>>> mentioned
>>> above.  To make that function easy to understand and use, I think it'd be
>>> better to use the IsA(node, ConvertRowtypeExpr) test as in the first
>>> version
>>> of the patch, instead of is_converted_whole_row_reference, which would be
>>> more consistent with other cases such as PlaceHolderVar.
>>
>>
>> I agree that using is_converted_whole_row_reference() is not
>> consistent with the other nodes that are handled by pull_var_clause().
>> I also agree that PVC_*_CONVERTROWTYPES doesn't reflect exactly what's
>> being done with those options. But using
>> is_converted_whole_row_reference() is the correct thing to do since we
>> are interested only in the whole-row references embedded in
>> ConvertRowtypeExpr. There can be anything encapsulated in
>> ConvertRowtypeExpr(), a non-shippable function for example. We don't
>> want to try to push that down in postgres_fdw's case. Neither in other
>> cases.
>
>
> Yeah, but I think the IsA test would be sufficient to make things work
> correctly because we can assume that there aren't any other nodes than Var,
> PHV, and CRE translating a wholerow value of a child table into a wholerow
> value of its parent table's rowtype in the expression clause to which we
> apply pull_var_clause with PVC_INCLUDE_CONVERTROWTYPES.

Not really.
Consider following case using the same tables fprt1 and fprt2 in test
sql/postgres_fdw.sql
create function row_as_is(a ftprt2_p1) returns ftprt2_p1 as $$begin
return a; end;$$ language plpgsql;

With the change you propose i.e.

diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index f972712..088da14 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -646,8 +646,9 @@ pull_var_clause_walker(Node *node,
pull_var_clause_context *context)
else
elog(ERROR, "PlaceHolderVar found where not expected");
}
-   else if (is_converted_whole_row_reference(node))
+   else if (IsA(node, ConvertRowtypeExpr))
{
+   Assert(is_converted_whole_row_reference(node));
if (context->flags & PVC_INCLUDE_CONVERTROWTYPES)
{
context->varlist = lappend(context->varlist, node);


following query crashes at Assert(is_converted_whole_row_reference(node));

If we remove that assert, it would end up pushing down row_as_is(),
which is a local user defined function, to the foreign server. That's
not desirable since the foreign server may not have that user defined
function.

>
> BTW, one thing I noticed about the new version of the patch is: there is yet
> another case where pull_var_clause produces an error.  Here is an example:
>
> postgres=# create table t1 (a int, b text);
> CREATE TABLE
> postgres=# create table t2 (a int, b text);
> CREATE TABLE
> postgres=# create foreign table ft1 (a int, b text) server loopback options
> (table_name 't1');
> CREATE FOREIGN TABLE
> postgres=# create foreign table ft2 (a int, b text) server loopback options
> (table_name 't2');
> CREATE FOREIGN TABLE
> postgres=# insert into ft1 values (1, 'foo');
> INSERT 0 1
> postgres=# insert into ft1 values (2, 'bar');
> INSERT 0 1
> postgres=# insert into ft2 values (1, 'test1');
> INSERT 0 1
> postgres=# insert into ft2 values (2, 'test2');
> INSERT 0 1
> postgres=# analyze ft1;
> ANALYZE
> postgres=# analyze ft2;
> ANALYZE
> postgres=# create table parent (a int, b text);
> CREATE TABLE
> postgres=# alter foreign table ft1 inherit parent;
> ALTER FOREIGN TABLE
> postgres=# explain verbose update parent set b = ft2.b from ft2 where
> parent.a = ft2.a returning parent;
> ERROR:  ConvertRowtypeExpr found where not expected
>
> To produce this, apply the patch in [1] as

Re: postgres_fdw: Oddity in pushing down inherited UPDATE/DELETE joins to remote servers

2018-05-11 Thread Amit Langote

Fujita-san,

On 2018/05/10 21:41, Etsuro Fujita wrote:
> I think the reason for that is: in that case we try to find the target
> foreign-join RelOptInfo using find_join_rel in postgresPlanDirectModify,
> but can't find it, because the given root is the *parent* root and
> doesn't have join RelOptInfos with it.  To fix this, I'd like to propose
> to modify make_modifytable so that in case of an inherited
> UPDATE/DELETE, it passes to PlanDirectModify the per-child modified
> subroot, not the parent root, for the FDW to get the foreign-join
> RelOptInfo from the given root in PlanDirectModify.  I think that that
> would be more consistent with the non-inherited UPDATE/DELETE case in
> that the FDW can look at any join RelOptInfos in the given root in
> PlanDirectModify, which I think would be a good thing because some FDWs
> might need to do that for some reason.  For the same reason, I'd also
> like to propose to pass to PlanForeignModify the per-child modified
> subroot, not the parent root, as for PlanDirectModify.  Attached is a
> proposed patch for that.  I'll add this to the open items list for PG11.

So IIUC, we must pass the per-child PlannerInfo here, because that's what
would have been used for the planning join between the child (ft1 in your
example) and the other table (ft2 in your example).  So that's where the
RelOptInfo's corresponding to planning for the child, including that for
the pushed-down join, would be stored.

Just to clarify, does this problem only arise because there is a pushed
down join involving the child?  That is, does the problem only occur as of
the following commit:

commit 1bc0100d270e5bcc980a0629b8726a32a497e788
Author: Robert Haas 
Date:   Wed Feb 7 15:34:30 2018 -0500

postgres_fdw: Push down UPDATE/DELETE joins to remote servers.

In other words, do we need to back-patch this up to 9.5 which added
foreign table inheritance?

Anyway, the patch and tests it adds look good.

Thanks,
Amit

Re: Considering signal handling in plpython again

2018-05-11 Thread Hubert Zhang

Thanks Heikki and Robert for your comments.

I reviewed Heikki's patch and let's enhance it.

As Heikki mentioned, there is a problem when no Python code is being
executed. I tested it in the following case
"select pysleep();" and then type ctrl-c,  query cancelled
successfully.(Patch works:))
"select * from a, b, c, d;" and then type ctrl-c,  query cancelled
successfully.
"select pysleep();" It will terminate immediately(without type ctrl-c) due
to the Py_AddPendingCall is registered on the last query.(The problem
Heikki said)

To fix this problem, we need to let query like "select * from a, b, c, d;"
doesn't call Py_AddPendingCall.
There are at least 3 ways.
1. Add a flag in hook function to indicate whether to call the hook
function.
In current patch the hook function will do two things: a. call
Py_AddPendingCall;
b. call the previous hook(prev_cancel_pending_hook).
So this method will introduce side effect on miss the previous hook. To
enhance it, we need to change the hook in postgres.c to hook array.
And let hook function only do one thing.
2. Add a flag in hook function to indicate whether to call Py_AddPendingCall.
This is straightforward.(I prefer it)
3. Delete the hook after it's been used. This is related to the lifecycle
of signal hooks. In current patch the handler hook is registered
inside PLy_initialize() which will be called for
every plpython_call_handler().  I prefer to just register hook once and in
_PG_init() for each extension. If follow this way, delete hook is not
needed.

Any comments?

On Thu, May 10, 2018 at 10:50 PM, Heikki Linnakangas 
wrote:

> On 10/05/18 09:32, Hubert Zhang wrote:
>
>> Hi all,
>>
>> I want to support canceling for a plpython query which may be a busy loop.
>>
>> I found some discussions on pgsql-hackers 2 years ago. Below is the link.
>>
>> https://www.postgresql.org/message-id/CAFYwGJ3+Xg7EcL2nU-MxX
>> 6p+o6c895pm3myz-b+9n9dffeh...@mail.gmail.com
>>
>> Mario wrote a patch to fix this problem at that time
>> *https://github.com/CartoDB/postgres/commit/d587d8a6e4f035cc
>> 45e1d84fc46aa7c3ab0344c3
>> > 45e1d84fc46aa7c3ab0344c3>*
>> > 45e1d84fc46aa7c3ab0344c3>
>>
>> The main logic is to register a new signal handler for SIGINT/SIGTERM
>> and link the old signal handler in the chain.
>>
>> static void PLy_python_interruption_handler()
>> {
>> PyErr_SetString(PyExc_RuntimeError, "test except");
>> return NULL;
>> }
>> static void
>> PLy_handle_interrupt(int sig)
>> {
>> // custom interruption
>> int added = Py_AddPendingCall(PLy_python_interruption_handler, NULL);
>> if (coreIntHandler) {
>> (*coreIntHandler)(sig);
>> }
>> }
>>
>> Does anyone have some comments on this patch?
>>
>
> PostgreSQL assumes to have control of all the signals. Although I don't
> foresee any changes in this area any time soon, there's no guarantee that
> overriding the SIGINT/SIGTERM will do what you want in the future. Also,
> catching SIGINT/SIGTERM still won't react to recovery conflict interrupts.
>
> In that old thread, I think the conclusion was that we should provide a
> hook in the backend for this, rather than override the signal handler
> directly. We could then call the hook whenever InterruptPending is set.
> No-one got around to write a patch to do that, though.
>
> As for me, I think handler function should call PyErr_SetInterrupt()
>> instead of PyErr_SetString(PyExc_RuntimeError, "test except");
>>
>
> Hmm. I tested that, but because the Python's default SIGINT handler is not
> installed, PyErr_SetInterrupt() will actually throw an error:
>
> TypeError: 'NoneType' object is not callable
>
> I came up with the attached patch, which is similar to Mario's, but it
> introduces a new "hook" for this.
>
> One little problem remains: The Py_AddPendingCall() call is made
> unconditionally in the signal handler, even if no Python code is currently
> being executed. The pending call is queued up until the next time you run a
> PL/Python function, which could be long after the original statement was
> canceled. We need some extra checks to only raise the Python exception, if
> the Python interpreter is currently active.
>
> - Heikki
>

-- 
Thanks

Hubert Zhang

Re: Needless additional partition check in INSERT?

2018-05-11 Thread Amit Langote

On 2018/05/11 15:27, Michael Paquier wrote:
> That's really up to the patch
> author at the end (I prefer matching with NULL, but usually it is better
> to comply with the surroundings for consistency).

Yeah.  I think in this case I'll have to withdraw my comment because most
places that check ri_PartitionRoot do *not* compare to NULL, so what's in
the David's patch is better left the way it is, if only for consistency.
Sorry about the noise.

Thanks,
Amit

Re: Should we add GUCs to allow partition pruning to be disabled?

2018-05-11 Thread Michael Paquier

On Fri, May 11, 2018 at 12:59:27PM +0900, Amit Langote wrote:
> On 2018/05/11 2:13, Robert Haas wrote:
>> On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera
>>  wrote:
>>> David G. Johnston wrote:
 As a user I don't really need to know which model is implemented and the
 name doesn't necessarily imply the implementation.  Pruning seems to be the
 commonly-used term for this feature and we should stick with that.
>>>
>>> I agree with this conclusion.  So we have it right and we shouldn't
>>> change it.
>> 
>> +1.
> 
> +1 from me too.

+1.
--
Michael


signature.asc
Description: PGP signature

Re: Needless additional partition check in INSERT?

2018-05-11 Thread Amit Langote

On 2018/05/11 15:12, David Rowley wrote:
> Thanks for looking
> 
> On 11 May 2018 at 17:48, Amit Langote  wrote:
>> By the way,
>>
>> +!resultRelInfo->ri_PartitionRoot)
>>
>> This should be resultRelInfo->ri_PartitionRoot == NULL, because the above
>> gives an impression that ri_PartitionRoot is a Boolean.
> 
> If this is some new coding rule, then that's the first I've heard of it.

No, I don't know of any official coding rule either.

> Scanning over the result of git grep -E "if \(!\w{1,}\)" it looks like
> we have a bit of cleanup work to do before we can comply.
> 
> FWIW, I've previously been told off for the opposite.

OK, no problem if you would like to leave it the way it it as it may just
be a matter of personal preference.  I just told you what I recall being
told a few times and it was a bit easy to spot in such a small patch.

Thanks,
Amit

Re: Needless additional partition check in INSERT?

2018-05-11 Thread Michael Paquier

On Fri, May 11, 2018 at 06:12:38PM +1200, David Rowley wrote:
> On 11 May 2018 at 17:48, Amit Langote  wrote:
>> By the way,
>>
>> +!resultRelInfo->ri_PartitionRoot)
>>
>> This should be resultRelInfo->ri_PartitionRoot == NULL, because the above
>> gives an impression that ri_PartitionRoot is a Boolean.
> 
> If this is some new coding rule, then that's the first I've heard of it.
> 
> Scanning over the result of git grep -E "if \(!\w{1,}\)" it looks like
> we have a bit of cleanup work to do before we can comply.
> 
> FWIW, I've previously been told off for the opposite.

NULL maps to 0 so that's not really worth worrying.  A lot of code paths
use one way or the other for pointers.  That's really up to the patch
author at the end (I prefer matching with NULL, but usually it is better
to comply with the surroundings for consistency).
--
Michael


signature.asc
Description: PGP signature

Re: Needless additional partition check in INSERT?

2018-05-11 Thread David Rowley

Thanks for looking

On 11 May 2018 at 17:48, Amit Langote  wrote:
> By the way,
>
> +!resultRelInfo->ri_PartitionRoot)
>
> This should be resultRelInfo->ri_PartitionRoot == NULL, because the above
> gives an impression that ri_PartitionRoot is a Boolean.

If this is some new coding rule, then that's the first I've heard of it.

Scanning over the result of git grep -E "if \(!\w{1,}\)" it looks like
we have a bit of cleanup work to do before we can comply.

FWIW, I've previously been told off for the opposite.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

68 matches

Mail list logo