Re: PATCH: Configurable file mode mask

2018-01-30 Thread Michael Paquier
On Mon, Jan 29, 2018 at 04:29:08PM -0500, David Steele wrote:
> OK, that being the case I have piggy-backed on the current pg_upgrade
> tests in the same way that --wal-segsize did.
> 
> There are now three patches:
> 
> 1) 01-pgresetwal-test
> 
> Adds a *very* basic test suite for pg_resetwal.  I was able to make this
> utility core dump (floating point errors) pretty easily with empty or
> malformed pg_control files so I focused on WAL reset functionality plus
> the basic help/command tests that every utility has.

A little is better than nothing, so that's an improvement, thanks!

+# Remove WAL from pg_wal and make sure it gets rebuilt
+$node->stop;
+
+unlink("$pgwal/00010001") == 1
+   or die("unable to remove 00010001");
+
+is_deeply(
+   [sort(slurp_dir($pgwal))], [sort(qw(. .. archive_status))], 'no WAL');
This is_deeply test serves little purpose.  The segment gets removed
directly so I would just remove it.

Another test that could be easily included is with -o, then create a
table and check for its OID value.  Also for -x, then just use
txid_current to check the value consistency.  For -c, with
track_commit_timestamp set to on, you can set a couple of values and
then check for pg_last_committed_xact() which would be NULL if you use
an XID older than the minimum set for example.  All those are simple
enough so they could easily be added in the first version of this test
suite.

You need to update src/bin/pg_resetxlog/.gitignore to include
tmp_check/.

> 2) 02-mkdir
> 
> Converts mkdir() calls to MakeDirectoryDefaultPerm() if the original
> call used default permissions.

So the only call not converted to that API is in ipc.c...  OK, this one
is pretty simple so nothing really to say about it, the real meat comes
with patch 3.  I would rather see with a good eye if this patch
introduces file_perm.h which includes both PG_FILE_MODE_DEFAULT and
PG_DIR_MODE_DEFAULT, and have the frontends use those values.  This
would make the switch to 0003 a bit easier if you look at pg_resetwal's
diffs for example.

> 3) 03-group
> 
> Allow group access on PGDATA.  This includes backend changes, utility
> changes (initdb, pg_ctl, pg_upgrade, etc.) and tests for each utility.

+++ b/src/include/common/file_perm.h
+ *
+ * This module is located in common so the backend can use the constants.
+ *
This is the case of more or less all the content in src/common/, except
for the things which are frontend-only, so this comment could just be
removed.

+command_ok(
+   ['chmod', "-R", 'g+rX', "$pgdata"],
+   'add group perms to PGDATA');
That would blow up on Windows.  You should replace that by perl's chmod
and look at its return result for sanity checks, likely with ($cnt != 0).
And let's not use icacls please...

+if ($params{has_group_access})
+{
+push(@{$params{extra}}, '-g');
+}
No need to introduce a new parameter here, please use directly extra =>
[ '-g' ] in the routines calling PostgresNode::init.

The efforts put in the tests are good.

Patch 0003 needs a very careful lookup...  We don't want to introduce any
kind of race conditions here.  I am not familiar enough with the
proposed patch but the patch is giving me some chills.

You may want to run pgindent once on your patch set.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2018-01-30 Thread Tatsuro Yamada

On 2018/01/24 1:08, Pavel Stehule wrote:



2018-01-22 23:15 GMT+01:00 Stephen Frost mailto:sfr...@snowman.net>>:

Pavel,

* Pavel Stehule (pavel.steh...@gmail.com ) 
wrote:
> here is a GUC based patch for plancache controlling. Looks so this code is
> working.

This really could use a new thread, imv.  This thread is a year old and
about a completely different feature than what you've implemented here.


true, but now it is too late


> It is hard to create regress tests. Any ideas?

Honestly, my idea for this would be to add another option to EXPLAIN
which would make it spit out generic and custom plans, or something of
that sort.


I looked there, but it needs cycle dependency between CachedPlan and 
PlannedStmt. It needs more code than this patch and code will not be nicer. I 
try to do some game with prepared statements

Please review your patches before sending them and don't send in patches
which have random unrelated whitespace changes.

 > diff --git a/src/backend/utils/cache/plancache.c 
b/src/backend/utils/cache/plancache.c
 > index ad8a82f1e3..cc99cf6dcc 100644
 > --- a/src/backend/utils/cache/plancache.c
 > +++ b/src/backend/utils/cache/plancache.c
 > @@ -1031,6 +1033,12 @@ choose_custom_plan(CachedPlanSource *plansource, 
ParamListInfo boundParams)
 >       if (IsTransactionStmtPlan(plansource))
 >               return false;
 >
 > +     /* See if settings wants to force the decision */
 > +     if (plancache_mode & PLANCACHE_FORCE_GENERIC_PLAN)
 > +             return false;
 > +     if (plancache_mode & PLANCACHE_FORCE_CUSTOM_PLAN)
 > +             return true;

Not a big deal, but I probably would have just expanded the conditional
checks against cursor_options (just below) rather than adding
independent if() statements.


I don't think so proposed change is better - My opinion is not strong - and 
commiter can change it simply


 > diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
 > index ae22185fbd..4ce275e39d 100644
 > --- a/src/backend/utils/misc/guc.c
 > +++ b/src/backend/utils/misc/guc.c
 > @@ -3916,6 +3923,16 @@ static struct config_enum ConfigureNamesEnum[] =
 >               NULL, NULL, NULL
 >       },
 >
 > +     {
 > +             {"plancache_mode", PGC_USERSET, QUERY_TUNING_OTHER,
 > +                     gettext_noop("Forces use of custom or generic 
plans."),
 > +                     gettext_noop("It can control query plan cache.")
 > +             },
 > +             &plancache_mode,
 > +             PLANCACHE_DEFAULT, plancache_mode_options,
 > +             NULL, NULL, NULL
 > +     },

That long description is shorter than the short description.  How about:

short: Controls the planner's selection of custom or generic plan.
long: Prepared queries have custom and generic plans and the planner
will attempt to choose which is better; this can be set to override
the default behavior.


changed - thank you for text


(either that, or just ditch the long description)

 > diff --git a/src/include/utils/plancache.h 
b/src/include/utils/plancache.h
 > index 87fab19f3c..962895cc1a 100644
 > --- a/src/include/utils/plancache.h
 > +++ b/src/include/utils/plancache.h
 > @@ -143,7 +143,6 @@ typedef struct CachedPlan
 >       MemoryContext context;          /* context containing this 
CachedPlan */
 >  } CachedPlan;
 >
 > -
 >  extern void InitPlanCache(void);
 >  extern void ResetPlanCache(void);
 >

Random unrelated whitespace change...


fixed


This is also missing documentation updates.


I wrote some basic doc, but native speaker should to write more words about 
used strategies.


Setting to Waiting for Author, but with those changes I would think we
could mark it ready-for-committer, it seems pretty straight-forward to
me and there seems to be general agreement (now) that it's worthwhile to
have.

Thanks!


attached updated patch

Regards

Pavel


Stephen


Hi Pavel,

I tested your patch on a044378ce2f6268a996c8cce2b7bfb5d82b05c90.
I share my results to you.

 - Result of patch command

$ patch -p1 < guc-plancache_mode-180123.patch
patching file doc/src/sgml/config.sgml
Hunk #1 succeeded at 4400 (offset 13 lines).
patching file src/backend/utils/cache/plancache.c
patching file src/backend/utils/misc/guc.c
patching file src/include/utils/plancache.h
patching file src/test/regress/expected/plancache.out
patching file src/test/regress/sql/plancache.sql

 - Result of regression test

===
 All 186 tests passed.
===

Regards,
Tatsuro Yamada




Re: [HACKERS] generated columns

2018-01-30 Thread Michael Paquier
On Sat, Jan 27, 2018 at 05:05:14PM -0500, Peter Eisentraut wrote:
> This is expanded in the rewriter, so there is no context like that.
> This is exactly how views work, e.g.,
> 
> create table t1 (id int, length int);
> create view v1 as select id, length * 10 as length_in_nanometers
> from t1;
> insert into t1 values (1, 5);
> select * from v1;
> ERROR:  integer out of range
> 
> I think this is not a problem in practice.

Yeah, I tend to have the same opinion while doing a second pass on the
patch proposed on this thread.  You could more context when using STORED
columns, but for VIRTUAL that does not make such sense as the handling
of values is close to views.  That's the same handling for materialized
views as well, you don't get any context when facing an overflow when
either creating the matview or refreshing it.
--
Michael


signature.asc
Description: PGP signature


csv format for psql

2018-01-30 Thread Daniel Verite
 Hi,


This patch implements csv as an output format in psql
(\pset format csv). It's quite similar to the unaligned format,
except that it applies CSV quoting rules (obviously!) and that
it prints no footer and no title.
As with unaligned, a header with column names is output unless
tuples_only is on. It also supports the fieldsep/fielsep_zero
and recordsep/recordsep_zero settings.

Most of times, the need for CSV is covered by \copy or COPY with
the CSV option, but there are some cases where it would be more
practical to have it as an output format in psql.

* \copy does not interpolate psql variables and is a single-line
command, so making a query fit these contraints can be cumbersome.
It can be got around by defining a temporary view and
\copy from that view, but that doesn't work in a read-only context
such as when connected to a standby.

* the server-side COPY TO STDOUT can also be used from psql,
typically with psql -c "COPY (query) TO STDOUT CSV" > file.csv,
but that's too simple to extract multiple result sets per script.
COPY is also more rigid than psql in the options to delimit
fields and records.

* copy with csv can't help for the output of meta-commands
such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
does work with these.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7ea7edc..ebb3d35 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -246,7 +246,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -376,7 +376,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -551,8 +551,8 @@ EOF
   --field-separator-zero
   
   
-  Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  Set the field separator for unaligned and csv outputs to a zero byte.
+  This is equivalent to \pset fieldsep_zero.
   
   
 
@@ -562,8 +562,9 @@ EOF
   --record-separator-zero
   
   
-  Set the record separator for unaligned output to a zero byte.  This is
-  useful for interfacing, for example, with xargs -0.
+  Set the record separator for unaligned and csv outputs to a zero byte.
+  This is useful for interfacing, for example, with
+  xargs -0.
   This is equivalent to \pset recordsep_zero.
   
   
@@ -1918,9 +1919,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2527,8 +2528,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one can create, for example, tab- or
   comma-separated output, which other programs might prefer. To
   set a tab as field separator, type \pset fieldsep
   '\t'. The default field separator is
@@ -2541,8 +2542,8 @@ lo_import 152801
   fieldsep_zero
   
   
-  Sets the field separator to use in unaligned output format to a zero
-  byte.
+  Sets the field separator to use in unaligned or csv output formats to
+  a zero byte.
   
   
   
@@ -2565,9 +2566,13 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of
+  unaligned,
+  aligned,
+  csv,
+  wrapped,
+  html,
+  asciidoc,
   latex (uses tabular),
   latex-longtable, or
   troff-ms.
@@ -2582,6 +2587,12 @@ lo_import 152801
   format).
   
 
+  csv format is similar to
+  unaligned, except that column contents are
+  enclosed in double quotes and quoted when necessary according to the
+  rules of the CSV format, and that no title or footer are printed.
+  
+
   aligned format is the standard, 
human-readable,
   nicely formatte

Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2018-01-30 Thread Michael Meskes
> Michael, is there any problem including datatype/* headers in ecpg
> that
> are frontend clean? I see no such usage so far, that's why I'm
> asking.

When the pgtypes library was created we tried to include only the bits
and pieces needed to not create unnecessary hassles, but if it compiles
cleanly I'm fine either way. I'm assuming you're talking about
including the files for compiling ecpg, not as externally visible
header files, right?

michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory

2018-01-30 Thread Yoshimi Ichiyanagi

Fri, 19 Jan 2018 09:42:25 -0500Robert Haas  wrote
 :
>
>I think that you really need to include the checkpoints in the tests.
>I would suggest setting max_wal_size and/or checkpoint_timeout so that
>you reliably complete 2 checkpoints in a 30-minute test, and then do a
>comparison on that basis.

Experimental setup:
-
Server: HP ProLiant DL360 Gen9
CPU:Xeon E5-2667 v4 (3.20GHz); 2 processors(without HT)
DRAM:   DDR4-2400; 32 GiB/processor
(8GiB/socket x 4 sockets/processor) x 2 processors
NVDIMM: DDR4-2133; 32 GiB/processor
(node 0: 8GiB/socket x 2 sockets/processor,
 node 1: 8GiB/socket x 6 sockets/processor)
HDD:Seagate Constellation2 2.5inch SATA 3.0. 6Gb/s 1TB 7200rpm x 1
SATA-SSD: Crucial_CT500MX200SSD1 (SATA 3.2, SATA 6Gb/s)
OS:   Ubuntu 16.04, linux-4.12
DAX FS:   ext4
PMDK: master(at)Aug 30, 2017
PostgreSQL: master
Note: I bound the postgres processes to one NUMA node,
  and the benchmarks to other NUMA node.
-

postgresql.conf
-
# - Settings -
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = pmem_drain/fdatasync/open_datasync
full_page_writes = on
wal_compression = off

# - Checkpoints -
checkpoint_timeout = 12min
max_wal_size = 20GB
min_wal_size = 20GB
-

Executed commands:

# numactl -N 1 pg_ctl start -D [PG_DIR] -l [LOG_FILE]
# numactl -N 0 pgbench -s 200 -i [DB_NAME]
# numactl -N 0 pgbench -c 32 -j 32 -T 1800 -r [DB_NAME] -M prepared


The results:

A) Applied the patches to PG src, and compiled PG with libpmem
B) Applied the patches to PG src, and compiled PG without libpmem
C) Original PG

The averages of running pgbench three times on *PMEM* are:
A)
wal_sync_method = pmem_drain  tps = 41660.42524
wal_sync_method = open_datasync   tps = 39913.49897
wal_sync_method = fdatasync   tps = 39900.83396

C)
wal_sync_method = open_datasync   tps = 40335.50178
wal_sync_method = fdatasync   tps = 40649.57772


The averages of running pgbench three times on *SATA-SSD* are:
B)
wal_sync_method = open_datasync   tps = 7224.07146
wal_sync_method = fdatasync   tps = 7222.19177

C)
wal_sync_method = open_datasync   tps = 7258.79093
wal_sync_method = fdatasync   tps = 7263.19878


>From the above results, it show that wal_sync_method=pmem_drain was
about faster than wal_sync_method=open_datasync/fdatasync.
When pgbench ran on SATA-SSD, wal_sync_method=fdatasync was as fast
as wal_sync_method=open_datasync.


>> Do you know any good WAL I/O intensive benchmarks? DBT2?
>
>pgbench is quite a WAL-intensive benchmark; it is much more
>write-heavy than what most systems experience in real life, at least
>in my experience.  Your comparison of DAX FS to DAX FS + PMDK is very
>interesting, but in real life the bandwidth of DAX FS is already so
>high -- and the latency so low -- that I think most real-world
>workloads won't gain very much.  At least, that is my impression based
>on internal testing EnterpriseDB did a few months back.  (Thanks to
>Mithun and Kuntal for that work.)

In the near future, many physical devices will send sensing data
(IoT might allow devices to exhaust tens Giga network bandwidth).
The amount of data inserted in the DB will significantly increase.
I think that PMEM will be needed for use cases like IoT.



Thu, 25 Jan 2018 09:30:45 -0500Robert Haas  wrote
 :
>Well, some day persistent memory may be a common enough storage
>technology that such a change makes sense, but these days most people
>have either SSD or spinning disks, where the change would probably be
>a net negative.  It seems more like something we might think about
>changing in PG 20 or PG 30.
>

Oracle and Microsoft SQL Server suported PMEM [1][2].
I think it is not too early for PostgreSQL to support PMEM.

[1] http://dbheartbeat.blogspot.jp/2017/11/doag-2017-oracle-18c-dbim-oracle.htm
[2] 
https://www.snia.org/sites/default/files/PM-Summit/2018/presentations/06_PM_Summit_2018_Talpey-Final_Post-CORRECTED.pdf

-- 
Yoshimi Ichiyanagi




Regarding drop index

2018-01-30 Thread Abinaya Kajendiran
Hello all,

We are building in-memory index extension for postgres. For drop index
query, does postgres notify me through index access methods?

Currently, we are using event triggers to capture drop index. If there is a
better way, please do suggest.

Thanks in advance.

Regards,
Abinaya K


Re: csv format for psql

2018-01-30 Thread Fabien COELHO


Hello Daniel,


This patch implements csv as an output format in psql
(\pset format csv).


Would you consider registering it in the next CF?

--
Fabien.



Re: non-bulk inserts and tuple routing

2018-01-30 Thread Etsuro Fujita

(2018/01/25 18:52), Amit Langote wrote:

On 2018/01/25 18:30, Etsuro Fujita wrote:

The patches apply cleanly and compile successfully, but make check fails
in an assert-enabled build.


Hmm, I can't seem to reproduce the failure with v4 patches I posted
earlier today.



Can you please post the errors you're seeing?


A quick debug showed that the failure was due to a segmentation fault 
caused by this change to ExecSetupPartitionTupleRouting (in patch 
v4-0003-During-tuple-routing-initialize-per-partition-obj):


-   boolis_update = false;

+   boolis_update;

I modified that patch to initialize the is_update to false as before. 
With the modified version, make check passed successfully.


I'll review the patch in more detail!

Best regards,
Etsuro Fujita



Re: [HACKERS] PoC: full merge join on comparison clause

2018-01-30 Thread Alexander Kuzmenkov

On 29.01.2018 08:40, Ashutosh Bapat wrote:

  Maybe it's better to separate these two into separate patches so that
it's easy to review patches.
OK, I'll try doing this. For now, moving the patch entry to the next 
commitfest.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: User defined data types in Logical Replication

2018-01-30 Thread atorikoshi

Hi,

It seems to be in the middle of discussion, but I became a reviewer of
this problem several days ago so I've tested the latest patch
'fix_slot_store_error_callback_v12.patch'.

I reproduced the below ERROR by inserting elog() to INPUT function
of CREATE TYPE, and confirmed that above patch solves the problem.

>>  ERROR:  XX000: cache lookup failed for type X

I also ran make check-world and there was no error.

Is the only remaining topic memory leaks?


On 2018/01/09 17:22, Masahiko Sawada wrote:

>> I suppose it's not a common use pattern, but it'd be good
>> to avoid everlasting memleaks.
>
> I agree. Can we remove entry from hash table in the callbacks instead
> of setting InvalidOid when invalidate caches?


--
Atsushi Torikoshi
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center




Re: non-bulk inserts and tuple routing

2018-01-30 Thread Amit Langote
Fujita-san,

On 2018/01/30 18:22, Etsuro Fujita wrote:
> (2018/01/25 18:52), Amit Langote wrote:
>> On 2018/01/25 18:30, Etsuro Fujita wrote:
>>> The patches apply cleanly and compile successfully, but make check fails
>>> in an assert-enabled build.
>>
>> Hmm, I can't seem to reproduce the failure with v4 patches I posted
>> earlier today.
> 
>> Can you please post the errors you're seeing?
> 
> A quick debug showed that the failure was due to a segmentation fault
> caused by this change to ExecSetupPartitionTupleRouting (in patch
> v4-0003-During-tuple-routing-initialize-per-partition-obj):
> 
> -    bool    is_update = false;
> 
> +    bool    is_update;
> 
> I modified that patch to initialize the is_update to false as before. With
> the modified version, make check passed successfully.

Oops, my bad.

> I'll review the patch in more detail!

Thank you.  Will wait for your comments before sending a new version then.

Regards,
Amit




Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Simon Riggs
On 29 January 2018 at 22:31, Peter Geoghegan  wrote:

> I don't think that there should be any error, so I can't say.

>> I argued that was possible and desirable, but you argued it was not
>> and got everybody else to agree with you. I'm surprised to see you
>> change your mind on that.
>
> You're mistaken. Nothing I've said here is inconsistent with my
> previous remarks on how we deal with concurrency.

Please see here

https://www.postgresql.org/message-id/20171102191636.GA27644%40marmot

On 2 November 2017 at 19:16, Peter Geoghegan  wrote:
> Simon Riggs  wrote:
>>
>> So if I understand you correctly, in your view MERGE should just fail
>> with an ERROR if it runs concurrently with other DML?
>
>
> That's certainly my opinion on the matter. It seems like that might be
> the consensus, too.


You've changed your position, which is good, thanks. No problem at all.

The proposal you make here had already been discussed in detail by
Pavan and myself. My understanding of that discussion was that he
thinks it might be possible, but I had said we must stick to the
earlier agreement on how to proceed. I am willing to try to produce
fewer concurrent errors, since that was an important point from
earlier work.

My only issue now is to make sure this does not cause confusion and
ask if that changes the views of others.



> According to your documentation, "MERGE provides a single SQL
> statement that can conditionally INSERT, UPDATE or DELETE rows, a task
> that would otherwise require multiple procedural language statements".
> But you're introducing a behavior/error that would not occur in
> equivalent procedural client code consisting of an UPDATE followed by
> a (conditionally executed) INSERT statement when run in READ COMMITTED
> mode. You actually get exactly the concurrency issue that you cite as
> unacceptable in justifying your serialization error with such
> procedural code (when the UPDATE didn't affect any rows, only
> following EPQ walking the UPDATE chain from the snapshot-visible
> tuple, making the client code decide to do an INSERT on the basis of
> information "from the future").

"You're introducing a behavior/error"... No, I advocate nothing in the
patch, I am merely following the agreement made here:

https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com

Robert, Stephen, may we attempt to implement option 4 as Peter now suggests?

> 4. Implement MERGE, while attempting to avoid concurrent ERRORs in
> cases where that is possible.

I will discuss in more detail at the Brussels Dev meeting and see if
we can achieve consensus on how to proceed.



v14 posted with changes requested by multiple people. Patch status:
Needs Review.

Current summary: 0 wrong answers; 2 ERRORs raised need better
handling; some open requests for change/enhancement.

I will open a wiki page to track open items by the end of the week.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FOR EACH ROW triggers on partitioned tables

2018-01-30 Thread Amit Langote
On 2018/01/30 5:30, Peter Eisentraut wrote:
> On 1/23/18 17:10, Alvaro Herrera wrote:
>> The main question is this: when running the trigger function, it is
>> going to look as it is running in the context of the partition, not in
>> the context of the parent partitioned table (TG_RELNAME etc).  That
>> seems mildly ugly: some users may be expecting that the partitioning
>> stuff is invisible to the rest of the system, so if you have triggers on
>> a regular table and later on decide to partition that table, the
>> behavior of triggers will change, which is maybe unexpected.  Maybe this
>> is not really a problem, but I'm not sure and would like further
>> opinions.
> 
> One could go either way on this, but I think reporting the actual table
> partition is acceptable and preferable.

+1

> If you are writing a generic
> trigger function, maybe to dump out all columns, you want to know the
> physical table and its actual columns.  It's easy[citation needed] to
> get the partition root for a given table, if the trigger code needs
> that.  The other way around is not possible.

I guess you mean the root where a trigger originated, that is, ancestor
table on which an inherited trigger was originally defined.  It is
possible for a trigger to be defined on an intermediate parent and not the
topmost root in a partition tree.

I see that the only parent-child relationship for triggers created
recursively is recorded in the form of a dependency.  I wonder why not a
flag in, say, pg_trigger to indicate that a trigger may have been created
recursively.  With the committed for inherited indexes, I can see that
inheritance is explicitly recorded in pg_inherits because indexes are
relations, so it's possible in the indexes' case to get the parent in
which a given inherited index originated.

> Similarly, transition tables should be OK.  You only get the current
> partition to look at, of course.

+1

> The function name CloneRowTriggersOnPartition() confused me.  A more
> accurate phrasing might be CloneRowTriggersToPartition(), or maybe
> reword altogether.

CloneRowTriggers*For*Partition()?

Thanks,
Amit




Re: non-bulk inserts and tuple routing

2018-01-30 Thread Etsuro Fujita

(2018/01/30 18:39), Amit Langote wrote:

Will wait for your comments before sending a new version then.


Ok, I'll post my comments as soon as possible.

Best regards,
Etsuro Fujita



Re: [HACKERS] Subscription code improvements

2018-01-30 Thread Masahiko Sawada
On Fri, Jan 26, 2018 at 11:41 AM, Peter Eisentraut
 wrote:
> On 1/24/18 02:33, Masahiko Sawada wrote:
>> Thank you for notification. Since it seems to me that no one is
>> interested in this patch, it would be better to close out this patch.
>> This patch is a refactoring patch but subscription code seems to work
>> fine so far. If a problem appears around subscriptions, I might
>> propose it again.
>
> I have looked at the patches again.

Thank you for looking at this.

>  They seem generally reasonable, but
> I don't see quite why we want or need them.  More details would help
> review them.  Do they fix any bugs, or are they just rearranging code?
>

0002 patch rearranges the code. Currently SetSubscriptionRelState()
not only update but also register a record, and it is controlled by
update_only argument. This patch splits SetSubscriptionRelState() into
AddSubscriptionRelState() and UpdateSubscriptionRelstate(). 0001, 0003
and 0004 patch are improvement patches. 0001 patch improves messaging
during worker startup. 0004 patch, which requires 0003 patch, patch
reduce the lock level for DROP SUBSCRIPTION to RowExclusiveLock.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-30 Thread Oliver Ford
On Tuesday, 30 January 2018, Tom Lane  wrote:

> Another thing I'm a little confused by is the precise API for the in_range
> support functions (the lack of any documentation for it doesn't help).
> I wonder why you chose to provide two support functions per datatype
> combination rather than one with an additional boolean argument.
> In fact, it almost seems like the "end" flag could already do the
> job, though I may be missing something.  As-is, it seems like this
> setup involves a lot of duplicate code and catalog entries ... what
> are we gaining from that?
>
> regards, tom lane
>

We could instead remove the "desc" functions and flip the values of both
"preceding" and "end" for a descending order. It just needs an extra bool
in the parsenode/plannode structs to send to nodeWindowAgg.

I used two functions because it seemed cleaner to me to get the Oid of the
function in the parser for both ordering types, so then nodeWindowAgg
doesn't have to know about sort order (doesn't have to have extra
conditionals for the two). But yes it does increase the catalog and code
size so is probably better removed.

I will send out v10 soon with the desc functions removed and the
EXCLUDE_NO_OTHERS define removed.


Re: [PATCH] pgbench - refactor some connection finish/null into common function

2018-01-30 Thread Fabien COELHO


Hello Doug,


This patch refactors all of the connection state PQfinish() and NULL’ing into a 
single function.
Excludes PQfinish() in doConnect().


My 0.02€:

The argument could be "PGconn **" instead of a "CState *"?
If so, it may be used in a few more places. What is your opinion?

I'm fine with this kind of factorization which takes out a three-line 
pattern, but I'm wondering whether it would please committers.


--
Fabien.

Re: [HACKERS] path toward faster partition pruning

2018-01-30 Thread Kyotaro HORIGUCHI
Hello, let me make some comments.

At Tue, 30 Jan 2018 09:57:44 +0900, Amit Langote 
 wrote in 
<4a7dda08-b883-6e5e-b0bf-f5ce95584...@lab.ntt.co.jp>
> Hi Jesper.
> 
> On 2018/01/29 22:13, Jesper Pedersen wrote:
> > Hi Amit,
> > 
> > On 01/26/2018 04:17 AM, Amit Langote wrote:
> >> I made a few of those myself in the updated patches.
> >>
> >> Thanks a lot again for your work on this.
> >>
> > 
> > This needs a rebase.
> 
> AFAICS, v22 cleanly applies to HEAD (c12693d8f3 [1]), compiles, and make
> check passes.

Yes, it cleanly applies to HEAD and seems working.

0001 seems fine.

I have some random comments on 0002.

-extract_partition_key_clauses implicitly assumes that the
 commutator of inequality operator is the same to the original
 operator. (commutation for such operators is an identity
 function.)

 I believe it is always true for a sane operator definition, but
 perhaps wouldn't it be better using commutator instead of
 opclause->opno as the source of negator if any? (Attached diff 1)

 
-get_partitions_from_or_clause_args abandons arg_partset with all
 bit set when partition constraint doesn't refute whole the
 partition. Finally get_partitions_from_clauses does the same
 thing but it's waste of cycles and looks weird. It seems to have
 intended to return immediately there.

  >   /* Couldn't eliminate any of the partitions. */
  >   partdesc = RelationGetPartitionDesc(context->relation);
  > - arg_partset = bms_add_range(NULL, 0, partdesc->nparts - 1);
  > + return bms_add_range(NULL, 0, partdesc->nparts - 1);
  > }
  >  
  > subcontext.rt_index = context->rt_index;
  > subcontext.relation = context->relation;
  > subcontext.clauseinfo = &partclauseinfo;
 !> arg_partset = get_partitions_from_clauses(&subcontext);

-get_partitions_from_or_clause_args converts IN (null) into
 nulltest and the nulltest doesn't exclude a child that the
 partition key column can be null.

 drop table if exists p;
 create table p (a int, b int) partition by list (a);
 create table c1 partition of p for values in (1, 5, 7);
 create table c2 partition of p for values in (4, 6, null);
 insert into p values (1, 0), (null, 0);
 
 explain select tableoid::regclass, * from p where a in (1, null);
 >QUERY PLAN
 > -
 >  Result  (cost=0.00..76.72 rows=22 width=12)
 >->  Append  (cost=0.00..76.50 rows=22 width=12)
 >  ->  Seq Scan on c1  (cost=0.00..38.25 rows=11 width=12)
 >Filter: (a = ANY ('{1,NULL}'::integer[]))
 >  ->  Seq Scan on c2  (cost=0.00..38.25 rows=11 width=12)
 >Filter: (a = ANY ('{1,NULL}'::integer[]))

 Although the clause "a in (null)" doesn't match the (null, 0)
 row so it donesn't harm finally, I don't think this is a right
 behavior. null in an SAOP rightop should be just ignored on
 partition pruning. Or is there any purpose of this behavior?


- In extract_bounding_datums, clauseinfo is set twice to the same
  value.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index ab17524..a2488ab 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -2111,7 +2111,6 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 			PartClause *pc;
 			Oid			partopfamily = partkey->partopfamily[i];
 			Oid			partcoll = partkey->partcollation[i];
-			Oid			commutator = InvalidOid;
 			AttrNumber	partattno = partkey->partattrs[i];
 			Expr	   *partexpr = NULL;
 
@@ -2144,6 +2143,7 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 			if (IsA(clause, OpExpr))
 			{
 OpExpr	   *opclause = (OpExpr *) clause;
+Oid			comparator = opclause->opno;
 Expr	   *leftop,
 		   *rightop,
 		   *valueexpr;
@@ -2161,13 +2161,14 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 	valueexpr = rightop;
 else if (EXPR_MATCHES_PARTKEY(rightop, partattno, partexpr))
 {
-	valueexpr = leftop;
-
-	commutator = get_commutator(opclause->opno);
+	Oid commutator = get_commutator(opclause->opno);
 
 	/* nothing we can do unless we can swap the operands */
 	if (!OidIsValid(commutator))
 		continue;
+
+	valueexpr = leftop;
+	comparator = commutator;
 }
 else
 	/* Clause does not match this partition key. */
@@ -2212,7 +2213,7 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 	 * equality operator *and* this is a list partitioned
 	 * table, we can use it prune partitions.
 	 */
-	negator = get_negator(opclause->opno);
+	negator = get_negator(comparator);
 	if (OidIsValid(negator) &&
 		op_in_opfamily(negator, partopfamily))
 	{
@@ -2236,7 +2237,7 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 }
 
 pc = (PartClause 

Re: [HACKERS] path toward faster partition pruning

2018-01-30 Thread Jesper Pedersen

Hi Amit,

On 01/29/2018 07:57 PM, Amit Langote wrote:

This needs a rebase.


AFAICS, v22 cleanly applies to HEAD (c12693d8f3 [1]), compiles, and make
check passes.



It was a rebase error; I should have checked against a clean master.

Sorry for the noise.

Best regards,
 Jesper




Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2018-01-30 Thread Pavel Stehule
2018-01-30 9:06 GMT+01:00 Tatsuro Yamada :

> On 2018/01/24 1:08, Pavel Stehule wrote:
>
>>
>>
>> 2018-01-22 23:15 GMT+01:00 Stephen Frost > sfr...@snowman.net>>:
>>
>> Pavel,
>>
>>
>> * Pavel Stehule (pavel.steh...@gmail.com > pavel.steh...@gmail.com>) wrote:
>> > here is a GUC based patch for plancache controlling. Looks so this
>> code is
>> > working.
>>
>> This really could use a new thread, imv.  This thread is a year old
>> and
>> about a completely different feature than what you've implemented
>> here.
>>
>>
>> true, but now it is too late
>>
>>
>> > It is hard to create regress tests. Any ideas?
>>
>> Honestly, my idea for this would be to add another option to EXPLAIN
>> which would make it spit out generic and custom plans, or something of
>> that sort.
>>
>>
>> I looked there, but it needs cycle dependency between CachedPlan and
>> PlannedStmt. It needs more code than this patch and code will not be nicer.
>> I try to do some game with prepared statements
>>
>> Please review your patches before sending them and don't send in
>> patches
>> which have random unrelated whitespace changes.
>>
>>  > diff --git a/src/backend/utils/cache/plancache.c
>> b/src/backend/utils/cache/plancache.c
>>  > index ad8a82f1e3..cc99cf6dcc 100644
>>  > --- a/src/backend/utils/cache/plancache.c
>>  > +++ b/src/backend/utils/cache/plancache.c
>>  > @@ -1031,6 +1033,12 @@ choose_custom_plan(CachedPlanSource
>> *plansource, ParamListInfo boundParams)
>>  >   if (IsTransactionStmtPlan(plansource))
>>  >   return false;
>>  >
>>  > + /* See if settings wants to force the decision */
>>  > + if (plancache_mode & PLANCACHE_FORCE_GENERIC_PLAN)
>>  > + return false;
>>  > + if (plancache_mode & PLANCACHE_FORCE_CUSTOM_PLAN)
>>  > + return true;
>>
>> Not a big deal, but I probably would have just expanded the
>> conditional
>> checks against cursor_options (just below) rather than adding
>> independent if() statements.
>>
>>
>> I don't think so proposed change is better - My opinion is not strong -
>> and commiter can change it simply
>>
>>
>>  > diff --git a/src/backend/utils/misc/guc.c
>> b/src/backend/utils/misc/guc.c
>>  > index ae22185fbd..4ce275e39d 100644
>>  > --- a/src/backend/utils/misc/guc.c
>>  > +++ b/src/backend/utils/misc/guc.c
>>  > @@ -3916,6 +3923,16 @@ static struct config_enum
>> ConfigureNamesEnum[] =
>>  >   NULL, NULL, NULL
>>  >   },
>>  >
>>  > + {
>>  > + {"plancache_mode", PGC_USERSET, QUERY_TUNING_OTHER,
>>  > + gettext_noop("Forces use of custom or
>> generic plans."),
>>  > + gettext_noop("It can control query plan
>> cache.")
>>  > + },
>>  > + &plancache_mode,
>>  > + PLANCACHE_DEFAULT, plancache_mode_options,
>>  > + NULL, NULL, NULL
>>  > + },
>>
>> That long description is shorter than the short description.  How
>> about:
>>
>> short: Controls the planner's selection of custom or generic plan.
>> long: Prepared queries have custom and generic plans and the planner
>> will attempt to choose which is better; this can be set to override
>> the default behavior.
>>
>>
>> changed - thank you for text
>>
>>
>> (either that, or just ditch the long description)
>>
>>  > diff --git a/src/include/utils/plancache.h
>> b/src/include/utils/plancache.h
>>  > index 87fab19f3c..962895cc1a 100644
>>  > --- a/src/include/utils/plancache.h
>>  > +++ b/src/include/utils/plancache.h
>>  > @@ -143,7 +143,6 @@ typedef struct CachedPlan
>>  >   MemoryContext context;  /* context containing this
>> CachedPlan */
>>  >  } CachedPlan;
>>  >
>>  > -
>>  >  extern void InitPlanCache(void);
>>  >  extern void ResetPlanCache(void);
>>  >
>>
>> Random unrelated whitespace change...
>>
>>
>> fixed
>>
>>
>> This is also missing documentation updates.
>>
>>
>> I wrote some basic doc, but native speaker should to write more words
>> about used strategies.
>>
>>
>> Setting to Waiting for Author, but with those changes I would think we
>> could mark it ready-for-committer, it seems pretty straight-forward to
>> me and there seems to be general agreement (now) that it's worthwhile
>> to
>> have.
>>
>> Thanks!
>>
>>
>> attached updated patch
>>
>> Regards
>>
>> Pavel
>>
>>
>> Stephen
>>
>
> Hi Pavel,
>
> I tested your patch on a044378ce2f6268a996c8cce2b7bfb5d82b05c90.
> I share my results to you.
>
>  - Result of patch command
>
> $ patch -p1 < guc-plancache_mode-180123.patch
> patching file doc/src/sgml/config.sgml
> Hunk #1 succeeded at 4400 (offset 13 lines).
> patching file src/backend/utils/cache/plancache.c
> patching file src/backend/utils/m

Re: csv format for psql

2018-01-30 Thread Pavel Stehule
2018-01-30 9:31 GMT+01:00 Daniel Verite :

>  Hi,
>
>
> This patch implements csv as an output format in psql
> (\pset format csv). It's quite similar to the unaligned format,
> except that it applies CSV quoting rules (obviously!) and that
> it prints no footer and no title.
> As with unaligned, a header with column names is output unless
> tuples_only is on. It also supports the fieldsep/fielsep_zero
> and recordsep/recordsep_zero settings.
>
> Most of times, the need for CSV is covered by \copy or COPY with
> the CSV option, but there are some cases where it would be more
> practical to have it as an output format in psql.
>

I absolutely agree


> * \copy does not interpolate psql variables and is a single-line
> command, so making a query fit these contraints can be cumbersome.
> It can be got around by defining a temporary view and
> \copy from that view, but that doesn't work in a read-only context
> such as when connected to a standby.
>
> * the server-side COPY TO STDOUT can also be used from psql,
> typically with psql -c "COPY (query) TO STDOUT CSV" > file.csv,
> but that's too simple to extract multiple result sets per script.
> COPY is also more rigid than psql in the options to delimit
> fields and records.
>
> * copy with csv can't help for the output of meta-commands
> such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
> does work with these.
>

It is great - long time I miss this feature - It is interesting for
scripting, ETL, ..

This format is too important, so some special short or long option can be
practical (it will be printed in help)

some like --csv

I found one issue - PostgreSQL default field separator is "|". Maybe good
time to use more common "," ?

Or when field separator was not explicitly defined, then use "," for CSV,
and "|" for other. Although it can be little bit messy

Thank you

Pavel


>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 3:37 AM, Yoshimi Ichiyanagi
 wrote:
> Oracle and Microsoft SQL Server suported PMEM [1][2].
> I think it is not too early for PostgreSQL to support PMEM.

I agree; it's good to have the option available for those who have
access to the hardware.

If you haven't added your patch to the next CommitFest, please do so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 10:32 AM, Simon Riggs  wrote:
> The code is about 1200 lines and has extensive docs, comments and tests.
>
> There are no contentious infrastructure changes, so the debate around
> concurrency is probably the main one. So it looks to me like
> meaningful review has taken place, though I know Andrew and Pavan have
> also looked at it in detail.

Only design-level review, not detailed review of the code.  To be
clear, I think the design-level review was quite productive and I'm
glad it happened, but it's not a substitute for someone going over the
code in detail to look for problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 7:15 PM, Michael Paquier
 wrote:
> On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
>> In terms of timing of commits, I have marked the patch Ready For
>> Committer. To me that signifies that it is ready for review by a
>> Committer prior to commit.
>
> My understanding of this meaning is different than yours.  It should not
> be the author's role to mark his own patch as ready for committer, but
> the role of one or more people who have reviewed in-depth the proposed
> patch and feature concepts.  If you can get a committer-level individual
> to review your patch, then good for you.  But review basics need to
> happen first.  And based on my rough lookup of this thread this has not
> happened yet.  Other people on this thread are pointing out that as
> well.

+1 to all of that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 12:03 PM, Simon Riggs  wrote:
> Partitioning doesn't look too bad, so that looks comfortable for PG11,
> assuming it doesn't hit some unhandled complexity.
>
> Including RLS in the first commit/release turns this into a high risk
> patch. Few people use it, but if they do, they don't want me putting a
> hole in their battleship (literally) should we discover some weird
> unhandled logic in a complex new command.
>
> My recommendation would be to support that later for those that use
> it. For those that don't, it doesn't matter so can also be done later.

-1.  Every other feature we've added recently, including partitioning,
has had to decide what to do about RLS before the initial commit, and
this feature shouldn't be exempt.  In general, newer features need to
work with older features unless there is some extremely good
architectural reason why that is unreasonably difficult.  If that is
the case here, I don't see that you've made an argument for it.  The
proper way to avoid having you put a hole in their battleship is good
code, proper code review, and good testing, not leaving that case off
to one side.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 4:45 AM, Simon Riggs  wrote:
> "You're introducing a behavior/error"... No, I advocate nothing in the
> patch, I am merely following the agreement made here:
>
> https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
>
> Robert, Stephen, may we attempt to implement option 4 as Peter now suggests?

Simon, I don't think you should represent Peter as having changed his
position when he has already said that he didn't.  To be honest, the
discussion up to this point suggests to me that Peter has a
significantly better grip of the issues than you do, yet your posts
convey, at least to me, that you're quite sure he's wrong about a lot
of stuff, including whether or not his current statements are
compatible with his previous statements.  I think that the appropriate
course of action is for you and he to spend a few more emails trying
to actually get on the same page here.

As far as I am able to understand, the substantive issue here is what
to do when we match an invisible tuple rather than a visible tuple.
The patch currently throws a serialization error on the basis that you
(Simon) thought that's what was previously agreed.  Peter is arguing
that we don't normally issue a serialization error at READ COMMITTED
(which I think is true) and proposed that we instead try to INSERT.  I
don't necessarily think that's different from consensus to implement
option #3 from 
https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
because that point #3 says that we're not going to try to AVOID errors
under concurrency, not that we're going to create NEW errors.  In
other words, I understand Peter, then and now, to be saying that MERGE
should behave just as if invisible tuples didn't exist at all; if that
leads some other part of the system to throw an ERROR, then that's
what happens.  Presumably, in a case like this, that would be a common
outcome, because the merge would be performed on the basis of a unique
key and so inserting would trigger a duplicate key violation.  But
maybe not, because I don't think MERGE requires there to be a unique
key on that column, so maybe the insert would just work, or maybe the
conflicting transaction would abort just in time to let it work
anyway.

It is possible that I am confused, here, among other reasons because I
haven't read the code, but if I'm not confused then Peter's analysis
is correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-30 Thread Erik Rijkers

On 2018-01-30 17:08, Oliver Ford wrote:

On Tue, Jan 30, 2018 at 10:48 AM, Oliver Ford  wrote:


I will send out v10 soon with the desc functions removed and the
EXCLUDE_NO_OTHERS define removed.


Here it is. Exclude No Others is still in the parser, but does
nothing. All desc functions are removed, replaced with a sortByAsc
bool. It no longer changes catversion.


There must be a small difference here but I don't even see it...

Sorry to be bothering you with these tiny things :)

thanks,

Erik Rijkers




Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-30 Thread Erik Rijkers

Once more trying to attach the  regression.diffs


On 2018-01-30 17:31, Erik Rijkers wrote:

On 2018-01-30 17:08, Oliver Ford wrote:
On Tue, Jan 30, 2018 at 10:48 AM, Oliver Ford  
wrote:


I will send out v10 soon with the desc functions removed and the
EXCLUDE_NO_OTHERS define removed.


Here it is. Exclude No Others is still in the parser, but does
nothing. All desc functions are removed, replaced with a sortByAsc
bool. It no longer changes catversion.


There must be a small difference here but I don't even see it...

Sorry to be bothering you with these tiny things :)

thanks,

Erik Rijkers
*** /var/data1/pg_stuff/pg_sandbox/pgsql.frame_range/src/test/regress/expected/window.out	2018-01-30 17:13:41.463633724 +0100
--- /var/data1/pg_stuff/pg_sandbox/pgsql.frame_range/src/test/regress/results/window.out	2018-01-30 17:16:59.607871830 +0100
***
*** 1172,1178 
  (10 rows)
  
  SELECT pg_get_viewdef('v_window');
!  pg_get_viewdef
  ---
SELECT i.i, +
   sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
--- 1172,1178 
  (10 rows)
  
  SELECT pg_get_viewdef('v_window');
! pg_get_viewdef 
  ---
SELECT i.i, +
   sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
***
*** 1198,1204 
  (10 rows)
  
  SELECT pg_get_viewdef('v_window');
!   pg_get_viewdef 
  -
SELECT i.i,   +
   sum(i.i) OVER (ORDER BY i.i GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
--- 1198,1204 
  (10 rows)
  
  SELECT pg_get_viewdef('v_window');
!  pg_get_viewdef  
  -
SELECT i.i,   +
   sum(i.i) OVER (ORDER BY i.i GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+

==



WINDOW RANGE patch versus leakproofness

2018-01-30 Thread Tom Lane
I am wondering whether we need to worry about leakproofness in connection
with adding in_range support functions to btree opclasses.  My initial
conclusion is not, but let me lay out the reasoning so people can check
it --- I might be missing something.

The first question is whether we need to encourage in_range functions
to be leakproof.  In the submitted patch, they mostly are not, because
they will happily blurt out the value of the "offset" argument if it's
negative, eg

if (offset < 0)
ereport(ERROR,
(errcode(ERRCODE_WINDOWING_ERROR),
 errmsg("RANGE offsets cannot be negative. invalid value %d", 
offset)));

(I'm not concerned here about whether this message text meets our style
guidelines, only about whether it should include the value of "offset".)
I think we could drop printing the value without a great deal of loss of
user-friendliness, although it is probably worth something.  But there
are other cases that would be harder to prevent, for example an in_range
function might rely on a subtraction function that will complain on
overflow.  So on the whole it'd be nicer if there were not a reason to
be concerned about leakiness of these functions.

The second question is whether there's even any need to worry about
leakproofness of support functions for window functions.  Window function
calls are only allowed in a SELECT's target list and ORDER BY, not in
anything that could get pushed underneath a qual, so my first thought is
that we do not actually care.  If we do care, there are weaknesses in this
area already, because contain_leaked_vars doesn't consider WindowFuncs at
all, let alone dig into the other support functions (such as equality)
that a WindowFunc node will use.  But ISTM that's not an oversight, just
not bothering to write code that would be dead.

So I'm thinking that (a) we do not need to check for leaky functions used
in window support, and (b) therefore there's no need to avoid leaky
behavior in in_range support functions.  Objections?

regards, tom lane



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Simon Riggs
On 30 January 2018 at 16:27, Robert Haas  wrote:

> As far as I am able to understand, the substantive issue here is what
> to do when we match an invisible tuple rather than a visible tuple.
> The patch currently throws a serialization error on the basis that you
> (Simon) thought that's what was previously agreed.

Correct. We discussed this and agreed point (3) below

> 3. Implement MERGE, but without attempting to avoid concurrent ERRORs (Peter)
>
> 4. Implement MERGE, while attempting to avoid concurrent ERRORs in
> cases where that is possible.

Acting in good faith, in respect of all of your wishes, I implemented
exactly that and not what I had personally argued in favour of.

> Peter is arguing
> that we don't normally issue a serialization error at READ COMMITTED
> (which I think is true) and proposed that we instead try to INSERT.

Which IMHO is case 4 since it would avoid a concurrent ERROR. This
meets exactly my original implementation goals as clearly stated on
this thread, so of course I agree with him and have already said I am
happy to change the code, though I am still wary of the dangers he
noted upthread.

If you now agree with doing that and are happy that there are no
dangers, then I'm happy we now have consensus again and we can
continue implementing MERGE for PG11.

This is a good outcome, thanks, our users will be happy.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Temporary tables prevent autovacuum, leading to XID wraparound

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 1:33 AM, Tsunakawa, Takayuki
 wrote:
>> Since the above idea would be complex a bit, as an
>> alternated idea it might be better to specify the number of worker to launch
>> per database by a new GUC parameter or something. If the number of worker
>> running on the database exceeds that limit, the launcher doesn't select
>> the database even if the database is about to wraparound.
>
> I'm afraid the GUC would be difficult for the user to understand and tune.

I agree.  It's autovacuum's job to do the correct thing.  If it's not,
then we need figure out how to make it do the right thing.  Adding a
GUC seems like saying we don't know what the right thing to do is but
we hope the user does know.  That's not a good answer.

Unfortunately, I think a full solution to the problem of allocating AV
workers to avoid wraparound is quite complex.  Suppose that database A
will force a shutdown due to impending wraparound in 4 hours and
database B will force a shutdown in 12 hours.  On first blush, it
seems that we should favor adding workers to A.  But it might be that
database A needs only 2 hours of vacuuming to avoid a shutdown whereas
B needs 12 hours.  In that case, it seems that we ought to instead
favor adding workers to B.   However, it might be the case that A has
more table coming do for wraparound 6 hours from now, and we need
another 15 hours of vacuuming to avoid that shutdown.  That would
favor adding workers to A.  Then again, it might be that A and B
already both have workers, and that adding yet another worker to A
won't speed anything up (because only large tables remain to be
processed and each has a worker already), whereas adding a worker to B
would speed things up (because it still has a big table that we could
immediately start to vacuum for wraparound).  In that case, perhaps we
ought to instead add a worker to B.  But, thinking some more, it seems
like that should cause autovacuum_vacuum_cost_limit to be reallocated
among the workers, making the existing vacuum processes take longer,
which might actually make a bad situation worse.  It seems possible
that the right answer could be to start no new autovacuum worker at
all.

Given all of the foregoing this seems like a very hard problem.  I
can't even articulate a clear set of rules for what our priorities
should be, and it seems that such rules would depend on the rate at
which we're consuming XIDs, how close we are in each database to a
wraparound shutdown, what tables exist in each database, how big the
not-all-frozen part of each one is, how big their indexes are, how
much they're holding back relfrozenxid, and which ones already have
workers, among other things.  I think it's quite possible that we can
come up with something that's better than what we have now without
embarking on a huge project, but it's not going to be anywhere near
perfect because this is really complicated, and there's a real risk
that we'll just making some cases better and others worse rather than
actually coming out ahead overall.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> It's an optional dependency, and it doesn't increase build time that
> much... If we were to move the llvm interfacing code to a .so, there'd
> not even be a packaging issue, you can just package that .so separately
> and get errors if somebody tries to enable LLVM without that .so being
> installed.

I suspect that would be really valuable.  If 'yum install
postgresql-server' (or your favorite equivalent) sucks down all of
LLVM, some people are going to complain, either because they are
trying to build little tiny machine images or because they are subject
to policies which preclude the presence of a compiler on a production
server.  If you can do 'yum install postgresql-server' without
additional dependencies and 'yum install postgresql-server-jit' to
make it go faster, that issue is solved.

Unfortunately, that has the pretty significant downside that a lot of
people who actually want the postgresql-server-jit package will not
realize that they need to install it, which sucks.  But I think it
might still be the better way to go.  Anyway, it's for individual
packagers to cope with that problem; as far as the patch goes, +1 for
structuring things in a way which gives packagers the option to divide
it up that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Robert Haas
On Wed, Jan 24, 2018 at 2:20 AM, Andres Freund  wrote:
> == Error handling ==
>
> There's two aspects to error handling.
>
> Firstly, generated (LLVM IR) and emitted functions (mmap()ed segments)
> need to be cleaned up both after a successful query execution and after
> an error.  I've settled on a fairly boring resowner based mechanism. On
> errors all expressions owned by a resowner are released, upon success
> expressions are reassigned to the parent / released on commit (unless
> executor shutdown has cleaned them up of course).

Cool.

> A second, less pretty and newly developed, aspect of error handling is
> OOM handling inside LLVM itself. The above resowner based mechanism
> takes care of cleaning up emitted code upon ERROR, but there's also the
> chance that LLVM itself runs out of memory. LLVM by default does *not*
> use any C++ exceptions. It's allocations are primarily funneled through
> the standard "new" handlers, and some direct use of malloc() and
> mmap(). For the former a 'new handler' exists
> http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
> latter LLVM provides callback that get called upon failure
> (unfortunately mmap() failures are treated as fatal rather than OOM
> errors).
> What I've chosen to do, and I'd be interested to get some input about
> that, is to have two functions that LLVM using code must use:
>   extern void llvm_enter_fatal_on_oom(void);
>   extern void llvm_leave_fatal_on_oom(void);
> before interacting with LLVM code (ie. emitting IR, or using the above
> functions) llvm_enter_fatal_on_oom() needs to be called.
>
> When a libstdc++ new or LLVM error occurs, the handlers set up by the
> above functions trigger a FATAL error. We have to use FATAL rather than
> ERROR, as we *cannot* reliably throw ERROR inside a foreign library
> without risking corrupting its internal state.

That bites, although it's probably tolerable if we expect such errors
only in exceptional situations such as a needed shared library failing
to load or something. Killing the session when we run out of memory
during JIT compilation is not very nice at all.  Does the LLVM library
have any useful hooks that we can leverage here, like a hypothetical
function LLVMProvokeFailureAsSoonAsConvenient()?  The equivalent
function for PostgreSQL would do { InterruptPending = true;
QueryCancelPending = true; }.  And maybe LLVMSetProgressCallback()
that would get called periodically and let us set a handler that could
check for interrupts on the PostgreSQL side and then call
LLVMProvokeFailureAsSoonAsConvenient() as applicable?  This problem
can't be completely unique to PostgreSQL; anybody who is using LLVM
for JIT from a long-running process needs a solution, so you might
think that the library would provide one.

> This facility allows us to get the bitcode for all operators
> (e.g. int8eq, float8pl etc), without maintaining two copies. The way
> I've currently set it up is that, if --with-llvm is passed to configure,
> all backend files are also compiled to bitcode files.  These bitcode
> files get installed into the server's
>   $pkglibdir/bitcode/postgres/
> under their original subfolder, eg.
>   ~/build/postgres/dev-assert/install/lib/bitcode/postgres/utils/adt/float.bc
> Using existing LLVM functionality (for parallel LTO compilation),
> additionally an index is over these is stored to
>   $pkglibdir/bitcode/postgres.index.bc

That sounds pretty sweet.

> When deciding to JIT for the first time, $pkglibdir/bitcode/ is scanned
> for all .index.bc files and a *combined* index over all these files is
> built in memory.  The reason for doing so is that that allows "easy"
> access to inlining access for extensions - they can install code into
>   $pkglibdir/bitcode/[extension]/
> accompanied by
>   $pkglibdir/bitcode/[extension].index.bc
> just alongside the actual library.

But that means that if an extension is installed after the initial
scan has been done, concurrent sessions won't notice the new files.
Maybe that's OK, but I wonder if we can do better.

> Do people feel these should be hidden behind #ifdefs, always present but
> prevent from being set to a meaningful, or unrestricted?

We shouldn't allow non-superusers to set any GUC that dumps files to
the data directory or provides an easy to way to crash the server, run
the machine out of memory, or similar.  GUCs that just print stuff, or
make queries faster/slower, can be set by anyone, I think.  I favor
having the debugging stuff available in the default build.  This
feature has a chance of containing bugs, and those bugs will be hard
to troubleshoot if the first step in getting information on what went
wrong is "recompile".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 11:56 AM, Simon Riggs  wrote:
> Which IMHO is case 4 since it would avoid a concurrent ERROR. This
> meets exactly my original implementation goals as clearly stated on
> this thread, so of course I agree with him and have already said I am
> happy to change the code, though I am still wary of the dangers he
> noted upthread.
>
> If you now agree with doing that and are happy that there are no
> dangers, then I'm happy we now have consensus again and we can
> continue implementing MERGE for PG11.

I can't certify that there are no dangers because I haven't studied it
in that much detail, and I still don't think this is the same thing as
#4 for the reasons I already stated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 13:57:50 -0500, Robert Haas wrote:
> > When a libstdc++ new or LLVM error occurs, the handlers set up by the
> > above functions trigger a FATAL error. We have to use FATAL rather than
> > ERROR, as we *cannot* reliably throw ERROR inside a foreign library
> > without risking corrupting its internal state.
> 
> That bites, although it's probably tolerable if we expect such errors
> only in exceptional situations such as a needed shared library failing
> to load or something. Killing the session when we run out of memory
> during JIT compilation is not very nice at all.  Does the LLVM library
> have any useful hooks that we can leverage here, like a hypothetical
> function LLVMProvokeFailureAsSoonAsConvenient()?

I don't see how that'd help if a memory allocation fails? We can't just
continue in that case? You could arguably have reserve memory pool that
you release in that case and then try to continue, but that seems
awfully fragile.


> The equivalent function for PostgreSQL would do { InterruptPending =
> true; QueryCancelPending = true; }.  And maybe
> LLVMSetProgressCallback() that would get called periodically and let
> us set a handler that could check for interrupts on the PostgreSQL
> side and then call LLVMProvokeFailureAsSoonAsConvenient() as
> applicable?  This problem can't be completely unique to PostgreSQL;
> anybody who is using LLVM for JIT from a long-running process needs a
> solution, so you might think that the library would provide one.

The ones I looked at just error out.  Needing to handle OOM in soft fail
manner isn't actually that common a demand, I guess :/.


> > for all .index.bc files and a *combined* index over all these files is
> > built in memory.  The reason for doing so is that that allows "easy"
> > access to inlining access for extensions - they can install code into
> >   $pkglibdir/bitcode/[extension]/
> > accompanied by
> >   $pkglibdir/bitcode/[extension].index.bc
> > just alongside the actual library.
> 
> But that means that if an extension is installed after the initial
> scan has been done, concurrent sessions won't notice the new files.
> Maybe that's OK, but I wonder if we can do better.

I mean we could periodically rescan, rescan after sighup, or such? But
that seems like something for later to me. It's not going to be super
common to install new extensions while a lot of sessions are
running. And things will work in that case, the functions just won't get 
inlined...


> > Do people feel these should be hidden behind #ifdefs, always present but
> > prevent from being set to a meaningful, or unrestricted?
> 
> We shouldn't allow non-superusers to set any GUC that dumps files to
> the data directory or provides an easy to way to crash the server, run
> the machine out of memory, or similar.

I don't buy the OOM one - there's so so so many of those already...

The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
if profiling can only be done by a superuser? Hm :/

Greetings,

Andres Freund



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Peter Geoghegan
On Tue, Jan 30, 2018 at 8:27 AM, Robert Haas  wrote:
> As far as I am able to understand, the substantive issue here is what
> to do when we match an invisible tuple rather than a visible tuple.
> The patch currently throws a serialization error on the basis that you
> (Simon) thought that's what was previously agreed.  Peter is arguing
> that we don't normally issue a serialization error at READ COMMITTED
> (which I think is true) and proposed that we instead try to INSERT.  I
> don't necessarily think that's different from consensus to implement
> option #3 from 
> https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
> because that point #3 says that we're not going to try to AVOID errors
> under concurrency, not that we're going to create NEW errors.

> In other words, I understand Peter, then and now, to be saying that MERGE
> should behave just as if invisible tuples didn't exist at all; if that
> leads some other part of the system to throw an ERROR, then that's
> what happens.

Yes, I am still saying that.

What's at issue here specifically is the exact behavior of
EvalPlanQual() in the context of having *multiple* sets of WHEN quals
that need to be evaluated one at a time (in addition to conventional
EPQ join quals). This is a specific, narrow question about the exact
steps that are taken by EPQ when we have to switch between WHEN
MATCHED and WHEN NOT MATCHED cases *as we walk the UPDATE chain*.

Right now, I suspect that we will require some minor variation of
EPQ's logic to account for new risks. The really interesting question
is what happens when we walk the UPDATE chain, while reevaluating EPQ
quals alongside WHEN quals, and then determine that no UPDATE/DELETE
should happen for the first WHEN case -- what then? I suspect that we
may not want to start from scratch (from the MVCC-visible tuple) as we
reach the second or subsequent WHEN case, but that's a very tentative
view, and I definitely want to hear more opinions it. (Simon wants to
just throw a serialization error here instead, even in READ COMMITTED
mode, which I see as a cop-out.)

Note in particular that this EPQ question has nothing to do with
seeing tuples that are not either visible to our MVCC snapshot, or
visible to EPQ through an UPDATE chain (which starts from the MVCC
visible tuple). The idea that I have done some kind of about-face on
how concurrency should work is just plain wrong. It is not a helpful
way of framing things. What I am talking about here is very
complicated, but also really narrow.

> Presumably, in a case like this, that would be a common
> outcome, because the merge would be performed on the basis of a unique
> key and so inserting would trigger a duplicate key violation.  But
> maybe not, because I don't think MERGE requires there to be a unique
> key on that column, so maybe the insert would just work, or maybe the
> conflicting transaction would abort just in time to let it work
> anyway.

I think that going on to INSERT having decided against an UPDATE only
having done an EPQ walk (rather than throwing a serialization error)
is very likely to result in the INSERT succeeding, actually. But there
is no guarantee that you won't get a duplicate violation, because
there is nothing to stop a concurrent *INSERT* with the same PK value.
(That's something that's *always* true, regardless of whether or not
somebody needs to do EPQ.)

-- 
Peter Geoghegan



Re: [HACKERS] postgres_fdw bug in 9.6

2018-01-30 Thread Robert Haas
On Sat, Jan 20, 2018 at 12:20 AM, Tom Lane  wrote:
> It looks like Etsuro-san's proposed patch locks down the choice of
> plan more tightly, which is probably a reasonable answer.

OK, committed.  I added a comment along the lines you suggested
previously, since this no longer seems like a generic test that
happens to involve a bunch of merge joins.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund  wrote:
>> That bites, although it's probably tolerable if we expect such errors
>> only in exceptional situations such as a needed shared library failing
>> to load or something. Killing the session when we run out of memory
>> during JIT compilation is not very nice at all.  Does the LLVM library
>> have any useful hooks that we can leverage here, like a hypothetical
>> function LLVMProvokeFailureAsSoonAsConvenient()?
>
> I don't see how that'd help if a memory allocation fails? We can't just
> continue in that case? You could arguably have reserve memory pool that
> you release in that case and then try to continue, but that seems
> awfully fragile.

Well, I'm just asking what the library supports.  For example:

https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html

If you had something like that, you could arrange to safely interrupt
the library the next time the progress-function was called.

> The ones I looked at just error out.  Needing to handle OOM in soft fail
> manner isn't actually that common a demand, I guess :/.

Bummer.

> I mean we could periodically rescan, rescan after sighup, or such? But
> that seems like something for later to me. It's not going to be super
> common to install new extensions while a lot of sessions are
> running. And things will work in that case, the functions just won't get 
> inlined...

Fair enough.

>> > Do people feel these should be hidden behind #ifdefs, always present but
>> > prevent from being set to a meaningful, or unrestricted?
>>
>> We shouldn't allow non-superusers to set any GUC that dumps files to
>> the data directory or provides an easy to way to crash the server, run
>> the machine out of memory, or similar.
>
> I don't buy the OOM one - there's so so so many of those already...
>
> The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
> if profiling can only be done by a superuser? Hm :/

The server's ~/.debug/jit?  Or are you somehow getting the output to the client?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> Unfortunately, that has the pretty significant downside that a lot of
> people who actually want the postgresql-server-jit package will not
> realize that they need to install it, which sucks.  But I think it
> might still be the better way to go.  Anyway, it's for individual
> packagers to cope with that problem; as far as the patch goes, +1 for
> structuring things in a way which gives packagers the option to divide
> it up that way.

I don't know about rpm/yum/dnf, but in dpkg/apt one could declare that
postgresql-server recommends postgresql-server-jit, which installs the
package by default, but can be overridden by config or on the command
line.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 15:06:02 -0500, Robert Haas wrote:
> On Tue, Jan 30, 2018 at 2:08 PM, Andres Freund  wrote:
> >> That bites, although it's probably tolerable if we expect such errors
> >> only in exceptional situations such as a needed shared library failing
> >> to load or something. Killing the session when we run out of memory
> >> during JIT compilation is not very nice at all.  Does the LLVM library
> >> have any useful hooks that we can leverage here, like a hypothetical
> >> function LLVMProvokeFailureAsSoonAsConvenient()?
> >
> > I don't see how that'd help if a memory allocation fails? We can't just
> > continue in that case? You could arguably have reserve memory pool that
> > you release in that case and then try to continue, but that seems
> > awfully fragile.
> 
> Well, I'm just asking what the library supports.  For example:
> 
> https://curl.haxx.se/libcurl/c/CURLOPT_PROGRESSFUNCTION.html

I get that type of function, what I don't understand how that applies to
OOM:

> If you had something like that, you could arrange to safely interrupt
> the library the next time the progress-function was called.

Yea, but how are you going to *get* to the next time, given that an
allocator just couldn't allocate memory? You can't just return a NULL
pointer because the caller will use that memory?


> > The profiling one does dump to ~/.debug/jit/ - it seems a bit annoying
> > if profiling can only be done by a superuser? Hm :/
> 
> The server's ~/.debug/jit?  Or are you somehow getting the output to the 
> client?

Yes, the servers - I'm not sure I understand the "client" bit? It's
about perf profiling, which isn't available to the client either?


Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Tomas Vondra


On 01/30/2018 12:24 AM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
>> FWIW I've installed llvm 5.0.1 from distribution package, and now
>> everything builds fine (I don't even need the configure tweak).
>>
>> I think I had to build the other binaries because there was no 5.x llvm
>> back then, but it's too far back so I don't remember.
>>
>> Anyway, seems I'm fine for now.
> 
> Phew, I'm relieved.  I'd guess you buily a 5.0 version while 5.0 was
> still in development, so not all 5.0 functionality was available. Hence
> the inconsistent looking result.  While I think we can support 4.0
> without too much problem, there's obviously no point in trying to
> support old between releases versions...
> 

That's quite possible, but I don't really remember :-/

But I ran into another issue today, where everything builds fine (llvm
5.0.1, gcc 6.4.0), but at runtime I get errors like this:

ERROR:
LLVMCreateMemoryBufferWithContentsOfFile(/home/tomas/pg-llvm/lib/postgresql/llvmjit_types.bc)
failed: No such file or directory

It seems the llvmjit_types.bc file ended up in the parent directory
(/home/tomas/pg-llvm/lib/) for some reason. After simply copying it to
the expected place everything started working.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Peter Geoghegan
On Tue, Jan 30, 2018 at 8:56 AM, Simon Riggs  wrote:
>> Peter is arguing
>> that we don't normally issue a serialization error at READ COMMITTED
>> (which I think is true) and proposed that we instead try to INSERT.
>
> Which IMHO is case 4 since it would avoid a concurrent ERROR.

It would avoid the error that you added in v13 of your patch, a type
of error that I never proposed or even considered.

> This meets exactly my original implementation goals as clearly stated on
> this thread

I'm glad that you see it that way. I didn't and don't, though I
objected to your characterization mostly because it muddied some
already rather muddy waters. I'm happy to let it go now, though.

> If you now agree with doing that and are happy that there are no
> dangers, then I'm happy we now have consensus again and we can
> continue implementing MERGE for PG11.

My outline from earlier today about EPQ handling, and how it needs to
deal with multiple independent sets of WHEN ... AND quals is, as I
said, very tentative. There isn't even consensus in my own mind about
this, much less between everyone that has taken an interest in this
project.

I'm glad that we all seem to agree that serialization failures as a
way of dealing with concurrency issues in READ COMMITTED mode are a
bad idea. Unfortunately, I still think that we have a lot of work
ahead of us when it comes to agreeing to the right semantics with READ
COMMITTED conflict handling with multiple WHEN ... AND quals.

I see that your v14 still has the serialization error, even though
it's now clear that nobody wants to go that way. So...where do we go
from here? (For the avoidance of doubt, this is *not* a rhetorical
question.)

-- 
Peter Geoghegan



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread David Fetter
On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > It's an optional dependency, and it doesn't increase build time
> > that much... If we were to move the llvm interfacing code to a
> > .so, there'd not even be a packaging issue, you can just package
> > that .so separately and get errors if somebody tries to enable
> > LLVM without that .so being installed.
> 
> I suspect that would be really valuable.  If 'yum install
> postgresql-server' (or your favorite equivalent) sucks down all of
> LLVM,

As I understand it, LLVM is organized in such a way as not to require
this.  Andres, am I understanding correctly that what you're using
doesn't require much of LLVM at runtime?

> some people are going to complain, either because they are
> trying to build little tiny machine images or because they are
> subject to policies which preclude the presence of a compiler on a
> production server.  If you can do 'yum install postgresql-server'
> without additional dependencies and 'yum install
> postgresql-server-jit' to make it go faster, that issue is solved.

Would you consider it solved if there were some very small part of the
LLVM (or similar JIT-capable) toolchain added as a dependency, or does
it need to be optional into a long future?

> Unfortunately, that has the pretty significant downside that a lot of
> people who actually want the postgresql-server-jit package will not
> realize that they need to install it, which sucks.

It does indeed.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 22:57:06 +0100, David Fetter wrote:
> On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> > On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > > It's an optional dependency, and it doesn't increase build time
> > > that much... If we were to move the llvm interfacing code to a
> > > .so, there'd not even be a packaging issue, you can just package
> > > that .so separately and get errors if somebody tries to enable
> > > LLVM without that .so being installed.
> > 
> > I suspect that would be really valuable.  If 'yum install
> > postgresql-server' (or your favorite equivalent) sucks down all of
> > LLVM,
> 
> As I understand it, LLVM is organized in such a way as not to require
> this.  Andres, am I understanding correctly that what you're using
> doesn't require much of LLVM at runtime?

I'm not sure what you exactly mean. Yes, you need the llvm library at
runtime. Perhaps you're thinking of clang or llvm binarieries? The
latter we *not* need.

What's required is something like:
$ apt show libllvm5.0
Package: libllvm5.0
Version: 1:5.0.1-2
Priority: optional
Section: libs
Source: llvm-toolchain-5.0
Maintainer: LLVM Packaging Team 
Installed-Size: 56.9 MB
Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), 
libgcc1 (>= 1:3.4), libstdc++6 (>= 6), libtinfo5 (>= 6), zlib1g (>= 1:1.2.0)
Breaks: libllvm3.9v4
Replaces: libllvm3.9v4
Homepage: http://www.llvm.org/
Tag: role::shared-lib
Download-Size: 13.7 MB
APT-Manual-Installed: no
APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
Description: Modular compiler and toolchain technologies, runtime library
 LLVM is a collection of libraries and tools that make it easy to build
 compilers, optimizers, just-in-time code generators, and many other
 compiler-related programs.
 .
 This package contains the LLVM runtime library.

So ~14MB to download, ~57MB on disk.  We only need a subset of
libllvm5.0, and LLVM allows to build such a subset. But obviously
distributions aren't going to target their LLVM just for postgres.


> > Unfortunately, that has the pretty significant downside that a lot of
> > people who actually want the postgresql-server-jit package will not
> > realize that they need to install it, which sucks.
> 
> It does indeed.

With things like apt recommends and such I don't think this is a huge
problem.  It'll be installed by default unless somebody is on a space
constrained system and doesn't want that...

Greetings,

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > It's an optional dependency, and it doesn't increase build time that
> > much... If we were to move the llvm interfacing code to a .so, there'd
> > not even be a packaging issue, you can just package that .so separately
> > and get errors if somebody tries to enable LLVM without that .so being
> > installed.
> 
> I suspect that would be really valuable.  If 'yum install
> postgresql-server' (or your favorite equivalent) sucks down all of
> LLVM, some people are going to complain, either because they are
> trying to build little tiny machine images or because they are subject
> to policies which preclude the presence of a compiler on a production
> server.  If you can do 'yum install postgresql-server' without
> additional dependencies and 'yum install postgresql-server-jit' to
> make it go faster, that issue is solved.

So, I'm working on that now.  In the course of this I'll be
painfully rebase and rename a lot of code, which I'd like not to repeat
unnecessarily.

Right now there primarily is:

src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
src/backend/access/common/heaptuple.c - emit LLVM IR for deforming

Given that we need a shared library it'll be best buildsystem wise if
all of this is in a directory, and there's a separate file containing
the stubs that call into it.

I'm not quite sure where to put the code. I'm a bit inclined to add a
new
src/backend/jit/
because we're dealing with code from across different categories? There
we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
specific code?

Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
and the actual llvm using code into src/backend/executor/llvmjit/?

Comments?

Andres Freund



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread David Fetter
On Tue, Jan 30, 2018 at 02:08:30PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2018-01-30 22:57:06 +0100, David Fetter wrote:
> > On Tue, Jan 30, 2018 at 01:46:37PM -0500, Robert Haas wrote:
> > > On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
> > > > It's an optional dependency, and it doesn't increase build
> > > > time that much... If we were to move the llvm interfacing code
> > > > to a .so, there'd not even be a packaging issue, you can just
> > > > package that .so separately and get errors if somebody tries
> > > > to enable LLVM without that .so being installed.
> > > 
> > > I suspect that would be really valuable.  If 'yum install
> > > postgresql-server' (or your favorite equivalent) sucks down all
> > > of LLVM,
> > 
> > As I understand it, LLVM is organized in such a way as not to
> > require this.  Andres, am I understanding correctly that what
> > you're using doesn't require much of LLVM at runtime?
> 
> I'm not sure what you exactly mean. Yes, you need the llvm library
> at runtime. Perhaps you're thinking of clang or llvm binarieries?
> The latter we *not* need.

I was, and glad I understood correctly.

> What's required is something like:
> $ apt show libllvm5.0
> Package: libllvm5.0
> Version: 1:5.0.1-2
> Priority: optional
> Section: libs
> Source: llvm-toolchain-5.0
> Maintainer: LLVM Packaging Team 
> Installed-Size: 56.9 MB
> Depends: libc6 (>= 2.15), libedit2 (>= 2.11-20080614), libffi6 (>= 3.0.4), 
> libgcc1 (>= 1:3.4), libstdc++6 (>= 6), libtinfo5 (>= 6), zlib1g (>= 1:1.2.0)
> Breaks: libllvm3.9v4
> Replaces: libllvm3.9v4
> Homepage: http://www.llvm.org/
> Tag: role::shared-lib
> Download-Size: 13.7 MB
> APT-Manual-Installed: no
> APT-Sources: http://debian.osuosl.org/debian unstable/main amd64 Packages
> Description: Modular compiler and toolchain technologies, runtime library
>  LLVM is a collection of libraries and tools that make it easy to build
>  compilers, optimizers, just-in-time code generators, and many other
>  compiler-related programs.
>  .
>  This package contains the LLVM runtime library.
> 
> So ~14MB to download, ~57MB on disk.  We only need a subset of
> libllvm5.0, and LLVM allows to build such a subset. But obviously
> distributions aren't going to target their LLVM just for postgres.

True, although if they're using an LLVM only for PostgreSQL and care
about 57MB of disk, they're probably also ready to do that work.

> > > Unfortunately, that has the pretty significant downside that a
> > > lot of people who actually want the postgresql-server-jit
> > > package will not realize that they need to install it, which
> > > sucks.
> > 
> > It does indeed.
> 
> With things like apt recommends and such I don't think this is a
> huge problem.  It'll be installed by default unless somebody is on a
> space constrained system and doesn't want that...

Don't most of the wins for JITing come in the OLAP space anyway?  I'm
having trouble picturing a severely space-constrained OLAP system, but
of course it's a possible scenario.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Jason Petersen
> On Jan 30, 2018, at 2:08 PM, Andres Freund  wrote:
> 
> With things like apt recommends and such I don't think this is a huge problem.


I don’t believe there is a similar widely-supported dependency type in yum/rpm, 
though. rpm 4.12 adds support for Weak Dependencies, which have 
Recommends/Suggests-style semantics, but AFAIK it’s not going to be on most RPM 
machines (I haven’t checked most OSes yet, but IIRC it’s mostly a Fedora thing 
at this point?)

Which means in the rpm packages we’ll have to decide whether this is required 
or must be opt-in by end users (which as discussed would hurt adoption).

--
Jason Petersen
Software Engineer | Citus Data
303.736.9255
ja...@citusdata.com



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-01-30 Thread Andreas Karlsson

On 01/21/2018 10:36 PM, Mark Rofail wrote:

== The @>> operator

A previous version of your patch added the "anyelement <<@ anyarray"
operator to avoid having to build arrays, but that part was reverted
due to a bug.

I am not expert on the gin code, but as far as I can tell it would
be relatively simple to fix that bug. Just allocate an array of
Datums of length one where you put the element you are searching for
(or maybe a copy of it).

The @>> is now restored and functioning correctly, all issues with 
contrib libraries has been resolved


Hi,

I looked some at your anyarray @>> anyelement code and sadly it does not 
look like the index code could work. The issue I see is that 
ginqueryarrayextract() needs to make a copy of the search key but to do 
so it needs to know the type of anyelement (to know if it needs to 
detoast, etc). But there is as far as I can tell no way to check the 
type of anyelement in this context.


Am I missing something?

Andreas



Re: Regarding drop index

2018-01-30 Thread Ivan Novick
Hi Abinaya,

Have you seen some benchmarks showing that an in-memory index will fix some
bottlenecks?

Do you have an idea how much of a gain this will provide?

Can you point to a github repo for it?

Cheers,
Ivan

On Tue, Jan 30, 2018 at 1:12 AM, Abinaya Kajendiran 
wrote:

> Hello all,
>
> We are building in-memory index extension for postgres. For drop index
> query, does postgres notify me through index access methods?
>
> Currently, we are using event triggers to capture drop index. If there is
> a better way, please do suggest.
>
> Thanks in advance.
>
> Regards,
> Abinaya K
>
>


-- 
Ivan Novick, Product Manager Pivotal Greenplum
inov...@pivotal.io --  (Mobile) 408-230-6491
https://www.youtube.com/GreenplumDatabase


Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Peter Geoghegan
On Tue, Jan 30, 2018 at 8:27 AM, Robert Haas  wrote:
> As far as I am able to understand, the substantive issue here is what
> to do when we match an invisible tuple rather than a visible tuple.
> The patch currently throws a serialization error on the basis that you
> (Simon) thought that's what was previously agreed.  Peter is arguing
> that we don't normally issue a serialization error at READ COMMITTED
> (which I think is true) and proposed that we instead try to INSERT.  I
> don't necessarily think that's different from consensus to implement
> option #3 from 
> https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
> because that point #3 says that we're not going to try to AVOID errors
> under concurrency, not that we're going to create NEW errors.

I should have mentioned earlier that you have this exactly right: I do
not want to make any special effort to avoid duplicate violation
errors. I also don't want to create any novel new kind of error (e.g.,
READ COMMITTED serialization errors).

That having been said, I think that Simon was correct to propose a
novel solution. It just seemed like READ COMMITTED serialization
errors were the wrong novel solution, because that takes the easy way
out. ISTM that the right thing is to adapt EvalPlanQual() (or READ
COMMITTED conflict handling more generally) to the new complication
that is multiple "WHEN ... AND" quals (that need to be evaluated one
at a time, a bona fide new requirement). In short, his novel solution
seemed much too novel.

As I've pointed out already, we will define MERGE to users as
something that "provides a single SQL statement that can conditionally
INSERT, UPDATE or DELETE rows, a task that would otherwise require
multiple procedural language statements". I believe that MERGE's
charter should be to live up to that definition in the least
surprising way possible, up to and including preserving the
maybe-surprising aspects of how multiple procedural language
statements can behave when the system does READ COMMITTED conflict
handling. That's my opinion in a nutshell.

-- 
Peter Geoghegan



RE: Temporary tables prevent autovacuum, leading to XID wraparound

2018-01-30 Thread Tsunakawa, Takayuki
From: Robert Haas [mailto:robertmh...@gmail.com]
> Unfortunately, I think a full solution to the problem of allocating AV
> workers to avoid wraparound is quite complex.

Yes, that easily puts my small brain into an infinite loop...

> Given all of the foregoing this seems like a very hard problem.  I can't
> even articulate a clear set of rules for what our priorities should be,
> and it seems that such rules would depend on the rate at which we're consuming
> XIDs, how close we are in each database to a wraparound shutdown, what tables
> exist in each database, how big the not-all-frozen part of each one is,
> how big their indexes are, how much they're holding back relfrozenxid, and
> which ones already have workers, among other things.  I think it's quite
> possible that we can come up with something that's better than what we have
> now without embarking on a huge project, but it's not going to be anywhere
> near perfect because this is really complicated, and there's a real risk
> that we'll just making some cases better and others worse rather than
> actually coming out ahead overall.

So a simple improvement would be to assign workers fairly to databases facing a 
wraparound risk, as Sawada-san suggested.

One ultimate solution should be the undo-based MVCC that makes vacuuming 
unnecessary, which you proposed about a year ago...

Regards
Takayuki Tsunakawa




Re: [HACKERS] GnuTLS support

2018-01-30 Thread Andreas Karlsson

On 01/26/2018 03:54 AM, Peter Eisentraut wrote:

On 1/25/18 20:10, Michael Paquier wrote:

Peter, could you change ssl_version() and ssl_cipher() in sslinfo at the
same time please? I think that those should use the generic backend-side
APIs as well. sslinfo depends heavily on OpenSSL, OK, but if possible
getting this code more generic will help users of sslinfo to get
something partially working with other SSL implementations natively.


sslinfo is currently entirely dependent on OpenSSL, so I don't think
it's useful to throw in one or two isolated API changes.

I'm thinking maybe we should get rid of sslinfo and fold everything into
pg_stat_ssl.


I think sslinfo should either use the pg_tls_get_* functions or be 
removed. I do not like having an OpenSSL specific extension. One issue 
though is that pg_tls_get_* truncates strings to a given length while 
sslinfo allocates a copy and is therefore only limited by the maximum 
size of text, but this may not be an issue in practice.


Andreas



Re: Regarding drop index

2018-01-30 Thread Peter Eisentraut
On 1/30/18 04:12, Abinaya Kajendiran wrote:
>     We are building in-memory index extension for postgres. For drop
> index query, does postgres notify me through index access methods?

No, the access methods just write into blocks for the file they are told
about.  The deleting of that file is not handled by the access methods.

> Currently, we are using event triggers to capture drop index. If there
> is a better way, please do suggest.

An event trigger will probably do for now.  But you are venturing into
uncharted territory, so you will have to find your own best solution.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FOR EACH ROW triggers on partitioned tables

2018-01-30 Thread Peter Eisentraut
On 1/30/18 04:49, Amit Langote wrote:
>> If you are writing a generic
>> trigger function, maybe to dump out all columns, you want to know the
>> physical table and its actual columns.  It's easy[citation needed] to
>> get the partition root for a given table, if the trigger code needs
>> that.  The other way around is not possible.
> 
> I guess you mean the root where a trigger originated, that is, ancestor
> table on which an inherited trigger was originally defined.  It is
> possible for a trigger to be defined on an intermediate parent and not the
> topmost root in a partition tree.

OK, so maybe not so "easy".

But this muddies the situation even further.  You could be updating
table A, which causes an update in intermediate partition B, which
causes an update in leaf partition C, which fires a trigger that was
logically defined on B and has a local child on C.  Under this proposal,
the trigger will see TG_RELNAME = C.  You could make arguments that the
trigger should also somehow know about B (where the trigger was defined)
and A (what the user actually targeted in their statement).  I'm not
sure how useful these would be.  But if you want to cover everything,
you'll need three values.

I think the patch can go ahead as proposed, and the other things could
be future separate additions.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2018-01-30 Thread Andreas Karlsson

On 01/26/2018 03:28 AM, Stephen Frost wrote:

I'm a big fan of this patch but it doesn't appear to have made any
progress in quite a while.  Is there any chance we can get an updated
patch and perhaps get another review before the end of this CF...?


Sorry, as you may have guessed I do not have the time right now to get 
this updated during this commitfest.



Refactoring this to have an internal representation between
ProcessUtility() and DefineIndex doesn't sound too terrible and if it
means the ability to reuse that, seems like it'd be awful nice to do
so..


I too like the concept, but have not had the time to look into it.

Andreas



Re: [Sender Address Forgery]Re: FOR EACH ROW triggers on partitioned tables

2018-01-30 Thread Amit Langote
On 2018/01/31 9:44, Peter Eisentraut wrote:
> On 1/30/18 04:49, Amit Langote wrote:
>>> If you are writing a generic
>>> trigger function, maybe to dump out all columns, you want to know the
>>> physical table and its actual columns.  It's easy[citation needed] to
>>> get the partition root for a given table, if the trigger code needs
>>> that.  The other way around is not possible.
>>
>> I guess you mean the root where a trigger originated, that is, ancestor
>> table on which an inherited trigger was originally defined.  It is
>> possible for a trigger to be defined on an intermediate parent and not the
>> topmost root in a partition tree.
> 
> OK, so maybe not so "easy".
> 
> But this muddies the situation even further.  You could be updating
> table A, which causes an update in intermediate partition B, which
> causes an update in leaf partition C, which fires a trigger that was
> logically defined on B and has a local child on C.  Under this proposal,
> the trigger will see TG_RELNAME = C.  You could make arguments that the
> trigger should also somehow know about B (where the trigger was defined)
> and A (what the user actually targeted in their statement).  I'm not
> sure how useful these would be.  But if you want to cover everything,
> you'll need three values.
> 
> I think the patch can go ahead as proposed, and the other things could
> be future separate additions.

Yeah, I see no problem with going ahead with the patch as it for now.

Thanks,
Amit




Re: User defined data types in Logical Replication

2018-01-30 Thread Masahiko Sawada
On Tue, Jan 30, 2018 at 6:36 PM, atorikoshi
 wrote:
> Hi,
>
> It seems to be in the middle of discussion, but I became a reviewer of
> this problem several days ago so I've tested the latest patch
> 'fix_slot_store_error_callback_v12.patch'.
>
> I reproduced the below ERROR by inserting elog() to INPUT function
> of CREATE TYPE, and confirmed that above patch solves the problem.
>
>>>  ERROR:  XX000: cache lookup failed for type X
>
> I also ran make check-world and there was no error.
>
> Is the only remaining topic memory leaks?
>

Yeah, but I think that the patch for the avoiding memleaks should be a
separate patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Thomas Munro
On Wed, Jan 31, 2018 at 11:57 AM, Andres Freund  wrote:
> On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
>> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund  wrote:
>> > It's an optional dependency, and it doesn't increase build time that
>> > much... If we were to move the llvm interfacing code to a .so, there'd
>> > not even be a packaging issue, you can just package that .so separately
>> > and get errors if somebody tries to enable LLVM without that .so being
>> > installed.
>>
>> I suspect that would be really valuable.  If 'yum install
>> postgresql-server' (or your favorite equivalent) sucks down all of
>> LLVM, some people are going to complain, either because they are
>> trying to build little tiny machine images or because they are subject
>> to policies which preclude the presence of a compiler on a production
>> server.  If you can do 'yum install postgresql-server' without
>> additional dependencies and 'yum install postgresql-server-jit' to
>> make it go faster, that issue is solved.
>
> So, I'm working on that now.  In the course of this I'll be
> painfully rebase and rename a lot of code, which I'd like not to repeat
> unnecessarily.
>
> Right now there primarily is:
>
> src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
> src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
> src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
> src/backend/access/common/heaptuple.c - emit LLVM IR for deforming
>
> Given that we need a shared library it'll be best buildsystem wise if
> all of this is in a directory, and there's a separate file containing
> the stubs that call into it.
>
> I'm not quite sure where to put the code. I'm a bit inclined to add a
> new
> src/backend/jit/
> because we're dealing with code from across different categories? There
> we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> specific code?
>
> Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
> and the actual llvm using code into src/backend/executor/llvmjit/?
>
> Comments?

I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet.  But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me.  Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so.  I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)
whether JIT support is active/installed so that people are at least
very aware when they encounter a system that is interpreting stuff it
could be compiling.   Putting all the JIT into a separate directory
under src/backend/jit certainly looks sensible at first glance, but
I'm not sure.

Incidentally, from commit fdc6c7a6dddbd6df63717f2375637660bcd00fc6
(HEAD -> jit, andresfreund/jit) on your branch I get:

ccache c++ -Wall -Wpointer-arith -fno-strict-aliasing -fwrapv -g -g
-O2 -fno-exceptions -I../../../src/include
-I/usr/local/llvm50/include -DLLVM_BUILD_GLOBAL_ISEL
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-I/usr/local/include  -c -o llvmjit_error.o llvmjit_error.cpp -MMD -MP
-MF .deps/llvmjit_error.Po
In file included from llvmjit_error.cpp:26:
In file included from ../../../src/include/lib/llvmjit.h:48:
In file included from /usr/local/llvm50/include/llvm-c/Types.h:17:
In file included from /usr/local/llvm50/include/llvm/Support/DataTypes.h:33:
/usr/include/c++/v1/cmath:555:1: error: templates must have C++ linkage
template 
^~~~
llvmjit_error.cpp:24:1: note: extern "C" language linkage
specification begins here
extern "C"
^

$ c++ -v
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)

This seems to be a valid complaint.  I don't think you should be
(indirectly) wrapping Types.h in extern "C".  At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] GUC for cleanup indexes threshold.

2018-01-30 Thread Masahiko Sawada
On Fri, Jan 12, 2018 at 1:05 PM, Masahiko Sawada  wrote:
> On Wed, Jan 10, 2018 at 11:28 AM, Masahiko Sawada  
> wrote:
>> On Sun, Jan 7, 2018 at 8:40 AM, Peter Geoghegan  wrote:
>>> On Sat, Jan 6, 2018 at 2:20 PM, Stephen Frost  wrote:
> > IIRC the patches that makes the cleanup scan skip has a problem
> > pointed by Peter[1], that is that we stash an XID when a btree page is
> > deleted, which is used to determine when it's finally safe to recycle
> > the page. Yura's patch doesn't have that problem?
> >
> > [1] 
> > https://www.postgresql.org/message-id/CAH2-Wz%3D1%3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com
>>>
 Masahiko Sawada, if this patch isn't viable or requires serious rework
 to be acceptable, then perhaps we should change its status to 'returned
 with feedback' and you can post a new patch for the next commitfest..?
>>>
>>> I believe that the problem that I pointed out with freezing/wraparound
>>> is a solvable problem. If we think about it carefully, we will come up
>>> with a good solution. I have tried to get the ball rolling with my
>>> pd_prune_xid suggestion. I think it's important to not lose sight of
>>> the fact that the page deletion/recycling XID thing is just one detail
>>> that we need to think about some more.
>>>
>>> I cannot fault Sawada-san for waiting to hear other people's views
>>> before proceeding. It really needs to be properly discussed.
>>>
>>
>> Thank you for commenting.
>>
>> IIUC we have two approaches: one idea is based on Peter's suggestion.
>> We can use pd_prune_xid to store epoch of xid of which the page is
>> deleted. That way, we can correctly mark deleted pages as recyclable
>> without breaking on-disk format.
>>
>> Another idea is suggested by  Sokolov Yura. His original patch makes
>> btree have a flag in btpo_flags that implies the btree has deleted but
>> not recyclable page or not. I'd rather want to store it as bool in
>> BTMetaPageData. Currently btm_version is defined as uint32 but I think
>> we won't use all of them. If we store it in part of btm_version we
>> don't break on-disk format. However, we're now assuming that the
>> vacuum on btree index always scans whole btree rather than a part, and
>> this approach will nurture it more. It might be possible that it will
>> become a restriction in the future.
>>
>
> On third thought, it's similar to what we are doing for heaps but we
> can store the oldest btpo.xact of a btree index to somewhere(metapage
> or pg_class.relfrozenxid?) after vacuumed. We can skip cleanup
> vacuuming as long as the xid is not more than a certain threshold old
> (say vacuum_index_cleanup_age). Combining with new parameter I
> proposed before, the condition of doing cleanup index vacuum will
> become as follows.
>
> if (there is garbage on heap)
>Vacuum index, and update oldest btpo.xact
> else /* there is no garbage on heap */
> {
> if (# of scanned pages > (nblocks *
> vacuum_cleanup_index_scale_factor) OR oldest btpo.xact is more than
> vacuum_index_cleanup_age old?))
> Cleanup vacuum index, and update oldest btpo.xact
> }
>
> In current index vacuuming, it scans whole index pages if there is
> even one garbage on heap(I'd like to improve it though someday), which
> it also lead to update the oldest btpo.xact at the time. If
> vacuum_cleanup_index_slace_factor is enough high, we can skip the
> scanning whole index as long as either the table isn't modified for 2
> billion transactions or the oldest btpo.xact isn't more than
> vacuum_index_cleanup_age transactions old. But if there is a opened
> transaction for a very long time, we might not be able to mark deleted
> page as recyclable. Some tricks might be required. Feedback is
> welcome.
>

Since this patch still needs to be discussed before proceeding, I've
marked it as "Needs Review". Feedback is very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
> I'm just starting to look at this (amazing) work, and I don't have a
> strong opinion yet.  But certainly, making it easy for packagers to
> put the -jit stuff into a separate package for the reasons already
> given sounds sensible to me.  Some systems package LLVM as one
> gigantic package that'll get you 1GB of compiler/debugger/other stuff
> and perhaps violate local rules by installing a compiler when you
> really just wanted libLLVM{whatever}.so.  I guess it should be made
> very clear to users (explain plans, maybe startup message, ...?)

I'm not quite sure I understand. You mean have it display whether
available? I think my plan is to "just" set jit_expressions=on (or
whatever we're going to name it) fail if the prerequisites aren't
available. I personally don't think this should be enabled by default,
definitely not in the first release.

> $ c++ -v
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
> LLVM 4.0.0)
> 
> This seems to be a valid complaint.  I don't think you should be
> (indirectly) wrapping Types.h in extern "C".  At a guess, your
> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
> specifiers, so you can use it from C or C++, but making sure that you
> don't #include LLVM's headers from a bizarro context where __cplusplus
> is defined but the linkage is unexpectedly already "C"?

Hm, this seems like a bit of pointless nitpickery by the compiler to me,
but I guess...

Greetings,

Andres Freund



Re: [HACKERS] postgres_fdw bug in 9.6

2018-01-30 Thread Etsuro Fujita

(2018/01/31 4:56), Robert Haas wrote:

On Sat, Jan 20, 2018 at 12:20 AM, Tom Lane  wrote:

It looks like Etsuro-san's proposed patch locks down the choice of
plan more tightly, which is probably a reasonable answer.


OK, committed.  I added a comment along the lines you suggested
previously, since this no longer seems like a generic test that
happens to involve a bunch of merge joins.


Thank you!

Best regards,
Etsuro Fujita



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Thomas Munro
On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund  wrote:
> On 2018-01-31 14:42:26 +1300, Thomas Munro wrote:
>> I'm just starting to look at this (amazing) work, and I don't have a
>> strong opinion yet.  But certainly, making it easy for packagers to
>> put the -jit stuff into a separate package for the reasons already
>> given sounds sensible to me.  Some systems package LLVM as one
>> gigantic package that'll get you 1GB of compiler/debugger/other stuff
>> and perhaps violate local rules by installing a compiler when you
>> really just wanted libLLVM{whatever}.so.  I guess it should be made
>> very clear to users (explain plans, maybe startup message, ...?)
>
> I'm not quite sure I understand. You mean have it display whether
> available? I think my plan is to "just" set jit_expressions=on (or
> whatever we're going to name it) fail if the prerequisites aren't
> available. I personally don't think this should be enabled by default,
> definitely not in the first release.

I assumed (incorrectly) that you wanted it to default to on if
available, so I was suggesting making it obvious to end users if
they've accidentally forgotten to install -jit.  If it's not enabled
until you actually ask for it and trying to enable it when it's not
installed barfs, then that seems sensible.

>> This seems to be a valid complaint.  I don't think you should be
>> (indirectly) wrapping Types.h in extern "C".  At a guess, your
>> llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
>> specifiers, so you can use it from C or C++, but making sure that you
>> don't #include LLVM's headers from a bizarro context where __cplusplus
>> is defined but the linkage is unexpectedly already "C"?
>
> Hm, this seems like a bit of pointless nitpickery by the compiler to me,
> but I guess...

Well that got me curious about how GCC could possibly be accepting
that (it certainly doesn't like extern "C" template ... any more than
the next compiler).  I dug a bit and realised that it's the stdlib
that's different:  libstdc++ has its own extern "C++" in ,
while  libc++ doesn't.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: JIT compiling with LLVM v9.0

2018-01-30 Thread Andres Freund
Hi,

On 2018-01-31 15:48:09 +1300, Thomas Munro wrote:
> On Wed, Jan 31, 2018 at 3:05 PM, Andres Freund  wrote:
> > I'm not quite sure I understand. You mean have it display whether
> > available? I think my plan is to "just" set jit_expressions=on (or
> > whatever we're going to name it) fail if the prerequisites aren't
> > available. I personally don't think this should be enabled by default,
> > definitely not in the first release.
> 
> I assumed (incorrectly) that you wanted it to default to on if
> available, so I was suggesting making it obvious to end users if
> they've accidentally forgotten to install -jit.  If it's not enabled
> until you actually ask for it and trying to enable it when it's not
> installed barfs, then that seems sensible.

I'm open to changing my mind on it, but it seems a bit weird that a
feature that relies on a shlib being installed magically turns itself on
if avaible. And leaving that angle aside, ISTM, that it's a complex
enough feature that it should be opt-in the first release... Think we
roughly did that right for e.g. parallellism.

Greetings,

Andres Freund



Re: Wait for parallel workers to attach

2018-01-30 Thread Amit Kapila
On Mon, Jan 29, 2018 at 8:25 PM, Robert Haas  wrote:
> On Sat, Jan 27, 2018 at 3:14 AM, Amit Kapila  wrote:
>> During the recent development of parallel operation (parallel create
>> index)[1], a need has been arised for $SUBJECT.  The idea is to allow
>> leader backend to rely on number of workers that are successfully
>> started.  This API allows leader to wait for all the workers to start
>> or fail even if one of the workers fails to attach.  We consider
>> workers started/attached once they are attached to error queue.  This
>> will ensure that any error after the workers are attached won't be
>> silently ignored by leader.
>
> known_started_workers looks a lot like any_message_received.  Perhaps
> any_message_received should be renamed to known_started_workers and
> reused here.
>

Sure, that sounds good to me.  Do you prefer a separate patch for
renaming any_message_received to known_started_workers or is it okay
to have it along with the main patch?

>  After all, if we know that a worker was started, there's
> no need for WaitForParallelWorkersToFinish to again call
> GetBackgroundWorkerPid() for it.
>

I think in above sentence you wanted to say
WaitForParallelWorkersToAttach, not WaitForParallelWorkersToFinish.
Am I right?

> I think that you shouldn't need the 10ms delay loop; waiting forever
> should work.  If a work fails to start, the postmaster should send
> SIGUSR1 which should set our latch.
>

I am not getting what exactly you are suggesting here.  The wait loop
is intended for the case when some workers are not started.  We want
to wait for sometime before checking again whether workers are
started. I wanted to avoid busy looping waiting for some worker to
start.  I think in most cases we don't need to wait, but for some
corner cases where postmaster didn't get chance to start a worker, we
should avoid busy looping waiting for a worker to start.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Wait for parallel workers to attach

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 10:10 PM, Amit Kapila  wrote:
>> known_started_workers looks a lot like any_message_received.  Perhaps
>> any_message_received should be renamed to known_started_workers and
>> reused here.
>
> Sure, that sounds good to me.  Do you prefer a separate patch for
> renaming any_message_received to known_started_workers or is it okay
> to have it along with the main patch?

A single patch sounds OK.

>>  After all, if we know that a worker was started, there's
>> no need for WaitForParallelWorkersToFinish to again call
>> GetBackgroundWorkerPid() for it.
>
> I think in above sentence you wanted to say
> WaitForParallelWorkersToAttach, not WaitForParallelWorkersToFinish.
> Am I right?

Yes.

>> I think that you shouldn't need the 10ms delay loop; waiting forever
>> should work.  If a work fails to start, the postmaster should send
>> SIGUSR1 which should set our latch.
>>
> I am not getting what exactly you are suggesting here.  The wait loop
> is intended for the case when some workers are not started.  We want
> to wait for sometime before checking again whether workers are
> started. I wanted to avoid busy looping waiting for some worker to
> start.  I think in most cases we don't need to wait, but for some
> corner cases where postmaster didn't get chance to start a worker, we
> should avoid busy looping waiting for a worker to start.

I agree we need to avoid busy-looping.  What I'm saying is that we
don't need a timeout.  Why do you think we need a timeout?  We have
bgw_notify_pid so that we will get a signal instead of having to poll.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files

2018-01-30 Thread Michael Paquier
On Fri, Jan 26, 2018 at 11:36:09AM +0900, Michael Paquier wrote:
> In short, it seems really to me that we should reject the approach as
> proposed, and replace it with something that prevents the fetching of
> any WAL segments from the source server. I think that we could consider
> as well removing all WAL segments on the primary from the point WAL
> forked, as those created between the last checkpoint before WAL forked
> up to the point where WAL forked are useful anyway. But those are bonus
> points to keep a minimalistic amount of space for the rewound node as
> they finish by being recycled anyway. For those reasons I think that the
> patch should be marked as returned with feedback.

Hearing nothing and as the commit fest is coming to a close, I am
marking this patch as returned with feedback.  Feel free to correct me
if you think this is not adapted.
--
Michael


signature.asc
Description: PGP signature


Re: Wait for parallel workers to attach

2018-01-30 Thread Amit Kapila
On Wed, Jan 31, 2018 at 8:46 AM, Robert Haas  wrote:
> On Tue, Jan 30, 2018 at 10:10 PM, Amit Kapila  wrote:
>> I am not getting what exactly you are suggesting here.  The wait loop
>> is intended for the case when some workers are not started.  We want
>> to wait for sometime before checking again whether workers are
>> started. I wanted to avoid busy looping waiting for some worker to
>> start.  I think in most cases we don't need to wait, but for some
>> corner cases where postmaster didn't get chance to start a worker, we
>> should avoid busy looping waiting for a worker to start.
>
> I agree we need to avoid busy-looping.  What I'm saying is that we
> don't need a timeout.  Why do you think we need a timeout?
>

I thought we need it for worker startup, but now after again looking
at the code, it seems we do notify at worker startup as well.   So, we
don't need a timeout.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



[PATCH] Add missing type conversion functions for PL/Python

2018-01-30 Thread Haozhou Wang
Hi All,

PL/Python already has different type conversion functions to
convert PostgreSQL datum to Python object. However, the conversion
functions from Python object to PostgreSQL datum only has Boolean,
Bytea and String functions.

In this patch, we add rest of Integer and Float related datatype conversion
functions
and can increase the performance of data conversion greatly especially
when returning a large array.

We did a quick test about the performance of returning array in PL/Python:

The following UDF is used for test:

```
CREATE OR REPLACE FUNCTION pyoutfloat8(num int) RETURNS float8[] AS $$
return [x/3.0 for x in range(num)]
$$ LANGUAGE plpythonu;
```

The test command is

```
select count(pyoutfloat8(n));
```

The count is used for avoid large output, where n is the number of element
in returned array, and the size is from 1.5k to15m.

Size of Array  1.5k   |15k |
 150k|   1.5m| 15m   |

Origin 2.324ms | 19.834ms  | 194.991ms
| 1927.28ms|   19982.1ms  |

With this patch   1.168ms  |  3.052ms   |  21.888ms |
   213.39ms  |2138.5ms   |

All test for both PG and PL/Python are passed.

Thanks very much.


-- 
Regards,
Haozhou


0001-Add-missing-type-conversion-functions-for-PL-Python.patch
Description: Binary data


Re: [HACKERS] [PATCH] Improve geometric types

2018-01-30 Thread Kyotaro HORIGUCHI
At Sun, 21 Jan 2018 21:59:19 +0100, Emre Hasegeli  wrote in 

> New versions are attached including all changes we discussed.

Thanks for the new version.

# there's many changes from the previous version..

About 0001 and 0002.

1."COPT=-DGEODEBUG make" complains as follows.

  | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 
{aka double}’)
  |   printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result);

2. line_construct_pm has been renamed to line_construct. I
   noticed that the patch adds the second block for "(m == 0.0)"
   (from the ealier versions) but it seems to work exactly as the
   same to the "else" block. We need a comment about the reason
   for the seemingly redundant second block.

3. point_sl can return -0.0 and that is a thing that this patch
   intends to avoid. line_invsl has the same problem.

4. lseg_interpt_line is doing as follows.

   >  if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y))
   >*result = lseg->p[0];
   >  else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y))
   >*result = lseg->p[1];

   I suppose we can use point_pt_point for this purpose.

   >  if (point_eq_point(&lseg->p[0], &interpt))
   >*result = lseg->p[0];
   >  else if (point_eq_point(&lseg->p[1], &interpt))
   >*result = lseg->p[1];

   However I'm not sure that adjusting the intersection to the
   tips of the segment is good or not. Adjusting onto the line
   can be better in another case. lseg_interpt_lseg, for
   instance, checks lseg_contain_point on the line parameter of
   lseg_interpt_line.

# I'll be back later..

regards

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




A typo in error message

2018-01-30 Thread Alexander Lakhin

Hello,

Please consider committing the attached patch to fix a typo in error 
message.


Best regards,

--
Alexander Lakhin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index e0d96d7..4894812 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -668,7 +668,7 @@ transformColumnDefinition(CreateStmtContext *cxt, ColumnDef *column)
 	if (cxt->ofType)
 		ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("identity colums are not supported on typed tables")));
+	 errmsg("identity columns are not supported on typed tables")));
 	if (cxt->partbound)
 		ereport(ERROR,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/test/regress/expected/identity.out b/src/test/regress/expected/identity.out
index 87ef0d3..627389b 100644
--- a/src/test/regress/expected/identity.out
+++ b/src/test/regress/expected/identity.out
@@ -349,7 +349,7 @@ DROP USER regress_user1;
 -- typed tables (currently not supported)
 CREATE TYPE itest_type AS (f1 integer, f2 text, f3 bigint);
 CREATE TABLE itest12 OF itest_type (f1 WITH OPTIONS GENERATED ALWAYS AS IDENTITY); -- error
-ERROR:  identity colums are not supported on typed tables
+ERROR:  identity columns are not supported on typed tables
 DROP TYPE itest_type CASCADE;
 -- table partitions (currently not supported)
 CREATE TABLE itest_parent (f1 date NOT NULL, f2 text, f3 bigint) PARTITION BY RANGE (f1);


Re: A typo in error message

2018-01-30 Thread Michael Paquier
On Wed, Jan 31, 2018 at 08:47:57AM +0300, Alexander Lakhin wrote:
> Please consider committing the attached patch to fix a typo in error
> message.

Yeah..

>   if (cxt->ofType)
>   ereport(ERROR,
>   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> -  errmsg("identity colums are not 
> supported on typed tables")));
> +  errmsg("identity columns are not 
> supported on typed tables")));
>   if (cxt->partbound)
>   ereport(ERROR,
>   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

The indentation for this ereport() is weird as well here.
--
Michael


signature.asc
Description: PGP signature


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2018-01-30 Thread Masahiko Sawada
On Sat, Jan 20, 2018 at 7:08 AM, Tomas Vondra
 wrote:
> On 01/19/2018 03:34 PM, Tomas Vondra wrote:
>> Attached is v5, fixing a silly bug in part 0006, causing segfault when
>> creating a subscription.
>>
>
> Meh, there was a bug in the sgml docs ( vs. ),
> causing another failure. Hopefully v6 will pass the CI build, it does
> pass a build with the same parameters on my system.

Thank you for working on this. This patch would be helpful for
synchronous replication.

I haven't looked at the code deeply yet, but I've reviewed the v6
patch set especially on subscriber side. All of the patches can be
applied to current HEAD cleanly. Here is review comment.


CREATE SUBSCRIPTION commands accept work_mem < 64 but it leads ERROR
on publisher side when starting replication. Probably we should check
the value on the subscriber side as well.


When streaming = on, if we drop subscription in the middle of
receiving stream changes, DROP SUBSCRIPTION could leak tmp files
(.chages file and .subxacts file). Also it also happens when a
transaction on upstream aborted without abort record.


Since we can change both streaming option and work_mem option by ALTER
SUBSCRIPTION, documentation of ALTER SUBSCRIPTION needs to be updated.


If we create a subscription without any options, both
pg_subscription.substream and pg_subscription.subworkmem are set to
null. However, since GetSubscription are't aware of NULL we start the
replication with invalid options like follows.
LOG:  received replication command: START_REPLICATION SLOT "hoge_sub"
LOGICAL 0/0 (proto_version '2', work_mem '893219954', streaming 'on',
publication_names '"hoge_pub"')

I think we can set substream to false and subworkmem to -1 instead of
null, and then makes libpqrcv_startstreaming not set streaming option
if stream is -1.


Some WARNING messages appeared. Maybe these are for debug purpose?

WARNING:  updating stream stats 0x1c12ef8 4 3 65604
WARNING: UpdateSpillStats: updating stats 0x1c12ef8 0 0 0 39 41 2632080

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: PATCH: Configurable file mode mask

2018-01-30 Thread Michael Paquier
On Mon, Jan 29, 2018 at 04:29:08PM -0500, David Steele wrote:
> OK, that being the case I have piggy-backed on the current pg_upgrade
> tests in the same way that --wal-segsize did.
> 
> There are now three patches:
> 
> 1) 01-pgresetwal-test
> 
> Adds a *very* basic test suite for pg_resetwal.  I was able to make this
> utility core dump (floating point errors) pretty easily with empty or
> malformed pg_control files so I focused on WAL reset functionality plus
> the basic help/command tests that every utility has.

A little is better than nothing, so that's an improvement, thanks!

+# Remove WAL from pg_wal and make sure it gets rebuilt
+$node->stop;
+
+unlink("$pgwal/00010001") == 1
+   or die("unable to remove 00010001");
+
+is_deeply(
+   [sort(slurp_dir($pgwal))], [sort(qw(. .. archive_status))], 'no WAL');
This is_deeply test serves little purpose.  The segment gets removed
directly so I would just remove it.

Another test that could be easily included is with -o, then create a
table and check for its OID value.  Also for -x, then just use
txid_current to check the value consistency.  For -c, with
track_commit_timestamp set to on, you can set a couple of values and
then check for pg_last_committed_xact() which would be NULL if you use
an XID older than the minimum set for example.  All those are simple
enough so they could easily be added in the first version of this test
suite.

You need to update src/bin/pg_resetxlog/.gitignore to include
tmp_check/.

> 2) 02-mkdir
> 
> Converts mkdir() calls to MakeDirectoryDefaultPerm() if the original
> call used default permissions.

So the only call not converted to that API is in ipc.c...  OK, this one
is pretty simple so nothing really to say about it, the real meat comes
with patch 3.  I would rather see with a good eye if this patch
introduces file_perm.h which includes both PG_FILE_MODE_DEFAULT and
PG_DIR_MODE_DEFAULT, and have the frontends use those values.  This
would make the switch to 0003 a bit easier if you look at pg_resetwal's
diffs for example.

> 3) 03-group
> 
> Allow group access on PGDATA.  This includes backend changes, utility
> changes (initdb, pg_ctl, pg_upgrade, etc.) and tests for each utility.

+++ b/src/include/common/file_perm.h
+ *
+ * This module is located in common so the backend can use the constants.
+ *
This is the case of more or less all the content in src/common/, except
for the things which are frontend-only, so this comment could just be
removed.

+command_ok(
+   ['chmod', "-R", 'g+rX', "$pgdata"],
+   'add group perms to PGDATA');
That would blow up on Windows.  You should replace that by perl's chmod
and look at its return result for sanity checks, likely with ($cnt != 0).
And let's not use icacls please...

+if ($params{has_group_access})
+{
+push(@{$params{extra}}, '-g');
+}
No need to introduce a new parameter here, please use directly extra =>
[ '-g' ] in the routines calling PostgresNode::init.

The efforts put in the tests are good.

Patch 0003 needs a very careful lookup...  We don't want to introduce any
kind of race conditions here.  I am not familiar enough with the
proposed patch but the patch is giving me some chills.

You may want to run pgindent once on your patch set.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2018-01-30 Thread Tatsuro Yamada

On 2018/01/24 1:08, Pavel Stehule wrote:



2018-01-22 23:15 GMT+01:00 Stephen Frost mailto:sfr...@snowman.net>>:

Pavel,

* Pavel Stehule (pavel.steh...@gmail.com ) 
wrote:
> here is a GUC based patch for plancache controlling. Looks so this code is
> working.

This really could use a new thread, imv.  This thread is a year old and
about a completely different feature than what you've implemented here.


true, but now it is too late


> It is hard to create regress tests. Any ideas?

Honestly, my idea for this would be to add another option to EXPLAIN
which would make it spit out generic and custom plans, or something of
that sort.


I looked there, but it needs cycle dependency between CachedPlan and 
PlannedStmt. It needs more code than this patch and code will not be nicer. I 
try to do some game with prepared statements

Please review your patches before sending them and don't send in patches
which have random unrelated whitespace changes.

 > diff --git a/src/backend/utils/cache/plancache.c 
b/src/backend/utils/cache/plancache.c
 > index ad8a82f1e3..cc99cf6dcc 100644
 > --- a/src/backend/utils/cache/plancache.c
 > +++ b/src/backend/utils/cache/plancache.c
 > @@ -1031,6 +1033,12 @@ choose_custom_plan(CachedPlanSource *plansource, 
ParamListInfo boundParams)
 >       if (IsTransactionStmtPlan(plansource))
 >               return false;
 >
 > +     /* See if settings wants to force the decision */
 > +     if (plancache_mode & PLANCACHE_FORCE_GENERIC_PLAN)
 > +             return false;
 > +     if (plancache_mode & PLANCACHE_FORCE_CUSTOM_PLAN)
 > +             return true;

Not a big deal, but I probably would have just expanded the conditional
checks against cursor_options (just below) rather than adding
independent if() statements.


I don't think so proposed change is better - My opinion is not strong - and 
commiter can change it simply


 > diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
 > index ae22185fbd..4ce275e39d 100644
 > --- a/src/backend/utils/misc/guc.c
 > +++ b/src/backend/utils/misc/guc.c
 > @@ -3916,6 +3923,16 @@ static struct config_enum ConfigureNamesEnum[] =
 >               NULL, NULL, NULL
 >       },
 >
 > +     {
 > +             {"plancache_mode", PGC_USERSET, QUERY_TUNING_OTHER,
 > +                     gettext_noop("Forces use of custom or generic 
plans."),
 > +                     gettext_noop("It can control query plan cache.")
 > +             },
 > +             &plancache_mode,
 > +             PLANCACHE_DEFAULT, plancache_mode_options,
 > +             NULL, NULL, NULL
 > +     },

That long description is shorter than the short description.  How about:

short: Controls the planner's selection of custom or generic plan.
long: Prepared queries have custom and generic plans and the planner
will attempt to choose which is better; this can be set to override
the default behavior.


changed - thank you for text


(either that, or just ditch the long description)

 > diff --git a/src/include/utils/plancache.h 
b/src/include/utils/plancache.h
 > index 87fab19f3c..962895cc1a 100644
 > --- a/src/include/utils/plancache.h
 > +++ b/src/include/utils/plancache.h
 > @@ -143,7 +143,6 @@ typedef struct CachedPlan
 >       MemoryContext context;          /* context containing this 
CachedPlan */
 >  } CachedPlan;
 >
 > -
 >  extern void InitPlanCache(void);
 >  extern void ResetPlanCache(void);
 >

Random unrelated whitespace change...


fixed


This is also missing documentation updates.


I wrote some basic doc, but native speaker should to write more words about 
used strategies.


Setting to Waiting for Author, but with those changes I would think we
could mark it ready-for-committer, it seems pretty straight-forward to
me and there seems to be general agreement (now) that it's worthwhile to
have.

Thanks!


attached updated patch

Regards

Pavel


Stephen


Hi Pavel,

I tested your patch on a044378ce2f6268a996c8cce2b7bfb5d82b05c90.
I share my results to you.

 - Result of patch command

$ patch -p1 < guc-plancache_mode-180123.patch
patching file doc/src/sgml/config.sgml
Hunk #1 succeeded at 4400 (offset 13 lines).
patching file src/backend/utils/cache/plancache.c
patching file src/backend/utils/misc/guc.c
patching file src/include/utils/plancache.h
patching file src/test/regress/expected/plancache.out
patching file src/test/regress/sql/plancache.sql

 - Result of regression test

===
 All 186 tests passed.
===

Regards,
Tatsuro Yamada




Re: [HACKERS] generated columns

2018-01-30 Thread Michael Paquier
On Sat, Jan 27, 2018 at 05:05:14PM -0500, Peter Eisentraut wrote:
> This is expanded in the rewriter, so there is no context like that.
> This is exactly how views work, e.g.,
> 
> create table t1 (id int, length int);
> create view v1 as select id, length * 10 as length_in_nanometers
> from t1;
> insert into t1 values (1, 5);
> select * from v1;
> ERROR:  integer out of range
> 
> I think this is not a problem in practice.

Yeah, I tend to have the same opinion while doing a second pass on the
patch proposed on this thread.  You could more context when using STORED
columns, but for VIRTUAL that does not make such sense as the handling
of values is close to views.  That's the same handling for materialized
views as well, you don't get any context when facing an overflow when
either creating the matview or refreshing it.
--
Michael


signature.asc
Description: PGP signature


csv format for psql

2018-01-30 Thread Daniel Verite
 Hi,


This patch implements csv as an output format in psql
(\pset format csv). It's quite similar to the unaligned format,
except that it applies CSV quoting rules (obviously!) and that
it prints no footer and no title.
As with unaligned, a header with column names is output unless
tuples_only is on. It also supports the fieldsep/fielsep_zero
and recordsep/recordsep_zero settings.

Most of times, the need for CSV is covered by \copy or COPY with
the CSV option, but there are some cases where it would be more
practical to have it as an output format in psql.

* \copy does not interpolate psql variables and is a single-line
command, so making a query fit these contraints can be cumbersome.
It can be got around by defining a temporary view and
\copy from that view, but that doesn't work in a read-only context
such as when connected to a standby.

* the server-side COPY TO STDOUT can also be used from psql,
typically with psql -c "COPY (query) TO STDOUT CSV" > file.csv,
but that's too simple to extract multiple result sets per script.
COPY is also more rigid than psql in the options to delimit
fields and records.

* copy with csv can't help for the output of meta-commands
such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
does work with these.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7ea7edc..ebb3d35 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -246,7 +246,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -376,7 +376,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -551,8 +551,8 @@ EOF
   --field-separator-zero
   
   
-  Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  Set the field separator for unaligned and csv outputs to a zero byte.
+  This is equivalent to \pset fieldsep_zero.
   
   
 
@@ -562,8 +562,9 @@ EOF
   --record-separator-zero
   
   
-  Set the record separator for unaligned output to a zero byte.  This is
-  useful for interfacing, for example, with xargs -0.
+  Set the record separator for unaligned and csv outputs to a zero byte.
+  This is useful for interfacing, for example, with
+  xargs -0.
   This is equivalent to \pset recordsep_zero.
   
   
@@ -1918,9 +1919,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2527,8 +2528,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one can create, for example, tab- or
   comma-separated output, which other programs might prefer. To
   set a tab as field separator, type \pset fieldsep
   '\t'. The default field separator is
@@ -2541,8 +2542,8 @@ lo_import 152801
   fieldsep_zero
   
   
-  Sets the field separator to use in unaligned output format to a zero
-  byte.
+  Sets the field separator to use in unaligned or csv output formats to
+  a zero byte.
   
   
   
@@ -2565,9 +2566,13 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of
+  unaligned,
+  aligned,
+  csv,
+  wrapped,
+  html,
+  asciidoc,
   latex (uses tabular),
   latex-longtable, or
   troff-ms.
@@ -2582,6 +2587,12 @@ lo_import 152801
   format).
   
 
+  csv format is similar to
+  unaligned, except that column contents are
+  enclosed in double quotes and quoted when necessary according to the
+  rules of the CSV format, and that no title or footer are printed.
+  
+
   aligned format is the standard, 
human-readable,
   nicely formatte

Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2018-01-30 Thread Michael Meskes
> Michael, is there any problem including datatype/* headers in ecpg
> that
> are frontend clean? I see no such usage so far, that's why I'm
> asking.

When the pgtypes library was created we tried to include only the bits
and pieces needed to not create unnecessary hassles, but if it compiles
cleanly I'm fine either way. I'm assuming you're talking about
including the files for compiling ecpg, not as externally visible
header files, right?

michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL



Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory

2018-01-30 Thread Yoshimi Ichiyanagi

Fri, 19 Jan 2018 09:42:25 -0500Robert Haas  wrote
 :
>
>I think that you really need to include the checkpoints in the tests.
>I would suggest setting max_wal_size and/or checkpoint_timeout so that
>you reliably complete 2 checkpoints in a 30-minute test, and then do a
>comparison on that basis.

Experimental setup:
-
Server: HP ProLiant DL360 Gen9
CPU:Xeon E5-2667 v4 (3.20GHz); 2 processors(without HT)
DRAM:   DDR4-2400; 32 GiB/processor
(8GiB/socket x 4 sockets/processor) x 2 processors
NVDIMM: DDR4-2133; 32 GiB/processor
(node 0: 8GiB/socket x 2 sockets/processor,
 node 1: 8GiB/socket x 6 sockets/processor)
HDD:Seagate Constellation2 2.5inch SATA 3.0. 6Gb/s 1TB 7200rpm x 1
SATA-SSD: Crucial_CT500MX200SSD1 (SATA 3.2, SATA 6Gb/s)
OS:   Ubuntu 16.04, linux-4.12
DAX FS:   ext4
PMDK: master(at)Aug 30, 2017
PostgreSQL: master
Note: I bound the postgres processes to one NUMA node,
  and the benchmarks to other NUMA node.
-

postgresql.conf
-
# - Settings -
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = pmem_drain/fdatasync/open_datasync
full_page_writes = on
wal_compression = off

# - Checkpoints -
checkpoint_timeout = 12min
max_wal_size = 20GB
min_wal_size = 20GB
-

Executed commands:

# numactl -N 1 pg_ctl start -D [PG_DIR] -l [LOG_FILE]
# numactl -N 0 pgbench -s 200 -i [DB_NAME]
# numactl -N 0 pgbench -c 32 -j 32 -T 1800 -r [DB_NAME] -M prepared


The results:

A) Applied the patches to PG src, and compiled PG with libpmem
B) Applied the patches to PG src, and compiled PG without libpmem
C) Original PG

The averages of running pgbench three times on *PMEM* are:
A)
wal_sync_method = pmem_drain  tps = 41660.42524
wal_sync_method = open_datasync   tps = 39913.49897
wal_sync_method = fdatasync   tps = 39900.83396

C)
wal_sync_method = open_datasync   tps = 40335.50178
wal_sync_method = fdatasync   tps = 40649.57772


The averages of running pgbench three times on *SATA-SSD* are:
B)
wal_sync_method = open_datasync   tps = 7224.07146
wal_sync_method = fdatasync   tps = 7222.19177

C)
wal_sync_method = open_datasync   tps = 7258.79093
wal_sync_method = fdatasync   tps = 7263.19878


>From the above results, it show that wal_sync_method=pmem_drain was
about faster than wal_sync_method=open_datasync/fdatasync.
When pgbench ran on SATA-SSD, wal_sync_method=fdatasync was as fast
as wal_sync_method=open_datasync.


>> Do you know any good WAL I/O intensive benchmarks? DBT2?
>
>pgbench is quite a WAL-intensive benchmark; it is much more
>write-heavy than what most systems experience in real life, at least
>in my experience.  Your comparison of DAX FS to DAX FS + PMDK is very
>interesting, but in real life the bandwidth of DAX FS is already so
>high -- and the latency so low -- that I think most real-world
>workloads won't gain very much.  At least, that is my impression based
>on internal testing EnterpriseDB did a few months back.  (Thanks to
>Mithun and Kuntal for that work.)

In the near future, many physical devices will send sensing data
(IoT might allow devices to exhaust tens Giga network bandwidth).
The amount of data inserted in the DB will significantly increase.
I think that PMEM will be needed for use cases like IoT.



Thu, 25 Jan 2018 09:30:45 -0500Robert Haas  wrote
 :
>Well, some day persistent memory may be a common enough storage
>technology that such a change makes sense, but these days most people
>have either SSD or spinning disks, where the change would probably be
>a net negative.  It seems more like something we might think about
>changing in PG 20 or PG 30.
>

Oracle and Microsoft SQL Server suported PMEM [1][2].
I think it is not too early for PostgreSQL to support PMEM.

[1] http://dbheartbeat.blogspot.jp/2017/11/doag-2017-oracle-18c-dbim-oracle.htm
[2] 
https://www.snia.org/sites/default/files/PM-Summit/2018/presentations/06_PM_Summit_2018_Talpey-Final_Post-CORRECTED.pdf

-- 
Yoshimi Ichiyanagi




Regarding drop index

2018-01-30 Thread Abinaya Kajendiran
Hello all,

We are building in-memory index extension for postgres. For drop index
query, does postgres notify me through index access methods?

Currently, we are using event triggers to capture drop index. If there is a
better way, please do suggest.

Thanks in advance.

Regards,
Abinaya K


Re: csv format for psql

2018-01-30 Thread Fabien COELHO


Hello Daniel,


This patch implements csv as an output format in psql
(\pset format csv).


Would you consider registering it in the next CF?

--
Fabien.



Re: non-bulk inserts and tuple routing

2018-01-30 Thread Etsuro Fujita

(2018/01/25 18:52), Amit Langote wrote:

On 2018/01/25 18:30, Etsuro Fujita wrote:

The patches apply cleanly and compile successfully, but make check fails
in an assert-enabled build.


Hmm, I can't seem to reproduce the failure with v4 patches I posted
earlier today.



Can you please post the errors you're seeing?


A quick debug showed that the failure was due to a segmentation fault 
caused by this change to ExecSetupPartitionTupleRouting (in patch 
v4-0003-During-tuple-routing-initialize-per-partition-obj):


-   boolis_update = false;

+   boolis_update;

I modified that patch to initialize the is_update to false as before. 
With the modified version, make check passed successfully.


I'll review the patch in more detail!

Best regards,
Etsuro Fujita



Re: [HACKERS] PoC: full merge join on comparison clause

2018-01-30 Thread Alexander Kuzmenkov

On 29.01.2018 08:40, Ashutosh Bapat wrote:

  Maybe it's better to separate these two into separate patches so that
it's easy to review patches.
OK, I'll try doing this. For now, moving the patch entry to the next 
commitfest.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: User defined data types in Logical Replication

2018-01-30 Thread atorikoshi

Hi,

It seems to be in the middle of discussion, but I became a reviewer of
this problem several days ago so I've tested the latest patch
'fix_slot_store_error_callback_v12.patch'.

I reproduced the below ERROR by inserting elog() to INPUT function
of CREATE TYPE, and confirmed that above patch solves the problem.

>>  ERROR:  XX000: cache lookup failed for type X

I also ran make check-world and there was no error.

Is the only remaining topic memory leaks?


On 2018/01/09 17:22, Masahiko Sawada wrote:

>> I suppose it's not a common use pattern, but it'd be good
>> to avoid everlasting memleaks.
>
> I agree. Can we remove entry from hash table in the callbacks instead
> of setting InvalidOid when invalidate caches?


--
Atsushi Torikoshi
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center




Re: non-bulk inserts and tuple routing

2018-01-30 Thread Amit Langote
Fujita-san,

On 2018/01/30 18:22, Etsuro Fujita wrote:
> (2018/01/25 18:52), Amit Langote wrote:
>> On 2018/01/25 18:30, Etsuro Fujita wrote:
>>> The patches apply cleanly and compile successfully, but make check fails
>>> in an assert-enabled build.
>>
>> Hmm, I can't seem to reproduce the failure with v4 patches I posted
>> earlier today.
> 
>> Can you please post the errors you're seeing?
> 
> A quick debug showed that the failure was due to a segmentation fault
> caused by this change to ExecSetupPartitionTupleRouting (in patch
> v4-0003-During-tuple-routing-initialize-per-partition-obj):
> 
> -    bool    is_update = false;
> 
> +    bool    is_update;
> 
> I modified that patch to initialize the is_update to false as before. With
> the modified version, make check passed successfully.

Oops, my bad.

> I'll review the patch in more detail!

Thank you.  Will wait for your comments before sending a new version then.

Regards,
Amit




Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Simon Riggs
On 29 January 2018 at 22:31, Peter Geoghegan  wrote:

> I don't think that there should be any error, so I can't say.

>> I argued that was possible and desirable, but you argued it was not
>> and got everybody else to agree with you. I'm surprised to see you
>> change your mind on that.
>
> You're mistaken. Nothing I've said here is inconsistent with my
> previous remarks on how we deal with concurrency.

Please see here

https://www.postgresql.org/message-id/20171102191636.GA27644%40marmot

On 2 November 2017 at 19:16, Peter Geoghegan  wrote:
> Simon Riggs  wrote:
>>
>> So if I understand you correctly, in your view MERGE should just fail
>> with an ERROR if it runs concurrently with other DML?
>
>
> That's certainly my opinion on the matter. It seems like that might be
> the consensus, too.


You've changed your position, which is good, thanks. No problem at all.

The proposal you make here had already been discussed in detail by
Pavan and myself. My understanding of that discussion was that he
thinks it might be possible, but I had said we must stick to the
earlier agreement on how to proceed. I am willing to try to produce
fewer concurrent errors, since that was an important point from
earlier work.

My only issue now is to make sure this does not cause confusion and
ask if that changes the views of others.



> According to your documentation, "MERGE provides a single SQL
> statement that can conditionally INSERT, UPDATE or DELETE rows, a task
> that would otherwise require multiple procedural language statements".
> But you're introducing a behavior/error that would not occur in
> equivalent procedural client code consisting of an UPDATE followed by
> a (conditionally executed) INSERT statement when run in READ COMMITTED
> mode. You actually get exactly the concurrency issue that you cite as
> unacceptable in justifying your serialization error with such
> procedural code (when the UPDATE didn't affect any rows, only
> following EPQ walking the UPDATE chain from the snapshot-visible
> tuple, making the client code decide to do an INSERT on the basis of
> information "from the future").

"You're introducing a behavior/error"... No, I advocate nothing in the
patch, I am merely following the agreement made here:

https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com

Robert, Stephen, may we attempt to implement option 4 as Peter now suggests?

> 4. Implement MERGE, while attempting to avoid concurrent ERRORs in
> cases where that is possible.

I will discuss in more detail at the Brussels Dev meeting and see if
we can achieve consensus on how to proceed.



v14 posted with changes requested by multiple people. Patch status:
Needs Review.

Current summary: 0 wrong answers; 2 ERRORs raised need better
handling; some open requests for change/enhancement.

I will open a wiki page to track open items by the end of the week.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FOR EACH ROW triggers on partitioned tables

2018-01-30 Thread Amit Langote
On 2018/01/30 5:30, Peter Eisentraut wrote:
> On 1/23/18 17:10, Alvaro Herrera wrote:
>> The main question is this: when running the trigger function, it is
>> going to look as it is running in the context of the partition, not in
>> the context of the parent partitioned table (TG_RELNAME etc).  That
>> seems mildly ugly: some users may be expecting that the partitioning
>> stuff is invisible to the rest of the system, so if you have triggers on
>> a regular table and later on decide to partition that table, the
>> behavior of triggers will change, which is maybe unexpected.  Maybe this
>> is not really a problem, but I'm not sure and would like further
>> opinions.
> 
> One could go either way on this, but I think reporting the actual table
> partition is acceptable and preferable.

+1

> If you are writing a generic
> trigger function, maybe to dump out all columns, you want to know the
> physical table and its actual columns.  It's easy[citation needed] to
> get the partition root for a given table, if the trigger code needs
> that.  The other way around is not possible.

I guess you mean the root where a trigger originated, that is, ancestor
table on which an inherited trigger was originally defined.  It is
possible for a trigger to be defined on an intermediate parent and not the
topmost root in a partition tree.

I see that the only parent-child relationship for triggers created
recursively is recorded in the form of a dependency.  I wonder why not a
flag in, say, pg_trigger to indicate that a trigger may have been created
recursively.  With the committed for inherited indexes, I can see that
inheritance is explicitly recorded in pg_inherits because indexes are
relations, so it's possible in the indexes' case to get the parent in
which a given inherited index originated.

> Similarly, transition tables should be OK.  You only get the current
> partition to look at, of course.

+1

> The function name CloneRowTriggersOnPartition() confused me.  A more
> accurate phrasing might be CloneRowTriggersToPartition(), or maybe
> reword altogether.

CloneRowTriggers*For*Partition()?

Thanks,
Amit




Re: non-bulk inserts and tuple routing

2018-01-30 Thread Etsuro Fujita

(2018/01/30 18:39), Amit Langote wrote:

Will wait for your comments before sending a new version then.


Ok, I'll post my comments as soon as possible.

Best regards,
Etsuro Fujita



Re: [HACKERS] Subscription code improvements

2018-01-30 Thread Masahiko Sawada
On Fri, Jan 26, 2018 at 11:41 AM, Peter Eisentraut
 wrote:
> On 1/24/18 02:33, Masahiko Sawada wrote:
>> Thank you for notification. Since it seems to me that no one is
>> interested in this patch, it would be better to close out this patch.
>> This patch is a refactoring patch but subscription code seems to work
>> fine so far. If a problem appears around subscriptions, I might
>> propose it again.
>
> I have looked at the patches again.

Thank you for looking at this.

>  They seem generally reasonable, but
> I don't see quite why we want or need them.  More details would help
> review them.  Do they fix any bugs, or are they just rearranging code?
>

0002 patch rearranges the code. Currently SetSubscriptionRelState()
not only update but also register a record, and it is controlled by
update_only argument. This patch splits SetSubscriptionRelState() into
AddSubscriptionRelState() and UpdateSubscriptionRelstate(). 0001, 0003
and 0004 patch are improvement patches. 0001 patch improves messaging
during worker startup. 0004 patch, which requires 0003 patch, patch
reduce the lock level for DROP SUBSCRIPTION to RowExclusiveLock.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-30 Thread Oliver Ford
On Tuesday, 30 January 2018, Tom Lane  wrote:

> Another thing I'm a little confused by is the precise API for the in_range
> support functions (the lack of any documentation for it doesn't help).
> I wonder why you chose to provide two support functions per datatype
> combination rather than one with an additional boolean argument.
> In fact, it almost seems like the "end" flag could already do the
> job, though I may be missing something.  As-is, it seems like this
> setup involves a lot of duplicate code and catalog entries ... what
> are we gaining from that?
>
> regards, tom lane
>

We could instead remove the "desc" functions and flip the values of both
"preceding" and "end" for a descending order. It just needs an extra bool
in the parsenode/plannode structs to send to nodeWindowAgg.

I used two functions because it seemed cleaner to me to get the Oid of the
function in the parser for both ordering types, so then nodeWindowAgg
doesn't have to know about sort order (doesn't have to have extra
conditionals for the two). But yes it does increase the catalog and code
size so is probably better removed.

I will send out v10 soon with the desc functions removed and the
EXCLUDE_NO_OTHERS define removed.


Re: [PATCH] pgbench - refactor some connection finish/null into common function

2018-01-30 Thread Fabien COELHO


Hello Doug,


This patch refactors all of the connection state PQfinish() and NULL’ing into a 
single function.
Excludes PQfinish() in doConnect().


My 0.02€:

The argument could be "PGconn **" instead of a "CState *"?
If so, it may be used in a few more places. What is your opinion?

I'm fine with this kind of factorization which takes out a three-line 
pattern, but I'm wondering whether it would please committers.


--
Fabien.

Re: [HACKERS] path toward faster partition pruning

2018-01-30 Thread Kyotaro HORIGUCHI
Hello, let me make some comments.

At Tue, 30 Jan 2018 09:57:44 +0900, Amit Langote 
 wrote in 
<4a7dda08-b883-6e5e-b0bf-f5ce95584...@lab.ntt.co.jp>
> Hi Jesper.
> 
> On 2018/01/29 22:13, Jesper Pedersen wrote:
> > Hi Amit,
> > 
> > On 01/26/2018 04:17 AM, Amit Langote wrote:
> >> I made a few of those myself in the updated patches.
> >>
> >> Thanks a lot again for your work on this.
> >>
> > 
> > This needs a rebase.
> 
> AFAICS, v22 cleanly applies to HEAD (c12693d8f3 [1]), compiles, and make
> check passes.

Yes, it cleanly applies to HEAD and seems working.

0001 seems fine.

I have some random comments on 0002.

-extract_partition_key_clauses implicitly assumes that the
 commutator of inequality operator is the same to the original
 operator. (commutation for such operators is an identity
 function.)

 I believe it is always true for a sane operator definition, but
 perhaps wouldn't it be better using commutator instead of
 opclause->opno as the source of negator if any? (Attached diff 1)

 
-get_partitions_from_or_clause_args abandons arg_partset with all
 bit set when partition constraint doesn't refute whole the
 partition. Finally get_partitions_from_clauses does the same
 thing but it's waste of cycles and looks weird. It seems to have
 intended to return immediately there.

  >   /* Couldn't eliminate any of the partitions. */
  >   partdesc = RelationGetPartitionDesc(context->relation);
  > - arg_partset = bms_add_range(NULL, 0, partdesc->nparts - 1);
  > + return bms_add_range(NULL, 0, partdesc->nparts - 1);
  > }
  >  
  > subcontext.rt_index = context->rt_index;
  > subcontext.relation = context->relation;
  > subcontext.clauseinfo = &partclauseinfo;
 !> arg_partset = get_partitions_from_clauses(&subcontext);

-get_partitions_from_or_clause_args converts IN (null) into
 nulltest and the nulltest doesn't exclude a child that the
 partition key column can be null.

 drop table if exists p;
 create table p (a int, b int) partition by list (a);
 create table c1 partition of p for values in (1, 5, 7);
 create table c2 partition of p for values in (4, 6, null);
 insert into p values (1, 0), (null, 0);
 
 explain select tableoid::regclass, * from p where a in (1, null);
 >QUERY PLAN
 > -
 >  Result  (cost=0.00..76.72 rows=22 width=12)
 >->  Append  (cost=0.00..76.50 rows=22 width=12)
 >  ->  Seq Scan on c1  (cost=0.00..38.25 rows=11 width=12)
 >Filter: (a = ANY ('{1,NULL}'::integer[]))
 >  ->  Seq Scan on c2  (cost=0.00..38.25 rows=11 width=12)
 >Filter: (a = ANY ('{1,NULL}'::integer[]))

 Although the clause "a in (null)" doesn't match the (null, 0)
 row so it donesn't harm finally, I don't think this is a right
 behavior. null in an SAOP rightop should be just ignored on
 partition pruning. Or is there any purpose of this behavior?


- In extract_bounding_datums, clauseinfo is set twice to the same
  value.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index ab17524..a2488ab 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -2111,7 +2111,6 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 			PartClause *pc;
 			Oid			partopfamily = partkey->partopfamily[i];
 			Oid			partcoll = partkey->partcollation[i];
-			Oid			commutator = InvalidOid;
 			AttrNumber	partattno = partkey->partattrs[i];
 			Expr	   *partexpr = NULL;
 
@@ -2144,6 +2143,7 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 			if (IsA(clause, OpExpr))
 			{
 OpExpr	   *opclause = (OpExpr *) clause;
+Oid			comparator = opclause->opno;
 Expr	   *leftop,
 		   *rightop,
 		   *valueexpr;
@@ -2161,13 +2161,14 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 	valueexpr = rightop;
 else if (EXPR_MATCHES_PARTKEY(rightop, partattno, partexpr))
 {
-	valueexpr = leftop;
-
-	commutator = get_commutator(opclause->opno);
+	Oid commutator = get_commutator(opclause->opno);
 
 	/* nothing we can do unless we can swap the operands */
 	if (!OidIsValid(commutator))
 		continue;
+
+	valueexpr = leftop;
+	comparator = commutator;
 }
 else
 	/* Clause does not match this partition key. */
@@ -2212,7 +2213,7 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 	 * equality operator *and* this is a list partitioned
 	 * table, we can use it prune partitions.
 	 */
-	negator = get_negator(opclause->opno);
+	negator = get_negator(comparator);
 	if (OidIsValid(negator) &&
 		op_in_opfamily(negator, partopfamily))
 	{
@@ -2236,7 +2237,7 @@ extract_partition_key_clauses(PartitionKey partkey, List *clauses,
 }
 
 pc = (PartClause 

Re: [HACKERS] path toward faster partition pruning

2018-01-30 Thread Jesper Pedersen

Hi Amit,

On 01/29/2018 07:57 PM, Amit Langote wrote:

This needs a rebase.


AFAICS, v22 cleanly applies to HEAD (c12693d8f3 [1]), compiles, and make
check passes.



It was a rebase error; I should have checked against a clean master.

Sorry for the noise.

Best regards,
 Jesper




Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2018-01-30 Thread Pavel Stehule
2018-01-30 9:06 GMT+01:00 Tatsuro Yamada :

> On 2018/01/24 1:08, Pavel Stehule wrote:
>
>>
>>
>> 2018-01-22 23:15 GMT+01:00 Stephen Frost > sfr...@snowman.net>>:
>>
>> Pavel,
>>
>>
>> * Pavel Stehule (pavel.steh...@gmail.com > pavel.steh...@gmail.com>) wrote:
>> > here is a GUC based patch for plancache controlling. Looks so this
>> code is
>> > working.
>>
>> This really could use a new thread, imv.  This thread is a year old
>> and
>> about a completely different feature than what you've implemented
>> here.
>>
>>
>> true, but now it is too late
>>
>>
>> > It is hard to create regress tests. Any ideas?
>>
>> Honestly, my idea for this would be to add another option to EXPLAIN
>> which would make it spit out generic and custom plans, or something of
>> that sort.
>>
>>
>> I looked there, but it needs cycle dependency between CachedPlan and
>> PlannedStmt. It needs more code than this patch and code will not be nicer.
>> I try to do some game with prepared statements
>>
>> Please review your patches before sending them and don't send in
>> patches
>> which have random unrelated whitespace changes.
>>
>>  > diff --git a/src/backend/utils/cache/plancache.c
>> b/src/backend/utils/cache/plancache.c
>>  > index ad8a82f1e3..cc99cf6dcc 100644
>>  > --- a/src/backend/utils/cache/plancache.c
>>  > +++ b/src/backend/utils/cache/plancache.c
>>  > @@ -1031,6 +1033,12 @@ choose_custom_plan(CachedPlanSource
>> *plansource, ParamListInfo boundParams)
>>  >   if (IsTransactionStmtPlan(plansource))
>>  >   return false;
>>  >
>>  > + /* See if settings wants to force the decision */
>>  > + if (plancache_mode & PLANCACHE_FORCE_GENERIC_PLAN)
>>  > + return false;
>>  > + if (plancache_mode & PLANCACHE_FORCE_CUSTOM_PLAN)
>>  > + return true;
>>
>> Not a big deal, but I probably would have just expanded the
>> conditional
>> checks against cursor_options (just below) rather than adding
>> independent if() statements.
>>
>>
>> I don't think so proposed change is better - My opinion is not strong -
>> and commiter can change it simply
>>
>>
>>  > diff --git a/src/backend/utils/misc/guc.c
>> b/src/backend/utils/misc/guc.c
>>  > index ae22185fbd..4ce275e39d 100644
>>  > --- a/src/backend/utils/misc/guc.c
>>  > +++ b/src/backend/utils/misc/guc.c
>>  > @@ -3916,6 +3923,16 @@ static struct config_enum
>> ConfigureNamesEnum[] =
>>  >   NULL, NULL, NULL
>>  >   },
>>  >
>>  > + {
>>  > + {"plancache_mode", PGC_USERSET, QUERY_TUNING_OTHER,
>>  > + gettext_noop("Forces use of custom or
>> generic plans."),
>>  > + gettext_noop("It can control query plan
>> cache.")
>>  > + },
>>  > + &plancache_mode,
>>  > + PLANCACHE_DEFAULT, plancache_mode_options,
>>  > + NULL, NULL, NULL
>>  > + },
>>
>> That long description is shorter than the short description.  How
>> about:
>>
>> short: Controls the planner's selection of custom or generic plan.
>> long: Prepared queries have custom and generic plans and the planner
>> will attempt to choose which is better; this can be set to override
>> the default behavior.
>>
>>
>> changed - thank you for text
>>
>>
>> (either that, or just ditch the long description)
>>
>>  > diff --git a/src/include/utils/plancache.h
>> b/src/include/utils/plancache.h
>>  > index 87fab19f3c..962895cc1a 100644
>>  > --- a/src/include/utils/plancache.h
>>  > +++ b/src/include/utils/plancache.h
>>  > @@ -143,7 +143,6 @@ typedef struct CachedPlan
>>  >   MemoryContext context;  /* context containing this
>> CachedPlan */
>>  >  } CachedPlan;
>>  >
>>  > -
>>  >  extern void InitPlanCache(void);
>>  >  extern void ResetPlanCache(void);
>>  >
>>
>> Random unrelated whitespace change...
>>
>>
>> fixed
>>
>>
>> This is also missing documentation updates.
>>
>>
>> I wrote some basic doc, but native speaker should to write more words
>> about used strategies.
>>
>>
>> Setting to Waiting for Author, but with those changes I would think we
>> could mark it ready-for-committer, it seems pretty straight-forward to
>> me and there seems to be general agreement (now) that it's worthwhile
>> to
>> have.
>>
>> Thanks!
>>
>>
>> attached updated patch
>>
>> Regards
>>
>> Pavel
>>
>>
>> Stephen
>>
>
> Hi Pavel,
>
> I tested your patch on a044378ce2f6268a996c8cce2b7bfb5d82b05c90.
> I share my results to you.
>
>  - Result of patch command
>
> $ patch -p1 < guc-plancache_mode-180123.patch
> patching file doc/src/sgml/config.sgml
> Hunk #1 succeeded at 4400 (offset 13 lines).
> patching file src/backend/utils/cache/plancache.c
> patching file src/backend/utils/m

Re: csv format for psql

2018-01-30 Thread Pavel Stehule
2018-01-30 9:31 GMT+01:00 Daniel Verite :

>  Hi,
>
>
> This patch implements csv as an output format in psql
> (\pset format csv). It's quite similar to the unaligned format,
> except that it applies CSV quoting rules (obviously!) and that
> it prints no footer and no title.
> As with unaligned, a header with column names is output unless
> tuples_only is on. It also supports the fieldsep/fielsep_zero
> and recordsep/recordsep_zero settings.
>
> Most of times, the need for CSV is covered by \copy or COPY with
> the CSV option, but there are some cases where it would be more
> practical to have it as an output format in psql.
>

I absolutely agree


> * \copy does not interpolate psql variables and is a single-line
> command, so making a query fit these contraints can be cumbersome.
> It can be got around by defining a temporary view and
> \copy from that view, but that doesn't work in a read-only context
> such as when connected to a standby.
>
> * the server-side COPY TO STDOUT can also be used from psql,
> typically with psql -c "COPY (query) TO STDOUT CSV" > file.csv,
> but that's too simple to extract multiple result sets per script.
> COPY is also more rigid than psql in the options to delimit
> fields and records.
>
> * copy with csv can't help for the output of meta-commands
> such as \gx, \crosstabview, \l, \d ... whereas a CSV format within psql
> does work with these.
>

It is great - long time I miss this feature - It is interesting for
scripting, ETL, ..

This format is too important, so some special short or long option can be
practical (it will be printed in help)

some like --csv

I found one issue - PostgreSQL default field separator is "|". Maybe good
time to use more common "," ?

Or when field separator was not explicitly defined, then use "," for CSV,
and "|" for other. Although it can be little bit messy

Thank you

Pavel


>
>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 3:37 AM, Yoshimi Ichiyanagi
 wrote:
> Oracle and Microsoft SQL Server suported PMEM [1][2].
> I think it is not too early for PostgreSQL to support PMEM.

I agree; it's good to have the option available for those who have
access to the hardware.

If you haven't added your patch to the next CommitFest, please do so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 10:32 AM, Simon Riggs  wrote:
> The code is about 1200 lines and has extensive docs, comments and tests.
>
> There are no contentious infrastructure changes, so the debate around
> concurrency is probably the main one. So it looks to me like
> meaningful review has taken place, though I know Andrew and Pavan have
> also looked at it in detail.

Only design-level review, not detailed review of the code.  To be
clear, I think the design-level review was quite productive and I'm
glad it happened, but it's not a substitute for someone going over the
code in detail to look for problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 7:15 PM, Michael Paquier
 wrote:
> On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
>> In terms of timing of commits, I have marked the patch Ready For
>> Committer. To me that signifies that it is ready for review by a
>> Committer prior to commit.
>
> My understanding of this meaning is different than yours.  It should not
> be the author's role to mark his own patch as ready for committer, but
> the role of one or more people who have reviewed in-depth the proposed
> patch and feature concepts.  If you can get a committer-level individual
> to review your patch, then good for you.  But review basics need to
> happen first.  And based on my rough lookup of this thread this has not
> happened yet.  Other people on this thread are pointing out that as
> well.

+1 to all of that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Mon, Jan 29, 2018 at 12:03 PM, Simon Riggs  wrote:
> Partitioning doesn't look too bad, so that looks comfortable for PG11,
> assuming it doesn't hit some unhandled complexity.
>
> Including RLS in the first commit/release turns this into a high risk
> patch. Few people use it, but if they do, they don't want me putting a
> hole in their battleship (literally) should we discover some weird
> unhandled logic in a complex new command.
>
> My recommendation would be to support that later for those that use
> it. For those that don't, it doesn't matter so can also be done later.

-1.  Every other feature we've added recently, including partitioning,
has had to decide what to do about RLS before the initial commit, and
this feature shouldn't be exempt.  In general, newer features need to
work with older features unless there is some extremely good
architectural reason why that is unreasonably difficult.  If that is
the case here, I don't see that you've made an argument for it.  The
proper way to avoid having you put a hole in their battleship is good
code, proper code review, and good testing, not leaving that case off
to one side.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 4:45 AM, Simon Riggs  wrote:
> "You're introducing a behavior/error"... No, I advocate nothing in the
> patch, I am merely following the agreement made here:
>
> https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
>
> Robert, Stephen, may we attempt to implement option 4 as Peter now suggests?

Simon, I don't think you should represent Peter as having changed his
position when he has already said that he didn't.  To be honest, the
discussion up to this point suggests to me that Peter has a
significantly better grip of the issues than you do, yet your posts
convey, at least to me, that you're quite sure he's wrong about a lot
of stuff, including whether or not his current statements are
compatible with his previous statements.  I think that the appropriate
course of action is for you and he to spend a few more emails trying
to actually get on the same page here.

As far as I am able to understand, the substantive issue here is what
to do when we match an invisible tuple rather than a visible tuple.
The patch currently throws a serialization error on the basis that you
(Simon) thought that's what was previously agreed.  Peter is arguing
that we don't normally issue a serialization error at READ COMMITTED
(which I think is true) and proposed that we instead try to INSERT.  I
don't necessarily think that's different from consensus to implement
option #3 from 
https://www.postgresql.org/message-id/CA%2BTgmoYOyX4nyu9mbMdYTLzT9X-1RptxaTKSQfbSdpVGXgeAJQ%40mail.gmail.com
because that point #3 says that we're not going to try to AVOID errors
under concurrency, not that we're going to create NEW errors.  In
other words, I understand Peter, then and now, to be saying that MERGE
should behave just as if invisible tuples didn't exist at all; if that
leads some other part of the system to throw an ERROR, then that's
what happens.  Presumably, in a case like this, that would be a common
outcome, because the merge would be performed on the basis of a unique
key and so inserting would trigger a duplicate key violation.  But
maybe not, because I don't think MERGE requires there to be a unique
key on that column, so maybe the insert would just work, or maybe the
conflicting transaction would abort just in time to let it work
anyway.

It is possible that I am confused, here, among other reasons because I
haven't read the code, but if I'm not confused then Peter's analysis
is correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



  1   2   >