Re: [HACKERS] Add support for tuple routing to foreign partitions

2018-03-19 Thread Amit Langote
On 2018/03/19 20:25, Amit Langote wrote:
> That's all I have for now.

While testing this patch, I noticed a crash when performing EXPLAIN on
update of a partition tree containing foreign partitions.  Crash occurs in
postgresEndForeignRouting() due to the following Assert failing:

  Assert(fmstate != NULL);

It seems the problem is that ExecCleanupTupleRouting() invokes the
EndForeignRouting() function even if ri_PartitionIsValid is not set.  So I
suppose we need this:

 /*
- * If this is INSERT/UPDATE, allow any FDWs to shut down
+ * If this is INSERT/UPDATE, allow any FDWs to shut down if it has
+ * initialized tuple routing information at all.
  */
 if (node &&
+resultRelInfo->ri_PartitionIsValid &&
 resultRelInfo->ri_FdwRoutine != NULL &&
 resultRelInfo->ri_FdwRoutine->EndForeignRouting != NULL)
 resultRelInfo->ri_FdwRoutine->EndForeignRouting(node->ps.state,

BTW,patch needs to be rebased because of two commits this morning:
ee49f [1] and ee0a1fc84 [2].

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ee49f

[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ee0a1fc84




XID-assigned idle transactions affect vacuum's job.

2018-03-19 Thread Masahiko Sawada
Hi,

Long transactions often annoy users because if a long transaction
exists on a database vacuum cannot reclaim efficiently. There are
several reason why they exist on a database but it's a common case
where users or applications forget to commit/rollback transactions.
That is, transaction is not executing SQL and its state is 'idle in
transaction' on pg_stat_activity. In this case, such transactions
don't affect vacuum's job either if they aren't assigned transaction
id or if they don't have a snapshot. However if they have xid it will
affect vacuum's job even if they don't have a snapshot.

I think that to decide which deleted tuples must be preserved we don't
need to care about backend PGXACT.xid but must care about PGXACT.xmin.
But current GetOldestXmin considers both of them. I guess one reason
why GetOldestXmin does so is that it's also used to determine where to
truncate pg_subtrans. Is there anything else reason? If nothing, I'd
like to change GetOldestXmin so that it sees only PGXACT.xmin for
vacuum purposes. Once we addressed this issue it'll helpful especially
for user who uses read committed transaction isolation level.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: Problem while setting the fpw with SIGHUP

2018-03-19 Thread Dilip Kumar
On Tue, Mar 20, 2018 at 11:26 AM, Michael Paquier 
wrote:

> On Tue, Mar 20, 2018 at 10:43:55AM +0530, Dilip Kumar wrote:
> > I think like WALWriterProcess, we need to call InitXLogInsert for the
> > CheckpointerProcess as well as for the BgWriterProcess
> > because earlier they were calling InitXLogInsert while check
> > RecoveryInProgress before inserting the WAL.
>
> /* don't set signals, bgwriter has its own agenda */
> +   InitXLOGAccess();
> +   InitXLogInsert()
>
> This is wrong, as the checkpointer is started as well on standbys, and
> that InitXLOGAccess initializes things for WAL generation like
> ThisTimeLineID.  So you should just call InitXLogInsert(), and a comment
> would be welcome for both the bgwriter and the checkpointer.
>

Yeah, you are right.  Fixed.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


xlog-insert-critical-v3.patch
Description: Binary data


Re: [HACKERS] taking stdbool.h into use

2018-03-19 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Mar 13, 2018 at 03:25:39PM -0400, Peter Eisentraut wrote:
>> So I'm going back to my proposal from December, to just use stdbool.h
>> when sizeof(bool) == 1, and add a static assertion to prevent other
>> configurations.

> So, on one side of the ring, we have more complicated patches to include
> so as support for sizeof(bool) == 4 becomes possible in the backend
> code, and on the opposite side one patch which restrains the use of
> stdbool.h only when the size is 1.  A size of 4 bytes for bool is
> defined in stdbool.h on a small set of platforms, so it could be
> tempting to use what is proposed here, still that feels like a
> halk-baked integration.  Thoughts from others?

I think it'd be worth identifying exactly which platforms have
sizeof(bool) different from 1.  Are any of them things that anyone
cares about going forward?  The point of this patch is to ease
future development of extensions, but it's unlikely any extension
authors care about portability onto, say, macOS 10.4 (prairiedog).

regards, tom lane



Re: [HACKERS] taking stdbool.h into use

2018-03-19 Thread Michael Paquier
On Tue, Mar 13, 2018 at 03:25:39PM -0400, Peter Eisentraut wrote:
> After more digging, there are more problems with having a bool that is
> not 1 byte.  For example, pg_control has a bool field, so with a
> different bool size, pg_control would be laid out differently.  That
> would require changing all the mentions of bool to bool8 where the end
> up on disk somehow, as I had already done for the system catalog
> structures, but we don't know all the other places that would be affected.
> 
> So I'm going back to my proposal from December, to just use stdbool.h
> when sizeof(bool) == 1, and add a static assertion to prevent other
> configurations.

So, on one side of the ring, we have more complicated patches to include
so as support for sizeof(bool) == 4 becomes possible in the backend
code, and on the opposite side one patch which restrains the use of
stdbool.h only when the size is 1.  A size of 4 bytes for bool is
defined in stdbool.h on a small set of platforms, so it could be
tempting to use what is proposed here, still that feels like a
halk-baked integration.  Thoughts from others?
--
Michael


signature.asc
Description: PGP signature


Re: Problem while setting the fpw with SIGHUP

2018-03-19 Thread Michael Paquier
On Tue, Mar 20, 2018 at 10:43:55AM +0530, Dilip Kumar wrote:
> I think like WALWriterProcess, we need to call InitXLogInsert for the
> CheckpointerProcess as well as for the BgWriterProcess
> because earlier they were calling InitXLogInsert while check
> RecoveryInProgress before inserting the WAL.

/* don't set signals, bgwriter has its own agenda */
+   InitXLOGAccess();
+   InitXLogInsert()

This is wrong, as the checkpointer is started as well on standbys, and
that InitXLOGAccess initializes things for WAL generation like
ThisTimeLineID.  So you should just call InitXLogInsert(), and a comment
would be welcome for both the bgwriter and the checkpointer.
--
Michael


signature.asc
Description: PGP signature


Re: jsonpath

2018-03-19 Thread Tom Lane
Andrew Dunstan  writes:
>> That seems like a quite limited list of functions.  What about
>> reworking them providing a way of calling them without risk of
>> exception?

> I haven't seen a response to this email. Do we need one before
> proceeding any further with jsonpath?

I've not been following this thread in detail, but IMO any code anywhere
that thinks that no error can be thrown out of non-straight-line code is
broken beyond redemption.  What, for example, happens if we get ENOMEM
within one of the elog.c functions?

I did look through 0007-jsonpath-arithmetic-error-handling-v12.patch,
and I can't believe that's seriously proposed for commit.  It's making
some pretty fragile changes in error handling, and so far as I can
find there is not even one line of commentary as to what the new
design rules are supposed to be.  Even if it's completely bug-free
today (which I would bet against), how could we keep it so?

regards, tom lane



Re: Problem while setting the fpw with SIGHUP

2018-03-19 Thread Dilip Kumar
On Fri, Mar 16, 2018 at 10:53 AM, Michael Paquier 
wrote:

> On Tue, Mar 13, 2018 at 05:04:04PM +0900, Michael Paquier wrote:
> > Instead of doing what you are suggesting, why not moving
> > InitXLogInsert() out of InitXLOGAccess() and change InitPostgres() so as
> > the allocations for WAL inserts is done unconditionally?  This has
> > the cost of also making this allocation even for backends which are
> > started during recovery, still we are talking about allocating a couple
> > of bytes in exchange of addressing completely all race conditions in
> > this area.  InitXLogInsert() does not depend on any post-recovery data
> > like ThisTimeLineId, so a split is possible.
>
> I have been hacking things this way, and it seems to me that it takes
> care of all this class of problems.  CreateCheckPoint() actually
> mentions that InitXLogInsert() cannot be called in a critical section,
> so the comments don't match the code.  I also think that we still want
> to be able to use RecoveryInProgress() in critical sections to do
> decision-making for the generation of WAL records
>

Thanks for the patch, the idea looks good to me.  Please find some comments
and updated patch.

I think like WALWriterProcess, we need to call InitXLogInsert for the
CheckpointerProcess as well as for the BgWriterProcess
because earlier they were calling InitXLogInsert while check
RecoveryInProgress before inserting the WAL.

see below crash:
#0  0x7f89273a65d7 in raise () from /lib64/libc.so.6
#1  0x7f89273a7cc8 in abort () from /lib64/libc.so.6
#2  0x009fd24e in errfinish (dummy=0) at elog.c:556
#3  0x009ff70b in elog_finish (elevel=20, fmt=0xac0d1d "too much
WAL data") at elog.c:1378
#4  0x00558766 in XLogRegisterData (data=0xf3efac 
"\001", len=1) at xloginsert.c:330
#5  0x0055080e in UpdateFullPageWrites () at xlog.c:9569
#6  0x007ea831 in UpdateSharedMemoryConfig () at checkpointer.c:1360
#7  0x007e95b1 in CheckpointerMain () at checkpointer.c:370
#8  0x00561680 in AuxiliaryProcessMain (argc=2,
argv=0x7fffcfd4bec0) at bootstrap.c:447

I have modified you patch and called InitXLogInsert for CheckpointerProcess
and BgWriterProcess also. After that the
issue is solved and fpw is getting set properly.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


xlog-insert-critical-v2.patch
Description: Binary data


Re: jsonpath

2018-03-19 Thread Andrew Dunstan
On Tue, Mar 20, 2018 at 3:36 PM, Andrew Dunstan
 wrote:
> On Fri, Mar 2, 2018 at 8:27 AM, Alexander Korotkov
>  wrote:
>> On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov 
>> wrote:
>>>
>>> On 28.02.2018 06:55, Robert Haas wrote:
>>>
 On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
  wrote:
>
> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
> Now run-time errors in arithmetic operators are catched (added
> PG_TRY/PG_CATCH around operator's functions calls) and converted
> into
> Unknown values in predicates as it is required by the standard:

 I think we really need to rename PG_TRY and PG_CATCH or rethink this
 whole interface so that people stop thinking they can use it to
 prevent errors from being thrown.
>>>
>>>
>>> I understand that it is unsafe to call arbitrary function inside PG_TRY
>>> without
>>> rethrowing of caught errors in PG_CATCH, but in jsonpath only the
>>> following
>>> numeric and datetime functions with known behavior are called inside
>>> PG_TRY
>>> and only errors of category ERRCODE_DATA_EXCEPTION are caught:
>>>
>>> numeric_add()
>>> numeric_mul()
>>> numeric_div()
>>> numeric_mod()
>>> numeric_float8()
>>> float8in()
>>> float8_numeric()
>>> to_datetime()
>>
>>
>> That seems like a quite limited list of functions.  What about reworking
>> them
>> providing a way of calling them without risk of exception?  For example, we
>> can
>> have numeric_add_internal() function which fill given data structure with
>> error information instead of throwing the error.  numeric_add() would be a
>> wrapper over numeric_add_internal(), which throws an error if corresponding
>> data structure is filled.  In jsonpath we can call numeric_add_internal()
>> and
>> interpret errors in another way.  That seems to be better than use of PG_TRY
>> and PG_CATCH.
>>
>
>
> I haven't seen a response to this email. Do we need one before
> proceeding any further with jsonpath?
>


Apologies. I see the reply now.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

2018-03-19 Thread Andrew Dunstan
On Fri, Mar 2, 2018 at 8:27 AM, Alexander Korotkov
 wrote:
> On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov 
> wrote:
>>
>> On 28.02.2018 06:55, Robert Haas wrote:
>>
>>> On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
>>>  wrote:

 Attached 10th version of the jsonpath patches.

 1. Fixed error handling in arithmetic operators.

 Now run-time errors in arithmetic operators are catched (added
 PG_TRY/PG_CATCH around operator's functions calls) and converted
 into
 Unknown values in predicates as it is required by the standard:
>>>
>>> I think we really need to rename PG_TRY and PG_CATCH or rethink this
>>> whole interface so that people stop thinking they can use it to
>>> prevent errors from being thrown.
>>
>>
>> I understand that it is unsafe to call arbitrary function inside PG_TRY
>> without
>> rethrowing of caught errors in PG_CATCH, but in jsonpath only the
>> following
>> numeric and datetime functions with known behavior are called inside
>> PG_TRY
>> and only errors of category ERRCODE_DATA_EXCEPTION are caught:
>>
>> numeric_add()
>> numeric_mul()
>> numeric_div()
>> numeric_mod()
>> numeric_float8()
>> float8in()
>> float8_numeric()
>> to_datetime()
>
>
> That seems like a quite limited list of functions.  What about reworking
> them
> providing a way of calling them without risk of exception?  For example, we
> can
> have numeric_add_internal() function which fill given data structure with
> error information instead of throwing the error.  numeric_add() would be a
> wrapper over numeric_add_internal(), which throws an error if corresponding
> data structure is filled.  In jsonpath we can call numeric_add_internal()
> and
> interpret errors in another way.  That seems to be better than use of PG_TRY
> and PG_CATCH.
>


I haven't seen a response to this email. Do we need one before
proceeding any further with jsonpath?

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-19 Thread Ashutosh Bapat
On Mon, Mar 19, 2018 at 11:15 PM, Robert Haas  wrote:
> On Fri, Mar 16, 2018 at 1:50 PM, Ashutosh Bapat
>  wrote:
>>> This patch also renames can_parallel_agg to
>>> can_partial_agg and removes the parallelism-specific bits from it.
>>
>> I think we need to update the comments in this function to use phrase
>> "partial aggregation" instead of "parallel aggregation". And I think
>> we need to change the conditions as well. For example if
>> parse->groupClause == NIL, why can't we do partial aggregation? This
>> is the classical case when we will need patial aggregation. Probably
>> we should test this with Jeevan's patches for partition-wise aggregate
>> to see if it considers partition-wise aggregate or not.
>
> I think the case where we have neither any aggregates nor a grouping
> clause is where we are doing SELECT DISTINCT.

That's a case which will also benefit from partial partition-wise
grouping where each partition produces its own distinct values, thus
reducing the number of rows that flow through an Append node or better
over network, and finalization step removes duplicates. In fact, what
we are attempting here is partition-wise grouping and partial
aggregation happens to be a requirement for doing that if there are
aggregates. If there are no aggregates, we still benefit from
partition-wise grouping. So, can_partial_agg should only tell whether
we can calculate partial aggregates. If there are aggregates present,
and can_partial_agg is false, we can not attempt partition-wise
grouping. But if there are no aggregates and can_partial_agg is false,
it shouldn't prohibit us from using partition-wise aggregates.

> Something like SELECT
> COUNT(*) FROM ... is not this case because that has an aggregate.
>
> I'm sort of on the fence as to whether and how to update the comments.
> I agree that it's a little strange to leave the comments here
> referencing parallel aggregation when the function has been renamed to
> is_partial_agg(), but a simple search-and-replace doesn't necessarily
> improve the situation very much.

Hmm, agree. And as you have mentioned downthread, it's not part of the
this patchset to attempt to do that. May be I will try providing a
patch to update comments once we have committed PWA.

> Most notably, hasNonSerial is
> irrelevant for partial but non-parallel aggregation, but we still have
> the test because we haven't done the work to really do the right thing
> here, which is to separately track whether we can do parallel partial
> aggregation (either hasNonPartial or hasNonSerial is a problem) and
> non-parallel partial aggregation (only hasNonPartial is a problem).
> This needs a deeper reassessment, but I don't think that can or should
> be something we try to do right now.

I think this bit is easy to mention in the comments. We could always
say that since partial aggregation is using serialization and
deserialization while calculating partial aggregates, even though the
step is not needed, we need hasNonSerial to be true for partial
aggregation to work. A future improvement to avoid serialization and
deserialization in partial aggregation when no parallel query is
involved should remove this condition from here and treat it as a
requirement for parallel aggregation.

>
>> OR When parse->groupingSets is true, I can see why we can't use
>> parallel query, but we can still compute partial aggregates. This
>> condition doesn't hurt since partition-wise aggregation bails out when
>> there are grouping sets, so it's not that harmful here.
>
> I haven't thought deeply about what will break when GROUPING SETS are
> in use, but it's not the purpose of this patch set to make them work
> where they didn't before.  The point of hoisting the first two tests
> out of this function was just to avoid doing repeated work when
> partition-wise aggregate is in use.  Those two tests could conceivably
> produce different results for different children, whereas the
> remaining tests won't give different answers.  Let's not get
> distracted by the prospect of improving the tests.  I suspect that's
> not anyway so simple to achieve as what you seem to be speculating
> here...

+1.


>
> I'm fine with using a structure to bundle details that need to be sent
> to the FDW, but why should the FDW need to know about
> can_sort/can_hash?  I suppose if it needs to know about
> can_partial_agg then it doesn't really cost anything to pass through
> all the flags, but I doubt whether the FDW has any use for the others.
> Anyway, if that's the goal, let's just make sure that the set of
> things we're passing that way are exactly the set of things that we
> think the FDW might need.

I am speculating that an FDW or custom planner hook, which does some
local processing or hints something to the foreign server can use
those flags. But I agree, that unless we see such a requirement we
shouldn't expose those in the structure. It will get difficult to
remove those later.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseD

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-19 Thread Kyotaro HORIGUCHI
At Tue, 20 Mar 2018 13:57:19 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180320.135719.90053076.horiguchi.kyot...@lab.ntt.co.jp>
> At Mon, 19 Mar 2018 20:50:48 +0900, Masahiko Sawada  
> wrote in 
> > On Mon, Mar 19, 2018 at 2:45 PM, Kyotaro HORIGUCHI
> >  wrote:
> > > At Mon, 19 Mar 2018 11:12:58 +0900, Masahiko Sawada 
> > >  wrote in 
> > > 
> > >> On Wed, Mar 14, 2018 at 9:25 PM, Alexander Korotkov
> > >>  wrote:
> > >> > On Wed, Mar 14, 2018 at 7:40 AM, Masahiko Sawada 
> > >> > 
> > >> > wrote:
> > >> >>
> > >> >> On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
> > >> >>  wrote:
> > >> >> > On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada 
> > >> >> > 
> > >> >> > wrote:
> > >> >> >>
> > >> >> >> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
> > >> >> >>  wrote:
> > >> >> >> > 2) These parameters are reset during btbulkdelete() and set 
> > >> >> >> > during
> > >> >> >> > btvacuumcleanup().
> > >> >> >>
> > >> >> >> Can't we set these parameters even during btbulkdelete()? By 
> > >> >> >> keeping
> > >> >> >> them up to date, we will able to avoid an unnecessary cleanup 
> > >> >> >> vacuums
> > >> >> >> even after index bulk-delete.
> > >> >> >
> > >> >> >
> > >> >> > We certainly can update cleanup-related parameters during
> > >> >> > btbulkdelete().
> > >> >> > However, in this case we would update B-tree meta-page during each
> > >> >> > VACUUM cycle.  That may cause some overhead for non append-only
> > >> >> > workloads.  I don't think this overhead would be sensible, because 
> > >> >> > in
> > >> >> > non append-only scenarios VACUUM typically writes much more of
> > >> >> > information.
> > >> >> > But I would like this oriented to append-only workload patch to be
> > >> >> > as harmless as possible for other workloads.
> > >> >>
> > >> >> What overhead are you referring here? I guess the overhead is only the
> > >> >> calculating the oldest btpo.xact. And I think it would be harmless.
> > >> >
> > >> >
> > >> > I meant overhead of setting last_cleanup_num_heap_tuples after every
> > >> > btbulkdelete with wal-logging of meta-page.  I bet it also would be
> > >> > harmless, but I think that needs some testing.
> > >>
> > >> Agreed.
> > >>
> > >> After more thought, it might be too late but we can consider the
> > >> possibility of another idea proposed by Peter. Attached patch
> > >> addresses the original issue of index cleanups by storing the epoch
> > >> number of page deletion XID into PageHeader->pd_prune_xid which is
> > >> 4byte field.
> > >
> > > Mmm. It seems to me that the story is returning to the
> > > beginning. Could I try retelling the story?
> > >
> > > I understant that the initial problem was vacuum runs apparently
> > > unnecessary full-scan on indexes many times. The reason for that
> > > is the fact that a cleanup scan may leave some (or many under
> > > certain condition) dead pages not-recycled but we don't know
> > > whether a cleanup is needed or not. They will be staying left
> > > forever unless we run additional cleanup-scans at the appropriate
> > > timing.
> > >
> > > (If I understand it correctly,) Sawada-san's latest proposal is
> > > (fundamentally the same to the first one,) just skipping the
> > > cleanup scan if the vacuum scan just before found that the number
> > > of *live* tuples are increased. If there where many deletions and
> > > insertions but no increase of total number of tuples, we don't
> > > have a cleanup. Consequently it had a wraparound problem and it
> > > is addressed in this version.
> > 
> > No, it doesn't have a wraparound problem. The patch based on Peter's
> > idea I proposed adds an epoch number of page deletion xid and compare
> > them when we judge whether the page is recyclable or not. It's
> > something like we create 64-bit xid of deletion xid. Also, if there is
> > even one deletion the bulk-delete will be performed instead of the
> > index cleanup. So with this patch we do the index cleanup only when
> > the reltuple of table is increased by fraction of
> > vacuum_index_cleanup_scale_factor from previous stats. It doesn't need
> > to do the index cleanup by any xid thresholds.
> 
> 
> Perhaps you took me wrong. I know the last patch doesn't have (or
> at least intends to get rid of ) the problem, and I wrote that
> the problem was introduced by your *first* patch.
> 
> > > (ditto.) Alexander proposed to record the oldest xid of
> > > recyclable pages in metapage (and the number of tuples at the
> > > last cleanup). This prevents needless cleanup scan and surely
> > > runs cleanups to remove all recyclable pages.
> > 
> > Yes, but the concerns we discussed are that we need extra WAL-logging
> > for updating the metapage and it works only for append-only case. If
> > we also want to support more cases we will need to update the metapage
> > during bulk-delete. The overhead of WAL-logging would be harmless but
> > should be tested as Alexander mentioned.
> 
> Agreed.
> 
> > > I think that we can acc

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-19 Thread Kyotaro HORIGUCHI
At Mon, 19 Mar 2018 20:50:48 +0900, Masahiko Sawada  
wrote in 
> On Mon, Mar 19, 2018 at 2:45 PM, Kyotaro HORIGUCHI
>  wrote:
> > At Mon, 19 Mar 2018 11:12:58 +0900, Masahiko Sawada  
> > wrote in 
> > 
> >> On Wed, Mar 14, 2018 at 9:25 PM, Alexander Korotkov
> >>  wrote:
> >> > On Wed, Mar 14, 2018 at 7:40 AM, Masahiko Sawada 
> >> > wrote:
> >> >>
> >> >> On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
> >> >>  wrote:
> >> >> > On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada 
> >> >> > 
> >> >> > wrote:
> >> >> >>
> >> >> >> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
> >> >> >>  wrote:
> >> >> >> > 2) These parameters are reset during btbulkdelete() and set during
> >> >> >> > btvacuumcleanup().
> >> >> >>
> >> >> >> Can't we set these parameters even during btbulkdelete()? By keeping
> >> >> >> them up to date, we will able to avoid an unnecessary cleanup vacuums
> >> >> >> even after index bulk-delete.
> >> >> >
> >> >> >
> >> >> > We certainly can update cleanup-related parameters during
> >> >> > btbulkdelete().
> >> >> > However, in this case we would update B-tree meta-page during each
> >> >> > VACUUM cycle.  That may cause some overhead for non append-only
> >> >> > workloads.  I don't think this overhead would be sensible, because in
> >> >> > non append-only scenarios VACUUM typically writes much more of
> >> >> > information.
> >> >> > But I would like this oriented to append-only workload patch to be
> >> >> > as harmless as possible for other workloads.
> >> >>
> >> >> What overhead are you referring here? I guess the overhead is only the
> >> >> calculating the oldest btpo.xact. And I think it would be harmless.
> >> >
> >> >
> >> > I meant overhead of setting last_cleanup_num_heap_tuples after every
> >> > btbulkdelete with wal-logging of meta-page.  I bet it also would be
> >> > harmless, but I think that needs some testing.
> >>
> >> Agreed.
> >>
> >> After more thought, it might be too late but we can consider the
> >> possibility of another idea proposed by Peter. Attached patch
> >> addresses the original issue of index cleanups by storing the epoch
> >> number of page deletion XID into PageHeader->pd_prune_xid which is
> >> 4byte field.
> >
> > Mmm. It seems to me that the story is returning to the
> > beginning. Could I try retelling the story?
> >
> > I understant that the initial problem was vacuum runs apparently
> > unnecessary full-scan on indexes many times. The reason for that
> > is the fact that a cleanup scan may leave some (or many under
> > certain condition) dead pages not-recycled but we don't know
> > whether a cleanup is needed or not. They will be staying left
> > forever unless we run additional cleanup-scans at the appropriate
> > timing.
> >
> > (If I understand it correctly,) Sawada-san's latest proposal is
> > (fundamentally the same to the first one,) just skipping the
> > cleanup scan if the vacuum scan just before found that the number
> > of *live* tuples are increased. If there where many deletions and
> > insertions but no increase of total number of tuples, we don't
> > have a cleanup. Consequently it had a wraparound problem and it
> > is addressed in this version.
> 
> No, it doesn't have a wraparound problem. The patch based on Peter's
> idea I proposed adds an epoch number of page deletion xid and compare
> them when we judge whether the page is recyclable or not. It's
> something like we create 64-bit xid of deletion xid. Also, if there is
> even one deletion the bulk-delete will be performed instead of the
> index cleanup. So with this patch we do the index cleanup only when
> the reltuple of table is increased by fraction of
> vacuum_index_cleanup_scale_factor from previous stats. It doesn't need
> to do the index cleanup by any xid thresholds.


Perhaps you took me wrong. I know the last patch doesn't have (or
at least intends to get rid of ) the problem, and I wrote that
the problem was introduced by your *first* patch.

> > (ditto.) Alexander proposed to record the oldest xid of
> > recyclable pages in metapage (and the number of tuples at the
> > last cleanup). This prevents needless cleanup scan and surely
> > runs cleanups to remove all recyclable pages.
> 
> Yes, but the concerns we discussed are that we need extra WAL-logging
> for updating the metapage and it works only for append-only case. If
> we also want to support more cases we will need to update the metapage
> during bulk-delete. The overhead of WAL-logging would be harmless but
> should be tested as Alexander mentioned.

Agreed.

> > I think that we can accept Sawada-san's proposal if we accept the
> > fact that indexes can retain recyclable pages for a long
> > time. (Honestly I don't think so.)
> >
> > If (as I might have mentioned as the same upthread for Yura's
> > patch,) we accept to hold the information on index meta page,
> > Alexander's way would be preferable. The difference betwen Yura's
> > and Alexander's is the former runs cleanup scan if a recyclable
>

Re: ON CONFLICT DO UPDATE for partitioned tables

2018-03-19 Thread Amit Langote
On 2018/03/20 13:30, Amit Langote wrote:
> I have incorporated your patch in the main patch after updating the
> comments a bit.  Also, now that ee49f49 is in [1], the transition
> table related tests I proposed yesterday pass nicely.  Instead of posting
> as a separate patch, I have merged it with the main patch.  So now that
> planner refactoring is unnecessary, attached is just one patch.

Sorry, I forgot to remove a hunk in the patch affecting
src/include/optimizer/prep.h.  Fixed in the attached updated version.

Thanks,
Amit
>From 793b407545b0d24715f0d44a9a546689e9a4282a Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 20 Mar 2018 10:09:38 +0900
Subject: [PATCH v7] Fix ON CONFLICT to work with partitioned tables

Author: Amit Langote, Alvaro Herrera, Etsuro Fujita
---
 doc/src/sgml/ddl.sgml |  15 --
 src/backend/catalog/heap.c|   2 +-
 src/backend/catalog/partition.c   |  62 +--
 src/backend/commands/tablecmds.c  |  15 +-
 src/backend/executor/execPartition.c  | 237 --
 src/backend/executor/nodeModifyTable.c|  93 --
 src/backend/parser/analyze.c  |   7 -
 src/include/catalog/partition.h   |   2 +-
 src/include/executor/execPartition.h  |  10 ++
 src/include/nodes/execnodes.h |   1 +
 src/test/regress/expected/insert_conflict.out |  73 ++--
 src/test/regress/expected/triggers.out|  33 
 src/test/regress/sql/insert_conflict.sql  |  64 +--
 src/test/regress/sql/triggers.sql |  33 
 14 files changed, 551 insertions(+), 96 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3a54ba9d5a..8805b88d82 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3324,21 +3324,6 @@ ALTER TABLE measurement ATTACH PARTITION 
measurement_y2008m02
 
  
   
-   Using the ON CONFLICT clause with partitioned tables
-   will cause an error if the conflict target is specified (see
-for more details on how the clause
-   works).  Therefore, it is not possible to specify
-   DO UPDATE as the alternative action, because
-   specifying the conflict target is mandatory in that case.  On the other
-   hand, specifying DO NOTHING as the alternative action
-   works fine provided the conflict target is not specified.  In that case,
-   unique constraints (or exclusion constraints) of the individual leaf
-   partitions are considered.
-  
- 
-
- 
-  
When an UPDATE causes a row to move from one
partition to another, there is a chance that another concurrent
UPDATE or DELETE misses this row.
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 3d80ff9e5b..13489162df 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1776,7 +1776,7 @@ heap_drop_with_catalog(Oid relid)
elog(ERROR, "cache lookup failed for relation %u", relid);
if (((Form_pg_class) GETSTRUCT(tuple))->relispartition)
{
-   parentOid = get_partition_parent(relid);
+   parentOid = get_partition_parent(relid, false);
LockRelationOid(parentOid, AccessExclusiveLock);
 
/*
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 786c05df73..8dc73ae092 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -192,6 +192,7 @@ static int  
get_partition_bound_num_indexes(PartitionBoundInfo b);
 static int get_greatest_modulus(PartitionBoundInfo b);
 static uint64 compute_hash_value(int partnatts, FmgrInfo *partsupfunc,
 Datum *values, 
bool *isnull);
+static Oid get_partition_parent_recurse(Relation inhRel, Oid relid, bool 
getroot);
 
 /*
  * RelationBuildPartitionDesc
@@ -1384,24 +1385,43 @@ check_default_allows_bound(Relation parent, Relation 
default_rel,
 
 /*
  * get_partition_parent
+ * Obtain direct parent or topmost ancestor of given relation
  *
- * Returns inheritance parent of a partition by scanning pg_inherits
+ * Returns direct inheritance parent of a partition by scanning pg_inherits;
+ * or, if 'getroot' is true, the topmost parent in the inheritance hierarchy.
  *
  * Note: Because this function assumes that the relation whose OID is passed
  * as an argument will have precisely one parent, it should only be called
  * when it is known that the relation is a partition.
  */
 Oid
-get_partition_parent(Oid relid)
+get_partition_parent(Oid relid, bool getroot)
+{
+   RelationinhRel;
+   Oid parentOid;
+
+   inhRel = heap_open(InheritsRelationId, AccessShareLock);
+
+   parentOid = get_partition_parent_recurse(inhRel, relid, getroot);
+   if (parentOid == InvalidOid)
+   elog(ERROR, "could not find parent of rel

Re: [HACKERS] plpgsql - additional extra checks

2018-03-19 Thread Pavel Stehule
2018-03-19 21:47 GMT+01:00 Tomas Vondra :

> Hi,
>
> I'm looking at the updated patch (plpgsql-extra-check-180316.patch), and
> this time it applies and builds OK. The one thing I noticed is that the
> documentation still uses the old wording for strict_multi_assignement:
>
> WARNING:  Number of evaluated fields does not match expected.
> HINT:  strict_multi_assignement check of extra_warnings is active.
> WARNING:  Number of evaluated fields does not match expected.
> HINT:  strict_multi_assignement check of extra_warnings is active.
>
> This was reworded to "Number of source and target fields in assignment
> does not match."
>

fixed

Regards

Pavel


>
> Otherwise it seems fine to me, and I'm tempted to mark it RFC once the
> docs get fixed. Stephen, any objections?
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 6c25116538..2027e35063 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -5003,7 +5003,7 @@ a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
 
   
   
-   Additional Compile-time Checks
+   Additional Compile-time and Run-time Checks
 

 To aid the user in finding instances of simple but common problems before
@@ -5015,26 +5015,62 @@ a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
 so you are advised to test in a separate development environment.

 
- 
-  These additional checks are enabled through the configuration variables
-  plpgsql.extra_warnings for warnings and
-  plpgsql.extra_errors for errors. Both can be set either to
-  a comma-separated list of checks, "none" or "all".
-  The default is "none". Currently the list of available checks
-  includes only one:
-  
-   
-shadowed_variables
-
- 
-  Checks if a declaration shadows a previously defined variable.
- 
-
-   
-  
+   
+Setting plpgsql.extra_warnings, or
+plpgsql.extra_errors, as appropriate, to all
+is encouraged in development and/or testing environments.
+   
+
+   
+These additional checks are enabled through the configuration variables
+plpgsql.extra_warnings for warnings and
+plpgsql.extra_errors for errors. Both can be set either to
+a comma-separated list of checks, "none" or
+"all". The default is "none". Currently
+the list of available checks includes only one:
+
+ 
+  shadowed_variables
+  
+   
+Checks if a declaration shadows a previously defined variable.
+   
+  
+ 
 
-  The following example shows the effect of plpgsql.extra_warnings
-  set to shadowed_variables:
+ 
+  strict_multi_assignment
+  
+   
+Some PL/PgSQL commands allow assigning values
+to more than one variable at a time, such as SELECT INTO.  Typically,
+the number of target variables and the number of source variables should
+match, though PL/PgSQL will use NULL for
+missing values and extra variables are ignored.  Enabling this check
+will cause PL/PgSQL to throw a WARNING or
+ERROR whenever the number of target variables and the number of source
+variables are different.
+   
+  
+ 
+
+ 
+  too_many_rows
+  
+   
+Enabling this check will cause PL/PgSQL to
+check if a given query returns more than one row when an
+INTO clause is used.  As an INTO statement will only
+ever use one row, having a query return multiple rows is generally
+either inefficient and/or nondeterministic and therefore is likely an
+error.
+   
+  
+ 
+
+
+The following example shows the effect of plpgsql.extra_warnings
+set to shadowed_variables:
 
 SET plpgsql.extra_warnings TO 'shadowed_variables';
 
@@ -5050,8 +5086,38 @@ LINE 3: f1 int;
 ^
 CREATE FUNCTION
 
- 
- 
+The below example shows the effects of setting
+plpgsql.extra_warnings to
+strict_multi_assignment:
+
+SET plpgsql.extra_warnings TO 'strict_multi_assignment';
+
+CREATE OR REPLACE FUNCTION public.foo()
+ RETURNS void
+ LANGUAGE plpgsql
+AS $$
+DECLARE
+  x int;
+  y int;
+BEGIN
+  SELECT 1 INTO x, y;
+  SELECT 1, 2 INTO x, y;
+  SELECT 1, 2, 3 INTO x, y;
+END;
+$$
+
+SELECT foo();
+WARNING:  Number of source and target fields in assignment does not match.
+HINT:  strict_multi_assignement check of extra_warnings is active.
+WARNING:  Number of source and target fields in assignment does not match.
+HINT:  strict_multi_assignement check of extra_warnings is active.
+ foo 
+-
+ 
+(1 row)
+
+   
+  
  
 
   
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 38ea7a091f..2d3dc66726 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -3929,6 +3929,12 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
 	lon

Re: ON CONFLICT DO UPDATE for partitioned tables

2018-03-19 Thread Amit Langote
Fujita-san,

On 2018/03/19 21:59, Etsuro Fujita wrote:
> (2018/03/18 13:17), Alvaro Herrera wrote:
>> Alvaro Herrera wrote:
>> The only thing that I remain unhappy about this patch is the whole
>> adjust_and_expand_partition_tlist() thing.  I fear we may be doing
>> redundant and/or misplaced work.  I'll look into it next week.
> 
> I'm still reviewing the patches, but I really agree on that point.  As
> Pavan mentioned upthread, the onConflictSet tlist for the root parent,
> from which we create a translated onConflictSet tlist for a partition,
> would have already been processed by expand_targetlist() to contain all
> missing columns as well, so I think we could create the tlist for the
> partition by simply re-ordering the expression-converted tlist (ie,
> conv_setproj) based on the conversion map for the partition.  The Attached
> defines a function for that, which could be called, instead of calling
> adjust_and_expand_partition_tlist().  This would allow us to get rid of
> planner changes from the patches.  Maybe I'm missing something, though.

Thanks for the patch.  I can confirm your proposed
adjust_onconflictset_tlist() is enough to replace adjust_inherited_tlist()
+ expand_targetlist() combination (aka
adjust_and_expand_partition_tlist()), thus rendering the planner changes
in this patch unnecessary.  I tested it with a partition tree involving
partitions of varying attribute numbers (dropped columns included) and it
seems to work as expected (as also exercised in regression tests) as shown
below.

Partitioned table p has partitions p1, p2, p3, p4, and p5 whose attributes
look like this; shown as (colname: attnum, ...).

p:  (a: 1, b: 4)
p1: (a: 1, b: 4)
p2: (a: 2, b: 4)
p3: (a: 1, b: 3)
p4: (a: 3, b: 8)
p5: (a: 1, b: 5)

You may notice that all partitions but p1 will have a tuple conversion map
and hence will undergo adjust_onconflictset_tlist() treatment.

insert into p values (1, 'a') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 1

insert into p values (1, 'b') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 1

insert into p values (1, 'c') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 0

insert into p values (1) on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 0

insert into p values (2, 'a') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 1

insert into p values (2, 'b') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 1

insert into p values (2, 'c') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 0

insert into p values (5, 'a') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 1

insert into p values (5, 'b') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 1

insert into p values (5, 'c') on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 0

insert into p values (5) on conflict (a) do update set b = excluded.b
where excluded.b = 'b';
INSERT 0 0

select tableoid::regclass, * from p;
 tableoid | a | b
--+---+---
 p1   | 1 | b
 p2   | 2 | b
 p5   | 5 | b
(3 rows)

I have incorporated your patch in the main patch after updating the
comments a bit.  Also, now that ee49f49 is in [1], the transition
table related tests I proposed yesterday pass nicely.  Instead of posting
as a separate patch, I have merged it with the main patch.  So now that
planner refactoring is unnecessary, attached is just one patch.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ee49f49
>From ac4f82d67720994ff8c632515bcf6760542c0d2f Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 20 Mar 2018 10:09:38 +0900
Subject: [PATCH v6] Fix ON CONFLICT to work with partitioned tables

Author: Amit Langote, Alvaro Herrera, Etsuro Fujita
---
 doc/src/sgml/ddl.sgml |  15 --
 src/backend/catalog/heap.c|   2 +-
 src/backend/catalog/partition.c   |  62 +--
 src/backend/commands/tablecmds.c  |  15 +-
 src/backend/executor/execPartition.c  | 237 --
 src/backend/executor/nodeModifyTable.c|  93 --
 src/backend/parser/analyze.c  |   7 -
 src/include/catalog/partition.h   |   2 +-
 src/include/executor/execPartition.h  |  10 ++
 src/include/nodes/execnodes.h |   1 +
 src/include/optimizer/prep.h  |   9 +
 src/test/regress/expected/insert_conflict.out |  73 ++--
 src/test/regress/expected/triggers.out|  33 
 src/test/regress/sql/insert_conflict.sql  |  64 +--
 src/test/regress/sql/triggers.sql |  33 
 15 files changed, 560 insertions(+), 96 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3a54ba9d5a..8805b88d82 100644
--- a/doc/src/sgml/dd

Re: Trigger file behavior with the standby

2018-03-19 Thread Michael Paquier
On Mon, Mar 19, 2018 at 01:27:21PM -0700, Keiko Oda wrote:
> I'm seeing the following behavior with a trigger file which is very
> confusing to me, I'd like to get some advice of what is the expected
> behavior of the trigger file with the standby.

This portion from the docs includes your answer:
https://www.postgresql.org/docs/devel/static/warm-standby.html#STANDBY-SERVER-OPERATION
"Standby mode is exited and the server switches to normal operation when
pg_ctl promote is run or a trigger file is found (trigger_file). Before
failover, any WAL immediately available in the archive or in pg_wal will
be restored, but no attempt is made to connect to the master.

So when creating a trigger file or signaling for promotion, any WAL
files available are first fetched, and then promotion happens.  In your
case all the WAL segments from the archives are retrieved first.
--
Michael


signature.asc
Description: PGP signature


Re: IndexJoin memory problem using spgist and boxes

2018-03-19 Thread Tom Lane
Alexander Kuzmenkov  writes:
> The updated version looks good to me.

LGTM too.  Pushed with some trivial cosmetic adjustments.

regards, tom lane



Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs

2018-03-19 Thread Kyotaro HORIGUCHI
At Mon, 19 Mar 2018 23:07:13 -0400, Tom Lane  wrote in 
<10037.1521515...@sss.pgh.pa.us>
> Michael Paquier  writes:
> > On Mon, Mar 19, 2018 at 07:15:36PM -0400, Tom Lane wrote:
> >> This is a good thing not least because all the GUC_LIST_QUOTE variables
> >> are in core.  I've been trying to think of a way that we could have
> >> correct behavior for variables that the core backend doesn't know about
> >> and whose extension shlibs aren't currently loaded, and I can't come up
> >> with one.  So I think that in practice, we have to document that
> >> GUC_LIST_QUOTE is unsupported for extension variables
> 
> > I would propose to add a sentence on the matter at the bottom of the
> > CREATE FUNCTION page.  Even with that, the handling of items in the list
> > is incorrect with any patches on this thread: the double quotes should
> > be applied to each element of the list, not the whole list.
> 
> No, because all the places we are worried about are interpreting the
> contents of proconfig or setconfig columns, and we know that that info
> has been through flatten_set_variable_args().  If that function thought
> that GUC_LIST_QUOTE was applicable, it already did the appropriate
> quoting, and we can't quote over that without making a mess.  But if it
> did not think that GUC_LIST_QUOTE was applicable, then its output can
> just be treated as a single string, and that will work fine.
> 
> Our problem basically is that if we don't know the variable that was
> being processed, we can't be sure whether GUC_LIST_QUOTE was used.
> I don't currently see a solution to that other than insisting that
> GUC_LIST_QUOTE only be used for known core GUCs.

The documentation of SET command says as the follows. (CREATE
FUNCTION says a bit different but seems working in the same way)

https://www.postgresql.org/docs/10/static/sql-set.html

> SET [ SESSION | LOCAL ] configuration_parameter { TO | = } { value | 'value' 
> | DEFAULT }
and

> value
> 
>   New value of parameter. Values can be specified as string
>   constants, identifiers, numbers, or comma-separated lists of
>   these, as appropriate for the particular parameter. DEFAULT can
>   be written to specify resetting the parameter to its default
>   value (that is, whatever value it would have had if no SET had
>   been executed in the current session).

According to this description, a comma-separated list enclosed
with single quotes is a valid list value. (I remenber I was
trapped by the description.)

I should be stupid but couldn't we treat quoted comma-separated
list as a bare list if it is the only value in the argument? I
think no one sets such values containing commas as
temp_tablespaces, *_preload_libraries nor search_path.

But of course it may break something and may break some
extensions.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 6859,6864  GetConfigOptionResetString(const char *name)
--- 6859,6891 
  	return NULL;
  }
  
+ /*
+  * quote_list_identifiers
+  *		quote each element in the given list string
+  */
+ static const char *
+ quote_list_identifiers(char *str)
+ {
+ 	StringInfoData buf;
+ 	List		   *namelist;
+ 	ListCell	   *lc;
+ 
+ 	/* Just quote the whole if this is not a list */
+ 	if (!SplitIdentifierString(str, ',', &namelist))
+ 		return quote_identifier(str);
+ 
+ 	initStringInfo(&buf);
+ 	foreach (lc, namelist)
+ 	{
+ 		char *elmstr = (char *) lfirst(lc);
+ 
+ 		if (lc != list_head(namelist))
+ 			appendStringInfoString(&buf, ", ");
+ 		appendStringInfoString(&buf, quote_identifier(elmstr));
+ 	}
+ 
+ 	return buf.data;
+ }
  
  /*
   * flatten_set_variable_args
***
*** 6973,6979  flatten_set_variable_args(const char *name, List *args)
  	 * quote it if it's not a vanilla identifier.
  	 */
  	if (flags & GUC_LIST_QUOTE)
! 		appendStringInfoString(&buf, quote_identifier(val));
  	else
  		appendStringInfoString(&buf, val);
  }
--- 7000,7018 
  	 * quote it if it's not a vanilla identifier.
  	 */
  	if (flags & GUC_LIST_QUOTE)
! 	{
! 		if (list_length(args) > 1)
! 			appendStringInfoString(&buf, quote_identifier(val));
! 		else
! 		{
! 			/*
! 			 * If we have only one elment, it may be a list
! 			 * that is quoted as a whole.
! 			 */
! 			appendStringInfoString(&buf,
!    quote_list_identifiers(val));
! 		}
! 	}
  	else
  		appendStringInfoString(&buf, val);
  }


Re: inserts into partitioned table may cause crash

2018-03-19 Thread Etsuro Fujita

(2018/03/20 9:34), Amit Langote wrote:

On 2018/03/20 5:54, Alvaro Herrera wrote:

Etsuro Fujita wrote:


Thanks for the updated patches!  I think the patches are in good shape, but
I did a bit of editorial things; added a bit more comments for
ExecPrepareTupleRouting and adjusted regression test stuff to match the
existing ones.  Attached are the updated patches for HEAD and PG10. Those
changes are just editorial, so let's ask for the committer review.


Thanks, looks good!  I made another pass over the comments (a few looked
too much like they were in a larger function) and pushed to both
branches.


Thank you!


Thanks, Alvaro and Amit!

Best regards,
Etsuro Fujita



Re: configure's checks for --enable-tap-tests are insufficient

2018-03-19 Thread Michael Paquier
On Mon, Mar 19, 2018 at 10:54:59PM -0400, Chapman Flack wrote:
> Ehh, I'm with Craig. The *perl project* identifies which packages (and
> versions) are part of perl core in each perl version ... not Red Hat.
> 
> http://cpansearch.perl.org/src/BINGOS/Module-CoreList-5.20180220/lib/Module/CoreList.pm
> 
> Red Hat's practice seems super reckless given that there's an actual
> module (itself part of core since 5.8.9) that lets you check what
> modules you can safely assume are there. I'm sure I'm not the only one
> to have written code relying on it

At the same time I don't think that there is much we can do in trying to
reduce the split in those distributions.  Even if it were to be changed
in the future, that won't affect existing deployments, past OS base
images and maintenance releases.  So in order to simplify the life of
people running the regression test suites, I would be on board with Tom
in adding more checks at configure level as it simplifies the life of
anybody running the TAP tests on those distributions.
--
Michael


signature.asc
Description: PGP signature


Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs

2018-03-19 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Mar 19, 2018 at 07:15:36PM -0400, Tom Lane wrote:
>> This is a good thing not least because all the GUC_LIST_QUOTE variables
>> are in core.  I've been trying to think of a way that we could have
>> correct behavior for variables that the core backend doesn't know about
>> and whose extension shlibs aren't currently loaded, and I can't come up
>> with one.  So I think that in practice, we have to document that
>> GUC_LIST_QUOTE is unsupported for extension variables

> I would propose to add a sentence on the matter at the bottom of the
> CREATE FUNCTION page.  Even with that, the handling of items in the list
> is incorrect with any patches on this thread: the double quotes should
> be applied to each element of the list, not the whole list.

No, because all the places we are worried about are interpreting the
contents of proconfig or setconfig columns, and we know that that info
has been through flatten_set_variable_args().  If that function thought
that GUC_LIST_QUOTE was applicable, it already did the appropriate
quoting, and we can't quote over that without making a mess.  But if it
did not think that GUC_LIST_QUOTE was applicable, then its output can
just be treated as a single string, and that will work fine.

Our problem basically is that if we don't know the variable that was
being processed, we can't be sure whether GUC_LIST_QUOTE was used.
I don't currently see a solution to that other than insisting that
GUC_LIST_QUOTE only be used for known core GUCs.

regards, tom lane



Re: file cloning in pg_upgrade and CREATE DATABASE

2018-03-19 Thread Michael Paquier
On Mon, Mar 19, 2018 at 04:14:15PM +0900, Michael Paquier wrote:
> Two other things I have noticed as well:
> 1) src/bin/pg_rewind/copy_fetch.c could benefit from similar speed-ups I
> think when copying data from source to target using the local mode of
> pg_rewind.  This could really improve cases where new relations are
> added after a promotion.
> 2) XLogFileCopy() uses a copy logic as well.  For large segments things
> could be improved, however we need to be careful about filling in the
> end of segments with zeros.

I have been thinking about this patch over the night, and here is a list
of bullet points which would be nice to tackle:
- Remove the current diff in copydir.
- Extend copy_file so as it is able to use fcopyfile.
- Move the work done in pg_upgrade into a common API which can as well
be used by pg_rewind as well.  One place would be to have a
frontend-only API in src/common which does the leg work.  I would
recommend working only on file descriptors as well for consistency with
copy_file_range.
- Add proper wait events for the backend calls.  Those are missing for
copy_file_range and copyfile.
- For XLogFileCopy, the problem may be trickier as the tail of a segment
is filled with zeroes, so dropping it from the first version of the
patch sounds wiser.

Patch is switched as waiting on author, I have set myself as a
reviewer.

Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: configure's checks for --enable-tap-tests are insufficient

2018-03-19 Thread Chapman Flack
On 03/19/18 22:40, Tom Lane wrote:
> Craig Ringer  writes:
>> Yes, looks like tests are necessary. I'd argue it's "test for broken Perl"
>> but apparently Fedora think they should split the Perl core dist into tiny
>> pieces.
> 
> FWIW, the package boundaries are the same on RHEL6, and probably long
> before that.  It's possible that Red Hat have trimmed back which packages
> they consider part of a basic Perl installation.

Ehh, I'm with Craig. The *perl project* identifies which packages (and
versions) are part of perl core in each perl version ... not Red Hat.

http://cpansearch.perl.org/src/BINGOS/Module-CoreList-5.20180220/lib/Module/CoreList.pm

Red Hat's practice seems super reckless given that there's an actual
module (itself part of core since 5.8.9) that lets you check what
modules you can safely assume are there. I'm sure I'm not the only one
to have written code relying on it

-Chap



Re: configure's checks for --enable-tap-tests are insufficient

2018-03-19 Thread Tom Lane
Craig Ringer  writes:
> On 19 March 2018 at 23:56, Tom Lane  wrote:
>> ... much excavation finds a complaint about Time::HiRes not being installed

> That's absurd. Time::HiRes is a core module, as is Test::More. Sigh.

> Yes, looks like tests are necessary. I'd argue it's "test for broken Perl"
> but apparently Fedora think they should split the Perl core dist into tiny
> pieces.

FWIW, the package boundaries are the same on RHEL6, and probably long
before that.  It's possible that Red Hat have trimmed back which packages
they consider part of a basic Perl installation.  I don't have any data
on whether that's moved over time, but it wouldn't surprise me given
the long-term shifts in Fedora's orientation.

regards, tom lane



Re: configure's checks for --enable-tap-tests are insufficient

2018-03-19 Thread Craig Ringer
On 19 March 2018 at 23:56, Tom Lane  wrote:

$ make -j check-world
> ... runs for awhile and then fails
> ... much excavation finds a complaint about Time::HiRes not being installed
> $ sudo yum install perl-Time-HiRes
>
>
That's absurd. Time::HiRes is a core module, as is Test::More. Sigh.

Yes, looks like tests are necessary. I'd argue it's "test for broken Perl"
but apparently Fedora think they should split the Perl core dist into tiny
pieces.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: INOUT parameters in procedures

2018-03-19 Thread Peter Eisentraut
On 3/19/18 03:25, Rushabh Lathia wrote:
> For the FUNCTION when we have single OUT/INOUT parameter 
> the return type for that function will be set to the type of OUT parameter.
> But in case of PROCEDURE, it's always RECORDOID, why this inconsistency?

For procedures, this is just an implementation detail.  The CALL command
returns a row in any case, so if we set the return type to a scalar
type, we'd have to add special code to reassemble a row anyway.  For
functions, the inconsistency is (arguably) worth it, because it affects
how functions can be written and called, but for procedures, there would
be no point.

> Above test throws an error saying calling procedures with output
> arguments are not supported in SQL functions.  Whereas similar test
> do work with SQL functions:

This was discussed earlier in the thread.

The behavior of output parameters in functions was, AFAICT, invented by
us.  But for procedures, the SQL standard specifies it, so there might
be some differences.

> ERROR:  calling procedures with output arguments is not supported in SQL
> functions
> CONTEXT:  SQL function "ptest4b"
> 
> Here error message says that calling procedures with output arguments is not
> supported in SQL functions.  Whereas here it's getting called from the SQL
> procedure.  So error message needs to be changed. 

Well, I don't think we are going to change every single error message
from "function" to a separate function and procedure variant.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: committing inside cursor loop

2018-03-19 Thread Peter Eisentraut
On 3/14/18 08:05, Ildus Kurbangaliev wrote:
>> The ROLLBACK call in the first loop iteration undoes the UPDATE
>> command that drives the loop.  Is it then sensible to continue the
>> loop?
>>
> I think that in the first place ROLLBACK was prohibited because of cases
> like this, but it seems to safe to continue the loop when portal
> strategy is PORTAL_ONE_SELECT.

Maybe, but even plain SELECT commands can have side effects.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs

2018-03-19 Thread Michael Paquier
On Mon, Mar 19, 2018 at 07:15:36PM -0400, Tom Lane wrote:
> This is a good thing not least because all the GUC_LIST_QUOTE variables
> are in core.  I've been trying to think of a way that we could have
> correct behavior for variables that the core backend doesn't know about
> and whose extension shlibs aren't currently loaded, and I can't come up
> with one.  So I think that in practice, we have to document that
> GUC_LIST_QUOTE is unsupported for extension variables

I would propose to add a sentence on the matter at the bottom of the
CREATE FUNCTION page.  Even with that, the handling of items in the list
is incorrect with any patches on this thread: the double quotes should
be applied to each element of the list, not the whole list.  For example
with HEAD and this function:
=# CREATE or replace FUNCTION func_with_set_params() RETURNS integer
AS 'select 1;'
LANGUAGE SQL
SET search_path TO 'pg_catalog, public'
SET wal_consistency_checking TO heap, heap2
SET session_preload_libraries TO 'foo, bar'
IMMUTABLE STRICT;

Then retrieving the function definition results in that:
=# SELECT pg_get_functiondef('func_with_set_params'::regproc);
CREATE OR REPLACE FUNCTION public.func_with_set_params()
RETURNS integer
 LANGUAGE sql
 IMMUTABLE STRICT
 SET search_path TO "pg_catalog, public"
 SET wal_consistency_checking TO 'heap, heap2'
 SET session_preload_libraries TO '"foo, bar"'
AS $function$select 1;$function$

And that's wrong as "pg_catalog, public" is only a one-element list, no?

> (and, perhaps,
> enforce this in DefineCustomStringVariable? or is that cure worse than
> the disease?)

Extension authors can just mark the variables they define with
GUC_LIST_QUOTE, so enforcing it would be a step backwards in my
opinion.
--
Michael


signature.asc
Description: PGP signature


Re: inserts into partitioned table may cause crash

2018-03-19 Thread Amit Langote
On 2018/03/20 5:54, Alvaro Herrera wrote:
> Etsuro Fujita wrote:
> 
>> Thanks for the updated patches!  I think the patches are in good shape, but
>> I did a bit of editorial things; added a bit more comments for
>> ExecPrepareTupleRouting and adjusted regression test stuff to match the
>> existing ones.  Attached are the updated patches for HEAD and PG10. Those
>> changes are just editorial, so let's ask for the committer review.
> 
> Thanks, looks good!  I made another pass over the comments (a few looked
> too much like they were in a larger function) and pushed to both
> branches.

Thank you!

Regards,
Amit




Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs

2018-03-19 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> At Fri, 16 Mar 2018 21:15:54 +0900, Michael Paquier  
> wrote in <20180316121554.ga2...@paquier.xyz>
>> Let's be clear.  I have listed all the variables in the patch to gather
>> more easily opinions, and because it is easier to review the whole stack
>> this way. I personally think that the only variables where the patch
>> makes sense are:
>> - DateStyle
>> - search_path
>> - plpgsql.extra_errors
>> - plpgsql.extra_warnings
>> - wal_consistency_checking
>> So I would be incline to drop the rest from the patch.  If there are
>> authors of popular extensions willing to get this support, let's update
>> the list once they argue about it and only if it makes sense.  However,
>> as far as I can see, there are no real candidates.  So let's keep the
>> list simple.

> FWIW +1 from me. It seems reasonable as the amendment to the
> current status.

It suddenly struck me that the scope of the patch is wider than it needs
to be.  We don't need special behavior for all GUC_LIST variables, only
for GUC_LIST_QUOTE variables.  (For example, SET datestyle = 'iso, mdy'
works just as well as SET datestyle = iso, mdy.)

This is a good thing not least because all the GUC_LIST_QUOTE variables
are in core.  I've been trying to think of a way that we could have
correct behavior for variables that the core backend doesn't know about
and whose extension shlibs aren't currently loaded, and I can't come up
with one.  So I think that in practice, we have to document that
GUC_LIST_QUOTE is unsupported for extension variables (and, perhaps,
enforce this in DefineCustomStringVariable? or is that cure worse than
the disease?)

regards, tom lane



Re: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

2018-03-19 Thread Andres Freund
Hi,

On 2018-03-17 23:43:22 +, Huong Dangminh wrote:
> Hi,
> 
> I have found a case which could get segmentation fault when using GROUPING 
> SETS.
> It occurs when columns in GROUPING SETS are all unsortable but hashable.
> 
> Attached grouping_sets_segv.zip include module to reproduce this problem.
> 
> I think it made from below change.
> 
> https://www.postgresql.org/docs/current/static/release-10.html#id-1.11.6.8.5.3.6
> ---
> Allow hashed aggregation to be used with grouping sets (Andrew Gierth)
> ---
> https://github.com/postgres/postgres/commit/b5635948ab165b6070e7d05d111f966e07570d81

Andrew, any comments?

Regards,

Andres



RE: PostgreSQL 10: Segmentation fault when using GROUPING SETS with all unsortable columns

2018-03-19 Thread Huong Dangminh
Hi,

> I have found a case which could get segmentation fault when using GROUPING
> SETS.
> It occurs when columns in GROUPING SETS are all unsortable but hashable.

I have added this thread to current CF.
please let me know if you need more information.


---
Thanks and best regards,
Dang Minh Huong
NEC Solution Innovators, Ltd.
http://www.nec-solutioninnovators.co.jp/en/




Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread Tom Lane
"David G. Johnston"  writes:
> On Mon, Mar 19, 2018 at 8:33 AM, Tom Lane  wrote:
> ​Another one that seems to fall into the "fairly large-scale work" would be
> the:
> ​ERROR:  more than one row returned by a subquery used as an expression
> (that's it, nothing else prints in psql when I run the offending query -
> using "--echo-all" to at least see what query I sent caused the issue)

Yeah, this falls into the general category of unlocalized run-time errors.
I still think that emitting an error cursor would be a general-purpose
answer that would improve the user experience for this and many other
cases.  You could imagine specific behaviors that would be more optimal
for this specific error report, but the approach of solving this issue
one error at a time doesn't scale even a little bit.

> Printing out the offending expression would be nice if possible - printing
> the set (size > 1) of values that were returned would probably help as
> well, though usually the problem data is in a where clause while the set
> would contain target list data - and its those where clause items that
> matter more.  In the case of a correlated subquery exhibiting this problem
> the outer query vars being passed in is what would be most helpful.

IIRC, this error is thrown as soon as we get a second row out of the
subquery; we don't know what the ultimate result-set size would be,
and I rather doubt that finding that out would be helpful very often.
Nor does it seem like figuring out how to identify the parameter values
passed down to the subquery (if any) would really repay the effort.
The set that the executor thinks it's passing down might not have that
much to do with what the user thinks the query is doing.

regards, tom lane



Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread David G. Johnston
On Mon, Mar 19, 2018 at 8:33 AM, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > Frankly, I'm not seeing "invalid constant regular expressions" as being a
> > large scale problem - but I'll agree that having the error include the
> > actual literal being parsed as a RegEx should be done.
>
> Agreed.  Doing anything about the other stuff discussed in this thread is
> fairly large-scale work, but adjusting our regex error messages is easy.
>

​Another one that seems to fall into the "fairly large-scale work" would be
the:

​ERROR:  more than one row returned by a subquery used as an expression
(that's it, nothing else prints in psql when I run the offending query -
using "--echo-all" to at least see what query I sent caused the issue)

Found this via Google

https://www.postgresql.org/message-id/flat/001201ce70d8%24718bd780%2454a38680%24%40kapila%40huawei.com#001201ce70d8$718bd780$54a38680$@kap...@huawei.com

Printing out the offending expression would be nice if possible - printing
the set (size > 1) of values that were returned would probably help as
well, though usually the problem data is in a where clause while the set
would contain target list data - and its those where clause items that
matter more.  In the case of a correlated subquery exhibiting this problem
the outer query vars being passed in is what would be most helpful.

David J.

P.S. I consider this to be on-topic to the general "hard to debug" topic
this thread covers even if its doesn't involve domains...I may gripe more
prominently later...


Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-19 Thread Tom Lane
Thomas Munro  writes:
> On Wed, Mar 14, 2018 at 2:56 PM, Jeff Janes  wrote:
>> Is there any good way to make the regression tests fail if the plan reverts
>> to the bad one?  The only thing I can think of would be to make the table
>> bigger so the regression tests becomes "noticeably slower", but that is
>> pretty vague and not user friendly to formally pass and just hope it is slow
>> enough for someone to investigate.

> I can't think of a good way.  I guess it can still pick a nested loop
> if it thinks there'll only be a couple of loops.  This patch tells it
> to pay attention to the total cost, not the startup cost, so as soon
> as it thinks there is more than a hand full of rows the quadratic cost
> will exceed the sort/merge's logarithmic cost.

Right.  After further thought, the key point here is that in non-error
cases the query will produce no rows, meaning that it must be executed
to completion before we can be sure of that.  But applying a LIMIT
encourages the planner to pick a fast-start (slow-completion) plan,
which is not going to be what we want.  If in fact the query were going
to produce a lot of rows, and the planner could accurately predict that,
then maybe a LIMIT would be useful --- but there's no hope of estimates
on wholerowvar *= wholerowvar being accurate any time soon, let alone
correctly handling the correlation with ctid <> ctid.  So the LIMIT
is just an invitation to trouble and we may as well remove it.

Committed that way.  I also changed EXISTS(SELECT * ...) to
EXISTS(SELECT 1 ...), in hopes of saving a few microseconds of
parsing effort.

> Since I've had hash joins on the mind recently I couldn't help
> noticing that you can't get a hash join out of this query's "record
> image" based join qual (or even a regular row-based =).

Yeah, because there's no hash support for recordimage.  So there's even
less reason to be worried about how smart the planner is for this query:
basically, it might do a nestloop for a very small number of rows, but
otherwise it's gonna have to go for a merge join.

My earlier thought that we might be able to skip the ANALYZE step
seems wrong, though.  It's true that it does little for this query,
but the follow-on query to build a diff table can probably make
good use of the stats.

regards, tom lane



Re: Error detail/hint style fixup

2018-03-19 Thread Daniel Gustafsson
> On 19 Mar 2018, at 17:47, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> Attached patch ensures that (i) details and hints have leading 
>> capitalization,
>> have double spaces after punctuation and ends with period; (ii) context 
>> should
>> not be capitalized and should not end with period; (iii) test .out files 
>> match
>> the changes.
> 
> +1 for cleaning this up, but I wonder if you've gone far enough in
> adjusting the style of errcontext() messages.

Thanks for looking/reviewing.

> I'm thinking what we should actually be printing is more like
> 
> CONTEXT:  while opening cursor on dblink connection named “unnamed"

I agree that the contexts in dblink were pretty unhelpful with redundant
language.  I took a stab at this, the attached patch extends dblink_res_error()
to improve the context.  Including the cursorname in the context seemed to be
in line with the errmsg’s in dblink.

Looking around at other errcontext’s I can’t see any other cases that warrant
attention.

cheers ./daniel



errdetail_hint_style_v2.patch
Description: Binary data


Re: inserts into partitioned table may cause crash

2018-03-19 Thread Alvaro Herrera
Etsuro Fujita wrote:

> Thanks for the updated patches!  I think the patches are in good shape, but
> I did a bit of editorial things; added a bit more comments for
> ExecPrepareTupleRouting and adjusted regression test stuff to match the
> existing ones.  Attached are the updated patches for HEAD and PG10. Those
> changes are just editorial, so let's ask for the committer review.

Thanks, looks good!  I made another pass over the comments (a few looked
too much like they were in a larger function) and pushed to both
branches.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
I put that include from /usr/local/include because 'configure' wasn't
finding readline.h (I think). I'll look into this.

Thanks again for the help.

On Mon, Mar 19, 2018 at 4:44 PM, Tom Lane  wrote:

> Terry Phelps  writes:
> > Thank you for your help. That resolved the problem. My bad.
> > The build ran much further and then got another error, which I'll mention
> > here, and go research it, since it could be just my bleeding edge source
> > code.
>
> > cc -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement
> > -Wendif-labels -Wmissing-format-attribute -Wformat-security
> > -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> > -I/usr/local/include -D_THREAD_SAFE -pthread -D_REENTRANT -D_THREAD_SAFE
> > -D_POSIX_PTHREAD_SEMANTICS -fPIC -DPIC -I../include
> > -I../../../../src/interfaces/ecpg/include -DFRONTEND
> > -I../../../../src/include   -DSO_MAJOR_VERSION=3  -c -o datetime.o
> > datetime.c
> > datetime.c:332:1: error: conflicting types for 'PGTYPESdate_defmt_asc'
> > PGTYPESdate_defmt_asc(date * d, const char *fmt, const char *str)
> > ^
> > /usr/local/include/pgtypes_date.h:24:12: note: previous declaration is
> here
> > extern int  PGTYPESdate_defmt_asc(date *, const char *, char *);
> > ^
> > 1 error generated.
>
> It looks like you've got -I/usr/local/include in front of the build's
> own -I switches, and that's allowing it to pull in back-version copies
> of PG-related include files instead of the ones in the source tree.
>
> I'm not totally sure, but if you inject -I/usr/local/include through
> CPPFLAGS not CFLAGS, configure might do the right thing automatically.
> Otherwise, you could manually edit CPPFLAGS in src/Makefile.global
> after configuring to get the right -I order.
>
> regards, tom lane
>


Re: [HACKERS] plpgsql - additional extra checks

2018-03-19 Thread Tomas Vondra
Hi,

I'm looking at the updated patch (plpgsql-extra-check-180316.patch), and
this time it applies and builds OK. The one thing I noticed is that the
documentation still uses the old wording for strict_multi_assignement:

WARNING:  Number of evaluated fields does not match expected.
HINT:  strict_multi_assignement check of extra_warnings is active.
WARNING:  Number of evaluated fields does not match expected.
HINT:  strict_multi_assignement check of extra_warnings is active.

This was reworded to "Number of source and target fields in assignment
does not match."

Otherwise it seems fine to me, and I'm tempted to mark it RFC once the
docs get fixed. Stephen, any objections?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
Thomas Munro  writes:
> /usr/local/include/pgtypes_date.h:24:12: note: previous declaration is here
> extern int  PGTYPESdate_defmt_asc(date *, const char *, char *);
>
> BTW it looks like 0e1539ba0d0a added const qualifiers to that function
> but didn't update the documentation in doc/src/sgml/ecpg.sgml.

Good catch, please send a patch.

regards, tom lane



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
Terry Phelps  writes:
> Thank you for your help. That resolved the problem. My bad.
> The build ran much further and then got another error, which I'll mention
> here, and go research it, since it could be just my bleeding edge source
> code.

> cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
> -Wendif-labels -Wmissing-format-attribute -Wformat-security
> -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> -I/usr/local/include -D_THREAD_SAFE -pthread -D_REENTRANT -D_THREAD_SAFE
> -D_POSIX_PTHREAD_SEMANTICS -fPIC -DPIC -I../include
> -I../../../../src/interfaces/ecpg/include -DFRONTEND
> -I../../../../src/include   -DSO_MAJOR_VERSION=3  -c -o datetime.o
> datetime.c
> datetime.c:332:1: error: conflicting types for 'PGTYPESdate_defmt_asc'
> PGTYPESdate_defmt_asc(date * d, const char *fmt, const char *str)
> ^
> /usr/local/include/pgtypes_date.h:24:12: note: previous declaration is here
> extern int  PGTYPESdate_defmt_asc(date *, const char *, char *);
> ^
> 1 error generated.

It looks like you've got -I/usr/local/include in front of the build's
own -I switches, and that's allowing it to pull in back-version copies
of PG-related include files instead of the ones in the source tree.

I'm not totally sure, but if you inject -I/usr/local/include through
CPPFLAGS not CFLAGS, configure might do the right thing automatically.
Otherwise, you could manually edit CPPFLAGS in src/Makefile.global
after configuring to get the right -I order.

regards, tom lane



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Thomas Munro
On Tue, Mar 20, 2018 at 9:28 AM, Terry Phelps  wrote:
> Thank you for your help. That resolved the problem. My bad.
>
> The build ran much further and then got another error, which I'll mention
> here, and go research it, since it could be just my bleeding edge source
> code.
>
> cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
> -Wendif-labels -Wmissing-format-attribute -Wformat-security
> -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> -I/usr/local/include -D_THREAD_SAFE -pthread -D_REENTRANT -D_THREAD_SAFE
> -D_POSIX_PTHREAD_SEMANTICS -fPIC -DPIC -I../include
> -I../../../../src/interfaces/ecpg/include -DFRONTEND
> -I../../../../src/include   -DSO_MAJOR_VERSION=3  -c -o datetime.o
> datetime.c
> datetime.c:332:1: error: conflicting types for 'PGTYPESdate_defmt_asc'
> PGTYPESdate_defmt_asc(date * d, const char *fmt, const char *str)
> ^
> /usr/local/include/pgtypes_date.h:24:12: note: previous declaration is here
> extern int  PGTYPESdate_defmt_asc(date *, const char *, char *);

That header is coming from /usr/local/include instead of your
development tree, from an older version of PG from before const was
added there.

BTW it looks like 0e1539ba0d0a added const qualifiers to that function
but didn't update the documentation in doc/src/sgml/ecpg.sgml.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
Thank you for your help. That resolved the problem. My bad.

The build ran much further and then got another error, which I'll mention
here, and go research it, since it could be just my bleeding edge source
code.

cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -D_THREAD_SAFE -pthread -D_REENTRANT -D_THREAD_SAFE
-D_POSIX_PTHREAD_SEMANTICS -fPIC -DPIC -I../include
-I../../../../src/interfaces/ecpg/include -DFRONTEND
-I../../../../src/include   -DSO_MAJOR_VERSION=3  -c -o datetime.o
datetime.c
datetime.c:332:1: error: conflicting types for 'PGTYPESdate_defmt_asc'
PGTYPESdate_defmt_asc(date * d, const char *fmt, const char *str)
^
/usr/local/include/pgtypes_date.h:24:12: note: previous declaration is here
extern int  PGTYPESdate_defmt_asc(date *, const char *, char *);
^
1 error generated.
gmake[4]: *** [: datetime.o] Error 1



On Mon, Mar 19, 2018 at 4:22 PM, Tom Lane  wrote:

> Terry Phelps  writes:
> > I ran configure like this, because it got errors otherwise. O. I
> wonder
> > if CFLAGS caused this.
>
> > LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include  ./configure
>
> That would've been fine, but configure then adds onto what you specified
> as CFLAGS.
>
> > And then I ran gmake like this:
> > gmake  LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include
>
> And here you overrode the additions.  Just do "gmake" without these
> overrides.  If you look into src/Makefile.global you should see that
> your initial specifications got into the selected flag variables.
>
> regards, tom lane
>


Trigger file behavior with the standby

2018-03-19 Thread Keiko Oda
Hello,

I'm seeing the following behavior with a trigger file which is very
confusing to me, I'd like to get some advice of what is the expected
behavior of the trigger file with the standby.
(I'm cross-posting this from pgsql-general as I didn't get the response
there.)

1. setup the replication, with the standby having the following
recovery.conf

  # we use wal-e
  restore_command = 'wal-e wal-fetch  "%f" "%p"'
  standby_mode = 'true'
  trigger_file = '/my/path/to/trigger-file/STANDBY_OFF'
  recovery_target_timeline = 'latest'
  primary_conninfo = 'host=myhost port=5432 user=foo
password=verysecurepassword'

2. create a trigger file while standby is having a "lag" (and replication
is not streaming, but file-based log-shipping at this point)
3. looks like Postgres doesn't recognize a trigger file at all, standby
keeps replaying/recovering WALs
  * tried to see if Postgres is doing anything with DEBUG5 log, but it
doesn't say anything about a trigger file
  * also tried to restart Postgres, sending SIGUSR1, etc. to see if it
helps but it just keeps replaying WALs
4. once the standby "caught up" with the leader (replayed all WALs and
about to switch to the streaming replication and/or switch to the streaming
replication), Postgres finally realize that there is a trigger file, and do
the failover

The doc (
https://www.postgresql.org/docs/current/static/warm-standby-failover.html)
says:

> To trigger failover of a log-shipping standby server, run pg_ctl promote
or create a trigger file with the file name and path specified by the
trigger_file setting in recovery.conf.

So, I'd expect that the standby will trigger a failover as soon as we
create a trigger file at step 2. However, the failover doesn't happen until
step 3 above, and between step 2 and step 3 can take many hours sometimes.

I've reproduced this with Postgres 9.4 and 9.5. I kinda gave up to
reproduce with 10 for now (not that I wasn't able to, more like prep takes
time and I'm postponing), but happy to try it if that helps.
Please let me know if there is any other information I could provide.

Thanks!
Keiko


Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
Terry Phelps  writes:
> I ran configure like this, because it got errors otherwise. O. I wonder
> if CFLAGS caused this.

> LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include  ./configure

That would've been fine, but configure then adds onto what you specified
as CFLAGS.

> And then I ran gmake like this:
> gmake  LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include

And here you overrode the additions.  Just do "gmake" without these
overrides.  If you look into src/Makefile.global you should see that
your initial specifications got into the selected flag variables.

regards, tom lane



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
Only slight different. Here is exact what I entered to get the error:

I ran configure like this, because it got errors otherwise. O. I wonder
if CFLAGS caused this.

LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include  ./configure


And then I ran gmake like this:

gmake  LDFLAGS=-L/usr/local/lib CFLAGS=-I/usr/local/include


And I again touched CFLAGS. I fear that I caused this.


On Mon, Mar 19, 2018 at 4:15 PM, Tom Lane  wrote:

> Terry Phelps  writes:
> > I did:
> > cd src/port
> > gmake -s clean
> > gmake
>
> > It says:
>
> > cc -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement
> > -Wendif-labels -Wmissing-format-attribute -Wformat-security
> > -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> > -I/usr/local/include -msse4.2 -I../../src/port -DFRONTEND
> > -I../../src/include-c -o pg_crc32c_sse42.o pg_crc32c_sse42.c
>
> Now I'm even more confused, because that's fine --- and you'll notice
> it didn't fail.  So why doesn't this agree with your original run?
> Did you do anything different from plain "gmake" that time?
>
> regards, tom lane
>


Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
Terry Phelps  writes:
> I did:
> cd src/port
> gmake -s clean
> gmake

> It says:

> cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
> -Wendif-labels -Wmissing-format-attribute -Wformat-security
> -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> -I/usr/local/include -msse4.2 -I../../src/port -DFRONTEND
> -I../../src/include-c -o pg_crc32c_sse42.o pg_crc32c_sse42.c

Now I'm even more confused, because that's fine --- and you'll notice
it didn't fail.  So why doesn't this agree with your original run?
Did you do anything different from plain "gmake" that time?

regards, tom lane



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
The above was in response to Tom Lane's request to show him a listing of:

cd src/port
gmake

I didn't get any errors when I did that.

On Mon, Mar 19, 2018 at 4:11 PM, Andres Freund  wrote:

> Hi,
>
> On 2018-03-19 16:07:17 -0400, Terry Phelps wrote:
> > I did:
> > cd src/port
> > gmake -s clean
> > gmake
> >
> > It says:
> >
> > cc -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement
> > -Wendif-labels -Wmissing-format-attribute -Wformat-security
> > -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> > -I/usr/local/include -msse4.2 -I../../src/port -DFRONTEND
> > -I../../src/include-c -o pg_crc32c_sse42.o pg_crc32c_sse42.c
>
> > cc -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement
> > -Wendif-labels -Wmissing-format-attribute -Wformat-security
> > -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> > -I/usr/local/include -msse4.2  -I../../src/port  -I../../src/include   -c
> > pg_crc32c_sse42.c -o pg_crc32c_sse42_srv.o
>
> So the build actually succeeds now? Or are you getting the error while
> you're in some other directory?
>
> Greetings,
>
> Andres Freund
>


Re: Compile error while building postgresql 10.3

2018-03-19 Thread Andres Freund
Hi,

On 2018-03-19 16:07:17 -0400, Terry Phelps wrote:
> I did:
> cd src/port
> gmake -s clean
> gmake
> 
> It says:
> 
> cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
> -Wendif-labels -Wmissing-format-attribute -Wformat-security
> -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> -I/usr/local/include -msse4.2 -I../../src/port -DFRONTEND
> -I../../src/include-c -o pg_crc32c_sse42.o pg_crc32c_sse42.c

> cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
> -Wendif-labels -Wmissing-format-attribute -Wformat-security
> -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
> -I/usr/local/include -msse4.2  -I../../src/port  -I../../src/include   -c
> pg_crc32c_sse42.c -o pg_crc32c_sse42_srv.o

So the build actually succeeds now? Or are you getting the error while
you're in some other directory?

Greetings,

Andres Freund



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
I did:
cd src/port
gmake -s clean
gmake

It says:

gmake -C ../backend submake-errcodes
gmake[1]: Entering directory '/usr/home/tgphelps/postgresql/src/backend'
gmake[1]: Nothing to be done for 'submake-errcodes'.
gmake[1]: Leaving directory '/usr/home/tgphelps/postgresql/src/backend'
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -msse4.2 -I../../src/port -DFRONTEND
-I../../src/include-c -o pg_crc32c_sse42.o pg_crc32c_sse42.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o pg_crc32c_sb8.o pg_crc32c_sb8.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o pg_crc32c_choose.o pg_crc32c_choose.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o chklocale.o chklocale.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o erand48.o erand48.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o inet_net_ntop.o inet_net_ntop.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o noblock.o noblock.c
echo "#define PGBINDIR \"/usr/local/pgsql/bin\"" >pg_config_paths.h
echo "#define PGSHAREDIR \"/usr/local/pgsql/share\"" >>pg_config_paths.h
echo "#define SYSCONFDIR \"/usr/local/pgsql/etc\"" >>pg_config_paths.h
echo "#define INCLUDEDIR \"/usr/local/pgsql/include\"" >>pg_config_paths.h
echo "#define PKGINCLUDEDIR \"/usr/local/pgsql/include\""
>>pg_config_paths.h
echo "#define INCLUDEDIRSERVER \"/usr/local/pgsql/include/server\""
>>pg_config_paths.h
echo "#define LIBDIR \"/usr/local/pgsql/lib\"" >>pg_config_paths.h
echo "#define PKGLIBDIR \"/usr/local/pgsql/lib\"" >>pg_config_paths.h
echo "#define LOCALEDIR \"/usr/local/pgsql/share/locale\""
>>pg_config_paths.h
echo "#define DOCDIR \"/usr/local/pgsql/share/doc/\"" >>pg_config_paths.h
echo "#define HTMLDIR \"/usr/local/pgsql/share/doc/\"" >>pg_config_paths.h
echo "#define MANDIR \"/usr/local/pgsql/share/man\"" >>pg_config_paths.h
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o path.o path.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o pgcheckdir.o pgcheckdir.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o pgmkdirp.o pgmkdirp.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o pgsleep.o pgsleep.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument
-I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include-c
-o pgstrcasecmp.o pgstrcasecmp.c
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Wendif-labels -Wmissing-format-attribute -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-unuse

Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
Tom, I'll get what you asked for in a minute. But first, I want to make
sure that y'all see that the compiler is clang, and not gcc. Perhaps that's
not important.

On Mon, Mar 19, 2018 at 4:01 PM, Tom Lane  wrote:

> Andres Freund  writes:
> > There seems to be something sketchy afoot here, even outside of
> > CFLAGS_SSE42 itself. From the original email:
>
> > cc -I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include
> > -c -o pg_crc32c_sse42.o pg_crc32c_sse42.c
>
> > isn't this missing a number of important flags? Like at least
> > -fno-strict-aliasing -fwrapv -fexcess-precision=standard?
>
> Good point ... seems like we lost *all* CFLAGS not just the SSE42 ones.
> I believe the options we see here are all from CPPFLAGS not CFLAGS.
>
> Terry, could we see a full "make" trace from src/port/?  Something like
>
> cd src/port
> make -s clean
> make
>
> I'm curious whether the flags lossage affects all .c files in that
> directory, or only ones that are trying to add on custom flags.
>
> regards, tom lane
>


Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
Andres Freund  writes:
> There seems to be something sketchy afoot here, even outside of
> CFLAGS_SSE42 itself. From the original email:

> cc -I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include
> -c -o pg_crc32c_sse42.o pg_crc32c_sse42.c

> isn't this missing a number of important flags? Like at least
> -fno-strict-aliasing -fwrapv -fexcess-precision=standard?

Good point ... seems like we lost *all* CFLAGS not just the SSE42 ones.
I believe the options we see here are all from CPPFLAGS not CFLAGS.

Terry, could we see a full "make" trace from src/port/?  Something like

cd src/port
make -s clean
make

I'm curious whether the flags lossage affects all .c files in that
directory, or only ones that are trying to add on custom flags.

regards, tom lane



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Terry Phelps
I get the following, which appears to be what you were expecting.

$ grep sse4 src/Makefile.global
CFLAGS_SSE42 = -msse4.2
PG_CRC32C_OBJS = pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_choose.o

Gmake's version is:

$ gmake --version
GNU Make 4.2.1
Built for amd64-portbld-freebsd11.1

I installed this via "pkg install gmake". I didn't build it myself.

On Mon, Mar 19, 2018 at 3:50 PM, Tom Lane  wrote:

> [ please keep the list cc'd ]
>
> Terry Phelps  writes:
> > I can barely read a configure.in file, but here's what I think you're
> > asking for. If not, I'll try again:
>
> > configure:15453: checking for _mm_crc32_u8 and _mm_crc32_u32 with
> > CFLAGS=-msse4.2
> > configure:15475: cc -o conftest -Wall -Wmissing-prototypes
> -Wpointer-arith
> > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> > -Wformat-security -fno-strict-aliasing -fwrapv
> > -Wno-unused-command-line-argument -I/usr/local/include -msse4.2
> > -L/usr/local/lib  conftest.c -lz -lreadline -lcrypt -lm  >&5
> > configure:15475: $? = 0
> > configure:15484: result: yes
>
> Interesting.  So it looks like configure *did* decide that -msse4.2
> is needed.  What do you get from "grep sse4 src/Makefile.global" ?
> What I'd expect is
>
> CFLAGS_SSE42 = -msse4.2
> PG_CRC32C_OBJS = pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_choose.o
>
> If you see that, then the next question is why CFLAGS_SSE42 isn't
> getting propagated into the build of pg_crc32c_sse42.o, which would
> seem to suggest a problem with gmake.  What make version are you
> using?
>
> regards, tom lane
>


Re: Compile error while building postgresql 10.3

2018-03-19 Thread Andres Freund
On 2018-03-19 15:50:10 -0400, Tom Lane wrote:
> [ please keep the list cc'd ]
> 
> Terry Phelps  writes:
> > I can barely read a configure.in file, but here's what I think you're
> > asking for. If not, I'll try again:
> 
> > configure:15453: checking for _mm_crc32_u8 and _mm_crc32_u32 with
> > CFLAGS=-msse4.2
> > configure:15475: cc -o conftest -Wall -Wmissing-prototypes -Wpointer-arith
> > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> > -Wformat-security -fno-strict-aliasing -fwrapv
> > -Wno-unused-command-line-argument -I/usr/local/include -msse4.2
> > -L/usr/local/lib  conftest.c -lz -lreadline -lcrypt -lm  >&5
> > configure:15475: $? = 0
> > configure:15484: result: yes
> 
> Interesting.  So it looks like configure *did* decide that -msse4.2
> is needed.  What do you get from "grep sse4 src/Makefile.global" ?
> What I'd expect is
> 
> CFLAGS_SSE42 = -msse4.2
> PG_CRC32C_OBJS = pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_choose.o
> 
> If you see that, then the next question is why CFLAGS_SSE42 isn't
> getting propagated into the build of pg_crc32c_sse42.o, which would
> seem to suggest a problem with gmake.  What make version are you
> using?

There seems to be something sketchy afoot here, even outside of
CFLAGS_SSE42 itself. From the original email:

cc -I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include
-c -o pg_crc32c_sse42.o pg_crc32c_sse42.c

isn't this missing a number of important flags? Like at least
-fno-strict-aliasing -fwrapv -fexcess-precision=standard?

Greetings,

Andres Freund



Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-19 Thread Robert Haas
On Sat, Mar 17, 2018 at 1:16 AM, Amit Kapila  wrote:
> Test-1
> --
> DO $$
> DECLARE count integer;
> BEGIN
> For count In 1..100 Loop
> Execute 'explain Select ten from tenk1';
> END LOOP;
> END;
> $$;
>
> In the above block, I am explaining the simple statement which will
> have just one path, so there will be one additional path projection
> and removal cycle for this statement.  I have just executed the above
> block in psql by having \timing option 'on' and the average timing for
> ten runs on HEAD is  21292.388 ms, with patches (0001.* ~ 0003) is
> 22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362.  These
> results indicate that there is approximately 5~6% of the increase in
> planning time.

Ugh.  I'm able to reproduce this, more or less -- with master, this
test took 42089.484 ms, 41935.849 ms, 42519.336 ms on my laptop, but
with 0001-0003 applied, 43925.959 ms, 43619.004 ms, 43648.426 ms.
However I have a feeling there's more going on here, because the
following patch on top of 0001-0003 made the time go back down to
42353.548, 41797.757 ms, 41891.194 ms.

diff --git a/src/backend/optimizer/plan/planner.c
b/src/backend/optimizer/plan/planner.c
index bf0b3e75ea..0542b3e40b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1947,12 +1947,19 @@ grouping_planner(PlannerInfo *root, bool
inheritance_update,
 {
 Path   *subpath = (Path *) lfirst(lc);
 Path   *path;
+Path   *path2;

 Assert(subpath->param_info == NULL);
-path = (Path *) create_projection_path(root, tlist_rel,
+path2 = (Path *) create_projection_path(root, tlist_rel,
subpath, scanjoin_target);
-add_path(tlist_rel, path);
+path = (Path *) apply_projection_to_path(root, tlist_rel,
+   subpath, scanjoin_target);
+if (path == path2)
+elog(ERROR, "oops");
+lfirst(lc) = path;
 }
+tlist_rel->pathlist = current_rel->pathlist;
+current_rel->pathlist = NIL;

 /*
  * If we can produce partial paths for the tlist rel, for possible use

It seems pretty obvious that creating an extra projection path that is
just thrown away can't "really" be making this faster, so there's
evidently some other effect here involving how the code is laid out,
or CPU cache effects, or, uh, something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Tomas Vondra
Hi Arthur,

I went through the patch - just skimming through the diffs, will do more
testing tomorrow. Here are a few initial comments.

1) max_shared_dictionaries_size / PGC_POSTMASTER

I'm not quite sure why the GUC is defined as PGC_POSTMASTER, i.e. why it
can't be changed after server start. That seems like a fairly useful
thing to do (e.g. increase the limit while the server is running), and
after looking at the code I think it shouldn't be difficult to change.

The other thing I'd suggest is handling "-1" as "no limit".


2) max_shared_dictionaries_size / size of number

Some of the comments dealing with the GUC treat it as a number of
dictionaries (instead of a size). I suppose that's due to how the
original patch was implemented.


3) Assert(max_shared_dictionaries_size);

I'd say that assert is not very clear - it should be

Assert(max_shared_dictionaries_size > 0);

or something along the lines. It's also a good idea to add a comment
explaining the assert, say

/* we can only get here when shared dictionaries are enabled */
Assert(max_shared_dictionaries_size > 0);

4) I took the liberty of rewording some of the docs/comments. See the
attached diffs, that should apply on top of 0003 and 0004 patches.
Please, treat those as mere suggestions.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6862d5e..6747fe2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1433,23 +1433,23 @@ include_dir 'conf.d'
   
   

-Sets the maximum size of all text search dictionaries loaded into shared
-memory.  The default is 100 megabytes (100MB).  This
-parameter can only be set at server start.
+Specifies the amount of shared memory to be used to store full-text search
+search dictionaries.  The default is 100 megabytes (100MB).
+This parameter can only be set at server start.

 

-Currently controls only loading of Ispell
-dictionaries (see ).
-After compiling the dictionary it will be copied into shared memory.
-Another backends on first use of the dictionary will use it from shared
-memory, so it doesn't need to compile the dictionary second time.
+Currently only Ispell dictionaries (see
+) may be loaded into
+shared memory. The first backend requesting the dictionary will
+build it and copy it into shared memory, so that other backends can
+reuse it without having to build it again.

 

-If total size of simultaneously loaded dictionaries reaches the maximum
-allowed size then a new dictionary will be loaded into local memory of
-a backend.
+If the size of simultaneously loaded dictionaries reaches the maximum
+allowed size, additional dictionares will be loaded into private backend
+memory (effectively disabling the sharing).

   
  
diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c
index bfc5292..22d58a0 100644
--- a/src/backend/tsearch/ts_shared.c
+++ b/src/backend/tsearch/ts_shared.c
@@ -22,7 +22,7 @@
 
 
 /*
- * Hash table structures
+ * Hash table entries representing shared dictionaries.
  */
 typedef struct
 {
@@ -37,7 +37,8 @@ typedef struct
 static dshash_table *dict_table = NULL;
 
 /*
- * Shared struct for locking
+ * Information about the main shmem segment, used to coordinate
+ * access to the hash table and dictionaries.
  */
 typedef struct
 {
@@ -53,8 +54,8 @@ typedef struct
 static TsearchCtlData *tsearch_ctl;
 
 /*
- * GUC variable for maximum number of shared dictionaries. Default value is
- * 100MB.
+ * Maximum allowed amount of shared memory for shared dictionaries,
+ * in kilobytes. Default value is 100MB.
  */
 int			max_shared_dictionaries_size = 100 * 1024;
 
@@ -213,7 +202,7 @@ ts_dict_shmem_location(DictInitData *initoptions,
 
 /*
  * Release memory occupied by the dictionary. Function just unpins DSM mapping.
- * If nobody else hasn't mapping to this DSM then unping DSM segment.
+ * If nobody else hasn't mapping to this DSM then unpin DSM segment.
  *
  * dictid: Oid of the dictionary.
  */
@@ -312,6 +301,7 @@ init_dict_table(void)
 	MemoryContext old_context;
 	dsa_area   *dsa;
 
+	/* bail out if shared dictionaries not allowed */
 	if (max_shared_dictionaries_size == 0)
 		return;
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 172627a..b10ec48 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2939,7 +2939,7 @@ static struct config_int ConfigureNamesInt[] =
 			gettext_noop("Currently controls only loading of Ispell dictionaries. "
 		 "If total size of simultaneously loaded dictionaries "
 		 "reaches the maximum allowed size then a new dictionary "
-		

Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
[ please keep the list cc'd ]

Terry Phelps  writes:
> I can barely read a configure.in file, but here's what I think you're
> asking for. If not, I'll try again:

> configure:15453: checking for _mm_crc32_u8 and _mm_crc32_u32 with
> CFLAGS=-msse4.2
> configure:15475: cc -o conftest -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv
> -Wno-unused-command-line-argument -I/usr/local/include -msse4.2
> -L/usr/local/lib  conftest.c -lz -lreadline -lcrypt -lm  >&5
> configure:15475: $? = 0
> configure:15484: result: yes

Interesting.  So it looks like configure *did* decide that -msse4.2
is needed.  What do you get from "grep sse4 src/Makefile.global" ?
What I'd expect is

CFLAGS_SSE42 = -msse4.2
PG_CRC32C_OBJS = pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_choose.o

If you see that, then the next question is why CFLAGS_SSE42 isn't
getting propagated into the build of pg_crc32c_sse42.o, which would
seem to suggest a problem with gmake.  What make version are you
using?

regards, tom lane



Re: Compile error while building postgresql 10.3

2018-03-19 Thread Tom Lane
Terry Phelps  writes:
> Just for fun, I am trying to build postgres from source on FreeBSD 11. Yes,
> I know I don't need to, and I have already installed the 10.3 server and
> client packages, and they run fine. I did a 'git clone' today, and have
> hours-old source code.

FWIW, development-code issues generally belong on -hackers, so forwarding
this there.

> So, I install the prereqs, and did a 'configure' and 'make', and expected
> it to 'just work'. However, I get this compile error:

> cc -I/usr/local/include -I../../src/port -DFRONTEND -I../../src/include
> -c -o pg_crc32c_sse42.o pg_crc32c_sse42.c
> pg_crc32c_sse42.c:37:18: error: always_inline function '_mm_crc32_u64'
> requires
>   target feature 'ssse3', but would be inlined into function
>   'pg_comp_crc32c_sse42' that is compiled without support for 'ssse3'
> crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
>^
> pg_crc32c_sse42.c:44:9: error: always_inline function '_mm_crc32_u32'
> requires
>   target feature 'ssse3', but would be inlined into function
>   'pg_comp_crc32c_sse42' that is compiled without support for 'ssse3'
> crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
>   ^
> pg_crc32c_sse42.c:63:9: error: always_inline function '_mm_crc32_u8'
> requires
>   target feature 'ssse3', but would be inlined into function
>   'pg_comp_crc32c_sse42' that is compiled without support for 'ssse3'
> crc = _mm_crc32_u8(crc, *p);
>   ^
> 3 errors generated.
> 
> I googled, and search the archives, and don't see anything applicable. My C
> compiler is:
> $ cc --version
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM
> 4.0.0)
> Target: x86_64-unknown-freebsd11.1
> Thread model: posix

Huh.  Apparently that compiler is stricter about the use of "ssse3" than
anything we've tested this code on before.

It's interesting that your "cc" line for pg_crc32c_sse42.c shows no sign
of any added options for SSE4.2 compiler intrinsics, which seems to be
what's needed here.  The configure script is supposed to detect whether
such options are needed, but it looks like it failed to do so correctly.
Can you look into config.log and see what happened corresponding to this
bit of configure.in?

# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
#
# First check if the _mm_crc32_u8 and _mm_crc32_u64 intrinsics can be used
# with the default compiler flags. If not, check if adding the -msse4.2
# flag helps. CFLAGS_SSE42 is set to -msse4.2 if that's required.
PGAC_SSE42_CRC32_INTRINSICS([])
if test x"$pgac_sse42_crc32_intrinsics" != x"yes"; then
  PGAC_SSE42_CRC32_INTRINSICS([-msse4.2])
fi
AC_SUBST(CFLAGS_SSE42)

regards, tom lane



Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Arthur Zakirov
On Mon, Mar 19, 2018 at 07:40:54PM +0100, Tomas Vondra wrote:
> 
> 
> On 03/19/2018 07:07 PM, Andres Freund wrote:
> > You've to manually configure a setting that can only be set at server
> > start.  You can't set it as big as necessary because it might use up
> > memory better used for other things.  It needs the full space for
> > dictionaries even if the majority of it never will be needed.  All of
> > those aren't needed in an mmap world.
> > 
> 
> Which is not quite true, because that's not what the patch does.
> 
> Each dictionary is loaded into a separate dsm segment when needed, which
> is then stored in a dhash table. So most of what you wrote is not really
> true - the patch does not pre-allocate the space, and the setting might
> be set even after server start (it's not defined like that currently,
> but that should be trivial to change).

Oh, it's true. I had plans to fix it but somehow I forgot to allow to change
max_shared_dictionaries_size GUC via pg_reload_conf(). I'll fix it and
will send new version of the patch.

> > To me it seems we'll end up needing a heck of a lot more code that
> > the OS already implements if we do it ourselves.
> > 
> 
> Like what? Which features do you expect to need much more code?
> 
> The automated reloading will need a fairly small amount of code - the
> main issue is deciding when to reload, and as I mentioned before that's
> more complicated than you seem to believe. In fact, it may not even be
> possible - there's no way to decide if all files are already updated.
> Currently we kinda ignore that, on the assumption that dictionaries
> change only rarely. We may do the same thing and reload the dict if at
> least one file changes. In any case, the amount of code is trivial.
> 
> In fact, it may be more complicated in the mmap case - how do you update
> a dictionary that is already mapped to multiple processes?
> 
> The eviction is harder - I'll give you that. But then again, I'm not
> sure the mmap approach is really what we want here - it seems better to
> evict the whole dictionary, than some random pages from many of them.

Agree. mmap approach requires same code plus code to handle cache files,
which will be mapped into memory. In mmap approach we need to solve same
issues we face and more. Also we need somehow automatically reload
dictionaries in both cases.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: IndexJoin memory problem using spgist and boxes

2018-03-19 Thread Alexander Kuzmenkov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

The updated version looks good to me. make installcheck under valgrind finds no 
errors. I also tried the T1 and T5 datasets, here are the timings and memory 
usage:
  T1T5
Vanilla   2G/24.2 s over 7G/didn't wait
Patch v2  144M/23.0 s   148M/108 s

The new status of this patch is: Ready for Committer


Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Tomas Vondra


On 03/19/2018 07:07 PM, Andres Freund wrote:
> On 2018-03-19 14:52:34 +0100, Tomas Vondra wrote:
>> On 03/19/2018 02:34 AM, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
 I do agree with that. We have a working well-understood dsm-based
 solution, addressing the goals initially explained in this thread.
>>>
>>> Well, it's also awkward and manual to use. I do think that's
>>> something we've to pay attention to.
>>>
>>
>> Awkward in what sense?
> 
> You've to manually configure a setting that can only be set at server
> start.  You can't set it as big as necessary because it might use up
> memory better used for other things.  It needs the full space for
> dictionaries even if the majority of it never will be needed.  All of
> those aren't needed in an mmap world.
> 

Which is not quite true, because that's not what the patch does.

Each dictionary is loaded into a separate dsm segment when needed, which
is then stored in a dhash table. So most of what you wrote is not really
true - the patch does not pre-allocate the space, and the setting might
be set even after server start (it's not defined like that currently,
but that should be trivial to change).

> 
>> So, I'm not at all convinced the mmap approach is actually better
>> than the dsm one. And I believe that if we come up with a good way
>> to automate some of the tasks, I don't see why would that be
>> possible in the mmap and not dsm.
> 
> To me it seems we'll end up needing a heck of a lot more code that
> the OS already implements if we do it ourselves.
> 

Like what? Which features do you expect to need much more code?

The automated reloading will need a fairly small amount of code - the
main issue is deciding when to reload, and as I mentioned before that's
more complicated than you seem to believe. In fact, it may not even be
possible - there's no way to decide if all files are already updated.
Currently we kinda ignore that, on the assumption that dictionaries
change only rarely. We may do the same thing and reload the dict if at
least one file changes. In any case, the amount of code is trivial.

In fact, it may be more complicated in the mmap case - how do you update
a dictionary that is already mapped to multiple processes?

The eviction is harder - I'll give you that. But then again, I'm not
sure the mmap approach is really what we want here - it seems better to
evict the whole dictionary, than some random pages from many of them.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Andres Freund
On 2018-03-19 14:52:34 +0100, Tomas Vondra wrote:
> On 03/19/2018 02:34 AM, Andres Freund wrote:
> > Hi,
> > 
> > On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
> >> I do agree with that. We have a working well-understood dsm-based
> >> solution, addressing the goals initially explained in this thread.
> > 
> > Well, it's also awkward and manual to use. I do think that's
> > something we've to pay attention to.
> > 
> 
> Awkward in what sense?

You've to manually configure a setting that can only be set at server
start.  You can't set it as big as necessary because it might use up
memory better used for other things.  It needs the full space for
dictionaries even if the majority of it never will be needed.  All of
those aren't needed in an mmap world.


> So, I'm not at all convinced the mmap approach is actually better than
> the dsm one. And I believe that if we come up with a good way to
> automate some of the tasks, I don't see why would that be possible in
> the mmap and not dsm.

To me it seems we'll end up needing a heck of a lot more code that the
OS already implements if we do it ourselves.

Greetings,

Andres Freund



Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-19 Thread Robert Haas
On Fri, Mar 16, 2018 at 1:50 PM, Ashutosh Bapat
 wrote:
>> This patch also renames can_parallel_agg to
>> can_partial_agg and removes the parallelism-specific bits from it.
>
> I think we need to update the comments in this function to use phrase
> "partial aggregation" instead of "parallel aggregation". And I think
> we need to change the conditions as well. For example if
> parse->groupClause == NIL, why can't we do partial aggregation? This
> is the classical case when we will need patial aggregation. Probably
> we should test this with Jeevan's patches for partition-wise aggregate
> to see if it considers partition-wise aggregate or not.

I think the case where we have neither any aggregates nor a grouping
clause is where we are doing SELECT DISTINCT.  Something like SELECT
COUNT(*) FROM ... is not this case because that has an aggregate.

I'm sort of on the fence as to whether and how to update the comments.
I agree that it's a little strange to leave the comments here
referencing parallel aggregation when the function has been renamed to
is_partial_agg(), but a simple search-and-replace doesn't necessarily
improve the situation very much.  Most notably, hasNonSerial is
irrelevant for partial but non-parallel aggregation, but we still have
the test because we haven't done the work to really do the right thing
here, which is to separately track whether we can do parallel partial
aggregation (either hasNonPartial or hasNonSerial is a problem) and
non-parallel partial aggregation (only hasNonPartial is a problem).
This needs a deeper reassessment, but I don't think that can or should
be something we try to do right now.

> OR When parse->groupingSets is true, I can see why we can't use
> parallel query, but we can still compute partial aggregates. This
> condition doesn't hurt since partition-wise aggregation bails out when
> there are grouping sets, so it's not that harmful here.

I haven't thought deeply about what will break when GROUPING SETS are
in use, but it's not the purpose of this patch set to make them work
where they didn't before.  The point of hoisting the first two tests
out of this function was just to avoid doing repeated work when
partition-wise aggregate is in use.  Those two tests could conceivably
produce different results for different children, whereas the
remaining tests won't give different answers.  Let's not get
distracted by the prospect of improving the tests.  I suspect that's
not anyway so simple to achieve as what you seem to be speculating
here...

>> I am sort of unclear whether we need/want GroupPathExtraData at all.
>> What determines whether something gets passed via GroupPathExtraData
>> or just as a separate argument?  If we have a rule that stuff that is
>> common to all child grouped rels goes in there and other stuff
>> doesn't, or stuff postgres_fdw needs goes in there and other stuff
>> doesn't, then that might be OK.  But I'm not sure that there is such a
>> rule in the v20 patches.
>
> We have a single FDW hook for all the upper relations and that hook
> can not accept grouping specific arguments. Either we need a separate
> FDW hook for grouping OR we need some way of passing upper relation
> specific information down to an FDW. I think some FDWs and extensions
> will be happy if we provide them readymade decisions for can_sort,
> can_hash, can_partial_agg etc. It will be good if they don't have to
> translate the grouping target and havingQual for every child twice,
> once for core and second time in the FDW. In all it looks like we need
> some structure to hold that information so that we can pass it down
> the hook. I am fine with two structures one variable and other
> invariable. An upper operation can have one of them or both.

I'm fine with using a structure to bundle details that need to be sent
to the FDW, but why should the FDW need to know about
can_sort/can_hash?  I suppose if it needs to know about
can_partial_agg then it doesn't really cost anything to pass through
all the flags, but I doubt whether the FDW has any use for the others.
Anyway, if that's the goal, let's just make sure that the set of
things we're passing that way are exactly the set of things that we
think the FDW might need.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-19 Thread Robert Haas
On Fri, Mar 16, 2018 at 1:50 PM, Ashutosh Bapat
 wrote:
> Ok. That looks good.

Here's an updated version.  In this version, based on a voice
discussion with Ashutosh and Jeevan, I adjusted 0001 to combine it
with an earlier idea of splitting Gather/Gather Merge path generation
out of the function that creates partially aggregated paths.  The idea
here is that create_ordinary_gather_paths() could first call
create_partial_grouping_paths(), then add additional paths which might
be partial or non-partial by invoking the partition-wise aggregate
logic, then call gather_grouping_paths() and set_cheapest() to
finalize the partially grouped rel.  Also, I added draft commit
messages.

With this patch set applied, the key bit of logic in
create_ordinary_grouping_paths() ends up looking like this:

if (grouped_rel->consider_parallel && input_rel->partial_pathlist != NIL
&& (flags & GROUPING_CAN_PARTIAL_AGG) != 0)
{
partially_grouped_rel =
create_partial_grouping_paths(root,
  grouped_rel,
  input_rel,
  gd,
  can_sort,
  can_hash,
  &agg_final_costs);
gather_grouping_paths(root, partially_grouped_rel);
set_cheapest(partially_grouped_rel);
}

I imagine that what the main partition-wise aggregate patch would do
is (1) change the conditions under which
create_partial_grouping_paths() gets called, (2) postpone
gather_grouping_paths() and set_cheapest() until after partition-wise
aggregate had been done, doing them only if partially_grouped_rel !=
NULL.  Partition-wise aggregate will need to happen before
add_paths_to_grouping_rel(), though, so that the latter function can
try a FinalizeAggregate node on top of an Append added by
partition-wise aggregate.

This is a bit strange, because it will mean that partition-wise
aggregate will be attempted BEFORE adding ordinary aggregate paths to
grouped_rel but AFTER adding them to partially_grouped_rel.  We could
fix that by splitting add_paths_to_grouping_rel() into two functions,
one of which performs full aggregation directly and the other of which
tries finishing partial aggregation.  I'm unsure that's a good idea
though: it would mean that we have very similar logic in two different
functions that could get out of sync as a result of future code
changes, and it's not really fixing any problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


0003-Don-t-pass-the-grouping-target-around-unnecessarily.patch
Description: Binary data


0002-Determine-grouping-strategies-in-create_grouping_pat.patch
Description: Binary data


0001-Defer-creation-of-partially-grouped-relation-until-i.patch
Description: Binary data


Re: MCV lists for highly skewed distributions

2018-03-19 Thread John Naylor
On 3/19/18, Dean Rasheed  wrote:
> As promised, here is a new patch, with comment updates, per John and
> Tomas' suggestions, plus the continuity correction, which seemed just
> about worthwhile.

Great. I'm happy with the behavior of the patch. I've marked it ready
for committer.

> I repeated these tests with a 2-SD continuity-corrected threshold and
> the results fell somewhere between the 2-SD and 2.5-SD cases. My
> inclination is to go with that, as something that strikes a reasonable
> balance, and represents a distinct improvement over master in a number
> of different areas.

I also ran some tests with a hand-edited continuity correction last
night, but just now got around to looking at the results (queries
attached). I ran largely as before, but did 20 analyze runs with each
file instead of 10, using the V2 patch, the CC patch, and the V2 patch
with 2.5 threshold. I took out a bunch of the uniform tests to save
time since they largely behave the same.

-
Non-uniform:

To reduce noise for analysis, I tried filtering out the least and
greatest distinct value before taking the average for the
underestimation ratio for most common non-MCV. I also removed results
where all 3 patches had a full MCV list. There was still enough noise
that it's impossible to discern differences between patches that are
very similar. It's not like against HEAD where there were clear
differences in this test, so I won't post those numbers.

Looking at the number of MCVs, it's a little clearer. With few
exceptions, the number either stays the same or decreases slightly
going left to right:

  title   | v20_num_mcvs |
cc05_num_mcvs | v25_num_mcvs
--+--+---+--
 Exp. dist. (N=500k, sd=0.25 (beta))  |3 |
3 |3
 Exp. dist. (N=500k, sd=0.50 (beta))  |5 |
6 |5
 Exp. dist. (N=500k, sd=1.00 (beta))  |9 |
9 |9
 Exp. dist. (N=500k, sd=2.00 (beta))  |   15 |
15 |   15
 Exp. dist. (N=500k, sd=3.00 (beta))  |   22 |
21 |   21
 Exp. dist. (N=500k, sd=4.00 (beta))  |   27 |
27 |   26
 Exp. dist. (N=500k, sd=5.00 (beta))  |   33 |
32 |   30
 Exp. dist. (N=500k, sd=10.00 (beta)) |   60 |
58 |   56
 Exp. dist. (N=500k, sd=20.00 (beta)) |  100 |
99 |   97
 Laplace dist. (N=500k, b=0.25 (lambda))  |5 |
6 |5
 Laplace dist. (N=500k, b=0.50 (lambda))  |9 |
8 |7
 Laplace dist. (N=500k, b=1.00 (lambda))  |   15 |
15 |   15
 Laplace dist. (N=500k, b=2.00 (lambda))  |   27 |
26 |   26
 Laplace dist. (N=500k, b=3.00 (lambda))  |   38 |
37 |   36
 Laplace dist. (N=500k, b=4.00 (lambda))  |   48 |
47 |   45
 Laplace dist. (N=500k, b=5.00 (lambda))  |   58 |
57 |   55
 Laplace dist. (N=500k, b=10.00 (lambda)) |  100 |
99 |   97
 MNM (N=2000, peaks=20, sep=20, sd=15)|  100 |
100 |   62
 Normal dist. (N=500k, sd=1)  |7 |
7 |7
 Normal dist. (N=500k, sd=2)  |   15 |
15 |   14
 Normal dist. (N=500k, sd=3)  |   21 |
21 |   20
 Normal dist. (N=500k, sd=4)  |   28 |
26 |   26
 Normal dist. (N=500k, sd=5)  |   33 |
33 |   33
 Normal dist. (N=500k, sd=6)  |   39 |
38 |   38
 Normal dist. (N=500k, sd=7)  |   45 |
45 |   44
 Normal dist. (N=500k, sd=8)  |   51 |
49 |   49
 Normal dist. (N=500k, sd=9)  |   56 |
56 |   54
 Normal dist. (N=500k, sd=10) |   63 |
62 |   60
 Normal dist. (N=500k, sd=15) |   89 |
89 |   86
 Pareto dist. (N=500, a=1.00) |   70 |
65 |   56
 Pareto dist. (N=500, a=2.00) |   20 |
19 |   17
 Pareto dist. (N=500, a=3.00) |   10 |
9 |9
 Pareto dist. (N=500, a=4.00) |6 |
6 |6
 Pareto dist. (N=500, a=5.00) |5 |
5 |4
(34 rows)


-
Uniform:

Ideally, we want an empty MCV list unless we're sampling a large
portion of the table. As Dean mentioned, this is not as important as
getting the non-uniform case right. That said, it's nice to be
confident we can keep out flotsam. The number generally gets smaller
as we go left to right.

 title | v20_num_mcvs | cc05_num_mcvs
| v25_num_mcvs
---+--+---+--
 Corr. unif. dist. (N=1000k, Nd=1000)  |   13 | 9
|2
 Corr. uni

Re: Error detail/hint style fixup

2018-03-19 Thread Tom Lane
Daniel Gustafsson  writes:
> Attached patch ensures that (i) details and hints have leading capitalization,
> have double spaces after punctuation and ends with period; (ii) context should
> not be capitalized and should not end with period; (iii) test .out files match
> the changes.

+1 for cleaning this up, but I wonder if you've gone far enough in
adjusting the style of errcontext() messages.  The style guide says
"Context strings should normally not be complete sentences", with
the expectation that we're typically going to concatenate several
of them into what amounts to a stack trace.  And while the guide
doesn't say in so many words "describe the context in which the
error occurred", that's surely what you're supposed to do.
So I'm thinking that, eg,

- errcontext("Error occurred on dblink connection named \"%s\": %s.",
+ errcontext("error occurred on dblink connection named \"%s\": %s",
 dblink_context_conname, dblink_context_msg)));

is not getting the job done; at least the "error occurred" part is simply
redundant given that this is a context string.  Looking at the actual uses
of this, eg

 NOTICE:  relation "foobar" does not exist
-CONTEXT:  Error occurred on dblink connection named "unnamed": could not open 
cursor.
+CONTEXT:  error occurred on dblink connection named "unnamed": could not open 
cursor

I'm thinking what we should actually be printing is more like

CONTEXT:  while opening cursor on dblink connection named "unnamed"

That'd require fixing the callers of this subroutine too, but maybe
it's worth doing.

regards, tom lane



Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread David G. Johnston
On Mon, Mar 19, 2018 at 9:11 AM, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Mon, Mar 19, 2018 at 8:33 AM, Tom Lane  wrote:
> >> But I wonder if we wouldn't be better off to put the regex into a
> >> detail line, ie
> >>  errmsg("invalid regular expression: %s", ...),
> >>  errdetail("Regular expression is \"%s\".", ...),
>
> > I'd consider at least supplying the first 30 or so characters (or maybe
> up
> > to the first newline, whichever is shorter) in the main message and then
> > the entire regex in the detail line.
>
> That seems like a lot more complication than this is worth, and it'd be
> confusing to users as well, if things are formatted differently for short
> and long regexes.
> ​
>

They would be formatted the same every time - just need to remember to add
the extra function call around the expression variable in the errmsg case.
Definitely not married to the idea though.

Saying "begins with" and then showing the entire regex shouldn't cause too
much confusion I'd hope.


> I thought
> about trying to discourage deviations by using common error-reporting
> subroutines, but there are enough legit differences in what needs to
> be done that that might not work well.
>

​Two support functions might suffice:​

present_regexp_for_errmsg(regex_var)
present_regexp_for_detailmsg(regex_var)

We can at least centralize how the expression itself is string-ified.

David J.
​


Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread Tom Lane
"David G. Johnston"  writes:
> On Mon, Mar 19, 2018 at 8:33 AM, Tom Lane  wrote:
>> But I wonder if we wouldn't be better off to put the regex into a
>> detail line, ie
>>  errmsg("invalid regular expression: %s", ...),
>>  errdetail("Regular expression is \"%s\".", ...),

> I'd consider at least supplying the first 30 or so characters (or maybe up
> to the first newline, whichever is shorter) in the main message and then
> the entire regex in the detail line.

That seems like a lot more complication than this is worth, and it'd be
confusing to users as well, if things are formatted differently for short
and long regexes.

Also, by my count there are at least eight places in the code that
need to emit messages like this; if we turn formatting the messages into
a major production, people will take shortcuts.  Some already have, eg
spell.c is failing to report pg_regexec failures at all.  I thought
about trying to discourage deviations by using common error-reporting
subroutines, but there are enough legit differences in what needs to
be done that that might not work well.

regards, tom lane



Re: parallel append vs. simple UNION ALL

2018-03-19 Thread Robert Haas
On Fri, Mar 16, 2018 at 7:35 AM, Rajkumar Raghuwanshi
 wrote:
> On Fri, Mar 9, 2018 at 1:04 AM, Robert Haas  wrote:
>> Great.  Committed 0001.  Are you planning any further testing of this
>> patch series?
>
> Sorry I missed the mail.
> Yes, I have further tested patches and find no more issues.

OK, thanks to both you and Ashutosh Bapat.  Committed 0002 and 0003.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



configure's checks for --enable-tap-tests are insufficient

2018-03-19 Thread Tom Lane
So I just went through a rather annoying process trying to run
check-world on a Fedora 26 box where I hadn't done it before:

$ ./configure --enable-tap-tests
... fails, whining about IPC::Run
$ sudo yum install perl-IPC-Run
$ ./configure --enable-tap-tests
... fails, whining about prove
... figure out where Fedora hides prove
$ sudo yum install perl-Test-Harness
$ ./configure --enable-tap-tests
... works this time
... make, then
$ make -j check-world
... runs for awhile and then fails
... much excavation finds a complaint about Test::More not being installed
... figure out where Fedora hides Test::More
$ sudo yum install perl-Test-Simple
$ make -j check-world
... runs for awhile and then fails
... much excavation finds a complaint about Time::HiRes not being installed
$ sudo yum install perl-Time-HiRes
$ make -j check-world
... finally, it works!

Now, perhaps this is an indictment of Red Hat's choices about how to
package-ize the Perl world and which packages belong in a default
workstation installation.  But we haven't made it any easier.

I'm thinking that at minimum we ought to make configure --enable-tap-tests
check for Test::More and Time::HiRes as well as IPC::Run.  It might be
useful for somebody to repeat this exercise on a minimal Debian-oid
installation and see if there's a different set of pain points.

regards, tom lane



Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread David G. Johnston
On Mon, Mar 19, 2018 at 8:33 AM, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > Frankly, I'm not seeing "invalid constant regular expressions" as being a
> > large scale problem - but I'll agree that having the error include the
> > actual literal being parsed as a RegEx should be done.
>
> Agreed.  Doing anything about the other stuff discussed in this thread is
> fairly large-scale work, but adjusting our regex error messages is easy.
> ​[...]
>
> But I wonder if we wouldn't be better off to put the regex into a
> detail line, ie
>
>  errmsg("invalid regular expression: %s", ...),
>  errdetail("Regular expression is \"%s\".", ...),
>
> The reason I'm concerned about that is I've seen some pretty hairy
> regexes, way longer than are reasonable to include in a primary
> error message.  The one-line way is nice for short regexes, but
> it could get out of hand.


​I write many of those - albeit less so when working with database embedded
expressions as opposed to application-layer.

I'd consider at least supplying the first 30 or so characters (or maybe up
to the first newline, whichever is shorter) in the main message and then
the entire regex in the detail line.

invalid regular expression starting with: %s

I think having the typical regex in the message will aid users that use
client interfaces that don't propagate error detail by default - and it
should be sufficient to narrow down, if not pinpoint, the offending regex
in most cases.  And, faced with multiple, the user can add a leading
comment to the regexp to help narrow down the options if necessary.

David J.


Error detail/hint style fixup

2018-03-19 Thread Daniel Gustafsson
When fixing review comments for error message style on another patch, I noticed
that there were a few error details/hints/contexts that weren’t following the
style guide in the documentation.  This might be intentional from when they
were added, or we intentionally avoid changing after the fact to not impact
translations too much, or we just don’t care (didn’t find any past mention in
the archives though), but my OCD cannot unsee it now so I figured I'd send it
and see.

Attached patch ensures that (i) details and hints have leading capitalization,
have double spaces after punctuation and ends with period; (ii) context should
not be capitalized and should not end with period; (iii) test .out files match
the changes.

cheers ./daniel



errdetail_hint_style.patch
Description: Binary data


Re: inserts into partitioned table may cause crash

2018-03-19 Thread Robert Haas
On Mon, Mar 19, 2018 at 9:38 AM, Alvaro Herrera  wrote:
> Etsuro Fujita wrote:
>> Thanks for the updated patches!  I think the patches are in good shape, but
>> I did a bit of editorial things; added a bit more comments for
>> ExecPrepareTupleRouting and adjusted regression test stuff to match the
>> existing ones.  Attached are the updated patches for HEAD and PG10. Those
>> changes are just editorial, so let's ask for the committer review.
>
> Thanks, I'll give these a look now.

Thanks!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread Tom Lane
"David G. Johnston"  writes:
> Frankly, I'm not seeing "invalid constant regular expressions" as being a
> large scale problem - but I'll agree that having the error include the
> actual literal being parsed as a RegEx should be done.

Agreed.  Doing anything about the other stuff discussed in this thread is
fairly large-scale work, but adjusting our regex error messages is easy.

At least, it is if we can get consensus on what they should look like :-).
There's at least one place that already includes the regex proper in
its error, in hba.c:

ereport(LOG,
(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
 errmsg("invalid regular expression \"%s\": %s",
parsedline->ident_user + 1, errstr)));

But I wonder if we wouldn't be better off to put the regex into a
detail line, ie

 errmsg("invalid regular expression: %s", ...),
 errdetail("Regular expression is \"%s\".", ...),

The reason I'm concerned about that is I've seen some pretty hairy
regexes, way longer than are reasonable to include in a primary
error message.  The one-line way is nice for short regexes, but
it could get out of hand.  Also, for the principal use-cases in regexp.c,
we could avoid changing the primary message text from what it is now.
That might prevent some unnecessary client breakage (not that people
are supposed to be testing message string contents, but ...)

Thoughts?

regards, tom lane



Re: Faster inserts with mostly-monotonically increasing values

2018-03-19 Thread Claudio Freire
On Mon, Mar 19, 2018 at 11:57 AM, Pavan Deolasee
 wrote:
>
>
> On Thu, Mar 15, 2018 at 7:51 AM, Claudio Freire 
> wrote:
>>
>>
>>
>> I applied the attached patches on top of your patch, so it would be
>> nice to see if you can give it a try in your test hardware to see
>> whether it helps.
>
>
> I can confirm that I no longer see any regression at 8 or even 16 clients.
> In fact, the patched version consistent do better than master even at higher
> number of clients.
>
> The only thing I am a bit puzzled is that I am no longer able to reproduce
> the 40-50% gains that I'd earlier observed with a single client. I now get
> 20-25% with smaller number of client and 10-15% with larger number of
> clients. I haven't been able to establish why it's happening, but since it's
> a different AWS instance (though of the same type), I am inclined to blame
> it on that for now.

Your original patch also skipped the check for serializable conflicts.

*IF* that was correct, it probably further reduced contention. I'm not
entirely sure if skipping that check is correct, but if it is, you can
still accomplish this checking whether the new key is beyond the
current rightmost key, and setting a flag so that check is later
skipped.

But there are lots of complex interactions in that code and I'm no
expert there, I'd prefer if someone more knowledgeable could confirm
whether it's safe to skip that check or not. Or leave it just in case.



Re: Faster inserts with mostly-monotonically increasing values

2018-03-19 Thread Pavan Deolasee
On Thu, Mar 15, 2018 at 7:51 AM, Claudio Freire 
wrote:

>
>
> I applied the attached patches on top of your patch, so it would be
> nice to see if you can give it a try in your test hardware to see
> whether it helps.
>

I can confirm that I no longer see any regression at 8 or even 16 clients.
In fact, the patched version consistent do better than master even at
higher number of clients.

The only thing I am a bit puzzled is that I am no longer able to reproduce
the 40-50% gains that I'd earlier observed with a single client. I now get
20-25% with smaller number of client and 10-15% with larger number of
clients. I haven't been able to establish why it's happening, but since
it's a different AWS instance (though of the same type), I am inclined to
blame it on that for now.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-19 Thread Claudio Freire
On Mon, Mar 19, 2018 at 8:50 AM, Masahiko Sawada  wrote:
>>> require the bulk-delete method of scanning whole index and of logging
>>> WAL. But it leads some extra overhead. With this patch we no longer
>>> need to depend on the full scan on b-tree index. This might be useful
>>> for a future when we make the bulk-delete of b-tree index not scan
>>> whole index.
>>
>> Perhaps I'm taking something incorrectly, but is it just the
>> result of skipping 'maybe needed' scans without condiering the
>> actual necessity?
>
> I meant to scan only index pages that are relevant with garbages TIDs
> on a table. The current b-tree index bulk-deletion is very slow and
> heavy because we always scans the whole index even if there is only 1
> dead tuples in a table. To address this problem I'm thinking a way to
> make bulk-delete not scan whole index if there is a few dead tuples in
> a table. That is, we do index scans to collect the stack of btree
> pages and reclaim garbage. Maybe we will full index scan if there are
> a lot of dead tuples, which would be same as what we're doing on
> planning access paths.

In theory it's not very difficult to do. I was pondering doing some
PoC after the other vacuum patches get through.

TL;DR version is, as long as there's enough MWM to fit the keys, they
can be stashed before vacuuming the heap, and used to perform index
scans instead for index cleanup. If MWM runs out, it just goes back to
bulk-delete scans (it would imply there's a lot to clean up anyway, so
index scans wouldn't be worth it). A finer decision can be made with
random_page_cost on which technique is likely faster as well.

So yes, lets not paint ourselves into a corner where full index scans
are required for correct operation, that would make the above
impossible.



Re: Problems with Error Messages wrt Domains, Checks

2018-03-19 Thread David G. Johnston
On Sat, Mar 17, 2018 at 12:54 PM, john frazer 
wrote:

>
> -- Forwarded message --
> From: john frazer 
> Date: Sat, Mar 17, 2018 at 6:28 PM
> Subject: Re: Problems with Error Messages wrt Domains, Checks
> To: "David G. Johnston" 
>
> > As such, it could be anywhere in my many, many kLOCs big DB
> definition. I cannot even search the RegEx with a RegEx because all I know
> is some parenthesis is missing, somewhere:
> >
> > ​Well, the error does point to the first statement in the chain of
> issues - working backward a couple of steps is possible.
>
> In this particular case that is possible, and I did manage to do it. The
> point is that in the
> general case the faulty regular expression could be anywhere, and there's
> no clue given at all.
>

​Frankly, I'm not seeing "invalid constant regular expressions" as being a
large scale problem - but I'll agree that having the error include the
actual literal being parsed as a RegEx should be done.  If the targeted
reporting (e.g., stack trace) gets fixed as a side-effect of the more
annoying type input errors - which usually involves dynamic data - that
would be swell.


> > FAILURE: before the insert statement, everything runs dandy. We
> could have built an entire data warehouse application on top of a table
> definition that can never be syntactically processed but which will only
> fail when someone accidentally tries to insert a line.
> >
> > ​Since this is going to fail every single time you add a record I'm
> lacking sympathy here.  "Accidentally tries to insert a line" - if the
> table wasn't meant to be used why does it exist in the first place?​  And
> if it is intended to be used then functional testing should quickly point
> out something like this.
>
> But there clearly can be tables that are used only now and then and might
> get checked
> for absence of rows. But regardless, I think the point stands that ideally
> you shouldn't
> be able to succesfully declare nonsensical objects and only be told so
> some kinds of
> usage patterns by runtime errors (with defective contexts), and in most
> cases, pgSQL
> does keep that promise.
>

I'm not disagreeing but I'm also not part of the solution.  In terms of
importance I'd say its not that high given that I've never really felt the
lack personally.  An invalid object, even though it doesn't fail at
creation, usually fails immediately after its first use which happens soon
enough after creation as to make pin-pointing its location generally
trivial.


>
> > ​I suppose the best practice when dealing with a lack of information in
> the error handle code flows is to limit the amount of context that is in
> scope by unit testing.  And while they are absolutely short-comings
> overcoming them has cost in terms of both developer effort and, more
> importantly, runtime performance.
>
> I'm afraid no amount of unit testing of the DDL code can do this for me.
> Yes,
> in the first reported cases (the invalid RegExp), I can make sure I use
> each
> expression at least once so unsyntactic ones will make themselves shown.
> But
> in the other two cases, well, the production environment in which this
> came up
> has an insert statement that takes data from a largish source into the
> target table
> (20k rows of altogether >2m rows), and I *can't* unit test that data.
>

​I'd be inclined to not constrain the table itself at all and instead
perform soft validation post-load.​  You can process and remove offending
records and then add the constraints as a sanity check/documentation.


> FWIW the workaround that IÄve found is this:
>
> create table X.table_with_constraints (
>   my_column text,
>   constraint "my_column must start with 'x'"  check ( Q.starts_with_x(
> my_column ) ),
>   constraint "my_column must have 3 chrs" check ( Q.has_3_characters(
> my_column ) ) );
>
> In other words, I dispense with domains and use (small, boolean) functions
> (defined as `select` one-liners)
> because only then do I get told what piece of data comes doen the wrong
> way and where.
> It's a shame because this is essentially what I expect to do in a language
> like
> JavaScript.
>
>
​Yes, hopefully we can decide and implement value reporting for v12 (and
consider adding it in for v11) which would at least avoid the need for the
functions and thus rely on just the named constraints.

Though since its just you and I on this thread there is no one who can
write a patch speaking up...

David J.
​


Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Tomas Vondra
On 03/19/2018 02:34 AM, Andres Freund wrote:
> Hi,
> 
> On 2018-03-19 01:52:41 +0100, Tomas Vondra wrote:
>> I do agree with that. We have a working well-understood dsm-based
>> solution, addressing the goals initially explained in this thread.
> 
> Well, it's also awkward and manual to use. I do think that's
> something we've to pay attention to.
> 

Awkward in what sense?

I don't think the manual aspect is an issue. Currently we have no way to
reload the dictionary, except for restarting all the backends. I don't
see that as a particularly convenient solution. Also, this is pretty
much how the shared_ispell extension works, although you might argue
that was more due to the limitation of how shared memory could be used
in extensions before DSM was introduced. In any case, I've never heard
complaints about this aspect of the extension.

There are about two things that might be automated - reloading of
dictionaries and evicting them when hitting the memory limit. I have
tried to implement that in the shared_ispell dictionary but it's a bit
more complicated than it looks.

For example, it seems obvious to reload the dictionary when the file
timestamp changes. But in fact there are three files - dict, affixes,
stopwords. So will you reload when a single file changes? All of them?
Keep in mind that the new version of dictionary may use different
affixes, so a reload at the wrong moment may result in broken result.

> 
>> I wonder how much of this patch would be affected by the switch 
>> from dsm to mmap? I guess the memory limit would get mostly 
>> irrelevant (mmap would rely on the OS to page the memory in/out 
>> depending on memory pressure), and so would the UNLOAD/RELOAD 
>> commands (because each backend would do it's own mmap).
> 
> Those seem fairly major.
> 

I'm not sure I'd say those are major. And you might also see the lack of
these capabilities as negative points for the mmap approach.

So, I'm not at all convinced the mmap approach is actually better than
the dsm one. And I believe that if we come up with a good way to
automate some of the tasks, I don't see why would that be possible in
the mmap and not dsm.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: inserts into partitioned table may cause crash

2018-03-19 Thread Alvaro Herrera
Etsuro Fujita wrote:

> Thanks for the updated patches!  I think the patches are in good shape, but
> I did a bit of editorial things; added a bit more comments for
> ExecPrepareTupleRouting and adjusted regression test stuff to match the
> existing ones.  Attached are the updated patches for HEAD and PG10. Those
> changes are just editorial, so let's ask for the committer review.

Thanks, I'll give these a look now.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ON CONFLICT DO UPDATE for partitioned tables

2018-03-19 Thread Etsuro Fujita

(2018/03/18 13:17), Alvaro Herrera wrote:

Alvaro Herrera wrote:
The only thing that I remain unhappy about this patch is the whole
adjust_and_expand_partition_tlist() thing.  I fear we may be doing
redundant and/or misplaced work.  I'll look into it next week.


I'm still reviewing the patches, but I really agree on that point.  As 
Pavan mentioned upthread, the onConflictSet tlist for the root parent, 
from which we create a translated onConflictSet tlist for a partition, 
would have already been processed by expand_targetlist() to contain all 
missing columns as well, so I think we could create the tlist for the 
partition by simply re-ordering the expression-converted tlist (ie, 
conv_setproj) based on the conversion map for the partition.  The 
Attached defines a function for that, which could be called, instead of 
calling adjust_and_expand_partition_tlist().  This would allow us to get 
rid of planner changes from the patches.  Maybe I'm missing something, 
though.


Best regards,
Etsuro Fujita
*** src/backend/executor/execPartition.c.org	2018-03-19 21:32:52.0 +0900
--- src/backend/executor/execPartition.c	2018-03-19 21:44:17.0 +0900
***
*** 15,24 
--- 15,26 
  #include "postgres.h"
  
  #include "catalog/pg_inherits_fn.h"
+ #include "catalog/pg_type.h"
  #include "executor/execPartition.h"
  #include "executor/executor.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
+ #include "nodes/makefuncs.h"
  #include "optimizer/prep.h"
  #include "utils/lsyscache.h"
  #include "utils/rls.h"
***
*** 37,42 
--- 39,45 
  	 Datum *values,
  	 bool *isnull,
  	 int maxfieldlen);
+ static List *adjust_onconflictset_tlist(List *tlist, TupleConversionMap *map);
  
  /*
   * ExecSetupPartitionTupleRouting - sets up information needed during
***
*** 554,565 
  		firstVarno, partrel,
  		firstResultRel, NULL);
  
! 			conv_setproj =
! adjust_and_expand_partition_tlist(RelationGetDescr(firstResultRel),
!   RelationGetDescr(partrel),
!   RelationGetRelationName(partrel),
!   firstVarno,
!   conv_setproj);
  
  			tupDesc = ExecTypeFromTL(conv_setproj, partrelDesc->tdhasoid);
  			part_conflproj_slot = ExecInitExtraTupleSlot(mtstate->ps.state,
--- 557,563 
  		firstVarno, partrel,
  		firstResultRel, NULL);
  
! 			conv_setproj = adjust_onconflictset_tlist(conv_setproj, map);
  
  			tupDesc = ExecTypeFromTL(conv_setproj, partrelDesc->tdhasoid);
  			part_conflproj_slot = ExecInitExtraTupleSlot(mtstate->ps.state,
***
*** 1091,1093 
--- 1089,1153 
  
  	return buf.data;
  }
+ 
+ /*
+  * Adjust the targetlist entries of a translated ON CONFLICT UPDATE operation
+  *
+  * The expressions have already been fixed, but we need to re-order the tlist
+  * so that the target resnos match the child table.
+  *
+  * The given tlist has already been through expression_tree_mutator;
+  * therefore the TargetEntry nodes are fresh copies that it's okay to
+  * scribble on.
+  */
+ static List *
+ adjust_onconflictset_tlist(List *tlist, TupleConversionMap *map)
+ {
+ 	List	   *new_tlist = NIL;
+ 	TupleDesc	tupdesc = map->outdesc;
+ 	AttrNumber *attrMap = map->attrMap;
+ 	int			numattrs = tupdesc->natts;
+ 	int			attrno;
+ 
+ 	for (attrno = 1; attrno <= numattrs; attrno++)
+ 	{
+ 		Form_pg_attribute att_tup = TupleDescAttr(tupdesc, attrno - 1);
+ 		TargetEntry *tle;
+ 
+ 		if (attrMap[attrno - 1] != 0)
+ 		{
+ 			Assert(!att_tup->attisdropped);
+ 
+ 			/* Get the corresponding tlist entry from the given tlist */
+ 			tle = (TargetEntry *) list_nth(tlist, attrMap[attrno - 1] - 1);
+ 
+ 			/* Get the resno right */
+ 			if (tle->resno != attrno)
+ tle->resno = attrno;
+ 		}
+ 		else
+ 		{
+ 			Node	   *expr;
+ 
+ 			Assert(att_tup->attisdropped);
+ 
+ 			/* Insert NULL for dropped column */
+ 			expr = (Node *) makeConst(INT4OID,
+ 	  -1,
+ 	  InvalidOid,
+ 	  sizeof(int32),
+ 	  (Datum) 0,
+ 	  true, /* isnull */
+ 	  true /* byval */ );
+ 
+ 			tle = makeTargetEntry((Expr *) expr,
+   attrno,
+   pstrdup(NameStr(att_tup->attname)),
+   false);
+ 		}
+ 
+ 		new_tlist = lappend(new_tlist, tle);
+ 	}
+ 
+ 	return new_tlist;
+ }


Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Ildus Kurbangaliev
On Mon, 19 Mar 2018 14:06:50 +0300
Arthur Zakirov  wrote:

> 
> I beleive mmap requires completely rewrite 0003 part of the patch and
> a little changes in 0005.
> 
> > In any case, I suggest to polish the dsm-based patch, and see if we
> > can get that one into PG11.  
> 
> Yes we have more time in future commitfests if dsm-based patch won't
> be approved.
> 

Hi, I'm not sure about mmap approach, it would just bring another
problems. I like the dsm approach because it's not inventing any new
files in the database, when mmap approach will possibly require new
folder in data directory and management above bunch of new files, with
additional issues related with pg_upgrade and etc. Also in dsm approach
if someone needs to update dictionaries then he (or his package
manager) can just replace files and be done with it.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: Online enabling of checksums

2018-03-19 Thread Andrey Borodin
Hi Magnus!

> 19 марта 2018 г., в 16:57, Magnus Hagander  написал(а):
> 
> On Mon, Mar 19, 2018 at 12:24 PM, Magnus Hagander  > wrote:
> On Mon, Mar 19, 2018 at 11:40 AM, Andrey Borodin  > wrote:
> Hi, Daniel!
> If we commit online checksums before SLRU checksums, we will need very neat 
> hacks if we decide to protect SLRU eventually.
> 
> What do you think about this problem?
> 
> One would be adjusted to work with the other, yes. It makes no sense to now 
> allow online enabling once SLRU protection is in there, and it doesn't make 
> sense for either of these patches to be blocking the other one for commit, 
> though it would of course be best to get both included.
> 
> 
> Makes no sense to *not* allow it, of course. Meaning yes, that should be 
> handled.
> 
> We don' t need to convert from "page format with no support for checksums" 
> (pre-11) to "page format with support for checksums" (11+) online.
> 
> We do need to convert from "page format with support for checksums but no 
> checksums enabled" (11+) to "checksums enabled" online.

Oh, yes, you are right. Everything is fine, sorry for the noise.

Best regards, Andrey Borodin.

Re: Online enabling of checksums

2018-03-19 Thread Magnus Hagander
On Mon, Mar 19, 2018 at 12:24 PM, Magnus Hagander 
wrote:

> On Mon, Mar 19, 2018 at 11:40 AM, Andrey Borodin 
> wrote:
>
>> Hi, Daniel!
>>
>> > 19 марта 2018 г., в 4:01, Daniel Gustafsson 
>> написал(а):
>> >
>> > Fixed in patch just posted in 84693D0C-772F-45C2-88A1-85B498
>> 3a5...@yesql.se
>> > (version 5). Thanks!
>>
>>
>> I've been hacking a bit in neighboring thread.
>> And come across one interesting thing. There was a patch on this CF on
>> enabling checksums for SLRU. The thing is CLOG is not protected with
>> checksums right now. But the bad thing about it is that there's no reserved
>> place for checksums in SLRU.
>> And this conversion from page without checksum to page with checksum is
>> quite impossible online.
>>
>> If we commit online checksums before SLRU checksums, we will need very
>> neat hacks if we decide to protect SLRU eventually.
>>
>> What do you think about this problem?
>>
>
> One would be adjusted to work with the other, yes. It makes no sense to
> now allow online enabling once SLRU protection is in there, and it doesn't
> make sense for either of these patches to be blocking the other one for
> commit, though it would of course be best to get both included.
>


Makes no sense to *not* allow it, of course. Meaning yes, that should be
handled.

We don' t need to convert from "page format with no support for checksums"
(pre-11) to "page format with support for checksums" (11+) online.

We do need to convert from "page format with support for checksums but no
checksums enabled" (11+) to "checksums enabled" online.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-19 Thread Masahiko Sawada
On Mon, Mar 19, 2018 at 2:45 PM, Kyotaro HORIGUCHI
 wrote:
> At Mon, 19 Mar 2018 11:12:58 +0900, Masahiko Sawada  
> wrote in 
>> On Wed, Mar 14, 2018 at 9:25 PM, Alexander Korotkov
>>  wrote:
>> > On Wed, Mar 14, 2018 at 7:40 AM, Masahiko Sawada 
>> > wrote:
>> >>
>> >> On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
>> >>  wrote:
>> >> > On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada 
>> >> > wrote:
>> >> >>
>> >> >> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
>> >> >>  wrote:
>> >> >> > 2) These parameters are reset during btbulkdelete() and set during
>> >> >> > btvacuumcleanup().
>> >> >>
>> >> >> Can't we set these parameters even during btbulkdelete()? By keeping
>> >> >> them up to date, we will able to avoid an unnecessary cleanup vacuums
>> >> >> even after index bulk-delete.
>> >> >
>> >> >
>> >> > We certainly can update cleanup-related parameters during
>> >> > btbulkdelete().
>> >> > However, in this case we would update B-tree meta-page during each
>> >> > VACUUM cycle.  That may cause some overhead for non append-only
>> >> > workloads.  I don't think this overhead would be sensible, because in
>> >> > non append-only scenarios VACUUM typically writes much more of
>> >> > information.
>> >> > But I would like this oriented to append-only workload patch to be
>> >> > as harmless as possible for other workloads.
>> >>
>> >> What overhead are you referring here? I guess the overhead is only the
>> >> calculating the oldest btpo.xact. And I think it would be harmless.
>> >
>> >
>> > I meant overhead of setting last_cleanup_num_heap_tuples after every
>> > btbulkdelete with wal-logging of meta-page.  I bet it also would be
>> > harmless, but I think that needs some testing.
>>
>> Agreed.
>>
>> After more thought, it might be too late but we can consider the
>> possibility of another idea proposed by Peter. Attached patch
>> addresses the original issue of index cleanups by storing the epoch
>> number of page deletion XID into PageHeader->pd_prune_xid which is
>> 4byte field.
>
> Mmm. It seems to me that the story is returning to the
> beginning. Could I try retelling the story?
>
> I understant that the initial problem was vacuum runs apparently
> unnecessary full-scan on indexes many times. The reason for that
> is the fact that a cleanup scan may leave some (or many under
> certain condition) dead pages not-recycled but we don't know
> whether a cleanup is needed or not. They will be staying left
> forever unless we run additional cleanup-scans at the appropriate
> timing.
>
> (If I understand it correctly,) Sawada-san's latest proposal is
> (fundamentally the same to the first one,) just skipping the
> cleanup scan if the vacuum scan just before found that the number
> of *live* tuples are increased. If there where many deletions and
> insertions but no increase of total number of tuples, we don't
> have a cleanup. Consequently it had a wraparound problem and it
> is addressed in this version.

No, it doesn't have a wraparound problem. The patch based on Peter's
idea I proposed adds an epoch number of page deletion xid and compare
them when we judge whether the page is recyclable or not. It's
something like we create 64-bit xid of deletion xid. Also, if there is
even one deletion the bulk-delete will be performed instead of the
index cleanup. So with this patch we do the index cleanup only when
the reltuple of table is increased by fraction of
vacuum_index_cleanup_scale_factor from previous stats. It doesn't need
to do the index cleanup by any xid thresholds.

> (ditto.) Alexander proposed to record the oldest xid of
> recyclable pages in metapage (and the number of tuples at the
> last cleanup). This prevents needless cleanup scan and surely
> runs cleanups to remove all recyclable pages.

Yes, but the concerns we discussed are that we need extra WAL-logging
for updating the metapage and it works only for append-only case. If
we also want to support more cases we will need to update the metapage
during bulk-delete. The overhead of WAL-logging would be harmless but
should be tested as Alexander mentioned.

>
> I think that we can accept Sawada-san's proposal if we accept the
> fact that indexes can retain recyclable pages for a long
> time. (Honestly I don't think so.)
>
> If (as I might have mentioned as the same upthread for Yura's
> patch,) we accept to hold the information on index meta page,
> Alexander's way would be preferable. The difference betwen Yura's
> and Alexander's is the former runs cleanup scan if a recyclable
> page is present but the latter avoids that before any recyclable
> pages are knwon to be removed.
>
>>   Comparing to the current proposed patch this patch
>> doesn't need neither the page upgrade code nor extra WAL-logging. If
>
> # By the way, my proposal was storing the information as Yura
> # proposed into stats collector. The information maybe be
> # available a bit lately, but it doesn't harm. This doesn't need
> # extra WAL logging nor 

Re: inserts into partitioned table may cause crash

2018-03-19 Thread Etsuro Fujita

(2018/03/19 17:48), Amit Langote wrote:

On 2018/03/16 20:37, Etsuro Fujita wrote:

(2018/03/14 17:25), Etsuro Fujita wrote:

(2018/03/14 14:54), Amit Langote wrote:

Btw, I noticed that the patches place ExecPrepareTupleRouting (both the
declaration and the definition) at different relative locations within
nodeModifyTable.c in the HEAD branch vs. in the 10 branch. It might be a
good idea to bring them to the same relative location to avoid hassle
when
back-patching relevant patches in the future. I did that in the attached
updated version (v4) of the patch for HEAD, which does not make any other
changes. Although, the patch for PG-10 is unchanged, I still changed its
file name to contain v4.


That's a good idea! Thanks for the updated patches!


Sorry, I didn't look at those patches carefully, but I noticed that while
the patch for PG10 has put the definition of that function after the
definitions of functions such as fireBSTriggers, fireASTriggers and
ExecSetupTransitionCaptureState, but the patch for HEAD puts it *before*
the definitions of these functions (note: these functions don't have their
declarations at the file header), which seems a bit inconsistent.  ISTM
that the v3 patches for PG10 and HEAD have already put that function in
the right place in terms of that relativity.


That's correct, except v3 had added the definition of
ExecPrepareTupleRouting after those of ExecSetupChildParentMapForSubplan,
ExecSetupChildParentMapForTcs, etc., that are new in 11dev branch.

I've fixed the patch for HEAD to move the ExecPrepareTupleRouting
definition to appear after those of fireBSTriggers, fireASTriggers, and
ExecSetupTransitionCaptureState.

Attached are v5 patches for HEAD and 10 branches.


Thanks for the updated patches!  I think the patches are in good shape, 
but I did a bit of editorial things; added a bit more comments for 
ExecPrepareTupleRouting and adjusted regression test stuff to match the 
existing ones.  Attached are the updated patches for HEAD and PG10. 
Those changes are just editorial, so let's ask for the committer review.


Best regards,
Etsuro Fujita
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 2656,2668  CopyFrom(CopyState cstate)
  			if (cstate->transition_capture != NULL)
  			{
  if (resultRelInfo->ri_TrigDesc &&
! 	(resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
! 	 resultRelInfo->ri_TrigDesc->trig_insert_instead_row))
  {
  	/*
! 	 * If there are any BEFORE or INSTEAD triggers on the
! 	 * partition, we'll have to be ready to convert their
! 	 * result back to tuplestore format.
  	 */
  	cstate->transition_capture->tcs_original_insert_tuple = NULL;
  	cstate->transition_capture->tcs_map =
--- 2656,2667 
  			if (cstate->transition_capture != NULL)
  			{
  if (resultRelInfo->ri_TrigDesc &&
! 	resultRelInfo->ri_TrigDesc->trig_insert_before_row)
  {
  	/*
! 	 * If there are any BEFORE triggers on the partition,
! 	 * we'll have to be ready to convert their result back to
! 	 * tuplestore format.
  	 */
  	cstate->transition_capture->tcs_original_insert_tuple = NULL;
  	cstate->transition_capture->tcs_map =
***
*** 2803,2814  CopyFrom(CopyState cstate)
  			 * tuples inserted by an INSERT command.
  			 */
  			processed++;
  
! 			if (saved_resultRelInfo)
! 			{
! resultRelInfo = saved_resultRelInfo;
! estate->es_result_relation_info = resultRelInfo;
! 			}
  		}
  	}
  
--- 2802,2814 
  			 * tuples inserted by an INSERT command.
  			 */
  			processed++;
+ 		}
  
! 		/* Restore the saved ResultRelInfo */
! 		if (saved_resultRelInfo)
! 		{
! 			resultRelInfo = saved_resultRelInfo;
! 			estate->es_result_relation_info = resultRelInfo;
  		}
  	}
  
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***
*** 62,67  static bool ExecOnConflictUpdate(ModifyTableState *mtstate,
--- 62,71 
  	 EState *estate,
  	 bool canSetTag,
  	 TupleTableSlot **returning);
+ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
+ 		EState *estate,
+ 		ResultRelInfo *targetRelInfo,
+ 		TupleTableSlot *slot);
  
  /*
   * Verify that the tuples to be produced by INSERT or UPDATE match the
***
*** 259,265  ExecInsert(ModifyTableState *mtstate,
  {
  	HeapTuple	tuple;
  	ResultRelInfo *resultRelInfo;
- 	ResultRelInfo *saved_resultRelInfo = NULL;
  	Relation	resultRelationDesc;
  	Oid			newId;
  	List	   *recheckIndexes = NIL;
--- 263,268 
***
*** 275,374  ExecInsert(ModifyTableState *mtstate,
  	 * get information on the (current) result relation
  	 */
  	resultRelInfo = estate->es_result_relation_info;
- 
- 	/* Determine the partition to heap_insert the tuple into */
- 	if (mtstate->mt_partition_dispatch_info)
- 	{
- 		int			leaf_part_index;
- 		TupleConversionMap *map;
- 
- 		/*
- 		 * A

Re: [HACKERS] Add support for tuple routing to foreign partitions

2018-03-19 Thread Amit Langote
Fujita-san,

Thanks for sending the updated patches.

On 2018/02/27 21:01, Etsuro Fujita wrote:
> (2018/02/21 20:54), Etsuro Fujita wrote:
>> void
>> BeginForeignRouting();
>>
>> Prepare for a tuple-routing operation on a foreign table. This is called
>> from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo.
> 
> I modified execPartition.c so that this callback routine is called from a
> single function that I added to execPartition.c and it is called the first
> time the foreign partition is chose as the target partition to route a
> tuple to.  That removes CheckValidResultRel, the tuple-conversion setup,
> and the FDW initialization for each UPDATE subplan from
> ExecSetupPartitionTupleRouting, so it would minimize the possibly-useless
> overhead in doing that function.
> 
> Changes other than that are:
> 
> * Fixed typo and revised code/comments
> * Added more regression tests
> * Added docs
> 
> Attached is a new version of the patch set.

Patches still seem to apply cleanly to HEAD, compile fine, tests pass.

* Comments postgres-fdw-refactoring-1.patch:

1. Just a minor nitpick: wouldn't it be better to call it
create_foreign_modify_state just like its "finish" counterpart that's
named finish_foreign_modify?

Other than that, it looks good to me.

* Comments on foreign-routing-fdwapi-1.patch:

1. In the following sentence, s/rows/a tuple/g, to be consistent with
other text added by the patch

+
+ If the ExecForeignRouting pointer is set to
+ NULL, attempts to route rows to the foreign table
will fail
+ with an error message.
+


2. If I understand the description you provided in [1] correctly, the
point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to
avoid possibly-redundantly performing following two steps in
ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos
that may not be used for tuple routing after all:

 - create the parent_child_tupconv_maps[] entry for it
 - perform FDW tuple routing initialization.

If that's true, the following comment could be expanded just a little bit
to make that clearer:

/*
 * ExecInitPartitionExtraInfo
 *  Do the remaining initialization work for the given partition

3. You removed the following from ExecInitPartitionInfo:

 /*
- * Verify result relation is a valid target for an INSERT.  An UPDATE
of a
- * partition-key becomes a DELETE+INSERT operation, so this check is
still
- * required when the operation is CMD_UPDATE.
- */
-CheckValidResultRel(leaf_part_rri, CMD_INSERT);

but, only added back the following in ExecInsert:

+/*
+ * Verify the specified partition is a valid target for INSERT if we
+ * didn't yet.
+ */
+if (!resultRelInfo->ri_PartitionIsValid)
+{
+CheckValidResultRel(resultRelInfo, CMD_INSERT);

Maybe, the new comment should add a "Note: " line the comment saying
something about this code being invoked as part of an UPDATE.

Tests and documentation added by this patch look quite good.

That's all I have for now.

Thanks,
Amit

[1] https://www.postgresql.org/message-id/5A95487E.9050808%40lab.ntt.co.jp




Re: Online enabling of checksums

2018-03-19 Thread Magnus Hagander
On Mon, Mar 19, 2018 at 11:40 AM, Andrey Borodin 
wrote:

> Hi, Daniel!
>
> > 19 марта 2018 г., в 4:01, Daniel Gustafsson 
> написал(а):
> >
> > Fixed in patch just posted in 84693D0C-772F-45C2-88A1-
> 85b4983a5...@yesql.se
> > (version 5). Thanks!
>
>
> I've been hacking a bit in neighboring thread.
> And come across one interesting thing. There was a patch on this CF on
> enabling checksums for SLRU. The thing is CLOG is not protected with
> checksums right now. But the bad thing about it is that there's no reserved
> place for checksums in SLRU.
> And this conversion from page without checksum to page with checksum is
> quite impossible online.
>
> If we commit online checksums before SLRU checksums, we will need very
> neat hacks if we decide to protect SLRU eventually.
>
> What do you think about this problem?
>

One would be adjusted to work with the other, yes. It makes no sense to now
allow online enabling once SLRU protection is in there, and it doesn't make
sense for either of these patches to be blocking the other one for commit,
though it would of course be best to get both included.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-19 Thread Arthur Zakirov
Arthur Zakirov wrote:
> I've planned only to improve the documentation a little. Also it seems I
> should change 0004 part, I found that extension upgrade scripts may be made
> in wrong way.

I've attached new version of the patch. In this version I removed
0004-Update-tmplinit-arguments-v6.patch. In my opinion it handled
extensions upgrade in wrong way. If I'm not mistaken currently there is
no way to upgrade a template's init function signature. And I didn't
find way to change init_method(internal) to init_method(internal,
internal) within an extension's upgrade script.

Therefore I added 0002-Change-tmplinit-argument-v7.patch. Now
DictInitData struct is passed in a template's init method. It contains
necessary data: dictoptions and dictid. And there is no need to change
the method's signature.

Other parts of the patch are same, except that they use DictInitData
structure now.

On Mon, Mar 19, 2018 at 01:52:41AM +0100, Tomas Vondra wrote:
> I wonder how much of this patch would be affected by the switch from dsm
> to mmap? I guess the memory limit would get mostly irrelevant (mmap
> would rely on the OS to page the memory in/out depending on memory
> pressure), and so would the UNLOAD/RELOAD commands (because each backend
> would do it's own mmap).

I beleive mmap requires completely rewrite 0003 part of the patch and a
little changes in 0005.

> In any case, I suggest to polish the dsm-based patch, and see if we can
> get that one into PG11.

Yes we have more time in future commitfests if dsm-based patch won't be
approved.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
 #define tmpalloc(sz)  MemoryContextAlloc(Conf->buildCxt, (sz))
 #define tmpalloc0(sz)  MemoryContextAllocZero(Conf->buildCxt, (sz))
 
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
 /*
  * Prepare for constructing an ISpell dictionary.
  *
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char 
*flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + 
strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
-   ? cpstrdup(Conf, flag) : VoidString;
+   ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
 }
 
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, 
CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
-   entry->flag.s = cpstrdup(Conf, s);
+   entry->flag.s = tmpstrdup(s);
 
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
 
 isnewformat:
+   pfree(recoded);
+   pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..e11d1129e9 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
 Datum
 dintdict_init(PG_FUNCTION_ARGS)
 {
-   List   *dictoptions = (List *) PG_GETARG_POINTER(0);
+   DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt*d;
ListCell   *l;
 
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
 
-   foreach(l, dictoptions)
+   foreach(l, init_data->dictoptions)
{
DefElem*defel = (DefElem *) lfirst(l);
 
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..c3146bae3c 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
 Datum
 dxsyn_init(PG_FUNCTION_ARGS)
 {
-   List   *dictoptions = (List *) PG_GETARG_POINTER(0);
+   DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn*d;
ListCell   *l;
char   *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
 
-   foreach(l, dictoptions)
+   foreach(l, init_data->dictoptions)
{
DefElem*defel = (DefElem *) lfirst(l);
 
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2e66331ed8 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
 Datum
 unaccent_init(PG_FUNCTION_ARGS)
 {
-   List   *dictoptions = (List *) PG_GETARG_POINTER(0);
+   DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar   

Re: User defined data types in Logical Replication

2018-03-19 Thread Masahiko Sawada
On Mon, Mar 19, 2018 at 12:50 PM, Masahiko Sawada  wrote:
> On Fri, Mar 16, 2018 at 10:24 AM, Alvaro Herrera
>  wrote:
>> Masahiko Sawada wrote:
>>> On Thu, Mar 15, 2018 at 9:41 AM, Alvaro Herrera  
>>> wrote:
>>
>>> > I think this is a worthwhile test, but IMO it should be improved a bit
>>> > before we include it.  Also, we can come up with a better name for the
>>> > test surely, not just refer to this particular bug. Maybe "typemap".
>>>
>>> It might be useful if we have the facility of TAP test to check the
>>> log message and regexp-match the message to the expected string.
>>
>> Something similar to PostgresNode::issues_sql_like() perhaps?
>>
>
> Yeah, I didn't know that but I think it's a good idea. Unlike
> issues_sql_like() we don't issue anything to the subscriber. So maybe
> we need truncate logfile before insertion and verify logfile of
> particular period. The test code would be like follows.
>
> $node_subscriber->safe_psql('postgres', 'CREATE SUBSCRIPTION...");
> truncate $node_subscriber->logfile, 0;
> $node_publisher->safe_psql('postgres', 'INSERT .. ')
> my $log = TestLib::slurp_file($node_subscriber->logfile);
>
> # Verify logs
> like($log, qr/processing remote data for replication target relation
> "public.test" column "b", remote type dummyint, local type dummyint/,
> 'callback function of datatype conversion1');
> like($log, qr/processing remote data for replication target relation
> "public.test" column "a", remote type dummytext, local type
> dummytext/, 'callback function of datatype conversion2');
>
> Thoughts?
>

Attached an updated test patch added the above test(0002 patch). Since
for this test case it's enough to use existing test functions I didn't
create new test functions. Also I found that the local data type name
in log for data type conversion isn't qualified whereas the remote
data type is always qualified. Attached 0001 patch fixes that.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


0001-Qualify-datatype-name-in-log-of-data-type-conversion.patch
Description: Binary data


0002-test-module.patch
Description: Binary data


Re: Online enabling of checksums

2018-03-19 Thread Andrey Borodin
Hi, Daniel!

> 19 марта 2018 г., в 4:01, Daniel Gustafsson  написал(а):
> 
> Fixed in patch just posted in 84693d0c-772f-45c2-88a1-85b4983a5...@yesql.se
> (version 5). Thanks!


I've been hacking a bit in neighboring thread.
And come across one interesting thing. There was a patch on this CF on enabling 
checksums for SLRU. The thing is CLOG is not protected with checksums right 
now. But the bad thing about it is that there's no reserved place for checksums 
in SLRU.
And this conversion from page without checksum to page with checksum is quite 
impossible online.

If we commit online checksums before SLRU checksums, we will need very neat 
hacks if we decide to protect SLRU eventually.

What do you think about this problem?

Best regards, Andrey Borodin.


Re: [HACKERS] path toward faster partition pruning

2018-03-19 Thread David Rowley
On 19 March 2018 at 23:03, Amit Langote  wrote:
> Just recently, I replied to a pgsql-bugs report by someone who had OOM
> kill a backend running `delete from
> partitioned_table_with_7202_partitions` on their test system [1].  That'd
> be because running inheritance_planner on a partitioned table doesn't cope
> very well beyond a few hundred partitions, as we've also written in our
> partitioning/inheritance documentation.

hmm, yeah that's unfortunate. I'd not done much study of the
inheritance planner before, but I see how that could happen now that I
understand a bit more about it.  nparts^2 RelOptInfos will be created
on such problems. My patch should help with that providing that some
pruning will actually take place, but make the problem very slightly
worse if none can be pruned.

> On 2018/03/19 16:18, David Rowley wrote:
>> Likely there's more than could be squeezed out of this if we could get
>> the grouping_planner() to somehow skip creating paths and performing
>> the join search. But that patch is not nearly as simple as the
>> attached.
>
> Yeah, that'd be nice.  Do you think that we cannot fix update/delete on
> partitioned tables until we have such a patch though?  IOW, did you intend
> the patch you posted to just be a PoC to demonstrate that we can save tons
> just by not doing grouping_planner() on pruned partitions?

The patch was meant as a PoC. I think the performance of the patch is
acceptable without any additional optimisation work. It would be nice,
but any more code that's added would need more reviewer and committer
time, both of which are finite, especially so before PG11 code cut
off.

I think it would be a shame to tell people partition is usable now for
a decent number of partitions, providing you don't need to perform any
OLTP UPDATE/DELETE operations on the partitions. I think for the few
lines of code that the proposed patch takes it's worth considering for
PG11, but only once your work has gone in. I certainly wouldn't want
this to hold your work back.

> BTW, maybe you know, but if we want this to prune same partitions as are
> pruned during select (due to the new pruning facility), we'd need to teach
> get_relation_constraints() to not fetch the partition constraint
> (RelationGetPartitionQual) at all.  My patch currently teaches it to avoid
> fetching the partition constraint only for select.  If we include the
> partition constraint in the list of constraints returned by
> get_relation_constraints, we'd still be redundantly executing the
> constraint exclusion logic for the selected partitions via the
> grouping_planner() call on those partitions.

I'd not thought of that. It seems more like a performance optimisation
than something that's required for correctness.  Removing that would
probably make constraint_exclusion = 'partition' pretty useless


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [Patch] Checksums for SLRU files

2018-03-19 Thread Andrey Borodin
Hi, Ivan!

> 5 марта 2018 г., в 20:58, Andrey Borodin  написал(а):
> 
> I've found that there are few more places with SLRU items per page

I was looking into this patch mainly because I was reviewing other checksums 
patch in different thread. But the purpose of this patch seems viable for me.
After looking into the patch I'd like to propose some editorialization:
0. Removed GUC: ignore_checksum_failure looks good enough
1. Removed copy on read from disk. Also I do not see a reason to copy page 
before write
2. multis are upgraded by WAL reset, CLOG is actually upgraded
3. Updated all places where SLRU block size was used


Best regards, Andrey Borodin.


0001-SLRU-checksums-patch.patch
Description: Binary data


Re: [HACKERS] path toward faster partition pruning

2018-03-19 Thread Amit Langote
Hi David.

On 2018/03/19 16:18, David Rowley wrote:
> On 17 March 2018 at 01:55, Amit Langote  wrote:
>> Hope the attached version is easier to understand.
> 
> Hi Amit,
> 
> Thanks for making the updates. I'll look at them soon.
> 
> I've been thinking about how we're making these improvements for
> SELECT only.  If planning for an UPDATE or DELETE of a partitioned
> table then since the inheritance planner is planning each partition
> individually we gain next to nothing from this patch.

Unfortunately, yes. :-(

Just recently, I replied to a pgsql-bugs report by someone who had OOM
kill a backend running `delete from
partitioned_table_with_7202_partitions` on their test system [1].  That'd
be because running inheritance_planner on a partitioned table doesn't cope
very well beyond a few hundred partitions, as we've also written in our
partitioning/inheritance documentation.

> Generally, it seems the aim of this patch is to improve the usability
> of partitions in an OLTP type workload, most likely OLAP does not
> matter as much since planner overhead, in that case, is generally less
> of a concern.

Yes, that makes sense.

> I experimented with the attached small patch to see if the situation
> could be improved if we first plan the entire query with all
> partitions then ignore dummy rels when planning for each individual
> partition.
> 
> I used something along the lines of:
> 
> # create table listp (a int, b int) partition by list(a);
> # select 'create table listp'||x||' partition of listp for values
> in('||x||');' from generate_series(1, )x;
> $ echo explain update listp set b = 1 where a = 1; > bench.sql
> $ pgbench -f bench.sql -n -T 30 postgres
> 
> where  started at 1 and went up in powers of 2 until 1024.
> 
> Unpatched = your v35 patch
> Patched = your v35 + the attached.
> 
> The TPS result from a 30-second pgbench run of the above query showed:
> 
> Partitions = 1
> Unpatched: 7323.3
> Patched: 6573.2 (-10.24%)
> 
> Partitions = 2
> Unpatched: 6784.8
> Patched: 6377.1 (-6.01%)
> 
> Partitions = 4
> Unpatched: 5903.0
> Patched: 6106.8 (3.45%)
> 
> Partitions = 8
> Unpatched: 4582.0
> Patched: 5579.9 (21.78%)
> 
> Partitions = 16
> Unpatched: 3131.5
> Patched: 4521.2 (44.38%)
> 
> Partitions = 32
> Unpatched: 1779.8
> Patched: 3387.8 (90.35%)
> 
> Partitions = 64
> Unpatched: 821.9
> Patched: 2245.4 (173.18%)
> 
> Partitions = 128
> Unpatched: 322.2
> Patched: 1319.6 (309.56%)
> 
> Partitions = 256
> Unpatched: 84.3
> Patched: 731.7 (768.27%)
> 
> Partitions = 512
> Unpatched: 22.5
> Patched: 382.8 (1597.74%)
> 
> Partitions = 1024
> Unpatched: 5.5
> Patched: 150.1 (2607.83%)
>
> Which puts the crossover point at just 4 partitions, and just a small
> overhead for 1, 2 and probably 3 partitions. The planner generated a
> plan 26 times faster (!) with 1024 partitions.

Nice!

> Likely there's more than could be squeezed out of this if we could get
> the grouping_planner() to somehow skip creating paths and performing
> the join search. But that patch is not nearly as simple as the
> attached.

Yeah, that'd be nice.  Do you think that we cannot fix update/delete on
partitioned tables until we have such a patch though?  IOW, did you intend
the patch you posted to just be a PoC to demonstrate that we can save tons
just by not doing grouping_planner() on pruned partitions?

BTW, maybe you know, but if we want this to prune same partitions as are
pruned during select (due to the new pruning facility), we'd need to teach
get_relation_constraints() to not fetch the partition constraint
(RelationGetPartitionQual) at all.  My patch currently teaches it to avoid
fetching the partition constraint only for select.  If we include the
partition constraint in the list of constraints returned by
get_relation_constraints, we'd still be redundantly executing the
constraint exclusion logic for the selected partitions via the
grouping_planner() call on those partitions.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/fecdef72-8c2a-0794-8e0a-2ad76db82...@lab.ntt.co.jp




Re: [bug fix] ECPG: freeing memory for pgtypes crashes on Windows

2018-03-19 Thread Kyotaro HORIGUCHI
Hello.

The objective of this patch looks reasonable and this doesn't
affect ecpg applications except for the problematic case that
happens only on Windows. So the points here are only the
documentation, the new function name and the how we should place
the new defintion.

At Mon, 5 Feb 2018 00:53:47 +, "Tsunakawa, Takayuki" 
 wrote in 
<0A3221C70F24FB45833433255569204D1F8AE06B@G01JPEXMBYT05>
> From: Thomas Munro [mailto:thomas.mu...@enterprisedb.com]
> > +#ifndef PGTYPES_FREE
> > +#define PGTYPES_FREE
> > + extern void PGTYPES_free(void *ptr);
> > +#endif
> > 
> > It seems quite strange to repeat this in pgtypes_date.h, pgtypes_interval.h
> > and pgtypes_numeric.h.  I guess you might not want to introduce a new common
> > header file so that his can be back-patched more easily?  Not sure if there
> > is a project policy about that, but it seems unfortunate to introduce
> > maintenance burden by duplicating this.
> 
> Your guess is correct.  I wanted to avoid the duplication, but there was no 
> good place to put this without requiring existing applications to change 
> their #include directives.  
> 
> 
> > +   PGTYPES_free()/ instead of
> > free().
> > 
> > The "/" needs to move inside then closing tag.
> 
> Thanks, fixed.

The types PGTYPES* has corresponding PGTYPES*_free() functions. I
found it a bit strange that the function is named as
PGTYPES_free(), which looks as if it is generic for all PGTYPES*
types. Perhaps PGTYPESchar_free() or PGTYPEStext_free() would be
preferable.

The documentation for PQescapeByteaConn is saying that:

https://www.postgresql.org/docs/10/static/libpq-exec.html#LIBPQ-PQESCAPEBYTEACONN

> PQescapeByteaConn returns an escaped version of the from
> parameter binary string in memory allocated with malloc(). This
> memory should be freed using PQfreemem() when the result is no
> longer needed.

The similar description is needed for the four counterparts of
the new free function nor users doesn't have a definite
suggestion on where to use it.

I cannot (or am quite reluctant to^^;) replay the problem,
instead, I looked over the test/* files and the replacement looks
correct.

I agree to Thomas on the opinion that the definition of
PGTYPES_free shouldn't be scattered around. There's some header
files that has only one or two function defenitions. I believe
that a new header file having only this definition is preferable
than the current shape.

# By the way, I think that multiple occurance of function
# prototypes for the same function don't get error as long as
# they are identical. Or does for CL?

> Your guess is correct.  I wanted to avoid the duplication, but
> there was no good place to put this without requiring existing
> applications to change their #include directives.

I suppose that it is enough to let the pgtype headers
(pgtypes_(timestamp|numeric|interfval).h) include the new common
header. *I* think that it is better than multiple definitions for
the same function are placed here and there. Applications don't
need to be changed with this.

# Anyway existing applications need to replace free() to
# PGTYPES_free()..

I think this doesn't need more profound review so I'll mark this
Ready For Commit after confirming the amendment.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Typo in be_tls_write

2018-03-19 Thread Magnus Hagander
On Mon, Mar 19, 2018 at 10:28 AM, Daniel Gustafsson  wrote:

> Spotted a typo in be_tls_write(), fix attached.
>

Applied, thanks.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Typo in be_tls_write

2018-03-19 Thread Daniel Gustafsson
Spotted a typo in be_tls_write(), fix attached.

cheers ./daniel



typo-tls_write.patch
Description: Binary data


Re: ON CONFLICT DO UPDATE for partitioned tables

2018-03-19 Thread Amit Langote
On 2018/03/19 16:45, Amit Langote wrote:
> I have tried to make these changes and attached are the updated patches
> containing those, including the change I suggested for 0001 (that is,
> getting rid of mt_onconflict).  I also expanded some comments in 0003
> while making those changes.

I realized that there should be a test where transition table is involved
for an ON CONFLICT DO UPDATE on a partitioned table due to relevant
statement trigger on the table; something like the attached.  But due to a
bug being discussed over at [1], we won't get the correct expected output
for the test until the latest patch submitted for that bug [2] is
committed as a bug fix.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/ba19eff9-2120-680e-4671-55a9bea9454f%40lab.ntt.co.jp

[2]
https://www.postgresql.org/message-id/df921671-32df-45ea-c0e4-9b51ee86ba3b%40lab.ntt.co.jp
>From 152c6d5afed21e775caf4862c00e5c6388f7403b Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 19 Mar 2018 17:13:08 +0900
Subject: [PATCH] More tests for ON CONFLICT DO UPDATE on partitioned tables

For transition tables of the DO UPDATE action.
---
 src/test/regress/expected/triggers.out | 33 +
 src/test/regress/sql/triggers.sql  | 33 +
 2 files changed, 66 insertions(+)

diff --git a/src/test/regress/expected/triggers.out 
b/src/test/regress/expected/triggers.out
index 99be9ac6e9..e8b849f773 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -2328,6 +2328,39 @@ insert into my_table values (3, 'CCC'), (4, 'DDD')
 NOTICE:  trigger = my_table_update_trig, old table = (3,CCC), (4,DDD), new 
table = (3,CCC:CCC), (4,DDD:DDD)
 NOTICE:  trigger = my_table_insert_trig, new table = 
 --
+-- now using a partitioned table
+--
+create table iocdu_tt_parted (a int primary key, b text) partition by list (a);
+create table iocdu_tt_parted1 partition of iocdu_tt_parted for values in (1);
+create table iocdu_tt_parted2 partition of iocdu_tt_parted for values in (2);
+create table iocdu_tt_parted3 partition of iocdu_tt_parted for values in (3);
+create table iocdu_tt_parted4 partition of iocdu_tt_parted for values in (4);
+create trigger iocdu_tt_parted_insert_trig
+  after insert on iocdu_tt_parted referencing new table as new_table
+  for each statement execute procedure dump_insert();
+create trigger iocdu_tt_parted_update_trig
+  after update on iocdu_tt_parted referencing old table as old_table new table 
as new_table
+  for each statement execute procedure dump_update();
+-- inserts only
+insert into iocdu_tt_parted values (1, 'AAA'), (2, 'BBB')
+  on conflict (a) do
+  update set b = iocdu_tt_parted.b || ':' || excluded.b;
+NOTICE:  trigger = iocdu_tt_parted_update_trig, old table = , new table 
= 
+NOTICE:  trigger = iocdu_tt_parted_insert_trig, new table = (1,AAA), (2,BBB)
+-- mixture of inserts and updates
+insert into iocdu_tt_parted values (1, 'AAA'), (2, 'BBB'), (3, 'CCC'), (4, 
'DDD')
+  on conflict (a) do
+  update set b = iocdu_tt_parted.b || ':' || excluded.b;
+ERROR:  new row for relation "iocdu_tt_parted1" violates partition constraint
+DETAIL:  Failing row contains (2, BBB).
+-- updates only
+insert into iocdu_tt_parted values (3, 'CCC'), (4, 'DDD')
+  on conflict (a) do
+  update set b = iocdu_tt_parted.b || ':' || excluded.b;
+NOTICE:  trigger = iocdu_tt_parted_update_trig, old table = , new table 
= 
+NOTICE:  trigger = iocdu_tt_parted_insert_trig, new table = (3,CCC), (4,DDD)
+drop table iocdu_tt_parted;
+--
 -- Verify that you can't create a trigger with transition tables for
 -- more than one event.
 --
diff --git a/src/test/regress/sql/triggers.sql 
b/src/test/regress/sql/triggers.sql
index 3354f4899f..3773c6bc98 100644
--- a/src/test/regress/sql/triggers.sql
+++ b/src/test/regress/sql/triggers.sql
@@ -1773,6 +1773,39 @@ insert into my_table values (3, 'CCC'), (4, 'DDD')
   update set b = my_table.b || ':' || excluded.b;
 
 --
+-- now using a partitioned table
+--
+
+create table iocdu_tt_parted (a int primary key, b text) partition by list (a);
+create table iocdu_tt_parted1 partition of iocdu_tt_parted for values in (1);
+create table iocdu_tt_parted2 partition of iocdu_tt_parted for values in (2);
+create table iocdu_tt_parted3 partition of iocdu_tt_parted for values in (3);
+create table iocdu_tt_parted4 partition of iocdu_tt_parted for values in (4);
+create trigger iocdu_tt_parted_insert_trig
+  after insert on iocdu_tt_parted referencing new table as new_table
+  for each statement execute procedure dump_insert();
+create trigger iocdu_tt_parted_update_trig
+  after update on iocdu_tt_parted referencing old table as old_table new table 
as new_table
+  for each statement execute procedure dump_update();
+
+-- inserts only
+insert into iocdu_tt_parted values (1, 'AAA'), (2, 'BBB')
+  on conflict (a) do
+  update set b = iocdu_tt_parted.b || ':' || excluded.b;
+
+-- mixture of inserts and updates
+insert into iocdu_t

Re: inserts into partitioned table may cause crash

2018-03-19 Thread Amit Langote
Fujita-san,

Thanks for the review.

On 2018/03/16 20:37, Etsuro Fujita wrote:
> (2018/03/14 17:25), Etsuro Fujita wrote:
>> (2018/03/14 14:54), Amit Langote wrote:
>>> Btw, I noticed that the patches place ExecPrepareTupleRouting (both the
>>> declaration and the definition) at different relative locations within
>>> nodeModifyTable.c in the HEAD branch vs. in the 10 branch. It might be a
>>> good idea to bring them to the same relative location to avoid hassle
>>> when
>>> back-patching relevant patches in the future. I did that in the attached
>>> updated version (v4) of the patch for HEAD, which does not make any other
>>> changes. Although, the patch for PG-10 is unchanged, I still changed its
>>> file name to contain v4.
>>
>> That's a good idea! Thanks for the updated patches!
> 
> Sorry, I didn't look at those patches carefully, but I noticed that while
> the patch for PG10 has put the definition of that function after the
> definitions of functions such as fireBSTriggers, fireASTriggers and
> ExecSetupTransitionCaptureState, but the patch for HEAD puts it *before*
> the definitions of these functions (note: these functions don't have their
> declarations at the file header), which seems a bit inconsistent.  ISTM
> that the v3 patches for PG10 and HEAD have already put that function in
> the right place in terms of that relativity.

That's correct, except v3 had added the definition of
ExecPrepareTupleRouting after those of ExecSetupChildParentMapForSubplan,
ExecSetupChildParentMapForTcs, etc., that are new in 11dev branch.

I've fixed the patch for HEAD to move the ExecPrepareTupleRouting
definition to appear after those of fireBSTriggers, fireASTriggers, and
ExecSetupTransitionCaptureState.

Attached are v5 patches for HEAD and 10 branches.

Thanks,
Amit
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 2656,2668  CopyFrom(CopyState cstate)
if (cstate->transition_capture != NULL)
{
if (resultRelInfo->ri_TrigDesc &&
!   
(resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
!
resultRelInfo->ri_TrigDesc->trig_insert_instead_row))
{
/*
!* If there are any BEFORE or INSTEAD 
triggers on the
!* partition, we'll have to be ready to 
convert their
!* result back to tuplestore format.
 */

cstate->transition_capture->tcs_original_insert_tuple = NULL;
cstate->transition_capture->tcs_map =
--- 2656,2667 
if (cstate->transition_capture != NULL)
{
if (resultRelInfo->ri_TrigDesc &&
!   
resultRelInfo->ri_TrigDesc->trig_insert_before_row)
{
/*
!* If there are any BEFORE triggers on 
the partition,
!* we'll have to be ready to convert 
their result back to
!* tuplestore format.
 */

cstate->transition_capture->tcs_original_insert_tuple = NULL;
cstate->transition_capture->tcs_map =
***
*** 2803,2814  CopyFrom(CopyState cstate)
 * tuples inserted by an INSERT command.
 */
processed++;
  
!   if (saved_resultRelInfo)
!   {
!   resultRelInfo = saved_resultRelInfo;
!   estate->es_result_relation_info = resultRelInfo;
!   }
}
}
  
--- 2802,2814 
 * tuples inserted by an INSERT command.
 */
processed++;
+   }
  
!   /* Restore the saved ResultRelInfo */
!   if (saved_resultRelInfo)
!   {
!   resultRelInfo = saved_resultRelInfo;
!   estate->es_result_relation_info = resultRelInfo;
}
}
  
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***
*** 62,67  static bool ExecOnConflictUpdate(ModifyTableState *mtstate,
--- 62,71 
 EState *estate,
 bool canSetTag,
 TupleTableSlot **returning);
+ static TupleTableSlot *E

  1   2   >