Re: Table refer leak in logical replication

2021-04-19 Thread Amit Langote
On Sat, Apr 17, 2021 at 10:32 PM Amit Kapila  wrote:
> On Fri, Apr 16, 2021 at 11:55 AM Michael Paquier  wrote:
> > On Tue, Apr 06, 2021 at 02:25:05PM +0900, Amit Langote wrote:
> >
> > Attached is v5 that I am finishing with.  Much more could be done but
> > I don't want to do something too invasive at this stage of the game.
> > There are a couple of extra relations in terms of relations opened for
> > a partitioned table within create_estate_for_relation() when
> > redirecting to the tuple routing, but my guess is that this would be
> > better in the long-term.
> >
>
> Hmm, I am not sure if it is a good idea to open indexes needlessly
> especially when it is not done in the previous code.
>
> @@ -1766,8 +1771,11 @@ apply_handle_tuple_routing(ResultRelInfo *relinfo,
>   slot_getallattrs(remoteslot);
>   }
>   MemoryContextSwitchTo(oldctx);
> +
> + ExecOpenIndices(partrelinfo_new, false);
>   apply_handle_insert_internal(partrelinfo_new, estate,
>   remoteslot_part);
> + ExecCloseIndices(partrelinfo_new);
>   }
>
> It seems you forgot to call open indexes before apply_handle_delete_internal.
>
> I am not sure if it is a good idea to do the refactoring related to
> indexes or other things to fix a minor bug in commit 1375422c. It
> might be better to add a simple fix like what Hou-San has originally
> proposed [1] because even using ExecInitResultRelation might not be
> the best thing as it is again trying to open a range table which we
> have already opened in logicalrep_rel_open.

FWIW, I agree with fixing this bug of 1375422c in as least scary
manner as possible.  Hou-san proposed that we add the ResultRelInfo
that apply_handle_{insert|update|delete} initialize themselves to
es_opened_result_relations.  I would prefer that only
ExecInitResultRelation() add anything to es_opened_result_relations()
to avoid future maintenance problems.  Instead, a fix as simple as the
Hou-san's proposed fix would be to add a ExecCloseResultRelations()
call at the end of each of apply_handle_{insert|update|delete}.  That
would fix the originally reported leak, because
ExecCloseResultRelations() has this:

/* Close any relations that have been opened by
ExecGetTriggerResultRel(). */
foreach(l, estate->es_trig_target_relations)
{

We do end up with the reality though that trigger.c now opens the
replication target relation on its own (adding it to
es_trig_target_relations) to fire its triggers.

I am also not opposed to reviewing the changes of 1375422c in light of
these findings while we still have time.  For example, it might
perhaps be nice for ExecInitResultRelation to accept a Relation
pointer that the callers from copyfrom.c, worker.c can use to pass
their locally opened Relation.  In that case, ExecInitResultRelation()
would perform InitResultRelInfo() with that Relation pointer, instead
of getting one using ExecGetRangeTableRelation() which is what causes
the Relation to be opened again.

--
Amit Langote
EDB: http://www.enterprisedb.com




Re: Performance Evaluation of Result Cache by using TPC-DS

2021-04-19 Thread David Rowley
On Wed, 14 Apr 2021 at 17:11, Yuya Watari  wrote:
> I ran query 62 by "EXPLAIN (ANALYZE, TIMING OFF)" and normally. I
> attached these execution results to this e-mail. At this time, I
> executed each query only once (not twice). The results are as follows.

Thanks for running that again.  I see from the EXPLAIN ANALYZE output
that the planner did cost the Result Cache plan slightly more
expensive than the Hash Join plan.  It's likely that add_path() did
not consider the Hash Join plan to be worth keeping because it was not
more than 1% better than the Result Cache plan. STD_FUZZ_FACTOR is set
so new paths need to be at least 1% better than existing paths for
them to be kept.  That's pretty unfortunate and that alone does not
mean the costs are incorrect.  It would be good to know if that's the
case for the other queries too.

To test that, I've set up TPC-DS locally, however, it would be good if
you could send me the list of indexes that you've created.  I see the
tool from the transaction processing council for TPC-DS only comes
with the list of tables.

Can you share the output of:

select pg_get_indexdef(indexrelid) from pg_index where indrelid::regclass in (
'call_center',
'catalog_page',
'catalog_returns',
'catalog_sales',
'customer',
'customer_address',
'customer_demographics',
'date_dim',
'dbgen_version',
'household_demographics',
'income_band',
'inventory',
'item',
'promotion',
'reason',
'ship_mode',
'store',
'store_returns',
'store_sales',
'time_dim')
order by indrelid;

from your TPC-DS database?

David




Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-04-19 Thread Amul Sul
On Mon, Apr 19, 2021 at 12:25 PM Michael Paquier  wrote:
>
> On Fri, Apr 09, 2021 at 06:45:45PM -0400, Alvaro Herrera wrote:
> > We forgot this patch earlier in the commitfest.  Do people think we
> > should still get it in on this cycle?  I'm +1 on that, since it's a
> > safety feature poised to prevent more bugs than it's likely to
> > introduce.
>
> No objections from here to do that now even after feature freeze.  I
> also wonder, while looking at that, why you don't just remove the last
> call within src/backend/catalog/heap.c.  This way, nobody is tempted
> to use RelationOpenSmgr() anymore, and it could just be removed from
> rel.h.

Agree, did the same in the attached version, thanks.

Regards,
Amul

P.S. commitfest entry https://commitfest.postgresql.org/33/3084/


v2_Add-RelationGetSmgr-inline-function.patch
Description: Binary data


Re: Replication slot stats misgivings

2021-04-19 Thread Masahiko Sawada
On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila  wrote:
>
> On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada  wrote:
> >
> > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila  wrote:
> > >
> > >
> > > 4.
> > > +CREATE VIEW pg_stat_replication_slots AS
> > > +SELECT
> > > +s.slot_name,
> > > +s.spill_txns,
> > > +s.spill_count,
> > > +s.spill_bytes,
> > > +s.stream_txns,
> > > +s.stream_count,
> > > +s.stream_bytes,
> > > +s.total_txns,
> > > +s.total_bytes,
> > > +s.stats_reset
> > > +FROM pg_replication_slots as r,
> > > +LATERAL pg_stat_get_replication_slot(slot_name) as s
> > > +WHERE r.datoid IS NOT NULL; -- excluding physical slots
> > > ..
> > > ..
> > >
> > > -/* Get the statistics for the replication slots */
> > > +/* Get the statistics for the replication slot */
> > >  Datum
> > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS)
> > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
> > >  {
> > >  #define PG_STAT_GET_REPLICATION_SLOT_COLS 10
> > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> > > + text *slotname_text = PG_GETARG_TEXT_P(0);
> > > + NameData slotname;
> > >
> > > I think with the above changes getting all the slot stats has become
> > > much costlier. Is there any reason why can't we get all the stats from
> > > the new hash_table in one shot and return them to the user?
> >
> > I think the advantage of this approach would be that it can avoid
> > showing the stats for already-dropped slots. Like other statistics
> > views such as pg_stat_all_tables and pg_stat_all_functions, searching
> > the stats by the name got from pg_replication_slots can show only
> > available slot stats even if the hash table has garbage slot stats.
> >
>
> Sounds reasonable. However, if the create_slot message is missed, it
> will show an empty row for it. See below:
>
> postgres=# select slot_name, total_txns from pg_stat_replication_slots;
>  slot_name | total_txns
> ---+
>  s1|  0
>  s2|  0
>|
> (3 rows)
>
> Here, I have manually via debugger skipped sending the create_slot
> message for the third slot and we are showing an empty for it. This
> won't happen for pg_stat_all_tables, as it will set 0 or other initial
> values in such a case. I think we need to address this case.

Good catch. I think it's better to set 0 to all counters and NULL to
reset_stats.

>
> > Given that pg_stat_replication_slots doesn’t show garbage slot stats
> > even if it has, I thought we can avoid checking those garbage stats
> > frequently. It should not essentially be a problem for the hash table
> > to have entries up to max_replication_slots regardless of live or
> > already-dropped.
> >
>
> Yeah, but I guess we still might not save much by not doing it,
> especially because for the other cases like tables/functions, we are
> doing it without any threshold limit.

Agreed.

I've attached the updated patch, please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v8-0001-Use-HTAB-for-replication-slot-statistics.patch
Description: Binary data


Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-19 Thread Kyotaro Horiguchi
At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada  
wrote in 
> On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
>  wrote:
> > AFAICS the page is always empty when RelationGetBufferForTuple
> > returned a valid vmbuffer.  So the "if" should be an "assert" instead.
> 
> There is a chance that RelationGetBufferForTuple() returns a valid
> vmbuffer but the page is not empty, since RelationGetBufferForTuple()
> checks without a lock if the page is empty. But when it comes to
> HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
> since only one process inserts tuples into the relation. Will fix.

Yes.  It seems to me that it is cleaner that RelationGetBufferForTuple
returns vmbuffer only when the caller needs to change vm state.
Thanks.

> > And, the patch changes the value of all_frozen_set to false when the
> > page was already all-frozen (thus not empty). It would be fine since
> > we don't need to change the visibility of the page in that case but
> > the variable name is no longer correct.  set_all_visible or such?
> 
> It seems to me that the variable name all_frozen_set corresponds to
> XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
> set_all_frozen instead since we set all-frozen bits (also implying
> setting all-visible)?

Right. However, "if (set_all_frozen) then "set all_visible" looks like
a bug^^;.  all_frozen_set looks better in that context than
set_all_frozen. So I withdraw the comment.

> BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
> there is no all_visible_set anywhere:
> 
> /* all_frozen_set always implies all_visible_set */
> #define XLH_INSERT_ALL_FROZEN_SET   (1<<5)
> 
> I'll update those comments as well.

FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
to be set together". The current comment is working to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Doubt with [ RANGE partition with TEXT datatype ]

2021-04-19 Thread Prabhat Sahu
Hi All,

Please help me out with my doubt in RANGE partition with TEXT datatype:

postgres=# create table tab1 (col1 text) PARTITION BY RANGE (col1);
CREATE TABLE

postgres=# create table p1 (col1 text);
CREATE TABLE

-- Partition with range from '5' to '10' shows error:
postgres=# alter table tab1 attach partition p1 for values from ('5') to
('10');
ERROR:  empty range bound specified for partition "p1"
LINE 1: ...r table tab1 attach partition p1 for values from ('5') to ('...
 ^
DETAIL:  Specified lower bound ('5') is greater than or equal to upper
bound ('10').

-- Whereas, partition with range from '5' to '9' is working fine as below:
postgres=# alter table tab1 attach partition p1 for values from ('5') to
('9');
ALTER TABLE

If this behavior is expected, Kindly let me know, how to represent the
range from '5' to '10' with text datatype column?
Is there any specific restriction for RANGE PARTITION table with TEXT
datatype column?

Similar test scenario is working fine with INTEGER datatype.

-- 

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: Table refer leak in logical replication

2021-04-19 Thread Amit Langote
On Fri, Apr 16, 2021 at 3:24 PM Michael Paquier  wrote:
> On Tue, Apr 06, 2021 at 02:25:05PM +0900, Amit Langote wrote:
> > While updating the patch to do so, it occurred to me that maybe we
> > could move the ExecInitResultRelation() call into
> > create_estate_for_relation() too, in the spirit of removing
> > duplicative code.  See attached updated patch.  Actually I remember
> > proposing that as part of the commit you shared in your earlier email,
> > but for some reason it didn't end up in the commit.  I now think maybe
> > we should do that after all.
>
> So, I have been studying 1375422c and this thread.  Using
> ExecCloseRangeTableRelations() when cleaning up the executor state is
> reasonable as of the equivalent call to ExecInitRangeTable().  Now,
> there is something that itched me quite a lot while reviewing the
> proposed patch: ExecCloseResultRelations() is missing, but other
> code paths make an effort to mirror any calls of ExecInitRangeTable()
> with it.  Looking closer, I think that the worker code is actually
> confused with the handling of the opening and closing of the indexes
> needed by a result relation once you use that, because some code paths
> related to tuple routing for partitions may, or may not, need to do
> that.  However, once the code integrates with ExecInitRangeTable() and
> ExecCloseResultRelations(), the index handlings could get much simpler
> to understand as the internal apply routines for INSERT/UPDATE/DELETE
> have no need to think about the opening or closing them.  Well,
> mostly, to be honest.

To bring up another point, if we were to follow the spirit of the
recent c5b7ba4e67a, whereby we moved ExecOpenIndices() from
ExecInitModifyTable() into ExecInsert() and ExecUpdate(), that is,
from during the initialization phase of an INSERT/UPDATE to its actual
execution, we could even consider performing Exec{Open|Close}Indices()
for a replication target relation in
ExecSimpleRelation{Insert|Update}.  The ExecOpenIndices() performed in
worker.c is pointless in some cases, for example, when
ExecSimpleRelation{Insert|Update} end up skipping the insert/update of
a tuple due to BR triggers.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: logical replication empty transactions

2021-04-19 Thread Peter Smith
On Thu, Apr 15, 2021 at 4:39 PM Ajin Cherian  wrote:
>
>
>
> On Thu, Apr 15, 2021 at 1:29 PM Ajin Cherian  wrote:
>>
>>
>> I've rebased the patch and made changes so that the patch supports 
>> "streaming in-progress transactions" and handling of logical decoding
>> messages (transactional and non-transactional).
>> I see that this patch not only makes sure that empty transactions are not 
>> sent but also does call OutputPluginUpdateProgress when an empty
>> transaction is not sent, as a result the confirmed_flush_lsn is kept moving. 
>> I also see no hangs when synchronous_standby is configured.
>> Do let me know your thoughts on this patch.

REVIEW COMMENTS

I applied this patch to today's HEAD and successfully ran "make check"
and also the subscription TAP tests.

Here are a some review comments:

--

1. The patch v3 applied OK but with whitespace warnings

[postgres@CentOS7-x64 oss_postgres_2PC]$ git apply
../patches_misc/v3-0001-Skip-empty-transactions-for-logical-replication.patch
../patches_misc/v3-0001-Skip-empty-transactions-for-logical-replication.patch:98:
indent with spaces.
/* output BEGIN if we haven't yet, avoid for streaming and
non-transactional messages */
../patches_misc/v3-0001-Skip-empty-transactions-for-logical-replication.patch:99:
indent with spaces.
if (!data->xact_wrote_changes && !in_streaming && transactional)
warning: 2 lines add whitespace errors.

--

2. Please create a CF entry in [1] for this patch.

--

3. Patch comment

The comment  describes the problem and then suddenly just says
"Postpone the BEGIN message until the first change."

I suggest changing it to say more like... "(blank line) This patch
addresses the above problem by postponing the BEGIN message until the
first change."

--

4. pgoutput.h

Maybe for consistency with the context member, the comment for the new
member should be to the right instead of above it?

@@ -20,6 +20,9 @@ typedef struct PGOutputData
  MemoryContext context; /* private memory context for transient
  * allocations */

+ /* flag indicating whether messages have previously been sent */
+ boolxact_wrote_changes;
+

--

5. pgoutput.h

+ /* flag indicating whether messages have previously been sent */

"previously been sent" --> "already been sent" ??

--

6. pgoutput.h - misleading member name

Actually, now that I have read all the rest of the code and how this
member is used I feel that this name is very misleading. e.g. For
"streaming" case then you still are writing changes but are not
setting this member at all - therefore it does not always mean what it
says.

I feel a better name for this would be something like
"sent_begin_txn". Then if you have sent BEGIN it is true. If you
haven't sent BEGIN it is false. It eliminates all ambiguity naming it
this way instead.

(This makes my feedback #5 redundant because the comment will be a bit
different if you do this).

--

7. pgoutput.c - function pgoutput_begin_txn

@@ -345,6 +345,23 @@ pgoutput_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt,
 static void
 pgoutput_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
 {

I guess that you still needed to pass the txn because that is how the
API is documented, right?

But I am wondering if you ought to flag it as unused so you wont get
some BF machine giving warnings about it.

e.g. Syntax like this?

pgoutput_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN * txn) {
(void)txn;
...

--

8. pgoutput.c - function pgoutput_begin_txn

@@ -345,6 +345,23 @@ pgoutput_startup(LogicalDecodingContext *ctx,
OutputPluginOptions *opt,
 static void
 pgoutput_begin_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
 {
+ PGOutputData *data = ctx->output_plugin_private;
+
+ /*
+ * Don't send BEGIN message here. Instead, postpone it until the first
+ * change. In logical replication, a common scenario is to replicate a set
+ * of tables (instead of all tables) and transactions whose changes were on
+ * table(s) that are not published will produce empty transactions. These
+ * empty transactions will send BEGIN and COMMIT messages to subscribers,
+ * using bandwidth on something with little/no use for logical replication.
+ */
+ data->xact_wrote_changes = false;
+ elog(LOG,"Holding of begin");
+}

Why is this loglevel LOG? Looks like leftover debugging.

--

9. pgoutput.c - function pgoutput_commit_txn

@@ -384,8 +401,14 @@ static void
 pgoutput_commit_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
  XLogRecPtr commit_lsn)
 {
+ PGOutputData *data = ctx->output_plugin_private;
+
  OutputPluginUpdateProgress(ctx);

+ /* skip COMMIT message if nothing was sent */
+ if (!data->xact_wrote_changes)
+ return;
+

In the case where you decided to do nothing does it make sense that
you still called the function OutputPluginUpdateProgress(ctx); ?
I thought perhaps that your new check should come first so this call
would never happen.

--

10. pgoutput.c - variable declarati

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-19 Thread Masahiko Sawada
On Mon, Apr 19, 2021 at 5:04 PM Kyotaro Horiguchi
 wrote:
>
> At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada  
> wrote in
> > On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
> >  wrote:
> > > AFAICS the page is always empty when RelationGetBufferForTuple
> > > returned a valid vmbuffer.  So the "if" should be an "assert" instead.
> >
> > There is a chance that RelationGetBufferForTuple() returns a valid
> > vmbuffer but the page is not empty, since RelationGetBufferForTuple()
> > checks without a lock if the page is empty. But when it comes to
> > HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
> > since only one process inserts tuples into the relation. Will fix.
>
> Yes.  It seems to me that it is cleaner that RelationGetBufferForTuple
> returns vmbuffer only when the caller needs to change vm state.
> Thanks.
>
> > > And, the patch changes the value of all_frozen_set to false when the
> > > page was already all-frozen (thus not empty). It would be fine since
> > > we don't need to change the visibility of the page in that case but
> > > the variable name is no longer correct.  set_all_visible or such?
> >
> > It seems to me that the variable name all_frozen_set corresponds to
> > XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
> > set_all_frozen instead since we set all-frozen bits (also implying
> > setting all-visible)?
>
> Right. However, "if (set_all_frozen) then "set all_visible" looks like
> a bug^^;.  all_frozen_set looks better in that context than
> set_all_frozen. So I withdraw the comment.
>
> > BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
> > there is no all_visible_set anywhere:
> >
> > /* all_frozen_set always implies all_visible_set */
> > #define XLH_INSERT_ALL_FROZEN_SET   (1<<5)
> >
> > I'll update those comments as well.
>
> FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
> to be set together". The current comment is working to me.
>

Okay, I've updated the patch accordingly. Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


skip_vmbuffer_for_frozen_tuple_insertion_v2.patch
Description: Binary data


Re: Table refer leak in logical replication

2021-04-19 Thread Amit Kapila
On Mon, Apr 19, 2021 at 1:51 PM Amit Langote  wrote:
>
> On Fri, Apr 16, 2021 at 3:24 PM Michael Paquier  wrote:
> > On Tue, Apr 06, 2021 at 02:25:05PM +0900, Amit Langote wrote:
> > > While updating the patch to do so, it occurred to me that maybe we
> > > could move the ExecInitResultRelation() call into
> > > create_estate_for_relation() too, in the spirit of removing
> > > duplicative code.  See attached updated patch.  Actually I remember
> > > proposing that as part of the commit you shared in your earlier email,
> > > but for some reason it didn't end up in the commit.  I now think maybe
> > > we should do that after all.
> >
> > So, I have been studying 1375422c and this thread.  Using
> > ExecCloseRangeTableRelations() when cleaning up the executor state is
> > reasonable as of the equivalent call to ExecInitRangeTable().  Now,
> > there is something that itched me quite a lot while reviewing the
> > proposed patch: ExecCloseResultRelations() is missing, but other
> > code paths make an effort to mirror any calls of ExecInitRangeTable()
> > with it.  Looking closer, I think that the worker code is actually
> > confused with the handling of the opening and closing of the indexes
> > needed by a result relation once you use that, because some code paths
> > related to tuple routing for partitions may, or may not, need to do
> > that.  However, once the code integrates with ExecInitRangeTable() and
> > ExecCloseResultRelations(), the index handlings could get much simpler
> > to understand as the internal apply routines for INSERT/UPDATE/DELETE
> > have no need to think about the opening or closing them.  Well,
> > mostly, to be honest.
>
> To bring up another point, if we were to follow the spirit of the
> recent c5b7ba4e67a, whereby we moved ExecOpenIndices() from
> ExecInitModifyTable() into ExecInsert() and ExecUpdate(), that is,
> from during the initialization phase of an INSERT/UPDATE to its actual
> execution, we could even consider performing Exec{Open|Close}Indices()
> for a replication target relation in
> ExecSimpleRelation{Insert|Update}.  The ExecOpenIndices() performed in
> worker.c is pointless in some cases, for example, when
> ExecSimpleRelation{Insert|Update} end up skipping the insert/update of
> a tuple due to BR triggers.
>

Yeah, that is also worth considering and sounds like a good idea. But,
as I mentioned before it might be better to consider this separately.

-- 
With Regards,
Amit Kapila.




Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-04-19 Thread Kyotaro Horiguchi
At Mon, 19 Apr 2021 12:56:18 +0530, Amul Sul  wrote in 
> On Mon, Apr 19, 2021 at 12:25 PM Michael Paquier  wrote:
> >
> > On Fri, Apr 09, 2021 at 06:45:45PM -0400, Alvaro Herrera wrote:
> > > We forgot this patch earlier in the commitfest.  Do people think we
> > > should still get it in on this cycle?  I'm +1 on that, since it's a
> > > safety feature poised to prevent more bugs than it's likely to
> > > introduce.
> >
> > No objections from here to do that now even after feature freeze.  I
> > also wonder, while looking at that, why you don't just remove the last
> > call within src/backend/catalog/heap.c.  This way, nobody is tempted
> > to use RelationOpenSmgr() anymore, and it could just be removed from
> > rel.h.
> 
> Agree, did the same in the attached version, thanks.

+   smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
  (char *) metapage, true);
-   log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
+   log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,

At the log_newpage, index is guaranteed to have rd_smgr. So I prefer
to leave the line alone..  I don't mind other sccessive calls if any
since what I don't like is the notation there.

> P.S. commitfest entry https://commitfest.postgresql.org/33/3084/

Isn't this a kind of open item?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Doubt with [ RANGE partition with TEXT datatype ]

2021-04-19 Thread Amit Langote
Hi Prabhat,

On Mon, Apr 19, 2021 at 5:13 PM Prabhat Sahu
 wrote:
>
> Hi All,
>
> Please help me out with my doubt in RANGE partition with TEXT datatype:
>
> postgres=# create table tab1 (col1 text) PARTITION BY RANGE (col1);
> CREATE TABLE
>
> postgres=# create table p1 (col1 text);
> CREATE TABLE
>
> -- Partition with range from '5' to '10' shows error:
> postgres=# alter table tab1 attach partition p1 for values from ('5') to 
> ('10');
> ERROR:  empty range bound specified for partition "p1"
> LINE 1: ...r table tab1 attach partition p1 for values from ('5') to ('...
>  ^
> DETAIL:  Specified lower bound ('5') is greater than or equal to upper bound 
> ('10').
>
> -- Whereas, partition with range from '5' to '9' is working fine as below:
> postgres=# alter table tab1 attach partition p1 for values from ('5') to 
> ('9');
> ALTER TABLE

Well, that is how comparing text values works.  If you are expecting
the comparisons to follow numerical rules, use a numeric data type.

> If this behavior is expected, Kindly let me know, how to represent the range 
> from '5' to '10' with text datatype column?

Don't know why you want to use the text type for the column and these
particular values for the partitions bounds, but one workaround would
be to use '05' instead of '5'.


--
Amit Langote
EDB: http://www.enterprisedb.com




Re: Windows default locale vs initdb

2021-04-19 Thread Pavel Stehule
po 19. 4. 2021 v 7:43 odesílatel Thomas Munro 
napsal:

> Hi,
>
> Moving this topic into its own thread from the one about collation
> versions, because it concerns pre-existing problems, and that thread
> is long.
>
> Currently initdb sets up template databases with old-style Windows
> locale names reported by the OS, and they seem to have caused us quite
> a few problems over the years:
>
> db29620d "Work around Windows locale name with non-ASCII character."
> aa1d2fc5 "Another attempt at fixing Windows Norwegian locale."
> db477b69 "Deal with yet another issue related to "Norwegian (Bokmål)"..."
> 9f12a3b9 "Tolerate version lookup failure for old style Windows locale..."
>
> ... and probably more, and also various threads about , for example,
> "German_German.1252" vs "German_Switzerland.1252" which seem to get
> confused or badly canonicalised or rejected somewhere in the mix.
>
> I hadn't focused on any of that before, being a non-Windows-user, but
> the entire contents of win32setlocale.c supports the theory that
> Windows' manual meant what it said when it said[1]:
>
> "We do not recommend this form for locale strings embedded in
> code or serialized to storage, because these strings are more likely
> to be changed by an operating system update than the locale name
> form."
>
> I suppose that was the only form available at the time the code was
> written, so there was no choice.  The question we asked ourselves
> multiple times in the other thread was how we're supposed to get to
> the modern BCP 47 form when creating the template databases.  It looks
> like one possibility, since Vista, is to call
> GetUserDefaultLocaleName()[2], which doesn't appear to have been
> discussed before on this list.  That doesn't allow you to ask for the
> default for each individual category, but I don't know if that is even
> a concept for Windows user settings.  It may be that some of the other
> nearby functions give a better answer for some reason.  But one thing
> is clear from a test that someone kindly ran for me: it reports
> standardised strings like "en-NZ", not strings like "English_New
> Zealand.1252".
>
> No patch, but I wondered if any Windows hackers have any feedback on
> relative sanity of trying to fix all these problems this way.
>

Last weekend I talked with one user about one interesting (and messing)
issue. They needed to create a new database with Czech collation on Azure
SAS. There was not any entry in pg_collation for Czech language. The reply
from Microsoft support was to use CREATE DATABASE xxx TEMPLATE 'template0'
ENCODING 'utf8' LOCALE 'cs_CZ.UTF8' and it was working.

Regards

Pavel


> [1]
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-160
> [2]
> https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getuserdefaultlocalename
>
>
>


Re: Table refer leak in logical replication

2021-04-19 Thread Amit Kapila
On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  wrote:
>
> On Sat, Apr 17, 2021 at 10:32 PM Amit Kapila  wrote:
> > On Fri, Apr 16, 2021 at 11:55 AM Michael Paquier  
> > wrote:
> > > On Tue, Apr 06, 2021 at 02:25:05PM +0900, Amit Langote wrote:
> > >
> > > Attached is v5 that I am finishing with.  Much more could be done but
> > > I don't want to do something too invasive at this stage of the game.
> > > There are a couple of extra relations in terms of relations opened for
> > > a partitioned table within create_estate_for_relation() when
> > > redirecting to the tuple routing, but my guess is that this would be
> > > better in the long-term.
> > >
> >
> > Hmm, I am not sure if it is a good idea to open indexes needlessly
> > especially when it is not done in the previous code.
> >
> > @@ -1766,8 +1771,11 @@ apply_handle_tuple_routing(ResultRelInfo *relinfo,
> >   slot_getallattrs(remoteslot);
> >   }
> >   MemoryContextSwitchTo(oldctx);
> > +
> > + ExecOpenIndices(partrelinfo_new, false);
> >   apply_handle_insert_internal(partrelinfo_new, estate,
> >   remoteslot_part);
> > + ExecCloseIndices(partrelinfo_new);
> >   }
> >
> > It seems you forgot to call open indexes before 
> > apply_handle_delete_internal.
> >
> > I am not sure if it is a good idea to do the refactoring related to
> > indexes or other things to fix a minor bug in commit 1375422c. It
> > might be better to add a simple fix like what Hou-San has originally
> > proposed [1] because even using ExecInitResultRelation might not be
> > the best thing as it is again trying to open a range table which we
> > have already opened in logicalrep_rel_open.
>
> FWIW, I agree with fixing this bug of 1375422c in as least scary
> manner as possible.  Hou-san proposed that we add the ResultRelInfo
> that apply_handle_{insert|update|delete} initialize themselves to
> es_opened_result_relations.  I would prefer that only
> ExecInitResultRelation() add anything to es_opened_result_relations()
> to avoid future maintenance problems.  Instead, a fix as simple as the
> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> call at the end of each of apply_handle_{insert|update|delete}.
>

Yeah, that will work too but might look a bit strange. BTW, how that
is taken care of for ExecuteTruncateGuts? I mean we do add rels there
like Hou-San's patch without calling ExecCloseResultRelations, the
rels are probably closed when we close the relation in worker.c but
what about memory for the list?

-- 
With Regards,
Amit Kapila.




Re: Doubt with [ RANGE partition with TEXT datatype ]

2021-04-19 Thread Prabhat Sahu
On Mon, Apr 19, 2021 at 2:16 PM Amit Langote 
wrote:

> Hi Prabhat,
>
> On Mon, Apr 19, 2021 at 5:13 PM Prabhat Sahu
>  wrote:
> >
> > Hi All,
> >
> > Please help me out with my doubt in RANGE partition with TEXT datatype:
> >
> > postgres=# create table tab1 (col1 text) PARTITION BY RANGE (col1);
> > CREATE TABLE
> >
> > postgres=# create table p1 (col1 text);
> > CREATE TABLE
> >
> > -- Partition with range from '5' to '10' shows error:
> > postgres=# alter table tab1 attach partition p1 for values from ('5') to
> ('10');
> > ERROR:  empty range bound specified for partition "p1"
> > LINE 1: ...r table tab1 attach partition p1 for values from ('5') to
> ('...
> >  ^
> > DETAIL:  Specified lower bound ('5') is greater than or equal to upper
> bound ('10').
> >
> > -- Whereas, partition with range from '5' to '9' is working fine as
> below:
> > postgres=# alter table tab1 attach partition p1 for values from ('5') to
> ('9');
> > ALTER TABLE
>
> Well, that is how comparing text values works.  If you are expecting
> the comparisons to follow numerical rules, use a numeric data type.
>
> > If this behavior is expected, Kindly let me know, how to represent the
> range from '5' to '10' with text datatype column?
>
> Don't know why you want to use the text type for the column and these
> particular values for the partitions bounds, but one workaround would
> be to use '05' instead of '5'.
>

While testing on some PG behavior, I came across such a scenario/doubt.
Thank you Amit for the clarification.

-- 

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: Table refer leak in logical replication

2021-04-19 Thread Amit Langote
On Mon, Apr 19, 2021 at 6:03 PM Amit Kapila  wrote:
> On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  wrote:
> > On Sat, Apr 17, 2021 at 10:32 PM Amit Kapila  
> > wrote:
> > > On Fri, Apr 16, 2021 at 11:55 AM Michael Paquier  
> > > wrote:
> > > > Attached is v5 that I am finishing with.  Much more could be done but
> > > > I don't want to do something too invasive at this stage of the game.
> > > > There are a couple of extra relations in terms of relations opened for
> > > > a partitioned table within create_estate_for_relation() when
> > > > redirecting to the tuple routing, but my guess is that this would be
> > > > better in the long-term.
> > > >
> > >
> > > Hmm, I am not sure if it is a good idea to open indexes needlessly
> > > especially when it is not done in the previous code.
> > >
> > > @@ -1766,8 +1771,11 @@ apply_handle_tuple_routing(ResultRelInfo *relinfo,
> > >   slot_getallattrs(remoteslot);
> > >   }
> > >   MemoryContextSwitchTo(oldctx);
> > > +
> > > + ExecOpenIndices(partrelinfo_new, false);
> > >   apply_handle_insert_internal(partrelinfo_new, estate,
> > >   remoteslot_part);
> > > + ExecCloseIndices(partrelinfo_new);
> > >   }
> > >
> > > It seems you forgot to call open indexes before 
> > > apply_handle_delete_internal.
> > >
> > > I am not sure if it is a good idea to do the refactoring related to
> > > indexes or other things to fix a minor bug in commit 1375422c. It
> > > might be better to add a simple fix like what Hou-San has originally
> > > proposed [1] because even using ExecInitResultRelation might not be
> > > the best thing as it is again trying to open a range table which we
> > > have already opened in logicalrep_rel_open.
> >
> > FWIW, I agree with fixing this bug of 1375422c in as least scary
> > manner as possible.  Hou-san proposed that we add the ResultRelInfo
> > that apply_handle_{insert|update|delete} initialize themselves to
> > es_opened_result_relations.  I would prefer that only
> > ExecInitResultRelation() add anything to es_opened_result_relations()
> > to avoid future maintenance problems.  Instead, a fix as simple as the
> > Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> > call at the end of each of apply_handle_{insert|update|delete}.
> >
>
> Yeah, that will work too but might look a bit strange. BTW, how that
> is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> like Hou-San's patch without calling ExecCloseResultRelations, the
> rels are probably closed when we close the relation in worker.c but
> what about memory for the list?

It seems I had forgotten the code I had written myself.  The following
is from ExecuteTruncateGuts():

/*
 * To fire triggers, we'll need an EState as well as a ResultRelInfo for
 * each relation.  We don't need to call ExecOpenIndices, though.
 *
 * We put the ResultRelInfos in the es_opened_result_relations list, even
 * though we don't have a range table and don't populate the
 * es_result_relations array.  That's a bit bogus, but it's enough to make
 * ExecGetTriggerResultRel() find them.
 */
estate = CreateExecutorState();
resultRelInfos = (ResultRelInfo *)
palloc(list_length(rels) * sizeof(ResultRelInfo));
resultRelInfo = resultRelInfos;
foreach(cell, rels)
{
Relationrel = (Relation) lfirst(cell);

InitResultRelInfo(resultRelInfo,
  rel,
  0,/* dummy rangetable index */
  NULL,
  0);
estate->es_opened_result_relations =
lappend(estate->es_opened_result_relations, resultRelInfo);
resultRelInfo++;
}

So, that is exactly what Hou-san's patch did.  Although, the comment
does admit that doing this is a bit bogus and maybe written (by Heikki
IIRC) as a caution against repeating the pattern.


--
Amit Langote
EDB: http://www.enterprisedb.com




Re: Table refer leak in logical replication

2021-04-19 Thread Michael Paquier
On Mon, Apr 19, 2021 at 02:33:10PM +0530, Amit Kapila wrote:
> On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  wrote:
>> FWIW, I agree with fixing this bug of 1375422c in as least scary
>> manner as possible.  Hou-san proposed that we add the ResultRelInfo
>> that apply_handle_{insert|update|delete} initialize themselves to
>> es_opened_result_relations.  I would prefer that only
>> ExecInitResultRelation() add anything to es_opened_result_relations()
>> to avoid future maintenance problems.  Instead, a fix as simple as the
>> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
>> call at the end of each of apply_handle_{insert|update|delete}.
> 
> Yeah, that will work too but might look a bit strange. BTW, how that
> is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> like Hou-San's patch without calling ExecCloseResultRelations, the
> rels are probably closed when we close the relation in worker.c but
> what about memory for the list?

TRUNCATE relies on FreeExecutorState() for that, no?  FWIW, I'd rather
agree to use what has been proposed with es_opened_result_relations
like TRUNCATE does rather than attempt to use ExecInitResultRelation()
combined with potentially asymmetric calls to
ExecCloseResultRelations().
--
Michael


signature.asc
Description: PGP signature


Re: Table refer leak in logical replication

2021-04-19 Thread Amit Kapila
On Mon, Apr 19, 2021 at 3:02 PM Michael Paquier  wrote:
>
> On Mon, Apr 19, 2021 at 02:33:10PM +0530, Amit Kapila wrote:
> > On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  
> > wrote:
> >> FWIW, I agree with fixing this bug of 1375422c in as least scary
> >> manner as possible.  Hou-san proposed that we add the ResultRelInfo
> >> that apply_handle_{insert|update|delete} initialize themselves to
> >> es_opened_result_relations.  I would prefer that only
> >> ExecInitResultRelation() add anything to es_opened_result_relations()
> >> to avoid future maintenance problems.  Instead, a fix as simple as the
> >> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> >> call at the end of each of apply_handle_{insert|update|delete}.
> >
> > Yeah, that will work too but might look a bit strange. BTW, how that
> > is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> > like Hou-San's patch without calling ExecCloseResultRelations, the
> > rels are probably closed when we close the relation in worker.c but
> > what about memory for the list?
>
> TRUNCATE relies on FreeExecutorState() for that, no?
>

I am not sure about that because it doesn't seem to be allocated in
es_query_cxt. Note, we switch to oldcontext in the
CreateExecutorState.

-- 
With Regards,
Amit Kapila.




Re: Old Postgresql version on i7-1165g7

2021-04-19 Thread Yura Sokolov

Yura Sokolov писал 2021-04-18 23:29:

Tom Lane писал 2021-04-13 17:45:

Justin Pryzby  writes:

On Fri, Apr 09, 2021 at 04:28:25PM +0300, Yura Sokolov wrote:
Occasinally I found I'm not able to `make check` old Postgresql 
versions.


I've bisected between REL_11_0 and "Rename pg_rewind's 
copy_file_range()"

and
found 372728b0d49552641f0ea83d9d2e08817de038fa

Replace our traditional initial-catalog-data format with a better
design.
This is first commit where `make check` doesn't fail during initdb 
on my

machine.


This doesn't make much sense or help much, since 372728b doesn't 
actually

change the catalogs, or any .c file.


It could make sense if some part of the toolchain that was previously
used to generate postgres.bki doesn't work right on that machine.
Overall though I'd have thought that 372728b would increase not
decrease our toolchain footprint.  It also seems unlikely that a
recent Ubuntu release would contain toolchain bugs that we hadn't
already heard about.


You used make clean too, right ?


Really, when bisecting, you need to use "make distclean" or even
"git clean -dfx" between steps, or you may get bogus results,
because our makefiles aren't that great about tracking dependencies,
especially when you move backwards in the history.


Yep, "git clean -dfx" did the job. "make distclean" didn't, btw.
I've had "src/backend/catalog/schemapg.h" file in source tree
generated with "make submake-generated-headers" on REL_13_0.
It were not shown with "git status", therefore I didn't notice its
existence. It were not deleted neither with "make distclean", nor with
"git clean -dx" I tried before. Only "git clean -dfx" deletes it.

Thank you for the suggestion, Tom. You've saved my sanity.

Regards,
Yura Sokolov.




Re: Table refer leak in logical replication

2021-04-19 Thread Amit Kapila
On Mon, Apr 19, 2021 at 3:12 PM Amit Kapila  wrote:
>
> On Mon, Apr 19, 2021 at 3:02 PM Michael Paquier  wrote:
> >
> > On Mon, Apr 19, 2021 at 02:33:10PM +0530, Amit Kapila wrote:
> > > On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  
> > > wrote:
> > >> FWIW, I agree with fixing this bug of 1375422c in as least scary
> > >> manner as possible.  Hou-san proposed that we add the ResultRelInfo
> > >> that apply_handle_{insert|update|delete} initialize themselves to
> > >> es_opened_result_relations.  I would prefer that only
> > >> ExecInitResultRelation() add anything to es_opened_result_relations()
> > >> to avoid future maintenance problems.  Instead, a fix as simple as the
> > >> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> > >> call at the end of each of apply_handle_{insert|update|delete}.
> > >
> > > Yeah, that will work too but might look a bit strange. BTW, how that
> > > is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> > > like Hou-San's patch without calling ExecCloseResultRelations, the
> > > rels are probably closed when we close the relation in worker.c but
> > > what about memory for the list?
> >
> > TRUNCATE relies on FreeExecutorState() for that, no?
> >
>
> I am not sure about that because it doesn't seem to be allocated in
> es_query_cxt. Note, we switch to oldcontext in the
> CreateExecutorState.
>

I have just checked that the memory for the list is allocated in
ApplyMessageContext. So, it appears a memory leak to me unless I am
missing something.

-- 
With Regards,
Amit Kapila.




Re: Table refer leak in logical replication

2021-04-19 Thread Amit Kapila
On Mon, Apr 19, 2021 at 3:25 PM Amit Kapila  wrote:
>
> On Mon, Apr 19, 2021 at 3:12 PM Amit Kapila  wrote:
> >
> > On Mon, Apr 19, 2021 at 3:02 PM Michael Paquier  wrote:
> > >
> > > On Mon, Apr 19, 2021 at 02:33:10PM +0530, Amit Kapila wrote:
> > > > On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  
> > > > wrote:
> > > >> FWIW, I agree with fixing this bug of 1375422c in as least scary
> > > >> manner as possible.  Hou-san proposed that we add the ResultRelInfo
> > > >> that apply_handle_{insert|update|delete} initialize themselves to
> > > >> es_opened_result_relations.  I would prefer that only
> > > >> ExecInitResultRelation() add anything to es_opened_result_relations()
> > > >> to avoid future maintenance problems.  Instead, a fix as simple as the
> > > >> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> > > >> call at the end of each of apply_handle_{insert|update|delete}.
> > > >
> > > > Yeah, that will work too but might look a bit strange. BTW, how that
> > > > is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> > > > like Hou-San's patch without calling ExecCloseResultRelations, the
> > > > rels are probably closed when we close the relation in worker.c but
> > > > what about memory for the list?
> > >
> > > TRUNCATE relies on FreeExecutorState() for that, no?
> > >
> >
> > I am not sure about that because it doesn't seem to be allocated in
> > es_query_cxt. Note, we switch to oldcontext in the
> > CreateExecutorState.
> >
>
> I have just checked that the memory for the list is allocated in
> ApplyMessageContext. So, it appears a memory leak to me unless I am
> missing something.
>

It seems like the memory will be freed after we apply the truncate
because we reset the ApplyMessageContext after applying each message,
so maybe we don't need to bother about it.

-- 
With Regards,
Amit Kapila.




Re: Windows default locale vs initdb

2021-04-19 Thread Andrew Dunstan
On Mon, Apr 19, 2021 at 4:53 AM Pavel Stehule 
wrote:

>
>
> po 19. 4. 2021 v 7:43 odesílatel Thomas Munro 
> napsal:
>
>> Hi,
>>
>> Moving this topic into its own thread from the one about collation
>> versions, because it concerns pre-existing problems, and that thread
>> is long.
>>
>> Currently initdb sets up template databases with old-style Windows
>> locale names reported by the OS, and they seem to have caused us quite
>> a few problems over the years:
>>
>> db29620d "Work around Windows locale name with non-ASCII character."
>> aa1d2fc5 "Another attempt at fixing Windows Norwegian locale."
>> db477b69 "Deal with yet another issue related to "Norwegian (Bokmål)"..."
>> 9f12a3b9 "Tolerate version lookup failure for old style Windows locale..."
>>
>> ... and probably more, and also various threads about , for example,
>> "German_German.1252" vs "German_Switzerland.1252" which seem to get
>> confused or badly canonicalised or rejected somewhere in the mix.
>>
>> I hadn't focused on any of that before, being a non-Windows-user, but
>> the entire contents of win32setlocale.c supports the theory that
>> Windows' manual meant what it said when it said[1]:
>>
>> "We do not recommend this form for locale strings embedded in
>> code or serialized to storage, because these strings are more likely
>> to be changed by an operating system update than the locale name
>> form."
>>
>> I suppose that was the only form available at the time the code was
>> written, so there was no choice.  The question we asked ourselves
>> multiple times in the other thread was how we're supposed to get to
>> the modern BCP 47 form when creating the template databases.  It looks
>> like one possibility, since Vista, is to call
>> GetUserDefaultLocaleName()[2], which doesn't appear to have been
>> discussed before on this list.  That doesn't allow you to ask for the
>> default for each individual category, but I don't know if that is even
>> a concept for Windows user settings.  It may be that some of the other
>> nearby functions give a better answer for some reason.  But one thing
>> is clear from a test that someone kindly ran for me: it reports
>> standardised strings like "en-NZ", not strings like "English_New
>> Zealand.1252".
>>
>> No patch, but I wondered if any Windows hackers have any feedback on
>> relative sanity of trying to fix all these problems this way.
>>
>
> Last weekend I talked with one user about one interesting (and messing)
> issue. They needed to create a new database with Czech collation on Azure
> SAS. There was not any entry in pg_collation for Czech language. The reply
> from Microsoft support was to use CREATE DATABASE xxx TEMPLATE 'template0'
> ENCODING 'utf8' LOCALE 'cs_CZ.UTF8' and it was working.
>
>
>
My understanding from Microsoft staff at conferences is that Azure's
PostgreSQL SAS runs on  linux, not WIndows.

cheers

andrew


Re: Windows default locale vs initdb

2021-04-19 Thread Pavel Stehule
po 19. 4. 2021 v 12:52 odesílatel Andrew Dunstan 
napsal:

>
>
> On Mon, Apr 19, 2021 at 4:53 AM Pavel Stehule 
> wrote:
>
>>
>>
>> po 19. 4. 2021 v 7:43 odesílatel Thomas Munro 
>> napsal:
>>
>>> Hi,
>>>
>>> Moving this topic into its own thread from the one about collation
>>> versions, because it concerns pre-existing problems, and that thread
>>> is long.
>>>
>>> Currently initdb sets up template databases with old-style Windows
>>> locale names reported by the OS, and they seem to have caused us quite
>>> a few problems over the years:
>>>
>>> db29620d "Work around Windows locale name with non-ASCII character."
>>> aa1d2fc5 "Another attempt at fixing Windows Norwegian locale."
>>> db477b69 "Deal with yet another issue related to "Norwegian (Bokmål)"..."
>>> 9f12a3b9 "Tolerate version lookup failure for old style Windows
>>> locale..."
>>>
>>> ... and probably more, and also various threads about , for example,
>>> "German_German.1252" vs "German_Switzerland.1252" which seem to get
>>> confused or badly canonicalised or rejected somewhere in the mix.
>>>
>>> I hadn't focused on any of that before, being a non-Windows-user, but
>>> the entire contents of win32setlocale.c supports the theory that
>>> Windows' manual meant what it said when it said[1]:
>>>
>>> "We do not recommend this form for locale strings embedded in
>>> code or serialized to storage, because these strings are more likely
>>> to be changed by an operating system update than the locale name
>>> form."
>>>
>>> I suppose that was the only form available at the time the code was
>>> written, so there was no choice.  The question we asked ourselves
>>> multiple times in the other thread was how we're supposed to get to
>>> the modern BCP 47 form when creating the template databases.  It looks
>>> like one possibility, since Vista, is to call
>>> GetUserDefaultLocaleName()[2], which doesn't appear to have been
>>> discussed before on this list.  That doesn't allow you to ask for the
>>> default for each individual category, but I don't know if that is even
>>> a concept for Windows user settings.  It may be that some of the other
>>> nearby functions give a better answer for some reason.  But one thing
>>> is clear from a test that someone kindly ran for me: it reports
>>> standardised strings like "en-NZ", not strings like "English_New
>>> Zealand.1252".
>>>
>>> No patch, but I wondered if any Windows hackers have any feedback on
>>> relative sanity of trying to fix all these problems this way.
>>>
>>
>> Last weekend I talked with one user about one interesting (and messing)
>> issue. They needed to create a new database with Czech collation on Azure
>> SAS. There was not any entry in pg_collation for Czech language. The reply
>> from Microsoft support was to use CREATE DATABASE xxx TEMPLATE 'template0'
>> ENCODING 'utf8' LOCALE 'cs_CZ.UTF8' and it was working.
>>
>>
>>
> My understanding from Microsoft staff at conferences is that Azure's
> PostgreSQL SAS runs on  linux, not WIndows.
>

I had different informations, but still there was something wrong because
no czech locales was in pg_collation



>
> cheers
>
> andrew
>


Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-04-19 Thread Amul Sul
On Mon, Apr 19, 2021 at 2:05 PM Kyotaro Horiguchi
 wrote:
>
> At Mon, 19 Apr 2021 12:56:18 +0530, Amul Sul  wrote in
> > On Mon, Apr 19, 2021 at 12:25 PM Michael Paquier  
> > wrote:
> > >
> > > On Fri, Apr 09, 2021 at 06:45:45PM -0400, Alvaro Herrera wrote:
> > > > We forgot this patch earlier in the commitfest.  Do people think we
> > > > should still get it in on this cycle?  I'm +1 on that, since it's a
> > > > safety feature poised to prevent more bugs than it's likely to
> > > > introduce.
> > >
> > > No objections from here to do that now even after feature freeze.  I
> > > also wonder, while looking at that, why you don't just remove the last
> > > call within src/backend/catalog/heap.c.  This way, nobody is tempted
> > > to use RelationOpenSmgr() anymore, and it could just be removed from
> > > rel.h.
> >
> > Agree, did the same in the attached version, thanks.
>
> +   smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
>   (char *) metapage, true);
> -   log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
> +   log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
>
> At the log_newpage, index is guaranteed to have rd_smgr. So I prefer
> to leave the line alone..  I don't mind other sccessive calls if any
> since what I don't like is the notation there.
>

Perhaps, isn't that bad. It is good to follow the practice of using
RelationGetSmgr() for rd_smgr access, IMHO.

> > P.S. commitfest entry https://commitfest.postgresql.org/33/3084/
>
> Isn't this a kind of open item?
>

Sorry, I didn't get you. Do I need to move this to some other bucket?

Regards,
Amul




Re: Replication slot stats misgivings

2021-04-19 Thread vignesh C
On Fri, Apr 16, 2021 at 3:16 PM Amit Kapila  wrote:
>
> On Mon, Apr 12, 2021 at 2:57 PM vignesh C  wrote:
> >
> > On Sat, Mar 20, 2021 at 9:26 AM Amit Kapila  wrote:
> > >
> > > On Sat, Mar 20, 2021 at 12:22 AM Andres Freund  wrote:
> > > >
> > > > And then more generally about the feature:
> > > > - If a slot was used to stream out a large amount of changes (say an
> > > >   initial data load), but then replication is interrupted before the
> > > >   transaction is committed/aborted, stream_bytes will not reflect the
> > > >   many gigabytes of data we may have sent.
> > > >
> > >
> > > We can probably update the stats each time we spilled or streamed the
> > > transaction data but it was not clear at that stage whether or how
> > > much it will be useful.
> > >
> >
> > I felt we can update the replication slot statistics data each time we
> > spill/stream the transaction data instead of accumulating the
> > statistics and updating at the end. I have tried this in the attached
> > patch and the statistics data were getting updated.
> > Thoughts?
> >
>
> Did you check if we can update the stats when we release the slot as
> discussed above? I am not sure if it is easy to do at the time of slot
> release because this information might not be accessible there and in
> some cases, we might have already released the decoding
> context/reorderbuffer where this information is stored. It might be
> okay to update this when we stream or spill but let's see if we can do
> it easily at the time of slot release.
>

I have made the changes to update the replication statistics at
replication slot release. Please find the patch attached for the same.
Thoughts?

Regards,
Vignesh
From 2b92c8f2824e1ef731b3c79fa50b6afb75314f76 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Mon, 19 Apr 2021 14:54:27 +0530
Subject: [PATCH] Update decoding stats while releasing replication slot.

Currently replication slot statistics are updated at commit/rollback, if
the transaction is interrupted the stats might not get updated. Fixed
this by updating statistics during replication slot release. As part of
the fix, moved the statistics variable from ReorderBuffer to
ReplicationSlot structure to make the stats variable accessible at ReplicationSlotRelease.
---
 src/backend/replication/logical/decode.c  |  6 +-
 src/backend/replication/logical/logical.c | 71 +++
 .../replication/logical/reorderbuffer.c   | 39 +-
 src/backend/replication/slot.c|  6 ++
 src/include/replication/logical.h |  1 -
 src/include/replication/reorderbuffer.h   | 23 --
 src/include/replication/slot.h| 24 +++
 7 files changed, 95 insertions(+), 75 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 7924581cdc..6fbf22f345 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -750,7 +750,7 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 	 * not clear that sending more or less frequently than this would be
 	 * better.
 	 */
-	UpdateDecodingStats(ctx);
+	UpdateDecodingStats();
 }
 
 /*
@@ -832,7 +832,7 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 	 * not clear that sending more or less frequently than this would be
 	 * better.
 	 */
-	UpdateDecodingStats(ctx);
+	UpdateDecodingStats();
 }
 
 
@@ -889,7 +889,7 @@ DecodeAbort(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 	}
 
 	/* update the decoding stats */
-	UpdateDecodingStats(ctx);
+	UpdateDecodingStats();
 }
 
 /*
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 35b0c67641..22b30ffdef 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -197,6 +197,15 @@ StartupDecodingContext(List *output_plugin_options,
 		LWLockRelease(ProcArrayLock);
 	}
 
+	slot->spillTxns = 0;
+	slot->spillCount = 0;
+	slot->spillBytes = 0;
+	slot->streamTxns = 0;
+	slot->streamCount = 0;
+	slot->streamBytes = 0;
+	slot->totalTxns = 0;
+	slot->totalBytes = 0;
+
 	ctx->slot = slot;
 
 	ctx->reader = XLogReaderAllocate(wal_segment_size, NULL, cleanup_cb);
@@ -1770,44 +1779,46 @@ ResetLogicalStreamingState(void)
  * Report stats for a slot.
  */
 void
-UpdateDecodingStats(LogicalDecodingContext *ctx)
+UpdateDecodingStats()
 {
-	ReorderBuffer *rb = ctx->reorder;
 	PgStat_ReplSlotStats repSlotStat;
+	ReplicationSlot *slot = MyReplicationSlot;
+
+	Assert(MyReplicationSlot != NULL);
 
 	/* Nothing to do if we don't have any replication stats to be sent. */
-	if (rb->spillBytes <= 0 && rb->streamBytes <= 0 && rb->totalBytes <= 0)
+	if (slot->spillBytes <= 0 && slot->streamBytes <= 0 && slot->totalBytes <= 0)
 		return;
 
 	elog(DEBUG2, "UpdateDecodingStats: updating stats %p %lld %lld %lld %lld %lld %lld %lld %lld",
-		 rb,
-		 (long long) rb->spillTxns,
-		 (long long) rb->spillCount,
-		 (long 

Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-19 Thread Bharath Rupireddy
On Mon, Apr 19, 2021 at 1:57 PM Masahiko Sawada  wrote:
>
> On Mon, Apr 19, 2021 at 5:04 PM Kyotaro Horiguchi
>  wrote:
> >
> > At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada  
> > wrote in
> > > On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
> > >  wrote:
> > > > AFAICS the page is always empty when RelationGetBufferForTuple
> > > > returned a valid vmbuffer.  So the "if" should be an "assert" instead.
> > >
> > > There is a chance that RelationGetBufferForTuple() returns a valid
> > > vmbuffer but the page is not empty, since RelationGetBufferForTuple()
> > > checks without a lock if the page is empty. But when it comes to
> > > HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
> > > since only one process inserts tuples into the relation. Will fix.
> >
> > Yes.  It seems to me that it is cleaner that RelationGetBufferForTuple
> > returns vmbuffer only when the caller needs to change vm state.
> > Thanks.
> >
> > > > And, the patch changes the value of all_frozen_set to false when the
> > > > page was already all-frozen (thus not empty). It would be fine since
> > > > we don't need to change the visibility of the page in that case but
> > > > the variable name is no longer correct.  set_all_visible or such?
> > >
> > > It seems to me that the variable name all_frozen_set corresponds to
> > > XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
> > > set_all_frozen instead since we set all-frozen bits (also implying
> > > setting all-visible)?
> >
> > Right. However, "if (set_all_frozen) then "set all_visible" looks like
> > a bug^^;.  all_frozen_set looks better in that context than
> > set_all_frozen. So I withdraw the comment.
> >
> > > BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
> > > there is no all_visible_set anywhere:
> > >
> > > /* all_frozen_set always implies all_visible_set */
> > > #define XLH_INSERT_ALL_FROZEN_SET   (1<<5)
> > >
> > > I'll update those comments as well.
> >
> > FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
> > to be set together". The current comment is working to me.
> >
>
> Okay, I've updated the patch accordingly. Please review it.

I was reading the patch, just found some typos: it should be "a frozen
tuple" not "an frozen tuple".

+ * Also pin visibility map page if we're inserting an frozen tuple into
+ * If we're inserting an frozen entry into empty page, pin the

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Table refer leak in logical replication

2021-04-19 Thread Amit Langote
On Mon, Apr 19, 2021 at 7:00 PM Amit Kapila  wrote:
> On Mon, Apr 19, 2021 at 3:25 PM Amit Kapila  wrote:
> > On Mon, Apr 19, 2021 at 3:12 PM Amit Kapila  wrote:
> > > On Mon, Apr 19, 2021 at 3:02 PM Michael Paquier  
> > > wrote:
> > > > On Mon, Apr 19, 2021 at 02:33:10PM +0530, Amit Kapila wrote:
> > > > > On Mon, Apr 19, 2021 at 12:32 PM Amit Langote 
> > > > >  wrote:
> > > > >> FWIW, I agree with fixing this bug of 1375422c in as least scary
> > > > >> manner as possible.  Hou-san proposed that we add the ResultRelInfo
> > > > >> that apply_handle_{insert|update|delete} initialize themselves to
> > > > >> es_opened_result_relations.  I would prefer that only
> > > > >> ExecInitResultRelation() add anything to es_opened_result_relations()
> > > > >> to avoid future maintenance problems.  Instead, a fix as simple as 
> > > > >> the
> > > > >> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> > > > >> call at the end of each of apply_handle_{insert|update|delete}.
> > > > >
> > > > > Yeah, that will work too but might look a bit strange. BTW, how that
> > > > > is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> > > > > like Hou-San's patch without calling ExecCloseResultRelations, the
> > > > > rels are probably closed when we close the relation in worker.c but
> > > > > what about memory for the list?
> > > >
> > > > TRUNCATE relies on FreeExecutorState() for that, no?
> > > >
> > >
> > > I am not sure about that because it doesn't seem to be allocated in
> > > es_query_cxt. Note, we switch to oldcontext in the
> > > CreateExecutorState.
> > >
> >
> > I have just checked that the memory for the list is allocated in
> > ApplyMessageContext. So, it appears a memory leak to me unless I am
> > missing something.
> >
>
> It seems like the memory will be freed after we apply the truncate
> because we reset the ApplyMessageContext after applying each message,
> so maybe we don't need to bother about it.

Yes, ApplyMessageContext seems short-lived enough for this not to
matter.  In the case of ExecuteTruncateGuts(), it's PortalContext, but
we don't seem to bother about leaking into that one too.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2021-04-19 Thread Chengxi Sun
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hi, Andrei

I tested your patch, and it works well. I also prefer timestamp inseatead of 
dealloc num.
I think it can provide more useful details about query statements.

Regards,
Martin Sun

Re: Reduce the number of special cases to build contrib modules on windows

2021-04-19 Thread David Rowley
On Wed, 3 Mar 2021 at 22:37, David Rowley  wrote:
> I've attached a rebased patch.

I've rebased this again.

I also moved away from using hash tables for storing references and
libraries.  I was having some problems getting psql to compile due to
the order of the dependencies being reversed due to the order being at
the mercy of Perl's hash function. There's mention of this in
Makefile.global.in:

# libpq_pgport is for use by client executables (not libraries) that use libpq.
# We force clients to pull symbols from the non-shared libraries libpgport
# and libpgcommon rather than pulling some libpgport symbols from libpq just
# because libpq uses those functions too.  This makes applications less
# dependent on changes in libpq's usage of pgport (on platforms where we
# don't have symbol export control for libpq).  To do this we link to
# pgport before libpq.  This does cause duplicate -lpgport's to appear
# on client link lines, since that also appears in $(LIBS).
# libpq_pgport_shlib is the same idea, but for use in client shared libraries.

I switched these back to arrays but added an additional check to only
add new items to the array if we don't already have an element with
the same value.

I've attached the diffs in the *.vcxproj files between patched and unpatched.

David
diff -u "L:\\proj_std/autoinc.vcxproj" "L:\\proj_mod/autoinc.vcxproj"
--- "L:\\proj_std/autoinc.vcxproj"  2021-04-19 23:05:12.549209700 +1200
+++ "L:\\proj_mod/autoinc.vcxproj"  2021-04-19 23:02:23.365767000 +1200
@@ -52,7 +52,7 @@
 
   Disabled
   
src/include;src/include/port/win32;src/include/port/win32_msvc;%(AdditionalIncludeDirectories)
-  
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_DEBUG;DEBUG=1%(PreprocessorDefinitions)
+  
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;REFINT_VERBOSE;_DEBUG;DEBUG=1%(PreprocessorDefinitions)
   false
   MultiThreadedDebugDLL
   
4018;4244;4273;4102;4090;4267;%(DisableSpecificWarnings)
@@ -100,7 +100,7 @@
 
   Full
   
src/include;src/include/port/win32;src/include/port/win32_msvc;%(AdditionalIncludeDirectories)
-  
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;%(PreprocessorDefinitions)
+  
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;REFINT_VERBOSE;%(PreprocessorDefinitions)
   true
   MultiThreadedDLL
   
4018;4244;4273;4102;4090;4267;%(DisableSpecificWarnings)
diff -u "L:\\proj_std/dblink.vcxproj" "L:\\proj_mod/dblink.vcxproj"
--- "L:\\proj_std/dblink.vcxproj"   2021-04-19 23:05:12.757441500 +1200
+++ "L:\\proj_mod/dblink.vcxproj"   2021-04-19 23:02:23.571066000 +1200
@@ -51,7 +51,7 @@
   
 
   Disabled
-  
src/include;src/include/port/win32;src/include/port/win32_msvc;src\interfaces\libpq;src/backend;%(AdditionalIncludeDirectories)
+  
src/include;src/include/port/win32;src/include/port/win32_msvc;src/backend;src\interfaces\libpq;%(AdditionalIncludeDirectories)
   
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_DEBUG;DEBUG=1%(PreprocessorDefinitions)
   false
   MultiThreadedDebugDLL
@@ -99,7 +99,7 @@
   
 
   Full
-  
src/include;src/include/port/win32;src/include/port/win32_msvc;src\interfaces\libpq;src/backend;%(AdditionalIncludeDirectories)
+  
src/include;src/include/port/win32;src/include/port/win32_msvc;src/backend;src\interfaces\libpq;%(AdditionalIncludeDirectories)
   
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;%(PreprocessorDefinitions)
   true
   MultiThreadedDLL
diff -u "L:\\proj_std/hstore_plpython3.vcxproj" 
"L:\\proj_mod/hstore_plpython3.vcxproj"
--- "L:\\proj_std/hstore_plpython3.vcxproj" 2021-04-19 23:05:12.944178400 
+1200
+++ "L:\\proj_mod/hstore_plpython3.vcxproj" 2021-04-19 23:02:23.750051600 
+1200
@@ -51,7 +51,7 @@
   
 
   Disabled
-  
src/include;src/include/port/win32;src/include/port/win32_msvc;src/pl/plpython;C:\python-3.9.1-embed/include;contrib;%(AdditionalIncludeDirectories)
+  
src/include;src/include/port/win32;src/include/port/win32_msvc;C:\python-3.9.1-embed/include;contrib;src/pl/plpython;%(AdditionalIncludeDirectories)
   
WIN32;_WINDOWS;__WINDOWS__;__WIN32__;WIN32_STACK_RLIMIT=4194304;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;PLPYTHON_LIBNAME="plpython3";_DEBUG;DEBUG=1%(PreprocessorDefinitions)
   false
   MultiThreadedDebugDLL
@@ -70,7 +70,7 @@
 
 
   .\Debug\hstore_plpython3\hstore_plpython3.dll
-  
Debug/postgres/postgres.lib;Debug/plpython3/plpython3.lib;C:\python-3.9.1-embed/Libs/python39.lib;Debug/postgres/postgres.lib;Debug/postgres/pos

Re: Table refer leak in logical replication

2021-04-19 Thread Michael Paquier
On Mon, Apr 19, 2021 at 08:09:05PM +0900, Amit Langote wrote:
> On Mon, Apr 19, 2021 at 7:00 PM Amit Kapila  wrote:
>> It seems like the memory will be freed after we apply the truncate
>> because we reset the ApplyMessageContext after applying each message,
>> so maybe we don't need to bother about it.
> 
> Yes, ApplyMessageContext seems short-lived enough for this not to
> matter.  In the case of ExecuteTruncateGuts(), it's PortalContext, but
> we don't seem to bother about leaking into that one too.

Sorry for the dump question because I have not studied this part of
the code in any extensive way, but how many changes at maximum can be
applied within a single ApplyMessageContext?  I am wondering if we
could run into problems depending on the number of relations touched
within a single message, or if there are any patches that could run
into problems because of this limitation, meaning that we may want to
add a proper set of comments within this area to document the
limitations attached to a DML operation applied.
--
Michael


signature.asc
Description: PGP signature


Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-04-19 Thread Justin Pryzby
On Mon, Apr 19, 2021 at 04:27:25PM +0530, Amul Sul wrote:
> On Mon, Apr 19, 2021 at 2:05 PM Kyotaro Horiguchi  
> wrote:
> >
> > At Mon, 19 Apr 2021 12:56:18 +0530, Amul Sul  wrote in
> > > On Mon, Apr 19, 2021 at 12:25 PM Michael Paquier  
> > > wrote:
> > > >
> > > > On Fri, Apr 09, 2021 at 06:45:45PM -0400, Alvaro Herrera wrote:
> > > > > We forgot this patch earlier in the commitfest.  Do people think we
> > > > > should still get it in on this cycle?  I'm +1 on that, since it's a
> > > > > safety feature poised to prevent more bugs than it's likely to
> > > > > introduce.
> > > >
> > > > No objections from here to do that now even after feature freeze.  I
> > > > also wonder, while looking at that, why you don't just remove the last
> > > > call within src/backend/catalog/heap.c.  This way, nobody is tempted
> > > > to use RelationOpenSmgr() anymore, and it could just be removed from
> > > > rel.h.
> > >
> > > Agree, did the same in the attached version, thanks.
> >
> > +   smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, 
> > BLOOM_METAPAGE_BLKNO,
> >   (char *) metapage, true);
> > -   log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
> > +   log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, 
> > INIT_FORKNUM,
> >
> > At the log_newpage, index is guaranteed to have rd_smgr. So I prefer
> > to leave the line alone..  I don't mind other sccessive calls if any
> > since what I don't like is the notation there.
> >
> 
> Perhaps, isn't that bad. It is good to follow the practice of using
> RelationGetSmgr() for rd_smgr access, IMHO.
> 
> > > P.S. commitfest entry https://commitfest.postgresql.org/33/3084/
> >
> > Isn't this a kind of open item?
> >
> 
> Sorry, I didn't get you. Do I need to move this to some other bucket?

It's not a new feature, and shouldn't wait for July's CF since it's targetting
v14.

The original crash was fixed by Tom by commit 9d523119f.

So it's not exactly an "open item" for v14, but there's probably no better
place for it, so you could add it if you think it's at risk of being forgotten
(again).

https://wiki.postgresql.org/wiki/PostgreSQL_14_Open_Items

-- 
Justin




Re: Table refer leak in logical replication

2021-04-19 Thread Amit Kapila
On Mon, Apr 19, 2021 at 5:20 PM Michael Paquier  wrote:
>
> On Mon, Apr 19, 2021 at 08:09:05PM +0900, Amit Langote wrote:
> > On Mon, Apr 19, 2021 at 7:00 PM Amit Kapila  wrote:
> >> It seems like the memory will be freed after we apply the truncate
> >> because we reset the ApplyMessageContext after applying each message,
> >> so maybe we don't need to bother about it.
> >
> > Yes, ApplyMessageContext seems short-lived enough for this not to
> > matter.  In the case of ExecuteTruncateGuts(), it's PortalContext, but
> > we don't seem to bother about leaking into that one too.
>
> Sorry for the dump question because I have not studied this part of
> the code in any extensive way, but how many changes at maximum can be
> applied within a single ApplyMessageContext?
>

It is one change for Insert/UpdateDelete. See apply_dispatch() for
different change messages.

-- 
With Regards,
Amit Kapila.




Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-04-19 Thread Michael Paquier
On Mon, Apr 19, 2021 at 05:35:52PM +0900, Kyotaro Horiguchi wrote:
> Isn't this a kind of open item?

This does not qualify as an open item because it is not an actual bug
IMO, neither is it a defect of the existing code, so it seems
appropriate to me to not list it.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2021-04-19 Thread Andrei Zubkov
Hi, Martin

On Mon, 2021-04-19 at 11:39 +, Chengxi Sun wrote:
> I tested your patch, and it works well. I also prefer timestamp
> inseatead of dealloc num.
> I think it can provide more useful details about query statements.
> 
Thank you for your review.
Certainly, timestamp is valuable here. Deallocation number is only a
workaround in unlikely case when timestamping will cost a much. It
seems, that it can happen only when significant amount of statements
causes a new entry in pg_stat_statements hashtable. However, in such
case using of pg_stat_statements extension might be qute difficult.
-- 
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company






Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Andrew Dunstan


On 4/17/21 12:35 PM, Andrew Dunstan wrote:
>
>> OK, here is more WIP on this item. I have drawn substantially on Mark's
>> and Jehan-Guillaime's work, but it doesn't really resemble either, and I
>> take full responsibility for it.
>>
>> The guiding principles have been:
>>
>> . avoid doing version tests, or capability tests which are the moral
>> equivalent, and rely instead on pure overloading.
>>
>> . avoid overriding large pieces of code.
>>
>>
>> The last has involved breaking up some large subroutines (like init)
>> into pieces which can more sensibly be overridden. Breaking them up
>> isn't a bad thing to do anyway.
>>
>> There is a new PostgresVersion object, but it's deliberately very
>> minimal. Since we're not doing version tests we don't need more complex
>> routines.
>>
>>
>
> I should have also attached my test program - here it is. Also, I have
> been testing with the binaries which I've published here:
>  along with some saved by my
> buildfarm animal.
>
>

I've been thinking about this some more over the weekend. I'm not really
happy with any of the three approaches to this problem:


a) Use version or capability tests in the main package

b) No changes to main package, use overrides

c) Changes to main package to allow smaller overrides


The problem is that a) and c) involve substantial changes to the main
PostgresNode package, while b) involves overriding large functions (like
init) sometimes for quite trivial changes.

I think therefore I'm inclined for now to do nothing for old version
compatibility. I would commit the fix for the IPC::Run caching glitch,
and version detection. I would add a warning if the module is used with
a version <= 11.

The original goal of these changes was to allow testing of combinations
of different builds with openssl and nss, which doesn't involve old
version compatibility.

As far as I know, without any compatibility changes the module is fully
compatible with releases 13 and 12, and with releases 11 and 10 so long
as you don't want a standby, and with releases 9.6 and 9.5 if you also
don't want a backup. That makes it suitable for a lot of testing without
any attempt at version compatibility.

We can revisit compatibility further in the next release.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Forget close an open relation in ReorderBufferProcessTXN()

2021-04-19 Thread Amit Kapila
On Sat, Apr 17, 2021 at 12:01 PM Amit Kapila  wrote:
>
> On Fri, Apr 16, 2021 at 11:24 PM Andres Freund  wrote:
> >
> >
> > > I think it is also important to *not* acquire any lock on relation
> > > otherwise it can lead to some sort of deadlock or infinite wait in the
> > > decoding process. Consider a case for operations like Truncate (or if
> > > the user has acquired an exclusive lock on the relation in some other
> > > way say via Lock command) which acquires an exclusive lock on
> > > relation, it won't get replicated in synchronous mode (when
> > > synchronous_standby_name is configured). The truncate operation will
> > > wait for the transaction to be replicated to the subscriber and the
> > > decoding process will wait for the Truncate operation to finish.
> >
> > However, this cannot be really relied upon for catalog tables. An output
> > function might acquire locks or such. But for those we do not need to
> > decode contents...
> >
>
> I see that if we define a user_catalog_table (create table t1_cat(c1
> int) WITH(user_catalog_table = true);), we are able to decode
> operations like (insert, truncate) on such a table. What do you mean
> by "But for those we do not need to decode contents"?
>

I think we are allowed to decode the operations on user catalog tables
because we are using RelationIsLogicallyLogged() instead of
RelationIsAccessibleInLogicalDecoding() in ReorderBufferProcessTXN().
Based on this discussion, I think we should not be allowing decoding
of operations on user catalog tables, so we should use
RelationIsAccessibleInLogicalDecoding to skip such ops in
ReorderBufferProcessTXN(). Am, I missing something?

Can you please clarify?

-- 
With Regards,
Amit Kapila.




Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Michael Paquier
On Mon, Apr 19, 2021 at 08:11:01AM -0400, Andrew Dunstan wrote:
> As far as I know, without any compatibility changes the module is fully
> compatible with releases 13 and 12, and with releases 11 and 10 so long
> as you don't want a standby, and with releases 9.6 and 9.5 if you also
> don't want a backup. That makes it suitable for a lot of testing without
> any attempt at version compatibility.
> 
> We can revisit compatibility further in the next release.

Agreed, and I am not convinced that there is any strong need for any
of that in the close future either, as long as there are no
ground-breaking compatibility changes.

How far does the buildfarm test pg_upgrade?  One thing that I
personally care about here is the possibility to make pg_upgrade's
test.sh become a TAP test.  However, I am also pretty sure that we
could apply some local changes to the TAP test of pg_upgrade itself to
not require any wide changes to PostgresNode.pm either to make the
central logic as simple as possible with all the stable branches still
supported or even older ones.  Having compatibility for free down to
12 is nice enough IMO for most of the core logic, and pg_upgrade would
also work just fine down to 9.5 without any extra changes because we
don't care there about standbys or backups.
--
Michael


signature.asc
Description: PGP signature


Do we need to update copyright for PG11 branch

2021-04-19 Thread bchen90
hi, all

  Recently, I found the copyright info for PG11 branch still is "Portions
Copyright (c) *1996-2018*, PostgreSQL Global Development Group". Do we need
to update it?

  regards, ChenBo



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Table refer leak in logical replication

2021-04-19 Thread Amit Langote
On Mon, Apr 19, 2021 at 6:32 PM Michael Paquier  wrote:
> On Mon, Apr 19, 2021 at 02:33:10PM +0530, Amit Kapila wrote:
> > On Mon, Apr 19, 2021 at 12:32 PM Amit Langote  
> > wrote:
> >> FWIW, I agree with fixing this bug of 1375422c in as least scary
> >> manner as possible.  Hou-san proposed that we add the ResultRelInfo
> >> that apply_handle_{insert|update|delete} initialize themselves to
> >> es_opened_result_relations.  I would prefer that only
> >> ExecInitResultRelation() add anything to es_opened_result_relations()
> >> to avoid future maintenance problems.  Instead, a fix as simple as the
> >> Hou-san's proposed fix would be to add a ExecCloseResultRelations()
> >> call at the end of each of apply_handle_{insert|update|delete}.
> >
> > Yeah, that will work too but might look a bit strange. BTW, how that
> > is taken care of for ExecuteTruncateGuts? I mean we do add rels there
> > like Hou-San's patch without calling ExecCloseResultRelations, the
> > rels are probably closed when we close the relation in worker.c but
> > what about memory for the list?
>
> ... FWIW, I'd rather
> agree to use what has been proposed with es_opened_result_relations
> like TRUNCATE does rather than attempt to use ExecInitResultRelation()
> combined with potentially asymmetric calls to
> ExecCloseResultRelations().

Okay, how about the attached then?  I decided to go with just
finish_estate(), because we no longer have to do anything relation
specific there.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


fix_relcache_leak_in_lrworker_v6.patch
Description: Binary data


Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Rushabh Lathia
Hi.

With the commit mentioned in the $subject, I am seeing the
change in behaviour with the varlena header size.  Please
consider the below test:

postgres@83795=#CREATE TABLE test_storage_char(d char(20));
CREATE TABLE
postgres@83795=#INSERT INTO test_storage_char SELECT REPEAT('e', 20);
INSERT 0 1
postgres@83795=#SELECT d, pg_column_size(d) FROM test_storage_char;
  d   | pg_column_size
--+
  | 21
(1 row)

postgres@83795=#ALTER TABLE test_storage_char ALTER COLUMN  d SET STORAGE
PLAIN;
ALTER TABLE
postgres@83795=#SELECT d, pg_column_size(d) FROM test_storage_char;
  d   | pg_column_size
--+
  | 21
(1 row)

postgres@83795=#UPDATE test_storage_char SET d='ab' WHERE d LIKE '%e%';
UPDATE 1
postgres@83795=#SELECT d, pg_column_size(d) FROM test_storage_char;
  d   | pg_column_size
--+
 ab   | 24
(1 row)

After changing the STORAGE for the column and UPDATE, pg_column_size
now returns the size as 24.

*BEFORE Commit 86dc90056:*

postgres@129158=#SELECT d, pg_column_size(d) FROM test_storage_char;
  d   | pg_column_size
--+
 ab   | 21
(1 row)

I am not sure whether this change is expected? Or missing something
in the toasting the attribute?


Thanks,
Rushabh Lathia
www.EnterpriseDB.com


Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-04-19 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Apr 19, 2021 at 05:35:52PM +0900, Kyotaro Horiguchi wrote:
>> Isn't this a kind of open item?

> This does not qualify as an open item because it is not an actual bug
> IMO, neither is it a defect of the existing code, so it seems
> appropriate to me to not list it.

Agreed, but by the same token, rushing it into v14 doesn't have any
clear benefit.  I'd be inclined to leave it for v15 at this point,
especially since we don't seem to have 100% consensus on the details.

regards, tom lane




Re: Do we need to update copyright for PG11 branch

2021-04-19 Thread Tom Lane
bchen90  writes:
>   Recently, I found the copyright info for PG11 branch still is "Portions
> Copyright (c) *1996-2018*, PostgreSQL Global Development Group". Do we need
> to update it?

No, that's not our practice.

regards, tom lane




Re: Performance degradation of REFRESH MATERIALIZED VIEW

2021-04-19 Thread Masahiko Sawada
On Mon, Apr 19, 2021 at 8:04 PM Bharath Rupireddy
 wrote:
>
> On Mon, Apr 19, 2021 at 1:57 PM Masahiko Sawada  wrote:
> >
> > On Mon, Apr 19, 2021 at 5:04 PM Kyotaro Horiguchi
> >  wrote:
> > >
> > > At Mon, 19 Apr 2021 13:32:31 +0900, Masahiko Sawada 
> > >  wrote in
> > > > On Fri, Apr 16, 2021 at 12:16 PM Kyotaro Horiguchi
> > > >  wrote:
> > > > > AFAICS the page is always empty when RelationGetBufferForTuple
> > > > > returned a valid vmbuffer.  So the "if" should be an "assert" instead.
> > > >
> > > > There is a chance that RelationGetBufferForTuple() returns a valid
> > > > vmbuffer but the page is not empty, since RelationGetBufferForTuple()
> > > > checks without a lock if the page is empty. But when it comes to
> > > > HEAP_INSERT_FROZEN cases it actually doesn’t happen at least for now
> > > > since only one process inserts tuples into the relation. Will fix.
> > >
> > > Yes.  It seems to me that it is cleaner that RelationGetBufferForTuple
> > > returns vmbuffer only when the caller needs to change vm state.
> > > Thanks.
> > >
> > > > > And, the patch changes the value of all_frozen_set to false when the
> > > > > page was already all-frozen (thus not empty). It would be fine since
> > > > > we don't need to change the visibility of the page in that case but
> > > > > the variable name is no longer correct.  set_all_visible or such?
> > > >
> > > > It seems to me that the variable name all_frozen_set corresponds to
> > > > XLH_INSERT_ALL_FROZEN_SET but I see your point. How about
> > > > set_all_frozen instead since we set all-frozen bits (also implying
> > > > setting all-visible)?
> > >
> > > Right. However, "if (set_all_frozen) then "set all_visible" looks like
> > > a bug^^;.  all_frozen_set looks better in that context than
> > > set_all_frozen. So I withdraw the comment.
> > >
> > > > BTW I found the following description of XLH_INSERT_ALL_FROZEN_SET but
> > > > there is no all_visible_set anywhere:
> > > >
> > > > /* all_frozen_set always implies all_visible_set */
> > > > #define XLH_INSERT_ALL_FROZEN_SET   (1<<5)
> > > >
> > > > I'll update those comments as well.
> > >
> > > FWIW, it seems like a shorthand of "ALL_FROZEN_SET implies ALL_VISIBLE
> > > to be set together". The current comment is working to me.
> > >
> >
> > Okay, I've updated the patch accordingly. Please review it.
>
> I was reading the patch, just found some typos: it should be "a frozen
> tuple" not "an frozen tuple".
>
> + * Also pin visibility map page if we're inserting an frozen tuple into
> + * If we're inserting an frozen entry into empty page, pin 
> the

Thank you for the comment.

I’ve updated the patch including the above comment.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 13396eb7f2..65ec6118b9 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2065,7 +2065,6 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	Buffer		buffer;
 	Page		page = NULL;
 	Buffer		vmbuffer = InvalidBuffer;
-	bool		starting_with_empty_page;
 	bool		all_visible_cleared = false;
 	bool		all_frozen_set = false;
 	uint8		vmstatus = 0;
@@ -2082,8 +2081,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * Find buffer to insert this tuple into.  If the page is all visible,
 	 * this will also pin the requisite visibility map page.
 	 *
-	 * Also pin visibility map page if COPY FREEZE inserts tuples into an
-	 * empty page. See all_frozen_set below.
+	 * Also pin visibility map page if we're inserting a frozen tuple into
+	 * an empty page. See all_frozen_set below.
 	 */
 	buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
 	   InvalidBuffer, options, bistate,
@@ -2093,22 +2092,23 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	/*
 	 * If we're inserting frozen entry into an empty page,
 	 * set visibility map bits and PageAllVisible() hint.
-	 *
-	 * If we're inserting frozen entry into already all_frozen page,
-	 * preserve this state.
 	 */
-	if (options & HEAP_INSERT_FROZEN)
+	if (options & HEAP_INSERT_FROZEN && BufferIsValid(vmbuffer))
 	{
-		page = BufferGetPage(buffer);
+		Assert(visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer));
 
-		starting_with_empty_page = PageGetMaxOffsetNumber(page) == 0;
+		page = BufferGetPage(buffer);
+		vmstatus = visibilitymap_get_status(relation,
+			BufferGetBlockNumber(buffer), &vmbuffer);
 
-		if (visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
-			vmstatus = visibilitymap_get_status(relation,
- BufferGetBlockNumber(buffer), &vmbuffer);
+		/* The page must be empty */
+		Assert(PageGetMaxOffsetNumber(page) == 0);
 
-		if ((starting_with_empty_page || vmstatus & VISIBILITYMAP_ALL_FROZEN))
-			all_frozen_set = true;
+		/*
+		 * If we're inserting frozen entry into empty page, we will set
+		 * all-visible to page 

Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Andrew Dunstan


On 4/19/21 8:32 AM, Michael Paquier wrote:
> On Mon, Apr 19, 2021 at 08:11:01AM -0400, Andrew Dunstan wrote:
>> As far as I know, without any compatibility changes the module is fully
>> compatible with releases 13 and 12, and with releases 11 and 10 so long
>> as you don't want a standby, and with releases 9.6 and 9.5 if you also
>> don't want a backup. That makes it suitable for a lot of testing without
>> any attempt at version compatibility.
>>
>> We can revisit compatibility further in the next release.
> Agreed, and I am not convinced that there is any strong need for any
> of that in the close future either, as long as there are no
> ground-breaking compatibility changes.
>
> How far does the buildfarm test pg_upgrade?  One thing that I
> personally care about here is the possibility to make pg_upgrade's
> test.sh become a TAP test.  However, I am also pretty sure that we
> could apply some local changes to the TAP test of pg_upgrade itself to
> not require any wide changes to PostgresNode.pm either to make the
> central logic as simple as possible with all the stable branches still
> supported or even older ones.  Having compatibility for free down to
> 12 is nice enough IMO for most of the core logic, and pg_upgrade would
> also work just fine down to 9.5 without any extra changes because we
> don't care there about standbys or backups.


The buildfarm tests self-targetted pg_upgrade by calling the builtin
tests (make check / vcregress.pl upgradecheck).


However, for cross version testing the regime is quite different. The
cross version module doesn't ever construct a repo. Rather, it tries to
upgrade a repo saved from a prior run. So all it does is some
adjustments for things that have changed between releases and then calls
pg_upgrade. See



Note that we currently test upgrades down to 9.2 on crake. However, now
I have some working binaries for really old releases I might extend that
all the way back to 8.4 at some stage. pg_upgrade and pg_dump/pg_restore
testing are the major use cases I can see for backwards compatibility -
pg_dump is still supposed to be able to go back into the dim dark ages,
which is why I built the old binaries all the way back to 7.2.





It's just occurred to me that a much nicer way of doing this
PostgresNode stuff would be to have a function that instead of appending
to the config file would adjust it. Then we wouldn't need all those
little settings functions to be overridden - the subclasses could just
post-process the  config files. I'm going to try that and see what I can
come up with. I think it will look heaps nicer.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

-- 
--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Windows default locale vs initdb

2021-04-19 Thread Dave Page
On Mon, Apr 19, 2021 at 11:52 AM Andrew Dunstan  wrote:

>
> My understanding from Microsoft staff at conferences is that Azure's
> PostgreSQL SAS runs on  linux, not WIndows.
>

This is from a regular Azure Database for PostgreSQL single server:

postgres=> select version();
  version

 PostgreSQL 11.6, compiled by Visual C++ build 1800, 64-bit
(1 row)

And this is from the new Flexible Server preview:

postgres=> select version();
 version

-
 PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
(1 row)

So I guess it's a case of "it depends".

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com


Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Tom Lane
Rushabh Lathia  writes:
> With the commit mentioned in the $subject, I am seeing the
> change in behaviour with the varlena header size.

Interesting.  AFAICS, the new behavior is correct and the old is wrong.
SET STORAGE PLAIN is supposed to disable use of TOAST features, including
short varlena headers.  So now that's being honored by the UPDATE, but
before it was not.  I have no idea exactly why that changed though ---
I'd expect that to be implemented in low-level tuple-construction logic
that the planner rewrite wouldn't have changed.

regards, tom lane




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Amit Langote
On Mon, Apr 19, 2021 at 10:00 PM Rushabh Lathia
 wrote:
>
> Hi.
>
> With the commit mentioned in the $subject, I am seeing the
> change in behaviour with the varlena header size.  Please
> consider the below test:
>
> postgres@83795=#CREATE TABLE test_storage_char(d char(20));
> CREATE TABLE
> postgres@83795=#INSERT INTO test_storage_char SELECT REPEAT('e', 20);
> INSERT 0 1
> postgres@83795=#SELECT d, pg_column_size(d) FROM test_storage_char;
>   d   | pg_column_size
> --+
>   | 21
> (1 row)
>
> postgres@83795=#ALTER TABLE test_storage_char ALTER COLUMN  d SET STORAGE 
> PLAIN;
> ALTER TABLE
> postgres@83795=#SELECT d, pg_column_size(d) FROM test_storage_char;
>   d   | pg_column_size
> --+
>   | 21
> (1 row)
>
> postgres@83795=#UPDATE test_storage_char SET d='ab' WHERE d LIKE '%e%';
> UPDATE 1
> postgres@83795=#SELECT d, pg_column_size(d) FROM test_storage_char;
>   d   | pg_column_size
> --+
>  ab   | 24
> (1 row)
>
> After changing the STORAGE for the column and UPDATE, pg_column_size
> now returns the size as 24.
>
> BEFORE Commit 86dc90056:
>
> postgres@129158=#SELECT d, pg_column_size(d) FROM test_storage_char;
>   d   | pg_column_size
> --+
>  ab   | 21
> (1 row)
>
> I am not sure whether this change is expected? Or missing something
> in the toasting the attribute?

I haven't studied this closely enough yet to say if the new behavior
is correct or not, but can say why this has changed.

Before 86dc90056, the new tuple to pass to ExecUpdate would be
computed with a TupleDesc that uses pg_type.typstorage for the column
instead of the column's actual pg_attribute.attstorage.  That's
because the new tuple would be computed from the subplan's targetlist
and the TupleDesc for that targetlist is computed with no regard to
where the computed tuple will go; IOW ignoring the target table's
actual TupleDesc.

After 86dc90056, the new tuple is computed with the target table's
actual TupleDesc, so the new value respects the column's attstorage,
which makes me think the new behavior is not wrong.

Will look more closely tomorrow.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Mark Dilger



> On Apr 19, 2021, at 5:11 AM, Andrew Dunstan  wrote:
> 
> I think therefore I'm inclined for now to do nothing for old version
> compatibility.

I agree with waiting until the v15 development cycle.

> I would commit the fix for the IPC::Run caching glitch,
> and version detection

Thank you.

> I would add a warning if the module is used with
> a version <= 11.

Sounds fine for now.

> The original goal of these changes was to allow testing of combinations
> of different builds with openssl and nss, which doesn't involve old
> version compatibility.

Hmm.  I think different folks had different goals.  My personal interest is to 
write automated tests which spin up older servers, create data that cannot be 
created on newer servers (such as heap tuples with HEAP_MOVED_IN or 
HEAP_MOVED_OFF bits set), upgrade, and test that new code handles the old data 
correctly.  I think this is not only useful for our test suites as a community, 
but is also useful for companies providing support services who need to 
reproduce problems that customers are having on clusters that have been 
pg_upgraded across large numbers of postgres versions.

> As far as I know, without any compatibility changes the module is fully
> compatible with releases 13 and 12, and with releases 11 and 10 so long
> as you don't want a standby, and with releases 9.6 and 9.5 if you also
> don't want a backup. That makes it suitable for a lot of testing without
> any attempt at version compatibility.
> 
> We can revisit compatibility further in the next release.

Sounds good.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: track_planning causing performance regression

2021-04-19 Thread Fujii Masao




On 2021/04/19 8:36, Justin Pryzby wrote:

Reviewing this change which was committed last year as
321fa6a4a26c9b649a0fbec9fc8b019f19e62289

On Fri, Jul 03, 2020 at 03:57:38PM +0900, Fujii Masao wrote:

On 2020/07/03 13:05, Pavel Stehule wrote:

pá 3. 7. 2020 v 4:39 odesílatel Fujii Masao  
napsal:

Maybe there can be documented so enabling this option can have a negative 
impact on performance.


Yes. What about adding either of the followings into the doc?

 Enabling this parameter may incur a noticeable performance penalty.

or

 Enabling this parameter may incur a noticeable performance penalty,
 especially when a fewer kinds of queries are executed on many
 concurrent connections.


Something seems is wrong with this sentence, and I'm not sure what it's trying
to say.  Is this right ?


pg_stat_statements users different spinlock for each kind of query.
So fewer kinds of queries many sessions execute, fewer spinlocks
they try to acquire. This may lead to spinlock contention and
significant performance degration. This is what the statement is
trying to say.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: track_planning causing performance regression

2021-04-19 Thread Justin Pryzby
On Mon, Apr 19, 2021 at 11:44:05PM +0900, Fujii Masao wrote:
> On 2021/04/19 8:36, Justin Pryzby wrote:
> > Reviewing this change which was committed last year as
> > 321fa6a4a26c9b649a0fbec9fc8b019f19e62289
> > 
> > On Fri, Jul 03, 2020 at 03:57:38PM +0900, Fujii Masao wrote:
> > > On 2020/07/03 13:05, Pavel Stehule wrote:
> > > > pá 3. 7. 2020 v 4:39 odesílatel Fujii Masao 
> > > >  napsal:
> > > > 
> > > > Maybe there can be documented so enabling this option can have a 
> > > > negative impact on performance.
> > > 
> > > Yes. What about adding either of the followings into the doc?
> > > 
> > >  Enabling this parameter may incur a noticeable performance penalty.
> > > 
> > > or
> > > 
> > >  Enabling this parameter may incur a noticeable performance penalty,
> > >  especially when a fewer kinds of queries are executed on many
> > >  concurrent connections.
> > 
> > Something seems is wrong with this sentence, and I'm not sure what it's 
> > trying
> > to say.  Is this right ?
> 
> pg_stat_statements users different spinlock for each kind of query.
> So fewer kinds of queries many sessions execute, fewer spinlocks
> they try to acquire. This may lead to spinlock contention and
> significant performance degration. This is what the statement is
> trying to say.

What does "kind" mean ?  I think it means a "normalized" query or a "query
structure".

"a fewer kinds" is wrong, so I think the docs should say "a small number of
queries" or maybe:

> > >  Enabling this parameter may incur a noticeable performance penalty,
> > >  especially similar queries are run by many concurrent connections and
> > >  compete to update the same pg_stat_statements entry

-- 
Justin




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Tom Lane
I wrote:
> Rushabh Lathia  writes:
>> With the commit mentioned in the $subject, I am seeing the
>> change in behaviour with the varlena header size.

> Interesting.  AFAICS, the new behavior is correct and the old is wrong.
> SET STORAGE PLAIN is supposed to disable use of TOAST features, including
> short varlena headers.  So now that's being honored by the UPDATE, but
> before it was not.  I have no idea exactly why that changed though ---
> I'd expect that to be implemented in low-level tuple-construction logic
> that the planner rewrite wouldn't have changed.

Oh, after a bit of tracing I see it.  In v13, the new value gets
short-header-ified when a tuple is constructed here:

/*
 * Ensure input tuple is the right format for the target relation.
 */
if (node->mt_scans[node->mt_whichplan]->tts_ops != planSlot->tts_ops)
{
ExecCopySlot(node->mt_scans[node->mt_whichplan], planSlot);
planSlot = node->mt_scans[node->mt_whichplan];
}

where the target slot has been made like this:

mtstate->mt_scans[i] =
ExecInitExtraTupleSlot(mtstate->ps.state, 
ExecGetResultType(mtstate->mt_plans[i]),
   
table_slot_callbacks(resultRelInfo->ri_RelationDesc));

So that's using a tupdesc that's been constructed according to the
default properties of the column datatypes, in particular attstorage
will be 'x' for the 'd' column.  Later we transpose the data into
a slot that actually has the target table's rowtype, but the damage
is already done; the value isn't un-short-headerized at that point.
(I wonder if that should be considered a bug?)

86dc90056 got rid of the intermediate mt_scans slots, so the 'ab'
value only gets put into a slot that has the table's real descriptor,
and it never loses its original 4-byte header.

I observe that the INSERT code path still does the wrong thing:

regression=# insert into test_storage_char values('foo');
INSERT 0 1
regression=# SELECT d, pg_column_size(d) FROM test_storage_char;
  d   | pg_column_size 
--+
 ab   | 24
 foo  | 21
(2 rows)

Maybe we oughta try to fix that sometime.  It doesn't seem terribly
high-priority though.

regards, tom lane




Re: [HACKERS] [PATCH] Caching for stable expressions with constant arguments v3

2021-04-19 Thread Robert Haas
On Sat, Dec 17, 2011 at 12:25 PM Marti Raudsepp  wrote:
> Ah, one more trick for testing this patch: if you build with
> -DDEBUG_CACHEEXPR=1 then EXPLAIN VERBOSE displays cached
> subexpressions between a CACHE[...] marker.

Unless I missed something, this 2011 thread is the latest one about
this patch, and the patch was never applied, and nothing similar ever
got done either. Is that correct? If so, was there any particular
reason why this wasn't judged to be a good idea, or did it just not
get enough attention?

This was on my mind today because of a FDW pushdown question that came
up. Someone observed that a predicate of the form foreign_table_column
pushable_operator hairy_but_stable_expression is not pushed down. In
theory, it could be: just compute the value of
hairy_but_stable_expression locally, and then send the result across.
The trick is that it requires identifying a maximal stable
subexpression. In this case, the whole expression isn't stable,
because every row will provide a different value for
foreign_table_column, but the subexpression which is the right child
of the toplevel OpExpr is stable. Identifying that seems like it could
be somewhat expensive and I seem to vaguely recall some discussion of
that issue, but I think not on this thread. But if this patch - or
some other one - happened to inject a CacheExpr at the right place in
the plan tree, then postgres_fdw could leverage that existing
determination in a pretty straightforward way, precomputing the cached
value locally and then pushing the clause if otherwise safe.

That's also just a fringe benefit. Marti's test results show
significant speedup in all-local cases, which seem likely to benefit a
pretty broad class of queries. We'd probably have to give some thought
to how the technique would work with Andres's rewrite of expression
evaluation, but I imagine that's a solvable problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch

2021-04-19 Thread Tom Lane
Peter Eisentraut  writes:
> The extract(julian from timestamp) is still a bit in the slow mode, but 
> as I previously stated, it's not documented and gives the wrong result, 
> so it's not clear whether it should be fixed and what it should do.  I 
> think I'll register that part as an open item in any case, to see what 
> we should do about that.

I looked into this issue.  It's not quite true that the behavior is
entirely undocumented: Appendix B (datetime.sgml) says

   In the Julian Date system, each day has a sequential number, starting
   from JD 0 (which is sometimes called the Julian Date).
   JD 0 corresponds to 1 January 4713 BC in the Julian calendar, or
   24 November 4714 BC in the Gregorian calendar.  Julian Date counting
   is most often used by astronomers for labeling their nightly observations,
   and therefore a date runs from noon UTC to the next noon UTC, rather than
   from midnight to midnight: JD 0 designates the 24 hours from noon UTC on
   24 November 4714 BC to noon UTC on 25 November 4714 BC.
  

  
   Although PostgreSQL supports Julian Date notation 
for
   input and output of dates (and also uses Julian dates for some internal
   datetime calculations), it does not observe the nicety of having dates
   run from noon to noon.  PostgreSQL treats a 
Julian Date
   as running from midnight to midnight.
  

That last bit requires clarification: we treat a Julian date as running
from *local* midnight to local midnight (ie in the active timezone, not
UTC midnight).  So far as I can see, the behavior of extract(julian) is
consistent with that definition:

regression=# show timezone;
 TimeZone 
--
 America/New_York
(1 row)

regression=# select date_part('julian', '2021-04-19 00:00:01-04'::timestamptz);
 date_part 
---
 2459324.11574
(1 row)

regression=# select date_part('julian', '2021-04-19 23:59:00-04'::timestamptz);
 date_part  

 2459324.999306
(1 row)

regression=# select date_part('julian', '2021-04-19'::date);
 date_part 
---
   2459324
(1 row)

I don't see that to_char's J mode differs from this, either.

So I don't think there's any code change required (unless you are still
worried about speed).  What we do need is documentation fixes:

* clarify the above bit about local vs UTC midnight

* document the existence of the julian field for date_part/extract

* fix this bit in the to_char docs to agree with reality,
ie s/UTC/local time/:

   
J
Julian Day (integer days since November 24, 4714 BC at midnight 
UTC)
   

Perhaps it'd be worth documenting that you can get the standard
astronomical definition of Julian date by transposing to time zone UTC-12
before converting.  But I think trying to change PG's behavior at this
point would be a bad idea.

(We could also consider back-patching these doc fixes.)

regards, tom lane




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Robert Haas
On Mon, Apr 19, 2021 at 10:34 AM Amit Langote  wrote:
> After 86dc90056, the new tuple is computed with the target table's
> actual TupleDesc, so the new value respects the column's attstorage,
> which makes me think the new behavior is not wrong.

I would not have expected SET STORAGE PLAIN to disable the use of
short varlena headers. *Maybe* at some point in time there was enough
code that couldn't operate directly on short varlenas to justify a
theory that in some circumstances eschewing short headers would save
on CPU cycles. But surely in 2021 this is not true and this behavior
is not plausibly desired by anyone.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Tom Lane
Robert Haas  writes:
> On Mon, Apr 19, 2021 at 10:34 AM Amit Langote  wrote:
>> After 86dc90056, the new tuple is computed with the target table's
>> actual TupleDesc, so the new value respects the column's attstorage,
>> which makes me think the new behavior is not wrong.

> I would not have expected SET STORAGE PLAIN to disable the use of
> short varlena headers.

Au contraire.  The reason that mode exists at all (for varlena types)
is to support data types that haven't been updated for TOAST.  Perhaps
that's now the empty set, but it's not really our job to take away the
capability.  If you really want MAIN you should say that, not quibble
that PLAIN doesn't mean what it's always been understood to mean.

I don't think that this behavior quite breaks such data types, because
if you actually have a type like that then you've set typstorage = PLAIN
and we will not allow there to be any tupdescs in the system that differ
from that.  The issue is just that if you set a particular column of
an otherwise-toastable type to be PLAIN then we're not terribly rigorous
about enforcing that, because values that have been toasted can get into
the column without being fully detoasted.  (I've not checked, but I
suspect that you could also get a compressed or toasted-out-of-line value
into such a column if you tried hard enough.)

Related to this is that when you update some other column(s) of the table,
we don't try to force detoasting of existing values in a column recently
set to PLAIN.  Personally I think that's fine, so it means that the
lack of rigor is inherent.

regards, tom lane




回覆: 回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-19 Thread Yulin PEI
Consider the SQL statement 'SELECT (('1' COLLATE "C") ||(B'1'));' . 
Intuitively, the result will be '11' and the result is '11' in pg 13.2 release 
as well.

The function stack is make_fn_arguments -> coerce_type, which means that the 
param "Node *node" of function coerce_type could be a CollateExpr Node.
Let's look at your patch:

```
// node is ('1' COLLATE "C")
// targetType is varbit and it is non-collatable
if (IsA(node, CollateExpr) && type_is_collatable(targetTypeId))
{

// we will not reach here.

CollateExpr *coll = (CollateExpr *) node;
CollateExpr *newcoll = makeNode(CollateExpr);



// An error will be generated. "failed to find conversion function"

}

```

So I suggest:

```
// node is ('1' COLLATE "C")

if (IsA(node, CollateExpr))
   {
  CollateExpr *coll = (CollateExpr *) node;
  CollateExpr *newcoll = makeNode(CollateExpr);


  //targetType is varbit and it is non-collatable

  if (!type_is_collatable(targetTypeId)) {

 // try to convert '1'(string) to varbit

 // We do not make a new CollateExpr here, but don't forget to coerce 
coll->arg.

 return coerce_type(pstate, (Node *) coll->arg,
inputTypeId, targetTypeId, targetTypeMod,
ccontext, cformat, location);
  }
 ...
   }

```







寄件者: Tom Lane 
寄件日期: 2021年4月19日 1:46
收件者: Yulin PEI 
副本: pgsql-hackers@lists.postgresql.org 
主旨: Re: 回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT 
('4' COLLATE "C")::INT FROM generate_series(1, 10));

Yulin PEI  writes:
> After several tests, I found that this patch do not fix the bug well.

What do you think is wrong with it?

> So the attachment is my patch and it works well as far as I tested.

This seems equivalent to the already-committed patch [1] except that
it wastes a makeNode call in the coerce-to-uncollatable-type case.

regards, tom lane

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c402b02b9fb53aee2a26876de90a8f95f9a9be92


Re: More info on pg_stat_activity Wait Event Name when is DataFileRead

2021-04-19 Thread PegoraroF10
I´m sure problem was hardware and I hope it does not occur anymore.
If I have a logical replication and on replica I do a Vacuum Full, Cluster
or any other EXCLUSIVE LOCK operation which, replication will wait for that. 
I was thinking was about a time to release that lock, or in my situation a
hardware problem. If N seconds 
 later and that file is not released then change DataFileRead to
DataFileRead + relfilenode 



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Reduce the number of special cases to build contrib modules on windows

2021-04-19 Thread Alvaro Herrera
> diff --git a/src/tools/msvc/MSBuildProject.pm 
> b/src/tools/msvc/MSBuildProject.pm
> index ebb169e201..68606a296d 100644
> --- a/src/tools/msvc/MSBuildProject.pm
> +++ b/src/tools/msvc/MSBuildProject.pm
> @@ -310,11 +310,12 @@ sub WriteItemDefinitionGroup
>   my $targetmachine =
> $self->{platform} eq 'Win32' ? 'MachineX86' : 'MachineX64';
>  
> - my $includes = $self->{includes};
> - unless ($includes eq '' or $includes =~ /;$/)
> + my $includes = "";
> + foreach my $inc (@{ $self->{includes} })
>   {
> - $includes .= ';';
> + $includes .= $inc . ";";
>   }

Perl note: you can do this more easily as 

  my $includes = join ';', @{$self->{includes}};
  $includes .= ';' unless $includes eq '';

-- 
Álvaro Herrera39°49'30"S 73°17'W
 Are you not unsure you want to delete Firefox?
   [Not unsure] [Not not unsure][Cancel]
   http://smylers.hates-software.com/2008/01/03/566e45b2.html




Re: Windows default locale vs initdb

2021-04-19 Thread Andrew Dunstan


On 4/19/21 10:26 AM, Dave Page wrote:
>
>
> On Mon, Apr 19, 2021 at 11:52 AM Andrew Dunstan  > wrote:
>
>
> My understanding from Microsoft staff at conferences is that
> Azure's PostgreSQL SAS runs on  linux, not WIndows.
>
>
> This is from a regular Azure Database for PostgreSQL single server:
>
> postgres=> select version();
>                           version                           
> 
>  PostgreSQL 11.6, compiled by Visual C++ build 1800, 64-bit
> (1 row) 
>
> And this is from the new Flexible Server preview:
>
> postgres=> select version();
>                                                      version          
>                                           
> -
>  PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
> (1 row)
>
> So I guess it's a case of "it depends".
>

Good to know. A year or two back at more than one conference I tried to enlist 
some of these folks in helping us with Windows PostgreSQL and their reply was 
that they knew nothing about it because they were on Linux :-) I guess things 
change over time.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: pg_amcheck option to install extension

2021-04-19 Thread Andrew Dunstan


On 4/18/21 7:32 PM, Alvaro Herrera wrote:
> On 2021-Apr-18, Andrew Dunstan wrote:
>
>> On 4/17/21 3:43 PM, Mark Dilger wrote:
>>> I'd also like your impressions on whether we're likely to move
>>> contrib/amcheck into core anytime soon.  If so, is it worth adding
>>> an option that we'll soon need to deprecate?
>> I think if it stays as an extension it will stay in contrib. But it sure
>> feels very odd to have a core bin program that relies on a contrib
>> extension. It seems one or the other is misplaced.
> I've proposed in the past that we should have a way to provide
> extensions other than contrib -- specifically src/extensions/ -- and
> then have those extensions installed together with the rest of core.
> Then it would be perfectly legitimate to have src/bin/pg_amcheck that
> depending that extension.  I agree that the current situation is not
> great.
>


OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Andrew Dunstan

On 4/19/21 10:43 AM, Mark Dilger wrote:
>
>> On Apr 19, 2021, at 5:11 AM, Andrew Dunstan  wrote:
>>
>> I think therefore I'm inclined for now to do nothing for old version
>> compatibility.
> I agree with waiting until the v15 development cycle.
>
>> I would commit the fix for the IPC::Run caching glitch,
>> and version detection
> Thank you.
>
>> I would add a warning if the module is used with
>> a version <= 11.
> Sounds fine for now.
>
>> The original goal of these changes was to allow testing of combinations
>> of different builds with openssl and nss, which doesn't involve old
>> version compatibility.
> Hmm.  I think different folks had different goals.  My personal interest is 
> to write automated tests which spin up older servers, create data that cannot 
> be created on newer servers (such as heap tuples with HEAP_MOVED_IN or 
> HEAP_MOVED_OFF bits set), upgrade, and test that new code handles the old 
> data correctly.  I think this is not only useful for our test suites as a 
> community, but is also useful for companies providing support services who 
> need to reproduce problems that customers are having on clusters that have 
> been pg_upgraded across large numbers of postgres versions.
>
>> As far as I know, without any compatibility changes the module is fully
>> compatible with releases 13 and 12, and with releases 11 and 10 so long
>> as you don't want a standby, and with releases 9.6 and 9.5 if you also
>> don't want a backup. That makes it suitable for a lot of testing without
>> any attempt at version compatibility.
>>
>> We can revisit compatibility further in the next release.
> Sounds good.


I'll work on this. Meanwhile FTR here's my latest revision - it's a lot
less invasive of the main module, so it seems much more palatable to me,
and still passes my test down to 7.2.


cheers


andrew


-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e209ea7163..c6086101f2 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -96,6 +96,7 @@ use File::Spec;
 use File::stat qw(stat);
 use File::Temp ();
 use IPC::Run;
+use PostgresVersion;
 use RecursiveCopy;
 use Socket;
 use Test::More;
@@ -127,6 +128,20 @@ INIT
 	$last_port_assigned = int(rand() * 16384) + 49152;
 }
 
+# Current dev version, for which we have no subclass
+# When a new stable branch is made this and the subclass hierarchy below
+# need to be adjusted.
+my $devtip = 14;
+
+INIT
+{
+	# sanity check to make sure there is a subclass for the last stable branch
+	my $last_child = 'PostgresNodeV_' . ($devtip -1);
+	eval "${last_child}->can('get_new_node') || die('not found');";
+	die "No child package $last_child found" if $@;
+}
+
+
 =pod
 
 =head1 METHODS
@@ -347,9 +362,12 @@ about this node.
 sub info
 {
 	my ($self) = @_;
+	my $varr = $self->{_pg_version};
+	my $vstr = join('.', @$varr) if ref $varr;
 	my $_info = '';
 	open my $fh, '>', \$_info or die;
 	print $fh "Name: " . $self->name . "\n";
+	print $fh "Version: " . $vstr . "\n" if $vstr;
 	print $fh "Data directory: " . $self->data_dir . "\n";
 	print $fh "Backup directory: " . $self->backup_dir . "\n";
 	print $fh "Archive directory: " . $self->archive_dir . "\n";
@@ -438,7 +456,7 @@ sub init
 	mkdir $self->backup_dir;
 	mkdir $self->archive_dir;
 
-	TestLib::system_or_bail('initdb', '-D', $pgdata, '-A', 'trust', '-N',
+	TestLib::system_or_bail('initdb', '-D', $pgdata, ($self->_initdb_flags),
 		@{ $params{extra} });
 	TestLib::system_or_bail($ENV{PG_REGRESS}, '--config-auth', $pgdata,
 		@{ $params{auth_extra} });
@@ -465,31 +483,36 @@ sub init
 
 	if ($params{allows_streaming})
 	{
-		if ($params{allows_streaming} eq "logical")
-		{
-			print $conf "wal_level = logical\n";
-		}
-		else
-		{
-			print $conf "wal_level = replica\n";
-		}
-		print $conf "max_wal_senders = 10\n";
-		print $conf "max_replication_slots = 10\n";
-		print $conf "wal_log_hints = on\n";
-		print $conf "hot_standby = on\n";
-		# conservative settings to ensure we can run multiple postmasters:
-		print $conf "shared_buffers = 1MB\n";
-		print $conf "max_connections = 10\n";
-		# limit disk space consumption, too:
-		print $conf "max_wal_size = 128MB\n";
+		$self->_init_streaming($conf, $params{allows_streaming})
 	}
 	else
 	{
-		print $conf "wal_level = minimal\n";
-		print $conf "max_wal_senders = 0\n";
+		$self->_init_wal_level_minimal($conf);
 	}
 
 	print $conf "port = $port\n";
+
+	$self->_init_network($conf, $use_tcp, $host);
+
+	close $conf;
+
+	chmod($self->group_access ? 0640 : 0600, "$pgdata/postgresql.conf")
+	  or die("unable to set permissions for $pgdata/postgresql.conf");
+
+	$self->set_replication_conf if $params{allows_streaming};
+	$self->enable_archiving if $params{has_archiving};
+	return;
+}
+
+
+# methods use in init() which can be overridden in older versions
+
+sub _initdb_flags {	return ('-A', 'trust', '-N'); }
+
+sub _init_network
+{
+	my ($self, $conf,

Re: pg_amcheck option to install extension

2021-04-19 Thread Mark Dilger



> On Apr 19, 2021, at 9:32 AM, Andrew Dunstan  wrote:
> 
> 
> On 4/18/21 7:32 PM, Alvaro Herrera wrote:
>> On 2021-Apr-18, Andrew Dunstan wrote:
>> 
>>> On 4/17/21 3:43 PM, Mark Dilger wrote:
 I'd also like your impressions on whether we're likely to move
 contrib/amcheck into core anytime soon.  If so, is it worth adding
 an option that we'll soon need to deprecate?
>>> I think if it stays as an extension it will stay in contrib. But it sure
>>> feels very odd to have a core bin program that relies on a contrib
>>> extension. It seems one or the other is misplaced.
>> I've proposed in the past that we should have a way to provide
>> extensions other than contrib -- specifically src/extensions/ -- and
>> then have those extensions installed together with the rest of core.
>> Then it would be perfectly legitimate to have src/bin/pg_amcheck that
>> depending that extension.  I agree that the current situation is not
>> great.
>> 
> 
> 
> OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
> pg_amcheck should move there. I can organize that if there's agreement.
> Or else let's move amcheck as Alvaro suggests.

Ah, no.  I wrote pg_amcheck in contrib originally, and moved it to src/bin as 
requested during the v14 development cycle.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: terminate called after throwing an instance of 'std::bad_alloc'

2021-04-19 Thread Andres Freund
Hi,

On 2021-04-17 19:13:24 -0500, Justin Pryzby wrote:
> I'm now realizing that that's RAM use for a single query, not from continuous
> leaks across multiple queries.

What's the memory usage with inlining disabled, and whats the usage
without jit?


> This is still true with the patch even if I
> #define LLVMJIT_LLVM_CONTEXT_REUSE_MAX 1
>
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 28438 postgres  20   0 3854264   2.8g  29428 R  98.7 33.2   8:56.79 postgres: 
> telsasoft ts 192.168.122.11(53614) BIND
>
> python3 ./jitleak.py # runs telsasoft reports
> INFO:  recreating LLVM context after 2 uses
> INFO:  recreating LLVM context after 2 uses
> INFO:  recreating LLVM context after 2 uses
> INFO:  recreating LLVM context after 2 uses
> INFO:  recreating LLVM context after 2 uses
> PID 27742 finished running report; est=None rows=40745; cols=34; ... 
> duration:538
> INFO:  recreating LLVM context after 81492 uses
>
> I did:
>
> -   llvm_llvm_context_reuse_count = 0;
> Assert(llvm_context != NULL);
> +   elog(INFO, "recreating LLVM context after %zu uses", 
> llvm_llvm_context_reuse_count);
> +   llvm_llvm_context_reuse_count = 0;
>
> Maybe we're missing this condition somehow ?
> if (llvm_jit_context_in_use_count == 0 &&

Do you utilize cursors or such?


> Also, I just hit this assertion by cancelling the query with ^C / sigint.  But
> I don't have a reprodcer for it.
>
> < 2021-04-17 19:14:23.509 ADT telsasoft >PANIC:  LLVMJitContext in use count 
> not 0 at exit (is 1)

Hm, I'll try to see how that happens - if not, do you have a backtrace?

Greetings,

Andres Freund




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Robert Haas
On Mon, Apr 19, 2021 at 12:13 PM Tom Lane  wrote:
> Au contraire.  The reason that mode exists at all (for varlena types)
> is to support data types that haven't been updated for TOAST.  Perhaps
> that's now the empty set, but it's not really our job to take away the
> capability.  If you really want MAIN you should say that, not quibble
> that PLAIN doesn't mean what it's always been understood to mean.

This kind of begs the question of whether you have the right idea
about what PLAIN has always been understood to mean, and whether
everyone understands it the same way. I formed my understanding of
what PLAIN is understood to mean by reading the ALTER TABLE .. SET
STORAGE documentation, and there's no real hint in there that this is
some kind of backward-compatibility only feature. Rather, I read that
paragraph to suggest that you can choose between the four options as a
way of getting best performance. Both external storage and compression
are trade-offs: they make the tuples smaller, which can be good for
performance, but they also make the toasted columns more expensive to
access, which can be bad for performance. It seems completely
reasonable to suppose that some workloads may benefit from a
non-default TOAST strategy; e.g. if you often access only a few
columns but scan a lot of rows, toasting wins; if you typically access
every column but only a few rows via an index scan, not toasting wins.
And if that is your idea about what the feature does - an idea that
seems perfectly defensible given what the documentation says about
this - then I think it's going to be surprising to find that it also
inhibits 1-byte headers from being used. But, IDK, maybe nobody will
care (or maybe I'm the only one who will be surprised).

Perhaps this whole area needs a broader rethink at some point. I'm
inclined to think that compatibility with varlena data types that
haven't been updated since PostgreSQL 8.3 came out is almost a
non-goal and maybe we ought to just kick such data types and the
associated code paths to the curb. It's unlikely that they get much
testing. On the other hand, perhaps we'd like to have more control
over the decision to compress or store externally than we have
currently. I think if I were designing this from scratch, I'd want one
switch for whether it's OK to compress, with values meaning "yes,"
"no," and "only if stored externally," a second switch for the
*length* at which external storage should be used (so that I can push
out rarely-used columns at lower size thresholds and commonly-used
ones at higher thresholds), and a third for what should happen if we
do the stuff allowed by the first two switches and the tuple still
doesn't fit, with value meaning "fail" and "externalize anyway".

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: 回覆: 回复: Core dump happens when execute sql CREATE VIEW v1(c1) AS (SELECT ('4' COLLATE "C")::INT FROM generate_series(1, 10));

2021-04-19 Thread Tom Lane
Yulin PEI  writes:
> Let's look at your patch:

> ```
> // node is ('1' COLLATE "C")
> // targetType is varbit and it is non-collatable
> if (IsA(node, CollateExpr) && type_is_collatable(targetTypeId))
> {

> // we will not reach here.

That's not the committed patch, though.  I realized after posting
it that it didn't maintain the same behavior in coerce_type as
coerce_to_target_type.  But the actually-committed fix does, and
as I said, what you're suggesting seems equivalent though a bit
messier.

regards, tom lane




Re: pg_amcheck option to install extension

2021-04-19 Thread Robert Haas
On Mon, Apr 19, 2021 at 12:37 PM Mark Dilger
 wrote:
> > OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
> > pg_amcheck should move there. I can organize that if there's agreement.
> > Or else let's move amcheck as Alvaro suggests.
>
> Ah, no.  I wrote pg_amcheck in contrib originally, and moved it to src/bin as 
> requested during the v14 development cycle.

Yeah, I am not that excited about moving this again. I realize it was
never committed anywhere else, but it was moved at least one during
development. And I don't see that moving it to contrib really fixes
anything anyway here, except perhaps conceptually. Maybe inventing
src/extensions is the right idea, but there's no real need to do that
at this point in the release cycle, and it doesn't actually fix
anything either. The only thing that's really needed here is to either
(a) teach the test script to install amcheck as a separate step or (b)
teach pg_amcheck to install amcheck in a user-specified schema. If we
do that, AIUI, this issue is fixed regardless of whether we move any
source code around, and if we don't do that, AIUI, this issue is not
fixed no matter how much source code we move.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_amcheck option to install extension

2021-04-19 Thread Tom Lane
Andrew Dunstan  writes:
> OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
> pg_amcheck should move there. I can organize that if there's agreement.
> Or else let's move amcheck as Alvaro suggests.

FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.

Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded.  Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.

regards, tom lane




Re: More info on pg_stat_activity Wait Event Name when is DataFileRead

2021-04-19 Thread Robert Haas
On Mon, Apr 19, 2021 at 12:17 PM PegoraroF10  wrote:
> I´m sure problem was hardware and I hope it does not occur anymore.
> If I have a logical replication and on replica I do a Vacuum Full, Cluster
> or any other EXCLUSIVE LOCK operation which, replication will wait for that.
> I was thinking was about a time to release that lock, or in my situation a
> hardware problem. If N seconds
>  later and that file is not released then change DataFileRead to
> DataFileRead + relfilenode

But how would we implement that with reasonable efficiency? If we
called setitimer() before every read() call to set the timeout, and
then again to clear it after the read(), that would probably be
hideously expensive. Perhaps it would work to have a background
"heartbeat" process that pings every backend in the system every 1s or
something like that, and make the signal handler do this, but that
supposes that the signal handler would have ready access to this
information, which doesn't seem totally straightforward to arrange,
and that it would be OK for the signal handler to grab a lock to
update shared memory, which as things stand today is definitely not
safe.

I am not trying to say that there is no way that something like this
could be made to work. There's probably something that can be done. I
don't think I know what that thing is, though.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Jehan-Guillaume de Rorthais
On Mon, 19 Apr 2021 07:43:58 -0700
Mark Dilger  wrote:

> > On Apr 19, 2021, at 5:11 AM, Andrew Dunstan  wrote:
> > 
> > I think therefore I'm inclined for now to do nothing for old version
> > compatibility.  
> 
> I agree with waiting until the v15 development cycle.

Agree.




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Tom Lane
Robert Haas  writes:
> On Mon, Apr 19, 2021 at 12:13 PM Tom Lane  wrote:
>> Au contraire.  The reason that mode exists at all (for varlena types)
>> is to support data types that haven't been updated for TOAST.

> This kind of begs the question of whether you have the right idea
> about what PLAIN has always been understood to mean, and whether
> everyone understands it the same way. I formed my understanding of
> what PLAIN is understood to mean by reading the ALTER TABLE .. SET
> STORAGE documentation, and there's no real hint in there that this is
> some kind of backward-compatibility only feature.

That doco is explaining the users-eye view of it.  Places addressed
to datatype developers, such as the CREATE TYPE reference page, see
it a bit differently.  CREATE TYPE for instance points out that

All storage values other than plain imply that the functions of the
data type can handle values that have been toasted, as described in ...

> I think if I were designing this from scratch, I'd want one
> switch for whether it's OK to compress, with values meaning "yes,"
> "no," and "only if stored externally," a second switch for the
> *length* at which external storage should be used (so that I can push
> out rarely-used columns at lower size thresholds and commonly-used
> ones at higher thresholds), and a third for what should happen if we
> do the stuff allowed by the first two switches and the tuple still
> doesn't fit, with value meaning "fail" and "externalize anyway".

Yeah, I don't think the existing options for attstorage have much
to recommend them except backwards compatibility.  But if we do
redesign them, I'd still say there should be a way for a data
type to say that it doesn't support these weird header hacks that
we've invented.  The notion that short header doesn't cost anything
seems extremely Intel-centric to me.

regards, tom lane




Re: Commit 86dc90056 - Rework planning and execution of UPDATE and DELETE

2021-04-19 Thread Robert Haas
On Mon, Apr 19, 2021 at 1:03 PM Tom Lane  wrote:
> That doco is explaining the users-eye view of it.  Places addressed
> to datatype developers, such as the CREATE TYPE reference page, see
> it a bit differently.  CREATE TYPE for instance points out that
>
> All storage values other than plain imply that the functions of the
> data type can handle values that have been toasted, as described in ...

Interesting. It feels to me like SET STORAGE PLAIN feels like it is
really trying to be two different things. Either you want to inhibit
compression and external storage for performance reasons, or your data
type can't support either one. Maybe we should separate those
concepts, since there's no mode right now that says "don't ever
compress, and externalize only if there's absolutely no other way,"
and there's no way to disable compression and externalization without
also killing off short headers. :-(

> The notion that short header doesn't cost anything seems extremely 
> Intel-centric to me.

I don't think so. It's true that Intel is very forgiving about
unaligned accesses compared to some other architectures, but I think
if you have a terabyte of data, you want it to fit into as few disk
pages as possible pretty much no matter what architecture you're
using. The dominant costs are going to be the I/O costs, not the CPU
costs of dealing with unaligned bytes. In fact, even if you have a
gigabyte of data, I bet it's *still* better to use a more compact
on-disk representation. Now, the dominant cost is going to be pumping
the data through the L3 CPU cache, which is still - I think - going to
be quite a lot more important than the CPU costs of dealing with
unaligned bytes. The CPU bus is an I/O bottleneck not unlike the disk
itself, just at a higher rate of speed which is still way slower than
the CPU speed. Now if you have a megabyte of data, or better yet a
kilobyte of data, then I think optimizing for CPU efficiency may well
be the right thing to do. I don't know how much 4-byte varlena headers
really save there, but if I were designing a storage representation
for very small data sets, I'd definitely be thinking about how I could
waste space to shave cycles.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_amcheck option to install extension

2021-04-19 Thread Mark Dilger



> On Apr 19, 2021, at 9:53 AM, Tom Lane  wrote:
> 
> Andrew Dunstan  writes:
>> OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
>> pg_amcheck should move there. I can organize that if there's agreement.
>> Or else let's move amcheck as Alvaro suggests.
> 
> FWIW, I think that putting them both in contrib makes the most
> sense from a structural standpoint.

That was my original thought also, largely from a package management 
perspective.  Just as an example, postgresql-client and postgresql-contrib are 
separate rpms.  There isn't much point to having pg_amcheck installed as part 
of the postgresql-client package while having amcheck in the postgresql-contrib 
package which might not be installed.

A counter argument is that amcheck is server side, and pg_amcheck is client 
side.  Having pg_amcheck installed on a system makes sense if you are 
connecting to a server on a different system.

> On Mar 11, 2021, at 12:12 AM, Peter Eisentraut 
>  wrote:
> 
> I want to register, if we are going to add this, it ought to be in src/bin/.  
> If we think it's a useful tool, it should be there with all the other useful 
> tools.
> 
> I realize there is a dependency on a module in contrib, and it's probably now 
> not the time to re-debate reorganizing contrib.  But if we ever get to that, 
> this program should be the prime example why the current organization is 
> problematic, and we should be prepared to make the necessary moves then.

This was the request that motivated the move to src/bin.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Jehan-Guillaume de Rorthais
On Mon, 19 Apr 2021 12:37:08 -0400
Andrew Dunstan  wrote:

> 
> On 4/19/21 10:43 AM, Mark Dilger wrote:
> >
> >> On Apr 19, 2021, at 5:11 AM, Andrew Dunstan  wrote:
> >>
> >> I think therefore I'm inclined for now to do nothing for old version
> >> compatibility.
> > I agree with waiting until the v15 development cycle.
> >
> >> I would commit the fix for the IPC::Run caching glitch,
> >> and version detection
> > Thank you.
> >
> >> I would add a warning if the module is used with
> >> a version <= 11.
> > Sounds fine for now.
> >
> >> The original goal of these changes was to allow testing of combinations
> >> of different builds with openssl and nss, which doesn't involve old
> >> version compatibility.
> > Hmm.  I think different folks had different goals.  My personal interest is
> > to write automated tests which spin up older servers, create data that
> > cannot be created on newer servers (such as heap tuples with HEAP_MOVED_IN
> > or HEAP_MOVED_OFF bits set), upgrade, and test that new code handles the
> > old data correctly.  I think this is not only useful for our test suites as
> > a community, but is also useful for companies providing support services
> > who need to reproduce problems that customers are having on clusters that
> > have been pg_upgraded across large numbers of postgres versions.
> >
> >> As far as I know, without any compatibility changes the module is fully
> >> compatible with releases 13 and 12, and with releases 11 and 10 so long
> >> as you don't want a standby, and with releases 9.6 and 9.5 if you also
> >> don't want a backup. That makes it suitable for a lot of testing without
> >> any attempt at version compatibility.
> >>
> >> We can revisit compatibility further in the next release.
> > Sounds good.
> 
> 
> I'll work on this. Meanwhile FTR here's my latest revision - it's a lot
> less invasive of the main module, so it seems much more palatable to me,
> and still passes my test down to 7.2.

I spend a fair bit of time to wonder how useful it could be to either maintain
such a module in core, including for external needs, or creating a separate
external project with a different release/distribution/packaging policy.

Wherever the module is maintained, the goal would be to address broader
needs, eg. adding a switch_wal() method or wait_for_archive(), supporting
replication, backups, etc for multi-old-deprecated-PostgreSQL versions.

To be honest I have mixed feelings. I feel this burden shouldn't be carried
by the core, which has restricted needs compared to external projects. In the
opposite, maintaining an external project which shares 90% of the code seems to
be a useless duplicate and backport effort. Moreover Craig Ringer already opened
the door for external use of PostgresNode with his effort to install/package it,
see:
https://www.postgresql.org/message-id/CAGRY4nxxKSFJEgVAv5YAk%3DbqULtFmNw7gEJef0CCgzpNy6O%3D-w%40mail.gmail.com

Thoughts?




Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Mark Dilger



> On Apr 19, 2021, at 10:25 AM, Jehan-Guillaume de Rorthais  
> wrote:
> 
> On Mon, 19 Apr 2021 12:37:08 -0400
> Andrew Dunstan  wrote:
> 
>> 
>> On 4/19/21 10:43 AM, Mark Dilger wrote:
>>> 
 On Apr 19, 2021, at 5:11 AM, Andrew Dunstan  wrote:
 
 I think therefore I'm inclined for now to do nothing for old version
 compatibility.
>>> I agree with waiting until the v15 development cycle.
>>> 
 I would commit the fix for the IPC::Run caching glitch,
 and version detection
>>> Thank you.
>>> 
 I would add a warning if the module is used with
 a version <= 11.
>>> Sounds fine for now.
>>> 
 The original goal of these changes was to allow testing of combinations
 of different builds with openssl and nss, which doesn't involve old
 version compatibility.
>>> Hmm.  I think different folks had different goals.  My personal interest is
>>> to write automated tests which spin up older servers, create data that
>>> cannot be created on newer servers (such as heap tuples with HEAP_MOVED_IN
>>> or HEAP_MOVED_OFF bits set), upgrade, and test that new code handles the
>>> old data correctly.  I think this is not only useful for our test suites as
>>> a community, but is also useful for companies providing support services
>>> who need to reproduce problems that customers are having on clusters that
>>> have been pg_upgraded across large numbers of postgres versions.
>>> 
 As far as I know, without any compatibility changes the module is fully
 compatible with releases 13 and 12, and with releases 11 and 10 so long
 as you don't want a standby, and with releases 9.6 and 9.5 if you also
 don't want a backup. That makes it suitable for a lot of testing without
 any attempt at version compatibility.
 
 We can revisit compatibility further in the next release.
>>> Sounds good.
>> 
>> 
>> I'll work on this. Meanwhile FTR here's my latest revision - it's a lot
>> less invasive of the main module, so it seems much more palatable to me,
>> and still passes my test down to 7.2.
> 
> I spend a fair bit of time to wonder how useful it could be to either maintain
> such a module in core, including for external needs, or creating a separate
> external project with a different release/distribution/packaging policy.
> 
> Wherever the module is maintained, the goal would be to address broader
> needs, eg. adding a switch_wal() method or wait_for_archive(), supporting
> replication, backups, etc for multi-old-deprecated-PostgreSQL versions.
> 
> To be honest I have mixed feelings. I feel this burden shouldn't be carried
> by the core, which has restricted needs compared to external projects. In the
> opposite, maintaining an external project which shares 90% of the code seems 
> to
> be a useless duplicate and backport effort. Moreover Craig Ringer already 
> opened
> the door for external use of PostgresNode with his effort to install/package 
> it,
> see:
> https://www.postgresql.org/message-id/CAGRY4nxxKSFJEgVAv5YAk%3DbqULtFmNw7gEJef0CCgzpNy6O%3D-w%40mail.gmail.com
> 
> Thoughts?

The community needs a single shared PostgresNode implementation that can be 
used by scripts which reproduce bugs.  For bugs that can only be triggered by 
cross version upgrades, the scripts need a PostgresNode implementation which 
can work across versions.  Likewise for bugs that can only be triggered when 
client applications connect to servers of a different version.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Andres Freund
Hi,

On 2021-04-18 11:29:42 -0400, Tom Lane wrote:
> I'm not sure that an error in this direction is all that much more
> problematic than the other direction.  If it's okay to claim that
> indexes need to be rebuilt when they don't really, then we could just
> drop this entire overcomplicated infrastructure and report that all
> indexes need to be rebuilt after any collation version change.

That doesn't ring true to me. There's a huge difference between needing
to rebuild all indexes, especially primary key indexes which often are
over int8 etc, and unnecessarily needing to rebuild indexes doing
comparatively rare things.

Greetings,

Andres Freund




Re: terminate called after throwing an instance of 'std::bad_alloc'

2021-04-19 Thread Justin Pryzby
On Mon, Apr 19, 2021 at 09:41:30AM -0700, Andres Freund wrote:
> On 2021-04-17 19:13:24 -0500, Justin Pryzby wrote:
> > I'm now realizing that that's RAM use for a single query, not from 
> > continuous
> > leaks across multiple queries.
> 
> What's the memory usage with inlining disabled, and whats the usage
> without jit?

PGOPTIONS="-c jit=off -c jit_inline_above_cost=0 -c jit_above_cost=0 -c 
client_min_messages=debug -c log_executor_stats=on" python3 `pwd`/jitleak.py
!   [5.272604 s user, 0.329912 s system total]
!   120948 kB max resident size

PGOPTIONS="-c jit=on -c jit_inline_above_cost=0 -c jit_above_cost=0 -c 
client_min_messages=debug -c log_executor_stats=on" python3 `pwd`/jitleak.py
DEBUG:  time to inline: 0.003s, opt: 0.000s, emit: 0.003s
... repeated many times ...
!   [585.090087 s user, 31.835228 s system total]
!   4639556 kB max resident size

PGOPTIONS="-c jit=on -c jit_inline_above_cost=-1 -c jit_above_cost=0 -c 
client_min_messages=debug -c log_executor_stats=on" python3 `pwd`/jitleak.py
DEBUG:  time to inline: 0.000s, opt: 0.000s, emit: 0.003s
... repeated many times ... which is confusing, since inlining is disabled ...
!   [318.556489 s user, 8.413007 s system total]
!   159436 kB max resident size

> > Maybe we're missing this condition somehow ?
> > if (llvm_jit_context_in_use_count == 0 &&
> 
> Do you utilize cursors or such?

Not in this backend/process.
I believe postgis/mapserver uses cursors in a separate process, though...

Our report runs like:
SELECT large_object_oid FROM ... WHERE name=$1 AND user=$2 ... FOR UPDATE -- 
our reports are stored as LOs and read by readlo
begin;
SET enable_nestloop=off; -- I'm sorry to say but nothing else has avoids 
occasional pathological query plans
SELECT report text...;
rollback;

> > Also, I just hit this assertion by cancelling the query with ^C / sigint.  
> > But
> > I don't have a reprodcer for it.
> >
> > < 2021-04-17 19:14:23.509 ADT telsasoft >PANIC:  LLVMJitContext in use 
> > count not 0 at exit (is 1)
> 
> Hm, I'll try to see how that happens - if not, do you have a backtrace?

This may be because I backpatched to v13, which had one conflict (0a2bc5d61).
Maybe this is relevant, too:  bab150045

If you could provide a patch for v13, it's a good bet there's no issue that I
didn't cause.

I'm able to reproduce the crash like this, with my patch to v13, but not your
patch on master.

python3 -c "import pg; db=pg.DB('dbname=postgres'); db.query('SET 
jit_above_cost=0; SET jit_inline_above_cost=0; SET jit=on; SET 
client_min_messages=debug'); db.query('begin'); db.query_formatted('SELECT 1 
FROM generate_series(1,99)a WHERE a=%s', [1], inline=False);"

-- 
Justin




Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Jehan-Guillaume de Rorthais
On Mon, 19 Apr 2021 10:35:39 -0700
Mark Dilger  wrote:

> > On Apr 19, 2021, at 10:25 AM, Jehan-Guillaume de Rorthais 
> > wrote:
> > 
> > On Mon, 19 Apr 2021 12:37:08 -0400
> > Andrew Dunstan  wrote:
> >   
> >> 
> >> On 4/19/21 10:43 AM, Mark Dilger wrote:  
> >>>   
>  On Apr 19, 2021, at 5:11 AM, Andrew Dunstan  wrote:
>  
>  I think therefore I'm inclined for now to do nothing for old version
>  compatibility.  
> >>> I agree with waiting until the v15 development cycle.
> >>>   
>  I would commit the fix for the IPC::Run caching glitch,
>  and version detection  
> >>> Thank you.
> >>>   
>  I would add a warning if the module is used with
>  a version <= 11.  
> >>> Sounds fine for now.
> >>>   
>  The original goal of these changes was to allow testing of combinations
>  of different builds with openssl and nss, which doesn't involve old
>  version compatibility.  
> >>> Hmm.  I think different folks had different goals.  My personal interest
> >>> is to write automated tests which spin up older servers, create data that
> >>> cannot be created on newer servers (such as heap tuples with HEAP_MOVED_IN
> >>> or HEAP_MOVED_OFF bits set), upgrade, and test that new code handles the
> >>> old data correctly.  I think this is not only useful for our test suites
> >>> as a community, but is also useful for companies providing support
> >>> services who need to reproduce problems that customers are having on
> >>> clusters that have been pg_upgraded across large numbers of postgres
> >>> versions. 
>  As far as I know, without any compatibility changes the module is fully
>  compatible with releases 13 and 12, and with releases 11 and 10 so long
>  as you don't want a standby, and with releases 9.6 and 9.5 if you also
>  don't want a backup. That makes it suitable for a lot of testing without
>  any attempt at version compatibility.
>  
>  We can revisit compatibility further in the next release.  
> >>> Sounds good.  
> >> 
> >> 
> >> I'll work on this. Meanwhile FTR here's my latest revision - it's a lot
> >> less invasive of the main module, so it seems much more palatable to me,
> >> and still passes my test down to 7.2.  
> > 
> > I spend a fair bit of time to wonder how useful it could be to either
> > maintain such a module in core, including for external needs, or creating a
> > separate external project with a different release/distribution/packaging
> > policy.
> > 
> > Wherever the module is maintained, the goal would be to address broader
> > needs, eg. adding a switch_wal() method or wait_for_archive(), supporting
> > replication, backups, etc for multi-old-deprecated-PostgreSQL versions.
> >
> > To be honest I have mixed feelings. I feel this burden shouldn't be carried
> > by the core, which has restricted needs compared to external projects. In
> > the opposite, maintaining an external project which shares 90% of the code
> > seems to be a useless duplicate and backport effort. Moreover Craig Ringer
> > already opened the door for external use of PostgresNode with his effort to
> > install/package it, see:
> > https://www.postgresql.org/message-id/CAGRY4nxxKSFJEgVAv5YAk%3DbqULtFmNw7gEJef0CCgzpNy6O%3D-w%40mail.gmail.com
> > 
> > Thoughts?  
> 
> The community needs a single shared PostgresNode implementation that can be
> used by scripts which reproduce bugs.

Which means it could be OK to have a PostgresNode implementation, leaving in
core source-tree, which supports broader needs than the core ones (older
versions and some more methods)? Did I understood correctly?

If this is correct, I suppose this effort could be committed early in v15 cycle?

Does it deserve some effort to build some dedicated TAP tests for these
modules? I already have a small patch for this waiting on my disk for some more
tests and review...

Regards




Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Tom Lane
Andres Freund  writes:
> On 2021-04-18 11:29:42 -0400, Tom Lane wrote:
>> I'm not sure that an error in this direction is all that much more
>> problematic than the other direction.  If it's okay to claim that
>> indexes need to be rebuilt when they don't really, then we could just
>> drop this entire overcomplicated infrastructure and report that all
>> indexes need to be rebuilt after any collation version change.

> That doesn't ring true to me. There's a huge difference between needing
> to rebuild all indexes, especially primary key indexes which often are
> over int8 etc, and unnecessarily needing to rebuild indexes doing
> comparatively rare things.

It would not be that hard to exclude indexes on int8, or other cases
that clearly have zero collation dependencies.  And I think I might
have some faith in such a solution.  Right now I have zero faith
that the patch as it stands gives trustworthy answers.

I think that the real fundamental bug is supposing that static analysis
can give 100% correct answers.  Even if it did do so in a given state
of the database, consider this counterexample:

create type myrow as (f1 int, f2 int);
create table mytable (id bigint, r1 myrow, r2 myrow);
create index myindex on mytable(id) where r1 < r2;
alter type myrow add attribute f3 text;

myindex is recorded as having no collation dependency, but that is
now wrong.

regards, tom lane




when the startup process doesn't

2021-04-19 Thread Robert Haas
Hi,

I've noticed that customers not infrequently complain that they start
postgres and then the system doesn't come up for a while and they have
no idea what's going on and are (understandably) worried. There are
probably a number of reasons why this can happen, but the ones that
seem to come up most often in my experience are (1) SyncDataDirectory
takes a long time, (b) ResetUnloggedRelations takes a long time, and
(c) there's a lot of WAL to apply so that takes a long time. It's
possible to distinguish this last case from the other two by looking
at the output of 'ps', but that's not super-convenient if your normal
method of access to the server is via libpq, and it only works if you
are monitoring it as it's happening rather than looking at the logs
after-the-fact. I am not sure there's any real way to distinguish the
other two cases without using strace or gdb or similar.

It seems to me that we could do better. One approach would be to try
to issue a log message periodically - maybe once per minute, or some
configurable interval, e.g. perhaps add messages something like this:

LOG:  still syncing data directory, elapsed time %ld.%03d ms, current path %s
LOG:  data directory sync complete after %ld.%03d ms
LOG:  still resetting unlogged relations, elapsed time %ld.%03d ms,
current path %s
LOG:  unlogged relations reset after %ld.%03d ms
LOG:  still performing crash recovery, elapsed time %ld.%03d ms,
current LSN %08X/%08X

We already have a message when redo is complete, so there's no need
for another one. The implementation here doesn't seem too hard either:
the startup process would set a timer, when the timer expires the
signal handler sets a flag, at a convenient point we notice the flag
is set and responding by printing a message and clearing the flag.

Another possible approach would be to accept connections for
monitoring purposes even during crash recovery. We can't allow access
to any database at that point, since the system might not be
consistent, but we could allow something like a replication connection
(the non-database-associated variant). Maybe it would be precisely a
replication connection and we'd just refuse all but a subset of
commands, or maybe it would be some other kinds of thing. But either
way you'd be able to issue a command in some mini-language saying "so,
tell me how startup is going" and it would reply with a result set of
some kind.

If I had to pick one of these two ideas, I'd pick the one the
log-based solution, since it seems easier to access and simplifies
retrospective analysis, but I suspect SQL access would be quite useful
for some users too, especially in cloud environments where "just log
into the machine and have a look" is not an option.

Thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Peter Geoghegan
On Mon, Apr 19, 2021 at 10:53 AM Tom Lane  wrote:
> I think that the real fundamental bug is supposing that static analysis
> can give 100% correct answers.  Even if it did do so in a given state
> of the database, consider this counterexample:
>
> create type myrow as (f1 int, f2 int);
> create table mytable (id bigint, r1 myrow, r2 myrow);
> create index myindex on mytable(id) where r1 < r2;
> alter type myrow add attribute f3 text;
>
> myindex is recorded as having no collation dependency, but that is
> now wrong.

Is it really the case that static analysis of the kind that you'd need
to make this 100% robust is fundamentally impossible? I find that
proposition hard to believe.

I'm not sure that you were making a totally general statement, rather
than a statement about the patch/implementation, so perhaps I just
missed the point.

-- 
Peter Geoghegan




Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Peter Geoghegan
On Sun, Apr 18, 2021 at 4:23 AM Julien Rouhaud  wrote:
> So IIUC the issue here is that the code could previously record useless
> collation version dependencies in somes cases, which could lead to false
> positive possible corruption messages (and of course additional bloat on
> pg_depend).  False positive messages can't be avoided anyway, as a collation
> version update may not corrupt the actually indexed set of data, especially 
> for
> glibc.

This argument seems completely absurd to me.

-- 
Peter Geoghegan




Re: Windows default locale vs initdb

2021-04-19 Thread Peter Eisentraut

On 19.04.21 07:42, Thomas Munro wrote:

It looks
like one possibility, since Vista, is to call
GetUserDefaultLocaleName()[2], which doesn't appear to have been
discussed before on this list.  That doesn't allow you to ask for the
default for each individual category, but I don't know if that is even
a concept for Windows user settings.


pg_newlocale_from_collation() doesn't support collcollate != collctype 
on Windows anyway, so that wouldn't be an issue.





Re: multi-install PostgresNode fails with older postgres versions

2021-04-19 Thread Mark Dilger



> On Apr 19, 2021, at 10:50 AM, Jehan-Guillaume de Rorthais  
> wrote:
> 
>> The community needs a single shared PostgresNode implementation that can be
>> used by scripts which reproduce bugs.
> 
> Which means it could be OK to have a PostgresNode implementation, leaving in
> core source-tree, which supports broader needs than the core ones (older
> versions and some more methods)? Did I understood correctly?

Yes, I believe it should be in core.

I don't know about "some more methods", as it depends which methods you are 
proposing.

> If this is correct, I suppose this effort could be committed early in v15 
> cycle?

I don't care to speculate on that yet.

> Does it deserve some effort to build some dedicated TAP tests for these
> modules? I already have a small patch for this waiting on my disk for some 
> more
> tests and review...

I did that, too, in the 0002 version of my patch.  Perhaps we need to merge 
your work and mine.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Tom Lane
Peter Geoghegan  writes:
> On Mon, Apr 19, 2021 at 10:53 AM Tom Lane  wrote:
>> I think that the real fundamental bug is supposing that static analysis
>> can give 100% correct answers.

> Is it really the case that static analysis of the kind that you'd need
> to make this 100% robust is fundamentally impossible? I find that
> proposition hard to believe.

I didn't mean to imply that it's necessarily theoretically impossible,
but given our lack of visibility into what a function or operator
will do, plus the way that the collation feature was bolted on
with minimal system-level redesign, it's sure pretty darn hard.
Code like record_eq is doing a lot at runtime that we can't really
see from static analysis.

Anyway, given the ALTER TYPE ADD ATTRIBUTE counterexample, I'm
definitely starting to lean towards "revert and try again in v15".
I feel we'd be best off to consider functions/operators that
operate on container types to be "maybe"s rather than certainly
safe or certainly not safe.  I think that such things appear
sufficiently rarely in index specifications that it's not worth it
to try to do an exact analysis of them, even if we were sure we
could get that 100% right.  But that doesn't seem to be an idea that
can trivially be added to the current design.

regards, tom lane




Re: pg_amcheck option to install extension

2021-04-19 Thread Andrew Dunstan


On 4/19/21 1:25 PM, Mark Dilger wrote:
>
>> On Apr 19, 2021, at 9:53 AM, Tom Lane  wrote:
>>
>> Andrew Dunstan  writes:
>>> OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
>>> pg_amcheck should move there. I can organize that if there's agreement.
>>> Or else let's move amcheck as Alvaro suggests.
>> FWIW, I think that putting them both in contrib makes the most
>> sense from a structural standpoint.
> That was my original thought also, largely from a package management 
> perspective.  Just as an example, postgresql-client and postgresql-contrib 
> are separate rpms.  There isn't much point to having pg_amcheck installed as 
> part of the postgresql-client package while having amcheck in the 
> postgresql-contrib package which might not be installed.
>
> A counter argument is that amcheck is server side, and pg_amcheck is client 
> side.  Having pg_amcheck installed on a system makes sense if you are 
> connecting to a server on a different system.


There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.



>
>> On Mar 11, 2021, at 12:12 AM, Peter Eisentraut 
>>  wrote:
>>
>> I want to register, if we are going to add this, it ought to be in src/bin/. 
>>  If we think it's a useful tool, it should be there with all the other 
>> useful tools.
>>
>> I realize there is a dependency on a module in contrib, and it's probably 
>> now not the time to re-debate reorganizing contrib.  But if we ever get to 
>> that, this program should be the prime example why the current organization 
>> is problematic, and we should be prepared to make the necessary moves then.
> This was the request that motivated the move to src/bin.
>


I missed that, so I guess maybe I can't complain too loudly. But if I'd
seen it I would have disagreed. :-)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Synchronous commit behavior during network outage

2021-04-19 Thread Ondřej Žižka

Hello all,
I would like to know your opinion on the following behaviour I see for 
PostgreSQL setup with synchronous replication.


This behaviour happens in a special use case. In this use case, there 
are 2 synchronous replicas with the following config (truncated):


- 2 nodes
- synchronous_standby_names='*'
- synchronous_commit=remote_apply


With this setup run the following steps (LAN down - LAN between master 
and replica):

-
postgres=# truncate table a;
TRUNCATE TABLE
postgres=# insert into a values (1); -- LAN up, insert has been applied 
to replica.

INSERT 0 1
Vypnu LAN na serveru se standby:
postgres=# insert into a values (2); --LAN down, waiting for a 
confirmation from sync replica. In this situation cancel it (press CTRL+C)

^CCancel request sent
*WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not 
have been replicated to the standby.

*INSERT 0 1
There will be warning that commit was performed only locally:
2021-04-12 19:55:53.063 CEST [26104] WARNING:  canceling wait for 
synchronous replication due to user request
2021-04-12 19:55:53.063 CEST [26104] DETAIL:  The transaction has 
already committed locally, but might not have been replicated to the 
standby.


postgres=# insert into a values (2); --LAN down, waiting for a 
confirmation from sync replica. In this situation cancel it (press CTRL+C)

^CCancel request sent
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not 
have been replicated to the standby.

INSERT 0 1
postgres=# insert into a values (2); --LAN down, waiting for sync 
replica, second attempt, cancel it as well (CTRL+C)

^CCancel request sent
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not 
have been replicated to the standby.

INSERT 0 1
postgres=# update a set n=3 where n=2; --LAN down, waiting for sync 
replica, cancel it (CTRL+C)

^CCancel request sent
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not 
have been replicated to the standby.

UPDATE 2
postgres=# update a set n=3 where n=2; -- run the same update,because 
data from the previous attempt was commited on master, it is sucessfull, 
but no changes

UPDATE 0
postgres=# select * from a;
 n
---
 1
 3
 3
(3 rows)
postgres=#


Now, there is only value 1 in the sync replica table (no other values), 
data is not in sync. This is expected, after the LAN restore, data will 
come sync again, but if the main/primary node will fail and we failover 
to replica before the LAN is back up or the storage for this node would 
be destroyed and data would not sync to replica before it, we will lose 
data even if the client received successful commit (with a warning).
From the synchronous_commit=remote_write level and "higher", I would 
expect, that when the remote application (doesn't matter if flush, write 
or apply) would not be applied I would not receive a confirmation about 
the commit (even with a warning). Something like, if there is no commit 
from sync replica, there is no commit on primary and if someone performs 
the steps above, the whole transaction will not send a confirmation.


This can cause issues if the application receives a confirmation about 
the success and performs some follow-up steps e.g. create a user account 
and sends a request to the mail system to create an account or create a 
VPN account. If the scenario above happens, there can exist a VPN 
account that does not have any presence in the central database and can 
be a security issue.


I hope I explained it sufficiently. :-)

Do you think, that would be possible to implement a process that would 
solve this use case?


Thank you
Ondrej



Re: proposal - log_full_scan

2021-04-19 Thread Pavel Stehule
ne 18. 4. 2021 v 16:09 odesílatel Pavel Stehule 
napsal:

>
>
> ne 18. 4. 2021 v 14:28 odesílatel Julien Rouhaud 
> napsal:
>
>> On Sun, Apr 18, 2021 at 06:21:56AM +0200, Pavel Stehule wrote:
>> >
>> > The extension like pg_qualstat is good, but it does different work.
>>
>> Yes definitely.  It was just an idea if you needed something right now
>> that
>> could more or less do what you needed, not saying that we shouldn't
>> improve the
>> core :)
>>
>> > In
>> > complex applications I need to detect buggy (forgotten) queries - last
>> week
>> > I found two queries over bigger tables without predicates. So the
>> qualstat
>> > doesn't help me.
>>
>> Also not totally helpful but powa was created to detect problematic
>> queries in
>> such cases.  It wouldn't say if it's because of a seq scan or not (so yes
>> again
>> we need to improve that), but it would give you the slowest (or top
>> consumer
>> for any resource) for a given time interval.
>>
>> > This is an application for a government with few (but for
>> > government typical) specific: 1) the life cycle is short (one month), 2)
>> > there is not slow start - from first moment the application will be
>> used by
>> > more hundred thousands people, 3) the application is very public - so
>> any
>> > issues are very interesting for press and very unpleasant for politics,
>> and
>> > in next step for all suppliers (there are high penalty for failures),
>> and
>> > an admins are not happy from external extensions, 4) the budget is not
>> too
>> > big - there is not any performance testing environment
>> >
>> > First stages are covered well today. We can log and process very slow
>> > queries, and fix it immediately - with CREATE INDEX CONCURRENTLY I can
>> do
>> > it well on production servers too without high risk.
>> >
>> > But the detection of some bad not too slow queries is hard. And as an
>> > external consultant I am not able to install any external extensions to
>> the
>> > production environment for fixing some hot issues, The risk is not
>> > acceptable for project managers and I understand. So I have to use only
>> > tools available in Postgres.
>>
>> Yes I agree that having additional and more specialized tool in core
>> postgres
>> would definitely help in similar scenario.
>>
>> I think that having some kind of threshold for seq scan (like the
>> mentioned
>> auto_explain.log_seqscan = XXX) in auto_explain would be the best
>> approach, as
>> you really need the plan to know why a seq scan was chosen and if it was a
>> reasonable choice or not.
>>
>
> I would like to write this for core and for auto_explain too. I was in a
> situation when I hadnot used auto_explain too. Although this extension is
> widely used and then the risk is low.
>
> When I detect the query, then I can run the explanation manually. But sure
> I think so it can work well inside auto_explain
>

I tried to write the patch. It is harder work for core, than I expected,
because there is not any good information if the query is top or not, so it
is not easy to decide if we should log info or not.

On second hand, the modification of auto_explain is easy.

I am sending the patch

Regards

Pavel


> Regards
>
> Pavel
>
>
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 445bb37191..5754e25594 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -17,8 +17,11 @@
 #include "access/parallel.h"
 #include "commands/explain.h"
 #include "executor/instrument.h"
+#include "nodes/nodeFuncs.h"
+#include "parser/parsetree.h"
 #include "jit/jit.h"
 #include "utils/guc.h"
+#include "utils/lsyscache.h"
 
 PG_MODULE_MAGIC;
 
@@ -35,6 +38,7 @@ static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static int	auto_explain_log_level = LOG;
 static bool auto_explain_log_nested_statements = false;
 static double auto_explain_sample_rate = 1;
+static int auto_explain_log_seqscan = -1;	/* tuples */
 
 static const struct config_enum_entry format_options[] = {
 	{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -65,7 +69,7 @@ static int	nesting_level = 0;
 static bool current_query_sampled = false;
 
 #define auto_explain_enabled() \
-	(auto_explain_log_min_duration >= 0 && \
+	((auto_explain_log_min_duration >= 0 || auto_explain_log_seqscan != -1) && \
 	 (nesting_level == 0 || auto_explain_log_nested_statements) && \
 	 current_query_sampled)
 
@@ -105,6 +109,18 @@ _PG_init(void)
 			NULL,
 			NULL);
 
+	DefineCustomIntVariable("auto_explain.log_seqscan",
+			"Sets the minimum tuples produced by sequantial scans which plans will be logged",
+			"Zero prints all plans that contains seqscan node. -1 turns this feature off.",
+			&auto_explain_log_seqscan,
+			-1,
+			-1, INT_MAX,
+			PGC_SUSET,
+			0,
+			NULL,
+			NULL,
+			NULL);
+
 	DefineCustomBoolVariable("auto_explain.log_analyze",
 			 "Use EXPLAIN ANALYZE for plan logging.",
 			 NULL,
@@ -274,7 +290,9 @@ explai

Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Peter Geoghegan
On Mon, Apr 19, 2021 at 11:49 AM Tom Lane  wrote:
> I didn't mean to imply that it's necessarily theoretically impossible,
> but given our lack of visibility into what a function or operator
> will do, plus the way that the collation feature was bolted on
> with minimal system-level redesign, it's sure pretty darn hard.
> Code like record_eq is doing a lot at runtime that we can't really
> see from static analysis.

It's worth pointing out that code like record_eq is not (or at least
should not be) fundamentally unpredictable and unruly. The fact that
record_eq does typecache lookups and whatnot seems to me to be an
implementation detail. What record_eq is entitled to assume about
collations could be formalized by some general high-level
specification. It ought to be possible to do this, just as it ought to
be possible for us to statically determine if a composite type is safe
to use with B-Tree deduplication.

Whether or not it's worth the trouble is another matter, but it might
be if a single effort solved a bunch of related problems, not just the
collation dependency problem.

> Anyway, given the ALTER TYPE ADD ATTRIBUTE counterexample, I'm
> definitely starting to lean towards "revert and try again in v15".

The counterexample concerns me because it seems to indicate a lack of
sophistication in how dependencies are managed with corner cases -- I
don't think that it's okay to leave the behavior unspecified in a
stable release. But I also think that we should consider if code like
record_eq is in fact the real problem (or just the lack of any general
specification that constrains code like it in useful ways, perhaps).
This probably won't affect whether or not the patch gets reverted now,
but it still matters.

-- 
Peter Geoghegan




Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Thomas Munro
On Tue, Apr 20, 2021 at 5:53 AM Tom Lane  wrote:
> I think that the real fundamental bug is supposing that static analysis
> can give 100% correct answers.  ...

Well, the goal was to perform analysis to the extent possible
statically since that would cover the vast majority of cases and is
practically all you can do.  Clearly there is always going to be a
category of invisible dependencies inside procedural code in general
(halting problem).  We did think about the idea of using new
declarations about functions/operators to know which ones actually
care about collation, rather than assuming that they all do (bugs
aside), as an optimisation, and then that mechanism could in theory
also be used to say that functions that don't appear to depend on
collations actually do internal, but that all seemed like vast
overkill, so we left it for possible later improvements.  The question
on my mind is whether reverting the feature and trying again for 15
could produce anything fundamentally better at a design level, or
would just fix problems in the analyser code that we could fix right
now.  For example, if you think there actually is a potential better
plan than using pg_depend for this, that'd definitely be good to know
about.

> ... Even if it did do so in a given state
> of the database, consider this counterexample:
>
> create type myrow as (f1 int, f2 int);
> create table mytable (id bigint, r1 myrow, r2 myrow);
> create index myindex on mytable(id) where r1 < r2;
> alter type myrow add attribute f3 text;
>
> myindex is recorded as having no collation dependency, but that is
> now wrong.

Hrmph.  Yeah.  We didn't consider types that change later like this,
and handling those correctly does seem to warrant some more thought
and work than we perhaps have time for.  My first thought is that we'd
need to teach it to trigger reanalysis.




Re: pg_amcheck contrib application

2021-04-19 Thread Robert Haas
On Thu, Apr 15, 2021 at 1:07 PM Mark Dilger
 wrote:
> I have added the verb "has" rather than "contains" because "has" is more 
> consistent with the phrasing of other similar corruption reports.

That makes sense.

I think it's odd that a range of extraneous chunks is collapsed into a
single report if the size of each chunk happens to be
TOAST_MAX_CHUNK_SIZE and not otherwise. Why not just remember the
first and last extraneous chunk and the size of each? If the next
chunk you see is the next one in sequence and the same size as all the
others, extend your notion of the sequence end by 1. Otherwise, report
the range accumulated so far. It seems to me that this wouldn't be any
more code than you have now, and might actually be less.

I think that report_missing_chunks() could probably just report the
range of missing chunks and not bother reporting how big they were
expected to be. But, if it is going to report how big they were
expected to be, I think it should have only 2 cases rather than 4:
either a range of missing chunks of equal size, or a single missing
chunk of some size. If, as I propose, it doesn't report the expected
size, then you still have just 2 cases: a range of missing chunks, or
a single missing chunk.

Somehow I have a hard time feeling confident that check_toast_tuple()
is going to do the right thing. The logic is really complex and hard
to understand. 'chunkno' is a counter that advances every time we move
to the next chunk, and 'curchunk' is the value we actually find in the
TOAST tuple. This terminology is not easy to understand. Most messages
now report 'curchunk', but some still report 'chunkno'. Why does
'chunkno' need to exist at all? AFAICS the combination of 'curchunk'
and 'tctx->last_chunk_seen' ought to be sufficient. I can see no
particular reason why what you're calling 'chunkno' needs to exist
even as a local variable, let alone be printed out. Either we haven't
yet validated that the chunk_id extracted from the tuple is non-null
and greater than the last chunk number we saw, in which case we can
just complain about it if we find it to be otherwise, or we have
already done that validation, in which case we should complain about
that value and not 'chunkno' in any subsequent messages.

The conditionals between where you set max_valid_prior_chunk and where
you set last_chunk_seen seem hard to understand, particularly the
bifurcated way that missing chunks are reported. Initial missing
chunks are detected by (chunkno == 0 && max_valid_prior_chunk >= 0)
and later missing chunks are detected by (chunkno > 0 &&
max_valid_prior_chunk > tctx->last_chunk_seen). I'm not sure if this
is correct; I find it hard to get my head around what
max_valid_prior_chunk is supposed to represent. But in any case I
think it can be written more simply. Just keep track of what chunk_id
we expect to extract from the next TOAST tuple. Initially it's 0.
Then:

if (chunkno < tctx->expected_chunkno)
{
   // toast value %u index scan returned chunk number %d when chunk %d
was expected
   // don't modify tctx->expected_chunkno here, just hope the next
thing matches our previous expectation
}
else
{
if (chunkno > tctx->expected_chunkno)
// chunks are missing from tctx->expected_chunkno through
Min(chunkno - 1, tctx->final_expected_chunk), provided that the latter
value is greater than or equal to the former
tctx->expected_chunkno = chunkno + 1;
}

If you do this, you only need to report extraneous chunks when chunkno
> tctx->final_expected_chunk, since chunkno < 0 is guaranteed to
trigger the first of the two complaints shown above.

In check_tuple_attribute I suggest "bool valid = false" rather than
"bool invalid = true". I think it's easier to understand.

I object to check_toasted_attribute() using 'chunkno' in a message for
the same reasons as above in regards to check_toast_tuple() i.e. I
think it's a concept which should not exist.

I think this patch could possibly be split up into multiple patches.
There's some question in my mind whether it's getting too late to
commit any of this, since some of it looks suspiciously like new
features after feature freeze. However, I kind of hate to ship this
release without at least doing something about the chunkno vs.
curchunk stuff, which is even worse in the committed code than in your
patch, and which I think will confuse the heck out of users if those
messages actually fire for anyone.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Bogus collation version recording in recordMultipleDependencies

2021-04-19 Thread Tom Lane
Thomas Munro  writes:
> ... The question
> on my mind is whether reverting the feature and trying again for 15
> could produce anything fundamentally better at a design level, or
> would just fix problems in the analyser code that we could fix right
> now.

Well, as I said, I think what we ought to do is treat any record-accepting
functions/operators as "don't know, better assume it's collation
dependent".  What's not clear to me is how that concept could be
shoehorned into the existing design.

> For example, if you think there actually is a potential better
> plan than using pg_depend for this, that'd definitely be good to know
> about.

I really dislike using pg_depend, for a couple of reasons:

* You've broken the invariant that dependencies on pinned objects
are never recorded.  Now, some of them exist, for reasons having
nothing to do with the primary goals of pg_depend.  If that's not
a sign of bad relational design, I don't know what is.  I didn't
look at the code, but I wonder if you didn't have to lobotomize
some error checks in dependency.c because of that.  (Perhaps
some sort of special-case representation for the default
collation would help here?)

* pg_depend used to always be all-not-null.  Now, most rows in it
will need a nulls bitmap, adding 8 bytes per row (on maxalign=8
hardware) to what had been fairly narrow rows.  By my arithmetic
that's 13.3% bloat in what is already one of our largest
catalogs.  That's quite unpleasant.  (It would actually be
cheaper to store an empty-string refobjversion for non-collation
entries; a single-byte string would fit into the pad space
after deptype, adding nothing to the row width.)

> Hrmph.  Yeah.  We didn't consider types that change later like this,
> and handling those correctly does seem to warrant some more thought
> and work than we perhaps have time for.  My first thought is that we'd
> need to teach it to trigger reanalysis.

That seems like a nonstarter, even before you think about race
conditions.

regards, tom lane




Re: Allowing to create LEAKPROOF functions to non-superuser

2021-04-19 Thread Robert Haas
On Fri, Apr 16, 2021 at 3:57 AM Noah Misch  wrote:
> On Mon, Apr 12, 2021 at 02:35:27PM -0700, Andres Freund wrote:
> > On 2021-04-12 17:14:20 -0400, Tom Lane wrote:
> > > I doubt that falsely labeling a function LEAKPROOF can get you more
> > > than the ability to read data you're not supposed to be able to read
> > > ... but that ability is then available to all users, or at least all
> > > users who can execute the function in question.  So it definitely is a
> > > fairly serious security hazard, and one that's not well modeled by
> > > role labels.  If you give somebody e.g. pg_read_all_data privileges,
> > > you don't expect that that means they can give it to other users.
>
> I do expect that, essentially.  Like Andres describes for BYPASSRLS, they can
> create and GRANT a SECURITY DEFINER function that performs an arbitrary query
> and returns a refcursor (or stores the data to a table of the caller's
> choosing, etc.).  Unlike BYPASSRLS, they can even make pg_read_all_data own
> the function, making the situation persist after one drops the actor's role
> and that role's objects.

Yes. I think that if someone can read all the data, it's unworkable to
suppose that they can't find a way to delegate that ability to others.
If nothing else, a station wagon full of tapes has a lot of bandwidth.

> > A user with BYPASSRLS can create public security definer functions
> > returning data. If the concern is a BYPASSRLS user intentionally
> > exposing data, then there's not a meaningful increase to allow defining
> > LEAKPROOF functions.
>
> Hence, I do find it reasonable to let pg_read_all_data be sufficient for
> setting LEAKPROOF.  I would not consult datdba, because datdba currently has
> no special read abilities.  It feels too weird to let BYPASSRLS start
> affecting non-RLS access controls.  A reasonable person may assume that
> BYPASSRLS has no consequences until someone uses CREATE POLICY.  That said, I
> wouldn't be horrified if BYPASSRLS played a part.  BYPASSRLS, like
> pg_read_all_data, clearly isn't something to grant lightly.

I agree that datdba doesn't seem like quite the right thing, but I'm
not sure I agree with the rest. How can we say that leakproof is a
non-RLS access control? Its only purpose is to keep RLS secure, so I
guess I'd be inclined to think that of the two, BYPASSRLS is more
closely related to the topic at hand.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Allowing to create LEAKPROOF functions to non-superuser

2021-04-19 Thread Tom Lane
Robert Haas  writes:
> On Fri, Apr 16, 2021 at 3:57 AM Noah Misch  wrote:
>> Hence, I do find it reasonable to let pg_read_all_data be sufficient for
>> setting LEAKPROOF.  I would not consult datdba, because datdba currently has
>> no special read abilities.  It feels too weird to let BYPASSRLS start
>> affecting non-RLS access controls.  A reasonable person may assume that
>> BYPASSRLS has no consequences until someone uses CREATE POLICY.  That said, I
>> wouldn't be horrified if BYPASSRLS played a part.  BYPASSRLS, like
>> pg_read_all_data, clearly isn't something to grant lightly.

> I agree that datdba doesn't seem like quite the right thing, but I'm
> not sure I agree with the rest. How can we say that leakproof is a
> non-RLS access control? Its only purpose is to keep RLS secure, so I
> guess I'd be inclined to think that of the two, BYPASSRLS is more
> closely related to the topic at hand.

Umm ... I'm pretty sure LEAKPROOF also affects optimization around
"security barrier" views, which I wouldn't call RLS.  Out of these
options, I'd prefer granting the ability to pg_read_all_data.

regards, tom lane




Re: "could not find pathkey item to sort" for TPC-DS queries 94-96

2021-04-19 Thread James Coleman
On Sun, Apr 18, 2021 at 1:21 PM Tom Lane  wrote:
>
> I wrote:
> > I think it's time for some refactoring of this code so that we can
> > actually share the logic.  Accordingly, I propose the attached.
>
> After sleeping on it, here's an improved version that gets rid of
> an unnecessary assumption about ECs usually not containing both
> parallel-safe and parallel-unsafe members.  I'd tried to do this
> yesterday but didn't like the amount of side-effects on createplan.c
> (caused by the direct call sites not being passed the "root" pointer).
> However, we can avoid refactoring createplan.c APIs by saying that it's
> okay to pass root = NULL to find_computable_ec_member if you're not
> asking it to check parallel safety.  And there's not really a need to
> put a parallel-safety check into find_ec_member_matching_expr at all;
> that task can be left with the one caller that cares.

I like the refactoring here.

Two things I wonder:
1. Should we add tests for the relabel code path?
2. It'd be nice not to have the IS_SRF_CALL duplicated, but that might
add enough complexity that it's not worth it.

Thanks,
James Coleman




Re: Implementing Incremental View Maintenance

2021-04-19 Thread Andrew Dunstan


On 4/7/21 5:25 AM, Yugo NAGATA wrote:
> Hi, 
>
> I rebased the patch because the cfbot failed.
>
> Regards,
> Yugo Nagata



This patch (v22c) just crashed for me with an assertion failure on
Fedora 31. Here's the stack trace:


[New LWP 333090]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: andrew regression [local]
INSERT    '.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f8981caa625 in raise () from /lib64/libc.so.6
#0  0x7f8981caa625 in raise () from /lib64/libc.so.6
#1  0x7f8981c938d9 in abort () from /lib64/libc.so.6
#2  0x0094a54a in ExceptionalCondition
(conditionName=conditionName@entry=0xa91dae "queryDesc->sourceText !=
NULL", errorType=errorType@entry=0x99b468 "FailedAssertion",
fileName=fileName@entry=0xa91468
"/home/andrew/pgl/pg_head/src/backend/executor/execMain.c",
lineNumber=lineNumber@entry=199) at
/home/andrew/pgl/pg_head/src/backend/utils/error/assert.c:69
#3  0x006c0e17 in standard_ExecutorStart (queryDesc=0x226af98,
eflags=0) at /home/andrew/pgl/pg_head/src/backend/executor/execMain.c:199
#4  0x006737b2 in refresh_matview_datafill (dest=0x21cf428,
query=, queryEnv=0x2245fd0,
resultTupleDesc=0x7ffd5e764888, queryString=0x0) at
/home/andrew/pgl/pg_head/src/backend/commands/matview.c:719
#5  0x00678042 in calc_delta (queryEnv=0x2245fd0,
tupdesc_new=0x7ffd5e764888, tupdesc_old=0x7ffd5e764880,
dest_new=0x21cf428, dest_old=0x0, query=0x2246108, rte_path=0x2228a60,
table=) at
/home/andrew/pgl/pg_head/src/backend/commands/matview.c:2907
#6  IVM_immediate_maintenance (fcinfo=) at
/home/andrew/pgl/pg_head/src/backend/commands/matview.c:1683
#7  0x0069e483 in ExecCallTriggerFunc (trigdata=0x7ffd5e764bb0,
tgindx=2, finfo=0x22345f8, instr=0x0, per_tuple_context=0x2245eb0) at
/home/andrew/pgl/pg_head/src/backend/commands/trigger.c:2142
#8  0x0069fc4c in AfterTriggerExecute (trigdesc=0x2233db8,
trigdesc=0x2233db8, trig_tuple_slot2=0x0, trig_tuple_slot1=0x0,
per_tuple_context=0x2245eb0, instr=0x0, finfo=0x2234598,
relInfo=0x2233ba0, event=0x222d380, estate=0x2233710) at
/home/andrew/pgl/pg_head/src/backend/commands/trigger.c:4041
#9  afterTriggerInvokeEvents (events=0x21cece8, firing_id=1,
estate=0x2233710, delete_ok=false) at
/home/andrew/pgl/pg_head/src/backend/commands/trigger.c:4255
#10 0x006a4173 in AfterTriggerEndQuery
(estate=estate@entry=0x2233710) at
/home/andrew/pgl/pg_head/src/backend/commands/trigger.c:4632
#11 0x006c04c8 in standard_ExecutorFinish (queryDesc=0x2237300)
at /home/andrew/pgl/pg_head/src/backend/executor/execMain.c:436
#12 0x008415d8 in ProcessQuery (plan=,
sourceText=0x21490a0 "INSERT INTO mv_base_b VALUES(5,105);", params=0x0,
queryEnv=0x0, dest=0x2221010, qc=0x7ffd5e764f00) at
/home/andrew/pgl/pg_head/src/backend/tcop/pquery.c:190
#13 0x008417f2 in PortalRunMulti (portal=portal@entry=0x21ac3c0,
isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x2221010,
altdest=altdest@entry=0x2221010, qc=qc@entry=0x7ffd5e764f00) at
/home/andrew/pgl/pg_head/src/backend/tcop/pquery.c:1267
#14 0x00842415 in PortalRun (portal=portal@entry=0x21ac3c0,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x2221010,
altdest=altdest@entry=0x2221010, qc=0x7ffd5e764f00) at
/home/andrew/pgl/pg_head/src/backend/tcop/pquery.c:779
#15 0x0083e3ca in exec_simple_query (query_string=0x21490a0
"INSERT INTO mv_base_b VALUES(5,105);") at
/home/andrew/pgl/pg_head/src/backend/tcop/postgres.c:1196
#16 0x00840075 in PostgresMain (argc=argc@entry=1,
argv=argv@entry=0x7ffd5e765450, dbname=,
username=) at
/home/andrew/pgl/pg_head/src/backend/tcop/postgres.c:4458
#17 0x007b8054 in BackendRun (port=,
port=) at
/home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:4488
#18 BackendStartup (port=) at
/home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:4210
#19 ServerLoop () at
/home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:1742
#20 0x007b8ebf in PostmasterMain (argc=argc@entry=8,
argv=argv@entry=0x21435c0) at
/home/andrew/pgl/pg_head/src/backend/postmaster/postmaster.c:1414
#21 0x0050e030 in main (argc=8, argv=0x21435c0) at
/home/andrew/pgl/pg_head/src/backend/main/main.c:209
$1 = {si_signo = 6, si_errno = 0, si_code = -6, _sifields = {_pad =
{333090, 500, 0 }, _kill = {si_pid = 333090, si_uid =
500}, _timer = {si_tid = 333090, si_overrun = 500, si_sigval =
{sival_int = 0, sival_ptr = 0x0}}, _rt = {si_pid = 333090, si_uid = 500,
si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid =
333090, si_uid = 500, si_status = 0, si_utime = 0, si_stime = 0},
_sigfault = {si_addr = 0x1f400051522, _addr_lsb = 0, _addr_bnd = {_lower
= 0x0, _upper = 0x0}}, _sigpoll = {si_band = 2147483981090, si_fd = 0}}}


che

Re: "could not find pathkey item to sort" for TPC-DS queries 94-96

2021-04-19 Thread James Coleman
On Sat, Apr 17, 2021 at 3:39 PM Tom Lane  wrote:
> ...
> Also, I don't much care for either the name or API of
> find_em_expr_usable_for_sorting_rel.  The sole current caller only
> really needs a boolean result, and if it did need more than that
> it'd likely need the whole EquivalenceMember not just the em_expr
> (certainly createplan.c does).  So 0002 attached is some bikeshedding
> on that.  I kept that separate because it might be wise to do it only
> in HEAD, just in case somebody out there is calling the function from
> an extension.

I forgot to comment on this in my previous email, but it seems to me
that relation_has_safe_ec_member, while less wordy, isn't quite
descriptive enough. Perhaps something like
relation_has_sort_safe_ec_member?

James Coleman




Re: Allowing to create LEAKPROOF functions to non-superuser

2021-04-19 Thread Robert Haas
On Mon, Apr 19, 2021 at 4:32 PM Tom Lane  wrote:
> Robert Haas  writes:
> > On Fri, Apr 16, 2021 at 3:57 AM Noah Misch  wrote:
> >> Hence, I do find it reasonable to let pg_read_all_data be sufficient for
> >> setting LEAKPROOF.  I would not consult datdba, because datdba currently 
> >> has
> >> no special read abilities.  It feels too weird to let BYPASSRLS start
> >> affecting non-RLS access controls.  A reasonable person may assume that
> >> BYPASSRLS has no consequences until someone uses CREATE POLICY.  That 
> >> said, I
> >> wouldn't be horrified if BYPASSRLS played a part.  BYPASSRLS, like
> >> pg_read_all_data, clearly isn't something to grant lightly.
>
> > I agree that datdba doesn't seem like quite the right thing, but I'm
> > not sure I agree with the rest. How can we say that leakproof is a
> > non-RLS access control? Its only purpose is to keep RLS secure, so I
> > guess I'd be inclined to think that of the two, BYPASSRLS is more
> > closely related to the topic at hand.
>
> Umm ... I'm pretty sure LEAKPROOF also affects optimization around
> "security barrier" views, which I wouldn't call RLS.  Out of these
> options, I'd prefer granting the ability to pg_read_all_data.

Oops, I forgot about security_barrier views, which is rather
embarrassing since I committed them. So, yeah, I agree:
pg_read_all_data is better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Reduce the number of special cases to build contrib modules on windows

2021-04-19 Thread Andrew Dunstan


On 4/19/21 12:24 PM, Alvaro Herrera wrote:
>> diff --git a/src/tools/msvc/MSBuildProject.pm 
>> b/src/tools/msvc/MSBuildProject.pm
>> index ebb169e201..68606a296d 100644
>> --- a/src/tools/msvc/MSBuildProject.pm
>> +++ b/src/tools/msvc/MSBuildProject.pm
>> @@ -310,11 +310,12 @@ sub WriteItemDefinitionGroup
>>  my $targetmachine =
>>$self->{platform} eq 'Win32' ? 'MachineX86' : 'MachineX64';
>>  
>> -my $includes = $self->{includes};
>> -unless ($includes eq '' or $includes =~ /;$/)
>> +my $includes = "";
>> +foreach my $inc (@{ $self->{includes} })
>>  {
>> -$includes .= ';';
>> +$includes .= $inc . ";";
>>  }
> Perl note: you can do this more easily as 
>
>   my $includes = join ';', @{$self->{includes}};
>   $includes .= ';' unless $includes eq '';
>

or even more simply:


my $includes = join ';', @{$self->{includes}}, "";

cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: "could not find pathkey item to sort" for TPC-DS queries 94-96

2021-04-19 Thread Tom Lane
James Coleman  writes:
> I forgot to comment on this in my previous email, but it seems to me
> that relation_has_safe_ec_member, while less wordy, isn't quite
> descriptive enough. Perhaps something like
> relation_has_sort_safe_ec_member?

I'm not wedded to that name, certainly, but it seems like neither
of these is quite getting at the issue.  An EC can be sorted on,
by definition, but there are some things we don't want to sort
on till the final output step.  I was trying to think of something
using the terminology "early sort", but didn't much like
"relation_has_early_sortable_ec_member" or obvious variants of that.

regards, tom lane




Re: Allowing to create LEAKPROOF functions to non-superuser

2021-04-19 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Apr 19, 2021 at 4:32 PM Tom Lane  wrote:
> > Robert Haas  writes:
> > > On Fri, Apr 16, 2021 at 3:57 AM Noah Misch  wrote:
> > >> Hence, I do find it reasonable to let pg_read_all_data be sufficient for
> > >> setting LEAKPROOF.  I would not consult datdba, because datdba currently 
> > >> has
> > >> no special read abilities.  It feels too weird to let BYPASSRLS start
> > >> affecting non-RLS access controls.  A reasonable person may assume that
> > >> BYPASSRLS has no consequences until someone uses CREATE POLICY.  That 
> > >> said, I
> > >> wouldn't be horrified if BYPASSRLS played a part.  BYPASSRLS, like
> > >> pg_read_all_data, clearly isn't something to grant lightly.
> >
> > > I agree that datdba doesn't seem like quite the right thing, but I'm
> > > not sure I agree with the rest. How can we say that leakproof is a
> > > non-RLS access control? Its only purpose is to keep RLS secure, so I
> > > guess I'd be inclined to think that of the two, BYPASSRLS is more
> > > closely related to the topic at hand.
> >
> > Umm ... I'm pretty sure LEAKPROOF also affects optimization around
> > "security barrier" views, which I wouldn't call RLS.  Out of these
> > options, I'd prefer granting the ability to pg_read_all_data.
> 
> Oops, I forgot about security_barrier views, which is rather
> embarrassing since I committed them. So, yeah, I agree:
> pg_read_all_data is better.

I'm not really sure that attaching it to pg_read_all_data makes sense..

In general, I've been frustrated by the places where we lump privileges
together rather than having a way to distinctly GRANT capabilities
independently- that's more-or-less exactly what lead me to work on
implementing the role system in the first place, and later the
predefined roles.

I do think it's good to reduce the number of places that require
superuser, in general, but I'm not sure that marking functions as
leakproof as a non-superuser makes sense.

Here's what I'd ask Andrey- what's the actual use-case here?  Are these
cases where users are actually adding new functions which they believe
are leakproof where those functions don't require superuser already to
be created?  Clearly if they're in a untrusted language and you have to
be a superuser to install them in the first place then they should just
mark the function as leakproof when they install it.  If these are
trusted language functions, I'd be curious to actually see them as I
have doubts about if they're actually leakproof..

Or is the actual use-case here that they just want to mark functions we
know aren't leakproof as leakproof anyway because they aren't getting
the performance they want?

If it's the latter, as I suspect it is, then I don't really think the
use-case justifies any change on our part- the right answer is to make
those functions actually leakproof, or write ones which are.

Thanks,

Stephen


signature.asc
Description: PGP signature


  1   2   >