date:20180302

Craig Ringer  writes:
> On 3 March 2018 at 13:08, Peter Eisentraut  wrote:
>> I think one thing to try would to define a special kind of exception
>> that can safely be caught and ignored.  Then, input functions can
>> communicate benign parse errors by doing their own cleanup first, then
>> throwing this exception, and then the COPY subsystem can deal with it.

> That makes sense. We'd only use the error code range in question when it
> was safe to catch without re-throw, and we'd have to enforce rules around
> using a specific memory context.

I don't think that can possibly work.  It would only be safe if, between
the thrower and the catcher, there were no other levels of control
operating according to the normal error-handling rules.  But input
functions certainly cannot assume that they are only called by COPY,
so how could they safely throw a "soft error"?

regards, tom lane

Re: psql tab completion for ALTER INDEX SET

2018-03-02 Thread Masahiko Sawada

On Sat, Mar 3, 2018 at 1:43 AM, Fujii Masao  wrote:
> On Fri, Mar 2, 2018 at 3:52 PM, Masahiko Sawada  wrote:
>> Hi,
>>
>> I found that tab completion for ALTER INDEX SET [tab] doesn't support
>> the reloptions of brin and gist. Attached patch adds "buffering",
>> "pages_per_range" and "autosummarize" options.
>
> Thanks for the patch! Committed.
>

Thank you!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2018-03-02 Thread Craig Ringer

On 3 March 2018 at 13:08, Peter Eisentraut  wrote:

> On 1/22/18 21:33, Craig Ringer wrote:
> > We don't have much in the way of rules about what input functions can or
> > cannot do, so you can't assume much about their behaviour and what must
> > / must not be cleaned up. Nor can you just reset the state in a heavy
> > handed manner like (say) plpgsql does.
>
> I think one thing to try would to define a special kind of exception
> that can safely be caught and ignored.  Then, input functions can
> communicate benign parse errors by doing their own cleanup first, then
> throwing this exception, and then the COPY subsystem can deal with it.
>

That makes sense. We'd only use the error code range in question when it
was safe to catch without re-throw, and we'd have to enforce rules around
using a specific memory context. Of course no LWLocks could be held, but
that's IIRC true when throwing anyway unless you plan to proc_exit() in
your handler.

People will immediately ask for it to handle RI errors too, so something
similar would be needed there. But frankly, Pg's RI handling for bulk
loading desperately needs a major change in how it works to make it
efficient anyway, the current model of individual row triggers is horrible
for bulk load performance.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2018-03-02 Thread Erik Rijkers


On 2018-03-03 01:55, Tomas Vondra wrote:

Hi there,

attached is an updated patch fixing all the reported issues (a bit more
about those below).


Hi,

0007-Track-statistics-for-streaming-spilling.patch  won't apply.  All 
the other patches apply ok.


patch complaints with:

patching file doc/src/sgml/monitoring.sgml
patching file src/backend/catalog/system_views.sql
Hunk #1 succeeded at 734 (offset 2 lines).
patching file src/backend/replication/logical/reorderbuffer.c
patching file src/backend/replication/walsender.c
patching file src/include/catalog/pg_proc.h
Hunk #1 FAILED at 2903.
1 out of 1 hunk FAILED -- saving rejects to file 
src/include/catalog/pg_proc.h.rej

patching file src/include/replication/reorderbuffer.h
patching file src/include/replication/walsender_private.h
patching file src/test/regress/expected/rules.out
Hunk #1 succeeded at 1861 (offset 2 lines).

Attached the produced reject file.


thanks,

Erik Rijkers--- src/include/catalog/pg_proc.h
+++ src/include/catalog/pg_proc.h
@@ -2903,7 +2903,7 @@
 DESCR("statistics: information about currently active backends");
 DATA(insert OID = 3318 (  pg_stat_get_progress_info			  PGNSP PGUID 12 1 100 0 0 f f f f t t s r 1 0 2249 "25" "{25,23,26,26,20,20,20,20,20,20,20,20,20,20}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}" _null_ _null_ pg_stat_get_progress_info _null_ _null_ _null_ ));
 DESCR("statistics: information about progress of backends running maintenance command");
-DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s r 0 0 2249 "" "{23,25,3220,3220,3220,3220,1186,1186,1186,23,25}" "{o,o,o,o,o,o,o,o,o,o,o}" "{pid,state,sent_lsn,write_lsn,flush_lsn,replay_lsn,write_lag,flush_lag,replay_lag,sync_priority,sync_state}" _null_ _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
+DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s r 0 0 2249 "" "{23,25,3220,3220,3220,3220,1186,1186,1186,23,25,20,20,20,20,20,20}" "{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,state,sent_lsn,write_lsn,flush_lsn,replay_lsn,write_lag,flush_lag,replay_lag,sync_priority,sync_state,spill_txns,spill_count,spill_bytes,stream_txns,stream_count,stream_bytes}" _null_ _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active replication");
 DATA(insert OID = 3317 (  pg_stat_get_wal_receiver	PGNSP PGUID 12 1 0 0 0 f f f f f f s r 0 0 2249 "" "{23,25,3220,23,3220,23,1184,1184,3220,1184,25,25}" "{o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,status,receive_start_lsn,receive_start_tli,received_lsn,received_tli,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time,slot_name,conninfo}" _null_ _null_ pg_stat_get_wal_receiver _null_ _null_ _null_ ));
 DESCR("statistics: information about WAL receiver");

Re: GSOC 2018 ideas

2018-03-02 Thread Charles Cui

Got it, Aleksander! Will study these documents carefully!

2018-02-26 4:21 GMT-08:00 Aleksander Alekseev :

> Hello Charles,
>
> > I saw PostgreSQL is selected in GSOC 2018 and pretty interested in the
> > ideas of thrift data types support that proposed by you. So, I want to
> > prepare for a proposal based on this idea.
>
> Glad you are interested in this project!
>
> > Can I have more detailed information of what documents or code that I
> > need to understand?
>
> I would recommend the following documents and code:
>
> * Source code of pg_protobuf
>   https://github.com/afiskon/pg_protobuf
> * "Writing Postgres Extensions" tutorial series by Manuel Kniep
>   http://big-elephants.com/2015-10/writing-postgres-extensions-part-i/
> * "So you want to make an extension?" talk by Keith Fiske
>   http://slides.keithf4.com/extension_dev/#/
> * Apache Thrift official website
>   https://thrift.apache.org/
> * Also a great explanation of the Thrift format can be found in the
>   book "Designing Data-Intensive Applications" by Martin Kleppmann
>   http://dataintensive.net/
>
> > Also, if this idea is allocated to other student (or in other worlds,
> > you prefer some student to work on it), do let me know, so that I can
> > pick some other project in PostgreSQL. Any comments or suggestions are
> > welcomed!
>
> To my best knowledge currently there are no other students interested in
> this particular work.
>
> --
> Best regards,
> Aleksander Alekseev
>

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

On 1/22/18 21:33, Craig Ringer wrote:
> We don't have much in the way of rules about what input functions can or
> cannot do, so you can't assume much about their behaviour and what must
> / must not be cleaned up. Nor can you just reset the state in a heavy
> handed manner like (say) plpgsql does.

I think one thing to try would to define a special kind of exception
that can safely be caught and ignored.  Then, input functions can
communicate benign parse errors by doing their own cleanup first, then
throwing this exception, and then the COPY subsystem can deal with it.

Anyway, this patch is clearly not doing this, so I'm setting it to
returned with feedback.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: non-bulk inserts and tuple routing

On 2018-02-22 11:10:57 -0500, Robert Haas wrote:
> On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
>  wrote:
> >> Attached is an updated version for that.
> >
> > Thanks for updating the patch.
> 
> Committed with a few changes.  The big one was that I got rid of the
> local variable is_update in ExecSetupPartitionTupleRouting.  That
> saved a level of indentation on a substantial chunk of code, and it
> turns out that test was redundant anyway.

Btw, are there cases where this could change explain output?  If there's
subplan references or such in any of returning / wcte expressions,
they'd not get added at explain time.  It's probably fine because add
the expressions also "staticly" in ExecInitModifyTable()?

Greetings,

Andres Freund

Re: heap_lock_updated_tuple_rec can leak a buffer refcount

2018-03-02 Thread Amit Kapila

On Sat, Mar 3, 2018 at 3:26 AM, Tom Lane  wrote:
> Alexander Kuzmenkov  writes:
>> Looks like a leak indeed, the fix seems right.
>
>
> Hence, I propose the attached patch.  The test lobotomization
> (the "if (1) //" change) isn't meant for commit but shows how I tested
> the take-the-pin paths.  This passes make check-world with or without
> the lobotomization change.
>

Thanks for taking care of this.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: non-bulk inserts and tuple routing

Hi,

On 2018-02-22 11:10:57 -0500, Robert Haas wrote:
> On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
>  wrote:
> >> Attached is an updated version for that.
> >
> > Thanks for updating the patch.
> 
> Committed with a few changes.  The big one was that I got rid of the
> local variable is_update in ExecSetupPartitionTupleRouting.  That
> saved a level of indentation on a substantial chunk of code, and it
> turns out that test was redundant anyway.

I noticed that this patch broke my JIT patch when I force JITing to be
used everywhere (obviously pointless for perf reasons, but good for
testing).  Turns out to be a single line.

ExecInitPartitionInfo has the following block:
foreach(ll, wcoList)
{
WithCheckOption *wco = castNode(WithCheckOption, lfirst(ll));
ExprState  *wcoExpr = ExecInitQual(castNode(List, wco->qual),
   mtstate->mt_plans[0]);

wcoExprs = lappend(wcoExprs, wcoExpr);
}

note how it is passing mtstate->mt_plans[0] as the parent node for the
expression.  I don't quite know why mtstate->mt_plans[0] was chosen
here, it doesn't seem right. The WCO will not be executed in that
context. Note that the ExecBuildProjectionInfo() call a few lines below
also uses a different context.

For JITing that fails because the compiled deform will assume the tuples
look like mt_plans[0]'s scantuples. But we're not dealing with those,
we're dealing with tuples returned by the relevant subplan.

Also note that the code before this commit used
ExprState  *wcoExpr = 
ExecInitQual(castNode(List, wco->qual),

   >ps);
i.e. the ModifyTableState node, as I'd expect.

Greetings,

Andres Freund

Re: zheap: a new storage format for PostgreSQL

2018-03-02 Thread Mark Kirkwood




On 03/03/18 05:03, Robert Haas wrote:

On Fri, Mar 2, 2018 at 5:35 AM, Alexander Korotkov
 wrote:

I would propose "zero-bloat heap" disambiguation of zheap.  Seems like fair
enough explanation for me without need to rename :)

It will be possible to bloat a zheap table in certain usage patterns.
For example, if you bulk-load the table with a ton of data, commit the
transaction, delete every other row, and then never insert any more
rows ever again, the table is bloated: it's twice as large as it
really needs to be, and we have no provision for shrinking it.  In
general, I think it's very hard to keep bulk deletes from leaving
bloat in the table, and to the extent that it *is* possible, we're not
doing it.  One could imagine, for example, an index-organized table
that automatically combines adjacent pages when they're empty enough,
and that also relocates data to physically lower-numbered pages
whenever possible.  Such a storage engine might automatically shrink
the on-disk footprint after a large delete, but we have no plans to go
in that direction.

Rather, our assumption is that the bloat most people care about comes
from updates.  By performing updates in-place as often as possible, we
hope to avoid bloating both the heap (because we're not adding new row
versions to it which then have to be removed) and the indexes (because
if we don't add new row versions at some other TID, then we don't need
to add index pointers to that new TID either, or remove the old index
pointers to the old TID).  Without delete-marking, we can basically
optimize the case that is currently handled via HOT updates: no
indexed columns have changed.  However, the in-place update has a
major advantage that it still works even when the page is completely
full, provided that the row does not expand.  As Amit's results show,
that can hugely reduce bloat and increase performance in the face of
long-running concurrent transactions.  With delete-marking, we can
also optimize the case where indexed columns have been changed.  We
don't know exactly how well this will work yet because the code isn't
written and therefore can't be benchmarked, but am hopeful that that
in-place updates will be a big win here too.

So, I would not describe a zheap table as zero-bloat, but it should
involve a lot less bloat than our standard heap.



For folk doing ETL type data warehousing this should be great, as the 
typical workload tends to be like: COPY (or similar) from foreign data 
source, then do several sets of UPDATES to fix/check/scrub the 
data...which tends to result in huge bloat with the current heap design 
(despite telling people 'you can do it another way to' to avoid bloat - 
I guess it seems to be more intuitive to just to do it as described).


regards
Mark

Re: zheap: a new storage format for PostgreSQL

2018-03-02 Thread Amit Kapila

On Fri, Mar 2, 2018 at 7:06 PM, Aleksander Alekseev
 wrote:
> Hello Amit,
>
>> Sometime back Robert has proposed a solution to reduce the bloat in
>> PostgreSQL [1] which has some other advantages of its own as well.  To
>> recap, in the existing heap, we always create a new version of a tuple on
>> an update which must eventually be removed by periodic vacuuming or by
>> HOT-pruning, but still in many cases space is never reclaimed completely.
>> A similar problem occurs for tuples that are deleted.  This leads to bloat
>> in the database.
>
> This is an impressive work!
>

Thanks.

> Personally I would like to note that performance is probably not a
> priority at this stage.
>

Right, but we are also trying to see that we just don't fall off the
cliff for some more common workloads.

> Most important parts, in my humble opinion at
> least, are correctness, maintainability (tests, documentation, how
> readable the code is), extendability (e.g. an ability to add point in
> time recovery in the future), interfaces and heap format.
>

+1.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-02 Thread David Gould

On Fri, 02 Mar 2018 17:17:29 -0500
Tom Lane  wrote:

> But by the same token, analyze only looked at 0.0006 of the pages.  It's
> nice that for you, that's enough to get a robust estimate of the density
> everywhere; but I have a nasty feeling that that won't hold good for
> everybody.

My grasp of statistics is somewhat weak, so please inform me if I've got
this wrong, but every time I've looked into it I've found that one can get
pretty good accuracy and confidence with fairly small samples. Typically 1000
samples will serve no matter the population size if the desired margin of
error is 5%. Even with 99% confidence and a 1% margin of error it takes less
than 20,000 samples. See the table at:

http://www.research-advisors.com/tools/SampleSize.htm

Since we have by default 3 sample pages and since ANALYZE takes some
trouble to get a random sample I think we really can rely on the results of
extrapolating reltuples from analyze.

-dg

-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 3/2/18 8:54 PM, Tomas Vondra wrote:
> On 03/03/2018 02:37 AM, David Steele wrote:
>> On 3/2/18 8:01 PM, Andres Freund wrote:
>>> On 2018-03-03 02:00:46 +0100, Tomas Vondra wrote:
 That is somewhat misleading, I think. You're right the last version was
 submitted on 2018-01-19, but the next review arrived on 2018-01-31, i.e.
 right at the end of the CF. So it's not like the patch was sitting there
 with unresolved issues. Based on that review the patch was marked as RWF
 and thus not moved to 2018-03 automatically.
>>>
>>> I don't see how this changes anything.
>>
>> I agree that things could be clearer, and Andres has produced a great
>> document that we can build on.  The old one had gotten a bit stale.
>>
>> However, I think it's pretty obvious that a CF entry should be 
>> accompanied with a patch. It sounds like the timing was awkward but
>> you still had 28 days to produce a new patch.
> 
> Based on internal discussion I'm not so sure about the "pretty obvious"
> part. It certainly wasn't that obvious to me, otherwise I'd submit the
> revised patch earlier - hindsight is 20/20.

Indeed it is.  Be assured that nobody takes pleasure in pushing patches,
but we have limited resources and must make some choices.

>> I also notice that you submitted 7 patches in this CF but are
>> reviewing zero.
> 
> I've volunteered to review a couple of patches at the FOSDEM Developer
> Meeting - I thought Stephen was entering that into the CF app, not sure
> where it got lost.

There are plenty of patches that need review, so go for it.

Regards,
-- 
-David
da...@pgmasters.net

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 03/03/2018 02:37 AM, David Steele wrote:
> On 3/2/18 8:01 PM, Andres Freund wrote:
>> On 2018-03-03 02:00:46 +0100, Tomas Vondra wrote:
>>> That is somewhat misleading, I think. You're right the last version was
>>> submitted on 2018-01-19, but the next review arrived on 2018-01-31, i.e.
>>> right at the end of the CF. So it's not like the patch was sitting there
>>> with unresolved issues. Based on that review the patch was marked as RWF
>>> and thus not moved to 2018-03 automatically.
>>
>> I don't see how this changes anything.
> 
> I agree that things could be clearer, and Andres has produced a great
> document that we can build on.  The old one had gotten a bit stale.
> 
> However, I think it's pretty obvious that a CF entry should be 
> accompanied with a patch. It sounds like the timing was awkward but
> you still had 28 days to produce a new patch.
> 

Based on internal discussion I'm not so sure about the "pretty obvious"
part. It certainly wasn't that obvious to me, otherwise I'd submit the
revised patch earlier - hindsight is 20/20.

> I also notice that you submitted 7 patches in this CF but are
> reviewing zero.
> 

I've volunteered to review a couple of patches at the FOSDEM Developer
Meeting - I thought Stephen was entering that into the CF app, not sure
where it got lost.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 3/2/18 8:01 PM, Andres Freund wrote:
> On 2018-03-03 02:00:46 +0100, Tomas Vondra wrote:
>> That is somewhat misleading, I think. You're right the last version was
>> submitted on 2018-01-19, but the next review arrived on 2018-01-31, i.e.
>> right at the end of the CF. So it's not like the patch was sitting there
>> with unresolved issues. Based on that review the patch was marked as RWF
>> and thus not moved to 2018-03 automatically.
> 
> I don't see how this changes anything.

I agree that things could be clearer, and Andres has produced a great
document that we can build on.  The old one had gotten a bit stale.

However, I think it's pretty obvious that a CF entry should be
accompanied with a patch.  It sounds like the timing was awkward but you
still had 28 days to produce a new patch.

I also notice that you submitted 7 patches in this CF but are reviewing
zero.

-- 
-David
da...@pgmasters.net

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 2018-03-03 02:34:06 +0100, Tomas Vondra wrote:
> On 03/03/2018 02:01 AM, Andres Freund wrote:
> > On 2018-03-03 02:00:46 +0100, Tomas Vondra wrote:
> >> That is somewhat misleading, I think. You're right the last version
> >> was submitted on 2018-01-19, but the next review arrived on
> >> 2018-01-31, i.e. right at the end of the CF. So it's not like the
> >> patch was sitting there with unresolved issues. Based on that
> >> review the patch was marked as RWF and thus not moved to 2018-03
> >> automatically.
> > 
> > I don't see how this changes anything.
> > 
> 
> You've used "The patch hasn't moved forward since 2018-01-19," as an
> argument why the patch is not eligible for 2018-03. I suggest that
> argument is misleading, because patches generally do not move without
> reviews, and it's difficult to respond to a review that arrives on the
> last day of a commitfest.
> 
> Consider that without the review, the patch would end up with NR status,
> and would be moved to the next CF automatically. Isn't that a bit weird?

Not sure I follow. The point is that nobody would have complained if
you'd moved the patch into this fest if you'd updated it *before* it
started?

Greetings,

Andres Freund

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 03/03/2018 02:01 AM, Andres Freund wrote:
> On 2018-03-03 02:00:46 +0100, Tomas Vondra wrote:
>> That is somewhat misleading, I think. You're right the last version
>> was submitted on 2018-01-19, but the next review arrived on
>> 2018-01-31, i.e. right at the end of the CF. So it's not like the
>> patch was sitting there with unresolved issues. Based on that
>> review the patch was marked as RWF and thus not moved to 2018-03
>> automatically.
> 
> I don't see how this changes anything.
> 

You've used "The patch hasn't moved forward since 2018-01-19," as an
argument why the patch is not eligible for 2018-03. I suggest that
argument is misleading, because patches generally do not move without
reviews, and it's difficult to respond to a review that arrives on the
last day of a commitfest.

Consider that without the review, the patch would end up with NR status,
and would be moved to the next CF automatically. Isn't that a bit weird?

kind regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [COMMITTERS] pgsql: Add much-more-extensive TAP tests for pgbench.

Peter Eisentraut  writes:
> On 9/8/17 09:32, Tom Lane wrote:
>> Add much-more-extensive TAP tests for pgbench.

> The use of done_testing() raises the requirement of Test::More from 0.82
> to 0.87.  That seems OK, but we should update the version requirement in
> TestLib.pm.  Any concerns?

Probably not, given that none of the buildfarm machines have whined
(although I'm not sure if all the older ones run the TAP tests?).

I have some recollection that when I was trying to set up
the-oldest-supported-Perl-version on prairiedog, I couldn't even
find Test::More 0.82 anywhere.  What's actually there is a good
bit newer, anyway.

regards, tom lane

Re: [HACKERS] user-defined numeric data types triggering ERROR: unsupported type


On 03/03/2018 01:56 AM, Tom Lane wrote:
> Tomas Vondra  writes:
>> OK, time to revive this old thread ...
 [ scalarineqsel may fall over when used by extension operators ]
> 
>> Attached is a minimal fix adding a flag to convert_numeric_to_scalar,
>> tracking when it fails because of unsupported data type. If any of the 3
>> calls (value + lo/hi boundaries) sets it to 'true' we simply fall back
>> to the default estimate (0.5) within the bucket.
> 
> I think this is a little *too* minimal, because it only covers
> convert_numeric_to_scalar and not the other sub-cases in which
> convert_to_scalar will throw an error instead of returning "false".
> I realize that that would be enough for your use-case, but I think
> we need to think more globally.  So please fix the other ones too.
> 

Will do.

> I notice BTW that whoever shoehorned in the bytea case failed to
> pay attention to the possibility that not all three inputs are
> of the same type; so that code is flat out broken, and capable
> of crashing if fed dissimilar types.  We ought to deal with that
> while we're at it, since (IMO) the goal is to make convert_to_scalar
> fail-soft for any combination of supplied data types.
> 

OK, I'll look into that too.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 2018-03-03 02:00:46 +0100, Tomas Vondra wrote:
> That is somewhat misleading, I think. You're right the last version was
> submitted on 2018-01-19, but the next review arrived on 2018-01-31, i.e.
> right at the end of the CF. So it's not like the patch was sitting there
> with unresolved issues. Based on that review the patch was marked as RWF
> and thus not moved to 2018-03 automatically.

I don't see how this changes anything.

- Andres

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 03/02/2018 09:05 PM, Andres Freund wrote:
> Hi,
> 
> On 2018-03-01 21:39:36 -0500, David Steele wrote:
>> On 3/1/18 9:33 PM, Tomas Vondra wrote:
>>> On 03/02/2018 02:12 AM, Andres Freund wrote:
 Hm, this CF entry is marked as needs review as of 2018-03-01 12:54:48,
 but I don't see a newer version posted?

>>>
>>> Ah, apologies - that's due to moving the patch from the last CF (it was
>>> marked as RWF so I had to reopen it before moving it). I'll submit a new
>>> version of the patch shortly, please mark it as WOA until then.
>>
>> Marked as Waiting on Author.
> 
> Sorry to be the hard-ass, but given this patch hasn't been moved forward
> since 2018-01-19, I'm not sure why it's elegible to be in this CF in the
> first place?
> 

That is somewhat misleading, I think. You're right the last version was
submitted on 2018-01-19, but the next review arrived on 2018-01-31, i.e.
right at the end of the CF. So it's not like the patch was sitting there
with unresolved issues. Based on that review the patch was marked as RWF
and thus not moved to 2018-03 automatically.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] user-defined numeric data types triggering ERROR: unsupported type

Tomas Vondra  writes:
> OK, time to revive this old thread ...
>>> [ scalarineqsel may fall over when used by extension operators ]

> Attached is a minimal fix adding a flag to convert_numeric_to_scalar,
> tracking when it fails because of unsupported data type. If any of the 3
> calls (value + lo/hi boundaries) sets it to 'true' we simply fall back
> to the default estimate (0.5) within the bucket.

I think this is a little *too* minimal, because it only covers
convert_numeric_to_scalar and not the other sub-cases in which
convert_to_scalar will throw an error instead of returning "false".
I realize that that would be enough for your use-case, but I think
we need to think more globally.  So please fix the other ones too.

I notice BTW that whoever shoehorned in the bytea case failed to
pay attention to the possibility that not all three inputs are
of the same type; so that code is flat out broken, and capable
of crashing if fed dissimilar types.  We ought to deal with that
while we're at it, since (IMO) the goal is to make convert_to_scalar
fail-soft for any combination of supplied data types.

regards, tom lane

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Hi there,

attached is an updated patch fixing all the reported issues (a bit more
about those below).

The main change in this patch version is reworked logging of subxact
assignments, which needs to be done immediately for incremental decoding
to work properly.

The previous patch versions did that by logging a separate xlog record,
which however had rather noticeable space overhead (~40% on a worst-case
test - tiny table, no FPWs, ...). While in practice the overhead would
be much closer to 0%, it still seemed unacceptable.

Andres proposed doing something like we do with replication origins in
XLogRecordAssemble, i.e. inventing a special block, and embedding the
assignment info into that (in the next xlog record). This turned out to
be working quite well, and the worst-case space overhead dropped to ~5%.

I have attempted to do something like that with the invalidations, which
is the other thing that needs to be logged immediately for incremental
decoding to work correctly. The plan was to use the same approach as for
assignments, i.e. embed the invalidations into the next xlog record and
stop sending them in the commit message. That however turned out to be
much more complicated - the embedding is fairly trivial, of course, but
unlike assignments the invalidations are needed for hot standbys. If we
only send them incrementally, I think the standby would have to collect
from the WAL records, and store them in a way that survives restarts.

So for invalidations the patch uses the original approach with a new
type xlog record type (ignored by standby), and still logging the
invalidations in commit record (which is that the standby relies on).

On 02/01/2018 11:50 PM, Tomas Vondra wrote:
> On 01/31/2018 07:53 AM, Masahiko Sawada wrote:
> ...
>> 
>> CREATE SUBSCRIPTION commands accept work_mem < 64 but it leads ERROR
>> on publisher side when starting replication. Probably we should check
>> the value on the subscriber side as well.
>>

Added.

>> 
>> When streaming = on, if we drop subscription in the middle of
>> receiving stream changes, DROP SUBSCRIPTION could leak tmp files
>> (.chages file and .subxacts file). Also it also happens when a
>> transaction on upstream aborted without abort record.
>>

Right. The files would get cleaned up eventually during restart (just
like other temporary files), but leaking them after DROP SUBSCRIPTION is
not cool. So I've added a simple tracking of files (or rather streamed
XIDs) in the worker, and clean them explicitly on exit.

>> 
>> Since we can change both streaming option and work_mem option by ALTER
>> SUBSCRIPTION, documentation of ALTER SUBSCRIPTION needs to be updated.
>>

Yep, I've added note that work_mem and streaming can also be changed.
Those changes won't be applied to the already running worker, though.

>> 
>> If we create a subscription without any options, both
>> pg_subscription.substream and pg_subscription.subworkmem are set to
>> null. However, since GetSubscription are't aware of NULL we start the
>> replication with invalid options like follows.
>> LOG:  received replication command: START_REPLICATION SLOT "hoge_sub"
>> LOGICAL 0/0 (proto_version '2', work_mem '893219954', streaming 'on',
>> publication_names '"hoge_pub"')
>>
>> I think we can set substream to false and subworkmem to -1 instead of
>> null, and then makes libpqrcv_startstreaming not set streaming option
>> if stream is -1.
>>

Good catch! I've done pretty much what you suggested here, i.e. store
-1/false instead and then handle that in libpqrcv_startstreaming.

>> 
>> Some WARNING messages appeared. Maybe these are for debug purpose?
>>
>> WARNING:  updating stream stats 0x1c12ef8 4 3 65604
>> WARNING: UpdateSpillStats: updating stats 0x1c12ef8 0 0 0 39 41 2632080
>>

Yeah, those should be removed.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

0001-Introduce-logical_work_mem-to-limit-ReorderBuffer.patch.gz
Description: application/gzip

0002-Immediatel-WAL-log-assignments.patch.gz
Description: application/gzip

0003-Issue-individual-invalidations-with-wal_level-logica.patch.gz
Description: application/gzip

0004-Extend-the-output-plugin-API-with-stream-methods.patch.gz
Description: application/gzip

0005-Implement-streaming-mode-in-ReorderBuffer.patch.gz
Description: application/gzip

0006-Add-support-for-streaming-to-built-in-replication.patch.gz
Description: application/gzip

0007-Track-statistics-for-streaming-spilling.patch.gz
Description: application/gzip

0008-BUGFIX-make-sure-subxact-is-marked-as-is_known_as_su.patch.gz
Description: application/gzip

0009-BUGFIX-set-final_lsn-for-subxacts-before-cleanup.patch.gz
Description: application/gzip

Re: [COMMITTERS] pgsql: Add much-more-extensive TAP tests for pgbench.

On 9/8/17 09:32, Tom Lane wrote:
> Add much-more-extensive TAP tests for pgbench.
> 
> Fabien Coelho, reviewed by Nikolay Shaplov and myself

The use of done_testing() raises the requirement of Test::More from 0.82
to 0.87.  That seems OK, but we should update the version requirement in
TestLib.pm.  Any concerns?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: line_perp() (?-|) is broken.

Kyotaro HORIGUCHI  writes:
> I happend to see a strange geometric calcualtion on master/HEAD.
> ...
> Instead, calculating inner product of the two direction vectors
> works as expected.
>   (l1->A * l2->A) + (l1->B * l2->B) == 0

This seems to be a strict subset of the changes in Emre Hasgeli's
latest geometric-types patch.  I'm disinclined to commit it unless
that patch gets rejected, as it'll just require Emre to rebase again.

However, for either patch, I'm a bit concerned about using FPzero()
on the inner product result.  To the extent that the EPSILON bound
has any useful meaning at all, it needs to mean a maximum difference
between two coordinate values.  The inner product has units of coordinates
squared, so it seems like EPSILON isn't an appropriate threshold there.

Possibly this objection is pointless, because I'm not at all sure that
the existing code is careful about how it uses FPeq/FPzero; perhaps
we're applying EPSILON to all manner of numbers that don't have units
of the coordinate space.  But this won't help make it better.

regards, tom lane

Re: JIT compiling with LLVM v11

On 2018-03-02 16:29:54 -0800, Andres Freund wrote:
> >  #include 
> >  #include 
> > 
> > It seems that it was intended that way anyway, since llvmjit.h contains
> > its own provisions for extern C.
> 
> Hrmpf, yea, I broke that the third time now.  I'm actually inclined to
> add an appropriate #ifdef ... #error so it's not repeated, what do you
> think?

Hm, don't think that's easily possible :(

- Andres

Re: [HACKERS] path toward faster partition pruning

2018-03-02 Thread David Rowley

On 3 March 2018 at 04:47, Robert Haas  wrote:
> On Fri, Mar 2, 2018 at 6:21 AM, Amit Langote
>  wrote:
>> 2. Removing redundant clauses from each list, that is,
>>remove_redundant_clauses() to produce lists with just one member per
>>operator strategy for each partition key,
>
> I don't see that there's a real need to perform step 2 at all.  I
> mean, if you have x > $1 and x > $2 in the query, you can just compute
> the set of partitions for the first clause, compute the set of
> partitions for the second clause, and then intersect.  That doesn't
> seem obviously worse than deciding which of $1 and $2 is greater and
> then pruning only based on whichever one is greater in this case.

This is an interesting idea. It may simplify the code quite a bit as
the clause reduction code is quite bulky. However, unless I'm
mistaken, for this to work, certain inputs will need significantly
more processing to determine the minimum set of matching partitions.

Let's look at the following perhaps unlikely case. (I picked an
extreme case to demonstrate why this may be an inferior method)

Given the table abc (...) partition by range (a,b,c), with the query:

select * from abc where a >= 1 and a >= 2 and a >= 3 and b >= 1 and b
>= 2 and b = 3 and c >= 1 and c >= 2 and c = 3;

We would likely still be parsing those clauses into some struct like
PartitionClauseInfo and would end up with some arrays or Lists with
the clauses segmented by partition key.

It appears to me, for your method to work we'd need to try every
combination of the clauses matching each partition key, which in this
case is 3 * 3 * 3 searches. Amit's current method is 1 search, after
the clause reduction which is 3 + 3 + 3 (O(N) per partition key)

I've tried to think of a more genuine poor performing case for this
with IN or NOT IN lists, but I can't quite see it since NOT IN will
only be supported by LIST partitioning, which can only have a single
partition key and IN would be OR conditions, each of which would be
evaluated in a different round of looping. Although I'm not ruling out
that my imagination is just not good enough.

With that considered, is it still a good idea to do it this way?

Or maybe I've misunderstood the idea completely?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: JIT compiling with LLVM v11

Hi,

On 2018-03-02 19:13:01 -0500, Peter Eisentraut wrote:
> On 3/1/18 03:02, Andres Freund wrote:
> > I've pushed a revised version of my JIT patchset.
> > The git tree is at
> >   https://git.postgresql.org/git/users/andresfreund/postgres.git
> > in the jit branch
> >   
> > https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit
> 
> (testing 2e15e8b8100a61ec092a1e5b2db4a93f07a64cbd)
> 
> I'm having an interesting time getting this to build on macOS.

Sorry for that...

> First, you need to use a CXX that is reasonably similar to the CC.
> Otherwise, the CXX will complain about things like attributes not
> being supported etc.  That's not surprising, but it's a support issue
> that we'll have to prepare ourselves for.

Right.

> Using my standard set of CC=gcc-7 and CXX=g++-7, the build fails with
> 
> g++-7: error: unrecognized command line option '-stdlib=libc++'
> 
> That comes from llvm-config --cxxflags, which was apparently made for
> /usr/bin/cc (which is clang).

> I see here the same problems as we had in the olden days with Perl,
> where it gave us a bunch of compiler flags that applied to the system
> compiler but not the compiler currently in use.  We should just take the
> flags that we really need, like -I and -L.  Maybe we don't need it at all.

It's actually already filtered, I just added -std*, because of selecting
the c++ standard, I guess I need to filter more aggressively.  This is
fairly fairly annoying.

> Using this patch gets it past that:
> 
> diff --git a/src/backend/jit/llvm/llvmjit_inline.cpp
> b/src/backend/jit/llvm/llvmjit_inline.cpp
> index d4204d2cd2..ad87cfd2d9 100644
> --- a/src/backend/jit/llvm/llvmjit_inline.cpp
> +++ b/src/backend/jit/llvm/llvmjit_inline.cpp
> @@ -22,7 +22,6 @@
>  extern "C"
>  {
>  #include "postgres.h"
> -#include "jit/llvmjit.h"
> 
>  #include 
>  #include 
> @@ -35,6 +34,8 @@ extern "C"
>  #include "storage/fd.h"
>  }
> 
> +#include "jit/llvmjit.h"
> +
>  #include 
>  #include 
> 
> It seems that it was intended that way anyway, since llvmjit.h contains
> its own provisions for extern C.

Hrmpf, yea, I broke that the third time now.  I'm actually inclined to
add an appropriate #ifdef ... #error so it's not repeated, what do you
think?

> Then, I'm getting this error:

> It seems the problem here is linking C++ with the C compiler.  If I
> hack in to use c++ in the above command, it continues, and the build
> completes.

Yea, I was afraid of that, even if I didn't see it locally.
Unfortunately Makefile.shlib has a bunch of references both to
$(COMPILER) and $(CC).  Most of the relevant platforms (using llvmjit on
hpux seems like and edge case somebody desiring it can fix) use
$(COMPILER).

Does putting an
override COMPILER = $(CXX) $(CFLAGS)

into src/backend/jit/llvm/Makefile work?  It does force the use of CXX
for all important platforms if I see it correctly. Verified that it
works on linux.

> configure didn't find any of the LLVMOrc* symbols it was looking for.
> Is that a problem?  They seem to be for some debugging support.

That's not a problem, except that the symbols won't be registered with
the debugger, which is a bit annoying for backtraces. I tried to have
configure throw errors in cases llvm is too old or such.

> So, how do I turn this on then?  I see a bunch of new GUC settings
> that are all off by default.  Which ones turn the feature(s) on?

Hm, I'll switch them on in the development branch. Independent of the
final decision that's definitely the right thing for now.  The "full
capability" of the patchset is used if you turn on these three GUCs:

-c jit_expressions=1
-c jit_tuple_deforming=1
-c jit_perform_inlining=1

If you set -c log_min_messages=debug one and run a query you'd see
something like:
2018-03-02 16:27:19.717 PST [11077][3/8] DEBUG:  time to inline: 0.087s
2018-03-02 16:27:19.724 PST [11077][3/8] DEBUG:  time to opt: 0.007s
2018-03-02 16:27:19.750 PST [11077][3/8] DEBUG:  time to emit: 0.027s

I think I should just remove jit_tuple_deforming=1,
jit_perform_inlining=1, they're better done via the cost settings (-1
disables).  I think having -c jit_expressions is helpful leaving the
cost settings aside, because it allows to enable/disable jitting
wholesale without changing cost settings, which seems good.

Greetings,

Andres Freund

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-02 Thread David Gould

On Thu, 01 Mar 2018 18:49:20 -0500
Tom Lane  wrote:

> The sticking point in my mind right now is, if we do that, what to do with
> VACUUM's estimates.  If you believe the argument in the PDF that we'll
> necessarily overshoot reltuples in the face of declining true density,
> then it seems like that argument applies to VACUUM as well.  However,
> VACUUM has the issue that we should *not* believe that it looked at a
> random sample of pages.  Maybe the fact that it looks exactly at the
> changed pages causes it to see something less than the overall density,
> cancelling out the problem, but that seems kinda optimistic.

For what it's worth, I think the current estimate formula for VACUUM is
pretty reasonable. Consider a table T with N rows and P pages clustered
on serial key k. Assume reltuples is initially correct.

Then after:

 delete from T where k < 0.2 * (select max k from T);
 vacuum T;

Vacuum will touch the first 20% of the pages due to visibility map, the sample
will have 0 live rows, scanned pages will be 0.2 * P.

Then according to the current code:

old_density = old_rel_tuples / old_rel_pages;
new_density = scanned_tuples / scanned_pages;
multiplier = (double) scanned_pages / (double) total_pages;
updated_density = old_density + (new_density - old_density) * multiplier;
return floor(updated_density * total_pages + 0.5);

the new density will be:

   N/P + (0/0.2*P - N/P) * 0.2
 = N/P - N/P * 0.2
 = 0.8 * N/P

New reltuples estimate will be 0.8 * old_reltuples. Which is what we wanted.

If we evenly distribute the deletes across the table:

  delete from T where rand() < 0.2;

Then vacuum will scan all the pages, the sample will have 0.8 * N live rows,
scanned pages will be 1.0 * P. The new density will be

   N/P + (0.8 * N/1.0*P - N/P) * 1.0
 = N/P + (0.8 N/P - N/P)
 = N/P - 0.2 * N/P
 = 0.8 * N/P

Which again gives new reltuples as 0.8 * old_reltuples and is again correct.

I believe that given a good initial estimate of reltuples and relpages and
assuming that the pages vacuum does not scan do not change density then the
vacuum calculation does the right thing.

However, for ANALYZE the case is different.

-dg

-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.

Re: JIT compiling with LLVM v11

On 3/1/18 03:02, Andres Freund wrote:
> I've pushed a revised version of my JIT patchset.
> The git tree is at
>   https://git.postgresql.org/git/users/andresfreund/postgres.git
> in the jit branch
>   
> https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

(testing 2e15e8b8100a61ec092a1e5b2db4a93f07a64cbd)

I'm having an interesting time getting this to build on macOS.

First, you need to use a CXX that is reasonably similar to the CC.
Otherwise, the CXX will complain about things like attributes not
being supported etc.  That's not surprising, but it's a support issue
that we'll have to prepare ourselves for.

Using my standard set of CC=gcc-7 and CXX=g++-7, the build fails with

g++-7: error: unrecognized command line option '-stdlib=libc++'

That comes from llvm-config --cxxflags, which was apparently made for
/usr/bin/cc (which is clang).

I see here the same problems as we had in the olden days with Perl,
where it gave us a bunch of compiler flags that applied to the system
compiler but not the compiler currently in use.  We should just take the
flags that we really need, like -I and -L.  Maybe we don't need it at all.

Trying it again then with CC=/usr/bin/cc and CXX=/usr/bin/c++, it
fails with

In file included from llvmjit_inline.cpp:25:
In file included from ../../../../src/include/jit/llvmjit.h:16:
In file included from
/usr/local/Cellar/llvm/5.0.1/include/llvm-c/Types.h:17:
In file included from
/usr/local/Cellar/llvm/5.0.1/include/llvm/Support/DataTypes.h:33:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/cmath:555:1:
error: templates must have C++ linkage

Using this patch gets it past that:

diff --git a/src/backend/jit/llvm/llvmjit_inline.cpp
b/src/backend/jit/llvm/llvmjit_inline.cpp
index d4204d2cd2..ad87cfd2d9 100644
--- a/src/backend/jit/llvm/llvmjit_inline.cpp
+++ b/src/backend/jit/llvm/llvmjit_inline.cpp
@@ -22,7 +22,6 @@
 extern "C"
 {
 #include "postgres.h"
-#include "jit/llvmjit.h"

 #include 
 #include 
@@ -35,6 +34,8 @@ extern "C"
 #include "storage/fd.h"
 }

+#include "jit/llvmjit.h"
+
 #include 
 #include 

It seems that it was intended that way anyway, since llvmjit.h contains
its own provisions for extern C.

Then, I'm getting this error:

/usr/bin/cc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv
-Wno-unused-command-line-argument -g -O2 -Wno-deprecated-declarations
-Werror   -bundle -multiply_defined suppress -o llvmjit.so  llvmjit.o
llvmjit_error.o llvmjit_inline.o llvmjit_wrap.o llvmjit_expr.o
llvmjit_deform.o -L../../../../src/port -L../../../../src/common
-L/usr/local/lib -L/usr/local/opt/openssl/lib
-L/usr/local/opt/readline/lib -L/usr/local/Cellar/libxml2/2.9.7/lib
-L/usr/local/Cellar/llvm/5.0.1/lib  -L/usr/local/lib
-L/usr/local/opt/openssl/lib -L/usr/local/opt/readline/lib
-Wl,-dead_strip_dylibs  -Werror  -lLLVMPasses -lLLVMipo
-lLLVMInstrumentation -lLLVMVectorize -lLLVMLinker -lLLVMIRReader
-lLLVMAsmParser -lLLVMOrcJIT -lLLVMX86Disassembler -lLLVMX86AsmParser
-lLLVMX86CodeGen -lLLVMGlobalISel -lLLVMSelectionDAG -lLLVMAsmPrinter
-lLLVMDebugInfoCodeView -lLLVMDebugInfoMSF -lLLVMCodeGen
-lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMBitWriter
-lLLVMX86Desc -lLLVMMCDisassembler -lLLVMX86Info -lLLVMX86AsmPrinter
-lLLVMX86Utils -lLLVMMCJIT -lLLVMExecutionEngine -lLLVMTarget
-lLLVMAnalysis -lLLVMProfileData -lLLVMRuntimeDyld -lLLVMDebugInfoDWARF
-lLLVMObject -lLLVMMCParser -lLLVMMC -lLLVMBitReader -lLLVMCore
-lLLVMBinaryFormat -lLLVMSupport -lLLVMDemangle -lcurses -lz -lm
-bundle_loader ../../../../src/backend/postgres
Undefined symbols for architecture x86_64:
  "std::__1::error_code::message() const", referenced from:
  _LLVMPrintModuleToFile in libLLVMCore.a(Core.cpp.o)
  _LLVMCreateMemoryBufferWithContentsOfFile in libLLVMCore.a(Core.cpp.o)
  _LLVMCreateMemoryBufferWithSTDIN in libLLVMCore.a(Core.cpp.o)
  _LLVMTargetMachineEmitToFile in libLLVMTarget.a(TargetMachineC.cpp.o)
  llvm::errorToErrorCode(llvm::Error) in libLLVMSupport.a(Error.cpp.o)
  llvm::ECError::log(llvm::raw_ostream&) const in
libLLVMSupport.a(Error.cpp.o)

llvm::SectionMemoryManager::finalizeMemory(std::__1::basic_string*) in
libLLVMExecutionEngine.a(SectionMemoryManager.cpp.o)
[snipped about 900 lines]

It seems the problem here is linking C++ with the C compiler.  If I
hack in to use c++ in the above command, it continues, and the build
completes.

configure didn't find any of the LLVMOrc* symbols it was looking for.
Is that a problem?  They seem to be for some debugging support.


So, how do I turn this on then?  I see a bunch of new GUC settings
that are all off by default.  Which ones turn the feature(s) on?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7

Re: Online enabling of checksums


On 03/02/2018 11:01 PM, Magnus Hagander wrote:
> On Fri, Mar 2, 2018 at 5:50 PM, Tomas Vondra
> > wrote:
> 
> 
> 
> On 03/02/2018 02:35 PM, Magnus Hagander wrote:
> >
> >
> > On Wed, Feb 28, 2018 at 6:06 PM, Robert Haas  
> > >> wrote:
> >
> >     On Sun, Feb 25, 2018 at 9:54 AM, Magnus Hagander
> >     
> >> wrote:
> >     > Also if that wasn't clear -- we only do the full page write if 
> there isn't
> >     > already a checksum on the page and that checksum is correct.
> >
> >     Hmm.
> >
> >     Suppose that on the master there is a checksum on the page and that
> >     checksum is correct, but on the standby the page contents differ in
> >     some way that we don't always WAL-log, like as to hint bits, and 
> there
> >     the checksum is incorrect.  Then you'll enable checksums when the
> >     standby still has some pages without valid checksums, and disaster
> >     will ensue.
> >
> >     I think this could be hard to prevent if checksums are turned on and
> >     off multiple times.
> >
> >
> > Do we ever make hintbit changes on the standby for example? If so, it
> > would definitely cause problems. I didn't realize we did, actually...
> >
> 
> I don't think we do. SetHintBits does TransactionIdIsValid(xid) and
> AFAIK that can't be true on a standby.
> 
> > I guess we could get there even if we don't by:
> > * All checksums are correct
> > * Checkums are disabled (which replicates)
> > * Non-WAL logged change on the master, which updates checksum but does
> > *not* replicate
> > * Checksums re-enabled
> > * Worker sees the checksum as correct, and thus does not force a full
> > page write.
> > * Worker completes and flips checksums on which replicates. At this
> > point, if the replica reads the page, boom.
> >
> 
> Maybe.
> 
> My understanding of Robert's example is that you can start with an
> instance that has wal_log_hints=off, and so pages on master/standby may
> not be 100% identical. Then we do the online checksum thing, and the
> standby may get pages with incorrect checksums.
> 
> 
> No, in that case the master will issue full page writes for *all* pages,
> since they dind't hvae a checksum. The current patch only avoids doing
> that if the checksum on the master is correct, which it isn't when you
> start from checksums=off.  So this particular problem only shows up if
> you iterate between off/on/off multiple times.
> 

Hmmm, OK. So we need to have a valid checksum on a page, disable
checksums, set some hint bits on the page (which won't be WAL-logged),
enable checksums again and still get a valid checksum even with the new
hint bits? That's possible, albeit unlikely.

> 
> > I guess we have to remove that optimisation. It's definitely a
> > bummer, but I don't think it's an absolute dealbreaker.
> >
> 
> I agree it's not a deal-breaker. Or at least I don't see why it should
> be - any other maintenance activity on the database (freezing etc.) will
> also generate full-page writes.
> 
> 
> Yes.
> 
>  
> 
> The good thing is the throttling also limits the amount of WAL, so it's
> possible to prevent generating too many checkpoints etc.
> 
> I suggest we simply:
> 
> 1) set the checksums to in-progress
> 2) wait for a checkpoint
> 3) use the regular logic for full-pages (i.e. first change after
> checkpoint does a FPW)
> 
> 
> This is very close to what it does now, except it does not wait for a
> checkpoint in #2. Why does it need that?
> 

To guarantee that the page has a FPW with all the hint bits, before we
start messing with the checksums (or that setting the checksum itself
triggers a FPW).

> 
> BTW speaking of checkpoints, I see ChecksumHelperLauncherMain does
> 
>     RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | \
>                       CHECKPOINT_IMMEDIATE);
> 
> I'm rather unhappy about that - immediate checkpoints have massive
> impact on production systems, so we try not doing them (That's one of
> the reasons why CREATE DATABASE is somewhat painful). It usually
> requires a bit of thinking about when to do such commands. But in this
> case it's unpredictable when exactly the checksumming completes, so it
> may easily be in the middle of peak activity.
> 
> Why not to simply wait for regular spread checkpoint, the way
> pg_basebackup does it?
> 
> 
> Actually, that was my original idea. I changed it for testing, and shuld
> go change it back.
> 

OK

> 
> 
> > We could say that we keep the

Re: Documenting commitfest Rules

On 2018-03-02 18:08:00 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > - that there's a single point documenting the state of the patch, to
> >   avoid situations where different people interpret a thread differently
> >   without noticing.
> 
> I think that third point is at best an idealized statement, and it's not
> very reflective of actual practice.  We're not great about updating the CF
> entry's state, and even if we were, I don't think there's a bright line
> between Needs Review and Waiting On Author.  There may have been enough
> feedback provided so that the author has something to do, yet not so much
> that there's no point in further review.

Yea, I think it's a bit of hit/miss. But I do think that *if it happens*
the discussion around that often provides some clarification.


> But anyway you said you wanted to summarize current actual practice,
> and this isn't quite what really happens IME.

I think it's OK to state some of the aspirational goals, even if they
only halfway work.  And I think we should evolve the practices after
agreeing on what they currently are ;)


> There are a couple of meta-goals as well, although I'm not sure whether
> they belong in this document:
> 
> * Encourage people to review other people's patches.  This isn't just
> to make the patches better, it's to make the reviewers better: they
> gain familiarity with the PG code base.

> * Ensure that committers don't have to *always* feel guilty about
> not working on other people's patches instead of their own.  Otherwise
> we'd just stay in CF mode all the time.

Oh, yes, I think both of these belong.


> > Submitting a patch as a commitfest entry to a specific commitfest
> > implies a statement by the author that the patch needs input from
> > others. That input can be agreement on design decisions, high level code
> > review, testing, etc.
> 
> ... or even just that the author would like somebody to commit it.

Oh, right ;)


> Also, there's at least one rule you forgot to cover, concerning
> asking people to review patches more or less proportionally to the
> amount of patches they've submitted.  This is surely a lot squishier
> than the other rules, but without it, nothing much happens.

Agreed.

I think we actually should be much more aggressive about it too.

Greetings,

Andres Freund

Re: Testing "workers launched" in expected output? Really?

2018-03-02 Thread Craig Ringer

On 3 March 2018 at 04:27, Robert Haas  wrote:

> On Fri, Mar 2, 2018 at 3:13 PM, Tom Lane  wrote:
> > My point is that just because it isn't falling over on
> > relatively-lightly-loaded buildfarm machines doesn't mean that it won't
> > fall over in other environments.
>
> It doesn't mean it will, either.
>
> > I should think that your recent experience with postgres_fdw (which is
> > evidently still broken, BTW, see rhinoceros) would have convinced you
> > that environmentally dependent regression test results are something to
> > studiously avoid.
>
> I suspect the problem with the test is that it's testing for a plan
> that is only slightly better than the next-best plan, and so it takes
> very little to push it over the edge.  That's sort of an environment
> issue, but it's not exactly the same thing you're worried about here.
>
> > We have enough trouble with unforeseen test deltas;
> > putting in tests that have a blatantly obvious failure mechanism is
> > just taunting the software gods.  Who tend to take their revenge in
> > unpleasant ways.
> >
> > If you insist on actual failures, perhaps I'll go set up about four
> > concurrently-scheduled animals on prairiedog's host, and wait to see
> > what happens ...
>
> I will be curious to see the results of that experiment.  Unless you
> press the machine so hard that fork() starts returning EAGAIN, I don't
> see why that should cause this to fail.  And if you do that, then it's
> going to fail far beyond the ability of suppressing Workers Launched
> to cover the problem up.  However, it's certainly possible that you've
> thought of some failure mode that I've missed, or that there is one
> which neither of us has considered.
>


We already perpetrate all sorts of horrid things to work around
pg_regress's literal whole-file expected output matching. Hard-to-maintain
and impossible-to-document alternates files being high on my list.

Rather than adding to the list of things we hide and suppress to cover the
deficiencies in our regression test results pattern matching I wonder if
it's time to fix the pattern matching. Provide for some limited wildcard
features in pg_regress output pattern files, for example.

Easier said than done without requiring a total change in how all expected
files are written, unfortunately. "magic tags" to open a regexp section of
expected file are all well and good, but then we can't just use diff
anymore...

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Documenting commitfest Rules

Andres Freund  writes:
> Thus I'm trying to summarize my understanding in this email. It probably
> doesn't 100% match other people's understanding. After we've resolved
> those differences, we should update the wiki page, and delete outdated
> content.

> Commitfests primarily exists so:

> - we do not accidentally loose track of patch submissions that deserve
>   attention, that previously happened frequently
> - that reviewers and committers quickly can find patches deserving their
>   attention (i.e. ones in in needs-review and ready-for-committer
>   respectively)
> - that there's a single point documenting the state of the patch, to
>   avoid situations where different people interpret a thread differently
>   without noticing.

I think that third point is at best an idealized statement, and it's not
very reflective of actual practice.  We're not great about updating the CF
entry's state, and even if we were, I don't think there's a bright line
between Needs Review and Waiting On Author.  There may have been enough
feedback provided so that the author has something to do, yet not so much
that there's no point in further review.  So maybe there's something that
can be improved about the CF tool there.  But anyway you said you wanted
to summarize current actual practice, and this isn't quite what really
happens IME.

There are a couple of meta-goals as well, although I'm not sure whether
they belong in this document:

* Encourage people to review other people's patches.  This isn't just
to make the patches better, it's to make the reviewers better: they
gain familiarity with the PG code base.

* Ensure that committers don't have to *always* feel guilty about
not working on other people's patches instead of their own.  Otherwise
we'd just stay in CF mode all the time.

> Submitting a patch as a commitfest entry to a specific commitfest
> implies a statement by the author that the patch needs input from
> others. That input can be agreement on design decisions, high level code
> review, testing, etc.

... or even just that the author would like somebody to commit it.

Also, there's at least one rule you forgot to cover, concerning
asking people to review patches more or less proportionally to the
amount of patches they've submitted.  This is surely a lot squishier
than the other rules, but without it, nothing much happens.

regards, tom lane

Re: [HACKERS] log_destination=file

2018-03-02 Thread Magnus Hagander

On Fri, Mar 2, 2018 at 7:29 PM, Tom Lane  wrote:

> Robert Haas  writes:
> > On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
> >  wrote:
> >> Sorry for the naive question, but which of these bottlenecks are we
> >> actually hitting? I don't recall dealing with an actual production
> >> system where the log collector would be an issue, so I don't have a very
> >> good idea where the actual bottleneck is in this case.
>
> > Unfortunately, neither do I -- but I do believe that the problem is real.
>
> Not to state the obvious ... but if we do not know where the bottleneck
> is, isn't it a bit premature to be proposing changes to "fix" it?
>
> But that argument just leads to the same conclusion that seems to be
> prevalent at the moment, ie boot this to v12.
>

Just to be clear, it has already been booted to v12.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Documenting commitfest Rules

Hi Tomas, Everyone.

This seems like a good thing to discuss separately from the original
thread.

On 2018-03-02 22:06:32 +0100, Tomas Vondra wrote:
> Can you guys please point me to the CF rules that say this? Because my
> understanding (and not just mine, AFAICS) was obviously different.
> Clearly there's a disconnect somewhere.

I think we generally have documented the CF process very poorly.  Way
too much of it was discussed in-person at developer meetings (with notes
taken), is buried in deep threads arguing about release management or
individual commitfests.  A lot of the wiki information is outdated too.

Thus I'm trying to summarize my understanding in this email. It probably
doesn't 100% match other people's understanding. After we've resolved
those differences, we should update the wiki page, and delete outdated
content.

A single Commitfest is commonly called CF.

== Goal ==

Commitfests primarily exists so:

- we do not accidentally loose track of patch submissions that deserve
  attention, that previously happened frequently
- that reviewers and committers quickly can find patches deserving their
  attention (i.e. ones in in needs-review and ready-for-committer
  respectively)
- that there's a single point documenting the state of the patch, to
  avoid situations where different people interpret a thread differently
  without noticing.

== Entries ==

A commitfest entry is a patch, or set of patches, implementing some goal
(e.g. a new feature, some cleanup, improved docs). Each commitfest entry
is associated with one or more email threads, that discuss the feature.

A patch can be authored by multiple people, and the set of people
actively working on a patch can change.

Submitting a patch as a commitfest entry to a specific commitfest
implies a statement by the author that the patch needs input from
others. That input can be agreement on design decisions, high level code
review, testing, etc.

A CF entry can be in multiple states:

- Needs Review: this means that the patch author needs input to
  continue. While there may be some concurrent activity, the author will
  stall at some point without he input.

- Waiting for Author: there has been feedback to the author that makes
  continuing review unproductive. The most desirable case for that is
  because there has been a detailed review describing issues with the
  current implementation, which the patch author is expected to address
  or discuss.  Another reason might be that the current patch doesn't
  apply to the source tree anymore, and a rebased version of the patch
  is needed.

- Ready for Committer: reviewers have reviewed the patch and are,
  presumably after a few rounds of patch updates, happy with the current
  state and think the patch needs the attention of a committer. That can
  be because the patch is ready to commit, or because the design has
  evolved as far as it can without validation from somebody potentially
  willing to commit the patch.

- Committed: The patches associated with the CF entry have been
  committed.

- Rejected: The project has decided that the patch in the current form
  is not desirable.

- Returned with Feedback: The patch has gotten review indicating that it
  needs nontrivial changes, and the author did not have time and
  resources to perform them within the time of the current commitfest,
  or shortly thereafter.  It might be set because the author indicated
  so explicitly, stopped responding, etc.

- Moved to next CF: The patch was in needs review state, but either the
  current commitfest is about to end, or the patch is clearly in a state
  that is inappropriate for the current commitfest. The latter usually
  only matters in the last commitfest in a release cycle, more about
  that below.

== Commitfest Process ==

Commitfests start at a certain date. To avoid timezone issues, the
cutoff is when the start day has begun everywhere on the world.

All patches to be considered in that CF have to be submitted the day
before the start date, and either be in Needs Review or Ready for
Committer state.  That requirement exists because the goal of commitfest
is to provide input to patch authors so the patch can evolve and to not
forget patches that ought to be committed.

During a commitfest reviewers look at patches and provide feedback to
the authors. In the best case reviewers respond within a few days,
adapting the patch, answering changes, etc.

If the reviewers think a CF entry is ready for commit (or need design
input from a committer), the patch is set to Ready for Committer. Some
committer will hopefully come along and either commit the patch, or
explain what should happen in their opinion. Thus the followup states
can be Committed, Needs Review, Waiting on Author.

If patches are in Waiting for Author state for a while, after a
"warning" by the CF manager or reviewers, the patch will be marked as
Returned with Feedback, signaling that the project is not expected to
continue working on

Re: [PATCH] Opclass parameters

2018-03-02 Thread Nikita Glukhov

On 02.03.2018 19:12, Nikolay Shaplov wrote:

В письме от 1 марта 2018 23:02:20 пользователь Oleg Bartunov написал:

2. Your patch does not provide any example of your new tool usage. In my
prototype patch I've shown the implementation of opclass options for
intarray. May be you should do the same. (Use my example it will be more
easy then do it from scratch). It will give more understanding of how
this improvement can be used.

Hey, look on patches, there are many examples !

Oups...Sorry. I've looked at the patch from commitfest
https://commitfest.postgresql.org/17/1559/ and it have shown only the first
file. And When I read the letter I did not pay attention to attachments at
all. So I was sure there is only one there.

Yes. Now I see seven examples. But I think seven it is too many.
For each case a reviewer should make consideration if this parameter worth
moving to opclass options, or fixed definition in the C code is quite ok.
Doing it for whole bunch, may make it messy. I think, it would be good to
commit an implementation of opclass options, with a good example of usage. And
then commit patches for all cases where these options can be used.

There are 5 examples for GiST opclasses, not 7, and they are almost
identical -- in all of them added 'siglen' parameter for signature length
specification.

But since it is now "Rejected with feedback", let's wait till autumn.

We don't want to wait that long. But now we only need to сome to an agreement
about CREATE INDEX syntax and where to store the opclass parameters.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] [POC] Faster processing at Gather node

2018-03-02 Thread Tels

Hello Robert,

On Fri, March 2, 2018 12:22 pm, Robert Haas wrote:
> On Wed, Feb 28, 2018 at 10:06 AM, Robert Haas 
> wrote:
>> [ latest patches ]
>
> Committed.  Thanks for the review.

Cool :)

There is a typo, tho:

+   /*
+* If the counterpary is known to have attached, we can read mq_receiver
+* without acquiring the spinlock and assume it isn't NULL.  Otherwise,
+* more caution is needed.
+*/

s/counterpary/counterparty/;

Sorry, only noticed while re-reading the thread.

Also, either a double space is missing, or one is too many:

+   /*
+* Separate prior reads of mq_ring from the increment of mq_bytes_read
+* which follows.  Pairs with the full barrier in shm_mq_send_bytes(). 
We
+* only need a read barrier here because the increment of mq_bytes_read 
is
+* actually a read followed by a dependent write.
+*/

("  Pairs ..." vs. ". We only ...")

Best regards,

Tels

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

David Gould  writes:
> I'm confused at this point, I provided a patch that addresses this and a
> test case. We seem to be discussing everything as if we first noticed the
> issue. Have you reviewed the patch and and attached analysis and tested it?
> Please commment on that?

I've looked at the patch.  The questions in my mind are

(1) do we really want to go over to treating ANALYZE's tuple density
result as gospel, contradicting the entire thrust of the 2011 discussion?

(2) what should we do in the VACUUM case?  Alexander's argument seems
to apply with just as much force to the VACUUM case, so either you
discount that or you conclude that VACUUM needs adjustment too.

> This tables reltuples is 18 times the actual row count. It will never converge
> because with 5953 pages analyze can only adjust reltuples by 0.0006 each 
> time.

But by the same token, analyze only looked at 0.0006 of the pages.  It's
nice that for you, that's enough to get a robust estimate of the density
everywhere; but I have a nasty feeling that that won't hold good for
everybody.  The whole motivation of the 2011 discussion, and the issue
that is also seen in some other nearby discussions, is that the density
can vary wildly.

regards, tom lane

Re: prokind column (was Re: [HACKERS] SQL procedures)

On 2/28/18 17:37, Peter Eisentraut wrote:
> On 2/28/18 15:45, Tom Lane wrote:
>> I have reviewed this patch and attach an updated version below.
>> I've rebased it up to today, fixed a few minor errors, and adopted
>> most of Michael's suggestions.  Also, since I remain desperately
>> unhappy with putting zeroes into prorettype, I changed it to not
>> do that ;-) ... now procedures have VOIDOID as their prorettype,
>> and it will be substantially less painful to allow them to return
>> some other scalar result in future, should we wish to.  I believe
>> I've found all the places that were relying on prorettype == 0 as
>> a substitute for prokind == 'p'.
> 
> I have just posted "INOUT parameters in procedures", which contains some
> of those same changes.  So I think we're on the same page.  I will work
> on consolidating this.

Committed this so far.  More to come on touching this up.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Online enabling of checksums

2018-03-02 Thread Magnus Hagander

On Fri, Mar 2, 2018 at 5:50 PM, Tomas Vondra 
wrote:

>
>
> On 03/02/2018 02:35 PM, Magnus Hagander wrote:
> >
> >
> > On Wed, Feb 28, 2018 at 6:06 PM, Robert Haas  > > wrote:
> >
> > On Sun, Feb 25, 2018 at 9:54 AM, Magnus Hagander
> > > wrote:
> > > Also if that wasn't clear -- we only do the full page write if
> there isn't
> > > already a checksum on the page and that checksum is correct.
> >
> > Hmm.
> >
> > Suppose that on the master there is a checksum on the page and that
> > checksum is correct, but on the standby the page contents differ in
> > some way that we don't always WAL-log, like as to hint bits, and
> there
> > the checksum is incorrect.  Then you'll enable checksums when the
> > standby still has some pages without valid checksums, and disaster
> > will ensue.
> >
> > I think this could be hard to prevent if checksums are turned on and
> > off multiple times.
> >
> >
> > Do we ever make hintbit changes on the standby for example? If so, it
> > would definitely cause problems. I didn't realize we did, actually...
> >
>
> I don't think we do. SetHintBits does TransactionIdIsValid(xid) and
> AFAIK that can't be true on a standby.
>
> > I guess we could get there even if we don't by:
> > * All checksums are correct
> > * Checkums are disabled (which replicates)
> > * Non-WAL logged change on the master, which updates checksum but does
> > *not* replicate
> > * Checksums re-enabled
> > * Worker sees the checksum as correct, and thus does not force a full
> > page write.
> > * Worker completes and flips checksums on which replicates. At this
> > point, if the replica reads the page, boom.
> >
>
> Maybe.
>
> My understanding of Robert's example is that you can start with an
> instance that has wal_log_hints=off, and so pages on master/standby may
> not be 100% identical. Then we do the online checksum thing, and the
> standby may get pages with incorrect checksums.
>

No, in that case the master will issue full page writes for *all* pages,
since they dind't hvae a checksum. The current patch only avoids doing that
if the checksum on the master is correct, which it isn't when you start
from checksums=off.  So this particular problem only shows up if you
iterate between off/on/off multiple times.




> > I guess we have to remove that optimisation. It's definitely a
> > bummer, but I don't think it's an absolute dealbreaker.
> >
>
> I agree it's not a deal-breaker. Or at least I don't see why it should
> be - any other maintenance activity on the database (freezing etc.) will
> also generate full-page writes.
>

Yes.



> The good thing is the throttling also limits the amount of WAL, so it's
> possible to prevent generating too many checkpoints etc.
>
> I suggest we simply:
>
> 1) set the checksums to in-progress
> 2) wait for a checkpoint
> 3) use the regular logic for full-pages (i.e. first change after
> checkpoint does a FPW)
>

This is very close to what it does now, except it does not wait for a
checkpoint in #2. Why does it need that?


BTW speaking of checkpoints, I see ChecksumHelperLauncherMain does
>
> RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | \
>   CHECKPOINT_IMMEDIATE);
>
> I'm rather unhappy about that - immediate checkpoints have massive
> impact on production systems, so we try not doing them (That's one of
> the reasons why CREATE DATABASE is somewhat painful). It usually
> requires a bit of thinking about when to do such commands. But in this
> case it's unpredictable when exactly the checksumming completes, so it
> may easily be in the middle of peak activity.
>
> Why not to simply wait for regular spread checkpoint, the way
> pg_basebackup does it?
>

Actually, that was my original idea. I changed it for testing, and shuld go
change it back.



> We could say that we keep the optimisation if wal_level=minimal for
> > example, because then we know there is no replica. But I doubt
> > that's worth it?
> >
>
> If it doesn't require a lot of code, why not? But I don't really see
> much point in doing that.
>
>
Yeah, I doubt there are a lot of people using "minimal" these days, not
since we changed the default.

//Magnus

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-02 Thread David Gould

On Fri, 2 Mar 2018 18:47:44 +0300
Alexander Kuzmenkov  wrote:

> The calculation I made for the first step applies to the next steps too, 
> with minor differences. So, the estimate increases at each step. Just 
> out of interest, I plotted the reltuples for 60 steps, and it doesn't 
> look like it's going to converge anytime soon (see attached).
> Looking at the formula, this overshoot term is created when we multiply 
> the old density by the new number of pages. I'm not sure how to fix 
> this. I think we could average the number of tuples, not the densities. 
> The attached patch demonstrates what I mean.

I'm confused at this point, I provided a patch that addresses this and a
test case. We seem to be discussing everything as if we first noticed the
issue. Have you reviewed the patch and and attached analysis and tested it?
Please commment on that?

Thanks.

Also, here is a datapoint that I found just this morning on a clients
production system:

INFO:  "staging_xyz": scanned 3 of   pages, containing 63592 live rows and 
964346 dead rows;
3 rows in sample, 1959918155 estimated total rows

# select (5953.0/3*63592)::int as nrows;
   nrows  
---
 105988686

This tables reltuples is 18 times the actual row count. It will never converge
because with 5953 pages analyze can only adjust reltuples by 0.0006 each 
time.

It will also almost never get vacuumed because the autovacuum threshold of
0.2 * 1959918155 = 391983631 about 3.7 times larger than the actual row count.

The submitted patch is makes analyze effective in setting reltuples to within
a few percent of the count(*) value.

-dg

-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.

Re: heap_lock_updated_tuple_rec can leak a buffer refcount

Alexander Kuzmenkov  writes:
> Looks like a leak indeed, the fix seems right.

Yup, it's a leak.  It's hard to hit because you need to be starting
with an update of a tuple in an all-visible page; otherwise we never
pin the vm page so there's nothing to leak.  But if you lobotomize
the test a few lines above so that it always pins the vm page, then
the regression tests (specifically combocid) reveal the leak, and
show that the proposed patch indeed fixes it.

However ... with said lobotomization, the isolation tests trigger an
Assert(BufferIsPinned(buffer)) inside visibilitymap_pin, showing that
there's another bug here too.  That seems to be because at the bottom
of the outer loop, we do

if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);

and then loop back around with vmbuffer still not equal to InvalidBuffer.
This causes the next loop iteration's visibilitymap_pin call to think it
needs to release that vmbuffer pin a second time; kaboom.

And eyeing this, I see still a third problem: if either of the "goto l4"
jumps occur, we'll loop back to l4 with vmbuffer possibly pinned, and then
if the new page isn't all-visible, we'll just set vmbuffer = InvalidBuffer
and leak the pin that way.  (If it is all-visible, we unpin the old page
correctly in the visibilitymap_pin call, but that can charitably be
described as accidental.)

In short, this is pretty darn broken.  We need to treat the vmbuffer
variable honestly as state that may persist across either the outer loop
or the "goto l4" sub-loop.  Furthermore, it's not really cool to use
"vmbuffer == InvalidBuffer" as the indicator of whether we acquired the
vmbuffer pin pre-lock.  To do that, we'd be forced to drop the old pin in
the not-all-visible path, even though we might need it right back again.
Also, remembering that one vm page serves many heap pages, even if we have
a vm pin for the "wrong" page it's conceivable that it'd be the right one
for the next time we actually need it.  So we should use the
visibilitymap_pin API the way it's meant to be used, and hold any vm pin
we've acquired until the very end of the function.

Hence, I propose the attached patch.  The test lobotomization
(the "if (1) //" change) isn't meant for commit but shows how I tested
the take-the-pin paths.  This passes make check-world with or without
the lobotomization change.

regards, tom lane

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index dc762f9..42eac78 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*** heap_lock_updated_tuple_rec(Relation rel
*** 5677,5682 
--- 5677,5683 
  new_xmax;
  	TransactionId priorXmax = InvalidTransactionId;
  	bool		cleared_all_frozen = false;
+ 	bool		pinned_desired_page;
  	Buffer		vmbuffer = InvalidBuffer;
  	BlockNumber block;
  
*** heap_lock_updated_tuple_rec(Relation rel
*** 5698,5704 
  			 * chain, and there's no further tuple to lock: return success to
  			 * caller.
  			 */
! 			return HeapTupleMayBeUpdated;
  		}
  
  l4:
--- 5699,5706 
  			 * chain, and there's no further tuple to lock: return success to
  			 * caller.
  			 */
! 			result = HeapTupleMayBeUpdated;
! 			goto out_unlocked;
  		}
  
  l4:
*** l4:
*** 5710,5719 
  		 * someone else might be in the middle of changing this, so we'll need
  		 * to recheck after we have the lock.
  		 */
! 		if (PageIsAllVisible(BufferGetPage(buf)))
  			visibilitymap_pin(rel, block, );
  		else
! 			vmbuffer = InvalidBuffer;
  
  		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
  
--- 5712,5724 
  		 * someone else might be in the middle of changing this, so we'll need
  		 * to recheck after we have the lock.
  		 */
! 		if (1) // (PageIsAllVisible(BufferGetPage(buf)))
! 		{
  			visibilitymap_pin(rel, block, );
+ 			pinned_desired_page = true;
+ 		}
  		else
! 			pinned_desired_page = false;
  
  		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
  
*** l4:
*** 5722,5729 
  		 * all visible while we were busy locking the buffer, we'll have to
  		 * unlock and re-lock, to avoid holding the buffer lock across I/O.
  		 * That's a bit unfortunate, but hopefully shouldn't happen often.
  		 */
! 		if (vmbuffer == InvalidBuffer && PageIsAllVisible(BufferGetPage(buf)))
  		{
  			LockBuffer(buf, BUFFER_LOCK_UNLOCK);
  			visibilitymap_pin(rel, block, );
--- 5727,5739 
  		 * all visible while we were busy locking the buffer, we'll have to
  		 * unlock and re-lock, to avoid holding the buffer lock across I/O.
  		 * That's a bit unfortunate, but hopefully shouldn't happen often.
+ 		 *
+ 		 * Note: in some paths through this function, we will reach here
+ 		 * holding a pin on a vm page that may or may not be the one matching
+ 		 * this page.  If this page isn't all-visible, we won't use the vm
+ 		 * page, but we hold onto such a pin till the end of

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2018-03-02 Thread Pavel Stehule

2018-03-02 3:43 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-02 3:38 GMT+01:00 Andres Freund :
>
>> On 2018-03-02 03:13:04 +0100, Pavel Stehule wrote:
>> > 2018-03-01 23:10 GMT+01:00 Andres Freund :
>> >
>> > > On 2018-01-23 17:08:56 +0100, Pavel Stehule wrote:
>> > > > 2018-01-22 23:15 GMT+01:00 Stephen Frost :
>> > > > > This really could use a new thread, imv.  This thread is a year
>> old and
>> > > > > about a completely different feature than what you've implemented
>> here.
>> > > > >
>> > > >
>> > > > true, but now it is too late
>> > >
>> > > At the very least the CF entry could be renamed moved out the
>> procedual
>> > > language category?
>> > >
>> >
>> > Why not?
>>
>> Because the patch adds GUCs that don't have a direct connection
>> toprocedual languages?  And the patch's topic still says "plpgsql plan
>> cache behave" which surely is misleading.
>>
>> Seems fairly obvious that neither category nor name is particularly
>> descriptive of the current state?
>>
>>
> ok
>
>
>> > Have you idea, what category is best?
>>
>> Server Features? Misc?  And as a title something about "add GUCs to
>> control custom plan logic"?
>>
>>
> I'll move it there.
>

done

Pavel


>
> Regards
>
> Pavel
>
>>
>> Greetings,
>>
>> Andres Freund
>>
>
>

Re: Re: [HACKERS] plpgsql - additional extra checks

2018-03-02 Thread Pavel Stehule

Hi

2018-03-01 21:14 GMT+01:00 David Steele :

> Hi Pavel,
>
> On 1/7/18 3:31 AM, Pavel Stehule wrote:
> >
> > There, now it's in the correct Waiting for Author state. :)
> >
> > thank you for comments. All should be fixed in attached patch
>
> This patch no longer applies (and the conflicts do not look trivial).
> Can you provide a rebased patch?
>
> $ git apply -3 ../other/plpgsql-extra-check-180107.patch
> error: patch failed: src/pl/plpgsql/src/pl_exec.c:5944
> Falling back to three-way merge...
> Applied patch to 'src/pl/plpgsql/src/pl_exec.c' with conflicts.
> U src/pl/plpgsql/src/pl_exec.c
>
> Marked as Waiting on Author.
>

I am sending updated code. It reflects Tom's changes - now, the rec is used
as row type too, so the checks must be on two places. With this update is
related one change. When result is empty, then the extra checks doesn't
work - PLpgSQL runtime doesn't pass necessary tupledesc. But usually, when
result is empty, then there are not problems with missing values, because
every value is NULL.

Regards

Pavel



>
> Thanks,
> --
> -David
> da...@pgmasters.net
>
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index c1e3c6a19d..1657ec3fb5 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -4987,7 +4987,7 @@ a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
 
   
   
-   Additional Compile-time Checks
+   Additional Compile-time and Run-time Checks
 

 To aid the user in finding instances of simple but common problems before
@@ -4999,26 +4999,62 @@ a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$
 so you are advised to test in a separate development environment.

 
- 
-  These additional checks are enabled through the configuration variables
-  plpgsql.extra_warnings for warnings and
-  plpgsql.extra_errors for errors. Both can be set either to
-  a comma-separated list of checks, "none" or "all".
-  The default is "none". Currently the list of available checks
-  includes only one:
-  
-   
-shadowed_variables
-
- 
-  Checks if a declaration shadows a previously defined variable.
- 
-
-   
-  
+   
+Setting plpgsql.extra_warnings, or
+plpgsql.extra_errors, as appropriate, to all
+is encouraged in development and/or testing environments.
+   
+
+   
+These additional checks are enabled through the configuration variables
+plpgsql.extra_warnings for warnings and
+plpgsql.extra_errors for errors. Both can be set either to
+a comma-separated list of checks, "none" or
+"all". The default is "none". Currently
+the list of available checks includes only one:
+
+ 
+  shadowed_variables
+  
+   
+Checks if a declaration shadows a previously defined variable.
+   
+  
+ 
 
-  The following example shows the effect of plpgsql.extra_warnings
-  set to shadowed_variables:
+ 
+  strict_multi_assignment
+  
+   
+Some PL/PgSQL commands allow assigning values
+to more than one variable at a time, such as SELECT INTO.  Typically,
+the number of target variables and the number of source variables should
+match, though PL/PgSQL will use NULL for
+missing values and extra variables are ignored.  Enabling this check
+will cause PL/PgSQL to throw a WARNING or
+ERROR whenever the number of target variables and the number of source
+variables are different.
+   
+  
+ 
+
+ 
+  too_many_rows
+  
+   
+Enabling this check will cause PL/PgSQL to
+check if a given query returns more than one row when an
+INTO clause is used.  As an INTO statement will only
+ever use one row, having a query return multiple rows is generally
+either inefficient and/or nondeterministic and therefore is likely an
+error.
+   
+  
+ 
+
+
+The following example shows the effect of plpgsql.extra_warnings
+set to shadowed_variables:
 
 SET plpgsql.extra_warnings TO 'shadowed_variables';
 
@@ -5034,8 +5070,38 @@ LINE 3: f1 int;
 ^
 CREATE FUNCTION
 
- 
- 
+The below example shows the effects of setting
+plpgsql.extra_warnings to
+strict_multi_assignment:
+
+SET plpgsql.extra_warnings TO 'strict_multi_assignment';
+
+CREATE OR REPLACE FUNCTION public.foo()
+ RETURNS void
+ LANGUAGE plpgsql
+AS $$
+DECLARE
+  x int;
+  y int;
+BEGIN
+  SELECT 1 INTO x, y;
+  SELECT 1, 2 INTO x, y;
+  SELECT 1, 2, 3 INTO x, y;
+END;
+$$
+
+SELECT foo();
+WARNING:  Number of evaluated fields does not match expected.
+HINT:  strict_multi_assignement check of extra_warnings is active.
+WARNING:  Number of evaluated fields does not match expected.
+HINT:  strict_multi_assignement check of extra_warnings is active.
+ foo 
+-
+ 
+(1 row)
+
+   
+  
  
 
   
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index

Re: Cached/global query plans, autopreparation

2018-03-02 Thread Pavel Stehule

2018-03-02 21:51 GMT+01:00 Tom Lane :

> Andres Freund  writes:
> > On 2018-03-02 15:29:09 -0500, Bruce Momjian wrote:
> >> While I have heard people complain about how other databases cache
> >> prepare plans, I have heard few complaints about the Postgres approach,
> >> and I haven't even heard of people asking to control the documented
> "five
> >> or more" behavior.
>
> > This *constantly* is a problem.
>
> Yeah, I've certainly heard complaints about it.  I do agree with
> Bruce's conclusion that we should try to improve that behavior;
> but it's not entirely clear how.  (A user-frobbable knob isn't
> necessarily the best answer.)
>

Can be this problem reduced if we can count number of possible paths?

Maybe it can work for some simple queries, what is majority in pgbench.

When I migrate from Oracle, there was a issue slow planning of very complex
views - probably optimization on most common values can work well.

Still I have a idea about some optimization based not on searching the best
plan of one parameter vektor, but for searching the best plan for all
possible vectors - or best worst case plan.

I don't think so this issue is solvable without changing optimization
method.

Or don't lost time with probably useless work and move forward to dynamic
execution - for example - dynamic switch from nested loop, to hashjoin to
mergejoin ...

regards, tom lane
>
>

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 03/02/2018 09:21 PM, David Steele wrote:
> On 3/2/18 3:06 PM, Robert Haas wrote:
>> On Thu, Mar 1, 2018 at 9:33 PM, Tomas Vondra
>>  wrote:
>>> Ah, apologies - that's due to moving the patch from the last CF (it was
>>> marked as RWF so I had to reopen it before moving it). I'll submit a new
>>> version of the patch shortly, please mark it as WOA until then.
>>
>> So, the way it's supposed to work is you resubmit the patch first and
>> then re-activate the CF entry.  If you get to re-activate the CF entry
>> without actually updating the patch, and then submit the patch
>> afterwards, then the CF deadline becomes largely meaningless.  I think
>> a new patch should rejected as untimely.
> 
> Hmmm, I missed that implication last night.  I'll mark this Returned
> with Feedback.
> 
> Tomas, please move to the next CF once you have an updated patch.
> 

Can you guys please point me to the CF rules that say this? Because my
understanding (and not just mine, AFAICS) was obviously different.
Clearly there's a disconnect somewhere.

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Cached/global query plans, autopreparation

Andres Freund  writes:
> On 2018-03-02 15:29:09 -0500, Bruce Momjian wrote:
>> While I have heard people complain about how other databases cache
>> prepare plans, I have heard few complaints about the Postgres approach,
>> and I haven't even heard of people asking to control the documented "five
>> or more" behavior.

> This *constantly* is a problem.

Yeah, I've certainly heard complaints about it.  I do agree with
Bruce's conclusion that we should try to improve that behavior;
but it's not entirely clear how.  (A user-frobbable knob isn't
necessarily the best answer.)

regards, tom lane

Re: [HACKERS] Creating backup history files for backups taken from standbys

Hi,

On 3/2/18 1:03 PM, Fujii Masao wrote:
> On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier  wrote:
> 
>> We would talk about two backups running
>> simultaneously on a standby, which would overlap with each other to
>> generate a file aimed only at being helpful for debugging purposes, and
>> we provide no information now for backups taken from standbys.  We could
>> of course make that logic a bit smarter by checking if there is an
>> extsing file with the same name and create a new file with a different
>> name.  But is that worth the complication? That's where I am not
>> convinced, and that's the reason why this patch is doing things this
>> way.
> 
> What about including the backup history file in the base backup instead of
> creating it in the standby's pg_wal and archiving it? As the good side effect
> of this approach, we can use the backup history file for debugging purpose
> even when WAL archiving is not enabled.

I don't think this is a good idea, even if it does solve the race
condition.  First, it would be different from the way primary-style
backups generate this file.  Second, these files would not be excluded
from future restore / backup cycles so they could build up after a while
and cause confusion.  I guess we could teach PG to delete them or
pg_basebackup to ignore them, but that doesn't seem worth the effort.

Regards,
-- 
-David
da...@pgmasters.net

Re: Cached/global query plans, autopreparation

On 2018-03-02 15:29:09 -0500, Bruce Momjian wrote:
> Postgres uses a conservative method for reusing plans with previous
> constants, as described in the PREPARE manual page:
> 
>   https://www.postgresql.org/docs/10/static/sql-prepare.html
>   Prepared statements can use generic plans rather than re-planning with
>   each set of supplied EXECUTE values. This occurs immediately for 
> prepared
>   statements with no parameters; otherwise it occurs only after five or 
> more
>   executions produce plans whose estimated cost average (including 
> planning
>   overhead) is more expensive than the generic plan cost estimate. Once
>   a generic plan is chosen, it is used for the remaining lifetime of the
>   prepared statement. Using EXECUTE values which are rare in columns with
>   many duplicates can generate custom plans that are so much cheaper than
>   the generic plan, even after adding planning overhead, that the generic
>   plan might never be used.
> 
> While I have heard people complain about how other databases cache
> prepare plans, I have heard few complaints about the Postgres approach,
> and I haven't even heard of people asking to control the documented "five
> or more" behavior.

This *constantly* is a problem.


Greetings,

Andres Freund

Re: Cached/global query plans, autopreparation

2018-03-02 Thread Bruce Momjian

On Thu, Feb 15, 2018 at 03:00:17PM +0100, Shay Rojansky wrote:
> Just wanted to say that I've seen more than 10% improvement in some real-world
> application when preparation was done properly. Also, I'm assuming that
> implementing this wouldn't involve "rewriting substantial part of Postgres
> core", and that even 10% is quite a big gain, especially if it's a 
> transparent/
> free one as far as the user is concerned (no application changes).

I would like to step back on this issue.  Ideally, every query would get
re-optimized because we can only be sure the plan is optimal when we use
supplied constants to generate the plan.  But, of course, parsing and
planning take time, so there ideally would be an way to avoid it.  The
question is always how much time will be saved by avoiding
parsing/planning, and what risk is there of suboptimal plans.

Postgres uses a conservative method for reusing plans with previous
constants, as described in the PREPARE manual page:

https://www.postgresql.org/docs/10/static/sql-prepare.html
Prepared statements can use generic plans rather than re-planning with
each set of supplied EXECUTE values. This occurs immediately for 
prepared
statements with no parameters; otherwise it occurs only after five or 
more
executions produce plans whose estimated cost average (including 
planning
overhead) is more expensive than the generic plan cost estimate. Once
a generic plan is chosen, it is used for the remaining lifetime of the
prepared statement. Using EXECUTE values which are rare in columns with
many duplicates can generate custom plans that are so much cheaper than
the generic plan, even after adding planning overhead, that the generic
plan might never be used.

While I have heard people complain about how other databases cache
prepare plans, I have heard few complaints about the Postgres approach,
and I haven't even heard of people asking to control the documented "five
or more" behavior.

I also know that other database products have more sophisticated prepare
usage, but they might have higher parse/plan overhead, or they might be
more flexible in handling specialized workloads, which Postgres might
not want to handle, given the costs/complexity/overhead.

So, the directions for improvement are:

1  Improve the existing "five or more" behavior
2  Automatically prepare queries that are not sent as prepared queries
3  Share plans among sessions

While #1 would be nice, #2 increases the number of applications that can
silently benefit from prepared queries, and #3 improves the number of
cases that query plans can be reused.  The issue with #3 is that the
constants used are no longer local to the session (which is the same
issue with connection poolers reusing prepared plans).  When different
sessions with potentially more varied constants reuse plans, the
probability of suboptimal plans increases.

I think the fact that pgbench shows a 2x improvement for prepared
statements, and real world reports are a 10% improvement means we need
to have a better understanding of exactly what workloads can benefit
from this, and a comprehensive analysis of all three areas of
improvement.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Testing "workers launched" in expected output? Really?

On Fri, Mar 2, 2018 at 3:13 PM, Tom Lane  wrote:
> My point is that just because it isn't falling over on
> relatively-lightly-loaded buildfarm machines doesn't mean that it won't
> fall over in other environments.

It doesn't mean it will, either.

> I should think that your recent experience with postgres_fdw (which is
> evidently still broken, BTW, see rhinoceros) would have convinced you
> that environmentally dependent regression test results are something to
> studiously avoid.

I suspect the problem with the test is that it's testing for a plan
that is only slightly better than the next-best plan, and so it takes
very little to push it over the edge.  That's sort of an environment
issue, but it's not exactly the same thing you're worried about here.

> We have enough trouble with unforeseen test deltas;
> putting in tests that have a blatantly obvious failure mechanism is
> just taunting the software gods.  Who tend to take their revenge in
> unpleasant ways.
>
> If you insist on actual failures, perhaps I'll go set up about four
> concurrently-scheduled animals on prairiedog's host, and wait to see
> what happens ...

I will be curious to see the results of that experiment.  Unless you
press the machine so hard that fork() starts returning EAGAIN, I don't
see why that should cause this to fail.  And if you do that, then it's
going to fail far beyond the ability of suppressing Workers Launched
to cover the problem up.  However, it's certainly possible that you've
thought of some failure mode that I've missed, or that there is one
which neither of us has considered.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On 3/2/18 3:06 PM, Robert Haas wrote:
> On Thu, Mar 1, 2018 at 9:33 PM, Tomas Vondra
>  wrote:
>> Ah, apologies - that's due to moving the patch from the last CF (it was
>> marked as RWF so I had to reopen it before moving it). I'll submit a new
>> version of the patch shortly, please mark it as WOA until then.
> 
> So, the way it's supposed to work is you resubmit the patch first and
> then re-activate the CF entry.  If you get to re-activate the CF entry
> without actually updating the patch, and then submit the patch
> afterwards, then the CF deadline becomes largely meaningless.  I think
> a new patch should rejected as untimely.

Hmmm, I missed that implication last night.  I'll mark this Returned
with Feedback.

Tomas, please move to the next CF once you have an updated patch.

Thanks,
-- 
-David
da...@pgmasters.net

Re: 2018-03 Commitfest Summary (Andres #1)

2018-03-02 Thread Peter Geoghegan

On Fri, Mar 2, 2018 at 1:47 AM, Fabien COELHO  wrote:
> On the "adequate return" point, my opinion is that currently pgbench is just
> below the feature set needed to be generally usable, so not improving it is
> a self-fullfilling ensurance that it will not be used further. Once the
> "right" feature set is reached (for me, at least extracting query output
> into variables, having conditionals, possibly a few more functions if some
> benches use them), whether it would be actually more widely used by both
> developers and users is an open question.

FWIW, I think that pgbench would become a lot more usable if someone
maintained a toolset for managing pgbench. Something similar to Greg
Smith's pgbench-tools project, but with additional features for
instrumenting the server. There would be a lot of value in integrating
it with third party tooling, such as perf and BCC, and in making it
easy for non-experts to run relevant, representative tests.

Things like the rate limiting and alternative distributions were
sorely needed, but there are diminishing returns. It's pretty clear to
me that much of the remaining low hanging fruit is outside of pgbench
itself. None of the more recent pgbench enhancements seem to make it
easier to use.

-- 
Peter Geoghegan

Re: Testing "workers launched" in expected output? Really?

Robert Haas  writes:
> On Fri, Mar 2, 2018 at 2:28 PM, Tom Lane  wrote:
>> ... now I am on the warpath.  I have no idea whether or not the diff
>> here is significant --- maybe it is --- but I am desperately unhappy
>> that we have expected-output files that will fail if fewer than the
>> expected number of workers launched.

> Unless this is causing actual failures I don't think we should change
> it.  It would be very sad if we started routinely getting Workers
> Launched < Workers Planned due to some newly-introduced bug and had no
> idea it was happening because we'd hidden it to avoid imaginary
> buildfarm failures.

My point is that just because it isn't falling over on
relatively-lightly-loaded buildfarm machines doesn't mean that it won't
fall over in other environments.  You don't want packagers cursing your
name while trying to push out an emergency security fix because the
regression tests are unstable in their environment.  Or worse, they might
adopt the expedient of skipping "make check" in their package recipes,
which is not a good idea from anyone's standpoint.

I should think that your recent experience with postgres_fdw (which is
evidently still broken, BTW, see rhinoceros) would have convinced you
that environmentally dependent regression test results are something to
studiously avoid.  We have enough trouble with unforeseen test deltas;
putting in tests that have a blatantly obvious failure mechanism is
just taunting the software gods.  Who tend to take their revenge in
unpleasant ways.

If you insist on actual failures, perhaps I'll go set up about four
concurrently-scheduled animals on prairiedog's host, and wait to see
what happens ...

regards, tom lane

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

On Thu, Mar 1, 2018 at 9:33 PM, Tomas Vondra
 wrote:
> Ah, apologies - that's due to moving the patch from the last CF (it was
> marked as RWF so I had to reopen it before moving it). I'll submit a new
> version of the patch shortly, please mark it as WOA until then.

So, the way it's supposed to work is you resubmit the patch first and
then re-activate the CF entry.  If you get to re-activate the CF entry
without actually updating the patch, and then submit the patch
afterwards, then the CF deadline becomes largely meaningless.  I think
a new patch should rejected as untimely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Hi,

On 2018-03-01 21:39:36 -0500, David Steele wrote:
> On 3/1/18 9:33 PM, Tomas Vondra wrote:
> > On 03/02/2018 02:12 AM, Andres Freund wrote:
> > > Hm, this CF entry is marked as needs review as of 2018-03-01 12:54:48,
> > > but I don't see a newer version posted?
> > > 
> > 
> > Ah, apologies - that's due to moving the patch from the last CF (it was
> > marked as RWF so I had to reopen it before moving it). I'll submit a new
> > version of the patch shortly, please mark it as WOA until then.
> 
> Marked as Waiting on Author.

Sorry to be the hard-ass, but given this patch hasn't been moved forward
since 2018-01-19, I'm not sure why it's elegible to be in this CF in the
first place?

Greetings,

Andres Freund

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

Julian Markwort  writes:
> Andres Freund wrote on 2018-03-02:
>> and I'd checked that 1.5 already exists. But you just renamed the file,
>> presumably because it's essentially rewriting the whole file?  I'm not
>> sure I'm a big fan of doing so, because that makes testing the upgrade
>> path more work.

> You're right about 1.5 already existing. I wasn't sure about the versioning 
> policy for extensions and seeing as version 1.5 only changed a few grants I 
> quasi reused 1.5 for my intentions.

Nope, that's totally wrong.  You can get away with that if we've not
already shipped a 1.5 release --- but we did ship it in v10, so that
version is frozen now.  You need to make your changes in a 1.5--1.6
upgrade file.  Otherwise there's no clean path for existing installations
to upgrade to the new version.

regards, tom lane

Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly

On Fri, Mar 2, 2018 at 1:20 PM, Robert Haas  wrote:
> On Fri, Mar 2, 2018 at 5:35 AM, Etsuro Fujita
>  wrote:
>> Agreed.  Better safe than sorry, so I disabled autovacuum for all the tables
>> created in the postgres_fdw regression test, except the ones with no data
>> modification created in the sections "test handling of collations" and "test
>> IMPORT FOREIGN SCHEMA".  I'm attaching a patch for that.
>
> I have committed this patch.  Whee!

And that seems to have made things worse.  The failures on rhinocerous
look related:

https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=rhinoceros=HEAD

And so does this failure on chub:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chub=2018-03-02%2016%3A10%3A02

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Testing "workers launched" in expected output? Really?

On Fri, Mar 2, 2018 at 2:28 PM, Tom Lane  wrote:
> So buildfarm member piculet just fell over like this:
>
> == pgsql.build/src/test/regress/regression.diffs 
> ==
> *** 
> /home/andres/build/buildfarm-piculet/HEAD/pgsql.build/../pgsql/src/test/regress/expected/select_parallel.out
> 2018-02-28 16:10:01.986941733 +
> --- 
> /home/andres/build/buildfarm-piculet/HEAD/pgsql.build/src/test/regress/results/select_parallel.out
>   2018-03-02 19:13:57.843939790 +
> ***
> *** 485,495 
>   QUERY PLAN
>   --
>Aggregate (actual rows=1 loops=1)
> !->  Nested Loop (actual rows=98000 loops=1)
>->  Seq Scan on tenk2 (actual rows=10 loops=1)
>  Filter: (thousand = 0)
>  Rows Removed by Filter: 9990
> !  ->  Gather (actual rows=9800 loops=10)
>  Workers Planned: 4
>  Workers Launched: 4
>  ->  Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)
> --- 485,495 
>   QUERY PLAN
>   --
>Aggregate (actual rows=1 loops=1)
> !->  Nested Loop (actual rows=97836 loops=1)
>->  Seq Scan on tenk2 (actual rows=10 loops=1)
>  Filter: (thousand = 0)
>  Rows Removed by Filter: 9990
> !  ->  Gather (actual rows=9784 loops=10)
>  Workers Planned: 4
>  Workers Launched: 4
>  ->  Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)
>
> ==
>
> and now I am on the warpath.  I have no idea whether or not the diff
> here is significant --- maybe it is --- but I am desperately unhappy
> that we have expected-output files that will fail if fewer than the
> expected number of workers launched.  I find that absolutely
> unacceptable.  It reminds me entirely too much of when I had to package
> MySQL for Red Hat, and half the time the package builds failed in
> Red Hat's buildfarm, because their tests weren't robust about passing
> on heavily loaded machines.  I won't stand for our tests becoming
> like that.
>
> Perhaps we could deal with this by suppressing the Workers Planned/
> Launched lines when we are suppressing costs?

/me blinks.

So you're not upset about the thing that actually caused the failure,
which looks very likely to be a bug in the commits I pushed today, but
are instead upset about something that didn't cause a failure but
which is shiny and nearby?

Unless this is causing actual failures I don't think we should change
it.  It would be very sad if we started routinely getting Workers
Launched < Workers Planned due to some newly-introduced bug and had no
idea it was happening because we'd hidden it to avoid imaginary
buildfarm failures.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP Patch: Precalculate stable functions, infrastructure v1

Hi,

On 2018-03-02 11:22:01 +0300, Marina Polyakova wrote:
> I fixed the failure that Thomas pointed out to me, and I'm finishing work on
> it, but it took me a while to study this part of the executor..

I unfortunately think that makes this too late for v11, and we should
mark this as returned with feedback.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #3)

On 2018-03-02 12:22:15 +0300, Ildus Kurbangaliev wrote:
> On Thu, 1 Mar 2018 20:34:11 -0800
> Andres Freund  wrote:
> > - Custom compression methods
> > 
> >   RFC. While marked as ready for committer, I think this is mostly
> >   because some high level design input is needed.  It's huge, and so
> > far only seems to add already existing compression support.
> > 
> >   Not seing this for v11.

> This patch is not about adding new compression algorithms, it's about
> adding a new access method type which could be used for new compression
> methods.

Yes, but that's not a contradiction to what I wrote? Currently we don't
gain any practical improvements for users with this patch, no?  It'd
"just" allow extension authors to provide benefit.

Greetings,

Andres Freund

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-02 Thread Julian Markwort

Andres Freund wrote on 2018-03-02:
> Yea, I misread the diff to think you added a conflicting version. Due
> to:
> -DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
> +DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

> and I'd checked that 1.5 already exists. But you just renamed the file,
> presumably because it's essentially rewriting the whole file?  I'm not
> sure I'm a big fan of doing so, because that makes testing the upgrade
> path more work.

You're right about 1.5 already existing. I wasn't sure about the versioning 
policy for extensions and seeing as version 1.5 only changed a few grants I 
quasi reused 1.5 for my intentions.

> What workload did you run? read/write or readonly? This seems like a
> feature were readonly makes a lot more sense. But ~1800 tps strongly
> suggests that's not what you did?

I'm sorry I forgot to mention this; I ran all tests as read-write.

> > With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 
> > tps, while the patched version resulted in 1700 tps, so a little over 7% 
> > overhead? Well, the "control run", without pg_stat_statements delivered 
> > only 1806 tps, so variance seems to be quite high.

> That's quite some overhead, I'd say.

Yes, but I wouldn't give a warranty that it is neither more nor less overhead 
than 7%, seeing as for my testing, the tps were higher for (unmodified) pgss 
enabled vs no pgss at all.

> > If anybody has any recommendations for a setup that generates less 
> > variance, I'll try this again.

> I'd suggest disabling turboost, in my experience that makes tests
> painful to repeat, because it'll strongly depend on the current HW
> temperature.
This might be a problem for average systems but I'm fairly certain this isn't 
the issue here.

I might try some more benchmarks and will in particular look into running 
read-only tests, as the aforementioned 840 EVO SSD ist -comparatively speaking- 
pretty slow.
Do you have any recommendations as to what constitutes adequate testing times 
regarding pgbench?

Kind regards
Julian

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

Andres Freund  writes:
> Yea, I misread the diff to think you added a conflicting version. Due
> to:
> -DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
> +DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

> and I'd checked that 1.5 already exists. But you just renamed the file,
> presumably because it's essentially rewriting the whole file?  I'm not
> sure I'm a big fan of doing so, because that makes testing the upgrade
> path more work.

Yeah, that is not project style anymore.  Just add a delta file and leave
it at that.  See e.g. de1d042f5979bc1388e9a6d52a4d445342b04932 for an
example.

(We might from time to time roll up the deltas and replace the base
file with a newer version, but I think that should happen in separate
housekeeping commits.)

regards, tom lane

Re: 2018-03 Commitfest Summary (Andres #1)

Bcc: 
Reply-To: 

Hi,

On 2018-03-02 10:47:01 +0100, Fabien COELHO wrote:
> For instance, I used extensively tps throttling, latencies and timeouts
> measures when developping and testing the checkpointer sorting & throttling
> patch.

That doesn't say that much about proposed feature additions, we didn't
say that feature isn't useful?

> I can stop doing both if the project decides that improving pgbench
> capabilities is against its interest.

That's not what we said. There's a difference between "we do not want to
improve pgbench" and "the cost/benefit balance seems to have shifted
over time, and the marginal benefit of proposed features isn't that
high".

> As pgbench patches can stay ready-to-committers for half a dozen CF, I'm not
> sure the strain on the committer time is that heavy:-)

That's just plain wrong. Even patches that linger cost time and attention.

> > > As already said, the motivation is that it is a preparation for a (much)
> > > larger patch which would move pgbench expressions to fe utils and use them
> > > in "psql".
> > 
> > You could submit it together with that.
> 
> Sure, I could. My previous experience is that maintaining a set of dependent
> patches is tiresome, and it does not help much with testing and reviewing
> either.

I'm exactly of the opposite opinion. Submitting things out of context,
without seeing at least drafts of later patches, is a lot more work and
doesn't allow to see the big picture.

> So I'm doing things one (slow) step at a time, especially as each
> time I've submitted patches which were doing more than one thing I was asked
> to disentangle features and restructuring.

There's a difference between maintaining a set of patches in a queue,
nicely split up, and submitting them entirely independently.

Greetings,

Andres Freund

Re: [HACKERS] pgbench randomness initialization

Fabien COELHO  writes:
>> Hm ... so I tried to replicate this problem, and failed to: the log files
>> get made under the VPATH build directory, as desired, even without this
>> patch.  Am I doing something wrong, or is this platform-dependent somehow?

> As I recall, it indeed works if the source directories are rw, but fails 
> if they are ro because then the local mkdir fails. So you would have to do 
> a vpath build with sources that are ro to get the issue the patch is 
> fixing.

Ah, right you are.  Apparently, if the source is rw, the temporary files
in question are made there but immediately deleted, so the bug isn't
obvious.

Fix verified and pushed.

regards, tom lane

Re: 2018-03 Commitfest Summary (Andres #1)

Hi,

On 2018-03-02 11:06:12 +0100, Fabien COELHO wrote:
> A lot of patches do not even get a review: no immediate interest or more
> often no ressources currently available, patch stays put, I'm fine with
> that.

Well, even if that's the case that's not free of cost to transport them
from fest to fest. Going through them continually isn't free. At some
point we just gotta decide it's an undesirable feature.

> Now, some happy patches actually get reviews and are switched to ready,
> which shows that somebody saw enough interest in them to spend some time to
> improve them.
> 
> If committers ditch these reviewed patches on weak ground (eg "I do not need
> this feature so nobody should need it"), it is both in contradiction with
> the fact that someone took the time to review it, and is highly demotivating
> for people who do participate to the reviewing process and contribute to
> hopefully improve these patches, because the reviewing time just goes to the
> drain in the end even when the patch is okay.

The consequence of this appears to be that we should integrate
everything that anybody decided worthy enough to review. That just
doesn't make sense, we can't maintain the project that way, nor will the
design be even remotely coherent.

Sure it's a balancing act, but nobody denies that.

> So for me killing ready patches in the end of the process and on weak ground
> can only make the process worse. The project is shooting itself in the foot,
> and you cannot complain later that there is not enough reviewers.

How would you want it to work otherwise? Merge everything that somebody
found review worthy? Have everything immediately reviewed by committers?
Neither of those seem realistic.

Greetings,

Andres Freund

Re: [HACKERS] Early locking option to parallel backup

On 2018-03-02 11:42:06 -0500, Robert Haas wrote:
> On Fri, Mar 2, 2018 at 2:29 AM, Andres Freund  wrote:
> > There seems to to be consensus in this thread that the approach Lucas
> > proposed isn't what we want, and that instead some shared lock based
> > approach is desirable.  As that has been the case for ~1.5 months, I
> > propose we mark this as returned with feedback?
> 
> Yes, that seems pretty clear-cut to me.  It would be totally unfair if
> a patch that hasn't been updated since November were allowed to submit
> a new version after the start of the final CommitFest.  We shouldn't
> be working on anything now that hasn't been under active development
> recently; we have enough things (and then some) that have.

Done.

Re: [Patch] Checksums for SLRU files

On 2018-03-02 11:49:05 -0500, Robert Haas wrote:
> On Thu, Mar 1, 2018 at 8:25 PM, Andres Freund  wrote:
> > On 2018-02-02 11:37:34 +1300, Thomas Munro wrote:
> >> > 3. pg_upgrade isn't considered.  This patch should provide upgrading 
> >> > SLRUs
> >> > to adopt changed useful size of page.  That seems to be hardest patch of
> >> > this patch to be written.
> >>
> >> +1
> >>
> >> I think we'd want pg_upgrade tests showing an example of each SLRU
> >> growing past one segment, and then being upgraded, and then being
> >> accessed in various different pages and segment files, so that we can
> >> see that we're able to shift the data to the right place successfully.
> >> For example I think I'd want to see that a single aborted transaction
> >> surrounded by many committed transactions shows up in the right place
> >> after an upgrade.
> >
> > This patch is in the 2018-03 CF, but I don't see any progress since the
> > last comments. As it has been Waiting on author since the last CF, I
> > think we should mark this as returned with feedback.
> 
> +1.

Done.

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

On 2018-03-02 18:07:32 +0100, Julian Markwort wrote:
> Andres Freund wrote on 2018-03-01:
> > I think the patch probably doesn't apply anymore, due to other changes
> > to pg_stat_statements since its posting. Could you refresh?
> 
> pgss_plans_v02.patch applies cleanly to master, there were no changes to 
> pg_stat_statements since the copyright updates at the beginning of January.
> (pgss_plans_v02.patch is attached to message 
> 1bd396a9-4573-55ad-7ce8-fe7adffa1...@uni-muenster.de and can be found in the 
> current commitfest as well.)

Yea, I misread the diff to think you added a conflicting version. Due
to:
-DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

and I'd checked that 1.5 already exists. But you just renamed the file,
presumably because it's essentially rewriting the whole file?  I'm not
sure I'm a big fan of doing so, because that makes testing the upgrade
path more work.

> I've tried to gather some meaningful results, however either my testing 
> methodology was flawed (as variance between all my passes of pgbench was 
> rather high) or the takeaway is that the feature only generates little 
> overhead.
> This is what I've run on my workstation using a Ryzen 1700 and 16GB of RAM 
> and an old Samsung 840 Evo as boot drive, which also held the database:
> The database used for the tests was dropped and pgbench initialized anew for 
> each test (pgss off, pgss on, pgss on with plan collection) using a scaling 
> of 16437704*0.003~=50 (roughly what the phoronix test suite uses for a buffer 
> test).
> Also similar to the phoronix test suite, I used 8 jobs and 32 connections for 
> a normal multithreaded load.
> 
> I then ran 10 passes, each for 60 seconds, with a 30 second pause between 
> them, as well as another test which ran for 10 minutes.

What workload did you run? read/write or readonly? This seems like a
feature were readonly makes a lot more sense. But ~1800 tps strongly
suggests that's not what you did?


> With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 
> tps, while the patched version resulted in 1700 tps, so a little over 7% 
> overhead? Well, the "control run", without pg_stat_statements delivered only 
> 1806 tps, so variance seems to be quite high.

That's quite some overhead, I'd say.


> If anybody has any recommendations for a setup that generates less variance, 
> I'll try this again.

I'd suggest disabling turboost, in my experience that makes tests
painful to repeat, because it'll strongly depend on the current HW
temperature.

Greetings,

Andres Freund

Testing "workers launched" in expected output? Really?

So buildfarm member piculet just fell over like this:

== pgsql.build/src/test/regress/regression.diffs 
==
*** 
/home/andres/build/buildfarm-piculet/HEAD/pgsql.build/../pgsql/src/test/regress/expected/select_parallel.out
2018-02-28 16:10:01.986941733 +
--- 
/home/andres/build/buildfarm-piculet/HEAD/pgsql.build/src/test/regress/results/select_parallel.out
  2018-03-02 19:13:57.843939790 +
***
*** 485,495 
  QUERY PLAN
  --
   Aggregate (actual rows=1 loops=1)
!->  Nested Loop (actual rows=98000 loops=1)
   ->  Seq Scan on tenk2 (actual rows=10 loops=1)
 Filter: (thousand = 0)
 Rows Removed by Filter: 9990
!  ->  Gather (actual rows=9800 loops=10)
 Workers Planned: 4
 Workers Launched: 4
 ->  Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)
--- 485,495 
  QUERY PLAN
  --
   Aggregate (actual rows=1 loops=1)
!->  Nested Loop (actual rows=97836 loops=1)
   ->  Seq Scan on tenk2 (actual rows=10 loops=1)
 Filter: (thousand = 0)
 Rows Removed by Filter: 9990
!  ->  Gather (actual rows=9784 loops=10)
 Workers Planned: 4
 Workers Launched: 4
 ->  Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)

==

and now I am on the warpath.  I have no idea whether or not the diff
here is significant --- maybe it is --- but I am desperately unhappy
that we have expected-output files that will fail if fewer than the
expected number of workers launched.  I find that absolutely
unacceptable.  It reminds me entirely too much of when I had to package
MySQL for Red Hat, and half the time the package builds failed in
Red Hat's buildfarm, because their tests weren't robust about passing
on heavily loaded machines.  I won't stand for our tests becoming
like that.

Perhaps we could deal with this by suppressing the Workers Planned/
Launched lines when we are suppressing costs?

regards, tom lane

Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

Hi,

On 2018-03-02 20:22:21 +0300, Nikolay Shaplov wrote:
> Since I get a really big patch as a result, it was decided to commit it in 
> parts.

I get that, but I strongly suggest not creating 10 loosely related
threads, but keeping it as a patch series in one thread. It's really
hard to follow for people not continually paying otherwise.  Having to
search the list, collect together multiple threads, trying to read them
in some chronological order... That's not a reasonably efficient way to
interact, therefore the likelihood of timely review will be reduced.

Greetings,

Andres Freund

Re: Kerberos test suite

2018-03-02 Thread Robbie Harwood

Thomas Munro  writes:

> Peter Eisentraut  wrote:
>> On 2/27/18 00:56, Thomas Munro wrote:
>>> FWIW it passes for me if I add this:
>>>
>>> +elsif ($^O eq 'freebsd')
>>> +{
>>> +   $krb5_bin_dir = '/usr/local/bin';
>>> +   $krb5_sbin_dir = '/usr/local/sbin';
>>
>> I suppose you only need the second one, right?  The first one should be
>> in the path.
>
> Erm.  Turned out I had a couple of different versions in my path, but
> your test only works with the binaries in the paths shown above (the
> binaries installed from ports, or rather pkg install krb5, not the
> ones from the base system).
>
> munro@asterix $ /usr/bin/krb5-config --version
> FreeBSD heimdal 1.1.0

This one's Heimdal; the test is intended to only work with MIT.  Perhaps
it should check the output of krb5-config instead of failing
confusingly?

> munro@asterix $ /usr/local/bin/krb5-config --version
> Kerberos 5 release 1.15.2

Thanks,
--Robbie


signature.asc
Description: PGP signature

Re: Kerberos test suite

2018-03-02 Thread Robbie Harwood

Michael Paquier  writes:

> On Wed, Feb 14, 2018 at 09:27:04AM -0500, Peter Eisentraut wrote:
>
>> (If it appears to hang for you in the "setting up Kerberos" step, you
>> might need more entropy/wait a while.  That problem appears to be
>> limited to some virtual machine setups, but the specifics are not
>> clear.)
>
> That's one of those "move your mouse" or "type randomly your keyboard"
> to generate more entropy for the installation setup?

Right.

Whether this message can show up will depend on how the krb5 was built
(and how new it is).  krb5 > 1.15 has support for getrandom(), so on
most Linux distros, if the machine has successfully ever created 256
bits of entropy, it won't block.  On other platforms, use of
/dev/urandom requires a build flag.

Thanks,
--Robbie

signature.asc
Description: PGP signature

Re: IndexJoin memory problem using spgist and boxes

2018-03-02 Thread Alexander Kuzmenkov


Hi Anton,

I can reproduce the high memory consumption with your queries.

Looking at the patch, I see that you changed the lifetime of the 
temporary context from per-tuple to per-index-scan. It is not obvious 
that this change is correct. Could you explain, what memory context are 
involved in the scan, and what their lifetimes are, before and after 
your changes? What are these memory allocations that are causing the 
high consumption, and what code makes them? This will make it easier to 
understand your changes.




https://www.postgresql.org/message-id/flat/CAPqRbE5vTGWCGrOc91Bmu-0o7CwsU0UCnAshOtpDR8cSpSjy0g%40mail.gmail.com#capqrbe5vtgwcgroc91bmu-0o7cwsu0ucnashotpdr8cspsj...@mail.gmail.com


Also, this link doesn't open for me.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Partition-wise aggregation/grouping

On Thu, Mar 1, 2018 at 4:52 PM, Robert Haas  wrote:
> This is not a full review, but I'm out of time for right now.

Another thing I see here now is that create_grouping_paths() and
create_child_grouping_paths() are extremely similar.  Isn't there some
way we can refactor things so that we can reuse the code instead of
duplicating it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] log_destination=file

Robert Haas  writes:
> On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
>  wrote:
>> Sorry for the naive question, but which of these bottlenecks are we
>> actually hitting? I don't recall dealing with an actual production
>> system where the log collector would be an issue, so I don't have a very
>> good idea where the actual bottleneck is in this case.

> Unfortunately, neither do I -- but I do believe that the problem is real.

Not to state the obvious ... but if we do not know where the bottleneck
is, isn't it a bit premature to be proposing changes to "fix" it?

But that argument just leads to the same conclusion that seems to be
prevalent at the moment, ie boot this to v12.

regards, tom lane

Re: [HACKERS] log_destination=file

On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
 wrote:
> Sorry for the naive question, but which of these bottlenecks are we
> actually hitting? I don't recall dealing with an actual production
> system where the log collector would be an issue, so I don't have a very
> good idea where the actual bottleneck is in this case.

Unfortunately, neither do I -- but I do believe that the problem is real.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP: a way forward on bootstrap data

John Naylor  writes:
> Version 8, rebased against 76b6aa41f41d.

I took a preliminary look through this, without yet attempting to execute
the script against HEAD.  I have a few thoughts:

* I'm inclined not to commit the conversion scripts to the repo.  I doubt
there are third parties out there with a need for them, and if they do
need them they can get 'em out of this thread in the mailing list
archives.  (If anyone has a different opinion about that, speak up!)

* I notice you have a few "preliminary cleanup" changes here and there
in the series, such as fixing the inconsistent spelling of
Anum_pg_init_privs_initprivs.  These could be applied before we start
the main conversion process, and I'm inclined to do that since we could
get 'em out of the way early.  Ideally, the main conversion would only
touch the header files and related scripts/makefiles.

* I'm a little disturbed by the fact that 0002 has to "re-doublequote
values that are macros expanded by initdb.c".  I see that there are only
a small number of affected places, so maybe it's not really worth working
harder, but I worry that something might get missed.  Is there any way to
include this consideration in the automated conversion, or at least to
verify that we found all the places to quote?  Or, seeing that 0004 seems
to be introducing some quoting-related hacks to genbki.pl anyway, maybe
we could take care of the issue there?

* In 0003, I'd recommend leaving the re-indentation to happen in the next
perltidy run (assuming perltidy would fix that, which I think is true but
I might be wrong).  It's just creating more review work to do it here.
In any case, the patch summary line is pretty misleading since it's
*not* just reindenting, but also refactoring genbki.pl.  (BTW, if that
refactoring would work on the script as it is, maybe that's another
thing we could do early?  The more we can do before "flag day", the
better IMO.)

* In 0006, I'm not very pleased with the introduction of
"Makefile.headers".  I'd keep those macros where they are in
catalog/Makefile.  backend/Makefile doesn't need to know about that,
especially since it's doing an unconditional invocation of
catalog/Makefile anyway.  It could just do something like

submake-schemapg:
$(MAKE) -C catalog generated-headers

and leave it to catalog/Makefile to know what needs to happen for
both schemapg.h and the other generated files.

Overall, though, this is looking pretty promising.

regards, tom lane

Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly

On Fri, Mar 2, 2018 at 5:35 AM, Etsuro Fujita
 wrote:
> Agreed.  Better safe than sorry, so I disabled autovacuum for all the tables
> created in the postgres_fdw regression test, except the ones with no data
> modification created in the sections "test handling of collations" and "test
> IMPORT FOREIGN SCHEMA".  I'm attaching a patch for that.

I have committed this patch.  Whee!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PATCH] Verify Checksums during Basebackups

On Fri, Mar 2, 2018 at 6:23 AM, Magnus Hagander  wrote:
> Another quick note -- we need to assert that the size of the buffer is
> actually divisible by BLCKSZ. I don't think it's a common scenario, but it
> could break badly if somebody changes BLCKSZ. Either that or perhaps just
> change the TARSENDSIZE to be a multiple of BLCKSZ.

I think that this patch needs to support all block sizes that are
otherwise supported -- failing an assertion doesn't seem like a
reasonable option, unless it only happens for block sizes we don't
support anyway.

+1 for the feature in general.  I think this would help a lot of peple.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Creating backup history files for backups taken from standbys

2018-03-02 Thread Fujii Masao

On Fri, Mar 2, 2018 at 12:26 PM, Andres Freund  wrote:
> Hi!
>
> On 2018-03-02 02:29:13 +0900, Fujii Masao wrote:
>> On Fri, Feb 2, 2018 at 2:06 PM, Michael Paquier
>>  wrote:
>> > On Fri, Feb 02, 2018 at 12:47:26AM +0900, Fujii Masao wrote:
>> >> The patch basically looks good to me. Here are some small comments.
>> >>
>> >>
>> >>   The backup history file is not created in the database cluster 
>> >> backed up.
>> >>
>> >>
>> >> The above should be deleted in pg_basebackup.sgml.
>> >>
>> >> * During recovery, since we don't use the end-of-backup WAL
>> >> * record and don't write the backup history file, the
>> >>
>> >> This comment needs to be updated in xlog.c.
>> >
>> > Thanks Fujii-san for the review.  Indeed those portions need a refresh..
>>
>> Thanks for updating the patch!
>
> You'd claimed this patch as a committer. You're still planning to commit
> it?

Yes. Thanks for the ping!

Regards,

-- 
Fujii Masao

Re: [HACKERS] Creating backup history files for backups taken from standbys

2018-03-02 Thread Fujii Masao

On Fri, Mar 2, 2018 at 1:07 PM, Michael Paquier  wrote:
> On Fri, Mar 02, 2018 at 02:29:13AM +0900, Fujii Masao wrote:
>> + * write a backup history file with the same name.
>>
>> So more than one backup history files with the same name
>> but the diffferent content can be created and archived.
>> Isn't this problematic because the backup history file that
>> users want to use later might be overwritten unexpectedly?
>
> Yeah, that's the intention behind the patch.  Would that actually happen
> in practice though?

Yes, I think. During recovery, all the pg_basebackup would use the same
backup starting point and create the backup history file with the same name
until the next restartpoint runs. So unless the restartpoint interval is short,
ISTM that such a problematic case can happen even in practice. No?

> We would talk about two backups running
> simultaneously on a standby, which would overlap with each other to
> generate a file aimed only at being helpful for debugging purposes, and
> we provide no information now for backups taken from standbys.  We could
> of course make that logic a bit smarter by checking if there is an
> extsing file with the same name and create a new file with a different
> name.  But is that worth the complication? That's where I am not
> convinced, and that's the reason why this patch is doing things this
> way.

What about including the backup history file in the base backup instead of
creating it in the standby's pg_wal and archiving it? As the good side effect
of this approach, we can use the backup history file for debugging purpose
even when WAL archiving is not enabled.

Regards,

-- 
Fujii Masao

Re: heap_lock_updated_tuple_rec can leak a buffer refcount

2018-03-02 Thread Alexander Kuzmenkov

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Looks like a leak indeed, the fix seems right.

Re: [HACKERS] [POC] Faster processing at Gather node

On Wed, Feb 28, 2018 at 10:06 AM, Robert Haas  wrote:
> [ latest patches ]

Committed.  Thanks for the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

2018-03-02 Thread Nikolay Shaplov

В письме от 1 марта 2018 16:15:32 пользователь Andres Freund написал:

> > This is part or my bigger patch
> > https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4@x200m#21464
> > 19.veIEZdk4E4@x200m we've decided to commit by smaller parts.
> 
> I've not read that thread. Is this supposed to be a first step towards
> something larger?

I've started from the idea of opclass options. Same as here 
https://commitfest.postgresql.org/17/1559/
And found out that current reloptions code is not flexible at all, and could 
not be easily reused for other kind of options (that are not reloptions).

So I tried to change reloptions code to make universal, and then use it for 
any option purposes.

In the patch https://commitfest.postgresql.org/14/992/ I created a new file 
options.c, that works with options as with an abstraction. It has some option 
definition structure, that has all information about how this options should 
be parsed and transformed into binary representation, I called this structure 
OptionsCatalog, but name can be changed. And it also have all kind of 
functions that do all needed tranformation with options.

Then reloptions.c uses functions from options options.c for all options 
transformations, and takes care about relation specificity of reloptions.

While doing all of these, I first of all had to adjust code that was written 
around reloptions, so that there code would stay in tune with new reloptions 
code. And second, I met places where current reloption related code were 
imperfect, and I decided to change it too...

Since I get a really big patch as a result, it was decided to commit it in 
parts.

This patch is the adjusting of reloption related code. It does not change a 
great deal, but when you first commit it, the main reloptions refactoring 
patch will no longer look like a big mess, it will became much more logical.

Enum options patch is more about making coder a little bit more perfect. It is 
not really needed for the main patch, but it is more easy to introduce it 
before, then after.
https://commitfest.postgresql.org/17/1489/

There is one more patch, that deals with toast.* options for tables with no 
TOAST, but we can set it aside, it is independent enough, to deal with it 
later... 
https://commitfest.postgresql.org/17/1486/

The patch with reloptions tests I've written while working on this patch, were 
already commited
https://commitfest.postgresql.org/15/1314/


> > So for example if you set custom fillfactor value for some index, then it
> > will lead to allocating 84 bytes of memory (sizeof StdRdOptions on i386)
> > and only 8 bytes will be actually used (varlena header and fillfactor
> > value). 74 bytes is not much, but allocating them for each index for no
> > particular reason is bad idea, as I think.
> I'm not sure this is a particularly strong motivator though?  Does the
> patch have a goal besides saving a few bytes?

My real motivation for this patch is to make code more uniform. So the patch 
over it will be much clearer. And to make result code more clear and uniform 
too.

I guess long time ago, the StdRdOptions were the way to make code uniform. 
There were only one binary representation of options, and all relation were 
using it. Quite uniform.

But now, some relation kinds uses it's own options binary representations. 
Some still uses StdRdOptions, but only some elements from it. It is not 
uniform enough for me.

If start using individual binary representation for each relation kind, then 
code will be uniform and homogeneous again, and my next patch over it will be 
more easy to read and understand. 

> > Moreover when I wrote my big reloption refactoring patch, I came to "one
> > reloption kind - one binary representation" philosophy. It allows to write
> > code with more logic in it.
> I have no idea what this means?
I hope I managed to explain it above... May be I am not good enough with 
explaining, or with English. Or with explaining in English. If it is not clear 
enough, please tell me, what points are not clear, I'll try to explain more...


> Are you aiming this for v11 or v12?
I hope to commit StdRdOptions and Enum_Options patches before v11. 
Hope it does not go against any rules, since there patches does not change any 
big functionality, just do some refactoring and optimization, of the code that 
already exists.

As for main reloptions refactoring patch, if there is a chance to get commited 
before v11 release, it would be great. If not, I hope to post it to 
commitfest, right after v11 were released, so it would be possible to commit 
opclass options upon it before v12 release. 


-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/02/2018 04:29 AM, Andres Freund wrote:
> Hi,
> 
> On 2018-02-24 23:01:59 +0100, Tomas Vondra wrote:
>> Sadly, this patch series does not seem to move forward very much, and
>> I'm not sure how to change that :-/
> 
> What's your estimate about the patchset's maturity?
> 

It's dying of old age.

On a more serious note, I think the basics are pretty solid - both the
theory and the code (which mostly builds on what was introduced by the
CREATE STATISTICS thing in PG10).

I'm sure there are things to fix, but I don't expect radical reworks.
There are limitations I'd like to relax (say, allowing expressions
etc.), but those are clearly PG12 stuff at this point.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WIP: BRIN multi-range indexes

On 03/02/2018 05:08 AM, Tom Lane wrote:
> Andres Freund  writes:
>> On 2018-02-25 01:30:47 +0100, Tomas Vondra wrote:
>>> Note: Currently, this only works with float8-based data types.
>>> Supporting additional data types is not a big issue, but will
>>> require extending the opclass with "subtract" operator (used to
>>> compute distance between values when merging ranges).
> 
>> Based on Tom's past stances I'm a bit doubtful he'd be happy with such a
>> restriction.  Note that something similar-ish also has come up in
>> 0a459cec96.
> 

That restriction was lifted quite a long time ago, so now both index
types support pretty much the same data types as the original BRIN (with
the reltime/abstime exception, discussed in this thread earlier).

>> I kinda wonder if there's any way to not have two similar but not
>> equal types of logic here?
> 
> Hm. I wonder what the patch intends to do with subtraction overflow, 
> or infinities, or NaNs. Just as with the RANGE patch, it does not 
> seem to me that failure is really an acceptable option. Indexes are 
> supposed to be able to index whatever the column datatype can store.
> 

I admit that's something I haven't thought about very much. I'll look
into that, of course, but the indexes are only using the deltas to pick
which ranges to merge, so I think in the worst case it may results in
sub-optimal index. But let me check what the RANGE patch did.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: jsonlog logging only some messages?

2018-03-02 Thread Greg Stark

On 27 February 2018 at 16:50, Greg Stark  wrote:
> On 27 February 2018 at 02:04, Michael Paquier  wrote:
>> On Mon, Feb 26, 2018 at 05:38:56PM +, Greg Stark wrote:
>>
>> Hm.  I have just loaded jsonlog on a 9.6 and 10 instance where
>> log_checkpoints is enabled with this background worker which logs a
>> simple string every 10s:
>> https://github.com/michaelpq/pg_plugins/tree/master/hello_world
>>
>> Both checkpoint logs and the logs of the bgworker are showing up for me.
>
> Weird. I guess I have some more debugging with gdb to do.

I think I see what's going on. The log_min_messages is set to the
default value of warning. Which is a higher level than "LOG" so this
code in jsonlog.c is ignoring these messages:

if (edata->elevel < log_min_messages)
return;

But the normal processing for logs uses is_log_level_output to compare
error levels with log_min_messages which treats LOG (and
LOG_SERVER_ONLY) as logically equivalent to ERROR:

/*
 * is_log_level_output -- is elevel logically >= log_min_level?
 *
 * We use this for tests that should consider LOG to sort out-of-order,
 * between ERROR and FATAL.  Generally this is the right thing for testing
 * whether a message should go to the postmaster log, whereas a simple >=
 * test is correct for testing whether the message should go to the client.
 */
static bool
is_log_level_output(int elevel, int log_min_level)
{
if (elevel == LOG || elevel == LOG_SERVER_ONLY)
{
if (log_min_level == LOG || log_min_level <= ERROR)
return true;
}
else if (log_min_level == LOG)
{
/* elevel != LOG */
if (elevel >= FATAL)
return true;
}
/* Neither is LOG */
else if (elevel >= log_min_level)
return true;

return false;
}

-- 
greg

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-02 Thread Julian Markwort

Andres Freund wrote on 2018-03-01:
> I think the patch probably doesn't apply anymore, due to other changes
> to pg_stat_statements since its posting. Could you refresh?

pgss_plans_v02.patch applies cleanly to master, there were no changes to 
pg_stat_statements since the copyright updates at the beginning of January.
(pgss_plans_v02.patch is attached to message 
1bd396a9-4573-55ad-7ce8-fe7adffa1...@uni-muenster.de and can be found in the 
current commitfest as well.)

> I've not done any sort of review. Scrolling through I noticed //
> comments which aren't pg coding style.

I'll fix that along with any other problems that might be found in a review.


> I'd like to see a small benchmark showing the overhead of the feature.
> Both in runtime and storage size.

I've tried to gather some meaningful results, however either my testing 
methodology was flawed (as variance between all my passes of pgbench was rather 
high) or the takeaway is that the feature only generates little overhead.
This is what I've run on my workstation using a Ryzen 1700 and 16GB of RAM and 
an old Samsung 840 Evo as boot drive, which also held the database:
The database used for the tests was dropped and pgbench initialized anew for 
each test (pgss off, pgss on, pgss on with plan collection) using a scaling of 
16437704*0.003~=50 (roughly what the phoronix test suite uses for a buffer 
test).
Also similar to the phoronix test suite, I used 8 jobs and 32 connections for a 
normal multithreaded load.

I then ran 10 passes, each for 60 seconds, with a 30 second pause between them, 
as well as another test which ran for 10 minutes.

With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 tps, 
while the patched version resulted in 1700 tps, so a little over 7% overhead? 
Well, the "control run", without pg_stat_statements delivered only 1806 tps, so 
variance seems to be quite high.

The results of the ten successive tests, each running 60 seconds and then 
waiting for 30 seconds, are displayed in the attached plot.
I've tinkered with different settings with pgbench for quite some time now and 
all I can come up with are runs with high variance between them.

If anybody has any recommendations for a setup that generates less variance, 
I'll try this again.

Finally, the more interesting metric regarding this patch is the size of the 
pg_stat_statements.stat file, which stores all the metrics while the database 
is shut down. I reckon that the size of pgss_query_texts.stat (which holds only 
the query strings and plan strings while the database is running) will be 
similar, however it might fluctuate more as new strings are simply appended to 
the file until the garbagecollector decides that it has to be cleaned up.
After running the aforementioned tests, the file was 8566 bytes in size for 
pgss in it's unmodified form, while the tests resulted in 32607 bytes for the 
pgss that collects plans as well. This seems reasonable as plans strings are 
usually longer than the statements from which they result. Worst case, the 
pg_stat_statements.stat holds two plans for each type of statement.
I've not tested the length of the file with different encodings, such as JSON, 
YAML, or XML, however I do not expect any hugely different results.

Greetings
Julian


pgss_plans_pgbench.pdf
Description: Adobe PDF document

Re: New gist vacuum.

2018-03-02 Thread Andrey Borodin



> 2 марта 2018 г., в 21:25, Tom Lane  написал(а):
> 
> Andrey Borodin  writes:
>> So, I agree, unconditional counting is a good idea. Here's the v3 patch.
> 
> Pushed with trivial cosmetic adjustments.
> 
> I've marked the CF entry as committed; please make a new CF entry in
> 2018-09 for the other patch.  I'd also suggest starting a new email
> thread for that.  Linking the CF entry to a years-old thread doesn't
> make it easy for people to find what's the current submission.

Thanks, Tom!

Yes, I'll definitely start new thread for that patch. This thread had split 
unexpectedly, and I see it's not a convenient way. I've learned no to do it 
this way anymore :)

Best regards, Andrey Borodin.

Re: Rewrite of pg_dump TAP tests

2018-03-02 Thread Stephen Frost

Andres,

* Andres Freund (and...@anarazel.de) wrote:
> On 2018-02-26 13:15:04 -0500, Stephen Frost wrote:
> > Attached is a patch (which applies cleaning against a2a2205, but not so
> > much anymore, obviously, but I will fix after the releases) which
> > greatly improves the big pg_dump TAP tests.  There's probably more which
> > can be done, but I expect people will be much happier with this.  The
> > cliff-notes are:
> 
> Could you update?

Working on it and will provide an update, hopefully by the end of the
weekend.

> Do you think this should be backpatched so we can backpatch tests when
> backpatching fixes?

That's an interesting question.  I hadn't planned to and it wouldn't be
trivial, but if others would like to see these changes back-patched then
I'm willing to put in the work to do it.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: [PROPOSAL] Nepali Snowball dictionary

2018-03-02 Thread Arthur Zakirov

On Thu, Mar 01, 2018 at 10:23:11PM -0800, Andres Freund wrote:
> What is that entry for, if I may ask?  We need to wait for them to merge
> it, then sync the snowball code, including the nepali dictionary. This
> doesn't realistically seem doable for this commitfest. Therefore I think
> this should be marked as 'returned with feedback'.

I understand the point. I marked the patch as 'Returned with feedback'
by myself.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: Online enabling of checksums

On 03/02/2018 02:35 PM, Magnus Hagander wrote:
> 
> 
> On Wed, Feb 28, 2018 at 6:06 PM, Robert Haas  > wrote:
> 
> On Sun, Feb 25, 2018 at 9:54 AM, Magnus Hagander
> > wrote:
> > Also if that wasn't clear -- we only do the full page write if there 
> isn't
> > already a checksum on the page and that checksum is correct.
> 
> Hmm.
> 
> Suppose that on the master there is a checksum on the page and that
> checksum is correct, but on the standby the page contents differ in
> some way that we don't always WAL-log, like as to hint bits, and there
> the checksum is incorrect.  Then you'll enable checksums when the
> standby still has some pages without valid checksums, and disaster
> will ensue.
> 
> I think this could be hard to prevent if checksums are turned on and
> off multiple times.
> 
> 
> Do we ever make hintbit changes on the standby for example? If so, it
> would definitely cause problems. I didn't realize we did, actually...
> 

I don't think we do. SetHintBits does TransactionIdIsValid(xid) and
AFAIK that can't be true on a standby.

> I guess we could get there even if we don't by:
> * All checksums are correct
> * Checkums are disabled (which replicates)
> * Non-WAL logged change on the master, which updates checksum but does
> *not* replicate
> * Checksums re-enabled
> * Worker sees the checksum as correct, and thus does not force a full
> page write.
> * Worker completes and flips checksums on which replicates. At this
> point, if the replica reads the page, boom.
> 

Maybe.

My understanding of Robert's example is that you can start with an
instance that has wal_log_hints=off, and so pages on master/standby may
not be 100% identical. Then we do the online checksum thing, and the
standby may get pages with incorrect checksums.

> I guess we have to remove that optimisation. It's definitely a
> bummer, but I don't think it's an absolute dealbreaker.
> 

I agree it's not a deal-breaker. Or at least I don't see why it should
be - any other maintenance activity on the database (freezing etc.) will
also generate full-page writes.

The good thing is the throttling also limits the amount of WAL, so it's
possible to prevent generating too many checkpoints etc.

I suggest we simply:

1) set the checksums to in-progress
2) wait for a checkpoint
3) use the regular logic for full-pages (i.e. first change after
checkpoint does a FPW)

BTW speaking of checkpoints, I see ChecksumHelperLauncherMain does

RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | \
  CHECKPOINT_IMMEDIATE);

I'm rather unhappy about that - immediate checkpoints have massive
impact on production systems, so we try not doing them (That's one of
the reasons why CREATE DATABASE is somewhat painful). It usually
requires a bit of thinking about when to do such commands. But in this
case it's unpredictable when exactly the checksumming completes, so it
may easily be in the middle of peak activity.

Why not to simply wait for regular spread checkpoint, the way
pg_basebackup does it?

> We could say that we keep the optimisation if wal_level=minimal for 
> example, because then we know there is no replica. But I doubt
> that's worth it?
> 

If it doesn't require a lot of code, why not? But I don't really see
much point in doing that.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [Patch] Checksums for SLRU files

On Thu, Mar 1, 2018 at 8:25 PM, Andres Freund  wrote:
> On 2018-02-02 11:37:34 +1300, Thomas Munro wrote:
>> > 3. pg_upgrade isn't considered.  This patch should provide upgrading SLRUs
>> > to adopt changed useful size of page.  That seems to be hardest patch of
>> > this patch to be written.
>>
>> +1
>>
>> I think we'd want pg_upgrade tests showing an example of each SLRU
>> growing past one segment, and then being upgraded, and then being
>> accessed in various different pages and segment files, so that we can
>> see that we're able to shift the data to the right place successfully.
>> For example I think I'd want to see that a single aborted transaction
>> surrounded by many committed transactions shows up in the right place
>> after an upgrade.
>
> This patch is in the 2018-03 CF, but I don't see any progress since the
> last comments. As it has been Waiting on author since the last CF, I
> think we should mark this as returned with feedback.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Rangejoin rebased

On Fri, Mar 2, 2018 at 11:12 AM, Alexander Kuzmenkov
 wrote:
> On 16.01.2018 10:49, Jeff Davis wrote:
>> My proposed fix is to make an internal opfamily identical to the
>> external one, such that it's not recognized as part of the same EC,
>> and the planner won't try to eliminate it. It loses out on potential
>> optimizations, but those are mostly theoretical since the btree
>> opclass ordering for ranges is not very interesting to a user.
>
> I think I figured out what to do with missing sort directions. We can change
> select_outer_pathkeys_for_merge() to generate the pathkeys we need. Also,
> find_mergeclauses_for_outer_pathkeys() has to be changed too, so that it
> knows which pathkeys are compatible to which range join clauses.
>
> About the patch, do I understand it right that you are working on the next
> version now?

I think we are quite clearly past the deadline to submit a new patch
for inclusion in v11 at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: psql tab completion for ALTER INDEX SET

2018-03-02 Thread Fujii Masao

On Fri, Mar 2, 2018 at 3:52 PM, Masahiko Sawada  wrote:
> Hi,
>
> I found that tab completion for ALTER INDEX SET [tab] doesn't support
> the reloptions of brin and gist. Attached patch adds "buffering",
> "pages_per_range" and "autosummarize" options.

Thanks for the patch! Committed.

Regards,

-- 
Fujii Masao

Re: [HACKERS] Early locking option to parallel backup

On Fri, Mar 2, 2018 at 2:29 AM, Andres Freund  wrote:
> There seems to to be consensus in this thread that the approach Lucas
> proposed isn't what we want, and that instead some shared lock based
> approach is desirable.  As that has been the case for ~1.5 months, I
> propose we mark this as returned with feedback?

Yes, that seems pretty clear-cut to me.  It would be totally unfair if
a patch that hasn't been updated since November were allowed to submit
a new version after the start of the final CommitFest.  We shouldn't
be working on anything now that hasn't been under active development
recently; we have enough things (and then some) that have.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Online enabling of checksums