date:20200426

Re: Fix compilation failure against LLVM 11

2020-04-26 Thread Michael Paquier

On Sat, Apr 25, 2020 at 09:41:20PM -0700, Jesse Zhang wrote:
> I searched my inbox and the archive, strange that nobody else is seeing
> this.
> 
> Turns out that LLVM has recently removed "llvm/IR/CallSite.h" in
> (unreleased) version 11 [1][2]. To fix the build I tried conditionally
> (on LLVM_VERSION_MAJOR < 11) including CallSite.h, but that looks yuck.
> Then I poked at llvmjit_inline.cpp a bit and found that CallSite.h
> doesn't seem to be really necessary. PFA a patch that simply removes
> this #include.
> 
> In addition, I've done the due dilligence of trying to build against
> LLVM versions 8, 9, 10.

LLVM 11 has not been released yet.  Do you think that this part or
even more are subject to change before the 11 release?  My take would
be to wait more before fixing this issue and make sure that our code
is fixed when their code is GA'd.
--
Michael


signature.asc
Description: PGP signature

Re: WAL usage calculation patch

2020-04-26 Thread Michael Paquier

On Mon, Apr 27, 2020 at 08:35:51AM +0530, Amit Kapila wrote:
> On Thu, Apr 23, 2020 at 2:35 PM Amit Kapila  wrote:
>> On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut 
>>  wrote:
>>>  The internal symbol for the WAL record is
>>> XLOG_FPI and xlogdesc.c prints it as "FPI".
> 
> Julien, Peter, others do you have any opinion here?  I think it is
> better if we decide on one of FPW or FPI and make the changes at all
> places for this patch.

It seems to me that Peter is right here.  A full-page write is the
action to write a full-page image, so if you consider only a way to
define the static data of a full-page and/or a quantity associated to
it, we should talk about full-page images.
--
Michael


signature.asc
Description: PGP signature

Re: doc review for v13

2020-04-26 Thread Michael Paquier

On Sun, Apr 26, 2020 at 08:59:05PM -0400, Tom Lane wrote:
> James Coleman  writes:
>> On Sun, Apr 26, 2020 at 12:13 PM Justin Pryzby  wrote:
>>> I think my text is correct.  This would *also* be correct:
>>> |   If any CHECK constraint on the table being
>>> |   attached is marked NO INHERIT, the command will 
>>> fail;
>>> But not the hybrid: "If any OF THE .. is .."
> 
>> "any of the...are" sounds more natural to my ears,
> 
> Yeah, I think the same.  If you want to argue grammar, I'd point
> out that the "any" could refer to several of the constraints,
> making it correct to use the plural verb.  The alternative that
> Justin mentions could be written as "If any one constraint is ...",
> which is correct in that form; but the plural way seems less stilted.

Hm, okay.  There are still pieces in those patches about which I am
not sure, so I have let that aside for the time being.

Anyway, I have applied patch 12, and reported the typos from imath.c
directly to upstream:
https://github.com/creachadair/imath/issues/45
https://github.com/creachadair/imath/issues/46
--
Michael


signature.asc
Description: PGP signature

Re: Setting min/max TLS protocol in clientside libpq

2020-04-26 Thread Michael Paquier

On Sun, Apr 26, 2020 at 11:20:01PM +0200, Daniel Gustafsson wrote:
> That was the preferred name by Michael too elsewhere in the thread, so went
> ahead and made it so.

Thanks Daniel.

>> I would, however, prefer to also rename the internal symbols.
> 
> Done in the attached v2.

What you have here looks fine to me.  Peter, what do you think?
--
Michael


signature.asc
Description: PGP signature

Re: 2pc leaks fds

2020-04-26 Thread Kyotaro Horiguchi

At Fri, 24 Apr 2020 11:48:46 -0400, Alvaro Herrera  
wrote in 
> On 2020-Apr-24, Kyotaro Horiguchi wrote:
> 
> > At Thu, 23 Apr 2020 19:16:03 -0400, Alvaro Herrera 
> >  wrote in 
> 
> > > Here's a first attempt at that.  The segment_open/close callbacks are
> > > now given at XLogReaderAllocate time, and are passed the XLogReaderState
> > > pointer.  I wrote a comment to explain that the page_read callback can
> > > use WALRead() if it wishes to do so; but if it does, then segment_open
> > > has to be provided.  segment_close is mandatory (since we call it at
> > > XLogReaderFree).
> 
> > I modestly object to such many call-back functions.  FWIW I'm writing
> > this with [1] in my mind.
> > [1] 
> > https://www.postgresql.org/message-id/20200422.101246.331162888498679491.horikyota.ntt%40gmail.com
> 
> Hmm.  Looking at your 0001, I think there's nothing in that patch that's
> not compatible with my proposed API change.
> 
> 0002 is a completely different story of course; but that patch is a
> radical change of spirit for xlogreader, in the sense that it's no
> longer a "reader", but rather just an interpreter of bytes from a WAL
> byte sequence into WAL records; and shifts the responsibility of the
> actual reading to the caller.  That's why xlogreader no longer receives
> the page_read callback (and why it doesn't need the segment_open,
> segment_close callbacks).

Sorry for the ambiguity, I didn't meant I minded that this conflicts
with my patch or I don't want this to be committed. It is easily
rebased on this patch.  What I was anxious about is that the new
callback struct might be too flexible than required. So I "mildly"
objected, and I won't be dissapointed if this patch is committed.

> I have to admit that until today I hadn't realized that that's what your
> patch series was doing.  I'm not familiar with how you intend to
> implement WAL encryption on top of this, but on first blush I'm not
> liking this proposed design too much.

Right. I might be too much in detail, but it simplifies the call
tree. Anyway that is another discussion, though:)

> > An open-callback is bound to a read-callback. A close-callback is
> > bound to the way the read-callback opens a segment (or the
> > open-callback).  I'm afraid that only adding "cleanup" callback might
> > be sufficient.
> 
> Well, the complaint is that the current layering is weird, in that there
> are two levels at which we pass callbacks: one is XLogReaderAllocate,
> where you specify the page_read callback; and the other layer is inside
> the page_read callback, if that layer uses the WALRead auxiliary
> function.  The thing that my patch is doing is pass all three callbacks
> at the XLogReaderAllocate level.  So when xlogreader drills down to
> read_page, xlogreader already has the segment_open callback handy if it
> needs it.  Conceptually, this seems more sensible.

It looks like as if the open/read/close-callbacks are generic and on
the same interface layer, but actually open-callback is dedicate to
WALRead and it is useless when the read-callback doesn't use
WALRead. What I was anxious about is that the open-callback is
uselessly exposing the secret of the read-callback.

> I think a "cleanup" callback might also be sensible in general terms,
> but we have a problem with the specifics -- namely that the "state" that
> we need to clean up (the file descriptor of the open segment) is part of
> xlogreader's state.  And we obviously cannot remove the FD from
> XLogReaderState, because when we need the FD to do things with it to
> obtain data from the file.

I meant concretely that we only have read- and cleanup- callbacks in
xlogreader state.  The caller of XLogReaderAllocate specifies the
cleanup-callback that is to be used to clean up what the
reader-callback left behind, in the same manner with this patch does.
The only reason it is not named close-callback is that it is used only
when the xlogreader-state is destroyed. So I'm fine with
read/close-callbacks.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: proposal - plpgsql - all plpgsql auto variables should be constant

2020-04-26 Thread Pavel Stehule

po 27. 4. 2020 v 5:02 odesílatel Kyotaro Horiguchi 
napsal:

> At Fri, 24 Apr 2020 16:47:28 +0200, Pavel Stehule 
> wrote in
> > pá 24. 4. 2020 v 16:07 odesílatel Tom Lane  napsal:
> >
> > > Ashutosh Bapat  writes:
> > > > On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule <
> pavel.steh...@gmail.com>
> > > wrote:
> > > >> plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's
> > > control variable, TG_WHEN, TG_OP, ..
> > > >> Currently these variables are not protected, what can be source of
> > > problems, mainly for not experienced users. I propose mark these
> variables
> > > as constant.
> > >
> > > > +1 for general idea.
> > >
> > > I'm skeptical.  If we'd marked them that way from day one, it would
> have
> > > been fine, but to change it now is a whole different discussion.  I
> think
> > > the odds that anybody will thank us are much smaller than the odds that
> > > there will be complaints.  In particular, I'd be just about certain
> that
> > > there are people out there who are changing FOUND and loop control
> > > variables manually, and they will not appreciate us breaking their
> code.
> > >
> >
> > This is not black/white issue. Maybe can sense to modify the FOUND
> > variable, but modification of control variable has not any sense. The
> > updated value is rewriten by runtime any iteration. You cannot to use
> > modification of control variable to skip some iterations like in C.
>
> It seems to me, the loop structure is not a parallel of for() in C. It
> is rather a parallel of foreach of Perl or "for in range()" in
> Python. So it is natural to me that the i is assignable and reset with
> the next value at every iteration.  I believe that there are many
> existing cases where the control variable is modified in a loop.
>

it is based on PL/SQL language and this language is based on ADA.

There loop parameter is constant

https://www.adaic.org/resources/add_content/standards/05aarm/html/AA-5-5.html

Regards

Pavel


> On the other hand, I'm not sure about FOUND and the similars and I
> don't have a firm opinion them. I don't see a use case where they need
> to be assignable. However, I don't see a clear reason they mustn't be
> assignable, too. (And the behavior is documented at least for FOUND.)
>
> > > As for the trigger variables specifically, what is the rationale
> > > for marking TG_OP read-only but not OLD and NEW?  But it is dead
> > > certain that we won't get away with making the latter two read-only.
> > >
> >
> > For before triggers the NEW have to be updated. Any other maybe should be
> > protected, but there is little bit different kind of informations.
> >
> > > In short, -1.  This ship sailed about twenty years ago.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2020-04-26 Thread David Rowley

On Mon, 27 Apr 2020 at 15:12, James Coleman  wrote:
> While working on this I noticed that dynahash.c line 499 has this assertion:
>
> Assert(info->entrysize >= info->keysize);
>
> Do you by any chance know why the entry would need to be larger than the key?

Larger or equal. They'd be equal if you the key was the data, since
you do need to store at least the key.  Looking at the code for
examples where dynahash is used in that situation, I see
_hash_finish_split().

David

Re: Why are wait events not reported even though it reads/writes a timeline history file?

2020-04-26 Thread Fujii Masao





On 2020/04/24 11:29, Masahiro Ikeda wrote:

Hi,

There are two unexpected codes for me about wait events for timeline history 
file.
Please let me know your thoughts whether if we need to change.


1. readTimeLineHistory() function in timeline.c

The readTimeLineHistory() reads a timeline history file,
but it doesn't report “WAIT_EVENT_TIMELINE_HISTORY_READ".


Yeah, this sounds strange.


In my understanding, sscanf() is blocking read.
So, it's important to report a wait event.


Shouldn't the wait event be reported during fgets() rather than sscanf()?


2. writeTimeLineHistory() function in timeline.c

The writeTimeLineHistory() function may write a timeline history file twice,
but it reports “WAIT_EVENT_TIMELINE_HISTORY_WRITE" only once.

It makes sense to report a wait event twice, because both of them use write().


Yes.


I attached a patch to mention the code line number.


I checked the commit log which "WAIT_EVENT_TIMELINE_HISTORY_READ" and
"WAIT_EVENT_TIMELINE_HISTORY_WRITE" are committed and the discussion about it.
But I can't find the reason.

Please give me your comments.
If we need to change, I can make a patch to fix them.


Thanks! I agree to fix those issues.


By the way, which is correct "timeline's history file" or "timeline history 
file"?
The timeline.c has both. In my understanding, the latter is correct. If so, I 
will modify together.


Maybe both are correct?? I have no strong opinion about this.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Binary search in ScalarArrayOpExpr for OR'd constant arrays

2020-04-26 Thread James Coleman

On Sun, Apr 26, 2020 at 7:41 PM James Coleman  wrote:
>
> On Sun, Apr 26, 2020 at 4:49 PM Tomas Vondra
>  wrote:
> >
> > On Sun, Apr 26, 2020 at 02:46:19PM -0400, James Coleman wrote:
> > >On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra
> > > wrote:
> > >>
> > >> On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote:
> > >> >On Sat, Apr 25, 2020 at 5:41 PM David Rowley 
wrote:
> > >> >>
> > >> >> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra <
tomas.von...@2ndquadrant.com> wrote:
> > >> >> > This reminds me our attempts to add bloom filters to hash
joins, which I
> > >> >> > think ran into mostly the same challenge of deciding when the
bloom
> > >> >> > filter can be useful and is worth the extra work.
> > >> >>
> > >> >> Speaking of that, it would be interesting to see how a test where
you
> > >> >> write the query as IN(VALUES(...)) instead of IN() compares. It
would
> > >> >> be interesting to know if the planner is able to make a more
suitable
> > >> >> choice and also to see how all the work over the years to improve
Hash
> > >> >> Joins compares to the bsearch with and without the bloom filter.
> > >> >
> > >> >It would be interesting.
> > >> >
> > >> >It also makes one wonder about optimizing these into to hash
> > >> >joins...which I'd thought about over at [1]. I think it'd be a very
> > >> >significant effort though.
> > >> >
> > >>
> > >> I modified the script to also do the join version of the query. I can
> > >> only run it on my laptop at the moment, so the results may be a bit
> > >> different from those I shared before, but it's interesting I think.
> > >>
> > >> In most cases it's comparable to the binsearch/bloom approach, and in
> > >> some cases it actually beats them quite significantly. It seems to
> > >> depend on how expensive the comparison is - for "int" the comparison
is
> > >> very cheap and there's almost no difference. For "text" the
comparison
> > >> is much more expensive, and there are significant speedups.
> > >>
> > >> For example the test with 100k lookups in array of 10k elements and
10%
> > >> match probability, the timings are these
> > >>
> > >>master:  62362 ms
> > >>binsearch: 201 ms
> > >>bloom:  65 ms
> > >>hashjoin:   36 ms
> > >>
> > >> I do think the explanation is fairly simple - the bloom filter
> > >> eliminates about 90% of the expensive comparisons, so it's 20ms plus
> > >> some overhead to build and check the bits. The hash join probably
> > >> eliminates a lot of the remaining comparisons, because the hash table
> > >> is sized to have one tuple per bucket.
> > >>
> > >> Note: I also don't claim the PoC has the most efficient bloom filter
> > >> implementation possible. I'm sure it could be made faster.
> > >>
> > >> Anyway, I'm not sure transforming this to a hash join is worth the
> > >> effort - I agree that seems quite complex. But perhaps this suggest
we
> > >> should not be doing binary search and instead just build a simple
hash
> > >> table - that seems much simpler, and it'll probably give us about the
> > >> same benefits.
> > >
> > >That's actually what I originally thought about doing, but I chose
> > >binary search since it seemed a lot easier to get off the ground.
> > >
> >
> > OK, that makes perfect sense.
> >
> > >If we instead build a hash is there anything else we need to be
> > >concerned about? For example, work mem? I suppose for the binary
> > >search we already have to expand the array, so perhaps it's not all
> > >that meaningful relative to that...
> > >
> >
> > I don't think we need to be particularly concerned about work_mem. We
> > don't care about it now, and it's not clear to me what we could do about
> > it - we already have the array in memory anyway, so it's a bit futile.
> > Furthermore, if we need to care about it, it probably applies to the
> > binary search too.
> >
> > >I was looking earlier at what our standard hash implementation was,
> > >and it seemed less obvious what was needed to set that up (so binary
> > >search seemed a faster proof of concept). If you happen to have any
> > >pointers to similar usages I should look at, please let me know.
> > >
> >
> > I think the hash join implementation is far too complicated. It has to
> > care about work_mem, so it implements batching, etc. That's a lot of
> > complexity we don't need here. IMO we could use either the usual
> > dynahash, or maybe even the simpler simplehash.
> >
> > FWIW it'd be good to verify the numbers I shared, i.e. checking that the
> > benchmarks makes sense and running it independently. I'm not aware of
> > any issues but it was done late at night and only ran on my laptop.
>
> Some quick calculations (don't have the scripting in a form I can
> attach yet; using this as an opportunity to hack on a genericized
> performance testing framework of sorts) suggest your results are
> correct. I was also testing on my laptop, but I showed 1.) roughly
> equivalent results for IN (VALUES ...) and IN () for integers,
> but when I sw

Re: WAL usage calculation patch

2020-04-26 Thread Amit Kapila

On Thu, Apr 23, 2020 at 2:35 PM Amit Kapila  wrote:
>
> On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut
>  wrote:
>
> >  The internal symbol for the WAL record is
> > XLOG_FPI and xlogdesc.c prints it as "FPI".
> >
>
> That is just one way/reason we log the page.  There are others as
> well.  I thought here we are computing the number of full-page writes
> happened in the system due to various reasons like (a) a page is
> operated upon first time after the checkpoint, (b) log the XLOG_FPI
> record, (c) Guc for WAL consistency checker is on, etc.  If we see in
> XLogRecordAssemble where we decide to log this information, there is a
> comment "  log a full-page write for the current block." and there
> was an existing variable with 'fpw_lsn' which indicates to an extent
> that what we are computing in this patch is full-page writes.  But
> there is a reference to full-page image as well.  I think as
> full_page_writes is an exposed variable that is well understood so
> exposing information with similar name via this patch doesn't sound
> illogical to me. Whatever we use here we need to be consistent all
> throughout, even pg_stat_statements need to name exposed variable as
> wal_fpi instead of wal_fpw.
>
> To me, full-page writes sound more appealing with other WAL usage
> variables like records and bytes. I might be more used to this term as
> 'fpw' that is why it occurred better to me.  OTOH, if most of us think
> that a full-page image is better suited here, I am fine with changing
> it at all places.
>

Julien, Peter, others do you have any opinion here?  I think it is
better if we decide on one of FPW or FPI and make the changes at all
places for this patch.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: proposal - plpgsql - all plpgsql auto variables should be constant

2020-04-26 Thread Kyotaro Horiguchi

At Fri, 24 Apr 2020 16:47:28 +0200, Pavel Stehule  
wrote in 
> pá 24. 4. 2020 v 16:07 odesílatel Tom Lane  napsal:
> 
> > Ashutosh Bapat  writes:
> > > On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule 
> > wrote:
> > >> plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's
> > control variable, TG_WHEN, TG_OP, ..
> > >> Currently these variables are not protected, what can be source of
> > problems, mainly for not experienced users. I propose mark these variables
> > as constant.
> >
> > > +1 for general idea.
> >
> > I'm skeptical.  If we'd marked them that way from day one, it would have
> > been fine, but to change it now is a whole different discussion.  I think
> > the odds that anybody will thank us are much smaller than the odds that
> > there will be complaints.  In particular, I'd be just about certain that
> > there are people out there who are changing FOUND and loop control
> > variables manually, and they will not appreciate us breaking their code.
> >
> 
> This is not black/white issue. Maybe can sense to modify the FOUND
> variable, but modification of control variable has not any sense. The
> updated value is rewriten by runtime any iteration. You cannot to use
> modification of control variable to skip some iterations like in C.

It seems to me, the loop structure is not a parallel of for() in C. It
is rather a parallel of foreach of Perl or "for in range()" in
Python. So it is natural to me that the i is assignable and reset with
the next value at every iteration.  I believe that there are many
existing cases where the control variable is modified in a loop.

On the other hand, I'm not sure about FOUND and the similars and I
don't have a firm opinion them. I don't see a use case where they need
to be assignable. However, I don't see a clear reason they mustn't be
assignable, too. (And the behavior is documented at least for FOUND.)

> > As for the trigger variables specifically, what is the rationale
> > for marking TG_OP read-only but not OLD and NEW?  But it is dead
> > certain that we won't get away with making the latter two read-only.
> >
> 
> For before triggers the NEW have to be updated. Any other maybe should be
> protected, but there is little bit different kind of informations.
>
> > In short, -1.  This ship sailed about twenty years ago.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Poll: are people okay with function/operator table redesign?

2020-04-26 Thread Tom Lane

"Jonathan S. Katz"  writes:
> Can you try

>  #docContent p {
> -  margin-bottom: 1rem !important;
> +  margin-bottom: 1rem;
>  }

> and see how it looks?

In some desultory looking around, I couldn't find anyplace in the
existing text that that changes at all.  And it does make the
revised table markup render the way I want ... so +1.

regards, tom lane

Re: +(pg_lsn, int8) and -(pg_lsn, int8) operators

2020-04-26 Thread Kyotaro Horiguchi

At Fri, 24 Apr 2020 12:15:26 +0200, Jehan-Guillaume de Rorthais 
 wrote in 
> On Fri, 24 Apr 2020 16:24:14 +0900
> Michael Paquier  wrote:
> 
> > On Thu, Apr 23, 2020 at 08:09:22AM -0400, Robert Haas wrote:
> > > For anyone who missed it, this idea was popular on Twitter:
> > > 
> > > https://twitter.com/fujii_masao/status/1252652020487487488  
> > 
> > (For the sake of the archives)
> > To which Alvaro, Robert, Fabrízio de Royes Mello, Julien Rouhaud and I
> > answered positively to.
> 
> And me, discretely, with a little heart.

+1.  I actually sometimes need it.

y the way, -(pg_lsn, pg_lsn) yields a numeric. I feel that it could be
confusing that the new operators takes a bigint.  We need to cast the
second term to bigint in the following expression.

'2/20'::pg_lsn + ('1/10'::pg_lsn - '1/5'::pg_lsn)

The new + operator is not commutative. I'm not sure it is the right
desgin to make it commutative, but it would be irritatibe if it is
not. (Or maybe we should implement them as functions rather than
operators..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Poll: are people okay with function/operator table redesign?

2020-04-26 Thread Jonathan S. Katz

On 4/26/20 3:21 PM, Jonathan S. Katz wrote:
> On 4/26/20 1:40 PM, Tom Lane wrote:
>> I wrote:
>>> Alvaro Herrera  writes:
> 
>> There is a small problem with getting this to work in the webstyle
>> HTML: somebody decided it would be a great idea to have a global
>> override on paragraph margin-bottom settings.  For the purposes of
>> this test I just deleted that from main.css, but I suppose we want
>> some more-nuanced solution in reality.
> 
> I have to see why that is. I traced it back to the original "bring doc
> styles up to modern website" patch (66798351) and there is missing
> context. Anyway, I'd like to test it before a wholesale removal (there
> is often a strong correlation between "!important" and "hack", so I'll
> want to further dive into it).
> 
> I'll have some time to play around with the CSS tonight.

Can you try

 #docContent p {
-  margin-bottom: 1rem !important;
+  margin-bottom: 1rem;
 }

and see how it looks?

Jonathan



signature.asc
Description: OpenPGP digital signature

Re: doc review for v13

2020-04-26 Thread Tom Lane

James Coleman  writes:
> On Sun, Apr 26, 2020 at 12:13 PM Justin Pryzby  wrote:
>> I think my text is correct.  This would *also* be correct:
>> |   If any CHECK constraint on the table being
>> |   attached is marked NO INHERIT, the command will 
>> fail;
>> But not the hybrid: "If any OF THE .. is .."

> "any of the...are" sounds more natural to my ears,

Yeah, I think the same.  If you want to argue grammar, I'd point
out that the "any" could refer to several of the constraints,
making it correct to use the plural verb.  The alternative that
Justin mentions could be written as "If any one constraint is ...",
which is correct in that form; but the plural way seems less stilted.

regards, tom lane

Re: doc review for v13

2020-04-26 Thread James Coleman

On Sun, Apr 26, 2020 at 12:13 PM Justin Pryzby  wrote:
>
> On Tue, Apr 14, 2020 at 02:47:54PM +0900, Michael Paquier wrote:
> > On Sun, Apr 12, 2020 at 04:35:45PM -0500, Justin Pryzby wrote:
> > > Added a few more.
> > > And rebased on top of dbc60c5593f26dc777a3be032bff4fb4eab1ddd1
> >
> > Thanks for the patch set, I have applied the most obvious parts (more
> > or less 1/3) to reduce the load.  Here is a review of the rest.
>
> Thanks - attached are the remaining undisputed portions..
>
> > > +++ b/doc/src/sgml/ref/alter_table.sgml
> > > @@ -889,7 +889,7 @@ WITH ( MODULUS  > > class="parameter">numeric_literal, REM
> > >from the parent table will be created in the partition, if they 
> > > don't
> > >already exist.
> > >If any of the CHECK constraints of the table 
> > > being
> > > -  attached is marked NO INHERIT, the command will 
> > > fail;
> > > +  attached are marked NO INHERIT, the command 
> > > will fail;
> > >such constraints must be recreated without the
> > >NO INHERIT clause.
> > >   
> >
> > It seems to me that both are actually correct here.
>
> I think my text is correct.  This would *also* be correct:
>
> |   If any CHECK constraint on the table being
> |   attached is marked NO INHERIT, the command will 
> fail;
>
> But not the hybrid: "If any OF THE .. is .."

"any of the...are" sounds more natural to my ears, and some searching
yielded some grammar sites that agree (specifically that "any of" is
only used with singular verbs if the construction is uncountable or
negative).

However there are also multiple claims by grammarians that either
singular or plural verbs are acceptable with the "any of"
construction. So that's not all that helpful.

James

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2020-04-26 Thread James Coleman

On Sun, Apr 26, 2020 at 4:49 PM Tomas Vondra
 wrote:
>
> On Sun, Apr 26, 2020 at 02:46:19PM -0400, James Coleman wrote:
> >On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra
> > wrote:
> >>
> >> On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote:
> >> >On Sat, Apr 25, 2020 at 5:41 PM David Rowley  wrote:
> >> >>
> >> >> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra 
> >> >>  wrote:
> >> >> > This reminds me our attempts to add bloom filters to hash joins, 
> >> >> > which I
> >> >> > think ran into mostly the same challenge of deciding when the bloom
> >> >> > filter can be useful and is worth the extra work.
> >> >>
> >> >> Speaking of that, it would be interesting to see how a test where you
> >> >> write the query as IN(VALUES(...)) instead of IN() compares. It would
> >> >> be interesting to know if the planner is able to make a more suitable
> >> >> choice and also to see how all the work over the years to improve Hash
> >> >> Joins compares to the bsearch with and without the bloom filter.
> >> >
> >> >It would be interesting.
> >> >
> >> >It also makes one wonder about optimizing these into to hash
> >> >joins...which I'd thought about over at [1]. I think it'd be a very
> >> >significant effort though.
> >> >
> >>
> >> I modified the script to also do the join version of the query. I can
> >> only run it on my laptop at the moment, so the results may be a bit
> >> different from those I shared before, but it's interesting I think.
> >>
> >> In most cases it's comparable to the binsearch/bloom approach, and in
> >> some cases it actually beats them quite significantly. It seems to
> >> depend on how expensive the comparison is - for "int" the comparison is
> >> very cheap and there's almost no difference. For "text" the comparison
> >> is much more expensive, and there are significant speedups.
> >>
> >> For example the test with 100k lookups in array of 10k elements and 10%
> >> match probability, the timings are these
> >>
> >>master:  62362 ms
> >>binsearch: 201 ms
> >>bloom:  65 ms
> >>hashjoin:   36 ms
> >>
> >> I do think the explanation is fairly simple - the bloom filter
> >> eliminates about 90% of the expensive comparisons, so it's 20ms plus
> >> some overhead to build and check the bits. The hash join probably
> >> eliminates a lot of the remaining comparisons, because the hash table
> >> is sized to have one tuple per bucket.
> >>
> >> Note: I also don't claim the PoC has the most efficient bloom filter
> >> implementation possible. I'm sure it could be made faster.
> >>
> >> Anyway, I'm not sure transforming this to a hash join is worth the
> >> effort - I agree that seems quite complex. But perhaps this suggest we
> >> should not be doing binary search and instead just build a simple hash
> >> table - that seems much simpler, and it'll probably give us about the
> >> same benefits.
> >
> >That's actually what I originally thought about doing, but I chose
> >binary search since it seemed a lot easier to get off the ground.
> >
>
> OK, that makes perfect sense.
>
> >If we instead build a hash is there anything else we need to be
> >concerned about? For example, work mem? I suppose for the binary
> >search we already have to expand the array, so perhaps it's not all
> >that meaningful relative to that...
> >
>
> I don't think we need to be particularly concerned about work_mem. We
> don't care about it now, and it's not clear to me what we could do about
> it - we already have the array in memory anyway, so it's a bit futile.
> Furthermore, if we need to care about it, it probably applies to the
> binary search too.
>
> >I was looking earlier at what our standard hash implementation was,
> >and it seemed less obvious what was needed to set that up (so binary
> >search seemed a faster proof of concept). If you happen to have any
> >pointers to similar usages I should look at, please let me know.
> >
>
> I think the hash join implementation is far too complicated. It has to
> care about work_mem, so it implements batching, etc. That's a lot of
> complexity we don't need here. IMO we could use either the usual
> dynahash, or maybe even the simpler simplehash.
>
> FWIW it'd be good to verify the numbers I shared, i.e. checking that the
> benchmarks makes sense and running it independently. I'm not aware of
> any issues but it was done late at night and only ran on my laptop.

Some quick calculations (don't have the scripting in a form I can
attach yet; using this as an opportunity to hack on a genericized
performance testing framework of sorts) suggest your results are
correct. I was also testing on my laptop, but I showed 1.) roughly
equivalent results for IN (VALUES ...) and IN () for integers,
but when I switch to (short; average 3 characters long) text values I
show the hash join on VALUES is twice as fast as the binary search.

Given that, I'm planning to implement this as a hash lookup and share
a revised patch.

James

Re: Setting min/max TLS protocol in clientside libpq

2020-04-26 Thread Daniel Gustafsson

> On 26 Apr 2020, at 14:01, Peter Eisentraut  
> wrote:
> 
> On 2020-04-24 14:03, Daniel Gustafsson wrote:
>>> On 24 Apr 2020, at 12:56, Peter Eisentraut 
>>>  wrote:
>>> 
>>> Can we reconsider whether we really want to name the new settings like 
>>> "sslminprotocolversion", or whether we could add some underscores, both for 
>>> readability and for consistency with the server-side options?
>> That was brought up by Michael in the thread, but none of us followed up on 
>> it
>> it seems.  The current name was chosen to be consistent with the already
>> existing ssl* client-side settings, but I don't really have strong opinions 
>> on
>> if that makes sense or not.  Perhaps use ssl_m{in|max}_protocolversion to 
>> make
>> it more readable?
> 
> The names on the backend side are ssl_{min|max|_protocol_version.

That was the preferred name by Michael too elsewhere in the thread, so went
ahead and made it so.

>> The attached renames the userfacing setting, but keeps the environment 
>> variable
>> without underscores as most settings have env vars without underscores.
> 
> Keeping the environment variable as is seems fine (also consistent with 
> "channel_binding").
> 
> I would, however, prefer to also rename the internal symbols.

Done in the attached v2.

cheers ./daniel



minmaxproto_naming-v2.patch
Description: Binary data

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2020-04-26 Thread Tomas Vondra

On Sun, Apr 26, 2020 at 02:46:19PM -0400, James Coleman wrote:

On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra
 wrote:

On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote:
>On Sat, Apr 25, 2020 at 5:41 PM David Rowley  wrote:
>>
>> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra  
wrote:
>> > This reminds me our attempts to add bloom filters to hash joins, which I
>> > think ran into mostly the same challenge of deciding when the bloom
>> > filter can be useful and is worth the extra work.
>>
>> Speaking of that, it would be interesting to see how a test where you
>> write the query as IN(VALUES(...)) instead of IN() compares. It would
>> be interesting to know if the planner is able to make a more suitable
>> choice and also to see how all the work over the years to improve Hash
>> Joins compares to the bsearch with and without the bloom filter.
>
>It would be interesting.
>
>It also makes one wonder about optimizing these into to hash
>joins...which I'd thought about over at [1]. I think it'd be a very
>significant effort though.
>

I modified the script to also do the join version of the query. I can
only run it on my laptop at the moment, so the results may be a bit
different from those I shared before, but it's interesting I think.

In most cases it's comparable to the binsearch/bloom approach, and in
some cases it actually beats them quite significantly. It seems to
depend on how expensive the comparison is - for "int" the comparison is
very cheap and there's almost no difference. For "text" the comparison
is much more expensive, and there are significant speedups.

For example the test with 100k lookups in array of 10k elements and 10%
match probability, the timings are these

   master:  62362 ms
   binsearch: 201 ms
   bloom:  65 ms
   hashjoin:   36 ms

I do think the explanation is fairly simple - the bloom filter
eliminates about 90% of the expensive comparisons, so it's 20ms plus
some overhead to build and check the bits. The hash join probably
eliminates a lot of the remaining comparisons, because the hash table
is sized to have one tuple per bucket.

Note: I also don't claim the PoC has the most efficient bloom filter
implementation possible. I'm sure it could be made faster.

Anyway, I'm not sure transforming this to a hash join is worth the
effort - I agree that seems quite complex. But perhaps this suggest we
should not be doing binary search and instead just build a simple hash
table - that seems much simpler, and it'll probably give us about the
same benefits.

That's actually what I originally thought about doing, but I chose
binary search since it seemed a lot easier to get off the ground.

OK, that makes perfect sense.

If we instead build a hash is there anything else we need to be
concerned about? For example, work mem? I suppose for the binary
search we already have to expand the array, so perhaps it's not all
that meaningful relative to that...

I don't think we need to be particularly concerned about work_mem. We
don't care about it now, and it's not clear to me what we could do about
it - we already have the array in memory anyway, so it's a bit futile.
Furthermore, if we need to care about it, it probably applies to the
binary search too.

I was looking earlier at what our standard hash implementation was,
and it seemed less obvious what was needed to set that up (so binary
search seemed a faster proof of concept). If you happen to have any
pointers to similar usages I should look at, please let me know.

I think the hash join implementation is far too complicated. It has to
care about work_mem, so it implements batching, etc. That's a lot of
complexity we don't need here. IMO we could use either the usual
dynahash, or maybe even the simpler simplehash.

FWIW it'd be good to verify the numbers I shared, i.e. checking that the
benchmarks makes sense and running it independently. I'm not aware of
any issues but it was done late at night and only ran on my laptop.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Poll: are people okay with function/operator table redesign?

2020-04-26 Thread Jonathan S. Katz

On 4/26/20 1:40 PM, Tom Lane wrote:
> I wrote:
>> Alvaro Herrera  writes:

> There is a small problem with getting this to work in the webstyle
> HTML: somebody decided it would be a great idea to have a global
> override on paragraph margin-bottom settings.  For the purposes of
> this test I just deleted that from main.css, but I suppose we want
> some more-nuanced solution in reality.

I have to see why that is. I traced it back to the original "bring doc
styles up to modern website" patch (66798351) and there is missing
context. Anyway, I'd like to test it before a wholesale removal (there
is often a strong correlation between "!important" and "hack", so I'll
want to further dive into it).

I'll have some time to play around with the CSS tonight.

Jonathan

signature.asc
Description: OpenPGP digital signature

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

2020-04-26 Thread James Coleman

On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra
 wrote:
>
> On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote:
> >On Sat, Apr 25, 2020 at 5:41 PM David Rowley  wrote:
> >>
> >> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra  
> >> wrote:
> >> > This reminds me our attempts to add bloom filters to hash joins, which I
> >> > think ran into mostly the same challenge of deciding when the bloom
> >> > filter can be useful and is worth the extra work.
> >>
> >> Speaking of that, it would be interesting to see how a test where you
> >> write the query as IN(VALUES(...)) instead of IN() compares. It would
> >> be interesting to know if the planner is able to make a more suitable
> >> choice and also to see how all the work over the years to improve Hash
> >> Joins compares to the bsearch with and without the bloom filter.
> >
> >It would be interesting.
> >
> >It also makes one wonder about optimizing these into to hash
> >joins...which I'd thought about over at [1]. I think it'd be a very
> >significant effort though.
> >
>
> I modified the script to also do the join version of the query. I can
> only run it on my laptop at the moment, so the results may be a bit
> different from those I shared before, but it's interesting I think.
>
> In most cases it's comparable to the binsearch/bloom approach, and in
> some cases it actually beats them quite significantly. It seems to
> depend on how expensive the comparison is - for "int" the comparison is
> very cheap and there's almost no difference. For "text" the comparison
> is much more expensive, and there are significant speedups.
>
> For example the test with 100k lookups in array of 10k elements and 10%
> match probability, the timings are these
>
>master:  62362 ms
>binsearch: 201 ms
>bloom:  65 ms
>hashjoin:   36 ms
>
> I do think the explanation is fairly simple - the bloom filter
> eliminates about 90% of the expensive comparisons, so it's 20ms plus
> some overhead to build and check the bits. The hash join probably
> eliminates a lot of the remaining comparisons, because the hash table
> is sized to have one tuple per bucket.
>
> Note: I also don't claim the PoC has the most efficient bloom filter
> implementation possible. I'm sure it could be made faster.
>
> Anyway, I'm not sure transforming this to a hash join is worth the
> effort - I agree that seems quite complex. But perhaps this suggest we
> should not be doing binary search and instead just build a simple hash
> table - that seems much simpler, and it'll probably give us about the
> same benefits.

That's actually what I originally thought about doing, but I chose
binary search since it seemed a lot easier to get off the ground.

If we instead build a hash is there anything else we need to be
concerned about? For example, work mem? I suppose for the binary
search we already have to expand the array, so perhaps it's not all
that meaningful relative to that...

I was looking earlier at what our standard hash implementation was,
and it seemed less obvious what was needed to set that up (so binary
search seemed a faster proof of concept). If you happen to have any
pointers to similar usages I should look at, please let me know.

James

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-04-26 Thread Justin Pryzby

On Sat, Apr 11, 2020 at 08:33:52PM -0500, Justin Pryzby wrote:
> > That's the last known issue with the patch.  I doubt anyone will elect to 
> > pick
> > it up in the next 8 hours, but I think it's in very good shape for v14 :)
> 
> I tweaked some comments and docs and plan to mark this RfC.

Rebased onto d12bdba77b0fce9df818bc84ad8b1d8e7a96614b

Restored two tests from Alexey's original patch which exposed issue with
REINDEX DATABASE when allow_system_table_mods=off.

-- 
Justin
>From b29f3b29287188bf4fd3ff2289cb1dfbb09ff99f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 2 Apr 2020 14:20:11 -0500
Subject: [PATCH v21 1/5] tab completion for reindex(verbose)..

..which was added at ecd222e77 for v9.5.

This handles "verbose" itself as well as the following word.

Separate commit as this could be backpatched to v12 (but backpatching further
is less trivial, due to improvements added at 121213d9d).
---
 src/bin/psql/tab-complete.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f6fd623c98..8178e69575 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3418,7 +3418,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("DATA");
 
 /* REINDEX */
-	else if (Matches("REINDEX"))
+	else if (Matches("REINDEX") || Matches("REINDEX", "(*)"))
 		COMPLETE_WITH("TABLE", "INDEX", "SYSTEM", "SCHEMA", "DATABASE");
 	else if (Matches("REINDEX", "TABLE"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexables,
@@ -3440,6 +3440,17 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_QUERY(Query_for_list_of_schemas);
 	else if (Matches("REINDEX", "SYSTEM|DATABASE", "CONCURRENTLY"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_databases);
+	else if (HeadMatches("REINDEX", "(*") &&
+			 !HeadMatches("REINDEX", "(*)"))
+	{
+		/*
+		 * This fires if we're in an unfinished parenthesized option list.
+		 * get_previous_words treats a completed parenthesized option list as
+		 * one word, so the above test is correct.
+		 */
+		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
+			COMPLETE_WITH("VERBOSE");
+	}
 
 /* SECURITY LABEL */
 	else if (Matches("SECURITY"))
-- 
2.17.0

>From eff5654089dc199d596535fcc22b8f4a2b7e27a6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Mar 2020 17:50:46 -0500
Subject: [PATCH v21 2/5] Change REINDEX/CLUSTER to accept an option list..

..like EXPLAIN (..), VACUUM (..), and ANALYZE (..).

Change docs in the style of VACUUM.  See also: 52dcfda48778d16683c64ca4372299a099a15b96
---
 doc/src/sgml/ref/cluster.sgml| 29 ++---
 doc/src/sgml/ref/reindex.sgml| 43 +---
 src/backend/commands/cluster.c   | 23 -
 src/backend/commands/indexcmds.c | 41 --
 src/backend/nodes/copyfuncs.c|  2 ++
 src/backend/nodes/equalfuncs.c   |  2 ++
 src/backend/parser/gram.y| 35 +++---
 src/backend/tcop/utility.c   | 36 +++---
 src/bin/psql/tab-complete.c  | 23 +
 src/include/commands/cluster.h   |  3 ++-
 src/include/commands/defrem.h|  7 +++---
 src/include/nodes/parsenodes.h   |  2 ++
 12 files changed, 180 insertions(+), 66 deletions(-)

diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 4da60d8d56..a6df8a3d81 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -21,8 +21,13 @@ PostgreSQL documentation
 
  
 
-CLUSTER [VERBOSE] table_name [ USING index_name ]
-CLUSTER [VERBOSE]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ] table_name [ USING index_name ]
+CLUSTER [VERBOSE] [ ( option [, ...] ) ]
+
+where option can be one of:
+
+VERBOSE [ boolean ]
+
 
  
 
@@ -81,6 +86,16 @@ CLUSTER [VERBOSE]
   Parameters
 
   
+   
+VERBOSE
+
+ 
+  Prints a progress report as each table is clustered.
+
+ 
+
+   
+

 table_name
 
@@ -100,13 +115,19 @@ CLUSTER [VERBOSE]

 

-VERBOSE
+boolean
 
  
-  Prints a progress report as each table is clustered.
+  Specifies whether the selected option should be turned on or off.
+  You can write TRUE, ON, or
+  1 to enable the option, and FALSE,
+  OFF, or 0 to disable it.  The
+  boolean value can also
+  be omitted, in which case TRUE is assumed.
  
 

+
   
  
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index c54a7c420d..71941e52e3 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -25,7 +25,7 @@ REINDEX [ ( option [, ...] ) ] { IN
 
 where option can be one of:
 
-VERBOSE
+VERBOSE [ boolean ]
 
  
 
@@ -141,19 +141,6 @@ REINDEX [ ( option [, ...] ) ] { IN
 

 
-   
-name
-
- 
-  The name of the specific index, table, or database to be
-  reindexed.  Index and table names can be sche

Re: doc review for v13

2020-04-26 Thread Justin Pryzby

On Tue, Apr 14, 2020 at 02:47:54PM +0900, Michael Paquier wrote:
> On Sun, Apr 12, 2020 at 04:35:45PM -0500, Justin Pryzby wrote:
> > Added a few more.
> > And rebased on top of dbc60c5593f26dc777a3be032bff4fb4eab1ddd1
> 
> Thanks for the patch set, I have applied the most obvious parts (more
> or less 1/3) to reduce the load.  Here is a review of the rest.

Thanks - attached are the remaining undisputed portions..

> > +++ b/doc/src/sgml/ref/alter_table.sgml
> > @@ -889,7 +889,7 @@ WITH ( MODULUS  > class="parameter">numeric_literal, REM
> >from the parent table will be created in the partition, if they don't
> >already exist.
> >If any of the CHECK constraints of the table being
> > -  attached is marked NO INHERIT, the command will 
> > fail;
> > +  attached are marked NO INHERIT, the command will 
> > fail;
> >such constraints must be recreated without the
> >NO INHERIT clause.
> >   
>
> It seems to me that both are actually correct here.

I think my text is correct.  This would *also* be correct:

|   If any CHECK constraint on the table being
|   attached is marked NO INHERIT, the command will fail;

But not the hybrid: "If any OF THE .. is .."

-- 
Justin
>From 5f499fec02301696705be5b9f6e1b0c4fa1dba16 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 29 Mar 2020 21:31:08 -0500
Subject: [PATCH v4 01/12] doc: percent encoding

commit e0ed6817c0ee218a3681920807404603e042ff04
Author: Michael Paquier 
---
 doc/src/sgml/libpq.sgml | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 75d2224a61..8b9af95c14 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -925,11 +925,11 @@ postgresql:///mydb?host=localhost&port=5433

 

-Connection URI needs to be encoded with 
+A connection URI needs to be encoded with 
 https://tools.ietf.org/html/rfc3986#section-2.1";>Percent-encoding 
-if it includes symbols with special meaning in any of its parts. 
-Here is an example where equal sign (=) is replaced
-with %3D and whitespace character with
+if it includes symbols with special meanings in any of its parts. 
+Here is an example where an equal sign (=) is replaced
+with %3D and a whitespace character with
 %20:
 
 postgresql://user@localhost:5433/mydb?options=-c%20synchronous_commit%3Doff
-- 
2.17.0

>From 1df7e0445624948b63d776f6894da145db8622cc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 19:43:22 -0500
Subject: [PATCH v4 02/12] docs: btree deduplication

commit 0d861bbb702f8aa05c2a4e3f1650e7e8df8c8c27
Author: Peter Geoghegan 
---
 doc/src/sgml/btree.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index e9cab4a55d..ff1e49e509 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -609,7 +609,7 @@ equalimage(opcintype oid) returns bool
   
   
Deduplication works by periodically merging groups of duplicate
-   tuples together, forming a single posting list tuple for each
+   tuples together, forming a single posting list tuple for each
group.  The column key value(s) only appear once in this
representation.  This is followed by a sorted array of
TIDs that point to rows in the table.  This
-- 
2.17.0

>From 383aabd1e2fe497f40710904d897f088607e08a2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 29 Mar 2020 21:22:43 -0500
Subject: [PATCH v4 03/12] doc: pg_stat_progress_basebackup

commit e65497df8f85ab9b9084c928ff69f384ea729b24
Author: Fujii Masao 
---
 doc/src/sgml/monitoring.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 6562cc400b..d5d9da659d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4577,8 +4577,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   waiting for checkpoint to finish
   
The WAL sender process is currently performing
-   pg_start_backup to set up for
-   taking a base backup, and waiting for backup start
+   pg_start_backup to prepare to
+   take a base backup, and waiting for the start-of-backup
checkpoint to finish.
   
  
-- 
2.17.0

>From 44cc3b9422619ff068e72fbad3f0eec103594717 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 5 Apr 2020 22:35:09 -0500
Subject: [PATCH v4 04/12] Fix docs: "time of the"

commit 9e257a181cc1dc5e19eb5d770ce09cc98f470f5f
Author: Andrew Dunstan 
Date:   Sun Mar 24 11:27:20 2013 -0400

Add parallel pg_dump option.

commit 5ab892c391c6bc54a00e7a8de5cab077cabace6a
Author: Michael Paquier 
Date:   Sat Jul 27 22:21:18 2019 +0900

Add support for --jobs in reindexdb

commit a17923204736d8842eade3517d6a8ee81290fca4
Author: Alvaro Herrera 
Date:   Fri Jan 23 15:02:45 2015 -0300

vacuumdb: enable parallel mode
---
 doc/src/sgml

Re: Subplan result caching

2020-04-26 Thread Andy Fan

On Sun, Apr 26, 2020 at 5:49 PM David Rowley  wrote:

> On Sun, 26 Apr 2020 at 19:08, Andy Fan  wrote:
> > If we want to handle this case as well, one of the changes would
> > be it needs to cache multi records for one input parameter, or return
> > one row each time but return mutli times for one input parameter,
> > Tuplestore may be a good option for this case since its full
> functionalities
> > like tuple_puttuple, tuple_gettuple. But if we implement it with
> tuplestore,
> > the next question is how to control the memory usage for this Node.
> > We can use the dedicated memory context to know how many memory
> > this node used in total, but we can't stop the tuplestore from using more
> > memory.  Or we can force set both current tuplestore->state  to
> TTS_WRITEFILE
> > and set the allowedMem to 0 for the following tuplestore, after we find
> too
> > memory is used. However this looks a bit of hack.
>
> I didn't imagine a tuplestore would be that useful for this. A node
> like this will do its best work when the ratio of n_values /
> distinct_values of the parameters is high. The planner can often not
> be that great at knowing the number of distinct values, especially so
> when there is more than one expression to estimate the number of
> distinct values for. (we added extended statistics to try to help with
> that).  I think this node will do its best when the time spent for a
> cache miss it bearly any more expensive than scanning the subnode to
> get the results.  If we can do that then we'll see fewer regressions
> for when we inject one of these nodes where it'll do no good, e.g when
> we'll never get a repeated value.  If we start spilling these tuples
> out to disk then it adds overhead which might never pay off.
>
> I'd suggest a hash table to act as an MRU cache.  We'd just evict old
> values when we run out of space, i.e consume all of work_mem.
>
> I've got a bunch of code locally which is still a work in progress to
> do this. I'll finish it off and post it here.


I was feeling that we may have to maintain some extra status if we use hash
table rather than tuple store, but that might be not a major concern.  I can
wait and see your patch.

Best Regards
Andy Fan

Re: Setting min/max TLS protocol in clientside libpq

2020-04-26 Thread Peter Eisentraut


On 2020-04-24 14:03, Daniel Gustafsson wrote:

On 24 Apr 2020, at 12:56, Peter Eisentraut  
wrote:

Can we reconsider whether we really want to name the new settings like 
"sslminprotocolversion", or whether we could add some underscores, both for 
readability and for consistency with the server-side options?


That was brought up by Michael in the thread, but none of us followed up on it
it seems.  The current name was chosen to be consistent with the already
existing ssl* client-side settings, but I don't really have strong opinions on
if that makes sense or not.  Perhaps use ssl_m{in|max}_protocolversion to make
it more readable?


The names on the backend side are ssl_{min|max|_protocol_version.


The attached renames the userfacing setting, but keeps the environment variable
without underscores as most settings have env vars without underscores.


Keeping the environment variable as is seems fine (also consistent with 
"channel_binding").


I would, however, prefer to also rename the internal symbols.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Subplan result caching

2020-04-26 Thread David Rowley

On Sun, 26 Apr 2020 at 19:08, Andy Fan  wrote:
> If we want to handle this case as well, one of the changes would
> be it needs to cache multi records for one input parameter, or return
> one row each time but return mutli times for one input parameter,
> Tuplestore may be a good option for this case since its full functionalities
> like tuple_puttuple, tuple_gettuple. But if we implement it with tuplestore,
> the next question is how to control the memory usage for this Node.
> We can use the dedicated memory context to know how many memory
> this node used in total, but we can't stop the tuplestore from using more
> memory.  Or we can force set both current tuplestore->state  to TTS_WRITEFILE
> and set the allowedMem to 0 for the following tuplestore, after we find too
> memory is used. However this looks a bit of hack.

I didn't imagine a tuplestore would be that useful for this. A node
like this will do its best work when the ratio of n_values /
distinct_values of the parameters is high. The planner can often not
be that great at knowing the number of distinct values, especially so
when there is more than one expression to estimate the number of
distinct values for. (we added extended statistics to try to help with
that).  I think this node will do its best when the time spent for a
cache miss it bearly any more expensive than scanning the subnode to
get the results.  If we can do that then we'll see fewer regressions
for when we inject one of these nodes where it'll do no good, e.g when
we'll never get a repeated value.  If we start spilling these tuples
out to disk then it adds overhead which might never pay off.

I'd suggest a hash table to act as an MRU cache.  We'd just evict old
values when we run out of space, i.e consume all of work_mem.

I've got a bunch of code locally which is still a work in progress to
do this. I'll finish it off and post it here.

David

Re: WIP: Aggregation push-down

2020-04-26 Thread Andy Fan

On Fri, Apr 24, 2020 at 8:10 PM Antonin Houska  wrote:

> Andy Fan  wrote:
>
> > The more tests on your patch, the more powerful I feel it is!
>
> Thanks for the appreciation. Given the poor progress it's increasingly hard
> for me to find motivation to work on it. I'll try to be more professional
> :-)
>
>
Let's not give up:)  I see your patch can push down the aggregation in 3
cases
at least:

1).  The group by clause exists in the join eq clause.
2).  The group by clause doesn't exist in join eq clause, but we have
pk on on side of join eq clause.
3).  The group by clause doesn't exist in join eq clause, and we don't have
the pk as well.

Tom well explained the correctness of the first 2 cases [1] and probably
the case
3) is correct as well, but it is a bit of hard to understand.  If this is a
case for others
as well, that probably make the review harder.

So my little suggestion is can we split the patch into some smaller commit
to handle each case?  like: commit 1 & 2 handles case 1 & 2，commit 3 handles
join relation as well.   commit 4 handles the case 3.   Commit 5 can avoid
the
two-phase aggregation for case 2.  Commit 6 introduces the aggmultifn.  and
commit 7 to handle some outer join case Ashutosh raised at [2].  However not
all the cases need to be handled at one time.

Actually when I read the code, I want such separation by my own to verify my
understanding is correct,  but I don't have such time at least now. It
maybe much
easier for you since you are the author.  I will be pretty pleasure to help
in any case
after I close my current working item (Hopefully in 2 months).

> At the code level, I did some slight changes on init_grouping_targets
> which may
> > make the code easier to read.  You are free to to use/not use it.
>
> I'm going to accept your change of create_rel_agg_info(), but I hesitate
> about
> the changes to init_grouping_targets().
>
> First, is it worth to spend CPU cycles on construction of an extra list
> grouping_columns? Is there a corner case in which we cannot simply pass
> grouping_columns=target->exprs to check_functional_grouping()?
>
> Second, it's obvious that you prefer the style
>
> foreach ()
> {
> if ()
>...
> else if ()
>...
> else
>...
> }
>
> over this
>
> foreach ()
> {
> if ()
> {
>...
>continue;
> }
>
> if ()
> {
>...
>continue;
> }
>
> ...
> }
>
> I often prefer the latter and I see that the existing planner code uses
> this
> style quite often too. I think the reason is that it allows for more
> complex
> tests, while the "else-if style" requires all tests to take place inside
> the
> "if ()" expression. However, if several (not necessarily tightly related)
> tests become "compressed" this way, it's less obvious how where to put
> comments.  Indeed it seems that some comments got lost due to your changes.

Your explanation looks reasonable,  thanks for that.  the changes also used
some builtin function to avoid the one more level for-loop.
like tlist_member.

As for the high level design, based on my current knowledge,  probably no
need
to change.

[1] https://www.postgresql.org/message-id/9726.1542577439%40sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/flat/CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3%2BcoG%3D8ki7s8Zqjw%40mail.gmail.com

Re: [Proposal] Global temporary tables

2020-04-26 Thread 曾文旌



> 2020年4月23日 下午3:43，Pavel Stehule  写道：
> 
> 
> 
> čt 23. 4. 2020 v 9:10 odesílatel 曾文旌  > napsal:
> 
> 
>> 2020年4月22日 下午10:50，Pavel Stehule > > 写道：
>> 
>> 
>> 
>> st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu 
>> mailto:prabhat.s...@enterprisedb.com>> 
>> napsal:
>> 
>> 
>> On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 > > wrote:
>>> 
>>> Although the implementation of GTT is different, I think so TRUNCATE on 
>>> Postgres (when it is really finalized) can remove session metadata of GTT 
>>> too (and reduce usage's counter). It is not critical feature, but I think 
>>> so it should not be hard to implement. From practical reason can be nice to 
>>> have a tool how to refresh GTT without a necessity to close session. 
>>> TRUNCATE can be this tool.
> Sorry, I don't quite understand what you mean, could you describe it in 
> detail? 
> In my opinion the TRUNCATE GTT cannot clean up data in other sessions, 
> especially clean up local buffers in other sessions.
> 
> It is about a possibility to force reset GTT to default empty state for all 
> sessions.
> 
> Maybe it is some what does your TRUNCATE DROP, but I don't think so this 
> design (TRUNCATE DROP) is good, because then user have to know implementation 
> detail.
> 
> I prefer some like TRUNCATE tab WITH OPTION (GLOBAL, FORCE) - "GLOBAL" .. 
> apply on all sessions, FORCE try to do without waiting on some global lock, 
> try to do immediately with possibility to cancel some statements and rollback 
> some session.
> 
> instead GLOBAL maybe we can use  "ALLSESSION", or "ALL SESSION" or some else
> 
> But I like possible terminology LOCAL x GLOBAL for GTT. What I mean? Some 
> statements like "TRUNCATE" can  works (by default) in "local" mode .. it has 
> impact to current session only. But sometimes can be executed in "global" 
> mode with effect on all sessions.
The TRUNCATE GTT GLOBAL like DROP GTT FORCE you mentioned that before.
I think this requires identifying sessions that have initialized the stored 
file and no actual data.
And Handling local buffers on other session and locks is also difficult.
It may be harder than dropping the GTT force, which can kill other sessions, 
but TRUNCATE GTT would prefer not to.
This doesn't seem to complete the basic conditions, it's not easy.
So, I want to put this feature in next releases, along with DROP GTT FORCE.
Also, in view of your comments, I roll back the feature of TRUNCATE GTT DROP.



Wenjing

> 
> 
> 
>> Yes, I think we need a way to delete the GTT local storage without closing 
>> the session.
>> 
>> I provide the TRUNCATE tablename DROP to clear the data in the GTT and 
>> delete the storage files.
>> This feature requires the current transaction to commit immediately after it 
>> finishes truncate.
>> 
>> Hi Wenjing,
>> Thanks for the patch(v30) for the new syntax support for (TRUNCATE 
>> table_name DROP) for deleting storage files after TRUNCATE on GTT.
>> 
>> This syntax looks strange, and I don't think so it solve anything in 
>> practical life, because without lock the table will be used in few seconds 
>> by other sessions.
> 
> If a dba wants to delete or modify a GTT, he can use locks to help him make 
> the change. 
> 
> postgres=# begin;
> BEGIN
> postgres=*# LOCK TABLE gtt2 IN ACCESS EXCLUSIVE MODE;
> postgres=*# select * from pg_gtt_attached_pids ;
> 
> Kill session or let session do TRUNCATE tablename DROP
> 
> postgres=*# drop table gtt2;
> DROP TABLE
> postgres=*# commit;
> COMMIT
> 
> yes, user can lock a tables. But I think so it is user friendly design. I 
> don't remember any statement in Postgres, where I have to use table locks 
> explicitly.
> 
> For builtin commands it should be done transparently (for user).
It can be improved ,like DROP GTT FORCE.

> 
> Regards
> 
> Pavel
>  
> 
>> 
>> This is same topic when we talked about ALTER - when and where the changes 
>> should be applied. 
>> 
>> The CLUSTER commands works only on session private data, so it should not to 
>> need some special lock or some special cleaning before.
>> 
>> Regards
>> 
>> Pavel
>>  
>>  
>> Please check below scenarios:
>> 
>> Case1:
>> -- session1:
>> postgres=# create global temporary table gtt2 (c1 integer) on commit 
>> preserve rows;
>> CREATE TABLE
>> postgres=# create index  idx1 on gtt2 (c1);
>> CREATE INDEX
>> postgres=# create index  idx2 on gtt2 (c1) where c1%2 =0;
>> CREATE INDEX
>> postgres=# 
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> ERROR:  cannot cluster on partial index "idx2"
>> 
>> Case2:
>> -- Session2:
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> CLUSTER
>> 
>> postgres=# insert into gtt2 values(1);
>> INSERT 0 1
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> ERROR:  cannot cluster on partial index "idx2"
>> 
>> Case3:
>> -- Session2:

Re: Subplan result caching

2020-04-26 Thread Andy Fan

On Sun, Apr 26, 2020 at 2:11 PM David Rowley 
wrote:

> On 23 May 2018 at 21:31, Heikki Linnakangas  wrote:
> > I've been working on a patch to add a little cache to SubPlans, to speed
> up
> > queries with correlated subqueries, where the same subquery is currently
> > executed multiple times with the same parameters. The idea is to cache
> the
> > result of the subplan, with the correlation vars as the cache key.
>
> Hi,
>
> This seems like an interesting area to make improvements, so I've
> signed up to review the patch.
>
> From looking at the code I see that the caching is being done inside
> nodeSubplan.c. I don't think this is the right approach to the
> problem. The problem exists for any parameterized path, so I think a
> more general approach would be much better.
>

I'm +1 on this suggestion.  Actually I'm working on the query like this:

select  j1o.i, j2_v.sum_5
from j1 j1o
inner join lateral
(select im100, sum(im5) as sum_5
 from j2
 where j1o.im100 = im100
 and j1o.i = 1
 group by im100) j2_v
on true
where j1o.i = 1;

I hope the we have a result cache for j2_v.   But this nothing with SubPlan.

If we want to handle this case as well, one of the changes would
be it needs to cache multi records for one input parameter, or return
one row each time but return mutli times for one input parameter,
Tuplestore may be a good option for this case since its full functionalities
like tuple_puttuple, tuple_gettuple. But if we implement it with tuplestore,
the next question is how to control the memory usage for this Node.
We can use the dedicated memory context to know how many memory
this node used in total, but we can't stop the tuplestore from using more
memory.  Or we can force set both current tuplestore->state  to
TTS_WRITEFILE
and set the allowedMem to 0 for the following tuplestore, after we find too
memory is used. However this looks a bit of hack.

As a summary of my thinking about this topic before,  I'd write another
incomplete options to handle this case.  1) we just use the the Material
Path to cache the result for last parameter only.  if the following value
is
same as the last one, we use it. Or else, we rescan the Material path.
2).  We can consider order the rows by input params key. That will be
much cache friendly.

BTW,  I see may  node->chgParams was changed without checking if the
datum changed from the previous one, this may lost some optimization
opportunities (for example during the agg node rescan case, if the params
is not changed, the we use the hash table we build before).  So if we we
add new parameter to node->chgParams only if the datum really changed,
will it still be semantic correctly.

It's great to see someone working on this.
>

I'd like to have a try.

Best Regards
Andy Fan

Re: Fix compilation failure against LLVM 11

Re: WAL usage calculation patch

Re: doc review for v13

Re: Setting min/max TLS protocol in clientside libpq

Re: 2pc leaks fds

Re: proposal - plpgsql - all plpgsql auto variables should be constant

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

Re: Why are wait events not reported even though it reads/writes a timeline history file?

Binary search in ScalarArrayOpExpr for OR'd constant arrays

Re: WAL usage calculation patch

Re: proposal - plpgsql - all plpgsql auto variables should be constant

Re: Poll: are people okay with function/operator table redesign?

Re: +(pg_lsn, int8) and -(pg_lsn, int8) operators

Re: Poll: are people okay with function/operator table redesign?

Re: doc review for v13

Re: doc review for v13

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

Re: Setting min/max TLS protocol in clientside libpq

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

Re: Poll: are people okay with function/operator table redesign?

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

Re: doc review for v13

Re: Subplan result caching

Re: Setting min/max TLS protocol in clientside libpq

Re: Subplan result caching

Re: WIP: Aggregation push-down

Re: [Proposal] Global temporary tables

Re: Subplan result caching

29 matches

Site Navigation

Mail list logo

Footer information