On Wed, Sep 09, 2020 at 04:53:30PM -0300, Alvaro Herrera wrote:
On 2020-Sep-09, Tomas Vondra wrote:

There are some minor optimizations possible - for example I noticed we
call minmax_multi_get_strategy_procinfo often because it happens in a
loop, and we could easily do it just once. But that saves only about 10%
or so, it's not a ground-breaking optimization.

Well, I guess this kind of thing should be fixed regardless while we
still know it's there, just to avoid an obvious inefficiency.


Sure. I was just suggesting it's not something that'd make this very
close to plain minmax opclass.

The main reason for the slowness is that we pass the values one by one
to brin_minmax_multi_add_value - and on each call we need to deserialize
(and then sometimes also serialize) the summary, which may be quite
expensive. The regular minmax does not have this issue, it just swaps
the Datum value and that's it.

Ah, right, that's more interesting.  The original dumb BRIN code
separates BrinMemTuple from BrinTuple so that things can be operated
efficiently in memory.  Maybe something similar can be done in this
case, which also sounds like your second suggestion:

Another option would be to teach add_value to keep the deserialized
summary somewhere, and then force serialization at the end of the BRIN
page range. The end result would be roughly the same, I think.


Well, the patch already has Ranges (memory) and SerializedRanges (disk)
but it's not very clear to me where to stash the in-memory data and
where to make the conversion.


Also, I think you could get a few initial patches pushed soon, since
they look like general improvements rather than specific to multi-range.


Yeah, I agree. I plan to review those once again in a couple days and
then push them.


On a differen train of thought, I wonder if we shouldn't drop the idea
of there being two minmax opclasses; just have one (still called
"minmax") and have the multi-range version be the v2 of it.  We would
still need to keep code to operate on the old one, but if you ever
REINDEX then your index is upgraded to the new one.  I see no reason to
keep the dumb minmax version around, assuming the performance is roughly
similar.


I'm not a huge fan of that. I think it's unlikely we'll ever make this
new set of oplasses just as fast as the plain minmax, and moreover it
does have some additional requirements (e.g. the distance proc, which
may not make sense for some data types).


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to