Re: POC: GROUP BY optimization

Andrei Lepikhov Tue, 26 Dec 2023 20:36:14 -0800

On 27/12/2023 11:15, Alexander Korotkov wrote:

On Wed, Dec 27, 2023 at 5:23 AM Tom Lane <[email protected]> wrote:

Alexander Korotkov <[email protected]> writes:

2) An accurate estimate of the sorting cost is quite a difficult task.


Indeed.

What if we make a simple rule of thumb that sorting integers and
floats is cheaper than sorting numerics and strings with collation C,
in turn, that is cheaper than sorting collation-aware strings
(probably more groups)?  Within the group, we could keep the original
order of items.


I think it's a fool's errand to even try to separate different sort
column orderings by cost.  We simply do not have sufficiently accurate
cost information.  The previous patch in this thread got reverted because
of that (well, also some implementation issues, but mostly that), and
nothing has happened to make me think that another try will fare any
better.

To be clear. In [1], I mentioned we can perform micro-benchmarks andstructure costs of operators. At least for fixed-length operators, it isrelatively easy. So, the main block here is an accurate prediction ofndistincts for different combinations of columns. Does it make sense tocontinue to design the feature in the direction of turning on choosingbetween different sort column orderings if we have extended statisticson the columns?

[1]https://www.postgresql.org/message-id/[email protected]


--
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

Reply via email to