Re: DRAFT GIST support for ORDER BY

Matthias van de Meent Mon, 30 Oct 2023 05:31:46 -0700

On Mon, 30 Oct 2023 at 09:04, Michał Kłeczek <[email protected]> wrote:
>
> Hi All,
>
> Attached is a first attempt to implement GIST index (only) scans for ORDER BY 
> column clauses.


Cool!

> The solution is not ideal as it requires registering “<“ and “>” operators as 
> ordering operators in opfamily
> (which in turn makes it possible to issue somewhat meaningless “ORDER BY a < 
> ‘constant’)

I don't quite understand why we need to register new "<" and ">"
operators. Can't we update the current ones?

> The problem is though that right now handling of ORDER BY column clauses is 
> tightly coupled to BTree.
> It would be good to refactor the code so that semantics of ORDER BY column 
> could be more flexible.

The existence of a BTREE operator class for the type is the indicator
that (and how) the type can be ordered - that is where PostgreSQL gets
its methods for ordering most types. Although I agree that it's a
quirk, I don't mind it that much as an indicator of how a type is
ordered.
I do agree, though, that operator classes by themselves should be able
to say "hey, we support full ordered retrieval as well". Right now,
that seems to be limited to btrees, but indeed a GiST index with
btree_gist columns should be able to support the same.

> It would be great if someone could take a look at it.

I've not looked in detail at the patch, but here's some comments:

> --- a/contrib/btree_gist/btree_gist--1.6--1.7.sql
> +++ b/contrib/btree_gist/btree_gist--1.6--1.7.sql

You seem to be modifying an existing migration of a released version
of the btree_bist extension. I suggest you instead add a migration
from 1.7 to a new version 1.8, and update the control file's default
installed version.

> ORDER BY a == ORDER BY a <-> MIN_VALUE
> and
> ORDER BY a DESC == ORDER BY a <-> MAX_VALUE
>
> This allows implementing GIST ordered scans for btree_gist datatypes.
>
> This in turn makes using GIST with partitioning feasible (I have described 
> issues with such usage in my previous e-mails - see below).

Did you take into account that GiST's internal distance function uses
floating point, and is thus only an approximation for values that
require more than 2^54 significant bits in their distance function?
For example, GiST wouldn't be guaranteed to yield correct ordering of
int8/bigint when you use `my_column <-> UINT64_MAX` because as far as
the floating point math is concerned, 0 is about as far away from
INT64_MAX as (say) 20 and -21.


Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: DRAFT GIST support for ORDER BY

Reply via email to