100?

Andrey Borodin Wed, 12 Jun 2019 08:15:10 -0700

> 12 июня 2019 г., в 15:11, Siarhei Siniak <siarheisin...@yahoo.com> написал(а):
> 
> A uniform set of points with a dimension 128 and type cube. That has a size 
> of 50 ** 3. Can be queried for a nearest neighbor at a speed of 10 queries 
> per second with limit varying from 1 to 25.
> It works better than when no index used at all. So gist gives here a speed up.

Then, I think, data is correlated. I believe it is possible to implement 
something better for high dimensional KNN in GiST than cube.


> 
> The documentation of postgresql doesn't mention complexity of such an index. 
> I've got confused as to its speed.
> 
> Does postgresql allow for an approximate nearest neighbor search?
> https://github.com/erikbern/ann-benchmarks has a lot of efficient 
> implementations.

ANN is beyond concepts of SQL standard: database with index must return same 
results as without index.
I can add ANN to github.com/x4m/ags which is a fork of GiST.

But PostgreSQL adds a lot of OLTP overhead:
1. it is crash-safe
2. it supports concurrent operations
2a. in a very unexpected way, for example in serializable mode it guaranties 
you that if you were looking for nearest neighbor there will not appear any new 
closer neighbor until the end of your transaction.
3. it allows extensibility and has abstraction layers
4. it has declarative querying language
etcetc

All this comes at a cost of database that can do anything and be anything. It 
its very hard to compete with specialized indexes that only do ANN.

Yet, as far as I know, no one really pursued the idea of fast high dimensional 
ANN in PG.

Best regards, Andrey Borodin.
Re: GiST limits on contrib/cube with dimension > 100?

Reply via email to