> From: Tom Lane <t...@sss.pgh.pa.us>
> Sent: 25 June 2020 17:43
>  
> Alastair McKinley <a.mckin...@analyticsengines.com> writes:
> > I know that Cube in it's current form isn't suitable for nearest-neighbour 
> > searching these vectors in their raw form (I have tried recompilation with 
> > higher CUBE_MAX_DIM myself), but conceptually kNN GiST searches using Cubes 
> > can be useful for these applications.  There are other pre-processing 
> > techniques that can be used to improved the speed of the search, but it 
> > still ends up with a kNN search in a high-ish dimensional space.
> 
> Is there a way to fix the numerical instability involved?  If we could do
> that, then we'd definitely have a use-case justifying the work to make
> cube toastable.

I am not that familiar with the nature of the numerical instability, but it 
might be worth noting for additional context that for the NN use case:

- The value of each dimension is likely to be between 0 and 1 
- The L1 distance is meaningful for high numbers of dimensions, which 
*possibly* suffers less from the numeric issues than euclidean distance.

The numerical stability isn't the only issue for high dimensional kNN, the GiST 
search performance currently degrades with increasing N towards sequential scan 
performance, although maybe they are related?

>                         regards, tom lane

Best regards, 
Alastair

Reply via email to