> From: Tom Lane <t...@sss.pgh.pa.us> > Sent: 25 June 2020 17:43 > > Alastair McKinley <a.mckin...@analyticsengines.com> writes: > > I know that Cube in it's current form isn't suitable for nearest-neighbour > > searching these vectors in their raw form (I have tried recompilation with > > higher CUBE_MAX_DIM myself), but conceptually kNN GiST searches using Cubes > > can be useful for these applications. There are other pre-processing > > techniques that can be used to improved the speed of the search, but it > > still ends up with a kNN search in a high-ish dimensional space. > > Is there a way to fix the numerical instability involved? If we could do > that, then we'd definitely have a use-case justifying the work to make > cube toastable.
I am not that familiar with the nature of the numerical instability, but it might be worth noting for additional context that for the NN use case: - The value of each dimension is likely to be between 0 and 1 - The L1 distance is meaningful for high numbers of dimensions, which *possibly* suffers less from the numeric issues than euclidean distance. The numerical stability isn't the only issue for high dimensional kNN, the GiST search performance currently degrades with increasing N towards sequential scan performance, although maybe they are related? > regards, tom lane Best regards, Alastair