Hello,
I have been pondering this for a while, but never really looked deeply
into the problem.
I have 96 dimensional points and I would like to pose queries such as:
'give me all points that are within such a radius of this one'. The gis
extensions to mysql might support such type of query. The
Geert-Jan Brits wrote:
You're most likely talking about something like consine-similarity on
N-dimensional vectors.
http://en.wikipedia.org/wiki/Cosine_similarity
http://stackoverflow.com/search?q=cosine+similarity
Cool links ! Although it is not why I need it for. I'm really talking
about an
Johan De Meersman wrote:
Well... a point in an n-dimensional space, is a location that has a
defined value for each of it's n dimensions. If you have a value for
each of your 96 dimensions, you have a point.
Well, it's fairly simple. If you have two points with 96 values in each.
I'm not sure why, but it seems that some people, I don't mean to imply
that you are one of them, think there is some magic MySQL can preform to
find points with in a given radius using the GIS extension. There is no
magic. They simply use the well known math required to determine what
points
Perhaps you could give us a (generalized) description of your use-case, so
we can better grasp what you want to achieve, and how you want to use it.
i.e: since I can't imagine/ envison a real 'eucledian distance' over 96
dimensions I bet you're talking a generalized distance function over N
Hello Chris,
The use case I'
m talking about is actually a typical usecase for GIS applications: give
me the x closest points to this one. E.g: give me the 10 points closest
to (1,2,79) or in my case: give me the 100 points closest to
(x1,x96). A query like yours might be possible and might
Geert-Jan Brits wrote:
Perhaps you could give us a (generalized) description of your use-case, so
we can better grasp what you want to achieve, and how you want to use it.
i.e: since I can't imagine/ envison a real 'eucledian distance' over 96
dimensions I bet you're talking a generalized
Here is an idea, I'm not going to code this one:) It's still not an
ideal solution because it has to make assumptions about your data set.
Execute the algorithm I outlined previously with a very small r value,
if you didn't find the number of points you are looking for, increase r
and modify