Fw: Pseudo-Spatial data & MySQL

Rob McDonald Fri, 24 Jun 2005 09:10:44 -0700

> If I remember right, the R-TREES are associated with the GIS extensions to
> MySQL. I could be wrong but that's how I remember it (and I had a hard
> time finding a reference to them in the official online manual. Can anyone
> help?)


>From what I've been able to dig up, this is correct.  I was hoping that
there was some generalized functionality also available.  As you say,
documentation is scarce on these features.

> I guess as long as your _results_ are points of no more than two
> dimensions, you could try using the spatial extensions to store them.
> However, you make it sound as though you have an N-dimensional input space
> and an M-dimensional output space. I don't think the GIS extensions to
> MySQL cover the cases where M or N are greater than 2.

Yes, I'm looking for N-D input space, where N is unknown until runtime.  I
agree that the GIS stuff won't help me out.

> If I am right, you are asking for help to determine all of the points in
> an X-dimensional space that reside within a certain radius of a particular
> point, or the nearest C points to a particular location in X-dimensional
> space.
>
> One way to minimize your search targets T(0...n) is to define an
> X-dimensional hypercube around your target point P(0) by taking your
> original coordinates and looking +/- some value for each dimension. Then,
> you would compute the distance from your P(0) to each T(x), sort the
> results and pick the "nearest" by limiting to however many you wanted.
>
> In my opinion, you can store the data two different ways: flat or
> normalized. In the flat model, you create one column for each dimension.
> This is sometimes faster to work with but takes up more room. The
> down-side to this arrangement is: should you ever need to increase the
> number of dimensions you are storing, you will have to change the table
> structure to do it. The normalized model creates list-pairs (I think this
> is what you are doing) and can consume as much room as the flat model,
> depending on the density of your high dimensional data.

I have devised a system that can handle dynamic N input dimensions.  I'm not
sure if it matches your normalized scheme, but it certainly isn't the flat
scheme.

> What I mean is that if you create a "results" table to store up to 25
> dimensions, and most of your data only had 8, you have 17 "blank"
> dimensions being stored for each row of data. This kind of situation would
> be optimized by using the Normalized data method.  However, if you
> frequently store 20 dimensions or more for each data point then the flat
> table is more space efficient because you eliminate the duplication of the
> "parent_id" field for each row.
>
> As I said, it's all dependent on how flexible you need your results to be
> and the nature of your data that will determine which method is more
> flexible.
>
> The flat table model has another drawback, there is a limit to how many
> columns you can index on any one table. With the normalized data, you
> shouldn't run into that problem.
>
> Am I on the right track or have I lost my way?

I think you're on the right track, and I appreciate all the input.  The
primary clarification that the GIS functionality is limited to 2D was the
main question.

Thanks,

          Rob


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Fw: Pseudo-Spatial data & MySQL

Reply via email to