Re: [sqlite] General R*Tree query

Roman Fleysher Tue, 17 Dec 2013 13:19:51 -0800

Since coordinate system is spherical, how do you tell that RA=23:59 and 
RA=00:01 are next to each other using usual comparisons?


Roman

________________________________________
From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Dan Kennedy [danielk1...@gmail.com]
Sent: Tuesday, December 17, 2013 3:51 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] General R*Tree query

On 12/18/2013 12:49 AM, Brian T. Carcich wrote:
> I'm working on an SQLite solution to get at star catalogs; they are usually
> searched via Right Ascension (RA), Declination (DEC), and magnitude (mag).
>   RA,DEC is a spherical coordinate system to specify a star position on-sky;
> magnitude is related to star brightness.
>
> What I have so far is here:
>
>     https://github.com/drbitboy/Tycho2_SQLite_RTree
>
>
> I started with the Tycho-2 star catalog.  It comprises 2.5 million stars in
> a flat ASCII, fixed-width catalog file (actually two files but ignore that
> for now), and an index file (map) of ~10k small RA-DEC regions, with an
> average of ~250 stars in each region.  The regions do not overlap, and all
> the stars in any one region are in contiguous lines in the catalog file.
>
> The index file does not implement any grouping or sorting by magnitude.
>   Each index region refers to
>
> A) a contiguous region on-sky with defined by a min-max RA pair and a
> min-max DEC pair.
>
> B)  a contiguous range of the lines (stars) in the flat file that are
> within that region.
>
> So the data in the index file are a reasonable starting point for an R*Tree
> in SQLite3.  I put the index file data into the virtual table using the RA
> and DEC limits for each region as the two min-max pairs of columns in the
> table, and the index table, referenced by the primary key of the virtual
> table, contains the starting and ending+1 indices (offsets actually) of the
> stars in the flat catalog file for each region.
>
> So I use the R*Tree module to get a fast lookup into the index table,
> returning index regions that overlap an input RA and DEC min-max pair, then
> step through the catalog lines for each of those regions.
>
> Here's my question:  is there any advantage to skipping the index step and
> putting the star catalog data into the virtual table itself?  One advantage
> is that I could include the magnitude in the rtree table.
>
> The reason I ask is that rtree table uses min-max pairs, but each star is a
> point so the min and max are equal for each star.  Would that break any
> implicit R*Tree rules or negate any efficiencies?

Points are fine. So long as (max>=min) for all dimensions.
R-tree will return SQLITE_CONSTRAINT if you try to insert
a record for which this is not the case.

I guess in theory storing each individual star in the
r-tree might be more efficient. Difficult to say if it
would be significant though. The r-tree and other virtual
table code is not as optimized as the core, so even if
it seems better in theory it might not be in practice.

Adding the magnitude to the r-tree structure as an extra
dimension that is never queried sounds like it might make
the r-tree structure less efficient though. R-tree will
make some effort to place records with similar magnitudes
on the same node, which won't help the query but will
presumably cause some reduction in the localization that
does matter.

Is it working now? How many more stars do you have data for?

Dan.





_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] General R*Tree query

Reply via email to