Re: [sqlite] slowish R*Tree performance

Eric Rubin-Smith Wed, 18 Jun 2014 10:18:59 -0700

Carlos Ferreira wrote: 

> 1 - There a data type named IPV6 Address.  2 - there is a table where 
> this data type must be in.  ( can be a set of fields, one blob, one string 
> ...)  
> 
> You want to: 
> 
> Given a certain IPV6, find in the database the existent IPV6 record with 
> the longest equal prefix to the given IPV6 value.


Not quite.  Perhaps you were confused by my (probably unclear) use of 
the word "prefix".  

The data structure contains a set of IPv6 *network prefixes*.  A prefix 
is the first N bits of an IPv6 address and is denoted as an IP address 
with a suffix of (128-N) bits of zeros, along with the length of the 
prefix: 

feed:beef::/32 

(here N==32).  

An IPv6 address is inside this prefix iff its first 32 bits are equal to 
feed:beef.  

Note that one prefix can be contained within another: 

feed:beef:f00d::/48 is fully contained within feed:beef::/32.  We say 
that feed:beef:f00d::/48 is "more specific" than feed:beef::/32.  

The problem (one that comes up quite often, e.g.  in routers) is this: 
given an IP address, find the longest-length prefix containing the 
address.  I.e.  find the most specific prefix containing the address.  

This is different from the problem you stated.  One key thing to note is 
that the two prefixes feed:beef::/32 and feed:beef::/48 are different, 
even though the bits in the address parts are identical.  

> Again, if the problem is as I understood, the simplest solution is ( 
> again I am discussing it as if it could be implemented or available in 
> SQLite..I do not know..)  
> 
> 1 - encode the IPV6 field as a unique blob 2 - Implement an index to 
> this field so that this particular field can be ordered 3 - Make sure that 
> the ordering compare function is a binary incremental compare counting 
> from the left ( in terms of the data...not sure how you will represent 
> it in the field ) 4 - Each time you want to find the closed and longest 
> prefix, you just need to simulate an insert command.  Try to insert the 
> given value.  Instead of inserting, just return the ordered position where 
> it would be inserted.  Check what is the record actually standing in that 
> ordered position...That would be your result!!  

The problem has many solutions outside of SQL.  The most common data 
structure is a "trie" (pronounced "tree" and short for "reTRIEval 
tree"), though it turns out that in many subsets of this problem space 
other data structures turn out to be more efficient.  

The present motivation, however, is to see if we can leverage all 
the ancillary sexiness of SQLite while still getting reasonably fast 
searches within this problem space.  Turns out we can do pretty darn 
well with SQLite in this regard.  

The key part is coming up with an isomorphism between overlapping 
IPv6 prefixes and the overlapping geometric boxes represented in a 
5-dimensional R*Tree.  Not just any isomorphism, but one with the 
property that for prefixes P1 and P2, P1 contains P2 if and only if 
bbox(P1) fully contains bbox(P2).  (It follows that the volume of 
bbox(P1) is larger than the volume of bbox(P2), so you can sort by the 
volumes of all the covering bboxes to find the most specific prefix.  
Though my SQL had a bug in that regard, so treat it with care if you use 
it.:-) 

Again, the 5 dimensions are only needed because the R*Tree's integers 
are only 32 bits wide.  If they were 128 bits wide, you could just 
represent an IPv6 prefix as an interval on the line segment [0, 
2^128-1], and store those intervals in a 1-dimensional R*Tree (which 
works great for IPv4, btw).

-- 
Eric A. Rubin-Smith

I still maintain the point that designing a monolithic kernel in 
1991 is a fundamental error.  Be thankful you are not my student.  
You would not get a high grade for such a design.
    -- Andrew Tanenbaum, to Linus Torvalds
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] slowish R*Tree performance

Reply via email to