I am exploring a mechanism for finding the longest covering IPv6 prefix for
a given IPv6 address by leveraging SQLite 3.8.5's R*Tree feature.  I'm
getting pretty bad performance and would like to know if I'm doing
something obviously wrong.

A 1-dimensional R*Tree of integers of width 128 bits would be ideal.  But
SQLite R*Trees are only 32 bits wide per dimension.

So I am modeling IPv6 prefixes as a pair of coordinates in 5-dimensional
integer space:

CREATE VIRTUAL TABLE ipIndex USING rtree_i32(
    id,         -- Integer primary key
    minD1, maxD1,
    minD2, maxD2,
    minD3, maxD3,
    minD4, maxD4,
    minD5, maxD5
);

CREATE TABLE routeTarget(
    id INTEGER PRIMARY KEY,
    prefix TEXT NOT NULL,
    target TEXT NOT NULL
);

To do this, I have come up with a mapping from an IPv6 prefix to a pair of
coordinates that has the geometric property that we would like: bounding
box 1 contains bounding box 2 if and only if prefix 1 contains prefix 2.
So if a search for bounding boxes containing a particular address's
coordinate returns rows, then each of those rows corresponds to a covering
prefix -- from there, we must simply find the smallest bounding box to find
the longest-matching prefix.

Functionally, this appears to work like a charm.

Performance, unfortunately, leaves much to be desired.  I have seeded this
database with 100k randomly-generated prefixes, and am only able to push
through 250 searches per second.  I am hoping to increase the database size
to ~50m records and would like to hit 50k searches per second.

This is not an unreasonable request based on my hardware, and SQLite has
easily hit this throughput (at least, this order of magnitude in
throughput) in other applications.  For example, the analogue in IPv4
merely requires a 1-dimensional R*Tree, and with that I was able to hit
10kTPS throughput without breaking much of a sweat.

Here is my search query plan:

sqlite> explain query plan SELECT prefix, target FROM routeTarget WHERE id
= (
   ...>    SELECT id FROM ipIndex
   ...>     WHERE minD1 <= 1220818432 and 1220818432 <= maxD1
   ...>       AND minD2 <= 2120561472 and 2120561472 <= maxD2
   ...>       AND minD3 <= 1685398080 and 1685398080 <= maxD3
   ...>       AND minD4 <= 1685755328 and 1685755328 <= maxD4
   ...>       AND minD5 <= 538331072 and 538331072 <= maxD5
   ...>     ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)*
   ...>               (maxD2-minD2)*(maxD1-minD1)) ASC
   ...>    LIMIT 1);
0|0|0|SEARCH TABLE routeTarget USING INTEGER PRIMARY KEY (rowid=?)
0|0|0|EXECUTE SCALAR SUBQUERY 1
1|0|0|SCAN TABLE ipIndex VIRTUAL TABLE INDEX 2:B0D1B2D3B4D5B6D7B8D9
1|0|0|USE TEMP B-TREE FOR ORDER BY

I created a profiled SQLite build, and here are the top offenders for a run
on 1000 searches:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 40.00      1.58     1.58   300179     0.01     0.01  sqlite3VdbeExec
  6.33      1.83     0.25 36628944     0.00     0.00  sqlite3VdbeMemMove
  6.08      2.07     0.24 18314472     0.00     0.00  rtreeColumn
  4.05      2.23     0.16  1665952     0.00     0.00  rtreeStepToLeaf
  2.41      2.33     0.10 55549722     0.00     0.00  sqlite3VdbeMemRelease
  2.28      2.42     0.09  1965104     0.00     0.00
sqlite3BtreeMovetoUnpacked
  2.03      2.50     0.08 20187085     0.00     0.00
rtreeNodeOfFirstSearchPoint
  1.90      2.57     0.08 19981424     0.00     0.00
sqlite3VtabImportErrmsg
  1.77      2.64     0.07  1663952     0.00     0.00  sqlite3BtreeDelete
  1.52      2.70     0.06  5297026     0.00     0.00  sqlite3VdbeSerialGet
  1.52      2.76     0.06  1665952     0.00     0.00  btreeMoveto
  1.27      2.81     0.05 29969136     0.00     0.00  numericType
  1.27      2.86     0.05 22092688     0.00     0.00  sqlite3DbFree
  1.27      2.91     0.05 16649521     0.00     0.00  sqlite3_result_int
  1.27      2.96     0.05  5295009     0.00     0.00  moveToRoot
  1.27      3.01     0.05  1844945     0.00     0.00  nodeAcquire
  1.27      3.06     0.05  1665952     0.00     0.00  insertCell
  1.27      3.11     0.05  1663952     0.00     0.00  dropCell
  1.01      3.15     0.04 21214814     0.00     0.00  sqlite3_free
  0.89      3.19     0.04  3335904     0.00     0.00  pager_write
  0.76      3.22     0.03  9991712     0.00     0.00  sqlite3VdbeSerialType
  0.76      3.25     0.03  3335904     0.00     0.00  sqlite3PagerWrite
  0.76      3.28     0.03   903468     0.00     0.00  pcache1Fetch


I believe I have avoided the common pitfalls: everything is within a single
transaction, my cache_size pragma is large, etc.

I note from SQLite trace callbacks that a particular explicit search
results in a great many implicit SQL calls into the underlying R*Tree
tables.  Three hundred or so per search.  I assume this is expected?  It
seems pretty large.  They all look like this:

-- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1
-- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1
{etc, ~300 times}

I don't believe an R*Tree should be this deep; it makes me worry that
something about the data I am providing is causing the tree to degenerate
to a linear structure.  Or something.

The code reads search keys from stdin, runs the query on them, writes the
results, and repeats:

    if (sqlite3_exec(pDb, "BEGIN;", NULL, NULL, NULL)!=SQLITE_OK)
    {
        fprintf(stderr, "could not initiate transaction.\n");
        return 1;
    }
    const char* zSearch =
        "SELECT prefix, target FROM routeTarget WHERE id = (\n"
        "   SELECT id FROM ipIndex\n"
        "    WHERE minD1 <= $1 and $1 <= maxD1\n"
        "      AND minD2 <= $2 and $2 <= maxD2\n"
        "      AND minD3 <= $3 and $3 <= maxD3\n"
        "      AND minD4 <= $4 and $4 <= maxD4\n"
        "      AND minD5 <= $5 and $5 <= maxD5\n"
        "    ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)*\n"
        "              (maxD2-minD2)*(maxD1-minD1)) ASC \n"
        "   LIMIT 1"
        ");";
    rc = sqlite3_prepare_v2(pDb, zSearch, strlen(zSearch),
            &pSearch, NULL);
    {snip}
    uint32_t coord[5];
    while (fgets(zLine, sizeof(zLine), stdin)!=NULL)
    {
        struct in6_addr addr;
        size_t len = strlen(zLine);
        if (zLine[len-1]=='\n') {zLine[len-1] = 0;}
        if (inet_pton(AF_INET6, zLine, &addr)!=1)
        {
            fprintf(stderr, "Ignoring bad ip address \"%s\"\n", zLine);
            continue;
        }
        addr2coord(addr, coord);
        for (int i=0;i<5;i++)
        {
            rc = sqlite3_bind_int(pSearch, i+1, coord[i]);
            assert(rc==SQLITE_OK);
        }
        rc = sqlite3_step(pSearch);
        while (rc!=SQLITE_DONE)
        {
            switch(rc) {
            case SQLITE_ROW:
                const unsigned char* zPrefix, *zTarget;
                zPrefix = sqlite3_column_text(pSearch, 0);
                zTarget = sqlite3_column_text(pSearch, 1);
                printf("%s --> %s %s\n", zLine, zPrefix, zTarget);
                rc = sqlite3_step(pSearch);
                break;
            case SQLITE_DONE:
                // no rows.
                break;
            case SQLITE_BUSY:
                fprintf(stderr, "couldn't get lock\n");
                return 1;
                break;
            default:
                assert(0);
            }
        }
        sqlite3_reset(pSearch);
        struct timeval now;
        gettimeofday(&now, NULL);
        if (((now.tv_sec*1000+now.tv_usec/1000)-
            (lastTxnTime.tv_sec*1000 + lastTxnTime.tv_usec/1000))>1000)
        {
            if (sqlite3_exec(pDb, "END;", NULL, NULL, NULL)!=SQLITE_OK ||
                sqlite3_exec(pDb, "BEGIN;", NULL, NULL, NULL)!=SQLITE_OK)
            {
                fprintf(stderr, "could not initiate transaction.\n");
                return 1;
            }
            lastTxnTime = now;
        }
    }
{snip}

SQLite build params:

gcc -DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\"
-DPACKAGE_VERSION=\"3.8.5\" -DPACKAGE_STRING=\"sqlite\ 3.8.5\"
-DPACKAGE_BUGREPORT=\"http://www.sqlite.org\"; -DPACKAGE_URL=\"\"
-DPACKAGE=\"sqlite\" -DVERSION=\"3.8.5\" -DSTDC_HEADERS=1
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
-DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\"
-DHAVE_FDATASYNC=1 -DHAVE_USLEEP=1 -DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1
-DHAVE_DECL_STRERROR_R=1 -DHAVE_STRERROR_R=1 -DHAVE_READLINE=1
-DHAVE_POSIX_FALLOCATE=1 -I.    -D_REENTRANT=1 -DSQLITE_THREADSAFE=1
-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_RTREE -g -O2 -c sqlite3.c

OS:

Linux <anonymized> 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT
2012 x86_64 x86_64 x86_64 GNU/Linux
[root@<anonymized> sqlite-autoconf-3080500]# cat /etc/issue
Red Hat Enterprise Linux Server release 6.3 (Santiago)
Kernel \r on an \m

[root@<anonymized> sqlite-autoconf-3080500]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
stepping        : 2
cpu MHz         : 2400.016
cache size      : 12288 KB
physical id     : 1
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 32
initial apicid  : 32
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts
tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4800.03
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
{times 16 -- 4 cores * 2 hyperthreads * 2 chips}


Schema:

sqlite> select * from sqlite_master;
table|ipIndex|ipIndex|0|CREATE VIRTUAL TABLE ipIndex USING rtree_i32(
    id,         -- Integer primary key
    minD1, maxD1,
    minD2, maxD2,
    minD3, maxD3,
    minD4, maxD4,
    minD5, maxD5
)
table|ipIndex_node|ipIndex_node|2|CREATE TABLE "ipIndex_node"(nodeno
INTEGER PRIMARY KEY, data BLOB)
table|ipIndex_rowid|ipIndex_rowid|3|CREATE TABLE "ipIndex_rowid"(rowid
INTEGER PRIMARY KEY, nodeno INTEGER)
table|ipIndex_parent|ipIndex_parent|4|CREATE TABLE "ipIndex_parent"(nodeno
INTEGER PRIMARY KEY, parentnode INTEGER)
table|routeTarget|routeTarget|5|CREATE TABLE routeTarget(
    id INTEGER PRIMARY KEY,
    prefix TEXT NOT NULL,
    target TEXT NOT NULL
)
index|routeTargetIdxPrefix|routeTarget|6|CREATE INDEX routeTargetIdxPrefix
          ON routeTarget(prefix)


Is there anything that I am doing obviously wrong?

Thanks!

Eric
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to