Re: [sqlite] slowish R*Tree performance

Carlos Ferreira Mon, 16 Jun 2014 06:11:30 -0700

Hi,

I am just a user of binary Sqlite. So, I do not know how much of custom
Sqlite tricks can be made with a copy of sqlite source code.


However, and if I understood well the problem is like this:

1 - There a data type named IPV6 Address. 
2 - there is a table where this data type must be in. ( can be a set of
fields, one blob, one string ...)

You want to:

Given a certain IPV6, find in the database the existent IPV6 record with the
longest equal prefix to the given IPV6 value.


Again, if the problem is as I understood, the simplest solution is ( again I
am discussing it as if it could be implemented or available in SQLite..I do
not know..)

1  - encode the IPV6 field as a unique blob
2 - Implement an index to this field so that this particular field can be
ordered
3 - Make sure that the ordering compare function is a binary incremental
compare counting from the left ( in terms of the data...not sure how you
will represent it in the field )
4 - Each time you want to find the closed and longest prefix, you just need
to simulate an insert command.
Try to insert the given value. Instead of inserting, just return the ordered
position where it would be inserted. Check what is the record actually
standing in that ordered position...That would be your result!!


This can also be done outside Sqlite...but not sure if my understanding of
the problem is correct.

Regards,

Carlos

-----Original Message-----
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Eric Rubin-Smith
Sent: domingo, 15 de Junho de 2014 05:25
To: General Discussion of SQLite Database
Subject: [sqlite] slowish R*Tree performance

I am exploring a mechanism for finding the longest covering IPv6 prefix for
a given IPv6 address by leveraging SQLite 3.8.5's R*Tree feature.  I'm
getting pretty bad performance and would like to know if I'm doing something
obviously wrong.

A 1-dimensional R*Tree of integers of width 128 bits would be ideal.  But
SQLite R*Trees are only 32 bits wide per dimension.

So I am modeling IPv6 prefixes as a pair of coordinates in 5-dimensional
integer space:

CREATE VIRTUAL TABLE ipIndex USING rtree_i32(
    id,         -- Integer primary key
    minD1, maxD1,
    minD2, maxD2,
    minD3, maxD3,
    minD4, maxD4,
    minD5, maxD5
);

CREATE TABLE routeTarget(
    id INTEGER PRIMARY KEY,
    prefix TEXT NOT NULL,
    target TEXT NOT NULL
);

To do this, I have come up with a mapping from an IPv6 prefix to a pair of
coordinates that has the geometric property that we would like: bounding box
1 contains bounding box 2 if and only if prefix 1 contains prefix 2.
So if a search for bounding boxes containing a particular address's
coordinate returns rows, then each of those rows corresponds to a covering
prefix -- from there, we must simply find the smallest bounding box to find
the longest-matching prefix.

Functionally, this appears to work like a charm.

Performance, unfortunately, leaves much to be desired.  I have seeded this
database with 100k randomly-generated prefixes, and am only able to push
through 250 searches per second.  I am hoping to increase the database size
to ~50m records and would like to hit 50k searches per second.

This is not an unreasonable request based on my hardware, and SQLite has
easily hit this throughput (at least, this order of magnitude in
throughput) in other applications.  For example, the analogue in IPv4 merely
requires a 1-dimensional R*Tree, and with that I was able to hit 10kTPS
throughput without breaking much of a sweat.

Here is my search query plan:

sqlite> explain query plan SELECT prefix, target FROM routeTarget WHERE 
sqlite> id
= (
   ...>    SELECT id FROM ipIndex
   ...>     WHERE minD1 <= 1220818432 and 1220818432 <= maxD1
   ...>       AND minD2 <= 2120561472 and 2120561472 <= maxD2
   ...>       AND minD3 <= 1685398080 and 1685398080 <= maxD3
   ...>       AND minD4 <= 1685755328 and 1685755328 <= maxD4
   ...>       AND minD5 <= 538331072 and 538331072 <= maxD5
   ...>     ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)*
   ...>               (maxD2-minD2)*(maxD1-minD1)) ASC
   ...>    LIMIT 1);
0|0|0|SEARCH TABLE routeTarget USING INTEGER PRIMARY KEY (rowid=?) 
0|0|0|EXECUTE SCALAR SUBQUERY 1
1|0|0|SCAN TABLE ipIndex VIRTUAL TABLE INDEX 2:B0D1B2D3B4D5B6D7B8D9 USE 
1|0|0|TEMP B-TREE FOR ORDER BY

I created a profiled SQLite build, and here are the top offenders for a run
on 1000 searches:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 40.00      1.58     1.58   300179     0.01     0.01  sqlite3VdbeExec
  6.33      1.83     0.25 36628944     0.00     0.00  sqlite3VdbeMemMove
  6.08      2.07     0.24 18314472     0.00     0.00  rtreeColumn
  4.05      2.23     0.16  1665952     0.00     0.00  rtreeStepToLeaf
  2.41      2.33     0.10 55549722     0.00     0.00  sqlite3VdbeMemRelease
  2.28      2.42     0.09  1965104     0.00     0.00
sqlite3BtreeMovetoUnpacked
  2.03      2.50     0.08 20187085     0.00     0.00
rtreeNodeOfFirstSearchPoint
  1.90      2.57     0.08 19981424     0.00     0.00
sqlite3VtabImportErrmsg
  1.77      2.64     0.07  1663952     0.00     0.00  sqlite3BtreeDelete
  1.52      2.70     0.06  5297026     0.00     0.00  sqlite3VdbeSerialGet
  1.52      2.76     0.06  1665952     0.00     0.00  btreeMoveto
  1.27      2.81     0.05 29969136     0.00     0.00  numericType
  1.27      2.86     0.05 22092688     0.00     0.00  sqlite3DbFree
  1.27      2.91     0.05 16649521     0.00     0.00  sqlite3_result_int
  1.27      2.96     0.05  5295009     0.00     0.00  moveToRoot
  1.27      3.01     0.05  1844945     0.00     0.00  nodeAcquire
  1.27      3.06     0.05  1665952     0.00     0.00  insertCell
  1.27      3.11     0.05  1663952     0.00     0.00  dropCell
  1.01      3.15     0.04 21214814     0.00     0.00  sqlite3_free
  0.89      3.19     0.04  3335904     0.00     0.00  pager_write
  0.76      3.22     0.03  9991712     0.00     0.00  sqlite3VdbeSerialType
  0.76      3.25     0.03  3335904     0.00     0.00  sqlite3PagerWrite
  0.76      3.28     0.03   903468     0.00     0.00  pcache1Fetch


I believe I have avoided the common pitfalls: everything is within a single
transaction, my cache_size pragma is large, etc.

I note from SQLite trace callbacks that a particular explicit search results
in a great many implicit SQL calls into the underlying R*Tree tables.  Three
hundred or so per search.  I assume this is expected?  It seems pretty
large.  They all look like this:

-- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1
-- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1 {etc, ~300
times}

I don't believe an R*Tree should be this deep; it makes me worry that
something about the data I am providing is causing the tree to degenerate to
a linear structure.  Or something.

The code reads search keys from stdin, runs the query on them, writes the
results, and repeats:

    if (sqlite3_exec(pDb, "BEGIN;", NULL, NULL, NULL)!=SQLITE_OK)
    {
        fprintf(stderr, "could not initiate transaction.\n");
        return 1;
    }
    const char* zSearch =
        "SELECT prefix, target FROM routeTarget WHERE id = (\n"
        "   SELECT id FROM ipIndex\n"
        "    WHERE minD1 <= $1 and $1 <= maxD1\n"
        "      AND minD2 <= $2 and $2 <= maxD2\n"
        "      AND minD3 <= $3 and $3 <= maxD3\n"
        "      AND minD4 <= $4 and $4 <= maxD4\n"
        "      AND minD5 <= $5 and $5 <= maxD5\n"
        "    ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)*\n"
        "              (maxD2-minD2)*(maxD1-minD1)) ASC \n"
        "   LIMIT 1"
        ");";
    rc = sqlite3_prepare_v2(pDb, zSearch, strlen(zSearch),
            &pSearch, NULL);
    {snip}
    uint32_t coord[5];
    while (fgets(zLine, sizeof(zLine), stdin)!=NULL)
    {
        struct in6_addr addr;
        size_t len = strlen(zLine);
        if (zLine[len-1]=='\n') {zLine[len-1] = 0;}
        if (inet_pton(AF_INET6, zLine, &addr)!=1)
        {
            fprintf(stderr, "Ignoring bad ip address \"%s\"\n", zLine);
            continue;
        }
        addr2coord(addr, coord);
        for (int i=0;i<5;i++)
        {
            rc = sqlite3_bind_int(pSearch, i+1, coord[i]);
            assert(rc==SQLITE_OK);
        }
        rc = sqlite3_step(pSearch);
        while (rc!=SQLITE_DONE)
        {
            switch(rc) {
            case SQLITE_ROW:
                const unsigned char* zPrefix, *zTarget;
                zPrefix = sqlite3_column_text(pSearch, 0);
                zTarget = sqlite3_column_text(pSearch, 1);
                printf("%s --> %s %s\n", zLine, zPrefix, zTarget);
                rc = sqlite3_step(pSearch);
                break;
            case SQLITE_DONE:
                // no rows.
                break;
            case SQLITE_BUSY:
                fprintf(stderr, "couldn't get lock\n");
                return 1;
                break;
            default:
                assert(0);
            }
        }
        sqlite3_reset(pSearch);
        struct timeval now;
        gettimeofday(&now, NULL);
        if (((now.tv_sec*1000+now.tv_usec/1000)-
            (lastTxnTime.tv_sec*1000 + lastTxnTime.tv_usec/1000))>1000)
        {
            if (sqlite3_exec(pDb, "END;", NULL, NULL, NULL)!=SQLITE_OK ||
                sqlite3_exec(pDb, "BEGIN;", NULL, NULL, NULL)!=SQLITE_OK)
            {
                fprintf(stderr, "could not initiate transaction.\n");
                return 1;
            }
            lastTxnTime = now;
        }
    }
{snip}

SQLite build params:

gcc -DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\"
-DPACKAGE_VERSION=\"3.8.5\" -DPACKAGE_STRING=\"sqlite\ 3.8.5\"
-DPACKAGE_BUGREPORT=\"http://www.sqlite.org\"; -DPACKAGE_URL=\"\"
-DPACKAGE=\"sqlite\" -DVERSION=\"3.8.5\" -DSTDC_HEADERS=1
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
-DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\"
-DHAVE_FDATASYNC=1 -DHAVE_USLEEP=1 -DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1
-DHAVE_DECL_STRERROR_R=1 -DHAVE_STRERROR_R=1 -DHAVE_READLINE=1
-DHAVE_POSIX_FALLOCATE=1 -I.    -D_REENTRANT=1 -DSQLITE_THREADSAFE=1
-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_RTREE -g -O2 -c sqlite3.c

OS:

Linux <anonymized> 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT
2012 x86_64 x86_64 x86_64 GNU/Linux
[root@<anonymized> sqlite-autoconf-3080500]# cat /etc/issue Red Hat
Enterprise Linux Server release 6.3 (Santiago) Kernel \r on an \m

[root@<anonymized> sqlite-autoconf-3080500]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
stepping        : 2
cpu MHz         : 2400.016
cache size      : 12288 KB
physical id     : 1
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 32
initial apicid  : 32
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts
tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4800.03
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
{times 16 -- 4 cores * 2 hyperthreads * 2 chips}


Schema:

sqlite> select * from sqlite_master;
table|ipIndex|ipIndex|0|CREATE VIRTUAL TABLE ipIndex USING rtree_i32(
    id,         -- Integer primary key
    minD1, maxD1,
    minD2, maxD2,
    minD3, maxD3,
    minD4, maxD4,
    minD5, maxD5
)
table|ipIndex_node|ipIndex_node|2|CREATE TABLE "ipIndex_node"(nodeno
INTEGER PRIMARY KEY, data BLOB)
table|ipIndex_rowid|ipIndex_rowid|3|CREATE TABLE "ipIndex_rowid"(rowid
INTEGER PRIMARY KEY, nodeno INTEGER)
table|ipIndex_parent|ipIndex_parent|4|CREATE TABLE 
table|"ipIndex_parent"(nodeno
INTEGER PRIMARY KEY, parentnode INTEGER)
table|routeTarget|routeTarget|5|CREATE TABLE routeTarget(
    id INTEGER PRIMARY KEY,
    prefix TEXT NOT NULL,
    target TEXT NOT NULL
)
index|routeTargetIdxPrefix|routeTarget|6|CREATE INDEX 
index|routeTargetIdxPrefix|routeTarget|6|routeTargetIdxPrefix
          ON routeTarget(prefix)


Is there anything that I am doing obviously wrong?

Thanks!

Eric
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] slowish R*Tree performance

Reply via email to