I am exploring a mechanism for finding the longest covering IPv6 prefix for a given IPv6 address by leveraging SQLite 3.8.5's R*Tree feature. I'm getting pretty bad performance and would like to know if I'm doing something obviously wrong.
A 1-dimensional R*Tree of integers of width 128 bits would be ideal. But SQLite R*Trees are only 32 bits wide per dimension. So I am modeling IPv6 prefixes as a pair of coordinates in 5-dimensional integer space: CREATE VIRTUAL TABLE ipIndex USING rtree_i32( id, -- Integer primary key minD1, maxD1, minD2, maxD2, minD3, maxD3, minD4, maxD4, minD5, maxD5 ); CREATE TABLE routeTarget( id INTEGER PRIMARY KEY, prefix TEXT NOT NULL, target TEXT NOT NULL ); To do this, I have come up with a mapping from an IPv6 prefix to a pair of coordinates that has the geometric property that we would like: bounding box 1 contains bounding box 2 if and only if prefix 1 contains prefix 2. So if a search for bounding boxes containing a particular address's coordinate returns rows, then each of those rows corresponds to a covering prefix -- from there, we must simply find the smallest bounding box to find the longest-matching prefix. Functionally, this appears to work like a charm. Performance, unfortunately, leaves much to be desired. I have seeded this database with 100k randomly-generated prefixes, and am only able to push through 250 searches per second. I am hoping to increase the database size to ~50m records and would like to hit 50k searches per second. This is not an unreasonable request based on my hardware, and SQLite has easily hit this throughput (at least, this order of magnitude in throughput) in other applications. For example, the analogue in IPv4 merely requires a 1-dimensional R*Tree, and with that I was able to hit 10kTPS throughput without breaking much of a sweat. Here is my search query plan: sqlite> explain query plan SELECT prefix, target FROM routeTarget WHERE id = ( ...> SELECT id FROM ipIndex ...> WHERE minD1 <= 1220818432 and 1220818432 <= maxD1 ...> AND minD2 <= 2120561472 and 2120561472 <= maxD2 ...> AND minD3 <= 1685398080 and 1685398080 <= maxD3 ...> AND minD4 <= 1685755328 and 1685755328 <= maxD4 ...> AND minD5 <= 538331072 and 538331072 <= maxD5 ...> ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)* ...> (maxD2-minD2)*(maxD1-minD1)) ASC ...> LIMIT 1); 0|0|0|SEARCH TABLE routeTarget USING INTEGER PRIMARY KEY (rowid=?) 0|0|0|EXECUTE SCALAR SUBQUERY 1 1|0|0|SCAN TABLE ipIndex VIRTUAL TABLE INDEX 2:B0D1B2D3B4D5B6D7B8D9 1|0|0|USE TEMP B-TREE FOR ORDER BY I created a profiled SQLite build, and here are the top offenders for a run on 1000 searches: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 40.00 1.58 1.58 300179 0.01 0.01 sqlite3VdbeExec 6.33 1.83 0.25 36628944 0.00 0.00 sqlite3VdbeMemMove 6.08 2.07 0.24 18314472 0.00 0.00 rtreeColumn 4.05 2.23 0.16 1665952 0.00 0.00 rtreeStepToLeaf 2.41 2.33 0.10 55549722 0.00 0.00 sqlite3VdbeMemRelease 2.28 2.42 0.09 1965104 0.00 0.00 sqlite3BtreeMovetoUnpacked 2.03 2.50 0.08 20187085 0.00 0.00 rtreeNodeOfFirstSearchPoint 1.90 2.57 0.08 19981424 0.00 0.00 sqlite3VtabImportErrmsg 1.77 2.64 0.07 1663952 0.00 0.00 sqlite3BtreeDelete 1.52 2.70 0.06 5297026 0.00 0.00 sqlite3VdbeSerialGet 1.52 2.76 0.06 1665952 0.00 0.00 btreeMoveto 1.27 2.81 0.05 29969136 0.00 0.00 numericType 1.27 2.86 0.05 22092688 0.00 0.00 sqlite3DbFree 1.27 2.91 0.05 16649521 0.00 0.00 sqlite3_result_int 1.27 2.96 0.05 5295009 0.00 0.00 moveToRoot 1.27 3.01 0.05 1844945 0.00 0.00 nodeAcquire 1.27 3.06 0.05 1665952 0.00 0.00 insertCell 1.27 3.11 0.05 1663952 0.00 0.00 dropCell 1.01 3.15 0.04 21214814 0.00 0.00 sqlite3_free 0.89 3.19 0.04 3335904 0.00 0.00 pager_write 0.76 3.22 0.03 9991712 0.00 0.00 sqlite3VdbeSerialType 0.76 3.25 0.03 3335904 0.00 0.00 sqlite3PagerWrite 0.76 3.28 0.03 903468 0.00 0.00 pcache1Fetch I believe I have avoided the common pitfalls: everything is within a single transaction, my cache_size pragma is large, etc. I note from SQLite trace callbacks that a particular explicit search results in a great many implicit SQL calls into the underlying R*Tree tables. Three hundred or so per search. I assume this is expected? It seems pretty large. They all look like this: -- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1 -- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1 {etc, ~300 times} I don't believe an R*Tree should be this deep; it makes me worry that something about the data I am providing is causing the tree to degenerate to a linear structure. Or something. The code reads search keys from stdin, runs the query on them, writes the results, and repeats: if (sqlite3_exec(pDb, "BEGIN;", NULL, NULL, NULL)!=SQLITE_OK) { fprintf(stderr, "could not initiate transaction.\n"); return 1; } const char* zSearch = "SELECT prefix, target FROM routeTarget WHERE id = (\n" " SELECT id FROM ipIndex\n" " WHERE minD1 <= $1 and $1 <= maxD1\n" " AND minD2 <= $2 and $2 <= maxD2\n" " AND minD3 <= $3 and $3 <= maxD3\n" " AND minD4 <= $4 and $4 <= maxD4\n" " AND minD5 <= $5 and $5 <= maxD5\n" " ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)*\n" " (maxD2-minD2)*(maxD1-minD1)) ASC \n" " LIMIT 1" ");"; rc = sqlite3_prepare_v2(pDb, zSearch, strlen(zSearch), &pSearch, NULL); {snip} uint32_t coord[5]; while (fgets(zLine, sizeof(zLine), stdin)!=NULL) { struct in6_addr addr; size_t len = strlen(zLine); if (zLine[len-1]=='\n') {zLine[len-1] = 0;} if (inet_pton(AF_INET6, zLine, &addr)!=1) { fprintf(stderr, "Ignoring bad ip address \"%s\"\n", zLine); continue; } addr2coord(addr, coord); for (int i=0;i<5;i++) { rc = sqlite3_bind_int(pSearch, i+1, coord[i]); assert(rc==SQLITE_OK); } rc = sqlite3_step(pSearch); while (rc!=SQLITE_DONE) { switch(rc) { case SQLITE_ROW: const unsigned char* zPrefix, *zTarget; zPrefix = sqlite3_column_text(pSearch, 0); zTarget = sqlite3_column_text(pSearch, 1); printf("%s --> %s %s\n", zLine, zPrefix, zTarget); rc = sqlite3_step(pSearch); break; case SQLITE_DONE: // no rows. break; case SQLITE_BUSY: fprintf(stderr, "couldn't get lock\n"); return 1; break; default: assert(0); } } sqlite3_reset(pSearch); struct timeval now; gettimeofday(&now, NULL); if (((now.tv_sec*1000+now.tv_usec/1000)- (lastTxnTime.tv_sec*1000 + lastTxnTime.tv_usec/1000))>1000) { if (sqlite3_exec(pDb, "END;", NULL, NULL, NULL)!=SQLITE_OK || sqlite3_exec(pDb, "BEGIN;", NULL, NULL, NULL)!=SQLITE_OK) { fprintf(stderr, "could not initiate transaction.\n"); return 1; } lastTxnTime = now; } } {snip} SQLite build params: gcc -DPACKAGE_NAME=\"sqlite\" -DPACKAGE_TARNAME=\"sqlite\" -DPACKAGE_VERSION=\"3.8.5\" -DPACKAGE_STRING=\"sqlite\ 3.8.5\" -DPACKAGE_BUGREPORT=\"http://www.sqlite.org\" -DPACKAGE_URL=\"\" -DPACKAGE=\"sqlite\" -DVERSION=\"3.8.5\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_FDATASYNC=1 -DHAVE_USLEEP=1 -DHAVE_LOCALTIME_R=1 -DHAVE_GMTIME_R=1 -DHAVE_DECL_STRERROR_R=1 -DHAVE_STRERROR_R=1 -DHAVE_READLINE=1 -DHAVE_POSIX_FALLOCATE=1 -I. -D_REENTRANT=1 -DSQLITE_THREADSAFE=1 -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_RTREE -g -O2 -c sqlite3.c OS: Linux <anonymized> 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux [root@<anonymized> sqlite-autoconf-3080500]# cat /etc/issue Red Hat Enterprise Linux Server release 6.3 (Santiago) Kernel \r on an \m [root@<anonymized> sqlite-autoconf-3080500]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz stepping : 2 cpu MHz : 2400.016 cache size : 12288 KB physical id : 1 siblings : 8 core id : 0 cpu cores : 4 apicid : 32 initial apicid : 32 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid bogomips : 4800.03 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: {times 16 -- 4 cores * 2 hyperthreads * 2 chips} Schema: sqlite> select * from sqlite_master; table|ipIndex|ipIndex|0|CREATE VIRTUAL TABLE ipIndex USING rtree_i32( id, -- Integer primary key minD1, maxD1, minD2, maxD2, minD3, maxD3, minD4, maxD4, minD5, maxD5 ) table|ipIndex_node|ipIndex_node|2|CREATE TABLE "ipIndex_node"(nodeno INTEGER PRIMARY KEY, data BLOB) table|ipIndex_rowid|ipIndex_rowid|3|CREATE TABLE "ipIndex_rowid"(rowid INTEGER PRIMARY KEY, nodeno INTEGER) table|ipIndex_parent|ipIndex_parent|4|CREATE TABLE "ipIndex_parent"(nodeno INTEGER PRIMARY KEY, parentnode INTEGER) table|routeTarget|routeTarget|5|CREATE TABLE routeTarget( id INTEGER PRIMARY KEY, prefix TEXT NOT NULL, target TEXT NOT NULL ) index|routeTargetIdxPrefix|routeTarget|6|CREATE INDEX routeTargetIdxPrefix ON routeTarget(prefix) Is there anything that I am doing obviously wrong? Thanks! Eric _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users