First, thanks, your suggestion worked. To my surprise, it was enough to add "limit 1" to the original query.
So: select * from blocks,locations where locations.locid = blocks.locid AND ? >= blocks.startIpNum AND ? <= blocks.endIpNum limit 1 takes about 1.398-005 seconds and select * from blocks,locations where locations.locid = blocks.locid AND ? >= blocks.startIpNum AND ? <= blocks.endIpNum takes about 3 seconds. Igor Tandetnik wrote: > > Dani Valevski <[EMAIL PROTECTED]> wrote: >>> I think I have a performance problem for a simple select with range. >>> >>> My Tables: >>> CREATE TABLE locations( >>> locid INTEGER PRIMARY KEY, >>> country TEXT, >>> region TEXT, >>> city TEXT, >>> postalCode TEXT, >>> latitude REAL, >>> longitude REAL, >>> dmaCode INTEGER, >>> areaCode INTEGER) >>> >>> CREATE TABLE blocks( >>> startIpNum INTEGER, >>> endIpNum INTEGER, >>> locId INTEGER) >>> >>> My Data: >>> http://www.maxmind.com/app/geolitecity >>> Blocks table has 2,776,436 rows >>> Locations table has 159,488 rows >>> >>> After inserting the data I run analyze. >>> >>> My Query: >>> select * from blocks,locations where locations.locid = blocks.locid >>> AND ? >= blocks.startIpNum AND ? <= blocks.endIpNum >>> (replace ? with a number) >>> >>> Performance issues: >>> I use python's sqlite3 module to run the query. >>> With this configuration it takes about 0.6 seconds to complete the >>> query. I >>> think this is too slow. I could write a binary tree myself and have >>> searches >>> like this take, O(log(num_rows)) which is >>> 7*something_which_shouldnt_take_too_much. Am I wrong? > > And what would you use as a key for this binary tree? I bet you would > utilize additional information that the DB engine doesn't have - that > your blocks don't overlap (they don't, right?) Try coming up with a > search strategy without making this assumption. > > Try this: create an index on startIpNum, and run a query like this: > > select * from blocks, locations > where blocks.startIpNum <= ? and blocks.locid = locations.locid > order by blocks.startIpNum desc limit 1; > > This gives you the record with the largest value of startIpNum that is > still smaller than the threshold, and should be very fast. It can > produce a false positive - make the additional check for (? <= > startIpEnd) in your application code. Don't put this check into the > query though, or you will force it back into O(N) behavior in case your > target value doesn't fall within any block after all. > > Igor Tandetnik > > > ----------------------------------------------------------------------------- > To unsubscribe, send email to [EMAIL PROTECTED] > ----------------------------------------------------------------------------- > > > -- View this message in context: http://www.nabble.com/Performance-problem-for-a-simple-select-with-range-tf4711654.html#a13509241 Sent from the SQLite mailing list archive at Nabble.com. ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------