[sqlite] Handling the whole select query from an index

Richard Hipp Thu, 17 Sep 2015 08:22:34 -0400

On 9/17/15, John McKown <john.archie.mckown at gmail.com> wrote:
> Well, this may be a heretical answer.


Yes, it is.

There are a number of reasons to still use SQLite even if you are only
doing key/value lookups:

(1) SQLite is the most widely deployed database engine in the world,
so it is more likely to already be on the system you are using than
any other database.

(2) Those PRIMARY KEY lookups that the OP is doing are surprisingly
competitive in performance.  They may even be faster than other
hash-based systems.  Have you measured?  Do you *know* that a hashing
system is faster, or are you just assuming?

(3) Does that alternative key/value database engine offer
transactions?  What happens if you lose power in the middle of a
write?

(4) The next poor soul who has to maintain the program will be better
able to understand the the code if she can see the CREATE TABLE
schema, containing the obvious hints to the meanings of the various
columns inherent in their names.  Data lives longer than code.  It is
good to keep it in an easy-to-reuse and portable format.

(5) If you already have a relational database at hand, the application
will be much easier to enhance when you move to version 2.

(6) Is that hash-based key/value database you want to use
cross-platform?  Can you move the database files to a big-endian
system and have it still work?  You could if it were SQLite.

I now open the floor for additional suggestions from readers for why
it is still better to use SQLite even if you currently only think you
need a key/value store....

> Given what you have said, I wouldn't
> even try to use SQLite. Well, at least not directly. I would use a "hash
> table". SQLite's indexing, if I understand correctly, is a B-Tree. And that
> is the _only_ option. What would likely work better is a "hash index". You
> might get some more concrete answers if you were to post the OS and
> implementation language. Such as, Windows 8.1 using C#. Or Linux 64-bit
> using C++. Also, do you an SQL data base? Perhaps something else would be
> better, if you're not really doing relational queries. But I don't know
> what, given that I don't know your system environment.
>
> On Thu, Sep 17, 2015 at 6:58 AM, Rob Willett <rob.sqlite at robertwillett.com>
> wrote:
>
>> Hi,
>>
>> There was a topic on here a few weeks ago which I cannot remember and
>> cannot find in the mess that is my e-mail system and after spending the
>> last hours search the SQLite archives I still can?t find it so will ask
>> here if anybody can remember or help. I can?t even format the question for
>> Google to search on :(
>>
>> The specific question I have is about trying to provide the fastest
>> response possible to a select query.  I recall that the e-mail talked
>> about
>> using an index to satisfy the query and therefore never having to go out
>> to
>> get the rest of the data from the table, so it was a lot quicker. Is there
>> anything that I need to do specially to make this happen. e.g. if I put
>> all
>> the fields of the table in the index BUT I really only search on the
>> primary key
>>
>> The reason for this I want to look up UK postcodes (Zip codes to our
>> American brethren) and get their longitude and latitude. A UK postcode
>> identifies a number of houses or commercial buildings. Depending on the
>> area it can be just one building (a big one) or if you are in the country
>> it can be quite a big area. If you sent a letter just to a postcode with
>> no
>> other identifier it probably wouldn?t get delivered, but putting a name on
>> it or a building number, there?s a very good chance the post(wo)?man will
>> deliver it.
>>
>> The CSV file looks like this
>>
>> id,postcode,latitude,longitude
>> 1,AB101XG,57.144165160000000,-2.114847768000000
>> 2,AB106RN,57.137879760000000,-2.121486688000000
>> 3,AB107JB,57.124273770000000,-2.127189644000000
>> 4,AB115QN,57.142701090000000,-2.093014619000000
>> 5,AB116UL,57.137546630000000,-2.112695886000000
>> ?.
>> Couple of million more lines
>>
>> The entire database schema looks like this. I know its complicated but
>> bear with me :)
>>
>> CREATE TABLE "postcode" (
>>          "postcode" text NOT NULL,
>>          "long" TEXT NOT NULL,
>>          "lat" TEXT NOT NULL,
>>         PRIMARY KEY("postcode")
>> );
>>
>> The only query that will ever run will be
>>
>> select long,lat from postcode where postcode = ?<some string>?
>>
>> Note I drop off the id field (column 0 in the CSV file) as its of no
>> interest to me. I also store the long and lat as strings as I don?t want
>> any number formatting changes at all. Rounding on a GPS number could cause
>> the wrong location to be used.
>>
>> The database will do nothing but return long and lat based on doing a
>> postcode lookup. There will never be any updates or changes. If there are,
>> the whole database will be regenerated.
>>
>> I need this to be as fast as possible and if necessary I?ll put it all in
>> RAM. The database is currently 120MB so it would easily fit in RAM. As it
>> never changes (perhaps 4 times per year), it could stay there.
>>
>> Is there anything else from the database schema side that would make
>> things quicker? e.g. If I created an index with postcode, long, lat in,
>> would that be quicker? or if i changed the long, lat to real (though I?m
>> reluctant to do so), would that make a lot of difference?
>>
>> Any suggestions gratefully received and apologies for not being able to
>> find it in the archives.
>>
>> Thanks,
>>
>> Rob
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
> --
>
> Schrodinger's backup: The condition of any backup is unknown until a
> restore is attempted.
>
> Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.
>
> He's about as useful as a wax frying pan.
>
> 10 to the 12th power microphones = 1 Megaphone
>
> Maranatha! <><
> John McKown
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
drh at sqlite.org

[sqlite] Handling the whole select query from an index

Reply via email to