Hi,
There was a topic on here a few weeks ago which I cannot remember and cannot
find in the mess that is my e-mail system and after spending the last hours
search the SQLite archives I still can?t find it so will ask here if anybody
can remember or help. I can?t even format the question for Google to search on
:(
The specific question I have is about trying to provide the fastest response
possible to a select query. I recall that the e-mail talked about using an
index to satisfy the query and therefore never having to go out to get the rest
of the data from the table, so it was a lot quicker. Is there anything that I
need to do specially to make this happen. e.g. if I put all the fields of the
table in the index BUT I really only search on the primary key
The reason for this I want to look up UK postcodes (Zip codes to our American
brethren) and get their longitude and latitude. A UK postcode identifies a
number of houses or commercial buildings. Depending on the area it can be just
one building (a big one) or if you are in the country it can be quite a big
area. If you sent a letter just to a postcode with no other identifier it
probably wouldn?t get delivered, but putting a name on it or a building number,
there?s a very good chance the post(wo)?man will deliver it.
The CSV file looks like this
id,postcode,latitude,longitude
1,AB101XG,57.144165160000000,-2.114847768000000
2,AB106RN,57.137879760000000,-2.121486688000000
3,AB107JB,57.124273770000000,-2.127189644000000
4,AB115QN,57.142701090000000,-2.093014619000000
5,AB116UL,57.137546630000000,-2.112695886000000
?.
Couple of million more lines
The entire database schema looks like this. I know its complicated but bear
with me :)
CREATE TABLE "postcode" (
"postcode" text NOT NULL,
"long" TEXT NOT NULL,
"lat" TEXT NOT NULL,
PRIMARY KEY("postcode")
);
The only query that will ever run will be
select long,lat from postcode where postcode = ?<some string>?
Note I drop off the id field (column 0 in the CSV file) as its of no interest
to me. I also store the long and lat as strings as I don?t want any number
formatting changes at all. Rounding on a GPS number could cause the wrong
location to be used.
The database will do nothing but return long and lat based on doing a postcode
lookup. There will never be any updates or changes. If there are, the whole
database will be regenerated.
I need this to be as fast as possible and if necessary I?ll put it all in RAM.
The database is currently 120MB so it would easily fit in RAM. As it never
changes (perhaps 4 times per year), it could stay there.
Is there anything else from the database schema side that would make things
quicker? e.g. If I created an index with postcode, long, lat in, would that be
quicker? or if i changed the long, lat to real (though I?m reluctant to do so),
would that make a lot of difference?
Any suggestions gratefully received and apologies for not being able to find it
in the archives.
Thanks,
Rob