Re: [sqlite] How to handle large amount of data?

Dennis Cote Mon, 10 Apr 2006 15:59:53 -0700

André Goliath wrote:

Do you (or anyone else) have any experience with those large DBs in terms of
SELECT execution times?
ms, seconds or minutes?


Since I would mainly use field numbers in the WHERE clause I´d create  an
index on them I guess?

With an index on the field number and the object id, you should be ableto retrieve a particular field for a particular object in a few ms. Tosearch through the text of one particular field for all one millionobjects will still take a while, even with this index. You will bereading in and scanning through about 1 GB of data (1E6 objects x 1field per object x 1E3 bytes per field).

This is where the full text search mechanisms come in handy. Your 1 GBof data may only contain 10K unique words, and each 1 KB field probablyhas only about 200 words. If you create a table of unique words, andthen another table that relates each field to each of its unique words,you can find all the fields that contain a word very quickly. Thesesearches can be combined to find all the fields that contain acombination of words, or any one of a set of words etc.

In your case you might want to keep separate tables to relate the wordsof each field_number to the field_data, or add the field number to therelation table. Either of these will make searching for a combination ofa word in a particular field much faster.

This way you trade off some extra work when you insert the data for morespeed when you read the data.

Once you have a field id, you can get full text of the field and theobject id (so that you can get whatever other info you have for theobject) very quickly.

This all assumes you are trying to match various words in the fielddata, and not the entire text of the field data. The later is much simpler.

The client first wanted to store that data in an XML file,
but I was able to convince him to use a DB.

Good, you probably saved yourself a lot of work, and them a lot ofgrief. :-)


HTH
Dennis Cote

Re: [sqlite] How to handle large amount of data?

Reply via email to