I think that´s exactly what I´m going to do. I´ll have to get the ok from my customer regarding the longer time of an INSERT/UPDTAE, but I think he will be ok with that. Unless you have ebtter suggestions my DB will look like this. ('Takes' are the objects)
CREATE TABLE [takefields] ( [take] integer NOT NULL , [field] integer NOT NULL , [content] memo, PRIMARY KEY ([field]) ); -- CREATE TABLE [takes] ( [ID] integer NOT NULL , [inpoint] integer, [outpoint] integer, [date] integer, [screen] varchar (255), [comment] varchar (255), PRIMARY KEY ([ID]) ); -- CREATE TABLE [wordlist] ( [field] integer, [take] integer, [word] varchar (255), PRIMARY KEY ([word]) ); IIRC INDEXes are only a kind of copy of the Table they where crated on with lesser columns, right? So I don´t think an INDEX would improve the performance here, am I right? The typical User´s request would be: "Get me all Takes in which the words "Foo" and "Bar" appear, but only in fields 1,3 or 5 Thanks for your help! -----Original Message----- From: Dennis Cote [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 11, 2006 1:00 AM To: sqlite-users@sqlite.org Subject: Re: [sqlite] How to handle large amount of data? André Goliath wrote: >Do you (or anyone else) have any experience with those large DBs in terms of >SELECT execution times? >ms, seconds or minutes? > >Since I would mainly use field numbers in the WHERE clause I´d create an >index on them I guess? > > > With an index on the field number and the object id, you should be able to retrieve a particular field for a particular object in a few ms. To search through the text of one particular field for all one million objects will still take a while, even with this index. You will be reading in and scanning through about 1 GB of data (1E6 objects x 1 field per object x 1E3 bytes per field). This is where the full text search mechanisms come in handy. Your 1 GB of data may only contain 10K unique words, and each 1 KB field probably has only about 200 words. If you create a table of unique words, and then another table that relates each field to each of its unique words, you can find all the fields that contain a word very quickly. These searches can be combined to find all the fields that contain a combination of words, or any one of a set of words etc. In your case you might want to keep separate tables to relate the words of each field_number to the field_data, or add the field number to the relation table. Either of these will make searching for a combination of a word in a particular field much faster. This way you trade off some extra work when you insert the data for more speed when you read the data. Once you have a field id, you can get full text of the field and the object id (so that you can get whatever other info you have for the object) very quickly. This all assumes you are trying to match various words in the field data, and not the entire text of the field data. The later is much simpler. >The client first wanted to store that data in an XML file, >but I was able to convince him to use a DB. > > > Good, you probably saved yourself a lot of work, and them a lot of grief. :-) HTH Dennis Cote