I have a system requirement that feeds thousands of keyed records, each of which contains a blob of around a meg or so into an sqlite database. Ok so far. But the records come from multiple processes, and I can't afford to do have commits after each insertion for obvious performance reasons (if you have ever tried that you understand why) . So suppose we batch the commits, and via an external method not discussed here, and we can serialize the order of the commits from the various processes so they don't over-lap. A "inserting" process will block until it has "commit "rights". Are we still ok- i.e. leave the database in a sound condition? (commits can be batched a few hundred at a time maybe more) Contention for "commit" rights is unfortunate, but at the end of the day, if I can get 20 or so records inserted per second by this rube Goldberg, I'm ok with that, cause that is certainly the slow part, not the ability of the multiple feeders to supply records to insert.
Here is a bad extra ingredient that may be a deal breaker. The processes are on machines over a network, and have opened the same database on the same target machine where the database lives.. Network based windows file systems and locking are a recipe for sqlite corruption is my understanding.( Assume the serialization trick I mentioned earlier was made to work even if the processes invoking the trick were on the network.) So even if we were ok without this network issue, are we now into a unworkable design? The common table into which insertions will be done does not have indexes on it of course. We do that later after all processes are finished. In summary: Alternative I; multiple sqlite processes on different machines accessing the same database, but commits serialized (by a method not covered here). assume disaster without this serialized commit trick. Alternative II: I could design a system service that runs a single instance of sqlite, and make the users on other machines add records via the service on the target machine. , like designing my own little client server subsystem. I have done similar things before but its a lot of complexity and if I don't need it, I don't want to do it. Alternative III Maybe I should just use another sql product (mysql or sqlserver ...) and odbc to do this- I don't like that complexity either. Alternative IV:If there was a lightening fast sqlite bulk loader, I could let each process make its own database first, and then dump(merge) them all together as a final step. That's an alternative design but without a fast back door bulk loader, this alternative does not seem too good. (all that extra IO) ugh. Pure insertion speed: ( pair of 4.8ms disks in a raid0 disk array) The final size of the database is almost out of the range of sqlite, being between 100 and 120 Gbytes. I have tested sqlite on databases of this size for insertion speed, and it was not as bad as I expected from the caveats in the Owens sqlite book. Average of 12 inserts per second over the load of the entire 120G file file.16/second for 100 G a 100 G database, and for reasonable sizes like 1 gig, you can do 100 inserts/second. (all 1 transaction of course). If I can get this kind of speed with multiple load sources located on different machines, I would have what I need Pure retrieval speed: at around 120 Gbytes, 125,000 records with blobs between about 1 meg each , I could get over 600 retrievals per second so while insertions hurt, retrievals are fine. ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------