Dear Kees and Richard, Much appreciated.
I tried the ordered-urls-insert the results were better, but it is still taking progressively longer time as the number of records increases. A fundamental question to be asked here : Shouldn't the time complexity (Big-O) of the insert operation be constant? I even did a third test where the integer primary key is not auto increment; the same problem is observed. Time complexity is clearly constant when there are no indexes at all (except for the implicit auto-increment integer primary key) But otherwise, time taken increases incrementally (if not exponentially) with the number of existing records. Acknowledging my ignorance on the subject; I really don't see a reason why this is happening except being a potential bug or performance improvement opportunity. Regards, - Kefah. On Saturday 15 September 2007 12:51:02 Kees Nuyt wrote: > Hi Kefah, > > On Sat, 15 Sep 2007 04:43:46 +0300, you wrote: > >Thanks Kees, > > > >In fact using integer instead of string gives very similar results. > > > >Dropping the unique index constraint and then creating it again when > > needed sounds interesting, as insertion without the unique index is > > satisfactory and constact over time. > > Thanks for the feedback. > > >I tried this, but got a trivial question : > >When creating the unique index, sqlite gives me : > >SQL error: indexed columns are not unique > > > >What should be done here? > > Apparently the data in the text column is not unique. > That is the disadvantage of building the index after the > insertions: the database can't exercise the contraints on your > data, so you would have to do that yourself, for example by a > sort --unique step. My second suggestion for speeding things was > sorting the input data, so now you have two reasons for a sort. > > Such a data cleaning step will take considerable time, so time > gained in inserting may be lost again in preprocessing. > It might be better to use the database constraints, and live > with the slow insertions. Your benchmarks will tell you what's > best. > > >Thanks again, > >- Kefah. > > Good luck. > > >On Saturday 15 September 2007 00:25:03 Kees Nuyt wrote: > >> On Fri, 14 Sep 2007 23:20:53 +0300, you wrote: > >> >Dear All, > >> > > >> >I have been struggling with the performance of insertion in sqlite. > >> > > >> >Here we have a very simple case : > >> > > >> >A table with an integer autoincrement primary key and a text field that > >> > is unique. > >> > > >> >CREATE TABLE my (id PRIMARY KEY, url); > >> > > >> >CREATE UNIQUE INDEX myurl ON my(url); > >> > > >> > > >> >My application requires inserting up to 10 million records in batches > >> > of 20 thousand records. > >> > > >> >I use : > >> > A. PRAGMA SYNCHRONOUS = OFF > >> > B. Prepared Statement > >> > C. setAutoCommit(false), then to true at the end of the batch. > >> > > >> >Using the above, the insertion starts off at a good speed, then > >> > drastically slows down as more records are inserted. > >> > > >> >It goes like this : > >> > > >> >The first four inserstions (first 4*20K -> 60K records) > >> >0 took 786 > >> >1 took 944 > >> >2 took 1001 > >> >3 took 1303 > >> > > >> >After the first 1 Million records > >> >50 took 2560 > >> >51 took 1921 > >> >55 took 1900 > >> >53 took 3990 > >> > > >> >After the 2nd million records > >> >2 took 22393 > >> >3 took 16789 > >> >4 took 29419 > >> >5 took 13993 > >> > > >> >By the end of the the 3rd million records the insertion time goes up to > >> > 30 seconds per 20K records. > >> > > >> >My app is running from a java code and I am using SqliteJDBC 0.37 (the > >> > latest) on Sun JDK 1.6.0_02 on CentOS 5.0 4GB Ram, two duo core cpu > >> > server with raid10 sata-II harddisk. > >> > > >> > > >> >I know I might be stretching sqlite far beyond its limits, I just want > >> > to verify that there is nothing more that can be done to make a case > >> > for sqlite in this scenario. I am not sure whats the best next thing > >> > to do either. > >> > > >> >Your feedback and input will be highly appreciated, > >> > > >> >- Kefah. > >> > >> Most probably the UNIQUE INDEX on the TEXT column is the > >> culprit. > >> > >> My first try would be to create and fill the table first, and > >> create the UNIQUE INDEX on the TEXT column afterwards. > >> > >> The second suggestion would be to INSERT the rows in sorted > >> order, the sort key being the TEXT column. -- Kefah T. Issa Manager >/. freesoft technologies llc freesoft technologies, LLC. Cell : +962 777 80 90 50 Office : +962 6 55 23 967 Fax : +962 6 55 61 967 Jabber IM (XMPP) : [EMAIL PROTECTED] ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------