Re: [sqlite] LINKING AND LOADING FTS - on Linux/Ubuntu
>env | grep LD_LIBRARY_PATH; it will display path list. Now use that path and see if your lib is present there or not in that paht. If not add the path(where fst2 exists) to LD_LIBRARY_PATH R u new to unix/linux, i suggest to go through basic of them. regrads ** This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! * - Original Message - From: Uma Krishnan <[EMAIL PROTECTED]> Date: Sunday, September 16, 2007 10:38 am Subject: [sqlite] LINKING AND LOADING FTS - on Linux/Ubuntu > Hello, > > I'm having trouble loading fts2. I modified makefile to create fts2 > library on Linux/Ubuntu.I > > When I attempt to load fts2 using the command select > load_extension('fts2'), i get the error shared library not found. > ( noticed that it had not created the .so file, only .la file.) > > What am I doing wrong? > > Thanks in advance > > Uma > > Igor Tandetnik <[EMAIL PROTECTED]> wrote: Kefah T. Issa wrote: > >> I tried the ordered-urls-insert the results were better, but it is > >> still > >> taking progressively longer time as the number of records > increases.>> > >> A fundamental question to be asked here : > >> > >> Shouldn't the time complexity (Big-O) of the insert operation be > >> constant? > > Of course not. It takes O(log N) to find an appropriate place in > the > index for every new record (where N is the number of records > already > inserted). Also, it generates a lot of disk activity once the index > grows too large to fit in memory cache. > > >> I even did a third test where the integer primary key is not auto > >> increment; > >> the same problem is observed. > > The id is not a problem: O(log N) is caused by the index on url. > > Igor Tandetnik > > > > - > To unsubscribe, send email to [EMAIL PROTECTED] > > - > > > - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] LINKING AND LOADING FTS - on Linux/Ubuntu
Hello, I'm having trouble loading fts2. I modified makefile to create fts2 library on Linux/Ubuntu.I When I attempt to load fts2 using the command select load_extension('fts2'), i get the error shared library not found. ( noticed that it had not created the .so file, only .la file.) What am I doing wrong? Thanks in advance Uma Igor Tandetnik <[EMAIL PROTECTED]> wrote: Kefah T. Issa wrote: >> I tried the ordered-urls-insert the results were better, but it is >> still >> taking progressively longer time as the number of records increases. >> >> A fundamental question to be asked here : >> >> Shouldn't the time complexity (Big-O) of the insert operation be >> constant? Of course not. It takes O(log N) to find an appropriate place in the index for every new record (where N is the number of records already inserted). Also, it generates a lot of disk activity once the index grows too large to fit in memory cache. >> I even did a third test where the integer primary key is not auto >> increment; >> the same problem is observed. The id is not a problem: O(log N) is caused by the index on url. Igor Tandetnik - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Adding additional operators to FTS3
FTS1 and FTS2 don't provide any grouping operator (i.e. parentheses). e.g. 'pumpkin (onions OR cheese)' Are there plans to provide support for it in FTS3 (or FTS4) ? Klemens - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Exclusive Transactions and Attached Databases
Dear group, regarding the docs, it is possible to attach the same database file multiple times. After doing so, I wanted to begin an exclusive transaction. Unfortunately, this fails ("database is locked") and surprises me as I did not find any notice on this particular situation and possible side-effects neither in the attach nor in the transaction/locking documentation. Is this an intended behaviour of SQLite? Or just a side-effect in preventing possible dead-locks? The problem occurred with version 3.3.7. Regards Michael - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Re: Sqlite insertion performance
Kefah T. Issa <[EMAIL PROTECTED]> wrote: I tried the ordered-urls-insert the results were better, but it is still taking progressively longer time as the number of records increases. A fundamental question to be asked here : Shouldn't the time complexity (Big-O) of the insert operation be constant? Of course not. It takes O(log N) to find an appropriate place in the index for every new record (where N is the number of records already inserted). Also, it generates a lot of disk activity once the index grows too large to fit in memory cache. I even did a third test where the integer primary key is not auto increment; the same problem is observed. The id is not a problem: O(log N) is caused by the index on url. Igor Tandetnik - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Sqlite insertion performance
Dear Kees and Richard, Much appreciated. I tried the ordered-urls-insert the results were better, but it is still taking progressively longer time as the number of records increases. A fundamental question to be asked here : Shouldn't the time complexity (Big-O) of the insert operation be constant? I even did a third test where the integer primary key is not auto increment; the same problem is observed. Time complexity is clearly constant when there are no indexes at all (except for the implicit auto-increment integer primary key) But otherwise, time taken increases incrementally (if not exponentially) with the number of existing records. Acknowledging my ignorance on the subject; I really don't see a reason why this is happening except being a potential bug or performance improvement opportunity. Regards, - Kefah. On Saturday 15 September 2007 12:51:02 Kees Nuyt wrote: > Hi Kefah, > > On Sat, 15 Sep 2007 04:43:46 +0300, you wrote: > >Thanks Kees, > > > >In fact using integer instead of string gives very similar results. > > > >Dropping the unique index constraint and then creating it again when > > needed sounds interesting, as insertion without the unique index is > > satisfactory and constact over time. > > Thanks for the feedback. > > >I tried this, but got a trivial question : > >When creating the unique index, sqlite gives me : > >SQL error: indexed columns are not unique > > > >What should be done here? > > Apparently the data in the text column is not unique. > That is the disadvantage of building the index after the > insertions: the database can't exercise the contraints on your > data, so you would have to do that yourself, for example by a > sort --unique step. My second suggestion for speeding things was > sorting the input data, so now you have two reasons for a sort. > > Such a data cleaning step will take considerable time, so time > gained in inserting may be lost again in preprocessing. > It might be better to use the database constraints, and live > with the slow insertions. Your benchmarks will tell you what's > best. > > >Thanks again, > >- Kefah. > > Good luck. > > >On Saturday 15 September 2007 00:25:03 Kees Nuyt wrote: > >> On Fri, 14 Sep 2007 23:20:53 +0300, you wrote: > >> >Dear All, > >> > > >> >I have been struggling with the performance of insertion in sqlite. > >> > > >> >Here we have a very simple case : > >> > > >> >A table with an integer autoincrement primary key and a text field that > >> > is unique. > >> > > >> >CREATE TABLE my (id PRIMARY KEY, url); > >> > > >> >CREATE UNIQUE INDEX myurl ON my(url); > >> > > >> > > >> >My application requires inserting up to 10 million records in batches > >> > of 20 thousand records. > >> > > >> >I use : > >> >A. PRAGMA SYNCHRONOUS = OFF > >> >B. Prepared Statement > >> >C. setAutoCommit(false), then to true at the end of the batch. > >> > > >> >Using the above, the insertion starts off at a good speed, then > >> > drastically slows down as more records are inserted. > >> > > >> >It goes like this : > >> > > >> >The first four inserstions (first 4*20K -> 60K records) > >> >0took786 > >> >1took944 > >> >2took1001 > >> >3took1303 > >> > > >> >After the first 1 Million records > >> >50took2560 > >> >51took1921 > >> >55took1900 > >> >53took3990 > >> > > >> >After the 2nd million records > >> >2took22393 > >> >3took16789 > >> >4took29419 > >> >5took13993 > >> > > >> >By the end of the the 3rd million records the insertion time goes up to > >> > 30 seconds per 20K records. > >> > > >> >My app is running from a java code and I am using SqliteJDBC 0.37 (the > >> > latest) on Sun JDK 1.6.0_02 on CentOS 5.0 4GB Ram, two duo core cpu > >> > server with raid10 sata-II harddisk. > >> > > >> > > >> >I know I might be stretching sqlite far beyond its limits, I just want > >> > to verify that there is nothing more that can be done to make a case > >> > for sqlite in this scenario. I am not sure whats the best next thing > >> > to do either. > >> > > >> >Your feedback and input will be highly appreciated, > >> > > >> >- Kefah. > >> > >> Most probably the UNIQUE INDEX on the TEXT column is the > >> culprit. > >> > >> My first try would be to create and fill the table first, and > >> create the UNIQUE INDEX on the TEXT column afterwards. > >> > >> The second suggestion would be to INSERT the rows in sorted > >> order, the sort key being the TEXT column. -- Kefah T. Issa Manager >/. freesoft technologies llc freesoft technologies, LLC. Cell : +962 777 80 90 50 Office : +962 6 55 23 967 Fax : +962 6 55 61 967 Jabber IM (XMPP) : [EMAIL PROTECTED] - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Sqlite insertion performance
I have been struggling with the performance of insertion in sqlite. Here we have a very simple case : A table with an integer autoincrement primary key and a text field that is unique. CREATE TABLE my (id PRIMARY KEY, url); CREATE UNIQUE INDEX myurl ON my(url); My application requires inserting up to 10 million records in batches of 20 thousand records. For each group of 2 records, first insert them into a TEMP table. Call the temp table t1. Then transfer the records to the main table as follows: INSERT OR IGNORE INTO my(url) SELECT url FROM t1 ORDER BY url; D. Richard Hipp [EMAIL PROTECTED] - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Sqlite insertion performance
Hi Kefah, On Sat, 15 Sep 2007 04:43:46 +0300, you wrote: >Thanks Kees, > >In fact using integer instead of string gives very similar results. > >Dropping the unique index constraint and then creating it again when needed >sounds interesting, as insertion without the unique index is satisfactory and >constact over time. Thanks for the feedback. >I tried this, but got a trivial question : >When creating the unique index, sqlite gives me : >SQL error: indexed columns are not unique > >What should be done here? Apparently the data in the text column is not unique. That is the disadvantage of building the index after the insertions: the database can't exercise the contraints on your data, so you would have to do that yourself, for example by a sort --unique step. My second suggestion for speeding things was sorting the input data, so now you have two reasons for a sort. Such a data cleaning step will take considerable time, so time gained in inserting may be lost again in preprocessing. It might be better to use the database constraints, and live with the slow insertions. Your benchmarks will tell you what's best. >Thanks again, >- Kefah. Good luck. >On Saturday 15 September 2007 00:25:03 Kees Nuyt wrote: >> On Fri, 14 Sep 2007 23:20:53 +0300, you wrote: >> >Dear All, >> > >> >I have been struggling with the performance of insertion in sqlite. >> > >> >Here we have a very simple case : >> > >> >A table with an integer autoincrement primary key and a text field that is >> >unique. >> > >> >CREATE TABLE my (id PRIMARY KEY, url); >> > >> >CREATE UNIQUE INDEX myurl ON my(url); >> > >> > >> >My application requires inserting up to 10 million records in batches of >> > 20 thousand records. >> > >> >I use : >> >A. PRAGMA SYNCHRONOUS = OFF >> >B. Prepared Statement >> >C. setAutoCommit(false), then to true at the end of the batch. >> > >> >Using the above, the insertion starts off at a good speed, then >> > drastically slows down as more records are inserted. >> > >> >It goes like this : >> > >> >The first four inserstions (first 4*20K -> 60K records) >> >0took786 >> >1took944 >> >2took1001 >> >3took1303 >> > >> >After the first 1 Million records >> >50took2560 >> >51took1921 >> >55took1900 >> >53took3990 >> > >> >After the 2nd million records >> >2took22393 >> >3took16789 >> >4took29419 >> >5took13993 >> > >> >By the end of the the 3rd million records the insertion time goes up to 30 >> >seconds per 20K records. >> > >> >My app is running from a java code and I am using SqliteJDBC 0.37 (the >> > latest) on Sun JDK 1.6.0_02 on CentOS 5.0 4GB Ram, two duo core cpu >> > server with raid10 sata-II harddisk. >> > >> > >> >I know I might be stretching sqlite far beyond its limits, I just want to >> >verify that there is nothing more that can be done to make a case for >> > sqlite in this scenario. I am not sure whats the best next thing to do >> > either. >> > >> >Your feedback and input will be highly appreciated, >> > >> >- Kefah. >> >> Most probably the UNIQUE INDEX on the TEXT column is the >> culprit. >> >> My first try would be to create and fill the table first, and >> create the UNIQUE INDEX on the TEXT column afterwards. >> >> The second suggestion would be to INSERT the rows in sorted >> order, the sort key being the TEXT column. -- ( Kees Nuyt ) c[_] - To unsubscribe, send email to [EMAIL PROTECTED] -