Yes. I agree.
What about second approach.
create table if not exists emp (id integer primary key autoincrement, col1
text, col2 text); //without unique.
I tried with insert or ignore into emp (col1, col2) values ('a', 'b');
I noticed, this is fast.
just to know, what might be the reason?
On Tue, Jan 10, 2012 at 2:36 PM, Simon Slavin <[email protected]> wrote:
>
> On 10 Jan 2012, at 8:37am, Durga D wrote:
>
> > I noticed, with first approach, huge performance hit when database grows.
> > for ex: database has 500,000 records. now insert 25,000 within a
> > transaction. It takes lot of time when compared to insert 25,000 records
> > with empty database.
> >
> > approximately 15 times slower. because of unique(col1, col2).
>
> This makes some sense. Every INSERT command involves SQLite looking
> through the records which are already in the table. So a new insert when
> there are only 100 rows in the table is going to be pretty quick, but an
> insert when there are 500,000 rows in the table will involve lots of work.
>
> SQLite has to look up the data of the new record in the index it makes of
> (col1, col2) so it can find out whether the new row is a duplicate of an
> existing row. So it has to do 25,000 searches of a balanced tree. And if
> the tree has 500,000 records in it that means it's doing around ... hmm,
> it's over 3/4 of 1020, call it 765 ... tests for each search of the tree,
> 765 test for each INSERT command.
>
> So inserting 25,000 new rows when you already have 500,000 rows involves
> more than 25,000 * 765 string comparisons. Call it 19 million string
> comparisons. Plus, of course, all the other work involved in messing with
> the database. Since you know how long your strings are you can get an idea
> of how much data SQLite is having to handle to do this.
>
> > I did with transactions for every 25,000 records.
>
> That is a reasonable way to do it. People may be able to recommend
> PRAGMAs which will speed this up a bit but what you're doing really does
> involve a lot of work for the computer to do.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users