Re: [sqlite] CREATE INDEX time degradation

2008-06-18 Thread Alberto Simões
On Wed, Jun 18, 2008 at 3:15 PM, Jay A. Kreibich <[EMAIL PROTECTED]> wrote:
> On Wed, Jun 18, 2008 at 01:39:34PM +0100, Alberto Sim?es scratched on the 
> wall:
>>

 [...]
>
>  What is the original ordering of the data?  In order to index the
>  data, it needs to be sorted, and the sort can get expensive due to
>  page thrashing.

In fact it should be random for all columns.

>> Can it be caused by a larger db file?
>
>  I suppose, although I'm not sure that's directly related.  Index
>  creation is very sensitive to the number of free pages in the page
>  cache, but for an index of that size, chances are you're always over.

Disk access on that cluster was always a problem. It might be that as well.

>  Increase the cache size (see PRAGMA commands).  Actually, this is
>  good practice anytime you need to create an index.

Thanks
Alberto

-- 
Alberto Simões
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CREATE INDEX time degradation

2008-06-18 Thread Jay A. Kreibich
On Wed, Jun 18, 2008 at 01:39:34PM +0100, Alberto Sim?es scratched on the wall:
> Hi.
> 
> I have a simple table w1,w2,w3,w3, occs and I need indexes for columns w1 to 
> w4.
> Creating them I got:
> 
> $ time sqlite3 _.db 'CREATE INDEX wi1 ON tet (w1)'
> real3m48.159s
 
> $ time sqlite3 _.db 'CREATE INDEX wi2 ON tet (w2)'
> real9m0.107s
 
> $ time sqlite3 _.db 'CREATE INDEX wi3 ON tet (w3)'
> real15m2.199s

> $ time sqlite3 _.db 'CREATE INDEX wi4 ON tet (w4)'
> real26m57.155s

> As these indexes should be completely independent, why this time degradation?

  What is the original ordering of the data?  In order to index the
  data, it needs to be sorted, and the sort can get expensive due to
  page thrashing.
  
  If w1 is "mostly" in order, the index will be created quite quickly.
  If, on the other hand, w4 is mostly reverse order, it will take some
  time to get all the records properly sorted and indexed.

  Try creating the indexes in the reverse order and see if the times
  are the same.

> Can it be caused by a larger db file?

  I suppose, although I'm not sure that's directly related.  Index
  creation is very sensitive to the number of free pages in the page
  cache, but for an index of that size, chances are you're always over.

> To have an idea, _.db at the end of this process:
> 
> $ du -hs _.db
> 1.8G_.db
> 
> 
> And, if this time degradation is derived from the file size: is there
> any trick to make it faster?

  Increase the cache size (see PRAGMA commands).  Actually, this is
  good practice anytime you need to create an index.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
   - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] CREATE INDEX time degradation

2008-06-18 Thread Alberto Simões
Hi.

I have a simple table w1,w2,w3,w3, occs and I need indexes for columns w1 to w4.
Creating them I got:

$ time sqlite3 _.db 'CREATE INDEX wi1 ON tet (w1)'
real3m48.159s
user3m34.739s
sys 0m5.590s

$ time sqlite3 _.db 'CREATE INDEX wi2 ON tet (w2)'
real9m0.107s
user3m44.161s
sys 0m27.934s

$ time sqlite3 _.db 'CREATE INDEX wi3 ON tet (w3)'
real15m2.199s
user4m8.193s
sys 1m5.937s

$ time sqlite3 _.db 'CREATE INDEX wi4 ON tet (w4)'
real26m57.155s
user4m30.153s
sys 3m19.778s

As these indexes should be completely independent, why this time degradation?
Can it be caused by a larger db file?

To have an idea, _.db at the end of this process:

$ du -hs _.db
1.8G_.db


And, if this time degradation is derived from the file size: is there
any trick to make it faster?

Cheers
Alberto

-- 
Alberto Simões
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users