What page size do you use in your database? Maybe you should increase it?

Assuming a table fully fragmented with a page size of 16KB then the I/O 
rate should be (if my calculations are correct) in the 2MB/s range 
(assuming a hard disk having 50MB/s bandwidth, 7msec seek time).

For 32KB page size you should get 4MB/s. Because the I/O rate is heavily 
affected by the seeks (the rate doubles for 32KB because the hard disk 
does roughly half the seeks), if SQLite does additional seeks (like 
trying to lock something), then everything halves so try to avoid any 
locking if you can.

lefteris

Jens Miltner wrote:
> Hi,
> 
> we're seeing terrible performance problems when fetching data from one  
> of our tables:
> The table contains roughly 1.2 Million rows and a plain "SELECT  
> COUNT(*) FROM t" query takes 8 minutes to finish.
> The table contains 10 fields, but the records average to about 100  
> Bytes of data total. The largest records are ~1k of data, but there  
> are only  a few hundred records with more than 256 Bytes of data.
> 
> Another table in the same database contains over 5 Million records  
> (and probably an average of a few hundred Bytes per record). Counting  
> rows in this table only takes about 40 seconds.
> 
> (All times taken from first run and they're reproducible on several  
> machines).
> 
> 
> When looking at a sample taken while sqlite is executing the COUNT(*)  
> query, almost all the time is spent inside read() calls.
> The average I/O rate is about 500kB - 800kB / seconds read  
> performance. When reading from other databases / other tables, I can  
> easily get 20-30 MB/s overall read performance.
> So, I can only conclude that for some reason, there seems to be some  
> terrible I/O behavior for the data in this specific table.
> 
> Running fs_usage shows that there are almost no consecutive pages  
> being read (i.e. the file offsets are not consecutive), so I suspect  
> that for some reason, this table is fragmented all across the 3 GB  
> database file, which may explain the slow overall read performance.
> 
> Now, there are two things that happened to this table, which didn't  
> happen to the other big table:
> 
> (1) we've updated the schema by appending a couple of columns
> (2) when data is updated, it's updated by first deleting a bunch of  
> rows and then re-inserting the new data. This happens once or twice a  
> day for almost all entries, thus the
> 
> Would either (1) or (2) lead to heavy fragmentation of this kind?
> 
> Vacuuming the database does resolve the performance issue, but  
> vacuuming that specific database took about 75 minutes (MacPro @  
> 3GHz), so it's not something we can do frequently (especially, since  
> the database is used in a server product that has 24/7 uptime) :(
> 
> Is there anything we could do to avoid the table fragmentation in the  
> first place (write queries in a different way, avoid deletions/re- 
> insertions, etc.)?
> (Note that I'm not talking about reclaiming free space here - that's  
> not really an issue for us. We need to avoid the performance hit we've  
> seen which IMHO is caused by the pages for a single table being  
> scattered all over the database file, thus requiring the paging code  
> to jump a lot in the file).
> 
> 
> Thanks,
> -jens
> 
> 
> BTW: this is SQLite 3.6.3, but at least running the COUNT queries is  
> not any faster with 3.5.x. Tests run on Mac OS X.
> 
> 
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to