Re: [sqlite] EXT :Re: COUNT() extremely slow first time!

2011-02-28 Thread Max Vlasov
Michael,
it's an interesting hypothesis. Maybe you can describe the following
results. Recently I made a test program that just reads a file sequentially
with different block sizes and results are still confusing. This was the
same SSD drive and the functions were basically from api:
CreateFile/ReadFile

128, 256: 5MB/Sec
512: 20MB/Sec
1024: 25MB/Sec.
2048: 41MB/Sec
4096 - 65536: ~50MB/Sec.

Strangely I didn't see such a big difference between slowest and fastest
scenario (if we exclude the sizes below 1024), as you see the difference is
only x2. I have only one explanation looking at the numbers. Although it's
well-known that sqlite reads only full pages, if it sometimes does partial
reading, this 5MB/Sec drop for <256 reading can affect linear speed of 25
MB/Sec to end up as 12MB/Sec. But it's just a guess.

Max


On Mon, Feb 28, 2011 at 4:43 PM, Black, Michael (IS)  wrote:

> Those numbers make sense to me.
>
> Since count is doing a linear walk throughcorrect?
>
> #1 SSDs more than likely don't do "read ahead" like a disk drive.  So what
> you're seeing is what read-ahead does for you.
> #2 Count is doing a linear scan of the table...Probability of 2K containing
> the next page after 1K -- 100% (2X performance improvment)
> #3 Probability of 4K containing the next page after 2K -- 100% (2X
> improvement).
> #4 Probability of 8K containing the next page after 4K -- here the
> probability either drops or we're starting to hit the bandwidth of
> SSD+Sqlite -- I'm tempted to say that it's not the probability that drops.
>
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on
> behalf of Max Vlasov [max.vla...@gmail.com]
>
> my timing for Select Count for 400-500MB Bases and the table with about
> 1,000,000 records
>
> 1024:
>  40 Seconds, 98% system+user time, 500 MB Read, 12 MB/Sec
> 2048:
>  22 Seconds, 94% system+user time, 500 MB Read, 20 MB/Sec
> 4096 (This NTFS Cluster size)
>  11 Seconds, 96% system+user time, 450 MB Read, 32 MB/Sec
> 8192
>  8 Seconds, 87% system+user time, 420 MB Read, 50 MB/Sec
> 32768
>  8 Seconds, 56% system+user time, 410 MB Read, 50 MB/Sec
>
> Max
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT :Re: COUNT() extremely slow first time!

2011-02-28 Thread Black, Michael (IS)
Those numbers make sense to me.

Since count is doing a linear walk throughcorrect?

#1 SSDs more than likely don't do "read ahead" like a disk drive.  So what 
you're seeing is what read-ahead does for you.
#2 Count is doing a linear scan of the table...Probability of 2K containing the 
next page after 1K -- 100% (2X performance improvment)
#3 Probability of 4K containing the next page after 2K -- 100% (2X improvement).
#4 Probability of 8K containing the next page after 4K -- here the probability 
either drops or we're starting to hit the bandwidth of SSD+Sqlite -- I'm 
tempted to say that it's not the probability that drops.

We've got an SSD on order to test soon too.  


Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Max Vlasov [max.vla...@gmail.com]
Sent: Sunday, February 27, 2011 3:11 PM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] COUNT() extremely slow first time!

On Sat, Feb 26, 2011 at 1:03 AM, Greg Barker  wrote:

>  harddrive. Can you confirm this?
>
> > Giving this, if the fastest is 3.6 seconds, you have a very fast
> I can confirm this, my tests were run on a machine that uses a solid state
> drive.
>
>
Hmm, yesterday something struck me I can do similar tests on an Asus T91MT
having SSD as the storage. I got similar timings to Greg's. So it seems like
page size is a very sensitive parameter for solid state drives. Looks like
having the page_size lower than the cluster size on NTFS file drive will
have very high price for scanning-intensive operations. What puzzles me is
that CPU time for 1024 'select count' can take about 98% of the time so it's
not the same logic as was for rotating hard drives when a program just waits
for the hard drive to complete the operation. I doubt sqlite does something
special so it's maybe about adaptation of ssd specific by windows, it looks
like it's far from ideal. Can someone point to some tests regarding this?
Since SSD becomes more and more spread, i think it's better to understand
what it going on at least approximately.

my timing for Select Count for 400-500MB Bases and the table with about
1,000,000 records

1024:
  40 Seconds, 98% system+user time, 500 MB Read, 12 MB/Sec
2048:
  22 Seconds, 94% system+user time, 500 MB Read, 20 MB/Sec
4096 (This NTFS Cluster size)
  11 Seconds, 96% system+user time, 450 MB Read, 32 MB/Sec
8192
  8 Seconds, 87% system+user time, 420 MB Read, 50 MB/Sec
32768
  8 Seconds, 56% system+user time, 410 MB Read, 50 MB/Sec

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users