Robert, > > i guess, i can believe this. however its pretty disappointing to get > > 50% improvement on 30 times less dataset :( > > > > but how do you explain this? > > > > sqlite> .schema data > > CREATE TABLE data > > ( > > a INTEGER, > > b INTEGER, > > c CHAR, > > d INTEGER, > > e INTEGER, > > n1 FLOAT, > > n2 FLOAT > > ); > > CREATE INDEX data_by_a ON data (a); > > > > > time sqlite3 db 'select n1 from data where a <= 18234721' > > > /dev/null > > 26.15u 0.59s 0:27.00 99.0% > > > time sqlite3 db 'select n1 from data where a <= 18234721' > > > /dev/null > > 26.04u 0.61s 0:26.91 99.0% > > > > and > > > > > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null > > 12.22u 0.41s 0:12.67 99.6% > > > time sqlite3 db 'select e from data where a <= 18234721' > /dev/null > > 12.01u 0.61s 0:12.64 99.8% > > > > so just by selecting two different rows (both next to each other). i > > get another 50% time difference? sure the column types are different, > > and i can even understand that FLOAT might be 8 bytes and INTEGER 4 > > bytes, but 50% time difference? it just cant be that linear can it? > > > > do you think it is possible to get better results by issuing multiple > > queries each of which will return even less dataset? and, no, i'm not > > a database guy. > > The most glaring fault in your time tests that I see is that you're running > the command-line sqlite3, which is printf'ing all 339,000+ rows to dev/null. > It doesn't show on your screen, but the work is still being done regardless.
well, i do print these rows to /dev/null in perl test too. > If you really want a fair speed test, change those statements to: > > select count(n1) from data where a <= 18234721 > time sqlite3 db 'select count(n1) from data where a <= 18234721' > /dev/null 7.79u 0.70s 0:08.50 99.8% > And > > select count(e) from data where a <= 18234721 > time sqlite3 db 'select count(e) from data where a <= 18234721' > /dev/null 7.90u 0.42s 0:08.31 100.1% > Or even > > select count(*) from data where a <= 18234721 > time sqlite3 db 'select count(*) from data where a <= 18234721' > /dev/null 1.35u 0.16s 0:01.47 102.7% 8 times faster then count(n1) or count(e)? i'm confused. i guess it just used "a" field (on which it had index?) > THEN tell us what the difference in performance is ... fine, if i ask sqlite just to count the rows it wins hands-down, but i really want these rows. even more i 'd like to then "natural join" these rows with a couple of other tables to really do what the perl code currently does. but, it takes 22 seconds to just to create a temp table with the required dataset > time sqlite3 db 'create temp table foo as select * from data where a <= > 18234721' > /dev/null 21.93u 0.89s 0:22.95 99.4% and i do not understand what i'm doing wrong here :( thanks, max