Maksim,

Some things you could try:

1) increase cache memory

You may be causing a lot of cache misses if the size of the query result
is very large compared to the size of the cache. Index-based searches
can cause multiple reloads of the same page because of a lack of
locality in the cache. An index-less search will just load each page once.

as an experiment, try 'PRAGMA cache_size = 300000', before you run the
query. 1 GB of ram should be able to support 300MB of cache.

2) use 8192-byte pages

Larger pages seem to improve performance quite a bit, in my experience.

Do 'PRAGMA page_size = 8192' before you create the database.

Doing both may cause excessive memory use (200000 * 8K = ...). I've
never tried that.


Gé



Maksim Yevmenkin wrote:

>Robert,
>
>  
>
>>[snip]
>>
>>    
>>
>>>i said i print these rows to /dev/null too in my perl code. plus the
>>>perl code does some other things such as joining these rows with other
>>>hashes and summing the numbers.
>>>      
>>>
>>That's fine.  I was merely trying to account for the 50% speed difference
>>between the two differing column tests, which has been accomplished.
>>
>>    
>>
>>>>As for the temp table ... I haven't tried this, but isn't
>>>>        
>>>>
>>>"temp" a reserved
>>>      
>>>
>>>>word in SQLite?  More importantly, you should be doing this
>>>>        
>>>>
>>>statement inside
>>>
>>>yes, it is. i really want to create 'temporary table' in memory. i was
>>>really hoping it would speed things up.
>>>      
>>>
>>I misread the statement, so ignore me on that part.  However, 339,000 rows
>>into a temporary in-memory table ... I tried some experiments locally here
>>and none of them took more than 2 seconds to execute.  Are you sure you're
>>not using up all available memory, which is causing the system to hit the
>>swapfile?  What does this same query look like when you drop the "temp" from
>>the query?
>>    
>>
>
>the system has 1G of ram. i was "monitoring" sqlite3 memory usage with
>'top'. the SIZE and RES did not exceed 30M. so i do not think the
>memory is the issue here.
>
>  
>
>>time sqlite3 db 'create table foo as select * from data where a <= 18234721' 
>>> /dev/null
>>    
>>
>22.06u 1.39s 0:27.75 84.5%
>
>so pretty much the same time without 'temp'.
>
>i'm starting to suspect disk. here is what i did. i created a separate
>database with only one table. this table contains subset of 333392
>rows from original data table. it also has the same index on "a"
>column, i.e. i did
>
>  
>
>>sqlite3 db1
>>    
>>
>sqlite> attach db as s;
>sqlite> create table data as select * from s.data where a <= 18234721;
>sqlite> create index data_by_a on data (a);
>
>full scan
>
>  
>
>>time sqlite3 db1 'select n1 from data' > /dev/null
>>    
>>
>17.19u 0.55s 0:19.06 93.0%
>
>"bad" index scan, because it is guaranteed then the table only has
>keys that match "where"
>
>  
>
>>time sqlite3 db1 'select n1 from data where a <= 18234721' > /dev/null
>>    
>>
>25.73u 0.59s 0:28.37 92.7%
>
>+10 seconds! is this the overhead of "indexed" scan? is this what it
>really takes to seek back and forth between index and data? what am i
>missing here?
>
>thanks,
>max
>  
>

Reply via email to