> Any threads or processes would still be in contention for the same resource: > access to that file on disk.
Midway through the run, `iostat` is showing me exactly 0 disk accesses, so I'm assuming it's a locking problem and not a physical I/O bottleneck. I'm hoping that you folks could give me an idea of what I'm doing wrong. (It could also be a problem with the Mac OS locking.) > The best indexes to create will depend on whether this is a one-time job or > whether you're going to have to do it repeatedly with similar or completely > different data. For each row, I'm making a query for similar rows. The full DB is 17 million rows. I've tried adding all the indexes I can think of, but the only one that helps this query is the R*tree index. They were all simple indices on one or more columns, so if you have an idea on a more complex index, I'd apply it and test it out. ~Seth On Aug 4, 2011, at 9:54 AM, Simon Slavin wrote: > > On 4 Aug 2011, at 4:44pm, Seth Price wrote: > >> I was hoping that SQLite was doing that with the JOIN statement. It's >> narrowing down the results with the query from the R*table (col*min & >> col*max). A large query will have ~14k rows at this point. Then JOINs with >> the original table (data) and it's narrowing results further with the >> remaining part of the WHERE (col*). The large query will have ~11k rows. >> After this it groups by the 'class' column in 'data' and counts how many >> rows per class. >> >> I'm really skeptical that I can chop this up into many queries to make it >> run faster. I've always assumed that if I can get "final" data back from a >> query, then it's exactly the query I'm looking for. I would think that >> running three queries would use three times the resources. > > Threading is a red herring. Any threads or processes would still be in > contention for the same resource: access to that file on disk. This is what > caused the observation in your 'Subject' header. All the threads are doing > is waiting for another thread to get out the way. > > Speed increases can come from adding one or more indexes. The best indexes > to create will depend on whether this is a one-time job or whether you're > going to have to do it repeatedly with similar or completely different data. > > Simon. > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users