On Mon, Aug 2, 2010 at 1:51 PM, Mike Hays <mi...@songbirdnest.com> wrote:

> Hmm - My original pastebin is gone now (oops), so I reposted with the
> output from the sqlite3_ tracing functions (instead of our own performance
> logger) here: http://mozilla.pastebin.org/441814.  Let me know if you see
> any obvious indexes to add, or other obvious problems with how we're
> accessing the database.
>
>
Most of the slow queries seem to be "SELECT count(*) FROM ....".  Such
queries have to visit every row in the table (in order to count the rows)
and thus get progressively slower as the number of entries in the database
increases.

As an alternative, consider keeping running totals of the things you like to
count in a separate table, then add TRIGGERs to the main tables to keep the
running totals up-to-date as you INSERT, DELETE, and UPDATE information in
the main tables.


> Thanks again,
> Mike
>
> ----- Original Message -----
> From: "Mike Hays" <mi...@songbirdnest.com>
> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
> Sent: Friday, July 30, 2010 2:25:27 PM
> Subject: Re: [sqlite] Optimizing Songbird
>
> Hi Richard,
>
> Thanks for getting back to me so quickly! :)
>
> You're absolutely right that changing memory/cache usage doesn't seem to
> affect performance much.  I've pastebin'd a performance log here:
> http://mozilla.pastebin.org/434277.  It definitely looks like we have some
> duplicate queries, which I've been looking into.  I think this should cover
> your first suggestion (and maybe the second?), but let me know if you need
> more info about what's going on in the Songbird code.  I'll be looking into
> sqlite3_stmt_status() and sqlite3_profile() over the next few days.
>
> Thanks again,
> Mike
>
> ----- Original Message -----
> From: "Richard Hipp" <d...@sqlite.org>
> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
> Sent: Friday, July 30, 2010 1:11:02 PM
> Subject: Re: [sqlite] Optimizing Songbird
>
> On Fri, Jul 30, 2010 at 3:26 PM, Mike Hays <mi...@songbirdnest.com> wrote:
>
> > Hi sqlite-users,
> >
> > I'm working on improving the SQLite database performance for Songbird, an
> > open-source media player based on Firefox/Mozilla.  We're seeing pretty
> poor
> > performance with large media libraries - ideally we'd like to support
> > 100,000 media items and maintain reasonable performance.  I'd really
> > appreciate any help you can give (and so will our users)!
> >
> > For reference, you can see the majority of our code that interacts with
> the
> > DB here:
> >
> http://src.songbirdnest.com/source/xref/client/components/dbengine/src/DatabaseEngine.cpp
> > And the schema is here:
> >
> http://src.songbirdnest.com/source/xref/client/components/library/localdatabase/content/schema.sql
> > The high level is that we store a GUID for each media item and some
> > top-level properties in one table (media_items), and all the user-visible
> > metadata (track name, artist, album, etc.) in another
> (resource_properties).
> >  We use custom collation for things like sorting numbers correctly (so "2
> > Live Crew" comes before "10,000 Maniacs") and localization.
> >
> > I've updated to 3.7.0 (which seems to have better start-up and shut-down
> > times, yay!) and poked around for a few days, and discovered some areas
> I'd
> > like some help investigating.  The first is memory usage - I'm pretty new
> to
> > this stuff so bear with me ;)
> >
> > My test library has around 85,000 tracks, which ends up using about 300MB
> > of DB.  I have pre-allocated a 16000 page cache, with 16KB pages.  My
> memory
> > usage after using the DB for a while looks like:
> > DumpMemoryStatistics() format   Current Highwater
> > SQLITE_STATUS_MEMORY_USED:      264849264       560992624
> > SQLITE_STATUS_PAGECACHE_USED:   8       15996
> > SQLITE_STATUS_PAGECACHE_OVERFLOW:       264435640       560259280
> > SQLITE_STATUS_SCRATCH_USED:     0       0
> > SQLITE_STATUS_SCRATCH_OVERFLOW: 0       102664
> > SQLITE_STATUS_MALLOC_SIZE       328     135807
> > SQLITE_STATUS_PARSER_STACK      0       0
> > SQLITE_STATUS_PAGECACHE_SIZE    16384   16384
> > SQLITE_STATUS_SCRATCH_SIZE      80536   102664
> >
> > What surprises me is that the Highwater page-cache usage is nearly 100%,
>
>
> That you have a massive page cache and that it is all being used sometimes
> is a clue that you are doing full table scans rather than indexed queries.
> Adding the right indices will likely make a huge difference.  Changing
> database sizes and cache sizes probably won't make much difference at all.
>
> Here are some things to try:
>
> (1) Use the sqlite3_trace() interface to get a log of all SQL statements
> you
> are running.  This log, together with the schema of your database file, is
> useful to us in helping to determine which indices you need.
>
> (2) Consider using sqlite3_profile() to figure out which queries are taking
> time.  Focus on optimizing the slow ones.
>
> (3) Read about the sqlite3_stmt_status() interface.  Just before you
> sqlite3_finalize() or sqlite3_reset() each prepared statement, using
> sqlite3_stmt_status() to see how many full-scan steps and how many sorts it
> had to do.  If the answer to either question is non-zero, you found a
> potential place to optimize using an index.
>
> An example of (3) can be seen in the source code to Fossil in
>
> http://www.fossil-scm.org/fossil/artifact/c5ac4be0c11b7b0b6da0abe7ccf1680e87698b19(search<http://www.fossil-scm.org/fossil/artifact/c5ac4be0c11b7b0b6da0abe7ccf1680e87698b19%28search>
> for the symbol "db_stats").  If compiled with FOSSIL_DEBUG, an extra
> "db_stats()" function exists which runs sqlite3_stmt_status() after each
> SQL
> statement is evaluated and inserts warning message in the output if any
> inefficient queries are run.  This is very helpful in finding inefficient
> queries, even for small datasets - too small to notice the inefficiencies
> in
> causal usage.
>
>
>
>
> > but the Current usage is generally less than 10 pages.  Bumping the page
> > size up to 32KB resulted in about twice as many pages in cache after some
> > usage, but that still results in massive overflow (which I'm assuming is
> > significantly slower).  My initial guess is that the majority of the
> > accesses are just too large to fit in cache... Does this sound
> reasonable?
> >  How would I go about correcting this?
> >
> > I'd also like some help with prepared statements and indexing.  We use
> some
> > seemingly complex queries, and I'm having trouble understanding which
> > indexes and prepared statements are actually useful to improve the
> > performance of these queries.  Being relatively unfamiliar with SQL, some
> > documentation is probably what I'm after at the moment.
> >
> > Lastly, if you have any suggestions on the schema we're using, that would
> > probably be very helpful as well.
> >
> > Big thanks in advance :)
> >
> > -Mike
> >
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
>
> --
> ---------------------
> D. Richard Hipp
> d...@sqlite.org
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
---------------------
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to