On Mon, Feb 7, 2011 at 5:28 PM, Stefan Sperling <s...@elego.de> wrote: > On Mon, Feb 07, 2011 at 05:18:57PM +0100, Stefan Sperling wrote: >> On Mon, Feb 07, 2011 at 10:55:47AM -0500, Mark Phippard wrote: >> > On Mon, Feb 7, 2011 at 10:52 AM, Stefan Sperling <s...@elego.de> wrote: >> > >> > > Where is the temporary table stored? Is it back by a file or memory? >> > > If backed by memory, do we have to worry about memory consumption for >> > > large working copies? >> > >> > The patch says it is backed by a file. >> >> Yes, it does say that in a comment but I didn't see where this is being >> enforced in the code. >> Checking the sqlite docs gave the answer: >> >> "When the name of the database file handed to sqlite3_open() or to ATTACH >> is an empty string, then a new temporary file is created to hold the >> database." >> http://sqlite.org/inmemorydb.html >> >> This is what the patch does: >> >> +-- STMT_ATTACH_TEMPORARY_DATABASE >> +ATTACH DATABASE '' AS temp_query_cache; >> >> And note that the sqlite docs also say: >> >> "Even though a disk file is allocated for each temporary database, in >> practice the temporary database usually resides in the in-memory pager >> cache and hence is very little difference between a pure in-memory >> database created by ":memory:" and a temporary database created by an >> empty filename. The sole difference is that a ":memory:" database must >> remain in memory at all times whereas parts of a temporary database >> might be flushed to disk if database becomes large or if SQLite comes >> under memory pressure." >> >> Neat :) > > Actually, we are forcing all temporary databases to be pure memory > in svn_sqlite__open(): > > /* Store temporary tables in RAM instead of in temporary files, but don't > fail on this if this option is disabled in the sqlite compilation by > setting SQLITE_TEMP_STORE to 0 (always to disk) */ > svn_error_clear(exec_sql(*db, "PRAGMA temp_store = MEMORY;")); > > Is this really a good idea? I think we should set it to FILE by default. > http://sqlite.org/pragma.html#pragma_temp_store
I've been wondering about the question "how about storing/buffering the entire query results in memory?" Would this really be a problem, even for very large working copies? I have a quite large working copy checked out here (trunk of an old backup of our work repository): 68,806 files, 9,868 folders (not counting .svn area of course). Ok, it's not huge, there are certainly much larger ones out there, but it's not small either. For comparison: svn trunk is only 1,818 files, 227 folders. I just note that wc.db of this large wc is only 62 MB (the one from svn trunk is only 1.6 MB). And this is a working copy with mostly .java files, all with 3 properties set (svn:eol-style, svn:keywords and cvs2svn:cvs-rev (yeah, this was an old migration experiment, we dropped this one when we did the final migration)). So I'm thinking that any query results will take a maximum of 62 MB of memory, and usually a lot less (most queries won't be reading the entire db). Or is this too naive? The above solution with SQLite temp tables seems like a good approach. And it would be great if the "file-backed" temp tables would be "almost as fast" as the "in-memory" temp tables (with the PRAGMA setting). But if the file-backed temp tables would be significantly slower, I would consider taking the "in-memory" route. Or, even better: make it configurable for the user (client-side config file, or something wc-specific (?)), so he can make the choice between memory and speed for himself. A lot of wc databases out there will be so small that the user will hardly notice the memory increase. Just my 0.02 €... Cheers, -- Johan