On Mon, Feb 7, 2011 at 5:28 PM, Stefan Sperling <s...@elego.de> wrote:
> On Mon, Feb 07, 2011 at 05:18:57PM +0100, Stefan Sperling wrote:
>> On Mon, Feb 07, 2011 at 10:55:47AM -0500, Mark Phippard wrote:
>> > On Mon, Feb 7, 2011 at 10:52 AM, Stefan Sperling <s...@elego.de> wrote:
>> >
>> > > Where is the temporary table stored? Is it back by a file or memory?
>> > > If backed by memory, do we have to worry about memory consumption for
>> > > large working copies?
>> >
>> > The patch says it is backed by a file.
>>
>> Yes, it does say that in a comment but I didn't see where this is being
>> enforced in the code.
>> Checking the sqlite docs gave the answer:
>>
>>   "When the name of the database file handed to sqlite3_open() or to ATTACH
>>    is an empty string, then a new temporary file is created to hold the
>>    database."
>> http://sqlite.org/inmemorydb.html
>>
>> This is what the patch does:
>>
>> +-- STMT_ATTACH_TEMPORARY_DATABASE
>> +ATTACH DATABASE '' AS temp_query_cache;
>>
>> And note that the sqlite docs also say:
>>
>>   "Even though a disk file is allocated for each temporary database, in
>>    practice the temporary database usually resides in the in-memory pager
>>    cache and hence is very little difference between a pure in-memory
>>    database created by ":memory:" and a temporary database created by an
>>    empty filename. The sole difference is that a ":memory:" database must
>>    remain in memory at all times whereas parts of a temporary database
>>    might be flushed to disk if database becomes large or if SQLite comes
>>    under memory pressure."
>>
>> Neat :)
>
> Actually, we are forcing all temporary databases to be pure memory
> in svn_sqlite__open():
>
>  /* Store temporary tables in RAM instead of in temporary files, but don't
>     fail on this if this option is disabled in the sqlite compilation by
>     setting SQLITE_TEMP_STORE to 0 (always to disk) */
>  svn_error_clear(exec_sql(*db, "PRAGMA temp_store = MEMORY;"));
>
> Is this really a good idea? I think we should set it to FILE by default.
> http://sqlite.org/pragma.html#pragma_temp_store

I've been wondering about the question "how about storing/buffering
the entire query results in memory?" Would this really be a problem,
even for very large working copies?

I have a quite large working copy checked out here (trunk of an old
backup of our work repository): 68,806 files, 9,868 folders (not
counting .svn area of course). Ok, it's not huge, there are certainly
much larger ones out there, but it's not small either. For comparison:
svn trunk is only 1,818 files, 227 folders.

I just note that wc.db of this large wc is only 62 MB (the one from
svn trunk is only 1.6 MB). And this is a working copy with mostly
.java files, all with 3 properties set (svn:eol-style, svn:keywords
and cvs2svn:cvs-rev (yeah, this was an old migration experiment, we
dropped this one when we did the final migration)).

So I'm thinking that any query results will take a maximum of 62 MB of
memory, and usually a lot less (most queries won't be reading the
entire db). Or is this too naive?

The above solution with SQLite temp tables seems like a good approach.
And it would be great if the "file-backed" temp tables would be
"almost as fast" as the "in-memory" temp tables (with the PRAGMA
setting). But if the file-backed temp tables would be significantly
slower, I would consider taking the "in-memory" route.

Or, even better: make it configurable for the user (client-side config
file, or something wc-specific (?)), so he can make the choice between
memory and speed for himself. A lot of wc databases out there will be
so small that the user will hardly notice the memory increase.

Just my 0.02 €...
Cheers,
-- 
Johan

Reply via email to