Re: [sqlite] sqlite not sharing page cache across processes?

2004-03-22 Thread Mark D. Anderson

On Mon, 22 Mar 2004 14:35:23 -0500, "Doug Currie" <[EMAIL PROTECTED]> said:

> It has no way of knowing (save diffing the db) that "there have been
> no write transactions."

well, that should in principle be easy to accomplish, shouldn't it?
For example by having any writer increment a counter held
in the meta information in the first page of the file (PageOne
or whatever).

> > While I'm making suggestions despite near-total ignorance
> > of sqlite :), have you measured what improvement would
> > be obtained by using readonly mmap() for readers, rather
> > read()? It would save some malloc churning as well as
> > a memory copy. Performance could be additionally tweaked
> > with madvise(SEQUENTIAL) when suitable.
> 
> These functions are not in ANSI C.

No, but they are POSIX.
And there are lots of functions already in the sqlite code base
that are neither ANSI nor POSIX (primarily for windows or mac).

There are several operating systems, such as Solaris that
implement their "cp" using mmap rather than read, and
have done so for ages. So this is really not some piece
of exotica.

There probably are some embedded systems that do not
offer mmap. There are circumstances when mmap will not
perform as well as read. For example, the sqlite case of using
a page only once is pessimal for mmap, because
of TLB bookkeeping overhead on setup and unmapping.
Use of mmap would probably make sqlite even more vulnerable to NFS
than it is already.

I would not suggest mmap as the only solution; as with web servers,
I would suggest the strategy as a configurable option.

Also, even I would hesitate over suggesting mmap for writers, without
a lot of experimentation.

On Mon, 22 Mar 2004 14:56:35 -0500, "D. Richard Hipp" <[EMAIL PROTECTED]>
said:
> Mark D. Anderson wrote:
>  > Have you measured what improvement would be obtained by using readonly mmap()
>  > for readers, rather read()?
> 
> SQLite supports files that are larger than 4GB.  I don't think
> you can mmap() a file that big on a machine with a 32-bit
> address space.

You'd have to use mmap64 for that, if it was desirable to use mmap
at all.

-mda

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] sqlite not sharing page cache across processes?

2004-03-22 Thread D. Richard Hipp
Mark D. Anderson wrote:
> Have you measured what improvement would be obtained by using readonly mmap()
> for readers, rather read()?
SQLite supports files that are larger than 4GB.  I don't think
you can mmap() a file that big on a machine with a 32-bit
address space.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] sqlite not sharing page cache across processes?

2004-03-22 Thread Doug Currie

Monday, March 22, 2004, 1:46:00 PM, Mark wrote:

>> I have run experiments to see how much faster SQLite might
>> work if it didn't invalidate its cache so quickly.  The
>> answer is about 15%.  So it is probably worth looking into
>> someday. Note that such a change will be backwards but
>> not forwards compatible.  Suppose this change happened
>> for version 2.9.0.  Then a 2.9.0 library would be able
>> to read and write a 2.8.13 database file, but a 2.8.13
>> library would not be able to read and write a 2.9.0
>> database file.  Such a change it not to be made lightly
>> and without careful consideration.

> I don't follow. Why would this cause compatibility issues?

Note that presently (<= 2.8.13), the only mechanism used to coordinate
transactions is the file lock.

> Presumably some benefit could be obtained just by
> not throwing away the cache at the SQLITE_READLOCK to
> SQLITE_UNLOCK transition, but waiting until the next
> SQLITE_UNLOCK to SQLITE_READLOCK transition.
> If at that time there have been no write transactions
> since the cache was created, it can continue to use
> it, otherwise it would then throw it away.

It has no way of knowing (save diffing the db) that "there have been
no write transactions."

[...]

> While I'm making suggestions despite near-total ignorance
> of sqlite :), have you measured what improvement would
> be obtained by using readonly mmap() for readers, rather
> read()? It would save some malloc churning as well as
> a memory copy. Performance could be additionally tweaked
> with madvise(SEQUENTIAL) when suitable.

These functions are not in ANSI C.

e


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] sqlite not sharing page cache across processes?

2004-03-22 Thread D. Richard Hipp
Mark D. Anderson wrote:
My reading of the code has the readers flushing their cache often, and
not 
using the journal at all to maintain consistency. My reading has the 
reader start with nothing in the cache, lock the database with a read 
lock, perform as many SQL queries as are in the single text string it's 
currently parsing, and flush its cache around the time it unlocks the 
database.


So if the cache is thrown away after every transaction, does
that mean that if I do "select * from mytable" twice in a row
(in different transactions), that there will be the same number
of read() system calls for the second one?
Even if there are no intervening changes?
That hardly seems optimal, even within the constraints of the
sqlite architecture.
Yes, that is what it means.

You can make it reuse the cache by putting both SELECTs
inside a BEGIN...COMMIT.  But in so doing, you limit yourself
to a single reader.
I have run experiments to see how much faster SQLite might
work if it didn't invalidate its cache so quickly.  The
answer is about 15%.  So it is probably worth looking into
someday. Note that such a change will be backwards but
not forwards compatible.  Suppose this change happened
for version 2.9.0.  Then a 2.9.0 library would be able
to read and write a 2.8.13 database file, but a 2.8.13
library would not be able to read and write a 2.9.0
database file.  Such a change it not to be made lightly
and without careful consideration.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] sqlite not sharing page cache across processes?

2004-03-21 Thread ben . carlyle
Mark,





"Mark D. Anderson" <[EMAIL PROTECTED]>
22/03/2004 05:15 AM

 
To: [EMAIL PROTECTED]
cc: 
Subject:    [sqlite] sqlite not sharing page cache across processes?


> It appears from looking at pager.c that every process working
> with a database acquires its own page cache by use of read().
> They independently keep their individual cache in sync by
> playing back the (single) journal.
...
> But is my understanding of the tradeoff correct?

My reading of the code has the readers flushing their cache often, and not 
using the journal at all to maintain consistency. My reading has the 
reader start with nothing in the cache, lock the database with a read 
lock, perform as many SQL queries as are in the single text string it's 
currently parsing, and flush its cache around the time it unlocks the 
database. The cache remains valid only as long as the lock is in-place, 
and the operating-system is responsible for any caching between locks.

Replay of the journal only occurs if a writer has terminated (lost its 
lock without committing or rolling-back its transaction). The next reader 
will roll-back the transaction by restoring original pages from the 
journal.

Benjamin


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]