Re: [sqlite] Network-based DB performance for Mozilla

2005-09-15 Thread Christian Smith
Sorry to bring this up yet again...

On Tue, 13 Sep 2005, D. Richard Hipp wrote:

>On Tue, 2005-09-13 at 14:09 -0700, Brett Wilson wrote:
>> I'm still concerned about the warnings on the web page about some
>> networked file systems not supporting locking. There will be multiple
>> DB connections from the same process. They might even be
>> multithreaded. Might we have a problem in this case?
>>
>
>Most network filesystems do fcntl locks incorrectly if
>at all.  An incorrect implementation of fcntl can result
>in database corruption.


Multiple connections from the same process could (should?) have a single
lock manager for all connections. The threads share memory between them,
hence intra-thread locking should be simple. The actual lock on the
file should be the most restrictive lock reqired by all the threads within
a process.

Currently, if I'm not mistaken, SQLite has the following relationship
between btree, pager and OsFile:

btree -> pager -> OsFile
btree -> pager -> OsFile
btree -> pager -> OsFile


What if, instead, we had this relationship:
btree -> pager \
btree -> pager -> OsFile
btree -> pager /

Or perhaps even:
btree -> pager \
btree -> pager -> page cache -> OsFile
btree -> pager /

With the latter, we'd get the benefit that multiple connections can share
the page cache for a file, with some sort of per-page shadow for dirty
pages, much as is done at the moment.

Locking on the file is then done once per-process, and is set at any time
to the greatest locking level required, which can be done at the point the
locking level changes. Intra-process locking only requires changing the
file level locking when there is a change. So, if thread 1 has a shared
lock, and thread 2 also requires a shared lock, no change to the lock
level on the file is required. If thread 2 then changes to an exclusive
lock, the single lock on the single file can be updated from thread 2.

Plus, any thread can set the lock on the file to any level, even if the
previous level was set by another thread, so the implementation allows
sharing of connections between threads.

Unfortunately, this would require a major rejigging of the pager code, but
could simplify os_unix.c to remove the need for openCnt and lockInfo, as
the OsFile of the database file would only close when all connections are
idle, and only be (un)locked when the pager lock state changes.

Does this sound reasonable? Or am I way off base?


>
>Apple has contributed patches to SQLite that claim
>to fix this problem.  Those patches may one day find
>their way into the default release.  In the meantime,
>you can find the patches at:
>
>http://www.sqlite.org/cvstrac/tktview?tn=1240
>


Any way to generate a diff against 3.1.3? I'm having problem getting a
sandbox of 3.1.3 from CVS. Are releases tagged? Releases should have a CVS
tag. It's difficult to recreate a release after the fact without tags (or
at least I'm having difficulty!)

Christian

-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-14 Thread Edward Macnaghten

Please excuse me doing another reply to this but

Sorry about this daft question, but you have indexed everythink OK and 
designed the database to a reasonable "BCNF"(Boyce Codd Normal Form) 
model havn't you?


Eddy

Brett Wilson wrote:


Hi everybody,

I'm working on replacing a lot of Firefox's storage systems to use
sqlite. It has been going well so far except for one issue.

The database file is stored in the user's Mozilla profile directory.
In companies and Universities using Linux, this directory is often
hosted over the network. In these cases, sqlite performance can be
quite poor.

I am aware that writing performance can be bad in these cases, but we
don't do too many writes. I am mostly concerned about read
performance, since a lot of this stuff (history, bookmarks) drive the
UI. The lag, even over a fast network, can be noticable. I am also
concerned about file locking, since the documentation does not
recommend using files over the network.

Can anybody suggest what to do about this problem? This is not
something that can be avoided, since some people will have this
configuration and will not have any say about it. Firefox must perform
reasonably in these cases.

One thing that could work in our favor is that Mozilla already does
locking on the profile, so access will be restricted to our one
process. Is there anything that we can do to take advantage of this to
avoid having to query the file for reads even when the content is
cached? It looks like we will have multiple database connections from
this process.

I will work on minimizing the number of queries in the common cases,
but any little bit of perfomance will help here.

Thanks,
Brett

 





Re: [sqlite] Network-based DB performance for Mozilla

2005-09-14 Thread Edward Macnaghten

To add my 2p worth to all this

I am not fully aware of the Firefox internals, buts as far as my 
understanding goes only one process can use any profile at any time.  If 
two "instances" of firefox are fired up for the same user (+ profile) 
then what happens as far as I understand is another thread is started of 
the first process for the second instance.


The way I would tackle replacing profile data with SQLite is to enforce 
single processes per profile with a lock file (I believe this is already 
done), and on start of (the first instance of) Firefox to fire up a 
special database thread that opens the database exclusively  - thus 
ensuring no on-the-fly locking is required probably taking care of 
performance issues.  Any accessing of the database by Firefox is then 
done by passing requests to this special thread (using mutexes, waits 
and signals and a global area), the thread then retrieves/updates the 
data and passes the result back to the "calling" thread.


Although this is slightly more complex than otherwise, ot is not much 
more.  It also should increase performance (no on-the-fly locking as 
only one connection is made), increases stability as passing multiple 
queries through a single connection in an embedded database is really a 
no-no  and does the functions required.


Yours

Eddy

Brett Wilson wrote:


Hi everybody,

I'm working on replacing a lot of Firefox's storage systems to use
sqlite. It has been going well so far except for one issue.

The database file is stored in the user's Mozilla profile directory.
In companies and Universities using Linux, this directory is often
hosted over the network. In these cases, sqlite performance can be
quite poor.

I am aware that writing performance can be bad in these cases, but we
don't do too many writes. I am mostly concerned about read
performance, since a lot of this stuff (history, bookmarks) drive the
UI. The lag, even over a fast network, can be noticable. I am also
concerned about file locking, since the documentation does not
recommend using files over the network.

Can anybody suggest what to do about this problem? This is not
something that can be avoided, since some people will have this
configuration and will not have any say about it. Firefox must perform
reasonably in these cases.

One thing that could work in our favor is that Mozilla already does
locking on the profile, so access will be restricted to our one
process. Is there anything that we can do to take advantage of this to
avoid having to query the file for reads even when the content is
cached? It looks like we will have multiple database connections from
this process.

I will work on minimizing the number of queries in the common cases,
but any little bit of perfomance will help here.

Thanks,
Brett

 





RE: [sqlite] Network-based DB performance for Mozilla

2005-09-14 Thread Brandon, Nicholas


>If you can't tolerate the delays accessing the database over the 
>network, can you make a copy of the database in a temp directory on the 
>local machine on startup. If you copy the file after you lock the 
>profile it should be safe to copy down to local storage. Then use the 
>local database while the application runs, and finally copy the database 
>back to the server, if it has been modified, when the application quits. 

Just to add, this is similar to how Roaming Profiles (corporate environment)
work on Windows

Regards
Nick




This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.



Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Will Leshner


On Sep 13, 2005, at 3:19 PM, Brett Wilson wrote:


The patch says "improve and support locking on the OSX platform (as
well as others)". I see at least some enums in there for MSDOS NFS,
etc.


Well, looking closer at the code, I'm beginning to think it might  
very well be a generic solution that isn't specific to Mac OS X.


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Brett Wilson
The patch says "improve and support locking on the OSX platform (as
well as others)". I see at least some enums in there for MSDOS NFS,
etc.

Can anybody clarify whether this works on other platforms as well?

Basically, the question I have about this patch is: if I access the DB
from more than one connection in the same process (regardless of OS or
FS), will I be safe?

Thanks,
Brett


On 9/13/05, Will Leshner <[EMAIL PROTECTED]> wrote:
> 
> On Sep 13, 2005, at 2:19 PM, D. Richard Hipp wrote:
> 
> > Apple has contributed patches to SQLite that claim
> > to fix this problem.  Those patches may one day find
> > their way into the default release.  In the meantime,
> > you can find the patches at:
> 
> 
> I'm not positive, but I think the Apple patches are Mac OS X-specific.
> 
> --
> REALbasic news and tips: http://rbgazette.com
> KidzMail & KidzLog: http://haranbanjo.com
> 
> 
>


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Will Leshner


On Sep 13, 2005, at 2:19 PM, D. Richard Hipp wrote:


Apple has contributed patches to SQLite that claim
to fix this problem.  Those patches may one day find
their way into the default release.  In the meantime,
you can find the patches at:



I'm not positive, but I think the Apple patches are Mac OS X-specific.

--
REALbasic news and tips: http://rbgazette.com
KidzMail & KidzLog: http://haranbanjo.com




Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread D. Richard Hipp
On Tue, 2005-09-13 at 14:09 -0700, Brett Wilson wrote:
> I'm still concerned about the warnings on the web page about some
> networked file systems not supporting locking. There will be multiple
> DB connections from the same process. They might even be
> multithreaded. Might we have a problem in this case?
> 

Most network filesystems do fcntl locks incorrectly if
at all.  An incorrect implementation of fcntl can result
in database corruption.

Apple has contributed patches to SQLite that claim
to fix this problem.  Those patches may one day find
their way into the default release.  In the meantime,
you can find the patches at:

http://www.sqlite.org/cvstrac/tktview?tn=1240
-- 
D. Richard Hipp <[EMAIL PROTECTED]>



Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Brett Wilson
I'm still concerned about the warnings on the web page about some
networked file systems not supporting locking. There will be multiple
DB connections from the same process. They might even be
multithreaded. Might we have a problem in this case?

Brett

On 9/13/05, Roger Binns <[EMAIL PROTECTED]> wrote:
> > One possibility is if we think we'll be doing a lot of UI,
> > to make an in-memory "read-only" copy of everything we will
> > need.
> 
> Alternatively you can implement your own "platform" backend
> just as there are for Windows, Linux, Mac etc.  It isn't very
> many functions to implement.
> 
> Since you are doing your own locking, you can turn most of the
> locking calls into no-ops.  You could even track when you do
> writes and let locking go through then.
> 
> Roger
>


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Joel Lucsy
How about reading it all into :memory: and attaching the network database 
with a trigger so that when an update is made to the :memory: database the 
change is reflected to the network copy.

-- 
Joel Lucsy
"The dinosaurs became extinct because they didn't have a space program." -- 
Larry Niven


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Jay Sprenkle
>I don't think it will matter what database you use if you're forced to have
> >it hosted over a network.
> >As far as I know they all rely on the underlying locking mechanism in the
> >OS, which is inherently slow over a network.
> >
> >Can you avoid multiple locking latency hits by just getting a lock at
> >startup and never relinquishing it?
> >
> >
> What about copying (importing) the network database to a ":memory:"
> database, and periodically copy (export) it back?
> 
> 
If it's large, like he mentioned, that will take a while. It sounds like it 
needs to be profiled
to see what the slow parts are.


-- 
---
The Castles of Dereth Calendar: a tour of the art and architecture of 
Asheron's Call
http://www.lulu.com/content/77264


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Dennis Cote

Brett Wilson wrote:


Hi everybody,

I'm working on replacing a lot of Firefox's storage systems to use
sqlite. It has been going well so far except for one issue.

The database file is stored in the user's Mozilla profile directory.
In companies and Universities using Linux, this directory is often
hosted over the network. In these cases, sqlite performance can be
quite poor.

I am aware that writing performance can be bad in these cases, but we
don't do too many writes. I am mostly concerned about read
performance, since a lot of this stuff (history, bookmarks) drive the
UI. The lag, even over a fast network, can be noticable. I am also
concerned about file locking, since the documentation does not
recommend using files over the network.

Can anybody suggest what to do about this problem? This is not
something that can be avoided, since some people will have this
configuration and will not have any say about it. Firefox must perform
reasonably in these cases.

One thing that could work in our favor is that Mozilla already does
locking on the profile, so access will be restricted to our one
process. Is there anything that we can do to take advantage of this to
avoid having to query the file for reads even when the content is
cached? It looks like we will have multiple database connections from
this process.

I will work on minimizing the number of queries in the common cases,
but any little bit of perfomance will help here.

Thanks,
Brett

 


Brett,

If you can't tolerate the delays accessing the database over the 
network, can you make a copy of the database in a temp directory on the 
local machine on startup. If you copy the file after you lock the 
profile it should be safe to copy down to local storage. Then use the 
local database while the application runs, and finally copy the database 
back to the server, if it has been modified, when the application quits. 
If you want to be more resistant to data loss due to power failures or 
program crashes, you could copy the modified database back to the server 
after every write. This would be slower, but you say you are not as 
concerned about write performance.





Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Dennis Jenkins

Jay Sprenkle wrote:

I'm glad to see someone is working on this since it was a very noticable 
problem for me.


I don't think it will matter what database you use if you're forced to have 
it hosted over a network.
As far as I know they all rely on the underlying locking mechanism in the 
OS, which is inherently slow over a network.


Can you avoid multiple locking latency hits by just getting a lock at 
startup and never relinquishing it?
 

What about copying (importing) the network database to a ":memory:" 
database, and periodically copy (export) it back?




Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Brett Wilson
This was brought up but seems like a bad idea for several reasons.
First, the database could get pretty big. This would really kill start
up and shut down times, which is very important to FF users. Second,
what would happen if we crash?

One possibility is if we think we'll be doing a lot of UI, to make an
in-memory "read-only" copy of everything we will need. Writes go to
both the networked file and our memory cache. It would be difficult to
figure out when/how this could happen, though, and we would have to
balance latency vs. the time to slurp all the data locally.

Brett



On 9/13/05, Eric Scouten <[EMAIL PROTECTED]> wrote:
> To what extent is the database shared (either intentionally or
> unintentionally)? Or, put another way, do you have an option to cache
> data locally?
> 
> Since you are restricting access to the profile to a single process at a
> time, your best bet is probably to make a local copy of the DB during
> the app session (you do have *some* local temporary storage, right?) and
> copy that DB back to the network storage at the end of the session.
> 
> -Eric
> 
> 
> P.S. Thanks for your work on Firefox. Like SQLite, it's a great piece of
> software and I'm grateful to be able to use it.
> 
> 
> 
> Brett Wilson wrote:
> > Hi everybody,
> >
> > I'm working on replacing a lot of Firefox's storage systems to use
> > sqlite. It has been going well so far except for one issue.
> >
> > The database file is stored in the user's Mozilla profile directory.
> > In companies and Universities using Linux, this directory is often
> > hosted over the network. In these cases, sqlite performance can be
> > quite poor.
> >
> > I am aware that writing performance can be bad in these cases, but we
> > don't do too many writes. I am mostly concerned about read
> > performance, since a lot of this stuff (history, bookmarks) drive the
> > UI. The lag, even over a fast network, can be noticable. I am also
> > concerned about file locking, since the documentation does not
> > recommend using files over the network.
> >
> > Can anybody suggest what to do about this problem? This is not
> > something that can be avoided, since some people will have this
> > configuration and will not have any say about it. Firefox must perform
> > reasonably in these cases.
> >
> > One thing that could work in our favor is that Mozilla already does
> > locking on the profile, so access will be restricted to our one
> > process. Is there anything that we can do to take advantage of this to
> > avoid having to query the file for reads even when the content is
> > cached? It looks like we will have multiple database connections from
> > this process.
> >
> > I will work on minimizing the number of queries in the common cases,
> > but any little bit of perfomance will help here.
> >
> > Thanks,
> > Brett
> 
>


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Jay Sprenkle
I'm glad to see someone is working on this since it was a very noticable 
problem for me.

I don't think it will matter what database you use if you're forced to have 
it hosted over a network.
As far as I know they all rely on the underlying locking mechanism in the 
OS, which is inherently slow over a network.

Can you avoid multiple locking latency hits by just getting a lock at 
startup and never relinquishing it?



On 9/13/05, Brett Wilson <[EMAIL PROTECTED]> wrote:
> 
> Hi everybody,
> 
> I'm working on replacing a lot of Firefox's storage systems to use
> sqlite. It has been going well so far except for one issue.
> 
> The database file is stored in the user's Mozilla profile directory.
> In companies and Universities using Linux, this directory is often
> hosted over the network. In these cases, sqlite performance can be
> quite poor.
> 
> I am aware that writing performance can be bad in these cases, but we
> don't do too many writes. I am mostly concerned about read
> performance, since a lot of this stuff (history, bookmarks) drive the
> UI. The lag, even over a fast network, can be noticable. I am also
> concerned about file locking, since the documentation does not
> recommend using files over the network.
> 
> Can anybody suggest what to do about this problem? This is not
> something that can be avoided, since some people will have this
> configuration and will not have any say about it. Firefox must perform
> reasonably in these cases.
> 
> One thing that could work in our favor is that Mozilla already does
> locking on the profile, so access will be restricted to our one
> process. Is there anything that we can do to take advantage of this to
> avoid having to query the file for reads even when the content is
> cached? It looks like we will have multiple database connections from
> this process.
> 
> I will work on minimizing the number of queries in the common cases,
> but any little bit of perfomance will help here.
> 
> Thanks,
> Brett
> 



-- 
---
The Castles of Dereth Calendar: a tour of the art and architecture of 
Asheron's Call
http://www.lulu.com/content/77264


Re: [sqlite] Network-based DB performance for Mozilla

2005-09-13 Thread Eric Scouten
To what extent is the database shared (either intentionally or 
unintentionally)? Or, put another way, do you have an option to cache 
data locally?


Since you are restricting access to the profile to a single process at a 
time, your best bet is probably to make a local copy of the DB during 
the app session (you do have *some* local temporary storage, right?) and 
copy that DB back to the network storage at the end of the session.


-Eric


P.S. Thanks for your work on Firefox. Like SQLite, it's a great piece of 
software and I'm grateful to be able to use it.




Brett Wilson wrote:

Hi everybody,

I'm working on replacing a lot of Firefox's storage systems to use
sqlite. It has been going well so far except for one issue.

The database file is stored in the user's Mozilla profile directory.
In companies and Universities using Linux, this directory is often
hosted over the network. In these cases, sqlite performance can be
quite poor.

I am aware that writing performance can be bad in these cases, but we
don't do too many writes. I am mostly concerned about read
performance, since a lot of this stuff (history, bookmarks) drive the
UI. The lag, even over a fast network, can be noticable. I am also
concerned about file locking, since the documentation does not
recommend using files over the network.

Can anybody suggest what to do about this problem? This is not
something that can be avoided, since some people will have this
configuration and will not have any say about it. Firefox must perform
reasonably in these cases.

One thing that could work in our favor is that Mozilla already does
locking on the profile, so access will be restricted to our one
process. Is there anything that we can do to take advantage of this to
avoid having to query the file for reads even when the content is
cached? It looks like we will have multiple database connections from
this process.

I will work on minimizing the number of queries in the common cases,
but any little bit of perfomance will help here.

Thanks,
Brett