Re: [sqlite] Network-based DB performance for Mozilla
Sorry to bring this up yet again... On Tue, 13 Sep 2005, D. Richard Hipp wrote: >On Tue, 2005-09-13 at 14:09 -0700, Brett Wilson wrote: >> I'm still concerned about the warnings on the web page about some >> networked file systems not supporting locking. There will be multiple >> DB connections from the same process. They might even be >> multithreaded. Might we have a problem in this case? >> > >Most network filesystems do fcntl locks incorrectly if >at all. An incorrect implementation of fcntl can result >in database corruption. Multiple connections from the same process could (should?) have a single lock manager for all connections. The threads share memory between them, hence intra-thread locking should be simple. The actual lock on the file should be the most restrictive lock reqired by all the threads within a process. Currently, if I'm not mistaken, SQLite has the following relationship between btree, pager and OsFile: btree -> pager -> OsFile btree -> pager -> OsFile btree -> pager -> OsFile What if, instead, we had this relationship: btree -> pager \ btree -> pager -> OsFile btree -> pager / Or perhaps even: btree -> pager \ btree -> pager -> page cache -> OsFile btree -> pager / With the latter, we'd get the benefit that multiple connections can share the page cache for a file, with some sort of per-page shadow for dirty pages, much as is done at the moment. Locking on the file is then done once per-process, and is set at any time to the greatest locking level required, which can be done at the point the locking level changes. Intra-process locking only requires changing the file level locking when there is a change. So, if thread 1 has a shared lock, and thread 2 also requires a shared lock, no change to the lock level on the file is required. If thread 2 then changes to an exclusive lock, the single lock on the single file can be updated from thread 2. Plus, any thread can set the lock on the file to any level, even if the previous level was set by another thread, so the implementation allows sharing of connections between threads. Unfortunately, this would require a major rejigging of the pager code, but could simplify os_unix.c to remove the need for openCnt and lockInfo, as the OsFile of the database file would only close when all connections are idle, and only be (un)locked when the pager lock state changes. Does this sound reasonable? Or am I way off base? > >Apple has contributed patches to SQLite that claim >to fix this problem. Those patches may one day find >their way into the default release. In the meantime, >you can find the patches at: > >http://www.sqlite.org/cvstrac/tktview?tn=1240 > Any way to generate a diff against 3.1.3? I'm having problem getting a sandbox of 3.1.3 from CVS. Are releases tagged? Releases should have a CVS tag. It's difficult to recreate a release after the fact without tags (or at least I'm having difficulty!) Christian -- /"\ \ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL X - AGAINST MS ATTACHMENTS / \
Re: [sqlite] Network-based DB performance for Mozilla
Please excuse me doing another reply to this but Sorry about this daft question, but you have indexed everythink OK and designed the database to a reasonable "BCNF"(Boyce Codd Normal Form) model havn't you? Eddy Brett Wilson wrote: Hi everybody, I'm working on replacing a lot of Firefox's storage systems to use sqlite. It has been going well so far except for one issue. The database file is stored in the user's Mozilla profile directory. In companies and Universities using Linux, this directory is often hosted over the network. In these cases, sqlite performance can be quite poor. I am aware that writing performance can be bad in these cases, but we don't do too many writes. I am mostly concerned about read performance, since a lot of this stuff (history, bookmarks) drive the UI. The lag, even over a fast network, can be noticable. I am also concerned about file locking, since the documentation does not recommend using files over the network. Can anybody suggest what to do about this problem? This is not something that can be avoided, since some people will have this configuration and will not have any say about it. Firefox must perform reasonably in these cases. One thing that could work in our favor is that Mozilla already does locking on the profile, so access will be restricted to our one process. Is there anything that we can do to take advantage of this to avoid having to query the file for reads even when the content is cached? It looks like we will have multiple database connections from this process. I will work on minimizing the number of queries in the common cases, but any little bit of perfomance will help here. Thanks, Brett
Re: [sqlite] Network-based DB performance for Mozilla
To add my 2p worth to all this I am not fully aware of the Firefox internals, buts as far as my understanding goes only one process can use any profile at any time. If two "instances" of firefox are fired up for the same user (+ profile) then what happens as far as I understand is another thread is started of the first process for the second instance. The way I would tackle replacing profile data with SQLite is to enforce single processes per profile with a lock file (I believe this is already done), and on start of (the first instance of) Firefox to fire up a special database thread that opens the database exclusively - thus ensuring no on-the-fly locking is required probably taking care of performance issues. Any accessing of the database by Firefox is then done by passing requests to this special thread (using mutexes, waits and signals and a global area), the thread then retrieves/updates the data and passes the result back to the "calling" thread. Although this is slightly more complex than otherwise, ot is not much more. It also should increase performance (no on-the-fly locking as only one connection is made), increases stability as passing multiple queries through a single connection in an embedded database is really a no-no and does the functions required. Yours Eddy Brett Wilson wrote: Hi everybody, I'm working on replacing a lot of Firefox's storage systems to use sqlite. It has been going well so far except for one issue. The database file is stored in the user's Mozilla profile directory. In companies and Universities using Linux, this directory is often hosted over the network. In these cases, sqlite performance can be quite poor. I am aware that writing performance can be bad in these cases, but we don't do too many writes. I am mostly concerned about read performance, since a lot of this stuff (history, bookmarks) drive the UI. The lag, even over a fast network, can be noticable. I am also concerned about file locking, since the documentation does not recommend using files over the network. Can anybody suggest what to do about this problem? This is not something that can be avoided, since some people will have this configuration and will not have any say about it. Firefox must perform reasonably in these cases. One thing that could work in our favor is that Mozilla already does locking on the profile, so access will be restricted to our one process. Is there anything that we can do to take advantage of this to avoid having to query the file for reads even when the content is cached? It looks like we will have multiple database connections from this process. I will work on minimizing the number of queries in the common cases, but any little bit of perfomance will help here. Thanks, Brett
RE: [sqlite] Network-based DB performance for Mozilla
>If you can't tolerate the delays accessing the database over the >network, can you make a copy of the database in a temp directory on the >local machine on startup. If you copy the file after you lock the >profile it should be safe to copy down to local storage. Then use the >local database while the application runs, and finally copy the database >back to the server, if it has been modified, when the application quits. Just to add, this is similar to how Roaming Profiles (corporate environment) work on Windows Regards Nick This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person.
Re: [sqlite] Network-based DB performance for Mozilla
On Sep 13, 2005, at 3:19 PM, Brett Wilson wrote: The patch says "improve and support locking on the OSX platform (as well as others)". I see at least some enums in there for MSDOS NFS, etc. Well, looking closer at the code, I'm beginning to think it might very well be a generic solution that isn't specific to Mac OS X.
Re: [sqlite] Network-based DB performance for Mozilla
The patch says "improve and support locking on the OSX platform (as well as others)". I see at least some enums in there for MSDOS NFS, etc. Can anybody clarify whether this works on other platforms as well? Basically, the question I have about this patch is: if I access the DB from more than one connection in the same process (regardless of OS or FS), will I be safe? Thanks, Brett On 9/13/05, Will Leshner <[EMAIL PROTECTED]> wrote: > > On Sep 13, 2005, at 2:19 PM, D. Richard Hipp wrote: > > > Apple has contributed patches to SQLite that claim > > to fix this problem. Those patches may one day find > > their way into the default release. In the meantime, > > you can find the patches at: > > > I'm not positive, but I think the Apple patches are Mac OS X-specific. > > -- > REALbasic news and tips: http://rbgazette.com > KidzMail & KidzLog: http://haranbanjo.com > > >
Re: [sqlite] Network-based DB performance for Mozilla
On Sep 13, 2005, at 2:19 PM, D. Richard Hipp wrote: Apple has contributed patches to SQLite that claim to fix this problem. Those patches may one day find their way into the default release. In the meantime, you can find the patches at: I'm not positive, but I think the Apple patches are Mac OS X-specific. -- REALbasic news and tips: http://rbgazette.com KidzMail & KidzLog: http://haranbanjo.com
Re: [sqlite] Network-based DB performance for Mozilla
On Tue, 2005-09-13 at 14:09 -0700, Brett Wilson wrote: > I'm still concerned about the warnings on the web page about some > networked file systems not supporting locking. There will be multiple > DB connections from the same process. They might even be > multithreaded. Might we have a problem in this case? > Most network filesystems do fcntl locks incorrectly if at all. An incorrect implementation of fcntl can result in database corruption. Apple has contributed patches to SQLite that claim to fix this problem. Those patches may one day find their way into the default release. In the meantime, you can find the patches at: http://www.sqlite.org/cvstrac/tktview?tn=1240 -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] Network-based DB performance for Mozilla
I'm still concerned about the warnings on the web page about some networked file systems not supporting locking. There will be multiple DB connections from the same process. They might even be multithreaded. Might we have a problem in this case? Brett On 9/13/05, Roger Binns <[EMAIL PROTECTED]> wrote: > > One possibility is if we think we'll be doing a lot of UI, > > to make an in-memory "read-only" copy of everything we will > > need. > > Alternatively you can implement your own "platform" backend > just as there are for Windows, Linux, Mac etc. It isn't very > many functions to implement. > > Since you are doing your own locking, you can turn most of the > locking calls into no-ops. You could even track when you do > writes and let locking go through then. > > Roger >
Re: [sqlite] Network-based DB performance for Mozilla
How about reading it all into :memory: and attaching the network database with a trigger so that when an update is made to the :memory: database the change is reflected to the network copy. -- Joel Lucsy "The dinosaurs became extinct because they didn't have a space program." -- Larry Niven
Re: [sqlite] Network-based DB performance for Mozilla
>I don't think it will matter what database you use if you're forced to have > >it hosted over a network. > >As far as I know they all rely on the underlying locking mechanism in the > >OS, which is inherently slow over a network. > > > >Can you avoid multiple locking latency hits by just getting a lock at > >startup and never relinquishing it? > > > > > What about copying (importing) the network database to a ":memory:" > database, and periodically copy (export) it back? > > If it's large, like he mentioned, that will take a while. It sounds like it needs to be profiled to see what the slow parts are. -- --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] Network-based DB performance for Mozilla
Brett Wilson wrote: Hi everybody, I'm working on replacing a lot of Firefox's storage systems to use sqlite. It has been going well so far except for one issue. The database file is stored in the user's Mozilla profile directory. In companies and Universities using Linux, this directory is often hosted over the network. In these cases, sqlite performance can be quite poor. I am aware that writing performance can be bad in these cases, but we don't do too many writes. I am mostly concerned about read performance, since a lot of this stuff (history, bookmarks) drive the UI. The lag, even over a fast network, can be noticable. I am also concerned about file locking, since the documentation does not recommend using files over the network. Can anybody suggest what to do about this problem? This is not something that can be avoided, since some people will have this configuration and will not have any say about it. Firefox must perform reasonably in these cases. One thing that could work in our favor is that Mozilla already does locking on the profile, so access will be restricted to our one process. Is there anything that we can do to take advantage of this to avoid having to query the file for reads even when the content is cached? It looks like we will have multiple database connections from this process. I will work on minimizing the number of queries in the common cases, but any little bit of perfomance will help here. Thanks, Brett Brett, If you can't tolerate the delays accessing the database over the network, can you make a copy of the database in a temp directory on the local machine on startup. If you copy the file after you lock the profile it should be safe to copy down to local storage. Then use the local database while the application runs, and finally copy the database back to the server, if it has been modified, when the application quits. If you want to be more resistant to data loss due to power failures or program crashes, you could copy the modified database back to the server after every write. This would be slower, but you say you are not as concerned about write performance.
Re: [sqlite] Network-based DB performance for Mozilla
Jay Sprenkle wrote: I'm glad to see someone is working on this since it was a very noticable problem for me. I don't think it will matter what database you use if you're forced to have it hosted over a network. As far as I know they all rely on the underlying locking mechanism in the OS, which is inherently slow over a network. Can you avoid multiple locking latency hits by just getting a lock at startup and never relinquishing it? What about copying (importing) the network database to a ":memory:" database, and periodically copy (export) it back?
Re: [sqlite] Network-based DB performance for Mozilla
This was brought up but seems like a bad idea for several reasons. First, the database could get pretty big. This would really kill start up and shut down times, which is very important to FF users. Second, what would happen if we crash? One possibility is if we think we'll be doing a lot of UI, to make an in-memory "read-only" copy of everything we will need. Writes go to both the networked file and our memory cache. It would be difficult to figure out when/how this could happen, though, and we would have to balance latency vs. the time to slurp all the data locally. Brett On 9/13/05, Eric Scouten <[EMAIL PROTECTED]> wrote: > To what extent is the database shared (either intentionally or > unintentionally)? Or, put another way, do you have an option to cache > data locally? > > Since you are restricting access to the profile to a single process at a > time, your best bet is probably to make a local copy of the DB during > the app session (you do have *some* local temporary storage, right?) and > copy that DB back to the network storage at the end of the session. > > -Eric > > > P.S. Thanks for your work on Firefox. Like SQLite, it's a great piece of > software and I'm grateful to be able to use it. > > > > Brett Wilson wrote: > > Hi everybody, > > > > I'm working on replacing a lot of Firefox's storage systems to use > > sqlite. It has been going well so far except for one issue. > > > > The database file is stored in the user's Mozilla profile directory. > > In companies and Universities using Linux, this directory is often > > hosted over the network. In these cases, sqlite performance can be > > quite poor. > > > > I am aware that writing performance can be bad in these cases, but we > > don't do too many writes. I am mostly concerned about read > > performance, since a lot of this stuff (history, bookmarks) drive the > > UI. The lag, even over a fast network, can be noticable. I am also > > concerned about file locking, since the documentation does not > > recommend using files over the network. > > > > Can anybody suggest what to do about this problem? This is not > > something that can be avoided, since some people will have this > > configuration and will not have any say about it. Firefox must perform > > reasonably in these cases. > > > > One thing that could work in our favor is that Mozilla already does > > locking on the profile, so access will be restricted to our one > > process. Is there anything that we can do to take advantage of this to > > avoid having to query the file for reads even when the content is > > cached? It looks like we will have multiple database connections from > > this process. > > > > I will work on minimizing the number of queries in the common cases, > > but any little bit of perfomance will help here. > > > > Thanks, > > Brett > >
Re: [sqlite] Network-based DB performance for Mozilla
I'm glad to see someone is working on this since it was a very noticable problem for me. I don't think it will matter what database you use if you're forced to have it hosted over a network. As far as I know they all rely on the underlying locking mechanism in the OS, which is inherently slow over a network. Can you avoid multiple locking latency hits by just getting a lock at startup and never relinquishing it? On 9/13/05, Brett Wilson <[EMAIL PROTECTED]> wrote: > > Hi everybody, > > I'm working on replacing a lot of Firefox's storage systems to use > sqlite. It has been going well so far except for one issue. > > The database file is stored in the user's Mozilla profile directory. > In companies and Universities using Linux, this directory is often > hosted over the network. In these cases, sqlite performance can be > quite poor. > > I am aware that writing performance can be bad in these cases, but we > don't do too many writes. I am mostly concerned about read > performance, since a lot of this stuff (history, bookmarks) drive the > UI. The lag, even over a fast network, can be noticable. I am also > concerned about file locking, since the documentation does not > recommend using files over the network. > > Can anybody suggest what to do about this problem? This is not > something that can be avoided, since some people will have this > configuration and will not have any say about it. Firefox must perform > reasonably in these cases. > > One thing that could work in our favor is that Mozilla already does > locking on the profile, so access will be restricted to our one > process. Is there anything that we can do to take advantage of this to > avoid having to query the file for reads even when the content is > cached? It looks like we will have multiple database connections from > this process. > > I will work on minimizing the number of queries in the common cases, > but any little bit of perfomance will help here. > > Thanks, > Brett > -- --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] Network-based DB performance for Mozilla
To what extent is the database shared (either intentionally or unintentionally)? Or, put another way, do you have an option to cache data locally? Since you are restricting access to the profile to a single process at a time, your best bet is probably to make a local copy of the DB during the app session (you do have *some* local temporary storage, right?) and copy that DB back to the network storage at the end of the session. -Eric P.S. Thanks for your work on Firefox. Like SQLite, it's a great piece of software and I'm grateful to be able to use it. Brett Wilson wrote: Hi everybody, I'm working on replacing a lot of Firefox's storage systems to use sqlite. It has been going well so far except for one issue. The database file is stored in the user's Mozilla profile directory. In companies and Universities using Linux, this directory is often hosted over the network. In these cases, sqlite performance can be quite poor. I am aware that writing performance can be bad in these cases, but we don't do too many writes. I am mostly concerned about read performance, since a lot of this stuff (history, bookmarks) drive the UI. The lag, even over a fast network, can be noticable. I am also concerned about file locking, since the documentation does not recommend using files over the network. Can anybody suggest what to do about this problem? This is not something that can be avoided, since some people will have this configuration and will not have any say about it. Firefox must perform reasonably in these cases. One thing that could work in our favor is that Mozilla already does locking on the profile, so access will be restricted to our one process. Is there anything that we can do to take advantage of this to avoid having to query the file for reads even when the content is cached? It looks like we will have multiple database connections from this process. I will work on minimizing the number of queries in the common cases, but any little bit of perfomance will help here. Thanks, Brett