Re: Re[10]: [sqlite] Accessing Database on Network

Mrs. Brisby Wed, 10 Aug 2005 19:46:01 -0700

On Wed, 2005-08-10 at 11:30 +0200, djm wrote:
> Hello,
> 
> 
> >> My understanding, after all of your helpful comments,  currently is:
> >> 
> >> Simultaneous reading of a server file by different clients is safe, as
> >> long as the file doesnt get changed at all (at least in a time scale
> >> where the Client Os could cache it).
> 
> Mrs.> Remove your parenthesized exception and this is correct.
> 
> My current plan is:
> 
> Store the (readonly) database on the server and include version info
> in a table in the database. Each time the client app starts, it
> accesses the server database and checks the version info. Based on
> this it decides whether it needs to copy the database file locally
> (the server version is newer or there is no local version) or can use
> the current local version (the version of the server database is the
> same as the local version).


So the server database never changes?

How do new versions get published?

[[ This is why I suggested a protocol with well known semantics. ]]


> Its possible that more than one client will be accessing this version
> info simulataneously. Can there be any problem? Using sqlite to read
> this version info via a SELECT should map cleanly to a read only
> access to the disk file, right? This is the only point where imnot
> fully sure. According to the above should then be fine?

Read only access isn't always "safe" in that all of ACID can be
satisfied. MS-DFS for example.

> The server database will need to be updated at some stage. The sys
> admin will haye to be responsible for seeing that no client is
> accessing the file while its being updated (by disconecting the server
> from the network or doing the job at niight when all clients are off,
> or just making sure that none are running my app or whtever). What
> should I advise him is a necessary/reasonable procedure, and what are
> the worst case scenarios for each appraoch?

I've already told you what's safe. SQLite has a broad range of
requirements that if you cannot lock down your requirements any better
than "the client knows more about engineering than I do" and "i have no
idea what operating system, networked filesystems and platforms will
ever be used for this application" you're either absurdly
future-proofing yourself, or you only think you are.

NFS has well known and well defined semantics. So does SMB/CIFS. Their
semantics can be made compatible with SQLite (using -osync and the urls
I posted last). DFS, AFS and Bullet cannot be made safe with SQLite
unless the filename has the version encoded in it and you use purely
atomic file creation (rename).

HTTP is probably the simplest.

> As far as I can understand, the only thing that absolutely must be
> ensured is that no app starts after the sys admin begins replacing the
> file, and before he's finished, because then the client would
> possibly get corrupt data. This could be ensured eg by
> temporarily disconnecting the server from the network.

Are you trying to remove race conditions or not?

> Given this worst that could happen (which is extremely unlikely, but not
> completely impossible), due to client os caching, would be that
> upon restarting the clients then dont know that the serever database
> is newer, and dont copy it locally it, and instead continue to work with the
> old data (local version).

...

> Does this sound like a good approach?

No.

> >> The same applies to sqlite accesses since it just uses regular files.
> >> And when you only querying the database (not changing the data), you
> >> are just reading the database file .. ie sqlite doesnt do anything
> >> which would break the clients only reading condition.
> >> 
> >> Am I right so far?
> 
> Mrs.> Almost.
> 
> Mrs.> You make a distinction between Client and Server and I don't
> Mrs.> think you know who they are.
> 
> Whats that supposed to mean?
> 
> Mrs.> It's helpful to understand that the server cannot update files. Ever.
> Mrs.> Only clients can update files. The server coordinates these updates. In
> Mrs.> reality, in many cases, the server coordinates fine. In others, it fails
> Mrs.> miserably.
> 
> You can work with the server just as a normal machine. It happens to
> be functioning as a network server for clients on the network. Eg. the
> system admin can copy a database file from a cd onto the harddisk of the
> server. The server updates the database here, not the client.

No you cannot. That's the point. You only think you can. When you update
a file (say) on CIFS, this could invalidate oplocks and force clients to
redownload. Those notifications can reach clients at a different rate.

Are you trying to remove race conditions? Or are you simply trying to
make them harder to find?

> >> The size of a file depends for example on the cluster size of the
> >> underlying partition. And regardless is of little value in comparing
> >> two files; the same file size/name doesnt mean two files are
> >> identical.
> 
> Mrs.> The size of a file does not depend on the cluster size of the underlying
> Mrs.> partition. That's how much SPACE the file takes up.
> 
> Mrs.> Whether or not your filesystem records the size of a file isn't
> Mrs.> important.
> 
> It is. Its all that matters here.

How do you figure this? I do not think you know what you're talking
about.

> Mrs.> The concept of file size exists, and it has nothing to do
> Mrs.> with clusters or blocks or extents... almost all
> Mrs.> filesystems...correct
> 
> One wants to reliably determine this invariant size and not the space
> occupied on disk or some other measure reported which depends on cluster
> size or whatever. "Almost" or "roughhly" is not reliable.

Nobody said almost or roughly until you edited my reply.

I do not think you know what you are talking about.

"Whether or not your filesystem records the size of a file isn't
important. The concept of file size exists, and it has nothing to do
with clusters or blocks or extents."

Size has nothing to do with filesystem. CP/M doesn't explicitly record
the file size but files STILL have an exact size. If they're TEXT it's
to the EOF marker. If they're not, then it's always a multiple of the
cluster size.

But there still exists size. If you tried to copy from a machine like
this, your files would increase in size until they DID match. If you're
copying TO a machine like this, then be aware that SQLite's block size
is a multiple of the cluster size (or soon will be!) such that the
problem becomes moot again.

"As a matter of fact, almost all filesystems record the correct file
size."

All this means is that if you call stat(file,&sb) then sb.st_size is
going to be correct. SQLite doesn't run on any platforms where this
isn't true.


> Mrs.> The test is not to determine whether or not the files are identical,
> 
> It is.

I do not think you know what you are talking about.

> Mrs.> but if one has been changed. This method will certainly download
> Mrs.> files when they haven't been changed (although it's unlikely).
> Mrs.> This unusual case wastes some bandwidth. In contrast to
> Mrs.> downloading it all the time, where you always waste bandwidth.
> 
> If you cant tell when files are identical, you cant tell when you dont
> need to download, so you need to download always, so you needent bother
> checking the filesize in the first place.

You don't have to. You only need to find out if you don't know. That's
what caching is all about.

DNS doesn't know if records are identical before it fetches new
versions. It uses expiration dates.

HTTP doesn't know if records are identical before it fetches new
versions (sans the E-Tag extensions). It uses expiration dates and
If-Modified-Since conditions.

> Mrs.> If you're looking for cross-platform you need to select protocols that
> Mrs.> have well defined semantics. HTTP sounds like a good bet. Full file copy
> Mrs.> sounds like a better one.
> 
> My app will access the files on the network (LAN) as if theyre local.

Sounds like you've made your decision.

Re: Re[10]: [sqlite] Accessing Database on Network

Reply via email to