On 18Dec2013 14:35, Chris Angelico <ros...@gmail.com> wrote:
> On Wed, Dec 18, 2013 at 1:37 PM, Cameron Simpson <c...@zip.com.au> wrote:
> >> I'd say this is the right thing for a DB to do.  If it comes back
> >> from a commit() call, it better be on that disk, barring a failure
> >> of the physical hardware.  If it comes back from a commit() and data
> >> gets lost because of a power-failure, something is wrong.
> >
> > Depends on your view. People seem to treat dbs as some special form
> > of data storage. I don't; to me they're no different to storing
> > data in any other file. Do you do an fsync() every time you close
> > a file you've written? Of course not, it is a gratuitous performance
> > loss.  IMO, I've handed the data to the filesystem layer; its
> > integrity is now the OS's problem.
> 
> An SQL database *is* a different form of storage. It's storing tabular
> data, not a stream of bytes in a file. You're supposed to be able to
> treat it as an efficient way to locate a particular tuple based on a
> set of rules, not a different way to format a file on the disk.

Shrug. It's all just data to me. I don't _care_ about the particular
internal storage format.

> If you
> want file semantics, use a file. Otherwise, what do you expect
> commit() to do?

I expect commit() to update the db state, as presented to me via
whatever SQL interface I'm using, to reflect the sum of SQL changes
in the transaction versus unwinding those changes.

Whether they have reached a physical disk and been written onto
tiny ferrous oxide particules using magnet: I don't care. I expect
the OS to get the relevant storage to a permanent medium in due
course.

Commit() is a logical operation saying this SQL changeset is now
part of the global state.

Now, a db may involved an fsync() in that commit to beat the hard
drive into claiming to have stroed the data, and that may well be
desirable. Almost always.

But really, you should have the same requirements of an arbitrary
file you write as you do for a db, other things being equal.

> Also: the filesystem layer doesn't guarantee integrity. If you don't
> fsync() or fdatasync() or some other equivalent [1], it's not on the
> disk yet, so you can't trust it.

Course I can. There's plenty of scope within the disc physical layer
(buffering, caching, RAID card buffering) for an fsync() to return
_before_ the data are written to ferrous oxide (or whatever) because
the OS DOES NOT KNOW.

All that has happened after an fsync() is that the OS taken your
SQL changeset that you commited to the OS data abstraction and
pushed it one layer lower into the "disk" abstraction. There's more
going on in there.

Cheers,
-- 
Cameron Simpson <c...@zip.com.au>

One of the most important things you learn from the internet is that there is
no 'them' out there. It's just an awful lot of 'us'. - Douglas Adams
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to