On 18Dec2013 14:35, Chris Angelico <ros...@gmail.com> wrote: > On Wed, Dec 18, 2013 at 1:37 PM, Cameron Simpson <c...@zip.com.au> wrote: > >> I'd say this is the right thing for a DB to do. If it comes back > >> from a commit() call, it better be on that disk, barring a failure > >> of the physical hardware. If it comes back from a commit() and data > >> gets lost because of a power-failure, something is wrong. > > > > Depends on your view. People seem to treat dbs as some special form > > of data storage. I don't; to me they're no different to storing > > data in any other file. Do you do an fsync() every time you close > > a file you've written? Of course not, it is a gratuitous performance > > loss. IMO, I've handed the data to the filesystem layer; its > > integrity is now the OS's problem. > > An SQL database *is* a different form of storage. It's storing tabular > data, not a stream of bytes in a file. You're supposed to be able to > treat it as an efficient way to locate a particular tuple based on a > set of rules, not a different way to format a file on the disk.
Shrug. It's all just data to me. I don't _care_ about the particular internal storage format. > If you > want file semantics, use a file. Otherwise, what do you expect > commit() to do? I expect commit() to update the db state, as presented to me via whatever SQL interface I'm using, to reflect the sum of SQL changes in the transaction versus unwinding those changes. Whether they have reached a physical disk and been written onto tiny ferrous oxide particules using magnet: I don't care. I expect the OS to get the relevant storage to a permanent medium in due course. Commit() is a logical operation saying this SQL changeset is now part of the global state. Now, a db may involved an fsync() in that commit to beat the hard drive into claiming to have stroed the data, and that may well be desirable. Almost always. But really, you should have the same requirements of an arbitrary file you write as you do for a db, other things being equal. > Also: the filesystem layer doesn't guarantee integrity. If you don't > fsync() or fdatasync() or some other equivalent [1], it's not on the > disk yet, so you can't trust it. Course I can. There's plenty of scope within the disc physical layer (buffering, caching, RAID card buffering) for an fsync() to return _before_ the data are written to ferrous oxide (or whatever) because the OS DOES NOT KNOW. All that has happened after an fsync() is that the OS taken your SQL changeset that you commited to the OS data abstraction and pushed it one layer lower into the "disk" abstraction. There's more going on in there. Cheers, -- Cameron Simpson <c...@zip.com.au> One of the most important things you learn from the internet is that there is no 'them' out there. It's just an awful lot of 'us'. - Douglas Adams -- https://mail.python.org/mailman/listinfo/python-list