On Wed, 18 Dec 2013 21:50:00 +1100, Chris Angelico <ros...@gmail.com> wrote:
>On Wed, Dec 18, 2013 at 9:31 PM, Cameron Simpson <c...@zip.com.au> wrote: >> On 18Dec2013 14:35, Chris Angelico <ros...@gmail.com> wrote: >>> An SQL database *is* a different form of storage. It's storing tabular >>> data, not a stream of bytes in a file. You're supposed to be able to >>> treat it as an efficient way to locate a particular tuple based on a >>> set of rules, not a different way to format a file on the disk. >> >> Shrug. It's all just data to me. I don't _care_ about the particular >> internal storage format. > >Then use a file, because you want file semantics. That's why you have >both options available. > >> Commit() is a logical operation saying this SQL changeset is now >> part of the global state. > >The global state is defined by what's on the disk. Specifically, by >what would be read if the power failed right at that moment. In the >case of PostgreSQL, a commit doesn't actually write the table pages - >it just writes the WAL (Write-Ahead Log), which is used to recreate >the transaction. If something fails hard, the WAL replay will apply >the change perfectly. That's the global state. It's not there till the >WAL's been fsync'd. > <snip> > >> All that has happened after an fsync() is that the OS taken your >> SQL changeset that you commited to the OS data abstraction and >> pushed it one layer lower into the "disk" abstraction. There's more >> going on in there. > >Not just pushed it one layer lower; the point of fsync is that it's >been pushed all the way down. See its man page [1]: > >"""fsync() transfers ("flushes") all modified in-core data ... to the >disk ... so that all changed information can be retrieved even after >the system crashed or was rebooted.""" > >It's fundamentally about crash recovery, not about "passing it to a >lower abstraction". Of course, the OS isn't always *able* to guarantee >things (NFS shares are notoriously hard to pin down), but the >intention of fsync is that it won't return (and therefore the COMMIT >operation won't finish) until the data can be read back reliably even >in the event of a major failure. > >Databases protect against that. If you want that protection, use a >database. If you don't, use a file. There's nothing wrong with either >option. > >ChrisA > >[1] on the web here, for those who don't have them handy: >http://linux.die.net/man/2/fsync Don't forget that most hard disks have an option to cache the write data. This is a 'feature' that allows the manufacturers to claim better write performance. You can't be sure when the data is written to the disk if that option is in play. Dick -- https://mail.python.org/mailman/listinfo/python-list