On Fri, Jun 21, 2002 at 06:28:59PM +0100, Paul Jakma wrote: > > Now you're getting a little out of hand. A journaling filesystem is > > a piling of one set of warts ontop of another. Now you've got a > > situation where even though the filesystem might be 100% consistent > > even after a catastrophic crash, the database won't be. There's no > > need to use a journaling filesystem with PostgreSQL > > eh? there is great need - this is the only way to guarantee that when > postgresql does operations (esp on its own application level logs) > that the operation will either: > > - be completely carried out > or > - not carried out at all
Journaling filesystems doesn't provide this guarantee in general, because the transactional-interface is not provided to userspace. The only thing the filesystem guarantees is that filesystem-operations are carried out completely or not at all. If a non-journaling filesystem crashes while rename() is in progress, the file may be present in two directories or none (depending on implementation). If you create a file, write to it and then crash, the file may be gone from the directory. _Theese_ are the problems solved by journaling filesystems. Luckily postgresql implements it's own system (WAL) to get the same feature ("atomic" updates) on the database-level. [ There are actually some work underway to export a transactional filesystem-API to userspace. When this is completed, an application could tell the filesystem what operations are part of an transaction, and have "atomic" updates even to multiple files :-) ] > > either full mirroring or full level 5 protection). Indeed there are > > potentially performance related reasons to avoid journaling > > filesystems! > > if they're any good they should have better synchronous performance > over normal unix fs's. (and synchronous perf. is what a db is > interested in). Syncrounous metadata updates (create/rename ++): yes - they should be faster. But postgresql doesn't do many of those. Syncrounous data-updates (write/append): no - because postgresql already do the writes to a log so there are no seeks involved in the sync writes. (the writes to the actual files happens asyncrounous). So there is a theoretical improvement, but it's not likely to show up on a typical SQL-benchmark... -- Ragnar Kjørstad Big Storage