On Fri, Jun 21, 2002 at 06:28:59PM +0100, Paul Jakma wrote:
> > Now you're getting a little out of hand.  A journaling filesystem is
> > a piling of one set of warts ontop of another.  Now you've got a
> > situation where even though the filesystem might be 100% consistent
> > even after a catastrophic crash, the database won't be.  There's no
> > need to use a journaling filesystem with PostgreSQL 
> 
> eh? there is great need - this is the only way to guarantee that when 
> postgresql does operations (esp on its own application level logs) 
> that the operation will either:
> 
> - be completely carried out
> or
> - not carried out at all

Journaling filesystems doesn't provide this guarantee in general,
because the transactional-interface is not provided to userspace. The
only thing the filesystem guarantees is that filesystem-operations are
carried out completely or not at all.

If a non-journaling filesystem crashes while rename() is in progress,
the file may be present in two directories or none (depending on
implementation). If you create a file, write to it and then crash, the
file may be gone from the directory. _Theese_ are the problems solved by
journaling filesystems.

Luckily postgresql implements it's own system (WAL) to get the same
feature ("atomic" updates) on the database-level.


[ There are actually some work underway to export a transactional
filesystem-API to userspace. When this is completed, an application
could tell the filesystem what operations are part of an transaction,
and have "atomic" updates even to multiple files :-) ]

> > either full mirroring or full level 5 protection).  Indeed there are
> > potentially performance related reasons to avoid journaling
> > filesystems!
> 
> if they're any good they should have better synchronous performance 
> over normal unix fs's. (and synchronous perf. is what a db is 
> interested in).

Syncrounous metadata updates (create/rename ++): yes - they should be
faster. But postgresql doesn't do many of those.

Syncrounous data-updates (write/append): no - because postgresql
already do the writes to a log so there are no seeks involved in the
sync writes. (the writes to the actual files happens asyncrounous).


So there is a theoretical improvement, but it's not likely to show up on
a typical SQL-benchmark...




-- 
Ragnar Kjørstad
Big Storage

Reply via email to