Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

Scott Mead Wed, 17 Nov 2010 07:39:21 -0800

On Tue, Nov 16, 2010 at 10:25 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:


> Man, the number of misunderstandings in this thread is staggering.
> Let me try to explain what the proposed feature will and will not do.
>
> 1. The system catalog entries for all tables will be wal-logged.
> So schema (DDL) will survive a crash.  There wouldn't be any way
> to make it not do that, because we can't wal-log only some updates
> to a particular table, and that includes the catalogs in particular.
>
> Gotcha


> 2. What's proposed as the new feature is that specific non-system
> tables can be marked as unlogged, meaning that WAL entries won't
> be made for changes in those tables' contents (nor their indexes'
> contents).  So we can't guarantee that the contents of such tables
> will be correct or consistent after a crash.  The proposed feature
> deals with this by forcibly truncating all such tables after a crash,
> thus ensuring that they're consistent though not populated.  So the
> possible use-cases for such tables are limited to where (a) you can
> repopulate the tables on demand, or (b) you don't really care about
> losing data on a crash.
>

I would rather be allowed to decide that for myself.


>
> 3. There's a lot of wishful thinking here about what constitutes a
> crash.  A backend crash *is* a crash, even if the postmaster keeps
> going.  Data that had been in shared buffers doesn't get written out
> in such a scenario (and if we tried, it might be corrupt anyway).  So
> unlogged tables would be corrupt and in need of truncation after such an
> event.  Obviously, the same goes for an OS-level crash or power failure.
>

Right, just let *me* decide, that's all.


>
> 4. The last bit of discussion on -hackers concerned what to do in
> the case where the server got shut down cleanly.  If it was shut
> down cleanly, then any data for unlogged tables would have been
> written out from shared buffers ... but did the data make it to disk?
> There's no easy way to know that.  In the event of an OS crash or
> power failure shortly after server shutdown, it's possible that
> the unlogged tables would be corrupt.  So Robert's initial proposal
> includes truncating unlogged tables at any database startup, even
> if the previous shutdown was clean.  Some (including me) are arguing
> that that is unnecessarily strict; but you do have to realize that
> you're taking some risk with data validity if it doesn't do that.
>

It is too strict, it makes the feature barely more usable than a temp table.
As a DBA, I realize the implication of the feature:
   *) b0rked indexes
   *) b0rked data
   *) Not knowing what's good and what's bad
   *) Bad reports
   *) Bad Bi

etc..., etc... etc...

   Still, I'd rather be allowed to make the decision here.  I think that
having the database try to enforce integrity on something i've marked as
'corruptable' (via the 'unlogged' flag) will be a constant fight between me
and the system.  In the end, I'd just not use the feature.


> The bottom line here is that you really can only use the feature
> for data that you're willing to accept losing on no notice.
> Allowing the data to persist across clean shutdowns would probably
> improve usability a bit, but it's not changing that fundamental fact.
>

Agreed, and that's fine.  IMHO, it improves the usability 10 fold.  Having
it truncated on server restart is useful for only a fraction of the
use-cases for this feature.

--Scott


>
>                        regards, tom lane
>

Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

Reply via email to