> > > > So are whole pages stored in rollback segments or just
> > > > the modified data?
> > >
> > > This is implementation dependent. Storing whole pages is
> > > much easy to do, but obviously it's better to store just
> > > modified data.
> >
> > I am not sure it is necessarily better. Seem
> > > So are whole pages stored in rollback segments or just
> > > the modified data?
> >
> > This is implementation dependent. Storing whole pages is
> > much easy to do, but obviously it's better to store just
> > modified data.
>
> I am not sure it is necessarily better. Seems to be a tradeof
> > > > > You mean it is restored in session that is running the transaction ?
> > >
> > > Depends on what you mean with restored. It first reads the heap page,
> > > sees that it needs an older version and thus reads it from the "rollback
>segment".
> >
> > So are whole pages stored in rollba
> > > > You mean it is restored in session that is running the transaction ?
> >
> > Depends on what you mean with restored. It first reads the heap page,
> > sees that it needs an older version and thus reads it from the "rollback segment".
>
> So are whole pages stored in rollback segments or
> > You mean it is restored in session that is running the transaction ?
Depends on what you mean with restored. It first reads the heap page,
sees that it needs an older version and thus reads it from the "rollback segment".
> >
> > I guess thet it could be slower than our current way of doin
> > >> Impractical ? Oracle does it.
> > >
> > >Oracle has MVCC?
> >
> > With restrictions, yes.
>
> What restrictions? Rollback segments size?
No, that is not the whole story. The problem with their "rollback segment approach" is,
that they do not guard against overwriting a tuple version in
> > - A simple typo in psql can currently cause a forced
> > rollback of the entire TX. UNDO should avoid this.
>
> Yes, I forgot to mention this very big advantage, but undo is
> not the only possible way to implement savepoints. Solutions
> using CommandCounter have been discussed.
This would
> - A simple typo in psql can currently cause a forced rollback of the entire
> TX. UNDO should avoid this.
Yes, I forgot to mention this very big advantage, but undo is not the only possible
way
to implement savepoints. Solutions using CommandCounter have been discussed.
Although the pg_log m
At 11:25 23/05/01 +0200, Zeugswetter Andreas SB wrote:
>
>> >If community will not like UNDO then I'll probably try to implement
>> >dead space collector which will read log files and so on.
>>
>> I'd vote for UNDO; in terms of usability & friendliness it's a big win.
>
>Could you please try it
> >If community will not like UNDO then I'll probably try to implement
> >dead space collector which will read log files and so on.
>
> I'd vote for UNDO; in terms of usability & friendliness it's a big win.
Could you please try it a little more verbose ? I am very interested in
the advantage
> > The downside would only be, that long running txn's cannot
> > [easily] rollback to savepoint.
>
> We should implement savepoints for all or none transactions, no?
We should not limit transaction size to online available disk space for WAL.
Imho that is much more important. With guaranteed
> > People also have referred to an overwriting smgr
> > easily. Please tell me how to introduce an overwriting smgr
> > without UNDO.
There is no way. Although undo for an overwriting smgr would involve a
very different approach than with non-overwriting. See Vadim's post about what
info suffic
> If community will not like UNDO then I'll probably try to implement
Imho UNDO would be great under the following circumstances:
1. The undo is only registered for some background work process
and not done in the client's backend (or only if it is a small txn).
2. Th
> > 1. Compact log files after checkpoint (save records of uncommitted
> >transactions and remove/archive others).
>
> On the grounds that undo is not guaranteed anyway (concurrent
> heap access), why not simply forget it,
We can set flag in ItemData and register callback function in
buffer
> Todo:
>
> 1. Compact log files after checkpoint (save records of uncommitted
>transactions and remove/archive others).
On the grounds that undo is not guaranteed anyway (concurrent heap access),
why not simply forget it, since above sounds rather expensive ?
The downside would only be, tha
> REDO in oracle is done by something known as a 'rollback segment'.
You are not seriously saying that you like the "rollback segments" in Oracle.
They only cause trouble:
1. configuration (for every different workload you need a different config)
2. snapshot too old
3. tx abort because ro
> Correct me if I am wrong, but both cases do present a problem currently
> in 7.1. The WAL log will not remove any WAL files for transactions that
> are still open (even after a checkpoint occurs). Thus if you do a bulk
> insert of gigabyte size you will require a gigabyte sized WAL
> dire
> As a rule of thumb, online applications that hold open
> transactions during user interaction are considered to be
> Broken By Design (tm). So I'd slap the programmer/design
> team with - let's use the server box since it doesn't contain
> anything useful.
W
> Correct me if I am wrong, but both cases do present a problem
> currently in 7.1. The WAL log will not remove any WAL files
> for transactions that are still open (even after a checkpoint
> occurs). Thus if you do a bulk insert of gigabyte size you will
> require a gigabyte sized WAL directory.
> > Tom: If your ratio of physical pages vs WAL records is so bad, the config
> > should simply be changes to do fewer checkpoints (say every 20 min like a
> > typical Informix setup).
>
> I was using the default configuration. What caused the problem was
> probably not so much the standard 5-
> My point is that we'll need in dynamic cleanup anyway and UNDO is
> what should be implemented for dynamic cleanup of aborted changes.
I do not yet understand why you want to handle aborts different than outdated
tuples. The ratio in a well tuned system should well favor outdated tuples.
If so
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes:
> Tom: If your ratio of physical pages vs WAL records is so bad, the config
> should simply be changes to do fewer checkpoints (say every 20 min like a
> typical Informix setup).
I was using the default configuration. What caused the problem w
> Would it be possible to split the WAL traffic into two sets of files,
Sure, downside is two fsyncs :-( When I first suggested physical log
I had a separate file in mind, but that is imho only a small issue.
Of course people with more than 3 disks could benefit from a split.
Tom: If your rat
> Really?! Once again: WAL records give you *physical* address of tuples
> (both heap and index ones!) to be removed and size of log to read
> records from is not comparable with size of data files.
So how about a background "vacuum like" process, that reads the WAL
and does the cleanup ? Seems
> > > Vadim, can you remind me what UNDO is used for?
> > 4. Split pg_log into small files with ability to remove old ones (which
> >do not hold statuses for any running transactions).
and I wrote:
> They are already small (16Mb). Or do you mean even smaller ?
Sorry for above little confusi
> > Vadim, can you remind me what UNDO is used for?
> 4. Split pg_log into small files with ability to remove old ones (which
>do not hold statuses for any running transactions).
They are already small (16Mb). Or do you mean even smaller ?
This imposes one huge risk, that is already a pain i
> There was some discussion of doing that, but it fell down on the little
> problem that in normal index-search cases you *don't* know the heap tid
> you are looking for.
I can not follow here. It does not matter if you don't know a trailing
part of the key when doing a btree search, it only he
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes:
> It was my understanding, that the heap xtid is part of the key now,
It is not.
There was some discussion of doing that, but it fell down on the little
problem that in normal index-search cases you *don't* know the heap tid
you are looking for
> A particular point worth making is that in the common case where you've
> updated the same row N times (without changing its index key), the above
> approach has O(N^2) runtime. The indexscan will find all N index tuples
> matching the key ... only one of which is the one you are looking for o
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes:
> foreach tuple in heap that can be deleted do:
> foreach index
> call the current "index delete" with constructed key and xtid
See discussion with Hiroshi. This is much more complex than TID-based
delete and would be faster
> > Isn't current implementation "bulk delete" ?
>
> No, the index AM is called separately for each index tuple to be
> deleted; more to the point, the search for deletable index tuples
> should be moved inside the index AM for performance reasons.
Wouldn't a sequential scan on the heap ta
31 matches
Mail list logo