Just for the humor of it, as well as to confirm Nick's perspective, years ago on our inhouse developed Burroughs mainframe dbms, we had a process called "garbage collect".

Nicolai Tufar wrote:

I always wandered if VACUUM is the right name for the porcess. Now, when
PostgreSQL
is actively challenging in Enterprise space, it might be a good idea to give
it a more
enterprise-like name. Try to think how it is looking for an outside person
to see
us, database professionals hold lenghty discussions about the ways we
vacuum a database. Why should you need to vacuum a database? Is it
dirty? In my personal opinion, something like "space reclaiming daemon",
"free-list organizer", "tuple recyle job" or "segment coalesce process"
would
sound more business-like .

Regards,
Nick


----- Original Message -----
From: "Bruce Momjian" <[EMAIL PROTECTED]>
To: "Curtis Faith" <[EMAIL PROTECTED]>
Cc: "Tom Lane" <[EMAIL PROTECTED]>; "Ron Johnson" <[EMAIL PROTECTED]>;
"PgSQL Performance ML" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Tuesday, November 26, 2002 9:09 PM
Subject: Re: [PERFORM] [HACKERS] Realtime VACUUM, was: performance of
insert/delete/update



Good ideas. I think the master solution is to hook the statistics
daemon information into an automatic vacuum that could _know_ which
tables need attention.

--------------------------------------------------------------------------

-

Curtis Faith wrote:

tom lane wrote:

Sure, it's just shuffling the housekeeping work from one place to
another. The thing that I like about Postgres' approach is that we
put the housekeeping in a background task (VACUUM) rather than in the
critical path of foreground transaction commit.

Thinking with my marketing hat on, MVCC would be a much bigger win if

VACUUM

was not required (or was done automagically). The need for periodic

VACUUM

just gives ammunition to the PostgreSQL opponents who can claim we are
deferring work but that it amounts to the same thing.

A fully automatic background VACUUM will significantly reduce but will

not

eliminate this perceived weakness.

However, it always seemed to me there should be some way to reuse the

space

more dynamically and quickly than a background VACUUM thereby reducing

the

percentage of tuples that are expired in heavy update cases. If only a

very

tiny number of tuples on the disk are expired this will reduce the

aggregate

performance/space penalty of MVCC into insignificance for the majority

of

uses.

Couldn't we reuse tuple and index space as soon as there are no

transactions

that depend on the old tuple or index values. I have imagined that this

was

always part of the long-term master plan.

Couldn't we keep a list of dead tuples in shared memory and look in the

list

first when deciding where to place new values for inserts or updates so

we

don't have to rely on VACUUM (even a background one)? If there are

expired

tuple slots in the list these would be used before allocating a new slot

from

the tuple heap.

The only issue is determining the lowest transaction ID for in-process
transactions which seems relatively easy to do (if it's not already done
somewhere).

In the normal shutdown and startup case, a tuple VACUUM could be

performed

automatically. This would normally be very fast since there would not be

many

tuples in the list.

Index slots would be handled differently since these cannot be

substituted

one for another. However, these could be recovered as part of every

index

page update. Pages would be scanned before being written and any expired
slots that had transaction ID's lower than the lowest active slot would

be

removed. This could be done for non-leaf pages as well and would result

in

only reorganizing a page that is already going to be written thereby not
adding much to the overall work.

I don't think that internal pages that contain pointers to values in

nodes

further down the tree that are no longer in the leaf nodes because of

this

partial expired entry elimination will cause a problem since searches

and

scans will still work fine.

Does VACUUM do something that could not be handled in this realtime

manner?

- Curtis



---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


--
Bruce Momjian | http://candle.pha.pa.us
[EMAIL PROTECTED] | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania

19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])





---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to