On 6/10/2004 10:37 AM, Shridhar Daithankar wrote:

[EMAIL PROTECTED] wrote:
The session table is a different issue, but has the same problems. You
have an active website, hundreds or thousands of hits a second, and you
want to manage sessions for this site. Sessions are created, updated many
times, and deleted. Performance degrades steadily until a vacuum. Vacuum
has to be run VERY frequently. Prior to lazy vacuum, this was impossible.

Both session tables and summary tables have another thing in common, they
are not vital data, they hold transitive state information. Yea, sure,
data integrity is important, but if you lose these values, you can either
recreate it or it isn't too important.

Why put that is a database at all? Because, in the case of sessions
especially, you need to access this information for other operations. In
the case of summary tables, OLAP usually needs to join or include this
info.

PostgreSQL's behavior on these cases is poor. I don't think anyone who has
tried to use PG for this sort of thing will disagree, and yes it is
getting better. Does anyone else consider this to be a problem? If so, I'm
open for suggestions on what can be done. I've suggested a number of
things, and admittedly they have all been pretty weak ideas, but they were
potentially workable.

There is another as-of-non-feasible and hence rejected approach. Vacuum in postgresql is tied to entire relations/objects since indexes do not have transaction visibility information.


It has been suggested in past to add such a visibility to index tuple header so that index and heaps can be cleaned out of order. In such a case other backround processes such as background writer and soon-to-be integrated autovacuum daemon can vacuum pages/buffers rather than relations. That way most used things will remain clean and cost of cleanup will remain outside crtical transaction processing path.

This is not feasable because at the time you update or delete a row you would have to visit all it's index entries. The performance impact on that would be immense.


But a per relation bitmap that tells if a block is a) free of dead tuples and b) all remaining tuples in it are frozen could be used to let vacuum skip them (there can't be anything to do). The bit would get reset whenever the block is marked dirty. This would cause vacuum to look at mainly recently touched blocks, likely to be found in the buffer cache anyway and thus dramatically reduce the amount of IO and thereby make high frequent vacuuming less expensive.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #


---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to