On 04/02/2009 03:32 PM, Tom Lane wrote:
Tony Cebzanov <tony...@andrew.cmu.edu> writes:What I want to do is update the assoc_count field in the dataset table to reflect the count of related records in the assoc field. To do so, I added the following trigger:CREATE OR REPLACE FUNCTION update_assoc_count_insert() RETURNS TRIGGER AS ' BEGIN UPDATE dataset SET assoc_count = assoc_count + 1 WHERE dataset_id = NEW.dataset_id; RETURN NEW; END ' LANGUAGE plpgsql;CREATE TRIGGER assoc_update_assoc_count_insert AFTER INSERT ON assoc FOR EACH ROW EXECUTE PROCEDURE update_assoc_count_insert();There is basically no way that this is going to not suck :-(. In the first place, using an AFTER trigger means that each update queues an AFTER trigger update event that has to be fired at statement or transaction end. In the second place (as Craig correctly noted) this results in a separate update to the count-table row for each inserted row, which tremendously bloats the count table with dead tuples. In the third place, if you have any concurrency of insertions, it disappears because all the inserters need to update the same count row.If you dig in the pgsql-hackers archives, you will find that the original scheme for this was to have each transaction accumulate its total number of insertions minus deletions for a table in local memory, and then insert *one* delta row into the count table just before transaction commit. I don't think it's possible to do that with just user-level triggers (not least because we haven't got ON COMMIT triggers); it would have to be a C-code addition. The various blog entries you cite are non-peer-reviewed oversimplifications of that design. Digging around, the oldest description I can find of this idea is http://archives.postgresql.org/pgsql-hackers/2003-09/msg00387.php although there are more recent threads rehashing the topic. One point I don't recall anyone mentioning is that the stats subsystem now implements a fairly large subset of this work already, namely the initial data accumulation. So you could imagine plugging something into that to send the deltas to a table in addition to the stats collector. regards, tom lane So, basically other than reading from pg_class table about the tuple count, there isn't a good way to optimize the COUNT(*)? Thanks Wei |
- [SQL] Performance problem with row count trigger Tony Cebzanov
- Re: [SQL] Performance problem with row count trig... Craig Ringer
- Re: [SQL] Performance problem with row count ... Tony Cebzanov
- Re: [SQL] Performance problem with row count trig... Tom Lane
- Re: [SQL] Performance problem with row count ... Wei Weng
- Re: [SQL] Performance problem with row count trig... Greg Sabino Mullane
- Re: [SQL] Performance problem with row count ... Tony Cebzanov
- Re: [SQL] Performance problem with row co... Greg Sabino Mullane