Mark Mielke wrote
> This returns to the question of whether count of the whole table is useful,
> or whether
> count of a GROUP BY or WHERE is useful.
> If GROUP BY or WHERE is useful, then trigger on UPDATE becomes necessary.
True... for the example I gave I should have had an update trigger on my table.
I neglected it as I based the example of a more complex multi-table example
from my application, where I know that the value I'm grouping by doesn't change.
> What is the direction here? Is it count of the whole table only? (<-- not
> interesting to me)
> Or count of more practical real life examples, which I completely agree with
> Greg,
> that this gets into the materialized view realm, and becomes very interesting.
> In my current db project, I never count all of the rows in a table. However,
> I do use count(*) with GROUP BY and WHERE.
I'm trying to figure out how close I can come to a useful efficient
materialized view with current production builds of postgresql,
and identifying areas where changes to postgresql could make it easier.
Currently I can see three areas of concern:
1) Turning it on
My query to initially populate the materialized view table takes 6 hours on my
(limited hardware) development system, which highlights the problem of when do
you turn on the triggers.
An outage is one way to guarantee that there are neither missing details nor
double counted details.
Would turning on the triggers and then running my initial population query in
the same transaction work?
2) Table bloat
I'm trying to group together a large number of +1 or -1 deltas into a single
delta per transaction.
This creates as many dead rows as there are updates.
This is also a problem with Pavan's suggestion of maintaining a counter table.
They can all be HOT updates in 8.3, but they still all create new row versions.
Tom says "Collapsing a transaction's changes
into a single row would require keeping some intra-transaction state,
which is do-able at the C-code level but not from plpgsql."
So can I hope that this problem is solvable as a contrib module that will work
with at least 8.3?
I'd still want to write plpgsql trigger function myself, as I know the
aggregation rules, and call contributed functions to integrate with the
collapsing of the transaction's changes into a single row. (Expect I'd need a
small number of rows per transaction, as I need to create delta rows for each
value of my grouping field involved in the transaction.
3) How to incorporate the deltas.
With my technique, if the transaction rolls back the delta record becomes a
dead row, if it commits, the delta is then visible to transaction started after
this time.
I need to have a regular process run to sum and remove the deltas, rewriting
the summary rows.
I'd like to be able to write an after-commit trigger that fires after changes
in dependent tables that I identify, which can add the transaction's deltas to
the summary table. I would want it to effectively be a new, small transaction.
I think that if these three areas are addressed, then before even considering
writing code to automatically convert any given view into a materialized view.
2) and 3) could perhaps be implemented with a per transaction map from
my_custom_key to a prepared statement and a list of parameter values.
Provide access to the values, making them updateable within the transaction.
Have the statement automatically executed on commit.
Pavan also refers to deferred triggers, which has got me thinking about another
possible solution:
Instead of inserting a delta row, that will be updated a lot of times, create
an on commit drop temp table named after the txid and the grouping value (which
is an integer in my case, perhaps hash it if you're grouping by something that
doesn't easily convert to part of a table name),
create an after insert initially deferred constraint to call a function which
will move the 'at commit' values of the rows in the temp table to the summary
table.
The temp table will only have one row inserted, updated many times, then on
commit the trigger is fired once, and the temp table is dropped.
Does anyone think this will or won't work for some reason?
Regards,
Stephen Denne.
Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer
focus, and courage. This email with any attachments is confidential and may be
subject to legal privilege. If it is not intended for you please advise by
reply immediately, destroy it and do not copy, disclose or use it in any way.
__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/dmzmessaging.htm for details.
__________________________________________________________________
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers