On Sun, Dec 11, 2011 at 1:55 PM, Dimitri Fontaine
<dimi...@2ndquadrant.fr> wrote:
> Andres Freund <and...@anarazel.de> writes:
>> Hm. I just noticed a relatively big hole in the patch: The handling of
>> deletion of dependent objects currently is nonexistant because they don't go
>> through ProcessUtility...
>
> That's the reason why we're talking about “command triggers” rather than
> “DDL triggers”.  We don't intend to fire the triggers at each DDL
> operation happening on the server, but for each command.
>
> This restriction still allows us to provide a very useful feature when
> checked against the main use cases we target here. Those are auditing,
> and replication (the replay will also CASCADEs), and a generic enough
> SUDO facility (because the trigger function can well be SECURITY
> DEFINER).

I haven't yet thought about your specific proposal here in enough to
have a fully-formed opinion, but I am a little nervous that this may
turn out to be one of those cases where the obvious API ends up
working less well than might have been supposed.  For example,
consider the "auditing" use case, and suppose that the user wants to
log a message (to a table, or some other separate logging facility)
every time someone creates an index.  In order to do that, they're
going to need to trap not only CREATE INDEX but also CREATE TABLE and
ALTER CONSTRAINT, and in the latter two cases they're then going to
have to then write some grotty logic to grovel through the statement
or its node tree in order to figure out what indexes are being
created.  Also, if they want to log the name of the new index in cases
where the user doesn't specify it, they're going to have to duplicate
the backend logic which generates the index name, which is going to be
a pain in the ass and a recipe for future bugs (e.g. because we might
change the algorithm at some point, or the trigger might see a
different view of the world than the actual command, due to
intervening DDL).

This is something that has come up a lot in the context of sepgsql:
it's not really enough to know what the toplevel command is in some
cases, because whether or not the operation is allowed depends on
exactly what it's doing, which depends on things like which schema
gets used to create the table, and which tablespaces get used to
create the table and its indexes, and perhaps (given Peter's pending
work) what types are involved.  You can deduce all of these things
from the command text, but it requires duplicating nontrivial amounts
of backend code, which is undesirable from a maintainability point of
view.  I think the same issues are likely to arise in the context of
auditing, as in the example above, and even for replication: if, for
example, the index name that is chosen on the master happens to exist
on the standby with a different definition, replaying the actual
command text might succeed or fail depending on how the command is
phrased.  Maybe it's OK for the slave to just choose a different name
for that index, but now a subsequent DROP INDEX command that gets
replayed across all servers might end up dropping logically unrelated
indexes on different machines.

Now, on the other hand, the conceptual simplicity of this approach is
very appealing, and I can certainly see people who would never
consider writing a ProcessUtility_hook using this type of facility.
However, I'm worried that it will be difficult to use as a foundation
for robust, general-purpose tools.  Is that enough reason to reject
the whole approach?  I'm not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to