On 13-01-28 06:17 AM, Andres Freund wrote:
Hi,

3.  Pass the delete (with no key values) onto the replication client and let
it deal with it (see 1 and 2)
Hm.


While I agree that nicer behaviour would be good I think the real
enforcement should happen on a higher level, e.g. with event triggers or
somesuch. It seems way too late to do anything about it when we're
already decoding. The transaction will already have committed...

Ideally the first line of enforcement would be with event triggers. The thing with user-level mechanisms for enforcing things is that they sometimes can be disabled or by-passed. I don't have a lot of sympathy for people who do this but I like the idea of at least having the option coding defensively to detect the situation and whine to the user.

How do you plan on dealing with sequences?
I don't see my plugin being called on sequence changes and I don't see
XLOG_SEQ_LOG listed in DecodeRecordIntoReorderBuffer.  Is there a reason why
this can't be easily added?
I basically was hoping for Simon's sequence-am to get in before doing
anything real here. That didn't really happen yet.
I am not sure whether there's a real usecase in decoding normal
XLOG_SEQ_LOG records, their content isn't all that easy to interpet
unless youre rather familiar with pg's innards.

So, adding support wouldn't hard from a technical pov but it seems the
semantics are a bit hard to nail down.

Also what do we want to do about TRUNCATE support.  I could always leave a
TRUNCATE trigger in place that logged the truncate to a sl_truncates and
have my replication daemon respond to the insert on a   sl_truncates table
by actually truncating the data on the replica.
I have planned to add some generic "table_rewrite" handling, but I have
to admit I haven't thought too much about it yet. Currently if somebody
rewrites a table, e.g. with an ALTER ... ADD COLUMN .. DEFAULT .. or
ALTER COLUMN ... USING ..., you will see INSERTs into a temporary
table. That basically seems to be a good thing, but the user needs to be
told about that ;)

I've spent some time this weekend updating my prototype plugin that
generates slony 2.2 style COPY output.  I have attached my progress here
(also https://github.com/ssinger/slony1-engine/tree/logical_repl).  I have
not gotten as far as modifying slon to act as a logical log receiver, or
made a version of the slony apply trigger that would process these
changes.
I only gave it a quick look and have a couple of questions and
remarks. The way you used the options it looks like youre thinking of
specifying all the tables as options? I would have thought those would
get stored & queried locally and only something like the 'replication
set' name or such would be set as an option.

The way slony works today is that the list of tables to pull for a SYNC comes from the subscriber because the subscriber might be behind the provider, where a table has been removed from the set in the meantime. The subscriber still needs to receive data from that table until it is caught up to the point where that removal happens.

Having a time-travelled version of a user table (sl_table) might fix that problem but I haven't yet figured out how that needs to work with cascading (since that is a feature of slony today I can't ignore the problem). I'm also not sure how that will work with table renames. Today if the user renames a table inside of an EXECUTE SCRIPT slony will update the name of the table in sl_table. This type of change wouldn't be visible (yet) in the time-travelled catalog. There might be a solution to this yet but I haven't figured out it. Sticking with what slony does today seemed easier as a first step.

Iterating over a list with
        for(i = 0; i < options->length; i= i + 2 )
        {
                DefElem * def_schema = (DefElem*) list_nth(options,i);
is not a good idea btw, thats quadratic in complexity ;)

Thanks I'll rewrite this to walk a list of  ListCell objects with next.


In the REORDER_BUFFER_CHANGE_UPDATE I suggest using
relation->rd_primary, just as in the DELETE case, that should always
give you a consistent candidate key in an efficient manner.

I haven't looked into the details of what is involved in setting up a
subscription with the snapshot exporting.
That hopefully shouldn't be too hard... At least thats the idea :P

I couldn't get the options on the START REPLICATION command to parse so I
just hard coded some list building code in the init method. I do plan on
pasing the list of tables to replicate from the replica to the plugin
(because this list comes from the replica).   Passing what could be a few
thousand table names as a list of arguments is a bit ugly and I admit my
list processing code is rough.  Does this make us want to reconsider the
format of the option_list ?
Yea, something's screwed up there, sorry. Will push a fix later today.

I guess should provide an opinion on if I think that the patch in this CF,
if committed could be used to act as a source for slony instead of the log
trigger.
The biggest missing piece I mentioned in my email yesterday, that we aren't
logging the old primary key on row UPDATEs.  I don't see building a credible
replication system where you don't allow users to update any column of a
row.
Ok, I really thought this wouldn't be that much of an issue in a first
version, but if you think its important, I'll add support for
it. Shouldn't be too hard.

If your using non-surragate /natural primary keys this tends to come up occasionally due to data-entry errors or renames. I'm looking at this from the point of view of what do I need to use this as a source for a production replication system with fewer sharp-edges compared to trigger source slony. My standard is a bit higher than 'first' version because I intent to use it in the version 3.0 of slony not 1.0. If others feel I'm asking for too much they should speak up, maybe I am. Also the way things will fail if someone were to try and update a primary key value is pretty nasty (it will leave them with inconsistent databases). We could install UPDATE triggers to try and detect this type of thing but I'd rather see us just log the old values so we can use them during replay.




The other issues I've raised (DecodeDelete hiding bad deletes, replication
options not parsing for me) look like easy fixes

  no wal decoding support for sequences or truncate are things that I could
work around by doing things much like slony does today.  The SYNC can still
capture the sequence changes in  a table (where the INSERT's would be
logged) and I can have a trigger capture truncates.
Could you explan a bit what's being done there in slony?

Each time the slon connects to the local database to create a SYNC event, which is when slony captures snapshot visiblity information, it also gets also looks at all of the replicated sequences and finds any that have changed since the last sync The values sequence values as of the last SYNC are stored in memory. Any sequences that have changed get there new values written to the table sl_seqlog. When slon applies row updates for a SYNC it also updates (setval) on any sequences that have changed.


For truncates the truncate trigger just logs a single row into sl_log indicating that the table has been truncated. When slon encounters a row of operation 'TRUNCATE' it executes a TRUNCATE ONLY on the table.




If this patch is going to get bumped to 9.4 I really hope that someone with
good knowledge of the internals (ie a committer) can give this patch a good
review sooner rather than later.  If there are issues Andres has overlooked
that are more serious or complicated to fix I would like to see them raised
before the next CF in June.
Absolutely seconded. I *really* would love to see a more technical
review, its hard to see issues after spending that much time in a
certain worldview...

Thanks!

Andres




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to