RE: Parallel Replication of Single Table

Larry Elkins Tue, 14 Jan 2003 17:47:48 -0800

Waleed,

Thanks for chiming in.


Regarding refreshing multiple tables in parallel, well yes, that could be
done on a group level. Instead, though, they choose to have the scheduling
tool kick off multiple single table refreshes. They start many at the same
time, so they do in essence get parallelism of multiple tables at one time.
And this way they get paged with a specific should something fail.

Regarding the other comments, the replication logic seems to handle the
dependency you are talking about. There really isn't any "order" per se, so
the same issues you raise would apply to serial processing as well, right?
All those rows are in the MLOG$ table with a 1/1/4000 date -- in what order
did they occur?

Obviously an insert has to occur before the update to occur (with
xceptions  -- e.g. update, delete, then insert same PK row, etc). And we
don't care how many updates have occurred since we are only interested in
pushing the current image of the row.

Now, what about deletes? Let's say you see deletes and inserts in the MLOG$
table. Well, they will all have the 1/1/4000 date prior to kicking off. How
do we know which occurred first -- did we delete an existing row and then
re-insert? Or did we insert and then delete? Well, that's handled by joining
to the core table itself -- if the row exists, then you are going to push
it, the insert would have happened after the delete. If it doesn't exist,
then you know the delete happened after the insert. So the deletes are
processed first using the existence in the base table test.

With tracing turned on you can see how the replication logic handles this:

Step 1 -- get the != 'I' where it's not in the existing table -- e.g. we
know the last action was a delete, not an insert or update since the row no
longer exists:

SELECT /*+ remote_mapped(link) */ DISTINCT
  LOG$."OBJECT_ID"
FROM
 (SELECT MLOG$."OBJECT_ID" FROM
  "SCHEMA"."MLOG$_MY_PTEST"@link MLOG$ WHERE
  "SNAPTIME$$" > :1 AND ("DMLTYPE$$" != 'I')) LOG$ WHERE (LOG$."OBJECT_ID")
  NOT IN (SELECT MAS_TAB$."OBJECT_ID" FROM
  "SCHEMA"."MY_PTEST"@LINK "MAS_TAB$" WHERE
  LOG$."OBJECT_ID" = MAS_TAB$."OBJECT_ID")

Step 2 -- process those deletes:

DELETE FROM "SCHEMA"."MV_MY_PTEST" SNAP$
WHERE
 "OBJECT_ID" = :1

Step 3 -- Look for the != 'D' where it *exists* in the table. This will be
the I's and U's. And because you handled the true deletes (as in no longer
exists) in the prior step, then you know these go:

SELECT /*+ remote_mapped(link) */ CURRENT$."OWNER",
  CURRENT$."OBJECT_NAME",CURRENT$."SUBOBJECT_NAME",CURRENT$."OBJECT_ID",
  CURRENT$."DATA_OBJECT_ID",CURRENT$."OBJECT_TYPE",CURRENT$."CREATED",
  CURRENT$."LAST_DDL_TIME",CURRENT$."TIMESTAMP",CURRENT$."STATUS",
  CURRENT$."TEMPORARY",CURRENT$."GENERATED",CURRENT$."SECONDARY"
FROM
 (SELECT /*+ */ "MY_PTEST"."OWNER" "OWNER","MY_PTEST"."OBJECT_NAME"
  "OBJECT_NAME","MY_PTEST"."SUBOBJECT_NAME" "SUBOBJECT_NAME",
  "MY_PTEST"."OBJECT_ID" "OBJECT_ID","MY_PTEST"."DATA_OBJECT_ID"
  "DATA_OBJECT_ID","MY_PTEST"."OBJECT_TYPE" "OBJECT_TYPE",
  "MY_PTEST"."CREATED" "CREATED","MY_PTEST"."LAST_DDL_TIME" "LAST_DDL_TIME",
  "MY_PTEST"."TIMESTAMP" "TIMESTAMP","MY_PTEST"."STATUS" "STATUS",
  "MY_PTEST"."TEMPORARY" "TEMPORARY","MY_PTEST"."GENERATED" "GENERATED",
  "MY_PTEST"."SECONDARY" "SECONDARY" FROM  "SCHEMA"."MY_PTEST"@LINK
"MY_PTEST") CURRENT$,
  (SELECT DISTINCT MLOG$."OBJECT_ID" FROM  "SCHEMA"."MLOG$_MY_PTEST"@LINK
MLOG$ WHERE
  "SNAPTIME$$" > :1 AND ("DMLTYPE$$" != 'D')) LOG$ WHERE
CURRENT$."OBJECT_ID"  = LOG$."OBJECT_ID"

Steps 4 and 5 (intermixed) -- process the inserts/updates, if you update
before you insert, no big deal since you want the current image and the
insert will handle it and no row to update will not fail. And if you try to
insert before you update, no big deal since you will still be pushing the
current image. So this goes:

UPDATE "SCHEMA"."MV_MY_PTEST" SET "OWNER" = :1,"OBJECT_NAME" = :2,
  "SUBOBJECT_NAME" = :3,"OBJECT_ID" = :4,"DATA_OBJECT_ID"= :5,"OBJECT_TYPE"
=
   :6,"CREATED" = :7,"LAST_DDL_TIME" = :8,"TIMESTAMP" = :9,"STATUS" = :10,
  "TEMPORARY" = :11,"GENERATED" = :12,"SECONDARY" = :13
WHERE "OBJECT_ID" = :4

INSERT INTO "SCHEMA"."MV_MY_PTEST"
("OWNER","OBJECT_NAME","SUBOBJECT_NAME",
  "OBJECT_ID","DATA_OBJECT_ID","OBJECT_TYPE","CREATED","LAST_DDL_TIME",
  "TIMESTAMP","STATUS","TEMPORARY","GENERATED","SECONDARY")
VALUES
 (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13)

Now I may have missed a few things there, but it sounds like it handles the
dependencies. And what code and action may happen that doesn't manifest
itself as SQL -- you see analyzes computing number of distinct, max, min,
etc. But you don't see clustering and density factor calculations in terms
of SQL. I need to do some more work on this, and the Advanced Replication
guide does talk about dependencies and their impact on whether or not some
things can parallelized, but I've got to dig a bit more into that. But still
focusing on basic replication. Now the fun part is doing the replication in
the *same* DB. I could get parallelism on the SELECT's, or at least the
trace file said so, but they weren't observed, nor were the
delete/updates/inserts observed working in parallel. So I still need to do
some more digging and learning there. Anyway, most of the replication I've
done has been lower volume (with custom code for high volumes). I'd like to
get away from custom code for higher volumes, if possible. You know, things
like transportable tablespaces are an opportunity, but really more
appropriate for large actions -- low volume replication could be easier. Oh
well, no more rambling.

Regards,

Larry G. Elkins
[EMAIL PROTECTED]
214.954.1781

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Khedr,
> Waleed
> Sent: Tuesday, January 14, 2003 1:34 PM
> To: Multiple recipients of list ORACLE-L
> Subject: RE: Parallel Replication of Single Table
>
>
> Hi Larry,
>
> I do not know really how you want to parallelize this kind of operations!
>
> The final goal is to push the changes in site 1 to site 2 and get the data
> in both tables in sync.
>
> Changes include inserts, updates, and deletes. These changes (dml
> operations)
>  could be cascaded on the same row which means that the final image of the
> data in the table is completely dependent
> on the order and sequence of these dml operations.
>
> This is why I think that a single table refresh requires it to be a serial
> operation.
>
> But on the other hand if a single refresh was requested for many tables,
> then this could be parallelized on the job level not the table level.
>
> Does it make sense or am I missing something?
>
> Regards,
>
> Waleed

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Khedr,
> Waleed
> Sent: Tuesday, January 14, 2003 1:34 PM
> To: Multiple recipients of list ORACLE-L
> Subject: RE: Parallel Replication of Single Table
>
>
> Hi Larry,
>
> I do not know really how you want to parallelize this kind of operations!
>
> The final goal is to push the changes in site 1 to site 2 and get the data
> in both tables in sync.
>
> Changes include inserts, updates, and deletes. These changes (dml
> operations)
>  could be cascaded on the same row which means that the final image of the
> data in the table is completely dependent
> on the order and sequence of these dml operations.
>
> This is why I think that a single table refresh requires it to be a serial
> operation.
>
> But on the other hand if a single refresh was requested for many tables,
> then this could be parallelized on the job level not the table level.
>
> Does it make sense or am I missing something?
>
> Regards,
>
> Waleed

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Larry Elkins
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

RE: Parallel Replication of Single Table

Reply via email to