> On Tue, 2004-04-13 at 15:11, [EMAIL PROTECTED] wrote:

>> On Tue, 2004-04-13 at 11:21, [EMAIL PROTECTED] wrote:
[more snip]
>> Somehow, I need to detect the
>> end of my central slaves' replication cycle so that I can trigger the
>> "merge processing".

> Its not clear what you mean by 'replication cycle'.

A replication cycle starts when the slave is notified by the master that
there are binlog entries it needs to process and ends when the slave has
finished processing those entries and returns to waiting for more updates
from the master.

>> I could frequently poll the slave servers using SHOW SLAVE STATUS or
SHOW
>> PROCESS LIST and watch for their status to change.
>>
>> I could start each slave with "--log-slave-updates" and watch the
slaves'
>> binlogs for changes.
>>
>> I could watch for the creation/deletion of the slave's relay logs.

> This seems to indicate that you are afraid of selecting rows on the
> slave that are in the middle of being updated from the master.  A single
> update statement is still atomic, so you don't need to poll log files to
> determine if an update statement has finished.

> On the other hand, if there is some set of multiple updates and inserts
> that constitute a collection of data that you want to merge only when
>this collection is complete, you're better off finding a way to signal
> this through the database.  You could have the master lock the tables in
> question until its finished and then the program quering the slave knows
> that when it gets a read lock, its will see the full set of data.  You
> could also have a status column or a status table that has a flag
> letting the program on the slave side know when the data is ready.

> If this is off the mark, maybe some example statements would help...

[more snipping]

Yes, I AM concerned about getting a partial update to the warehouse. I know
that transactions aren't logged until after they commit. If I use
transactional boundaries to post multitable reports (for instance: an
invoice takes two tables, one for the base information and one for the line
items) into the branch masters then they will arrive intact and I won't
corrupt the central slaves. (That's NOT the issue I am worried about!)

I am worried that if I start processing those new records from the slave
database to the Warehouse before all of the records have been processed
from the Relay Logs (lets say I started trying to merge records when I see
the relay log being created), I could miss some data (like the last few
items on the invoice). That's why I am so worried about not merging until
the END of the cycle. I need to be sure that everything has been committed
to my central slave BEFORE I can merge the latest updates with the
warehouse database.

I can lock a slave database so that it won't replicate in the middle of my
merging so I know that if I can catch a slave when it goes back to "sleep"
("Waiting for master to send event"), I would have a complete set of data.
I could use "--log-slave-updates" to copy the updates to the slave's binlog
and check that to see if I need to merge records( if slave status is
"waiting" and the binlog is not "empty" then merge). Each merge could flush
the binlog. However, there was a post from someone using binlogs for
similar purpose that said that for 4.1.x+ the binlogs vary in size after
flushing so I don't know how reliable that would be as a check. How could I
tell when a binlog is empty?




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to