[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-22 Thread V.Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12561310#action_12561310
 ] 

V.Narayanan commented on DERBY-3254:


I ran the tests on this patch and had a clean run of the junit all suite. 

 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, 
 failover_impl_v1.stat, failover_impl_v2.diff, failover_impl_v2.stat, 
 failover_impl_v3.diff, failover_impl_v3.stat, failover_impl_v4.diff, 
 failover_impl_v4.stat, failover_impl_v5.diff, failover_impl_v5.stat, 
 failover_impl_v6.diff, failover_impl_v6.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12560983#action_12560983
 ] 

Øystein Grøvlen commented on DERBY-3254:


I do not understand the change to MasterController#startFailover.  It seems 
like handleFailoverFailure will be called in all cases now.  Also, exceptions 
thrown by handleFailoverFailure called from the try block, will be caught and 
passed to handeFailoverFailure by the catch block.  That seems a bit 
unnecessary.  I think the whole handling of ack, as it was in v4 of the patch, 
should be moved outside the try block.


 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, 
 failover_impl_v1.stat, failover_impl_v2.diff, failover_impl_v2.stat, 
 failover_impl_v3.diff, failover_impl_v3.stat, failover_impl_v4.diff, 
 failover_impl_v4.stat, failover_impl_v5.diff, failover_impl_v5.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-20 Thread V.Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12560787#action_12560787
 ] 

V.Narayanan commented on DERBY-3254:


1) The exception being thrown upon successful failover in 
MasterController#startFailover 
 needs to moved outside the try catch block.
2) If failover is successful AsynchonousLogShipper#stopLogShippment needs to be 
called
to terminate the log shipper thread.

 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, 
 failover_impl_v1.stat, failover_impl_v2.diff, failover_impl_v2.stat, 
 failover_impl_v3.diff, failover_impl_v3.stat, failover_impl_v4.diff, 
 failover_impl_v4.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-18 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12560520#action_12560520
 ] 

Øystein Grøvlen commented on DERBY-3254:


With the latest patch,  failover_impl_v4.diff , I get an error in the error 
code test. 
It is probably related to the fact that a database severity error as been added.

While fixing this, here is some minor issues that should also be addressed:

 - Update javadoc of MasterFactory/MasterController#startFailover to indicate 
that it will throw an exception also on success.
 - Some unecessary imports (Property, SQLException)
 - The text of the javadoc for LogToFile#stopReplicationSlaveRole could still 
be improved.

 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, 
 failover_impl_v1.stat, failover_impl_v2.diff, failover_impl_v2.stat, 
 failover_impl_v3.diff, failover_impl_v3.stat, failover_impl_v4.diff, 
 failover_impl_v4.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12559458#action_12559458
 ] 

Øystein Grøvlen commented on DERBY-3254:


Thanks for the patch, Narayanan.  Her are my comments:

1. Instead of using Database#freeze, think you should use
   RawStoreFactory#freeze since MasterController relates to the store,
   and not the SQL layer.  This also removes the need for importing
   SQLException.

2. Maybe I am wrong, but it seems to me that you are shutting down the
   entire system.  At least, you do not specify which database to shut
   down.  Instead of an explicit shutdown, I think you should consider
   to just use database severity for the exception you throw.  I think
   that will make the connection close down the database
   automatically.

3. MasterController#handleFailoverFailure:  Don't you mean to use
   REPLICATION_FAILOVER_UNSUCCESSFUL also for the else part?

4. Some of my comments to the previous version of the patch does not
   seem to have been addressed.
 


 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, 
 failover_impl_v1.stat, failover_impl_v2.diff, failover_impl_v2.stat, 
 failover_impl_v3.diff, failover_impl_v3.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-11 Thread V.Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558198#action_12558198
 ] 

V.Narayanan commented on DERBY-3254:


Before starting to make the patch there are a few things I thought I should 
detail out

I had earlier decided on the following set of steps to implement failover

* The failover command is given to the master.
* The master flushes the log buffer
* The master sends this command to the slave and waits for a response
* The slave responds with an acknowledgement
* The master stops replication

There are a few refinements to these steps that would become necessary because 
of the
following issues

1) When the master stops replication is it necessary for it to shutdown the 
database?

I believe the answer is YES because there is no point in having the master 
serving clients 
when the slave is doing likewise for the same database. Having two databases 
serving clients
would create trouble for the users.

2) In the aforementioned steps there is a window between the stop master 
operation 
(not shutting down database), sending a failover command to the slave, not 
succeeding, 
restarting master operation.

Stopping master, flushes the log buffer, and stops the log buffer from 
buffering more records.

But this does not stop the clients being served. So the next time you start 
replication you 
would be inconsistent. 

Therefore we would need to stop clients in some way before flushing the log 
buffer.

The above two issues lead to the following refinements in the steps mentioned 
earlier

* The failover command is given to the master
* We stop the clients upon receiving this command
* The master Flushes the log buffer
* The master sends the failover command to the slave and waits for
a response
* The slave responds with a acknowledgement
* The master stops replication and shuts down the database.

In the event of a failure the master would resume serving clients.

 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, failover_impl_v1.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-10 Thread V.Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557915#action_12557915
 ] 

V.Narayanan commented on DERBY-3254:


I will change this patch to do the following

1) Master initiates failover by sending a failover message to the slave and 
waits for the 
   acknowledgment from the slave. (Slave will send an acknowledgement if its 
attempt to
   failover succeeds.)
2) If the acknowledgment is received it proceeds with failover.
3) Otherwise it continues as master without doing anything.

 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, failover_impl_v1.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-10 Thread V.Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557916#action_12557916
 ] 

V.Narayanan commented on DERBY-3254:


I am proceeding to remove the stopSlave method implementation I had added here
to the stop issue which needs to be reopened to address the comments there, the
slave issue seemed to me the better context to address this issue.

 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, failover_impl_v1.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (DERBY-3254) Implement the replication failover functionality

2008-01-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DERBY-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1221#action_1221
 ] 

Øystein Grøvlen commented on DERBY-3254:


Thanks for the patch, Narayanan.  My main question about this patch is
about failure handling during failover.  If failover is not
successful, I think the current master should continue as master.
Also, I am not sure that just being able to send the failover message
is sufficient to decide that failover was successful.  Maybe some
acknowledgement from the slave is needed?

As it is, the implementation of stop and failover is identical at the
slave.  I guess it is the implementation of stop that is missing
something?

Some minor issues:

  - LogToFile#stopReplicationSlaveRole(): I think the javadoc here is
a bit inaccurate.  AFAIU, setting the inReplicationSlaveMode flag
will make the slave complete recovery and boot the database.

  - There is a double ; in SlaveController#failover.

  - I think a successful failover should also be recorded in derby.log
also at the (former) slave.

  - There is a typo in the message text for R011: perfomed
  


 Implement the replication failover functionality
 

 Key: DERBY-3254
 URL: https://issues.apache.org/jira/browse/DERBY-3254
 Project: Derby
  Issue Type: Sub-task
  Components: Replication
Reporter: V.Narayanan
Assignee: V.Narayanan
 Attachments: failover_impl_notforcommit.diff, 
 failover_impl_notforcommit.stat, failover_impl_v1.diff, failover_impl_v1.stat




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.