Hi Bogon,
Well what happens depends on your timeout settings. It the failure last
less than the timeout, nothing will happen (the query will just wait for
the network to come back up).
If the timeout expires, the group communication will report that the
other controllers are dead and that you are now the only one in the
world. Note that if the same NIC is used on the controller for all
communications (database backends, clients and group communication),
then connectivity will be lost with all of them and the backends will be
automatically disabled.
If only the group communication failed, when the network connectivity
comes back, the controllers gets notified of the new controllers in the
group. They will then compare their recovery log and find out if they
were involved in a partition. If no writes happened during the partition
they will recover smoothly otherwise only the first controller in the
group will survive and all others will shutdown.
Hope this helps,
Emmanuel
Thank you for your reply.
If you have a time, could you tell me what happens in Sequoia when a
controller suffers intermittent communication failure in network
devices such as a switch or NIC? For example, for a couple of minutes,
an NIC is not responded. I am not sure if this is possible failure
scenario in real, though.
Thank you for the paper list.
On Mon, Dec 8, 2008 at 12:52 PM, Emmanuel Cecchet
<[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
Hi,
There is no solution to the partitioning problem if you need
consistency. The first impossibility proof appeared in Brian A.
Coan, Brian M. Oki, and Elliot K. Kolodner. "Limitations on
Database Availability when Networks Partition." Proceedings of the
Fifth Annual ACM Symposium on Principles of Distributed Computing
(1986), pp. 187–194.
If your machines are connected to the same network switch, they
will all lose connectivity simultaneously if the switch goes down.
You can then have 2 networks with something like heartbeat (the HA
tool) for failover from one network to the other.
If someone artificially introduces a network partition by
misconfiguring the network (bad firewall or VPN setup) the you
will have to do reconciliation (which cannot be done
automatically). Some people have been thinking about the problem:
Asplund, M. and Nadjm-Tehrani, S. 2006. Post-partition
reconciliation protocols for maintaining consistency. In
/Proceedings of the 2006 ACM Symposium on Applied Computing/
(Dijon, France, April 23 - 27, 2006). SAC '06. ACM, New York, NY,
710-717.
In all cases, you have to try to avoid partitions by designing
carefully your network configuration. You will only be able to
detect a partition when it is too late and what you want to do for
reconciliation is application specific.
Hope this helps,
Emmanuel
I am running two controllers with RAIDB-1 scheme. But, the
problem is when network is partitioned because of a switch
failure.
In this scenario, two controllers will receive user's requests
from each. So, it will make two backends inconsistent.
>From the Bianca's papers I reallized that this network
partition very often can happen.
When I look at this mailing-list, Emmanuel suggested to unify
database communication path and user request path in network
topology.
But, in our case, databases are already communicating via
gigabit network, seperated from outside communication channel.
And, I read the "C-JDBC Horizontal Scalability: A controller
replication user guide", and understood I need to make a JMX
client which
listens from JMX notification of Sequoia controllers. I think
that this process will need to stop one controller, when two
controllers are
partitioned, or switch a backend database to the read-only mode.
At this point, I have a quick question. I wonder if there is
another better solution dealing with this sort of network
partition in Sequoia.
Thank you for your reading.
Best,
------------------------------------------------------------------------
_______________________________________________
Sequoia mailing list
[email protected]
<mailto:[email protected]>
https://forge.continuent.org/mailman/listinfo/sequoia
--
Emmanuel Cecchet
FTO @ Frog Thinker Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
Skype: emmanuel_cecchet
_______________________________________________
Sequoia mailing list
[email protected]
<mailto:[email protected]>
https://forge.continuent.org/mailman/listinfo/sequoia
--
여호와는 네게 복을 주시고 너를 지키시기를 원하며
여호와는 그 얼굴을 네게 비추사 은혜 베푸시기를 원하며
여호와는 그 얼굴을 네게로 향하여 드사 평강 주시기를 원하노라
(민수기 6:24-26)
------------------------------------------------------------------------
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia
--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: [EMAIL PROTECTED]
Skype: emmanuel_cecchet
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia