Re: [one-users] ONE server redundancy
Hi All: I believe failover mechanism for a cloud toolkit such as OpenNebula should be developed within it. Solutions such as DRBD are not able to address some features such as synchronization in various levels, and they are slow in doing failover operation, in addition they require and enforce some special setups. I have some experience in development of a active-passive failover component for a similar service in active-passive mode, and I would be interested in helping for development of such a failover component. Regards, Mehdi On Thu, Feb 17, 2011 at 3:29 PM, Manikanta Kattamuri mani.kattam...@hexagrid.com wrote: Hi, active-active mode in which the proxy acts as loadbalancer and gives authority token to a selected oned daemon. oned has a support to perform a dry run i.e performs all the operations like updating its cache etc but commit/authority to perform operation on synchronized resource is restricted by a token which is given by proxy. which makes sure the cahce is sync in all the oned deamons and proxy has control even for fail over and load balance. Regards, Mani. On Thu, Feb 17, 2011 at 5:28 PM, Danny Sternkopf danny.sternk...@csc.fi wrote: Hi, that would be an active - passive oned configuration. The 2nd oned only jumps in when needed. So it even could be running only in case the 1st oned has failed. HA software could manage that assuming the oned config directory is shared. The proxy makes sure that there is a single point of access, but this could be also done by HA management. What could go wrong if oned dies and another oned on a different machine takes over? Is there any possibility that information is lost due to the 1st oned's cache is gone? Active - active would probably only make sense if oned has integrated support for a redundancy mode. So that both oned's can exchange heartbeats and may also negotiate who is master and is taking care of the running configuration. Regards, Danny On 2011-02-17 13:00, Tino Vazquez wrote: Hi again, To add up to my previous email, it is worth noting that there is other option that would avoid the fiddling with the cache. For this, both oned have to be active. One way to go could be: 1) Set oned in two machines 2) CLI, EC2 tools connect via a proxy that forwards the requests to the first daemon 3) If this fails, the proxy should start forwarding to the second. Also, a coherence checking for VMs in intermediate states needs to be in place to avoid driver callback misses. 4) The scheduler should be on a third, separate machine Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 16, 2011 at 4:06 PM, Tino Vazqueztin...@opennebula.org wrote: Hi Steven, There may be incoherences between the two ONEs. Due to the cache (this can be disabled in ONE, with performance penalty), two ONEs can have the same VM record stored in memory, so if one instance of ONE writes to the DB, these changes won't reflect in the other ONE until it refreshes its caches, or worst still, the second instance of ONE may overwrite the changes. I am by no means saying this is not achievable, but there are several things (like the one in this email) to consider. We have been thinking of a setup as the one you propose, and actually, we would love to hear how this works in practice, as it is theoretically possible but haven't got around to try it out. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 16, 2011 at 3:59 PM, Steven Timmt...@fnal.gov wrote: Tino--are you saying that there is state information in the oned that is not on disk at any given time? We were thinking of setting up an active-passive failover of our oned via heartbeat and DRBD. Is there any reason why that might not work? Steve Timm On Wed, 16 Feb 2011, Tino Vazquez wrote: Hi Luis, That setup is not easily achievable. Operations are not transactional, and also ONE keeps a cache, so the information of multiple ONEs won't be in sync. It can be achieved, but not out of the box, a fair amount of fiddling is involved. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Mon, Jan 31, 2011 at 6:15 PM, Luis M. Carrillmcar...@cesga.es wrote: Hello, We have an OpenNebula installation and we wanted to deploy another ONE server for redundacy monitoring the same hosts and MVs. Could this be achieved if both ONE installations use the same mysql database? Are all the operations transactional? Cheers -- Luis M. Carril Project Technician Galicia Supercomputing Center (CESGA) Avda. de Vigo s/n 15706 Santiago de Compostela SPAIN Tel: 34-981569810 ext 249 lmcar...@cesga.es www.cesga.es
Re: [one-users] ONE server redundancy
Hi again, To add up to my previous email, it is worth noting that there is other option that would avoid the fiddling with the cache. For this, both oned have to be active. One way to go could be: 1) Set oned in two machines 2) CLI, EC2 tools connect via a proxy that forwards the requests to the first daemon 3) If this fails, the proxy should start forwarding to the second. Also, a coherence checking for VMs in intermediate states needs to be in place to avoid driver callback misses. 4) The scheduler should be on a third, separate machine Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 16, 2011 at 4:06 PM, Tino Vazquez tin...@opennebula.org wrote: Hi Steven, There may be incoherences between the two ONEs. Due to the cache (this can be disabled in ONE, with performance penalty), two ONEs can have the same VM record stored in memory, so if one instance of ONE writes to the DB, these changes won't reflect in the other ONE until it refreshes its caches, or worst still, the second instance of ONE may overwrite the changes. I am by no means saying this is not achievable, but there are several things (like the one in this email) to consider. We have been thinking of a setup as the one you propose, and actually, we would love to hear how this works in practice, as it is theoretically possible but haven't got around to try it out. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 16, 2011 at 3:59 PM, Steven Timm t...@fnal.gov wrote: Tino--are you saying that there is state information in the oned that is not on disk at any given time? We were thinking of setting up an active-passive failover of our oned via heartbeat and DRBD. Is there any reason why that might not work? Steve Timm On Wed, 16 Feb 2011, Tino Vazquez wrote: Hi Luis, That setup is not easily achievable. Operations are not transactional, and also ONE keeps a cache, so the information of multiple ONEs won't be in sync. It can be achieved, but not out of the box, a fair amount of fiddling is involved. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Mon, Jan 31, 2011 at 6:15 PM, Luis M. Carril lmcar...@cesga.es wrote: Hello, We have an OpenNebula installation and we wanted to deploy another ONE server for redundacy monitoring the same hosts and MVs. Could this be achieved if both ONE installations use the same mysql database? Are all the operations transactional? Cheers -- Luis M. Carril Project Technician Galicia Supercomputing Center (CESGA) Avda. de Vigo s/n 15706 Santiago de Compostela SPAIN Tel: 34-981569810 ext 249 lmcar...@cesga.es www.cesga.es == ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org -- -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Group Leader. Lead of FermiCloud project. ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] ONE server redundancy
Hi, that would be an active - passive oned configuration. The 2nd oned only jumps in when needed. So it even could be running only in case the 1st oned has failed. HA software could manage that assuming the oned config directory is shared. The proxy makes sure that there is a single point of access, but this could be also done by HA management. What could go wrong if oned dies and another oned on a different machine takes over? Is there any possibility that information is lost due to the 1st oned's cache is gone? Active - active would probably only make sense if oned has integrated support for a redundancy mode. So that both oned's can exchange heartbeats and may also negotiate who is master and is taking care of the running configuration. Regards, Danny On 2011-02-17 13:00, Tino Vazquez wrote: Hi again, To add up to my previous email, it is worth noting that there is other option that would avoid the fiddling with the cache. For this, both oned have to be active. One way to go could be: 1) Set oned in two machines 2) CLI, EC2 tools connect via a proxy that forwards the requests to the first daemon 3) If this fails, the proxy should start forwarding to the second. Also, a coherence checking for VMs in intermediate states needs to be in place to avoid driver callback misses. 4) The scheduler should be on a third, separate machine Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 16, 2011 at 4:06 PM, Tino Vazqueztin...@opennebula.org wrote: Hi Steven, There may be incoherences between the two ONEs. Due to the cache (this can be disabled in ONE, with performance penalty), two ONEs can have the same VM record stored in memory, so if one instance of ONE writes to the DB, these changes won't reflect in the other ONE until it refreshes its caches, or worst still, the second instance of ONE may overwrite the changes. I am by no means saying this is not achievable, but there are several things (like the one in this email) to consider. We have been thinking of a setup as the one you propose, and actually, we would love to hear how this works in practice, as it is theoretically possible but haven't got around to try it out. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 16, 2011 at 3:59 PM, Steven Timmt...@fnal.gov wrote: Tino--are you saying that there is state information in the oned that is not on disk at any given time? We were thinking of setting up an active-passive failover of our oned via heartbeat and DRBD. Is there any reason why that might not work? Steve Timm On Wed, 16 Feb 2011, Tino Vazquez wrote: Hi Luis, That setup is not easily achievable. Operations are not transactional, and also ONE keeps a cache, so the information of multiple ONEs won't be in sync. It can be achieved, but not out of the box, a fair amount of fiddling is involved. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Mon, Jan 31, 2011 at 6:15 PM, Luis M. Carrillmcar...@cesga.es wrote: Hello, We have an OpenNebula installation and we wanted to deploy another ONE server for redundacy monitoring the same hosts and MVs. Could this be achieved if both ONE installations use the same mysql database? Are all the operations transactional? Cheers -- Luis M. Carril Project Technician Galicia Supercomputing Center (CESGA) Avda. de Vigo s/n 15706 Santiago de Compostela SPAIN Tel: 34-981569810 ext 249 lmcar...@cesga.es www.cesga.es == ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org -- -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Group Leader. Lead of FermiCloud project. ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] ONE server redundancy
Hi Luis, That setup is not easily achievable. Operations are not transactional, and also ONE keeps a cache, so the information of multiple ONEs won't be in sync. It can be achieved, but not out of the box, a fair amount of fiddling is involved. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Mon, Jan 31, 2011 at 6:15 PM, Luis M. Carril lmcar...@cesga.es wrote: Hello, We have an OpenNebula installation and we wanted to deploy another ONE server for redundacy monitoring the same hosts and MVs. Could this be achieved if both ONE installations use the same mysql database? Are all the operations transactional? Cheers -- Luis M. Carril Project Technician Galicia Supercomputing Center (CESGA) Avda. de Vigo s/n 15706 Santiago de Compostela SPAIN Tel: 34-981569810 ext 249 lmcar...@cesga.es www.cesga.es == ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] ONE server redundancy
Tino--are you saying that there is state information in the oned that is not on disk at any given time? We were thinking of setting up an active-passive failover of our oned via heartbeat and DRBD. Is there any reason why that might not work? Steve Timm On Wed, 16 Feb 2011, Tino Vazquez wrote: Hi Luis, That setup is not easily achievable. Operations are not transactional, and also ONE keeps a cache, so the information of multiple ONEs won't be in sync. It can be achieved, but not out of the box, a fair amount of fiddling is involved. Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Mon, Jan 31, 2011 at 6:15 PM, Luis M. Carril lmcar...@cesga.es wrote: Hello, We have an OpenNebula installation and we wanted to deploy another ONE server for redundacy monitoring the same hosts and MVs. Could this be achieved if both ONE installations use the same mysql database? Are all the operations transactional? Cheers -- Luis M. Carril Project Technician Galicia Supercomputing Center (CESGA) Avda. de Vigo s/n 15706 Santiago de Compostela SPAIN Tel: 34-981569810 ext 249 lmcar...@cesga.es www.cesga.es == ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org -- -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Group Leader. Lead of FermiCloud project. ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
[one-users] ONE server redundancy
Hello, We have an OpenNebula installation and we wanted to deploy another ONE server for redundacy monitoring the same hosts and MVs. Could this be achieved if both ONE installations use the same mysql database? Are all the operations transactional? Cheers -- Luis M. Carril Project Technician Galicia Supercomputing Center (CESGA) Avda. de Vigo s/n 15706 Santiago de Compostela SPAIN Tel: 34-981569810 ext 249 lmcar...@cesga.es www.cesga.es == ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org