Re: [Gluster-devel] Default quorum for 2 way replication
On 03/05/2016 05:26 AM, Pranith Kumar Karampuri wrote: That is the point. There is an illusion of choice between Data integrity and HA. But we are not *really* giving HA, are we? HA will be there only if second brick in the replica pair goes down. In your typical @Pranith, can you elaborate on this? I am not so AFR savvy, so unable to comprehend why HA is available if only when the second brick goes down and is not when the first does. Just helps in understanding the issue at hand. Because it is client side replication there is a fixed *leader* i.e. 1st brick. Ah! good to know, thank you. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
On 03/04/2016 08:36 PM, Shyam wrote: On 03/04/2016 07:30 AM, Pranith Kumar Karampuri wrote: On 03/04/2016 05:47 PM, Bipin Kunal wrote: HI Pranith, Thanks for starting this mail thread. Looking from a user perspective most important is to get a "good copy" of data. I agree that people use replication for HA but having stale data with HA will not have any value. So I will suggest to make auto quorum as default configuration even for 2-way replication. If user is willing to lose data at the cost of HA, he always have option disable it. But default preference should be data and its integrity. I think we need to consider *maintenance* activities on the volume, like replacing a brick in a replica pair, or upgrading one half of the replica and then the other, at which time the replica group would function read-only, if we choose 'auto' in a 2-way replicated state, is this correct? Yes. Having said the above, we already have the option in place, right? I.e admins can already choose 'auto', it is just the default that we are discussing. This could also be tackled via documentation/best practices ("yeah right! who reads those again?" is a valid comment here). Yes. I just sent a reply to Jeff, where I told it is better to have interactive question at the time of creating 2-way replica volume which gives this information :-). I guess we need to be clear (in documentation or otherwise) what they get when they choose one over the other (like the HA point below and also upgrade concerns etc.), irrespective of how this discussion ends (just my 2 c's). Totally agree. We will give an interactive question above, a link which gives detailed explanation. That is the point. There is an illusion of choice between Data integrity and HA. But we are not *really* giving HA, are we? HA will be there only if second brick in the replica pair goes down. In your typical @Pranith, can you elaborate on this? I am not so AFR savvy, so unable to comprehend why HA is available if only when the second brick goes down and is not when the first does. Just helps in understanding the issue at hand. Because it is client side replication there is a fixed *leader* i.e. 1st brick. As a side note. We recently had a discussion with NSR team (Jeff, avra). We will be using some infra for NSR to implement server side afr as well with leader election etc. Pranith deployment, we can't really give any guarantees about what brick will go down when. So I am not sure if we can consider it as HA. But I would love to hear what others have to say about this as well. If majority of users say they need it to be auto, you will definitely see a patch :-). Pranith Thanks, Bipin Kunal On Fri, Mar 4, 2016 at 5:43 PM, Ravishankar Nwrote: On 03/04/2016 05:26 PM, Pranith Kumar Karampuri wrote: hi, So far default quorum for 2-way replication is 'none' (i.e. files/directories may go into split-brain) and for 3-way replication and arbiter based replication it is 'auto' (files/directories won't go into split-brain). There are requests to make default as 'auto' for 2-way replication as well. The line of reasoning is that people value data integrity (files not going into split-brain) more than HA (operation of mount even when bricks go down). And admins should explicitly change it to 'none' when they are fine with split-brains in 2-way replication. We were wondering if you have any inputs about what is a sane default for 2-way replication. I like the default to be 'none'. Reason: If we have 'auto' as quorum for 2-way replication and first brick dies, there is no HA. +1. Quorum does not make sense when there are only 2 parties. There is no majority voting. Arbiter volumes are a better option. If someone wants some background, please see 'Client quorum' and 'Replica 2 and Replica 3 volumes' section of http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/ -Ravi If users are fine with it, it is better to use plain distribute volume rather than replication with quorum as 'auto'. What are your thoughts on the matter? Please guide us in the right direction. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
On 03/04/2016 09:10 PM, Jeff Darcy wrote: I like the default to be 'none'. Reason: If we have 'auto' as quorum for 2-way replication and first brick dies, there is no HA. If users are fine with it, it is better to use plain distribute volume "Availability" is a tricky word. Does it mean access to data now, or later despite failure? Taking a volume down due to loss of quorum might be equivalent to having no replication in the first sense, but certainly not in the second. When the possibility (likelihood?) of split brain is considered, enforcing quorum actually does a *better* job of preserving availability in the second sense. I believe this second sense is most often what users care about, and therefore quorum enforcement should be the default. I think we all agree that quorum is a bit slippery when N=2. That's where there really is a tradeoff between (immediate) availability and (highest levels of) data integrity. That's why arbiters showed up first in the NSR specs, and later in AFR. We should definitely try to push people toward N>=3 as much as we can. However, the ability to "scale down" is one of the things that differentiate us vs. both our Ceph cousins and our true competitors. Many of our users will stop at N=2 no matter what we say. However unwise that might be, we must still do what we can to minimize harm when things go awry. I always felt 2-way replication, 3-way replication analogy is similar to 2-wheeler(motor-bikes) and 4-wheeler vehicles(cars). You have more fatal accidents with 2-wheelers than 4-wheelers. But it has its place. Arbiter volumes is like a 3-wheeler(auto rickshaw) :-). I feel users should be given the power to choose what they want based on what they are looking for and how much hardware they want to buy (affordability). We should educate them about the risks but the final decision should be theirs. So in that sense I don't like to *push* them to N>=3. "Many of our users will stop at N=2 no matter what we say". That right there is what I had to realize, some years back. I naively thought that people will rush to replica-3 with client quorum, but it didn't happen. That is the reason for investing time in arbiter volumes as a solution. Because we wanted to reduce the cost. People didn't want to spend so much money for consistency(based on what we are still seeing). Fact of the matter is, even after arbiter volumes I am sure some people will stick with replica-2 with unsplit-brain patch from facebook (For people who don't know: it resolves split-brain based on policies automatically without human intervention, it will be available soon in gluster). You do have a very good point though. I think it makes sense to make more people aware of what they are getting into with 2-way replication. So may be an interactive question at the time of 2-way replica volume creation about the possibility of split-brains and availability of other options(like arbiter/unsplit-brain in 2-way replication) could be helpful, keeping the default still as 'none'. I think it would be better if we educate users about value of arbiter volumes, so that users naturally progress towards that and embrace it. We are seeing more and more questions on the IRC and mailing list about arbiter volumes, so there is a +ve trend. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
> I like the default to be 'none'. Reason: If we have 'auto' as quorum for > 2-way replication and first brick dies, there is no HA. If users are > fine with it, it is better to use plain distribute volume "Availability" is a tricky word. Does it mean access to data now, or later despite failure? Taking a volume down due to loss of quorum might be equivalent to having no replication in the first sense, but certainly not in the second. When the possibility (likelihood?) of split brain is considered, enforcing quorum actually does a *better* job of preserving availability in the second sense. I believe this second sense is most often what users care about, and therefore quorum enforcement should be the default. I think we all agree that quorum is a bit slippery when N=2. That's where there really is a tradeoff between (immediate) availability and (highest levels of) data integrity. That's why arbiters showed up first in the NSR specs, and later in AFR. We should definitely try to push people toward N>=3 as much as we can. However, the ability to "scale down" is one of the things that differentiate us vs. both our Ceph cousins and our true competitors. Many of our users will stop at N=2 no matter what we say. However unwise that might be, we must still do what we can to minimize harm when things go awry. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
On 03/04/2016 07:30 AM, Pranith Kumar Karampuri wrote: On 03/04/2016 05:47 PM, Bipin Kunal wrote: HI Pranith, Thanks for starting this mail thread. Looking from a user perspective most important is to get a "good copy" of data. I agree that people use replication for HA but having stale data with HA will not have any value. So I will suggest to make auto quorum as default configuration even for 2-way replication. If user is willing to lose data at the cost of HA, he always have option disable it. But default preference should be data and its integrity. I think we need to consider *maintenance* activities on the volume, like replacing a brick in a replica pair, or upgrading one half of the replica and then the other, at which time the replica group would function read-only, if we choose 'auto' in a 2-way replicated state, is this correct? Having said the above, we already have the option in place, right? I.e admins can already choose 'auto', it is just the default that we are discussing. This could also be tackled via documentation/best practices ("yeah right! who reads those again?" is a valid comment here). I guess we need to be clear (in documentation or otherwise) what they get when they choose one over the other (like the HA point below and also upgrade concerns etc.), irrespective of how this discussion ends (just my 2 c's). That is the point. There is an illusion of choice between Data integrity and HA. But we are not *really* giving HA, are we? HA will be there only if second brick in the replica pair goes down. In your typical @Pranith, can you elaborate on this? I am not so AFR savvy, so unable to comprehend why HA is available if only when the second brick goes down and is not when the first does. Just helps in understanding the issue at hand. deployment, we can't really give any guarantees about what brick will go down when. So I am not sure if we can consider it as HA. But I would love to hear what others have to say about this as well. If majority of users say they need it to be auto, you will definitely see a patch :-). Pranith Thanks, Bipin Kunal On Fri, Mar 4, 2016 at 5:43 PM, Ravishankar Nwrote: On 03/04/2016 05:26 PM, Pranith Kumar Karampuri wrote: hi, So far default quorum for 2-way replication is 'none' (i.e. files/directories may go into split-brain) and for 3-way replication and arbiter based replication it is 'auto' (files/directories won't go into split-brain). There are requests to make default as 'auto' for 2-way replication as well. The line of reasoning is that people value data integrity (files not going into split-brain) more than HA (operation of mount even when bricks go down). And admins should explicitly change it to 'none' when they are fine with split-brains in 2-way replication. We were wondering if you have any inputs about what is a sane default for 2-way replication. I like the default to be 'none'. Reason: If we have 'auto' as quorum for 2-way replication and first brick dies, there is no HA. +1. Quorum does not make sense when there are only 2 parties. There is no majority voting. Arbiter volumes are a better option. If someone wants some background, please see 'Client quorum' and 'Replica 2 and Replica 3 volumes' section of http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/ -Ravi If users are fine with it, it is better to use plain distribute volume rather than replication with quorum as 'auto'. What are your thoughts on the matter? Please guide us in the right direction. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
On 03/04/2016 05:47 PM, Bipin Kunal wrote: HI Pranith, Thanks for starting this mail thread. Looking from a user perspective most important is to get a "good copy" of data. I agree that people use replication for HA but having stale data with HA will not have any value. So I will suggest to make auto quorum as default configuration even for 2-way replication. If user is willing to lose data at the cost of HA, he always have option disable it. But default preference should be data and its integrity. That is the point. There is an illusion of choice between Data integrity and HA. But we are not *really* giving HA, are we? HA will be there only if second brick in the replica pair goes down. In your typical deployment, we can't really give any guarantees about what brick will go down when. So I am not sure if we can consider it as HA. But I would love to hear what others have to say about this as well. If majority of users say they need it to be auto, you will definitely see a patch :-). Pranith Thanks, Bipin Kunal On Fri, Mar 4, 2016 at 5:43 PM, Ravishankar Nwrote: On 03/04/2016 05:26 PM, Pranith Kumar Karampuri wrote: hi, So far default quorum for 2-way replication is 'none' (i.e. files/directories may go into split-brain) and for 3-way replication and arbiter based replication it is 'auto' (files/directories won't go into split-brain). There are requests to make default as 'auto' for 2-way replication as well. The line of reasoning is that people value data integrity (files not going into split-brain) more than HA (operation of mount even when bricks go down). And admins should explicitly change it to 'none' when they are fine with split-brains in 2-way replication. We were wondering if you have any inputs about what is a sane default for 2-way replication. I like the default to be 'none'. Reason: If we have 'auto' as quorum for 2-way replication and first brick dies, there is no HA. +1. Quorum does not make sense when there are only 2 parties. There is no majority voting. Arbiter volumes are a better option. If someone wants some background, please see 'Client quorum' and 'Replica 2 and Replica 3 volumes' section of http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/ -Ravi If users are fine with it, it is better to use plain distribute volume rather than replication with quorum as 'auto'. What are your thoughts on the matter? Please guide us in the right direction. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
HI Pranith, Thanks for starting this mail thread. Looking from a user perspective most important is to get a "good copy" of data. I agree that people use replication for HA but having stale data with HA will not have any value. So I will suggest to make auto quorum as default configuration even for 2-way replication. If user is willing to lose data at the cost of HA, he always have option disable it. But default preference should be data and its integrity. Thanks, Bipin Kunal On Fri, Mar 4, 2016 at 5:43 PM, Ravishankar Nwrote: > On 03/04/2016 05:26 PM, Pranith Kumar Karampuri wrote: >> >> hi, >> So far default quorum for 2-way replication is 'none' (i.e. >> files/directories may go into split-brain) and for 3-way replication and >> arbiter based replication it is 'auto' (files/directories won't go into >> split-brain). There are requests to make default as 'auto' for 2-way >> replication as well. The line of reasoning is that people value data >> integrity (files not going into split-brain) more than HA (operation of >> mount even when bricks go down). And admins should explicitly change it to >> 'none' when they are fine with split-brains in 2-way replication. We were >> wondering if you have any inputs about what is a sane default for 2-way >> replication. >> >> I like the default to be 'none'. Reason: If we have 'auto' as quorum for >> 2-way replication and first brick dies, there is no HA. > > > > +1. Quorum does not make sense when there are only 2 parties. There is no > majority voting. Arbiter volumes are a better option. > If someone wants some background, please see 'Client quorum' and 'Replica 2 > and Replica 3 volumes' section of > http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/ > > -Ravi > >> If users are fine with it, it is better to use plain distribute volume >> rather than replication with quorum as 'auto'. What are your thoughts on the >> matter? Please guide us in the right direction. >> >> Pranith > > > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Default quorum for 2 way replication
On 03/04/2016 05:26 PM, Pranith Kumar Karampuri wrote: hi, So far default quorum for 2-way replication is 'none' (i.e. files/directories may go into split-brain) and for 3-way replication and arbiter based replication it is 'auto' (files/directories won't go into split-brain). There are requests to make default as 'auto' for 2-way replication as well. The line of reasoning is that people value data integrity (files not going into split-brain) more than HA (operation of mount even when bricks go down). And admins should explicitly change it to 'none' when they are fine with split-brains in 2-way replication. We were wondering if you have any inputs about what is a sane default for 2-way replication. I like the default to be 'none'. Reason: If we have 'auto' as quorum for 2-way replication and first brick dies, there is no HA. +1. Quorum does not make sense when there are only 2 parties. There is no majority voting. Arbiter volumes are a better option. If someone wants some background, please see 'Client quorum' and 'Replica 2 and Replica 3 volumes' section of http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/ -Ravi If users are fine with it, it is better to use plain distribute volume rather than replication with quorum as 'auto'. What are your thoughts on the matter? Please guide us in the right direction. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel