On Tue, Dec 3, 2013 at 3:46 PM, Nick Dimiduk <[email protected]> wrote:
> On Tue, Dec 3, 2013 at 11:37 AM, Enis Söztutar <[email protected]> wrote: > > > I think we do not want to differentiate between RS's by splitting them > between > > primaries and shadows. This will complicate provisioning, administration, > > monitoring and load balancing a lot, and will not achieve very cheap > > secondary region promotions (because you have to move the region still as > > you described). > > > > The idea of having "primary hosts" and "replica hosts" was brought up in > initial design discussions over here. I am particularly against this > approach because of the additional complexity. I need to update myself on > Enis's doc (I'm a week+ behind), but my opinion is that we treat a > non-primary region (be it a "read replica" or a "shadow region") as a > first-class and independent entities. These entities can be assigned to any > host in the cluster, each with their own individual state machine > instances. > > Of course, the balancer would need to be aware of the relationship between > the primary and its non-primaries in order to maintain the balancing policy > requirements. However, I see no reason for there to be specialization at > the host level, and I agree with Enis's arguments against it. > > -n > I think there was a misunderstanding here -- I made a distinction between the "normal" primary regions, eventually-consistent-read-replica/secondary regions, and shadow memstore regions (for fast consistent read recovery). All region servers would be able to host normal primary regions, read-replica regions and shadow memstore regions. There would be different potential sweet spots if read-replica regions and shadow memstore regions were co-located at region on recover time with trade offs for fast consistent recovery, ability to have more recent values, locality optimizations and load balancing optimizations. Jon. -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [email protected]
