On Tue, Dec 3, 2013 at 11:51 AM, Jonathan Hsieh <[email protected]> wrote:
> To keep the discussion focused on the design goals, I'm going start > referring to enis and deveraj's eventually consistent read replicas as the > *read replica* design, and consistent fast read recovery mechanism based on > shadowing/tailing the wals as *shadow regions* or *shadow memstores*. Can > we agree on nomenclature? > Makes sense. > > > On Tue, Dec 3, 2013 at 11:07 AM, Enis Söztutar <[email protected]> wrote: > > > Thanks Jon for bringing this to dev@. > > > > > > On Mon, Dec 2, 2013 at 10:01 PM, Jonathan Hsieh <[email protected]> > wrote: > > > > > Fundamentally, I'd prefer focusing on making HBase "HBasier" instead of > > > tackling a feature that other systems architecturally can do better > > > (inconsistent reads). I consider consistent reads/writes being one of > > > HBase's defining features. That said, I think read replicas makes sense > > and > > > is a nice feature to have. > > > > > > > Our design proposal has a specific use case goal, and hopefully we can > > demonstrate the > > benefits of having this in HBase so that even more pieces can be built on > > top of this. Plus I imagine this will > > be a widely used feature for read-only tables or bulk loaded tables. We > are > > not > > proposing of reworking strong consistency semantics or major > architectural > > changes. I think by > > having the tables to be defined with replication count, and the proposed > > client API changes (Consistency definition) > > plugs well into the HBase model rather well. > > > > > I do agree think that without any recent updating mechanism, we are > limiting this usefulness of this feature to essentially *only* the > read-only or bulk load only tables. Recency if there were any > edits/updates would be severely lagging (by default potentially an hour) > especially in cases where there are only a few edits to a primarily bulk > loaded table. This limitation is not mentioned in the tradeoffs or > requirements (or a non-requirements section) definitely should be listed > there. > Obviously the amount of lag you would observe depends on whether you are using "Region snapshots", "WAL-Tailing" or "Async wal replication". I think there are still use cases where you can live with >1 hour old stale reads, so that "Region snapshots" is not *just* for read-only tables. I'll add these to the tradeoff's section. We are proposing to implement "Region snapshots" first and "Async wal replication" second. As argued, I think wal-tailing only makes sense with WALpr so, that work is left until after we have WAL per region. > > With the current design it might be best to have a flag on the table which > marks it read-only or bulk-load only so that it only gets used by users > when the table is in that mode? (and maybe an "escape hatch" for power > users). > I think we have a read-only flag already. We might not have bulk-load only flag though. Makes sense to add it if we want to restrict allowing bulk loads but preventing writes. > > [snip] > > > > - I think the two goals are both worthy on their own each with their own > > > optimal points. We should in the design makes sure that we can support > > > both goals. > > > > > > > I think our proposal is consistent with your doc, and we have considered > > secondary region promotion > > in the future section. It would be good if you can review and comment on > > whether you see any points > > missing. > > > > > > I definitely will. At the moment, I think the hybrid for the wals/hlogs I > suggested in the other thread seems to be an optimal solution considering > locality. Though feasible is obviously more complex than just one approach > alone. > > > > > - I want to making sure the proposed design have a path for optimal > > > fast-consistent read-recovery. > > > > > > > We think that it is, but it is a secondary goal for the initial work. I > > don't see any reason why secondary > > promotion cannot be build on top of this, once the branch is in a better > > state. > > > > Based on the detail in the design doc and this statement it sounds like you > have a prototype branch already? Is this the case? > Indeed. I think that is mentioned in the jira description. We have some parts of the changes for region, region server, HRI, and master. Client changes are on the way. I think we can post that in a github branch for now to share the code early and solicit early reviews. > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // [email protected] >
