Seems nice to have to have the ability to control the consistency behavior
on a per client/app in stead of it being system wide.
I think its a good idea to have a design for eventual consistency in mind
for now and implement as required post 1.0.

-Sanjit

On Sun, Sep 13, 2009 at 3:54 PM, Luke <[email protected]> wrote:

>
> On Sun, Sep 13, 2009 at 2:13 PM, Doug Judd <[email protected]> wrote:
> > This looks like a nice way to add eventual consistency to Hypertable.  I
> > like the fact that once it makes it into the proxy log it guarantees that
> > the write will eventually make it into the system.  The only issue I see
> is
> > that updates for a cell could get written out-of-order.  The client could
> > end up writing a newer version of a cell before the proxy writer gets a
> > chance to write the older version.  The application can just write self
> > ordering entries using a monotonically increasing sequence number to
> solve
> > this problem.
>
> Yeah, client or the proxy (when writing to the proxy log) can fill out
> the revision/timestamp field of the cells.
>
> > I do question the need for eventual consistency.  I feel that this
> "concern"
> > is theoretical.  The problem is that people do not have a well
> implemented
> > Bigtable implementation to try out.  I suspect that this perceived
> problem
> > is much less of an issue than people think.  Amazon developed this
> concept
> > for their shopping cart.  If once every 1000th shopping cart update the
> > system spun for 30 seconds with a message "System busy", would you really
> > care?  If 999 times out of 1000, the shopping cart updated instantly, you
> > would perceive the system as highly available.
>
> I'm with you on this one (shopping cart), I personally would suspect
> my net connection issues first :) OTOH, if I'm an
> front-end/application programmer who wants to log stuff directly into
> Hypertable and don't really care about consistency (must log the
> transactions but wouldn't read until batch processing later), having
> to make sure the call doesn't timeout and lose the transaction in the
> log is very annoying. I'd choose a back-end that makes my life easier.
>
> > I think we should wait on this until it is determined to be a real
> problem,
> > not a theoretical one.  It might also be a worthy exercise to do a back
> of
> > the envelope calculation based on failure rate data to determine the real
> > impact of failures on availability.
>
> I think the choice really belongs to the users. I'd suggest that we
> add "multiple path write proxy" (MPWP) feature (easy to implement and
> TBD of course) to the slides to assuage people's irrational (or not)
> fear about write latency under recovery :)
>
> __Luke
>
> > - Doug
> >
> > On Sat, Sep 12, 2009 at 1:37 PM, Luke <[email protected]> wrote:
> >>
> >> One of the biggest "concerns" from potential "real-time" users of
> >> Hypertable is write latency spike when some nodes are down and being
> >> recovered. Read latency/availability are usually masked by the caching
> >> layer.
> >>
> >> Cassandra tries solve the problem by using "hinted handoff" (write
> >> data tagged with a destination to an alternative node when the
> >> destination node is down). Of course this mandates relaxing
> >> consistency guarantee to "eventual", which is a trade-off many are
> >> willing to make.
> >>
> >> I just thought that it's not that hard to implement something similar
> >> in Hypertable and give user a choice between immediate and eventual
> >> consistency:
> >>
> >> When a mutator is created with BEST_EFFORT/EVENTUAL_OK flag, instead
> >> of keep retrying writes in the client when a destination node is down,
> >> it tries to write to an alternative range server with a special update
> >> flag, which persists the writes to a proxy log. The maintenance
> >> threads on the alternative range server will try to to empty proxy log
> >> by retry the writes. Alternative range servers can be picked using a
> >> random (sort the server list by their md5 of their ip address and the
> >> alternatives are the next n servers) or a location (data center/rack)
> >> aware scheme. Note this approach works even when the alternative node
> >> dies when proxy logs are not yet cleared.
> >>
> >> Thoughts?
> >>
> >> __Luke
> >>
> >
> >
> > >
> >
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to