Sounds reasonable.  File and issue and we can put together a slide.  But
let's get Namespaces (dataspaces) in place first.  The system feels a hokey
having one big flat namespace.  It feels like the ancient non-heirarchical
filesystems.

- Doug

On Sun, Sep 13, 2009 at 7:48 PM, Luke <[email protected]> wrote:

>
> On Sun, Sep 13, 2009 at 7:14 PM, Sanjit Jhala <[email protected]> wrote:
> > Seems nice to have to have the ability to control the consistency
> behavior
> > on a per client/app in stead of it being system wide.
>
> Yeah, the behavior is controllable per mutator, which is already finer
> granularity than per client/app.
>
> > I think its a good idea to have a design for eventual consistency in mind
> > for now and implement as required post 1.0.
>
> I was just sick of people picking on the write availability issue,
> which was brought up in about every conversation about Hypertable :)
> Eventual consistency is easier to build on top of real consistency,
> not vice versa.
>
> __Luke
>
> > -Sanjit
> >
> > On Sun, Sep 13, 2009 at 3:54 PM, Luke <[email protected]> wrote:
> >>
> >> On Sun, Sep 13, 2009 at 2:13 PM, Doug Judd <[email protected]>
> wrote:
> >> > This looks like a nice way to add eventual consistency to Hypertable.
> I
> >> > like the fact that once it makes it into the proxy log it guarantees
> >> > that
> >> > the write will eventually make it into the system.  The only issue I
> see
> >> > is
> >> > that updates for a cell could get written out-of-order.  The client
> >> > could
> >> > end up writing a newer version of a cell before the proxy writer gets
> a
> >> > chance to write the older version.  The application can just write
> self
> >> > ordering entries using a monotonically increasing sequence number to
> >> > solve
> >> > this problem.
> >>
> >> Yeah, client or the proxy (when writing to the proxy log) can fill out
> >> the revision/timestamp field of the cells.
> >>
> >> > I do question the need for eventual consistency.  I feel that this
> >> > "concern"
> >> > is theoretical.  The problem is that people do not have a well
> >> > implemented
> >> > Bigtable implementation to try out.  I suspect that this perceived
> >> > problem
> >> > is much less of an issue than people think.  Amazon developed this
> >> > concept
> >> > for their shopping cart.  If once every 1000th shopping cart update
> the
> >> > system spun for 30 seconds with a message "System busy", would you
> >> > really
> >> > care?  If 999 times out of 1000, the shopping cart updated instantly,
> >> > you
> >> > would perceive the system as highly available.
> >>
> >> I'm with you on this one (shopping cart), I personally would suspect
> >> my net connection issues first :) OTOH, if I'm an
> >> front-end/application programmer who wants to log stuff directly into
> >> Hypertable and don't really care about consistency (must log the
> >> transactions but wouldn't read until batch processing later), having
> >> to make sure the call doesn't timeout and lose the transaction in the
> >> log is very annoying. I'd choose a back-end that makes my life easier.
> >>
> >> > I think we should wait on this until it is determined to be a real
> >> > problem,
> >> > not a theoretical one.  It might also be a worthy exercise to do a
> back
> >> > of
> >> > the envelope calculation based on failure rate data to determine the
> >> > real
> >> > impact of failures on availability.
> >>
> >> I think the choice really belongs to the users. I'd suggest that we
> >> add "multiple path write proxy" (MPWP) feature (easy to implement and
> >> TBD of course) to the slides to assuage people's irrational (or not)
> >> fear about write latency under recovery :)
> >>
> >> __Luke
> >>
> >> > - Doug
> >> >
> >> > On Sat, Sep 12, 2009 at 1:37 PM, Luke <[email protected]> wrote:
> >> >>
> >> >> One of the biggest "concerns" from potential "real-time" users of
> >> >> Hypertable is write latency spike when some nodes are down and being
> >> >> recovered. Read latency/availability are usually masked by the
> caching
> >> >> layer.
> >> >>
> >> >> Cassandra tries solve the problem by using "hinted handoff" (write
> >> >> data tagged with a destination to an alternative node when the
> >> >> destination node is down). Of course this mandates relaxing
> >> >> consistency guarantee to "eventual", which is a trade-off many are
> >> >> willing to make.
> >> >>
> >> >> I just thought that it's not that hard to implement something similar
> >> >> in Hypertable and give user a choice between immediate and eventual
> >> >> consistency:
> >> >>
> >> >> When a mutator is created with BEST_EFFORT/EVENTUAL_OK flag, instead
> >> >> of keep retrying writes in the client when a destination node is
> down,
> >> >> it tries to write to an alternative range server with a special
> update
> >> >> flag, which persists the writes to a proxy log. The maintenance
> >> >> threads on the alternative range server will try to to empty proxy
> log
> >> >> by retry the writes. Alternative range servers can be picked using a
> >> >> random (sort the server list by their md5 of their ip address and the
> >> >> alternatives are the next n servers) or a location (data center/rack)
> >> >> aware scheme. Note this approach works even when the alternative node
> >> >> dies when proxy logs are not yet cleared.
> >> >>
> >> >> Thoughts?
> >> >>
> >> >> __Luke
> >> >>
> >> >
> >> >
> >> > >
> >> >
> >>
> >>
> >
> >
> > >
> >
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to