Yeah... cancel really mostly serves the purpose of unloading the cluster
for the next requirement.  It can help avoid having a node go postal as
well, but that mostly affects the next query, not the current one.

On Tue, Sep 11, 2012 at 6:54 PM, Jason Frantz <[email protected]> wrote:

> Definitely agree with many of the points in the link.
>
> The PowerDrill paper also mentions a variant of this where each query
> fragment is sent to two machines, and the results for that fragment are
> used from whatever machine responds first. So in that case it's not so much
> a "cancel" as an "ignore".
>
> On Tue, Sep 11, 2012 at 11:37 AM, Ted Dunning <[email protected]>
> wrote:
>
> > Headed into Thursday's meetup, this paper by Jeff Dean provides a very
> good
> > description of strategies for getting fast response times with variable
> > quality infrastructure.
> >
> > http://research.google.com/people/jeff/latency.html
> >
> > The key point here is that it is very important to have asynchronous
> > queries with a cancel.  Above that level, there needs to be a simple
> > strategy for pushing second versions of queries out to the workers and
> > canceling defunct or redundant queries.
> >
>

Reply via email to