Scan Server discussion [WAS: Re: 2.1 Release TODO]

Christopher Mon, 04 Apr 2022 09:32:04 -0700

On Mon, Apr 4, 2022 at 11:50 AM Keith Turner <ke...@deenlo.com> wrote:
>
> On Mon, Apr 4, 2022 at 11:17 AM Christopher <ctubb...@apache.org> wrote:
> >
> > However, I'm reluctant to include #2422, because I don't think it's near
> > ready enough, and by the time it is, it will be very last minute, and I
> > don't want to delay 2.1 further for it. Even if it's included as an
> > experimental feature, I think it has huge potential to be disruptive, or to
> > have a lot of churn by the time people actually have a chance to review it
> > thoroughly. Furthermore, I think there are possible alternatives (like a
> > fully client-side implementation, based on offline scanners) that would
> > avoid the tight coupling of a new service to Accumulo's core code. This
>
> There are some advantages to scan servers over direct file access to
> consider.  One is scalability of computation, if a web server is
> serving N client queries with scan servers those can potentially go to
> different scan servers.  With direct file access, all N queries and
> their iterator stacks would have to run in the web server.  Another is
> scalability of caching/memory.  When web servers send queries to scan
> servers using a sticky algorithm for assigning tablets to groups of
> scan servers, it could lead to good cache utilization and sharing that
> may not be possible when running scans directly in the web server. So
> scan servers allow scaling cache and computations for queries
> independently of web servers in way that may not be possible with
> direct file access.
>
> Another advantage to consider is isolation.  With direct file access
> and queries running directly in a web server, a bad query could bring
> down a web server and lots of unrelated queries.  Having a bad query
> bring down a scan server may be less disruptive.
>


I've forked this thread into its own discussion with a new subject
line, because, as I suggested in my original reply, my intent was not
to hijack the 2.1 planning thread with a discussion of the ScanServer
implementation details.

I'm fine with all those benefits (even if all the "could" and "may"
were turned into concrete "will"). My objection is not an objection to
the feature. It's an objection to including the feature in 2.1, based
on:

* readiness of the feature branch,
* availability of time to review/test such a big feature without delaying 2.1,
* its tight coupling to the core code in the implementation, and
* the possibility that solutions may exist with the above benefits
that are less tightly coupled has not yet been explored.

I would be more okay with including it if:

* it is ready,
* it has been tested and reviewed by the wider community,
* its coupling to the core Accumulo code is loosened, ideally if it's
designed to use only API/SPI, and could be released as a separate,
optional add-on. This might require improvements to API/SPI to expose
the features needed to help it function. This could also be done by
sub-classing the AccumuloClient. My concern here is the risk of
technical debt and the extra maintenance costs of increased complexity
for optional features that go unmaintained.

We've been hurt by premature inclusion of optional/experimental
features before that were rushed to release. No matter how awesome the
feature is... if it's niche and optional, we should consider these
risks and work to mitigate them. Otherwise, we'll be stuck with the
technical debt for years to come. With a little bit of caution, we can
make the feature available, without rushing, to satisfy the use case
while reducing the risks.

Also, one point of clarification: when I say "fully client side", I
only mean relative to Accumulo, not necessarily in the client process.
I'm lacking vocabulary to describe what I mean. As I understand it,
the current client code has been modified to connect to ScanServers
sitting off to the side of TabletServers, and the ScanServers are
basically modified TabletServers with less functionality. What I mean
is that instead of coupling the ScanServer to the TabletServer
implementation, and coupling the ScanServer client to the
AccumuloClient, there could be less coupling. The ScanServer itself
could behave like a client to Accumulo and/or HDFS (and maybe even
share some library code that we make public API, like RFile readers)
and it could have its own client (this is just one very rough outline
of an idea that could be explored). That way, the entire thing could
be removed without any change in Accumulo's code, to make it truly
optional (as in, optional to even have on the class path).

Scan Server discussion [WAS: Re: 2.1 Release TODO]

Reply via email to