Thanks, Sean. This was very helpful.

Sean Busbey wrote:
altering the public API (as a standalone change)

I am very conservative about changes to our public API. I have had to write
code that works across Accumulo versions and even when only working with
the public API is it painful on almost every release. I have also seen
plenty of cases where deployments of Accumulo lag behind version changes
because of this same issue.

I fully support that major version changes are there for us to break APIs.
That doesn't mean that such breakage doesn't have a high cost that needs to
be weighed carefully.

In general, for me to be positive on an API change there needs to be a
compelling improvement or a correctness issue. For me, that correctness
issue is the race condition on property updates you mention.

While I acknowledge that a per-table configured load balancer can't be set
prior to the initial tablet assignment without this change, I don't think
that out weighs having to update all table creation code. (this will be the
case even if we go through a deprecation cycle. the cycle just draws it
out.) Granted, I have never had cause to use a per-table load balancer that
was hampered by where the initial tablet was assigned. So I might just not
know the level of pain that fix would address.

  implementing a partial fix for the larger problem

This fix leaves those who need a consistent view of some set of properties
with only options that involve table creation, either in this changed API
or via Keith's clone table work around on an offline table. That means if
you need to alter a constraint or an iterator (that needs to be used
universally), you have the same negative consequence as some of the
proposed solutions while also having to change the underlying table id. More
importantly, I think it leaves people prone to not realizing that their
change to these things won't be uniformly enforced during
the propagation period.

We could mitigate some of the gotchas around when property updates need to
be strongly consistent with better docs, but that only helps when people
are at the point of reading docs. I'd rather stop them from self inflicted
pain directly because people often don't have the time to read docs. I see
stopping this correctness problem as worth the behavior change something
like "only update properties when tables are offline" would necessitate.  But
this vote isn't about the merits of that particular alternative.

I'm willing to consider other options for solving the consistent property
issue and have tried to work through them in the comments around
ACCUMULO-3176. I just don't believe the one meets my standard for
inflicting API pain.

On Tue, Nov 25, 2014 at 1:16 PM, Josh Elser<josh.el...@gmail.com>  wrote:

Can I pick your brain some more? (also, I'm sorry if this is already
addressed in the JIRA comments somewhere. I'm being lazy)

As we know, there is a larger problem of ensuring consistent configuration
across all servers in the cluster. I don't think there is any contention
here. There are ways we can do this, but no one has done it yet. That's the
general problem, and, I believe, what you mean by the "underlying issue".

A subset of that problem is configuration updates for a table which is
newly created. In my experience, this is often how I get bitten by this
race condition -- I create a table, I updated some property (typically set
an iterator/combiner/constraint), and then did some insert/scan before the
tabletserver I communicated with got my property update.

I see this taking what is a technically difficult problem (assumption,
since no one has done it yet) and making the problem partially better. In
practice for me, this also means that how I often encountered this race
condition is addressed.

I also don't see the changes that Jenna wrote as a blocker to implementing
a "proper" blocking configuration update.

Can you clarify your level of concern with this change in: altering the
public API (as a standalone change), implementing a partial fix for the
larger problem, and the combination of the previous two points? It would be
much appreciated.



Sean Busbey wrote:

-1

This change alters our public API while not solving the underlying issue
of
race conditions in property updates.

On Tue, Nov 25, 2014 at 11:14 AM, Christopher<ctubb...@apache.org>
wrote:

  Committers, this is a consensus vote on whether or not to include Jenna's
patch for ACCUMULO-3176 to the 1.7.0-SNAPSHOT (master) branch.

This patch improves the table creation API with the introduction of a
NewTableConfiguration object (similar to the pattern for
BatchWriterConfig), which allows us to be flexible on improving table
creation options in the future without creating many overloaded methods
(as
the table creation API has been plagued by in the past). The main goal of
the patch is to allow table properties to be set on a table at the time
of
creation, before any tablets are assigned, but it also lays the
foundation
for future table creation improvements. Creating initial table properties
was already present in the RPC calls, but not exposed in the API. This
can
support a number of use cases.

Since an objection was raised by Sean Busbey (and a veto expected), I've
initiated this vote in lieu of applying the patch under lazy consensus so
that any veto votes can be justified and addressed here.

Note: there are a few bugs in the Mock implementation of this that I've
fixed, as well as some minor deprecation warnings and javadoc
improvements
I'm adding, please apply your vote under the assumption that those will
be
fixed before it will be applied.

Please vote in accordance with the bylaws for consensus voting.
My vote is +1.

Thanks.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii






Reply via email to