Re: [DISCUSS] API changes to provide resource cleanup

Josh Elser Thu, 02 Jan 2014 11:41:29 -0800

+1 Also agree with Keith and Christopher's ideas.

On 1/2/14, 2:01 PM, Bill Havanki wrote:

I concur with Keith and Christopher. I like the idea of taking time to do
it right in 1.7.0.


Bill


On Thu, Jan 2, 2014 at 1:51 PM, Christopher <[email protected]> wrote:

I agree with Keith. Thanks for summarizing, Sean.

I also favor option #2 for all existing versions, up through 1.6.0.

For 1.7.0, I strongly favor a new client API that addresses lifecycle
management of connection resources directly in the API. Specifically,
I propose moving static connection state to a ConnectionResources
object that is Closeable and can be provided to an instance. For
backwards compatibility, the implementation of the current API can be
made to use a singleton instance of this object, rather than static
state. It follows then, that "The Hammer" solution (#2) would simply
be modified in its implementation to close this singleton instance,
for backwards compatibility with earlier iterations of "The Hammer".

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jan 2, 2014 at 1:27 PM, Keith Turner <[email protected]> wrote:

I really like the summary of the discussion, very thorough and concise.

am in favor of #2 for 1.4.5, 1.5.1, and 1.6.0.  Also I would be willing

to

do the revert work for close().

If we go with option #2, what should we do for 1.7.0-SNAPSHOT?  If

someone

really wants to pursue adding close, we could leave things as is in
1.7.0-SNAPSHOT.  If no one is going to pursue it, then we should revert

it

in 1.7.0-SNAP rather than leave something thats partially done.




On Thu, Jan 2, 2014 at 12:47 PM, Sean Busbey <[email protected]
wrote:

Hey Folks!

We need to come to some conclusions on what we're going to do for

resource

clean up. I'll attempt to summarize the situation and various options.

If I

missed something from our myriad of tickets and mailing list threads,
please bring it up.

Brief Background:

The existing client APIs presume that a large amount of global state

will

persist for the duration of a JVM instance. This is at odds with

lifecycle

management in application containers, where a JVM is very long lived and
user provided applications are stood up and torn down. We have reports

of

this causing OOM on JBoss[1] and leaked threads on Tomcat[2].

We have two possible solutions, both of which Jared Winick has kindly
verified solve the problem, as seen on JBoss.

----
= Proposed solution #1: Closeable Instance

The first approach adds a .close method to Instance so that users can

say

when they are done with a given instance. Internally, reference counting
determines when we tear down global resources.

Advantages:
   * States via code where a client should do lifecycle management.
   * Allows shutting down just some of the resources used.
   * Is already in the code base.

Disadvantages:
   * Since lifecycle is getting added post-hoc, we are more likely to

have

maintenance issues as we find other side effects we hadn't considered,

like

the multithreaded issue that already came up[3].
   * Changes Instance from representing static configuration to shared

state

   * Doesn't work with the fluent style some of our APIs encourage.
   * closed semantics probably aren't consistently enforced (e.g. users
trying to use a BatchWriter that came from a now-closed instance should
fail)

To finish, we'd need to
   * Verify multithreaded handling is done without too much of a

performance

impact[3]
   * Finish making our internal use consistent with the lifecycle we're
telling others to use[4]
   * Possibly add tests to verify consistent enforcement of closing on
objects derived from Instance

= Proposed solution #2: Global cleanup utility, aka The Hammer

As a band-aid to allow for "unload resources" without making changes to

the

API we instead provide a utility method that cleans up all global
resources.

Advantages:
   * Doesn't change API or meaning for Instance
   * Can be used on older Accumulo deployments w/o patch/rebuild cycle

Disadvantages:
   * Only allows all-or-nothing cleanup
   * Doesn't address our underlying lack of lifecycle
   * Requires reverts

To finish, we'd need to
   * revert commits from old solution (I haven't checked how many

commits,

but it's 6 tickets :/ )
   * port code from PoC to main codebase (asf grants, etc) [6]
   * add some kind of test (functional/IT?)

-----

We need to decide what we're going to provide as a placeholder for

releases

already frozen on API (i.e. 1.4, 1.5, 1.6*) as well as longer term.

Personally, my position is that we should use the simplest change to

handle

the published versions (solution #2).

Obviously there are outstanding issues with how we deal with global

state

and shared resources in the current client APIs. I'd like to see that
addressed as a part of a more coherent client lifecycle rather than
struggling to make it work while maintaining the current API. Long

term, I

think this means handling things in the updated client API Christopher

has

mentioned a few times.

How close to consensus are we already?

- Sean

[1]: https://issues.apache.org/jira/browse/ACCUMULO-1379
[2]: https://issues.apache.org/jira/browse/ACCUMULO-1697
[3]: https://issues.apache.org/jira/browse/ACCUMULO-2027
[4]: https://issues.apache.org/jira/browse/ACCUMULO-1923
[6]: https://issues.apache.org/jira/browse/ACCUMULO-2113

*: I think 1.6 should be in this list because we are at feature freeze,

and

any proper changes to lifecycle management are likely to constitute an
addition that wouldn't pass that restriction. Mike Drob suggested in

chat

that given the current state of 1.6 a more intrusive fix might be
acceptable.

Re: [DISCUSS] API changes to provide resource cleanup

Reply via email to