Also, if anybody is interested in a live video conversation to discuss this interactively, I intend to be on Slack on Wednesday afternoon (EDT) starting around noon.
On Mon, Sep 14, 2020 at 5:30 PM Christopher <ctubb...@apache.org> wrote: > > Hi Accumulo Devs, > > Lately, Dave Marion (Apache ID: dlmarion) has been working on > prototyping some new class loader concepts for Accumulo that he and I > have discussed, and I wanted to pitch the idea here for consideration > for the project. > > # Background: > > Accumulo currently has two classloaders that are instantiated at > startup, and which can be used to bootstrap Accumulo dependencies (at > least, those not needed for the classloader code itself). This allows > us to use the `general.classpaths`[1] and > `general.dynamic.classpaths`[2] properties, as well as the per-context > classloaders (`general.vfs.*`[3] and `table.classpath.context`[4]) for > things like iterator class isolation. Since 2.0.0, we have deprecated > `general.classpaths` and `general.dynamic.classpaths`, the former > supplanted by the better use of the `CLASSPATH` environment variable > (along with much improved scripts in 2.0.0), and the latter being > replaceable by a user-provided class loader using the built-in Java > property, `java.system.class.loader`[5], at their discretion. > > # The Problem: > > The main problem with the current code is: complexity. Accumulo is > already complex enough without needing to be in the business of > developing and supporting complex custom class loading features, > especially when users have viable alternatives that can be better > supported by independent, dedicated projects. Furthermore, these > custom class loaders also have a dependency on commons-vfs2, which has > been the source of numerous problems and bugs that we have needed to > deal with, and which affect Accumulo, even though they are not > necessarily bugs in Accumulo itself. This also brings in a lot of > optional dependencies that aren't needed by users who don't rely on > these features. > > # The Requirements: > > In spite of these problems, I believe we still want to enable the use > cases that our classloaders are currently enabling. > > Specifically, > 1) the ability to have separate contexts for iterator class isolation > (A/B testing of iterators, updating iterators in a live system, etc.), > and > 2) the ability for users to bootstrap their class path from some other > distributed storage than local disk. > > # The Proposal: > > 1. Create a new reloading vfs class loader, with similar functionality > as our current two-classloaders that do the reloading and provide vfs > features, that can be easily used as a system class loader, if the > user chooses to, and deprecate (for removal in 3.0) the built-in > implementations. This class loader could not only be used with > Accumulo, but it could also be used by any other project that chooses > to use it, because it will not have much, if any, dependencies beyond > commons-vfs2, and will certainly not depend on Accumulo. Creating this > separate class loader provides us a path forward to simplify Accumulo > by removing these features from Accumulo directly (the properties are > already deprecated), and enabling it to be maintained independently. > 2. Create a new class loader factory property in Accumulo, with > corresponding SPI interface, for users to provide their own > implementation of a class loader factory, that can map a per-table > "context" to a ClassLoader of the implementation's choosing. > > The result of doing these two things will allow us to more flexibly > support user class loading needs, without being directly responsible > for class loading implementations inside Accumulo's core code. All the > same functionality that is available today will continue to exist, but > will be configured differently. The resulting code in Accumulo will be > dramatically simpler, as we would no longer have any complex class > loading implementations in our code base, and we would no longer have > any direct dependency on commons-vfs2, which has been problematic. > Independent implementations may use commons-vfs2, or something else, > but will be more easily testable and maintainable as independent > projects that are pluggable in Accumulo. > > Dave has already been working on prototyping these proposed changes, > and it is looking very feasible. > > We are now ready to: > 1. get feedback on the overall proposal, and > 2. decide on where to maintain the separate class loader. > > For where to maintain, the options seem to be: A) try to donate to > commons-vfs2 OR B) maintain as a new repository, > accumulo-vfs-classloader. > > Note that we have not yet proposed the idea of a user-facing, > configurable, reloading vfs classloader to the commons-vfs2 > developers. We wanted to get our own community's feedback on this > first. > > Please discuss. > > Thanks, > Christopher (in collaboration with Dave) > > [1]: > https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_classpaths > [2]: > https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_dynamic_classpaths > [3]: > https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_vfs_context_classpath_prefix > [4]: > https://accumulo.apache.org/docs/2.x/configuration/server-properties#table_classpath_context > [5]: > https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#getSystemClassLoader%E2%80%93