I don't really know enough about classloaders to contribute much to the 
discussion. But I am in definitely in favor of changes that seek to simplify 
the core codebase. If you and Dave feel the update is feasible I would support 
the update.

-----Original Message-----
From: Christopher <ctubb...@apache.org> 
Sent: Monday, September 14, 2020 5:30 PM
To: accumulo-dev <dev@accumulo.apache.org>
Subject: [DISCUSS] Classloader change proposals

Hi Accumulo Devs,

Lately, Dave Marion (Apache ID: dlmarion) has been working on prototyping some 
new class loader concepts for Accumulo that he and I have discussed, and I 
wanted to pitch the idea here for consideration for the project.

# Background:

Accumulo currently has two classloaders that are instantiated at startup, and 
which can be used to bootstrap Accumulo dependencies (at least, those not 
needed for the classloader code itself). This allows us to use the 
`general.classpaths`[1] and `general.dynamic.classpaths`[2] properties, as well 
as the per-context classloaders (`general.vfs.*`[3] and 
`table.classpath.context`[4]) for things like iterator class isolation. Since 
2.0.0, we have deprecated `general.classpaths` and 
`general.dynamic.classpaths`, the former supplanted by the better use of the 
`CLASSPATH` environment variable (along with much improved scripts in 2.0.0), 
and the latter being replaceable by a user-provided class loader using the 
built-in Java property, `java.system.class.loader`[5], at their discretion.

# The Problem:

The main problem with the current code is: complexity. Accumulo is already 
complex enough without needing to be in the business of developing and 
supporting complex custom class loading features, especially when users have 
viable alternatives that can be better supported by independent, dedicated 
projects. Furthermore, these custom class loaders also have a dependency on 
commons-vfs2, which has been the source of numerous problems and bugs that we 
have needed to deal with, and which affect Accumulo, even though they are not 
necessarily bugs in Accumulo itself. This also brings in a lot of optional 
dependencies that aren't needed by users who don't rely on these features.

# The Requirements:

In spite of these problems, I believe we still want to enable the use cases 
that our classloaders are currently enabling.

Specifically,
1) the ability to have separate contexts for iterator class isolation (A/B 
testing of iterators, updating iterators in a live system, etc.), and
2) the ability for users to bootstrap their class path from some other 
distributed storage than local disk.

# The Proposal:

1. Create a new reloading vfs class loader, with similar functionality as our 
current two-classloaders that do the reloading and provide vfs features, that 
can be easily used as a system class loader, if the user chooses to, and 
deprecate (for removal in 3.0) the built-in implementations. This class loader 
could not only be used with Accumulo, but it could also be used by any other 
project that chooses to use it, because it will not have much, if any, 
dependencies beyond commons-vfs2, and will certainly not depend on Accumulo. 
Creating this separate class loader provides us a path forward to simplify 
Accumulo by removing these features from Accumulo directly (the properties are 
already deprecated), and enabling it to be maintained independently.
2. Create a new class loader factory property in Accumulo, with corresponding 
SPI interface, for users to provide their own implementation of a class loader 
factory, that can map a per-table "context" to a ClassLoader of the 
implementation's choosing.

The result of doing these two things will allow us to more flexibly support 
user class loading needs, without being directly responsible for class loading 
implementations inside Accumulo's core code. All the same functionality that is 
available today will continue to exist, but will be configured differently. The 
resulting code in Accumulo will be dramatically simpler, as we would no longer 
have any complex class loading implementations in our code base, and we would 
no longer have any direct dependency on commons-vfs2, which has been 
problematic.
Independent implementations may use commons-vfs2, or something else, but will 
be more easily testable and maintainable as independent projects that are 
pluggable in Accumulo.

Dave has already been working on prototyping these proposed changes, and it is 
looking very feasible.

We are now ready to:
1. get feedback on the overall proposal, and 2. decide on where to maintain the 
separate class loader.

For where to maintain, the options seem to be: A) try to donate to
commons-vfs2 OR B) maintain as a new repository, accumulo-vfs-classloader.

Note that we have not yet proposed the idea of a user-facing, configurable, 
reloading vfs classloader to the commons-vfs2 developers. We wanted to get our 
own community's feedback on this first.

Please discuss.

Thanks,
Christopher (in collaboration with Dave)

[1]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_classpaths
[2]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_dynamic_classpaths
[3]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_vfs_context_classpath_prefix
[4]: 
https://accumulo.apache.org/docs/2.x/configuration/server-properties#table_classpath_context
[5]: 
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#getSystemClassLoader%E2%80%93

Reply via email to