I don't really know enough about classloaders to contribute much to the discussion. But I am in definitely in favor of changes that seek to simplify the core codebase. If you and Dave feel the update is feasible I would support the update.
-----Original Message----- From: Christopher <ctubb...@apache.org> Sent: Monday, September 14, 2020 5:30 PM To: accumulo-dev <dev@accumulo.apache.org> Subject: [DISCUSS] Classloader change proposals Hi Accumulo Devs, Lately, Dave Marion (Apache ID: dlmarion) has been working on prototyping some new class loader concepts for Accumulo that he and I have discussed, and I wanted to pitch the idea here for consideration for the project. # Background: Accumulo currently has two classloaders that are instantiated at startup, and which can be used to bootstrap Accumulo dependencies (at least, those not needed for the classloader code itself). This allows us to use the `general.classpaths`[1] and `general.dynamic.classpaths`[2] properties, as well as the per-context classloaders (`general.vfs.*`[3] and `table.classpath.context`[4]) for things like iterator class isolation. Since 2.0.0, we have deprecated `general.classpaths` and `general.dynamic.classpaths`, the former supplanted by the better use of the `CLASSPATH` environment variable (along with much improved scripts in 2.0.0), and the latter being replaceable by a user-provided class loader using the built-in Java property, `java.system.class.loader`[5], at their discretion. # The Problem: The main problem with the current code is: complexity. Accumulo is already complex enough without needing to be in the business of developing and supporting complex custom class loading features, especially when users have viable alternatives that can be better supported by independent, dedicated projects. Furthermore, these custom class loaders also have a dependency on commons-vfs2, which has been the source of numerous problems and bugs that we have needed to deal with, and which affect Accumulo, even though they are not necessarily bugs in Accumulo itself. This also brings in a lot of optional dependencies that aren't needed by users who don't rely on these features. # The Requirements: In spite of these problems, I believe we still want to enable the use cases that our classloaders are currently enabling. Specifically, 1) the ability to have separate contexts for iterator class isolation (A/B testing of iterators, updating iterators in a live system, etc.), and 2) the ability for users to bootstrap their class path from some other distributed storage than local disk. # The Proposal: 1. Create a new reloading vfs class loader, with similar functionality as our current two-classloaders that do the reloading and provide vfs features, that can be easily used as a system class loader, if the user chooses to, and deprecate (for removal in 3.0) the built-in implementations. This class loader could not only be used with Accumulo, but it could also be used by any other project that chooses to use it, because it will not have much, if any, dependencies beyond commons-vfs2, and will certainly not depend on Accumulo. Creating this separate class loader provides us a path forward to simplify Accumulo by removing these features from Accumulo directly (the properties are already deprecated), and enabling it to be maintained independently. 2. Create a new class loader factory property in Accumulo, with corresponding SPI interface, for users to provide their own implementation of a class loader factory, that can map a per-table "context" to a ClassLoader of the implementation's choosing. The result of doing these two things will allow us to more flexibly support user class loading needs, without being directly responsible for class loading implementations inside Accumulo's core code. All the same functionality that is available today will continue to exist, but will be configured differently. The resulting code in Accumulo will be dramatically simpler, as we would no longer have any complex class loading implementations in our code base, and we would no longer have any direct dependency on commons-vfs2, which has been problematic. Independent implementations may use commons-vfs2, or something else, but will be more easily testable and maintainable as independent projects that are pluggable in Accumulo. Dave has already been working on prototyping these proposed changes, and it is looking very feasible. We are now ready to: 1. get feedback on the overall proposal, and 2. decide on where to maintain the separate class loader. For where to maintain, the options seem to be: A) try to donate to commons-vfs2 OR B) maintain as a new repository, accumulo-vfs-classloader. Note that we have not yet proposed the idea of a user-facing, configurable, reloading vfs classloader to the commons-vfs2 developers. We wanted to get our own community's feedback on this first. Please discuss. Thanks, Christopher (in collaboration with Dave) [1]: https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_classpaths [2]: https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_dynamic_classpaths [3]: https://accumulo.apache.org/docs/2.x/configuration/server-properties#general_vfs_context_classpath_prefix [4]: https://accumulo.apache.org/docs/2.x/configuration/server-properties#table_classpath_context [5]: https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#getSystemClassLoader%E2%80%93