keith-turner commented on code in PR #35:
URL: 
https://github.com/apache/accumulo-classloaders/pull/35#discussion_r2677913231


##########
modules/local-caching-classloader/README.md:
##########
@@ -20,88 +20,206 @@
 -->
 # Local Caching ClassLoader
 
-The LocalCachingContextClassLoaderFactory is an Accumulo 
ContextClassLoaderFactory implementation that creates and maintains a
-LocalCachingContext. The 
`LocalCachingContextClassLoaderFactory.getClassLoader(String)` method expects 
the method
-argument to be a valid `file`, `hdfs`, `http` or `https` URL to a context 
definition file.
-
-The context definition file is a JSON formatted file that contains the name of 
the context, the interval (in seconds) at which
-the context definition file should be monitored, and a list of classpath 
resources. The LocalCachingContextClassLoaderFactory
-creates the LocalCachingContext based on the initial contents of the context 
definition file, and updates the classloader
-as changes are noticed based on the monitoring interval. An example of the 
context definition file is below.
-
-```
+`LocalCachingContextClassLoaderFactory` implements Accumulo's
+`ContextClassLoaderFactory` SPI. Given a context parameter, supplied as a
+String containing a URL to a remote context definition file, it will produce
+and return `ClassLoader` instances to load classes and resources for a Java
+application, based on the list of resource URLs contained in the remote context
+definition. It will also monitor the URL to the context definition file for any
+changes, at the monitoring interval specified in the context definition file.
+
+## Introduction
+
+This factory allows you to create `ClassLoader` instances that point to locally
+cached copies of remote resource files. In this way, this factory allows you to
+place common resources in a remote location for use across many hosts, but
+without many of the problems that can occur when loading resources from a
+remote location.
+
+This factory uses a storage cache in the local filesystem for any files it
+downloads from a remote URL.
+
+To use this factory, you must store your resource files in a location that can
+be specified by a supported URL, and then you must create a JSON-formatted
+context definition file that contains a monitoring interval (in seconds,
+greater than 0), and a list of resource URLs along with a checksum for each
+resource file. You must then store this context definition file somewhere where
+this factory can download it, and use the URL to that context definition file
+as the `context` parameter for this factory's `getClassLoader(String context)`
+method.
+
+This factory can handle context and resource URLs that use the `file`, `hdfs`,
+`http`, or `https` URL scheme.
+
+Here is an example context definition file:
+
+```json
 {
-    "contextName": "myContext",
-    "monitorIntervalSeconds": 5,
-    "resources": [
-        {
-            "location": "file:/home/user/ClassLoaderTestA/TestA.jar",
-            "checksum": "a10883244d70d971ec25cbfa69b6f08f"
-        },
-        {
-            "location": "hdfs://localhost:8020/contextB/TestB.jar",
-            "checksum": "a02a3b7026528156fb782dcdecaaa097"
-        },
-        {
-            "location": "http://localhost:80/TestC.jar";,
-            "checksum": "f464e66f6d07a41c656e8f4679509215"
-        }
-    ]
+  "monitorIntervalSeconds": 5,
+  "resources": [
+    {
+      "location": "file:/home/user/ClassLoaderTestA/TestA.jar",
+      "checksum": "a10883244d70d971ec25cbfa69b6f08f"
+    },
+    {
+      "location": "hdfs://localhost:8020/contextB/TestB.jar",
+      "checksum": "a02a3b7026528156fb782dcdecaaa097"
+    },
+    {
+      "location": "http://localhost:80/TestC.jar";,
+      "checksum": "f464e66f6d07a41c656e8f4679509215"
+    }
+  ]
 }
 ```
 
+## How it Works
+
+When this factory receives a request for a `ClassLoader` for a given URL, it
+downloads a copy of the context definition file and parses it. If it has
+recently acquired that context definition file, based on the monitoring
+interval from a previous retrieval, it can skip this step and use the
+definition from the earlier retrieval, which is kept up-to-date by the
+background monitoring process that started when it was previously retrieved.
+Once it has the context definition, it then returns a `ClassLoader` instance
+containing the resources defined in that context definition file, first
+downloading any missing resources and verifying them using the checksums in the
+context definition file.
+
+`ClassLoader` instances are stored in a de-deduplicating cache in memory with a
+minimum lifetime of 24 hours. So, no two instances will ever exist in a process
+for the same context definition.
+
+If this context definition had not previously been downloaded, a background
+monitoring task is set up to ensure the URL is watched for any changes to the
+context definition. This monitoring continues for as long as there exists
+`ClassLoader` instances in the system that were constructed from the definition
+file at that URL (at least 24 hours, since that is the minimum time they will
+exist in the de-duplicating cache).
+
+## Local Storage Cache
+
+The local storage cache location is configured by the user by setting the
+Accumulo property named `general.custom.classloader.lcc.cache.dir` to a
+directory on the local filesystem. This location may be specified as an
+absolute path or as a URL representing an absolute path with the `file` scheme.
+
+The selected location should be a persistent location with plenty of space to
+store downloaded resources (usually jar files), and should be writable by all
+the processes which use this factory to share the same resources.
+
+Files downloaded to this cache may be used by multiple threads or processes, so
+be very careful when removing old contents to ensure that they are no longer
+needed.
+
+Do **NOT** use a temporary directory for the local storage cache location.
+
+If a resource file is deleted from the local storage cache while a
+`ClassLoader` exists that references it, that `ClassLoader` may, and probably
+will, stop working correctly.
+
+The layout of the local storage cache is two directories: `contexts` and
+`resources`. Context definition files are stored in the `contexts` directory.
+These are for reference only. This factory will not use them once they are
+written. All of the resources used by all contexts are stored in the `contexts`
+directory.
+
+Both context definition files, and resource files use a naming scheme that
+includes the hash of their contents to avoid conflicts with similarly named
+remote resources.
+
+Multiple contexts, multiple threads, and multiple processes, can use the same
+resource files, once they are downloaded. If the resource file already exists,
+it is assumed that it is safe to use. Checksums are only verified on first
+download.
+
+When a resource file is first downloaded, it downloads to a temporary file, and
+then atomically renames it to the final location. **Important**: this factory
+depends on the local filesystem's ability to atomically move files and will
+only work on filesystems with that capability.
+
+This factory will only download one file at a time for each context. While each

Review Comment:
   The documentation updates look good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to