dlmarion commented on code in PR #55: URL: https://github.com/apache/accumulo-classloaders/pull/55#discussion_r2746089507
########## modules/local-caching-classloader/README.md: ########## @@ -131,6 +131,17 @@ unexpected behavior to classloaders still using the file. * The local storage cache location **MUST** use a filesystem that supports atomic moves. +The Accumulo property `general.custom.classloader.lcc.allowed.urls.pattern` is +another required parameter, and is used to limit the allowed URLs that can be +fetched when downloading context definitions or context resources. Since the +process using this factory will be using its own permissions to fetch +resources, and placing a copy of those resources in a local directory where +others may access them, this property allows a system administrator to mitigate +file disclosure security vulnerabilities by preventing it from accessing URLs +that it should not copy locally (e.g. `file:/path/to/accumulo.properties` or +`hdfs://host/path/to/accumulo/rfile.rf`). An example value might look like: +`https://example.com/path/to/contexts/.*`. Review Comment: There is an Accumulo Configuration section at the bottom of the README that lists the items needed to get this working. If this is required, then we should mention it there. Given that it's a single property that represents a Java regex for all locations, maybe the example should include `file`, `hdfs`, and `http` locations OR'd together. ########## modules/local-caching-classloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactory.java: ########## @@ -144,6 +154,17 @@ public void init(ContextClassLoaderEnvironment env) { String graceProp = env.getConfiguration().get(UPDATE_FAILURE_GRACE_PERIOD_MINS); long graceMins = graceProp == null ? 0 : Long.parseLong(graceProp); updateFailureGracePeriodMins = Duration.ofMinutes(graceMins); + // limit the frequency at which we check the config and re-compile the pattern + Supplier<Pattern> allowedUrlsPattern = Suppliers.memoizeWithExpiration( + () -> Pattern.compile(requireNonNull(env.getConfiguration().get(ALLOWED_URLS_PATTERN), + "Property " + ALLOWED_URLS_PATTERN + " not set, no URLs are allowed")), + Duration.ofMinutes(1)); + allowedUrlChecker = (locationType, url) -> { + var p = allowedUrlsPattern.get(); + Preconditions.checkArgument(p.matcher(url.toExternalForm()).matches(), + "%s location (%s) not allowed by pattern (%s)", locationType, url.toExternalForm(), + p.pattern()); + }; Review Comment: The non-null checking of the allowed url pattern property is deferred until the supplier is invoked via the BiConsumer, right? If so, we may want to include a check in init() that logs a warning when it is not set that there are no URLs configured and that this context will not really work as intended until the property is set. ########## modules/local-caching-classloader/src/test/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactoryTest.java: ########## @@ -168,11 +169,53 @@ public static void afterAll() throws Exception { public void beforeEach() throws Exception { baseCacheDir = tempDir.resolve("base"); ConfigurationCopy acuConf = new ConfigurationCopy( - Map.of(CACHE_DIR_PROPERTY, baseCacheDir.toAbsolutePath().toUri().toURL().toExternalForm())); + Map.of(CACHE_DIR_PROPERTY, baseCacheDir.toAbsolutePath().toUri().toURL().toExternalForm(), + UPDATE_FAILURE_GRACE_PERIOD_MINS, "1", ALLOWED_URLS_PATTERN, ".*")); FACTORY = new LocalCachingContextClassLoaderFactory(); FACTORY.init(() -> new ConfigurationImpl(acuConf)); } + @Test + public void testAllowedUrls() throws Exception { + // use a different factory than other tests; only allow file: URLs + ConfigurationCopy acuConf = new ConfigurationCopy( + Map.of(CACHE_DIR_PROPERTY, baseCacheDir.toAbsolutePath().toUri().toURL().toExternalForm(), + ALLOWED_URLS_PATTERN, "file:.*")); + var factory = new LocalCachingContextClassLoaderFactory(); + factory.init(() -> new ConfigurationImpl(acuConf)); + + // case 1: all URLs pass (normal case, covered by other tests) + + // case 2: context definition URL fails to match the pattern + var ex = assertThrows(ContextClassLoaderException.class, + () -> factory.getClassLoader(hdfsAllContext.toExternalForm())); + assertTrue(ex.getCause() instanceof IllegalArgumentException); + assertTrue(ex.getCause().getMessage().contains("Context definition location (hdfs:")); + + // case 3a: context definition URL matches, but resource URL should fail to match the pattern, + // but it works anyway, because the resources were downloaded already by a different instance + // (in this case, by a less restrictive FACTORY instance) and no new connection is made + FACTORY.getClassLoader(hdfsAllContext.toExternalForm()); + factory.getClassLoader(localAllContext.toExternalForm()); // same resources + + // case 3b: context definition URL matches, but resource URL fails to match the pattern + // in this case, we use a new context definition, with a resource that doesn't exist locally + var newResources = new LinkedHashSet<Resource>(); + var badUrl = "http://localhost/some/path"; + newResources.add(new Resource(new URL(badUrl), "MD5", BAD_MD5)); + var context2 = new ContextDefinition(MONITOR_INTERVAL_SECS, newResources); + var disallowedContext = tempDir.resolve("context-with-disallowed-resource-url.json"); + Files.writeString(disallowedContext, context2.toJson(), StandardOpenOption.CREATE, Review Comment: Will `createContextDefinitionFile` not work here? ########## modules/local-caching-classloader/README.md: ########## @@ -131,6 +131,17 @@ unexpected behavior to classloaders still using the file. * The local storage cache location **MUST** use a filesystem that supports atomic moves. +The Accumulo property `general.custom.classloader.lcc.allowed.urls.pattern` is Review Comment: May want to put this in it's own section called `Security` or something. Additionally, it would be useful to point out that this property is used by all contexts created by this classloader and that this property will need to be updated accordingly as contexts are used that reference new locations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
