Herman Ciechanowiec created SLING-13236:
-------------------------------------------
Summary: MapEntries: NPE in onChange when a resource change is
delivered during construction or disposal
Key: SLING-13236
URL: https://issues.apache.org/jira/browse/SLING-13236
Project: Sling
Issue Type: Bug
Components: ResourceResolver
Affects Versions: Resource Resolver 2.0.2, Resource Resolver 2.0.4
Reporter: Herman Ciechanowiec
h3. Problem
During unit tests based on sling-mock with {{ResourceResolverType.JCR_OAK}},
the following exception is thrown nondeterministically on a background thread:
{noformat}
Exception in thread "Apache Sling Resource Provider Change Notifier"
java.lang.NullPointerException: Cannot invoke
"org.apache.sling.resourceresolver.impl.mapping.VanityPathHandler.isReady()"
because "this.vph" is null
at
org.apache.sling.resourceresolver.impl.mapping.MapEntries.onChange(MapEntries.java:460)
at
org.apache.sling.resourceresolver.impl.observation.BasicObservationReporter.reportChanges(BasicObservationReporter.java:212)
at
org.apache.sling.resourceresolver.impl.observation.BasicObservationReporter.reportChanges(BasicObservationReporter.java:191)
at
org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.postResourceProviderChange(ResourceProviderTracker.java:364)
at
org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker$2.run(ResourceProviderTracker.java:474)
at java.base/java.lang.Thread.run(Thread.java:1474)
{noformat}
The NPE escapes on the notifier thread, so the affected resource change is
silently lost by {{MapEntries}}. The race is not specific to tests — tests
merely make it likely by starting and stopping the
{{ResourceResolverFactoryActivator}} rapidly.
h3. Root Cause
The {{MapEntries}} constructor registers the instance as a
{{ResourceChangeListener}} *before* the {{vph}} ({{VanityPathHandler}}) field
is assigned:
{code:java}
this.registration = registerResourceChangeListener(bundleContext);
this.vph =
new VanityPathHandler(this.factory, this.resolveMapsMap,
this.initializing, this::drainVanityPathQueue);
this.vph.initializeVanityPaths();
{code}
A listener may be called as soon as it is registered. If the asynchronous
"Apache Sling Resource Provider Change Notifier" thread delivers a change in
the window between the registration and the field assignment, {{onChange}}
dereferences the still-null {{vph}}.
There is a symmetric race on the shutdown side: {{dispose()}} disposes and
nulls the {{AliasHandler}} ({{ah}}) *before* unregistering the listener, so a
change delivered between those two steps produces the equivalent NPE on
{{this.ah}} in {{onChange}}.
Affects the released 2.0.2 and 2.0.4 as well as current master.
Related: SLING-12703 fixed an analogous init-after-dispose NPE for the
{{AliasHandler}}, but did not address the listener registration/unregistration
ordering; SLING-12894 introduced the startup event queues that make the fix
below safe.
h3. Proposed Fix
# Assign {{this.vph}} before calling {{registerResourceChangeListener}},
keeping {{initializeVanityPaths()}} after the registration. This should be safe
by design: {{VanityPathHandler.isReady()}} returns {{false}} until
initialization completes, so events delivered in between would be enqueued into
the vanity path startup queue and drained by the initializer — exactly as for
events arriving while initialization is running.
# In {{dispose()}}, unregister the listener first, then dispose/null the
{{AliasHandler}}, so the handlers are never torn down while the listener can
still be called.
Both races can be reproduced deterministically in {{MapEntriesTest}} by
stubbing the mocked {{BundleContext.registerService}} /
{{ServiceRegistration.unregister}} to deliver a change synchronously — such
tests currently fail with the exact NPEs above and could serve as regression
tests.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)