Julian,
Thanks for even the wildest of speculation. I can use all the help I can
get right now. It looks like the slingevent namespace we don't even
need. I inherited this project as a kind of mess. It looks like whoever
was migrating this from Sling 6 to Sling 10 had just included the
provisioning/model.txt and config/*.config files from the sling10
starter war. The war that we use is still made from a maven assembly
definition. This assembly includes those configuration files (instead
of generating them from the models) and then repo init reads the
model.txt. So this is probably one big source of configuration
failures. We are not including the org.apache.sling.event &
org.apache.sling.event.dea artifacts. We don't seem to need them. They
simply remain installed but not active bundles when we include them.
I've been trying to rationalize the hit or miss nature of this problem.
In my mind, I'm looking for a race condition. I'm going to bet that
this is a configuration issue on our end, but I'm not sure how this
would happen or where to look. Perhaps we are have some bundles out of
order. While I had mentioned in a previous email that we had our own
supplier of SlingRepository, I temporarily removed it trying to debug
this issue. I can see that two different instances of
OakSlingRepositoryManager (which implement SlingRepository) are created
on the first server startup. I saw this by setting a break point in
start(BundleContext, Config) of AbstractSlingRepositoryManager.
The first is called in a event dispatcher thread for [CM Event
Dispatcher (Fire ConfigurationEvent :
pid=org.apache.jackrabbit.oak.segment.SegmentNodeStoreService)]
I can see this is when the SegmentNodeStoreService is registering itself.
The second instance of the OakSlingRepositoryManager is called from :
[CM Event Dispatcher (Fire ConfigurationEvent:
pid=org.apache.jackrabbit.oak.spi.security.user.action.DefaultAuthorizableActionProvider)]
This is during the call to maybeRegister() in
SecurityProviderRegistration. I can clearly see that this is a different
object.
If I stop and start the server again. start() is only called once (from
a Felix start level thread) and I never have the reported issue of
"AuthenticationSupport service missing"
This very well may be standard operating procedure. Tomorrow I'm going
to deploy the sling10war and look for the exact same behavior to see if
I'm barking up the wrong tree. But if this seems very peculiar (or very
normal) to someone, please let me know what you think.
Marc
On 2/25/19 9:07 AM, Julian Sedding wrote:
Hi Marc
From your log, I suspect that the "sling" and "slingevent" Namespaces
are (not yet?) registered when the repository is being initialized.
During repository initialization all installed OSGi bundles are
inspected in order to see whether they contribute namespaces or
node-type definitions (via special manifest headers and embedded *.cnd
files).
The "sling" namespace is defined by the bundle
"org.apache.sling.jcr.resource", the "slingevent" namespace by
"org-apache-sling-event". Try making sure that they are installed
before repository initialization happens.
It may help to enable debug logging for "org.apache.sling.jcr.base" to
better understand when during initialization this happens.
Note: this advice is based on speculation, so YMMV.
Regards
Julian
On Sat, Feb 23, 2019 at 1:45 AM <marc.davenp...@oracle.com> wrote:
Robert,
Thanks for thinking about this issue with me.
You asked: do you have any custom code that plugs into the repository
initialisation?
I can not find any custom classes that are implementing RepositoryInitilizer.
We have a custom class which provides the SlingRepository service and a service
ranking of 1000. It wraps the OakSlingRepositoryManager
@Reference(target="(component.name=org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager)")
private SlingRepository repository;
We do this so that we can cast the Sessions as our own special Session objects.
This is what I've been able to figure out from walking up to this project. This
may be a defunct pattern since we have moved from Sling6 to 10 recently.
The order of the bundles
CustomWorkspace (14) <- contains the custom impl of
Oak bundles(15)
Sling JCR Oak Server (16)
I've tried moving the order of the bundles around to see if I had something in
common with SLING-4556, but again this grasping with hope.
You asked: "Are you able to inspect the repository after in the scenario where this
fails?"
As far as a can tell the repository is completely unavailable. But I'm open to
debugging suggestions.
You asked: "Can you adjust the configuration of the QuartzScheduler to include 'oak'
in the allowed pool names?"
I'm not sure how to do this. Also, I see this error in both working and non
working instances.
I've been looking at the logs from the working and failed state trying to find
an error or warning that is in one but not the other. It looks like the
OakConstraint error on the missing null primary type actually happens in both
situations. So I think tracking down my first error message suspect is a dead
end.
I lopped off the timestamps so I could diff the logs from a good and a bad
state. Below I've added the log messages I should have focused on. Is there a
proper way to share whole logs on this mailing list? Attaching a 9mb file
didn't seem appropriate.
Thank you for all you time so far,
Marc
22.02.2019 12:52:40.867 *ERROR* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager
start: Uncaught Throwable trying to access Repository, calling
stopRepository()
java.lang.NullPointerException: null
at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager.initializeAndRegisterRepositoryService(AbstractSlingRepositoryManager.java:478)
at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager.access$300(AbstractSlingRepositoryManager.java:85)
at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager$4.run(AbstractSlingRepositoryManager.java:455)
22.02.2019 12:52:40.867 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager stop:
Repository still running, forcing shutdown
22.02.2019 12:52:40.868 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [482,
[org.apache.jackrabbit.api.jmx.QueryStatManagerMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.869 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [483,
[org.apache.jackrabbit.oak.api.jmx.RepositoryStatsMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.870 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [484, [java.lang.Runnable]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.871 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [485,
[org.apache.jackrabbit.oak.spi.gc.GCMonitor]] ServiceEvent UNREGISTERING
22.02.2019 12:52:40.874 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [403,
[java.util.concurrent.Executor]] ServiceEvent UNREGISTERING
22.02.2019 12:52:40.875 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [474, [java.lang.Runnable]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.896 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [475,
[org.apache.jackrabbit.oak.api.jmx.IndexStatsMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.900 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [476,
[org.apache.jackrabbit.oak.plugins.index.property.jmx.PropertyIndexAsyncReindexMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.901 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [477,
[org.apache.jackrabbit.oak.plugins.index.counter.jmx.NodeCounterMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.902 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [478,
[org.apache.jackrabbit.oak.api.jmx.QueryEngineSettingsMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.904 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [479,
[org.apache.jackrabbit.oak.query.stats.QueryStatsMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.905 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [480,
[org.apache.jackrabbit.oak.api.jmx.RepositoryManagementMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.906 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [481,
[org.apache.jackrabbit.oak.spi.commit.CommitHook]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.906 *INFO* [Apache Sling Repository Startup Thread]
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate [async] Closed
22.02.2019 12:52:49.349 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.whiteboard Detected extended HttpService. Filters
enabled.
22.02.2019 12:52:49.358 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [486,
[org.osgi.service.http.HttpService,
org.apache.felix.http.api.ExtHttpService]] ServiceEvent REGISTERED
22.02.2019 12:52:49.358 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [487,
[org.osgi.service.http.runtime.HttpServiceRuntime]] ServiceEvent REGISTERED
22.02.2019 12:52:49.360 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [488,
[org.osgi.service.http.context.ServletContextHelper]] ServiceEvent
REGISTERED
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Default Character Encoding: ISO-8859-1
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Parameter Number Limit: 10000
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Maximum Request Size: -1
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Temporary File Location: null
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Maximum File Size: -1
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Tempory File Creation Threshold: 256000
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Check for additional container
parameters: false
22.02.2019 12:52:49.378 *INFO* [http-nio-8006-exec-1]
org.apache.felix.inventory Service [491, [javax.servlet.Servlet]]
ServiceEvent REGISTERED
22.02.2019 12:52:49.378 *INFO* [http-nio-8006-exec-1]
org.apache.felix.inventory Service [490,
[org.apache.felix.inventory.InventoryPrinter]] ServiceEvent REGISTERED
22.02.2019 12:52:49.379 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [HTTP Service Web Console
Plugin,489, [javax.servlet.Servlet]] ServiceEvent REGISTERED
22.02.2019 12:52:49.383 *ERROR* [http-nio-8006-exec-1]
org.apache.sling.engine.impl.SlingHttpContext handleSecurity:
AuthenticationSupport service missing. Cannot authenticate request.
22.02.2019 12:52:49.383 *ERROR* [http-nio-8006-exec-1]
org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible
reason is missing Repository service. Check AuthenticationSupport
dependencies.