Julian,

Thanks for even the wildest of speculation. I can use all the help I can get right now.  It looks like the slingevent namespace we don't even need. I inherited this project as a kind of mess. It looks like whoever was migrating this from Sling 6 to Sling 10 had just included the provisioning/model.txt and config/*.config files from the sling10 starter war. The war that we use is still made from a maven assembly definition.  This assembly includes those configuration files (instead of generating them from the models) and then repo init reads the model.txt.   So this is probably one big source of configuration failures.  We are not including the org.apache.sling.event & org.apache.sling.event.dea artifacts.  We don't seem to need them. They simply remain installed but not active bundles when we include them.

I've been trying to rationalize the hit or miss nature of this problem. In my mind,  I'm looking for a race condition. I'm going to bet that this is a configuration issue on our end, but I'm not sure how this would happen or where to look. Perhaps we are have some bundles out of order.  While I had mentioned in a previous email that we had our own supplier of SlingRepository, I temporarily removed it trying to debug this issue.  I can see that two different instances of OakSlingRepositoryManager (which implement SlingRepository) are created on the first server startup. I saw this by setting a break point in start(BundleContext, Config) of AbstractSlingRepositoryManager.

The first is called in a event dispatcher thread for [CM Event Dispatcher (Fire ConfigurationEvent : pid=org.apache.jackrabbit.oak.segment.SegmentNodeStoreService)]
I can see this is when the SegmentNodeStoreService is registering itself.

The second instance of the OakSlingRepositoryManager is called from : [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.spi.security.user.action.DefaultAuthorizableActionProvider)] This is during the call to maybeRegister() in SecurityProviderRegistration. I can clearly see that this is a different object.

If I stop and start the server again. start() is only called once (from a Felix start level thread) and I never have the reported issue of "AuthenticationSupport service missing"

This very well may be standard operating procedure. Tomorrow I'm going to deploy the sling10war and look for the exact same behavior to see if I'm barking up the wrong tree. But if this seems very peculiar (or very normal) to someone, please let me know what you think.

Marc

On 2/25/19 9:07 AM, Julian Sedding wrote:
Hi Marc

 From your log, I suspect that the "sling" and "slingevent" Namespaces
are (not yet?) registered when the repository is being initialized.
During repository initialization all installed OSGi bundles are
inspected in order to see whether they contribute namespaces or
node-type definitions (via special manifest headers and embedded *.cnd
files).

The "sling" namespace is defined by the bundle
"org.apache.sling.jcr.resource", the "slingevent" namespace by
"org-apache-sling-event". Try making sure that they are installed
before repository initialization happens.

It may help to enable debug logging for "org.apache.sling.jcr.base" to
better understand when during initialization this happens.

Note: this advice is based on speculation, so YMMV.

Regards
Julian

On Sat, Feb 23, 2019 at 1:45 AM <marc.davenp...@oracle.com> wrote:
Robert,

Thanks for thinking about this issue with me.

You asked: do you have any custom code that plugs into the repository 
initialisation?
I can not find any custom classes that are implementing RepositoryInitilizer.

We have a custom class which provides the SlingRepository service and a service 
ranking of 1000. It wraps the OakSlingRepositoryManager

      
@Reference(target="(component.name=org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager)")
      private SlingRepository repository;

We do this so that we can cast the Sessions as our own special Session objects. 
This is what I've been able to figure out from walking up to this project. This 
may be a defunct pattern since we have moved from Sling6 to 10 recently.

The order of the bundles
CustomWorkspace (14) <- contains the custom impl of
Oak bundles(15)
Sling JCR Oak Server (16)

I've tried moving the order of the bundles around to see if I had something in 
common with SLING-4556, but again this grasping with hope.

You asked: "Are you able to inspect the repository after in the scenario where this 
fails?"
As far as a can tell the repository is completely unavailable. But I'm open to 
debugging suggestions.

You asked: "Can you adjust the configuration of the QuartzScheduler to include 'oak' 
in the allowed pool names?"
I'm not sure how to do this. Also, I see this error in both working and non 
working instances.

I've been looking at the logs from the working and failed state trying to find 
an error or warning that is in one but not the other.  It looks like the 
OakConstraint error on the missing null primary type actually happens in both 
situations. So I think tracking down my first error message suspect is a dead 
end.

I lopped off the timestamps so I could diff the logs from a good and a bad 
state. Below I've added the log messages I should have focused on.  Is there a 
proper way to share whole logs on this mailing list? Attaching a 9mb file 
didn't seem appropriate.

Thank you for all you time so far,
Marc

22.02.2019 12:52:40.867 *ERROR* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager
start: Uncaught Throwable trying to access Repository, calling
stopRepository()
java.lang.NullPointerException: null
      at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager.initializeAndRegisterRepositoryService(AbstractSlingRepositoryManager.java:478)
      at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager.access$300(AbstractSlingRepositoryManager.java:85)
      at
org.apache.sling.jcr.base.AbstractSlingRepositoryManager$4.run(AbstractSlingRepositoryManager.java:455)
22.02.2019 12:52:40.867 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server.internal.OakSlingRepositoryManager stop:
Repository still running, forcing shutdown
22.02.2019 12:52:40.868 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [482,
[org.apache.jackrabbit.api.jmx.QueryStatManagerMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.869 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [483,
[org.apache.jackrabbit.oak.api.jmx.RepositoryStatsMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.870 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [484, [java.lang.Runnable]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.871 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [485,
[org.apache.jackrabbit.oak.spi.gc.GCMonitor]] ServiceEvent UNREGISTERING
22.02.2019 12:52:40.874 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [403,
[java.util.concurrent.Executor]] ServiceEvent UNREGISTERING
22.02.2019 12:52:40.875 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [474, [java.lang.Runnable]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.896 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [475,
[org.apache.jackrabbit.oak.api.jmx.IndexStatsMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.900 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [476,
[org.apache.jackrabbit.oak.plugins.index.property.jmx.PropertyIndexAsyncReindexMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.901 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [477,
[org.apache.jackrabbit.oak.plugins.index.counter.jmx.NodeCounterMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.902 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [478,
[org.apache.jackrabbit.oak.api.jmx.QueryEngineSettingsMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.904 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [479,
[org.apache.jackrabbit.oak.query.stats.QueryStatsMBean]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.905 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [480,
[org.apache.jackrabbit.oak.api.jmx.RepositoryManagementMBean]]
ServiceEvent UNREGISTERING
22.02.2019 12:52:40.906 *INFO* [Apache Sling Repository Startup Thread]
org.apache.sling.jcr.oak.server Service [481,
[org.apache.jackrabbit.oak.spi.commit.CommitHook]] ServiceEvent
UNREGISTERING
22.02.2019 12:52:40.906 *INFO* [Apache Sling Repository Startup Thread]
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate [async] Closed
22.02.2019 12:52:49.349 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.whiteboard Detected extended HttpService. Filters
enabled.
22.02.2019 12:52:49.358 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [486,
[org.osgi.service.http.HttpService,
org.apache.felix.http.api.ExtHttpService]] ServiceEvent REGISTERED
22.02.2019 12:52:49.358 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [487,
[org.osgi.service.http.runtime.HttpServiceRuntime]] ServiceEvent REGISTERED
22.02.2019 12:52:49.360 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [488,
[org.osgi.service.http.context.ServletContextHelper]] ServiceEvent
REGISTERED
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Default Character Encoding: ISO-8859-1
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Parameter Number Limit: 10000
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Maximum Request Size: -1
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Temporary File Location: null
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Maximum File Size: -1
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Tempory File Creation Threshold: 256000
22.02.2019 12:52:49.370 *INFO* [http-nio-8006-exec-1]
org.apache.sling.engine.parameters Check for additional container
parameters: false
22.02.2019 12:52:49.378 *INFO* [http-nio-8006-exec-1]
org.apache.felix.inventory Service [491, [javax.servlet.Servlet]]
ServiceEvent REGISTERED
22.02.2019 12:52:49.378 *INFO* [http-nio-8006-exec-1]
org.apache.felix.inventory Service [490,
[org.apache.felix.inventory.InventoryPrinter]] ServiceEvent REGISTERED
22.02.2019 12:52:49.379 *INFO* [http-nio-8006-exec-1]
org.apache.felix.http.bridge Service [HTTP Service Web Console
Plugin,489, [javax.servlet.Servlet]] ServiceEvent REGISTERED
22.02.2019 12:52:49.383 *ERROR* [http-nio-8006-exec-1]
org.apache.sling.engine.impl.SlingHttpContext handleSecurity:
AuthenticationSupport service missing. Cannot authenticate request.
22.02.2019 12:52:49.383 *ERROR* [http-nio-8006-exec-1]
org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible
reason is missing Repository service. Check AuthenticationSupport
dependencies.


Reply via email to