[
https://issues.apache.org/jira/browse/SOLR-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200470#comment-15200470
]
Hoss Man commented on SOLR-8862:
--------------------------------
bq. The first call to createEphemeralLiveNode() is not actually called from the
constructor; it's called from the OnReconnect handler much later, if you lose
your ZK session and have to create a new one. At least, that's the theory. Are
you seeing it actually get called early?
Ah ... ok ... i'm probably wrong then -- i thought the "OnReconnect" handler
was also used on the _initial_ connect as well.
I'll edit my other comment to reduce confusion
bq. This works reasonably well for things like routing search requests. I can
see how it might fall over if you're depending on live_nodes for doing cluster
level operations.
that's my concern -- CloudSolrClient consults {{/live_nodes}} (via
{{ClusterState.getLiveNodes()}}) to decide which nodes are up for any requests
that aren't explicitly routable updates -- in my particular case i'm getting
burned by collection API calls...
I guess I see your point though ... for any request involving specific
collection(s) clients can use the replica state to see if they are ACTIVE (or
if they are a LEADER for update situations) .. and CloudsolrClient does that
even for searchers. So I guess the "practical" impacts of this aren't as
severe as i initially thought ...
but I still feel like we need something per-node in ZK that isn't set to
"true" until that node is actually listening on it's port.
> /live_nodes is populated too early to be very useful for clients --
> CloudSolrClient (and MiniSolrCloudCluster.createCollection) need some other
> ephemeral zk node to knowwhich servers are "ready"
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-8862
> URL: https://issues.apache.org/jira/browse/SOLR-8862
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
>
> {{/live_nodes}} is populated surprisingly early (and multiple times) in the
> life cycle of a sole node startup, and as a result probably shouldn't be used
> by {{CloudSolrClient}} (or other "smart" clients) for deciding what servers
> are fair game for requests.
> we should either fix {{/live_nodes}} to be created later in the lifecycle, or
> add some new ZK node for this purpose.
> {panel:title=original bug report}
> I haven't been able to make sense of this yet, but what i'm seeing in a new
> SolrCloudTestCase subclass i'm writing is that the code below, which
> (reasonably) attempts to create a collection immediately after configuring
> the MiniSolrCloudCluster gets a "SolrServerException: No live SolrServers
> available to handle this request" -- in spite of the fact, that (as far as i
> can tell at first glance) MiniSolrCloudCluster's constructor is suppose to
> block until all the servers are live..
> {code}
> configureCluster(numServers)
> .addConfig(configName, configDir.toPath())
> .configure();
> Map<String, String> collectionProperties = ...;
> assertNotNull(cluster.createCollection(COLLECTION_NAME, numShards,
> repFactor,
> configName, null, null,
> collectionProperties));
> {code}
> {panel}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]