[ https://issues.apache.org/jira/browse/SOLR-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200470#comment-15200470 ]
Hoss Man commented on SOLR-8862: -------------------------------- bq. The first call to createEphemeralLiveNode() is not actually called from the constructor; it's called from the OnReconnect handler much later, if you lose your ZK session and have to create a new one. At least, that's the theory. Are you seeing it actually get called early? Ah ... ok ... i'm probably wrong then -- i thought the "OnReconnect" handler was also used on the _initial_ connect as well. I'll edit my other comment to reduce confusion bq. This works reasonably well for things like routing search requests. I can see how it might fall over if you're depending on live_nodes for doing cluster level operations. that's my concern -- CloudSolrClient consults {{/live_nodes}} (via {{ClusterState.getLiveNodes()}}) to decide which nodes are up for any requests that aren't explicitly routable updates -- in my particular case i'm getting burned by collection API calls... I guess I see your point though ... for any request involving specific collection(s) clients can use the replica state to see if they are ACTIVE (or if they are a LEADER for update situations) .. and CloudsolrClient does that even for searchers. So I guess the "practical" impacts of this aren't as severe as i initially thought ... but I still feel like we need something per-node in ZK that isn't set to "true" until that node is actually listening on it's port. > /live_nodes is populated too early to be very useful for clients -- > CloudSolrClient (and MiniSolrCloudCluster.createCollection) need some other > ephemeral zk node to knowwhich servers are "ready" > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-8862 > URL: https://issues.apache.org/jira/browse/SOLR-8862 > Project: Solr > Issue Type: Bug > Reporter: Hoss Man > > {{/live_nodes}} is populated surprisingly early (and multiple times) in the > life cycle of a sole node startup, and as a result probably shouldn't be used > by {{CloudSolrClient}} (or other "smart" clients) for deciding what servers > are fair game for requests. > we should either fix {{/live_nodes}} to be created later in the lifecycle, or > add some new ZK node for this purpose. > {panel:title=original bug report} > I haven't been able to make sense of this yet, but what i'm seeing in a new > SolrCloudTestCase subclass i'm writing is that the code below, which > (reasonably) attempts to create a collection immediately after configuring > the MiniSolrCloudCluster gets a "SolrServerException: No live SolrServers > available to handle this request" -- in spite of the fact, that (as far as i > can tell at first glance) MiniSolrCloudCluster's constructor is suppose to > block until all the servers are live.. > {code} > configureCluster(numServers) > .addConfig(configName, configDir.toPath()) > .configure(); > Map<String, String> collectionProperties = ...; > assertNotNull(cluster.createCollection(COLLECTION_NAME, numShards, > repFactor, > configName, null, null, > collectionProperties)); > {code} > {panel} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org