[
https://issues.apache.org/jira/browse/SOLR-12258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448816#comment-16448816
]
David Smiley commented on SOLR-12258:
-------------------------------------
To be extra clear, here's a small excerpt from
org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest#test (line
97-111) that is minimally sufficient to be showing the problem. _The payload
part is irrelevant.; what matters is that it's simply a V2 request_:
{code:java}
CollectionAdminRequest.createCollection(configName, configName, 1,
1).process(solrClient);
// manipulate the config...
checkNoError(solrClient.request(new V2Request.Builder("/collections/" +
configName + "/config")
.withMethod(SolrRequest.METHOD.POST)
.withPayload("{" +
" 'set-user-property' : {'update.autoCreateFields':false}," + //
no data driven
" 'add-updateprocessor' : {" +
" 'name':'tolerant',
'class':'solr.TolerantUpdateProcessorFactory'" +
" }," +
" 'add-updateprocessor' : {" + // for testing
" 'name':'inc', 'class':'" + IncrementURPFactory.class.getName()
+ "'," +
" 'fieldName':'" + intField + "'" +
" }," +
"}").build()));
{code}
The second call, where we manipulate the config, sometimes/rarely fails because
V2HttpCall can't resolve the collection (line 119). It's ZK state simply isn't
up to date (I surmise). In principle, a V1 call could fail as well but in
practice maybe it's more rare because the "retry" aspect of V1 buys it
sufficient extra time. Adding a SolrCloudTestCase.waitForState in-between the
calls here _may_ help but again there's no guarantee since waitForState waits
for _the state of the client's state reader_ (not for state readers of Solr
nodes).
For aliases, we can call ZooKeeper.sync("/aliases.json",...) -- and in fact I
made sure ZkStateReader now does this in update(). For cases where our code
expects to operate on a collection (thus it had better exist or we have an
error) we could try and do a similar thing for collections? In fact we have
ZkStateReader.forceUpdateCollection(collection) added by [~shalinmangar] in
SOLR-8745 though it doesn't call ZooKeeper.sync().... but shouldn't it?
> V2 API should "retry" for unresolved collections/aliases (like V1 does)
> -----------------------------------------------------------------------
>
> Key: SOLR-12258
> URL: https://issues.apache.org/jira/browse/SOLR-12258
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud, v2 API
> Reporter: David Smiley
> Priority: Major
>
> When using V1, if the request refers to a possible collection/alias that
> fails to resolve, HttpSolrCall will invoke AliasesManager.update() then retry
> the request as if anew (in collaboration with SolrDispatchFilter). If it
> fails to resolve again we stop there and return an error; it doesn't go on
> forever.
> V2 (V2HttpCall specifically) doesn't have this retry mechanism. It'll return
> "no such collection or alias".
> The retry will not only work for an alias but the retrying is a delay that
> will at least help the odds of a newly made collection from being known to
> this Solr node. It'd be nice if this was more explicit – i.e. if there was a
> mechanism similar to AliasesManager.update() but for a collection. I'm not
> sure how to do that?
> BTW I discovered this while debugging a Jenkins failure of
> TimeRoutedAliasUpdateProcessorTest.test where it early on simply goes to
> issue a V2 based request to change the configuration of a collection that was
> created immediately before it. It's pretty mysterious. I am aware of
> SolrCloudTestCase.waitForState which is maybe something that needs to be
> called? But if that were true then *every* SolrCloud test would need to use
> it; it just seems wrong to me that we ought to use this method commonly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]