[ 
https://issues.apache.org/jira/browse/SOLR-12258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448816#comment-16448816
 ] 

David Smiley commented on SOLR-12258:
-------------------------------------

To be extra clear, here's a small excerpt from 
org.apache.solr.update.processor.TimeRoutedAliasUpdateProcessorTest#test (line 
97-111) that is minimally sufficient to be showing the problem. _The payload 
part is irrelevant.; what matters is that it's simply a V2 request_:
{code:java}
    CollectionAdminRequest.createCollection(configName, configName, 1, 
1).process(solrClient);

    // manipulate the config...
    checkNoError(solrClient.request(new V2Request.Builder("/collections/" + 
configName + "/config")
        .withMethod(SolrRequest.METHOD.POST)
        .withPayload("{" +
            "  'set-user-property' : {'update.autoCreateFields':false}," + // 
no data driven
            "  'add-updateprocessor' : {" +
            "    'name':'tolerant', 
'class':'solr.TolerantUpdateProcessorFactory'" +
            "  }," +
            "  'add-updateprocessor' : {" + // for testing
            "    'name':'inc', 'class':'" + IncrementURPFactory.class.getName() 
+ "'," +
            "    'fieldName':'" + intField + "'" +
            "  }," +
            "}").build()));
{code}
The second call, where we manipulate the config, sometimes/rarely fails because 
V2HttpCall can't resolve the collection (line 119). It's ZK state simply isn't 
up to date (I surmise). In principle, a V1 call could fail as well but in 
practice maybe it's more rare because the "retry" aspect of V1 buys it 
sufficient extra time. Adding a SolrCloudTestCase.waitForState in-between the 
calls here _may_ help but again there's no guarantee since waitForState waits 
for _the state of the client's state reader_ (not for state readers of Solr 
nodes).

For aliases, we can call ZooKeeper.sync("/aliases.json",...) -- and in fact I 
made sure ZkStateReader now does this in update(). For cases where our code 
expects to operate on a collection (thus it had better exist or we have an 
error) we could try and do a similar thing for collections? In fact we have 
ZkStateReader.forceUpdateCollection(collection) added by [~shalinmangar] in 
SOLR-8745 though it doesn't call ZooKeeper.sync().... but shouldn't it?

> V2 API should "retry" for unresolved collections/aliases (like V1 does)
> -----------------------------------------------------------------------
>
>                 Key: SOLR-12258
>                 URL: https://issues.apache.org/jira/browse/SOLR-12258
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud, v2 API
>            Reporter: David Smiley
>            Priority: Major
>
> When using V1, if the request refers to a possible collection/alias that 
> fails to resolve, HttpSolrCall will invoke AliasesManager.update() then retry 
> the request as if anew (in collaboration with SolrDispatchFilter).  If it 
> fails to resolve again we stop there and return an error; it doesn't go on 
> forever.
> V2 (V2HttpCall specifically) doesn't have this retry mechanism.  It'll return 
> "no such collection or alias".
> The retry will not only work for an alias but the retrying is a delay that 
> will at least help the odds of a newly made collection from being known to 
> this Solr node.  It'd be nice if this was more explicit – i.e. if there was a 
> mechanism similar to AliasesManager.update() but for a collection.  I'm not 
> sure how to do that?
> BTW I discovered this while debugging a Jenkins failure of 
> TimeRoutedAliasUpdateProcessorTest.test where it early on simply goes to 
> issue a V2 based request to change the configuration of a collection that was 
> created immediately before it.  It's pretty mysterious.  I am aware of 
> SolrCloudTestCase.waitForState which is maybe something that needs to be 
> called?  But if that were true then *every* SolrCloud test would need to use 
> it; it just seems wrong to me that we ought to use this method commonly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to