On the surface, I’m automatically suspicious of _anything_ that relies on an 
arbitrary wait period for a state to settle down. Would this 300ms sleep be 
adequate on a very fast machine running just one test?

I don’t see the value that assert anyway. I can’t come up with a use-case for a 
running Solr functioning incorrectly because it failed to update a document 
while ZooKeeper was shutting down.

FWIW
Erick

> On Apr 22, 2019, at 8:42 AM, Gus Heck <gus.h...@gmail.com> wrote:
> 
> BasicZkTest has the following bit of code, that I'm tripping on. 
> 
>     zkServer.shutdown();
> 
>     // document indexing shouldn't stop immediately after a ZK disconnect
>     assertU(adoc("id", "201"));
> 
>     Thread.sleep(300);
>     
>     // try a reconnect from disconnect
>     zkServer = new ZkTestServer(zkDir, zkPort);
>     zkServer.run(false);
> 
> It's not entirely clear to me that this should always be true. ZkStateReader 
> has means to cache and watch various bits of information, but if it hasn't 
> done the caching yet it may need to talk to zk before completing the request. 
> I am trying to use Collection Properties as an alternative location for 
> looking up the routed alias for a collection. Current code uses a core 
> property, but this is inconvenient for testing as it can't be altered in the 
> test... or at least I didn't find a way to alter it. Also, future features 
> such as archiving older collections from a TRA, might find it useful to be 
> able to disconnect the older collections from the alias, but right now that 
> would require finding all cores and editing properties for all of them...  
> 
> However BasicZkTest fails on this assert, because the fetching of properties 
> fails, throwing an exception. 
> 
> So is this assert really reasonable? It kind of feels unreasonable but I'd 
> like some background from other folks here... 
> https://issues.apache.org/jira/browse/SOLR-7819 seems to have discussed this 
> some but The more I think about it, the more I'm convinced that proceeding 
> without zookeeper available seems dangerous. Any update sent to an alias 
> (TRA/CRA or regular) will need to check zookeeper for example.... Also 
> security.json is in zookeeper, so anyone running with security on probably 
> tries to hit zookeeper on a cache miss too
> 
> I guess it comes down to the question of whether or not solr cloud should 
> work while zookeeper is down/unavail or not. This is the first I've run into 
> the notion that the answer might be yes. I'd always presumed that if Zk went 
> away all bets were off, because ZK is what makes a cloud out of us.
> 
> What I don't know is what existing use cases/installs might find this assert 
> critical (most of the above bug talked about LIR, and the comment on the 
> commit mentions leader election)
> 
> Thoughts?
> 
> -Gus


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to