On the surface, I’m automatically suspicious of _anything_ that relies on an arbitrary wait period for a state to settle down. Would this 300ms sleep be adequate on a very fast machine running just one test?
I don’t see the value that assert anyway. I can’t come up with a use-case for a running Solr functioning incorrectly because it failed to update a document while ZooKeeper was shutting down. FWIW Erick > On Apr 22, 2019, at 8:42 AM, Gus Heck <gus.h...@gmail.com> wrote: > > BasicZkTest has the following bit of code, that I'm tripping on. > > zkServer.shutdown(); > > // document indexing shouldn't stop immediately after a ZK disconnect > assertU(adoc("id", "201")); > > Thread.sleep(300); > > // try a reconnect from disconnect > zkServer = new ZkTestServer(zkDir, zkPort); > zkServer.run(false); > > It's not entirely clear to me that this should always be true. ZkStateReader > has means to cache and watch various bits of information, but if it hasn't > done the caching yet it may need to talk to zk before completing the request. > I am trying to use Collection Properties as an alternative location for > looking up the routed alias for a collection. Current code uses a core > property, but this is inconvenient for testing as it can't be altered in the > test... or at least I didn't find a way to alter it. Also, future features > such as archiving older collections from a TRA, might find it useful to be > able to disconnect the older collections from the alias, but right now that > would require finding all cores and editing properties for all of them... > > However BasicZkTest fails on this assert, because the fetching of properties > fails, throwing an exception. > > So is this assert really reasonable? It kind of feels unreasonable but I'd > like some background from other folks here... > https://issues.apache.org/jira/browse/SOLR-7819 seems to have discussed this > some but The more I think about it, the more I'm convinced that proceeding > without zookeeper available seems dangerous. Any update sent to an alias > (TRA/CRA or regular) will need to check zookeeper for example.... Also > security.json is in zookeeper, so anyone running with security on probably > tries to hit zookeeper on a cache miss too > > I guess it comes down to the question of whether or not solr cloud should > work while zookeeper is down/unavail or not. This is the first I've run into > the notion that the answer might be yes. I'd always presumed that if Zk went > away all bets were off, because ZK is what makes a cloud out of us. > > What I don't know is what existing use cases/installs might find this assert > critical (most of the above bug talked about LIR, and the comment on the > commit mentions leader election) > > Thoughts? > > -Gus --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org