I agree with Erick's response, and thus the test/assertion seems
unreasonable.

If ZK is down, all bets are off on indexing proceeding.  In practice,
people expect searches to continue for some time at least.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Apr 22, 2019 at 1:54 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> On the surface, I’m automatically suspicious of _anything_ that relies on
> an arbitrary wait period for a state to settle down. Would this 300ms sleep
> be adequate on a very fast machine running just one test?
>
> I don’t see the value that assert anyway. I can’t come up with a use-case
> for a running Solr functioning incorrectly because it failed to update a
> document while ZooKeeper was shutting down.
>
> FWIW
> Erick
>
> > On Apr 22, 2019, at 8:42 AM, Gus Heck <gus.h...@gmail.com> wrote:
> >
> > BasicZkTest has the following bit of code, that I'm tripping on.
> >
> >     zkServer.shutdown();
> >
> >     // document indexing shouldn't stop immediately after a ZK disconnect
> >     assertU(adoc("id", "201"));
> >
> >     Thread.sleep(300);
> >
> >     // try a reconnect from disconnect
> >     zkServer = new ZkTestServer(zkDir, zkPort);
> >     zkServer.run(false);
> >
> > It's not entirely clear to me that this should always be true.
> ZkStateReader has means to cache and watch various bits of information, but
> if it hasn't done the caching yet it may need to talk to zk before
> completing the request. I am trying to use Collection Properties as an
> alternative location for looking up the routed alias for a collection.
> Current code uses a core property, but this is inconvenient for testing as
> it can't be altered in the test... or at least I didn't find a way to alter
> it. Also, future features such as archiving older collections from a TRA,
> might find it useful to be able to disconnect the older collections from
> the alias, but right now that would require finding all cores and editing
> properties for all of them...
> >
> > However BasicZkTest fails on this assert, because the fetching of
> properties fails, throwing an exception.
> >
> > So is this assert really reasonable? It kind of feels unreasonable but
> I'd like some background from other folks here...
> https://issues.apache.org/jira/browse/SOLR-7819 seems to have discussed
> this some but The more I think about it, the more I'm convinced that
> proceeding without zookeeper available seems dangerous. Any update sent to
> an alias (TRA/CRA or regular) will need to check zookeeper for example....
> Also security.json is in zookeeper, so anyone running with security on
> probably tries to hit zookeeper on a cache miss too
> >
> > I guess it comes down to the question of whether or not solr cloud
> should work while zookeeper is down/unavail or not. This is the first I've
> run into the notion that the answer might be yes. I'd always presumed that
> if Zk went away all bets were off, because ZK is what makes a cloud out of
> us.
> >
> > What I don't know is what existing use cases/installs might find this
> assert critical (most of the above bug talked about LIR, and the comment on
> the commit mentions leader election)
> >
> > Thoughts?
> >
> > -Gus
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to