Re: SolrCloud Alone: Deprecate Standalone Mode

Mike Drob Thu, 12 Aug 2021 05:37:18 -0700

There are a few additional improvements to embedded ZK available in 3.7 and
further in 3.7.1/2 that would make it much easier and safer for us as well.
I plan to pick those up when I’m done with the cache work I’m doing.


On Thu, Aug 12, 2021 at 7:31 AM David Smiley <[email protected]> wrote:

> I think Uwe is basically agreeing with my point -- we should not scare
> people away from embedded ZK.  We needn't wait for ZK v3.7; this is a
> matter of documentation and maybe warnings we emit.  Done.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Aug 12, 2021 at 7:09 AM Uwe Schindler <[email protected]> wrote:
>
>> I am fully happy with something that works out of box.
>>
>>
>>
>> The main problems I see with many customers is not only the complexity of
>> setup, but also that you need to install a separate Zookeeper ensemble.
>> When you tell them: Come on, use the one provided by a solr node and you
>> are fine: “no this is not allowed, see doc xy”.
>>
>>
>>
>> So let us please simplify the recommendations: If you have one or 2 or
>> three nodes in standalone node, it is perfectly fine to use embedded
>> zookeeper. We should not overreact here. A user who used Master/Slave
>> replication is also not fully fault tolerant.
>>
>>
>>
>> I’d change the documentation to say something: “If you want to scale, use
>> a separate zookeeper ensemble with a minimum of three nodes. But for simple
>> setups just relying on the good old master/slave replication (not the
>> default solr one that distributes indexing), it is perfectly fine to use
>> embedded zookeeper (on the “master” node that holds the main index). This
>> setup is then not really different from classical master/slave replication.
>>
>>
>>
>> As said before, I am not against Solr cloud, but lets keep it simple for
>> people that want to keep it simple. I am also fine to start a single node
>> cluster with zookeeper, but this should be the embedded one (just as
>> datastore for the fake cluster). And no warnings should be printed. Maybe
>> as soon as you add too many nodes, print some warning “now it is time to
>> setup a separate zookeeper ensemble”. But, please not for 2 nodes
>> (master/slave).
>>
>>
>>
>> Also where is the problem in spawning an embedded zookeeper in every node
>> by default? Why does it need to be separated?
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>> <https://www.google.com/maps/search/Achterdiek+19,+D-28357+Bremen?entry=gmail&source=g>
>>
>> https://www.thetaphi.de
>>
>> eMail: [email protected]
>>
>>
>>
>> *From:* Jan Høydahl <[email protected]>
>> *Sent:* Wednesday, August 11, 2021 4:27 PM
>> *To:* [email protected]
>> *Subject:* Re: SolrCloud Alone: Deprecate Standalone Mode
>>
>>
>>
>> However, we tell people not to use the embedded ZK in production, so I’m
>> curious if that’s only because it’s a single-node ZK or if there is
>> something else about the way we’ve embedded it that we would need to change?
>>
>>
>>
>> As I recall there are several reasons. First, our embedded ZK was kind of
>> a hack with some forked code etc. Second, it is not designed to be fault
>> tolerant even if you start three solr nodes this way we cannot form a
>> quorum. And perhaps third, ZK has not been officially supported on
>> Windows.. However, I believe this is all solvable if we want to day. Not
>> saying it is easy though :)
>>
>>
>>
>> Jan
>>
>>
>>
>> 11. aug. 2021 kl. 16:17 skrev Cassandra Targett <[email protected]>:
>>
>>
>>
>> So basically the proposal would be that we use the embedded ZK to
>> automatically create a quorum via multiple nodes. That’s an interesting
>> idea.
>>
>> However, we tell people not to use the embedded ZK in production, so I’m
>> curious if that’s only because it’s a single-node ZK or if there is
>> something else about the way we’ve embedded it that we would need to change?
>>
>> I was also under the impression that beyond the complexities of ZK there
>> are still use cases that SolrCloud does not adequately support, even with
>> the addition of TLOG and PULL replicas. Does anyone have any examples of
>> those?
>>
>> I’d also like to remind folks to please not use the terminology
>> “master/slave”, we removed it from the code and documentation because it’s
>> not inclusive for our community.
>>
>> Similarly, “standalone” has always been rather imprecise - it’s not
>> “standalone”, it’s a cluster but without ZK and other automation sugar. In
>> the Ref Guide we’ve settled on “user-managed”. It sounds pedantic but it
>> matters because we should be really clear about what we’re talking about -
>> deprecating and removing the ability for a single-node Solr installation?
>> Only the mode of a non-ZK cluster? Both?
>>
>> On Aug 11, 2021, 6:39 AM -0500, Eric Pugh <
>> [email protected]>, wrote:
>>
>> For small setups I’ve used a single ZK and a single Solr node very
>> successfully, the operational benefits of all the SolrCloud API’s has been
>> fantastic.
>>
>>
>>
>> I’ve always thought that us having ZooKeeper as this “front and center”
>> requirement for SolrCloud was always a weird decision that would put off a
>> lot of folks.   We don’t beat our potential users over the head with the
>> fact we use Jetty for example.   It’s just part of the stack.
>>
>>
>>
>> The flow that Gus proposed should have been added to SolrCloud a long
>> time ago, how much easier would it have made all our lives!   The entire
>> existence of ZooKeeper should be behind APIs and be an abstraction.  We
>> should do this regardless of if deprecated standalone!
>>
>>
>>
>> Uwe, if we had what Gus proposed, but eliminate zk, would that map much
>> more to what you wanted?  Here is my attempt at retelling the story that
>> Gus told, but to meet the goals of folks who might want to move to ES for
>> ease:
>>
>>
>>
>> A) Start Node 1.
>> B) Start Node 2 telling it that Node 1 exists. node 2 comes up, joins
>> network and messages “at risk for split brain”.
>> C) Start Node 3 telling it that node 1 exists. node 1, node 2, node 3 all
>> under the covers are sharing state via ZK and messages “no risk for split
>> brain"
>> D) Node 4 - like node 2 but since we have optimum quorum doesn’t add to
>> ZK (under covers, hidden from user).
>> E) Node 5 - like node 3, but since we have optimum quorum doesn’t add to
>> ZK (under covers, hidden from user).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Aug 11, 2021, at 7:15 AM, Uwe Schindler <[email protected]> wrote:
>>
>>
>>
>> Hi,
>>
>>
>>
>> most of my customers prefer standalone mode and manual replication. A lot
>> of setups, especially in Germany, are very
>>
>>
>>
>> Solr Cloud is only interesting to large customers that want to scale
>> hugely. But from what I have seen, most of those have moved to
>> Elasticsearch or Opensearch (see below). The biggest issue is always the
>> stupidness of having to maintain a separate Zookeeper cloud, which adds
>> more hardware/VMs to the game and makes the thing more complex. If you want
>> to maintain up to 4 or 6 Solr nodes with one index and a few shards, the
>> overhead by Zookeeper (you need 3 of them) is – sorry to say –
>> unmaintainable. With Elasticsearch it’s easy to setup. No dedicated
>> cloud/standalone mode. You just start a single node and test it. If it
>> works fine, you start additional nodes to form a cloud. Plain simple.
>> Config files are easy to handle, you need no ip addresses hardcoded into
>> Zookeeper nodes, it just works. If you don’t want to make people move to
>> Elasticsearch/Opensearch, make them happy with their fully controllable
>> local master/slave mode.
>>
>>
>>
>> So my strong -1 to make cloud mode the default and deprecate standalone
>> mode. Unless both is the same and works without a separate zookeeper
>> cluster, I won’t change my vote.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>> <https://www.google.com/maps/search/Achterdiek+19,+D-28357+Bremen?entry=gmail&source=g>
>>
>> https://www.thetaphi.de
>>
>> eMail: [email protected]
>>
>>
>>
>> *From:* Gus Heck <[email protected]>
>> *Sent:* Tuesday, August 10, 2021 8:34 PM
>> *To:* [email protected]
>> *Subject:* Re: SolrCloud Alone: Deprecate Standalone Mode
>>
>>
>>
>> Or to keep things fast without retaining all the checks, one could
>> provide slow/fast modes for test, fast requiring a local zookeeper external
>> to the tests, with the tests properly namespacing themselves... that does
>> imply reworking some tests.
>>
>>
>>
>> Now that I say the above, it would be interesting if the some of the
>> tests could (also optionally) properly isolate themselves within an
>> externally running solr (probably started via cloud.sh with the latest
>> edits. ... develop, cloud.sh, test manually, run tests against same I
>> expect that there are still tests for which that makes no sense of course.
>> This is probably a crazier idea than using an external zookeeper however,
>> where zkChroot should be sufficient to isolate things I think...
>>
>>
>>
>> -Gus
>>
>>
>>
>> On Tue, Aug 10, 2021 at 2:22 PM David Smiley <[email protected]> wrote:
>>
>> Good call-out on perceived complexity due to running 3 ZK nodes.  For
>> many small installations, honestly Solr's embedded ZK is fine.  Also, again
>> for small installations, running ZK alongside Solr (same hardware) is
>> fine.  We shouldn't needlessly shame users away from doing these things as
>> if it's irresponsible.  There's a spectrum of demands on Solr from low to
>> high.  Anyway, I suspect it's increasingly moot with more Docker &
>> Kubernetes being used to reduce the hassles of deploying any service (be it
>> Solr or whatever).  This will only increase going forward.
>>
>>
>>
>> Even if ZK becomes the only mode, I expect many checks in our codebase
>> that conditionally check for ZK to remain.  We want tests that don't care
>> about SolrCloud mode to be fast, and that means not running unnecessary
>> things like ZooKeeper.
>>
>>
>> ~ David Smiley
>>
>> Apache Lucene/Solr Search Developer
>>
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>>
>>
>>
>> On Tue, Aug 10, 2021 at 12:23 PM Gus Heck <[email protected]> wrote:
>>
>> I've met several clients who really didn't want to manage zookeeper as an
>> additional service (I've talked some into it anyway, but it was clearly a
>> key reason they hadn't started/gone cloud). I think it would be far more
>> palatable if it's all "part of solr", doesn't require plumbing the docs of
>> some other project entirely, and requires neither requisitioning additional
>> hardware nor service scripts, monitoring, support that isn't "solr"
>> support... etc... then I think that alleviates some of the pain that folks
>> in small sub-sections of moderate to large orgs feel at the idea of using
>> cloud. These folks face long procurement cycles and disaster/recovery plans
>> etc, despite often having team sizes under 20... or face having to educate
>> large IT departments into handling deployments when they themselves are new
>> (of course that's how some of them wind up hiring folks like me... but
>> that's a barrier too since that has to be approved too).  Also I've met
>> folks who didn't understand that it was possible to have a 1 node "cluster"
>> with zk on the same machine, and had the impression that 5 boxes (2 solr
>> and 3 zk) were absolutely required to run cloud. Which it is of course for
>> high availability with no SPOF, but it is not required if you don't need
>> high availability.
>>
>>
>>
>> I think to sunset "user managed" we need to figure out how to self manage
>> embedded zookeepers, most particularly setup for smaller orgs or lower
>> traffic installs should look like:
>>
>>
>>
>> A) Start Node 1 with zk embedded ... if you only need one node, don't
>> want high availability etc, done.
>>
>> B) Start Node 2 telling it the zk url for node 1. node 2 comes up, offers
>> to participate in zk, but does not because that would make an even number
>>
>> C) Start Node 3 telling it the zk url for node 1. node 1 (node 2 hasn't
>> started zk) node 3 offers to participate in zk, and now with 2 offers
>> pending, both 2 and 3, get up to date on the current state and th join, now
>> the embedded zk cluster is 3 nodes, not one, and no SPOF... as they grow...
>>
>> D) Node 4 - like node 2 but can use zk url of any of 1,2,3
>>
>> E) Node 5 - like node 3, but can use zk url of any 1,2,3
>>
>>
>>
>> Obviously, features for users to set a cap the size of zk clusters, don't
>> need 49 nodes on 50 servers... , ensure they put their data in a convenient
>> place that is well documented, document how to secure the inter-node
>> connections, clarity in the admin UI of what nodes have zk etc.
>>
>>
>>
>> For this embedded zk use case we should document whatever the user needs
>> to know so they don't have to sort through docs at an entirely different
>> project not necessarily focused on the things solr users need.
>>
>>
>>
>> Certainly we would still advocate for a separate zk cluster for better
>> performance/stability. In essence a supported mode with known
>> limitations... True we have to support all THAT code instead, but the
>> available feature set becomes consistent and a bazillion checks to see if
>> we have zkStateReader (or some other sentinel for cloud mode) can
>> disappear, so probably a net gain etc.
>>
>>
>>
>> On the flip side I"ve also had the thought that cluster state management
>> should be pluggable such that if a better tool than zk, or merely an
>> "already installed" tool is available solr could use it. Without careful
>> thought everything I just said could take us in the opposite direction
>>
>>
>>
>> Maybe running zk embedded is "Solr Fog" mode :)
>>
>>
>>
>> On Mon, Aug 9, 2021 at 2:55 PM Houston Putman <[email protected]>
>> wrote:
>>
>> I agree with David that the first step would be to make SolrCloud the
>> default mode.
>>
>> I made a dev list thread about this a few months ago, but I think I
>> failed to respond at some point.
>>
>> I will get back on that and address the
>>
>>
>>
>> I also really like Mike's idea that we enable very similar use cases with
>> embedded Zookeeper's,
>>
>> if at all possible, to make the transition easy for users who want to
>> stay on the user-manager mode.
>>
>>
>>
>> Marcus, I think it would be a great idea to fix up the documentation to
>> make SolrCloud the first and most prominent mode advertised.
>>
>> Never saw your original PR, but would love to give it a look if you
>> resuscitate it at some point.
>>
>>
>>
>> - Houston
>>
>>
>>
>> On Mon, Aug 9, 2021 at 2:48 PM David Smiley <[email protected]> wrote:
>>
>> Given that SolrCloud is not even the default mode, I think it is
>> premature to deprecate standalone mode.  Let's do this first and maybe
>> consider deprecating standalone after some time?
>>
>>
>> ~ David Smiley
>>
>> Apache Lucene/Solr Search Developer
>>
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>>
>>
>>
>> On Mon, Aug 9, 2021 at 1:58 PM Mike Drob <[email protected]> wrote:
>>
>> Could we simulate user managed replication with an embedded zookeeper
>> on the primary and pull replicas on the followers?
>>
>> On Mon, Aug 9, 2021 at 12:56 PM Jason Gerlowski <[email protected]>
>> wrote:
>> >
>> > Hey Marcus,
>> >
>> > The places I've worked in the past have all used SolrCloud primarily
>> > so I can't speak to any specifics, but my impression from reading
>> > user-list traffic is that a sizable chunk of Solr's user base prefers
>> > "User-Managed" mode (formerly called "standalone").  Some because they
>> > don't want to manage a ZooKeeper cluster.  Some because the
>> > replication model in 'user-managed' fits their needs better.  Some I
>> > imagine just haven't bothered to update in many years.
>> >
>> > I'm absolutely sympathetic to efforts to streamline development and
>> > reduce collective debt, but it might be tough to displace such a big
>> > chunk of users.  I'm curious what others think though.  Maybe the
>> > proportion of 'user-managed' users out there is smaller than I
>> > imagine.
>> >
>> > Jason
>> >
>> > On Fri, Aug 6, 2021 at 11:59 PM Marcus Eagan <[email protected]>
>> wrote:
>> > >
>> > > Hello again,
>> > >
>> > > Has the time come for us to reduce scope to move faster and with more
>> focus? Even for those not in the cloud, SolrCloud has been the undisputed
>> performance and usability champ since version 8.0. In version 9.0, I'd like
>> to propose that the deciders in the community deprecate standalone mode in
>> favor of SolrCloud.
>> > >
>> > > There are a few drivers:
>> > >
>> > > We only need to support changes that impact SolrCloud going forward.
>> I know that this is hard to stomach. But by the time Solr reaches version
>> 10.0, everyone should have migrated to SolrCloud as there is little reason
>> to continue to run standalone.
>> > > The new features keep coming to SolrCloud, but not to standalone. You
>> can see in a few ways how I embarrassingly discovered this late one night
>> while trying out a PR. If not careful, users can accidentally start Solr in
>> standalone mode. Think of all the features that they will see documented
>> but not in their environment. What a confusing user experience?
>> > > Last but certainly not least, the number of contributors to the
>> project, and the velocity of those contributions has dropped. . It does not
>> have to be that way, though. Two ways are for the community to observe our
>> push for modernization and improved user experience. Simply eliminating the
>> need to include the -c flag in the start command would be a huge win for
>> many engineers.We should make life easier for our users as much as the
>> maintainers here. We can strive to make the upgrade process from 9 to 10
>> very simple.
>> > >
>> > > I tried to make one step in this direction last year by re-ordering
>> the README to show the Solr Cloud command before the standalone command. I
>> believe that patch died on the vine, but I would be excited to revive it to
>> document this effort when the time is appropriate.
>> > >
>> > > Reason not to do it:
>> > >
>> > >  Some large company out there might view this move as introducing
>> risk. I view the risk here as negligible but I welcome any perspective
>> there.
>> > > Some things I inevitably don't know.
>> > >
>> > > What do you all think?
>> > >
>> > > Thank you all for your voluntary contributions,
>> > > --
>> > > Marcus Eagan
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [email protected]
>> > For additional commands, e-mail: [email protected]
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>>
>>
>> --
>>
>> http://www.needhamsoftware.com (work)
>>
>> http://www.the111shift.com (play)
>>
>>
>>
>>
>> --
>>
>> http://www.needhamsoftware.com (work)
>>
>> http://www.the111shift.com (play)
>>
>>
>>
>> _______________________
>>
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC |
>> 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>
>>
>>
>>
>

Re: SolrCloud Alone: Deprecate Standalone Mode

Reply via email to