There are a few additional improvements to embedded ZK available in 3.7 and further in 3.7.1/2 that would make it much easier and safer for us as well. I plan to pick those up when I’m done with the cache work I’m doing.
On Thu, Aug 12, 2021 at 7:31 AM David Smiley <dsmi...@apache.org> wrote: > I think Uwe is basically agreeing with my point -- we should not scare > people away from embedded ZK. We needn't wait for ZK v3.7; this is a > matter of documentation and maybe warnings we emit. Done. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Thu, Aug 12, 2021 at 7:09 AM Uwe Schindler <u...@thetaphi.de> wrote: > >> I am fully happy with something that works out of box. >> >> >> >> The main problems I see with many customers is not only the complexity of >> setup, but also that you need to install a separate Zookeeper ensemble. >> When you tell them: Come on, use the one provided by a solr node and you >> are fine: “no this is not allowed, see doc xy”. >> >> >> >> So let us please simplify the recommendations: If you have one or 2 or >> three nodes in standalone node, it is perfectly fine to use embedded >> zookeeper. We should not overreact here. A user who used Master/Slave >> replication is also not fully fault tolerant. >> >> >> >> I’d change the documentation to say something: “If you want to scale, use >> a separate zookeeper ensemble with a minimum of three nodes. But for simple >> setups just relying on the good old master/slave replication (not the >> default solr one that distributes indexing), it is perfectly fine to use >> embedded zookeeper (on the “master” node that holds the main index). This >> setup is then not really different from classical master/slave replication. >> >> >> >> As said before, I am not against Solr cloud, but lets keep it simple for >> people that want to keep it simple. I am also fine to start a single node >> cluster with zookeeper, but this should be the embedded one (just as >> datastore for the fake cluster). And no warnings should be printed. Maybe >> as soon as you add too many nodes, print some warning “now it is time to >> setup a separate zookeeper ensemble”. But, please not for 2 nodes >> (master/slave). >> >> >> >> Also where is the problem in spawning an embedded zookeeper in every node >> by default? Why does it need to be separated? >> >> >> >> Uwe >> >> >> >> ----- >> >> Uwe Schindler >> >> Achterdiek 19, D-28357 Bremen >> <https://www.google.com/maps/search/Achterdiek+19,+D-28357+Bremen?entry=gmail&source=g> >> >> https://www.thetaphi.de >> >> eMail: u...@thetaphi.de >> >> >> >> *From:* Jan Høydahl <jan....@cominvent.com> >> *Sent:* Wednesday, August 11, 2021 4:27 PM >> *To:* dev@solr.apache.org >> *Subject:* Re: SolrCloud Alone: Deprecate Standalone Mode >> >> >> >> However, we tell people not to use the embedded ZK in production, so I’m >> curious if that’s only because it’s a single-node ZK or if there is >> something else about the way we’ve embedded it that we would need to change? >> >> >> >> As I recall there are several reasons. First, our embedded ZK was kind of >> a hack with some forked code etc. Second, it is not designed to be fault >> tolerant even if you start three solr nodes this way we cannot form a >> quorum. And perhaps third, ZK has not been officially supported on >> Windows.. However, I believe this is all solvable if we want to day. Not >> saying it is easy though :) >> >> >> >> Jan >> >> >> >> 11. aug. 2021 kl. 16:17 skrev Cassandra Targett <casstarg...@gmail.com>: >> >> >> >> So basically the proposal would be that we use the embedded ZK to >> automatically create a quorum via multiple nodes. That’s an interesting >> idea. >> >> However, we tell people not to use the embedded ZK in production, so I’m >> curious if that’s only because it’s a single-node ZK or if there is >> something else about the way we’ve embedded it that we would need to change? >> >> I was also under the impression that beyond the complexities of ZK there >> are still use cases that SolrCloud does not adequately support, even with >> the addition of TLOG and PULL replicas. Does anyone have any examples of >> those? >> >> I’d also like to remind folks to please not use the terminology >> “master/slave”, we removed it from the code and documentation because it’s >> not inclusive for our community. >> >> Similarly, “standalone” has always been rather imprecise - it’s not >> “standalone”, it’s a cluster but without ZK and other automation sugar. In >> the Ref Guide we’ve settled on “user-managed”. It sounds pedantic but it >> matters because we should be really clear about what we’re talking about - >> deprecating and removing the ability for a single-node Solr installation? >> Only the mode of a non-ZK cluster? Both? >> >> On Aug 11, 2021, 6:39 AM -0500, Eric Pugh < >> ep...@opensourceconnections.com>, wrote: >> >> For small setups I’ve used a single ZK and a single Solr node very >> successfully, the operational benefits of all the SolrCloud API’s has been >> fantastic. >> >> >> >> I’ve always thought that us having ZooKeeper as this “front and center” >> requirement for SolrCloud was always a weird decision that would put off a >> lot of folks. We don’t beat our potential users over the head with the >> fact we use Jetty for example. It’s just part of the stack. >> >> >> >> The flow that Gus proposed should have been added to SolrCloud a long >> time ago, how much easier would it have made all our lives! The entire >> existence of ZooKeeper should be behind APIs and be an abstraction. We >> should do this regardless of if deprecated standalone! >> >> >> >> Uwe, if we had what Gus proposed, but eliminate zk, would that map much >> more to what you wanted? Here is my attempt at retelling the story that >> Gus told, but to meet the goals of folks who might want to move to ES for >> ease: >> >> >> >> A) Start Node 1. >> B) Start Node 2 telling it that Node 1 exists. node 2 comes up, joins >> network and messages “at risk for split brain”. >> C) Start Node 3 telling it that node 1 exists. node 1, node 2, node 3 all >> under the covers are sharing state via ZK and messages “no risk for split >> brain" >> D) Node 4 - like node 2 but since we have optimum quorum doesn’t add to >> ZK (under covers, hidden from user). >> E) Node 5 - like node 3, but since we have optimum quorum doesn’t add to >> ZK (under covers, hidden from user). >> >> >> >> >> >> >> >> >> >> >> >> On Aug 11, 2021, at 7:15 AM, Uwe Schindler <u...@thetaphi.de> wrote: >> >> >> >> Hi, >> >> >> >> most of my customers prefer standalone mode and manual replication. A lot >> of setups, especially in Germany, are very >> >> >> >> Solr Cloud is only interesting to large customers that want to scale >> hugely. But from what I have seen, most of those have moved to >> Elasticsearch or Opensearch (see below). The biggest issue is always the >> stupidness of having to maintain a separate Zookeeper cloud, which adds >> more hardware/VMs to the game and makes the thing more complex. If you want >> to maintain up to 4 or 6 Solr nodes with one index and a few shards, the >> overhead by Zookeeper (you need 3 of them) is – sorry to say – >> unmaintainable. With Elasticsearch it’s easy to setup. No dedicated >> cloud/standalone mode. You just start a single node and test it. If it >> works fine, you start additional nodes to form a cloud. Plain simple. >> Config files are easy to handle, you need no ip addresses hardcoded into >> Zookeeper nodes, it just works. If you don’t want to make people move to >> Elasticsearch/Opensearch, make them happy with their fully controllable >> local master/slave mode. >> >> >> >> So my strong -1 to make cloud mode the default and deprecate standalone >> mode. Unless both is the same and works without a separate zookeeper >> cluster, I won’t change my vote. >> >> >> >> Uwe >> >> >> >> ----- >> >> Uwe Schindler >> >> Achterdiek 19, D-28357 Bremen >> <https://www.google.com/maps/search/Achterdiek+19,+D-28357+Bremen?entry=gmail&source=g> >> >> https://www.thetaphi.de >> >> eMail: u...@thetaphi.de >> >> >> >> *From:* Gus Heck <gus.h...@gmail.com> >> *Sent:* Tuesday, August 10, 2021 8:34 PM >> *To:* dev@solr.apache.org >> *Subject:* Re: SolrCloud Alone: Deprecate Standalone Mode >> >> >> >> Or to keep things fast without retaining all the checks, one could >> provide slow/fast modes for test, fast requiring a local zookeeper external >> to the tests, with the tests properly namespacing themselves... that does >> imply reworking some tests. >> >> >> >> Now that I say the above, it would be interesting if the some of the >> tests could (also optionally) properly isolate themselves within an >> externally running solr (probably started via cloud.sh with the latest >> edits. ... develop, cloud.sh, test manually, run tests against same I >> expect that there are still tests for which that makes no sense of course. >> This is probably a crazier idea than using an external zookeeper however, >> where zkChroot should be sufficient to isolate things I think... >> >> >> >> -Gus >> >> >> >> On Tue, Aug 10, 2021 at 2:22 PM David Smiley <dsmi...@apache.org> wrote: >> >> Good call-out on perceived complexity due to running 3 ZK nodes. For >> many small installations, honestly Solr's embedded ZK is fine. Also, again >> for small installations, running ZK alongside Solr (same hardware) is >> fine. We shouldn't needlessly shame users away from doing these things as >> if it's irresponsible. There's a spectrum of demands on Solr from low to >> high. Anyway, I suspect it's increasingly moot with more Docker & >> Kubernetes being used to reduce the hassles of deploying any service (be it >> Solr or whatever). This will only increase going forward. >> >> >> >> Even if ZK becomes the only mode, I expect many checks in our codebase >> that conditionally check for ZK to remain. We want tests that don't care >> about SolrCloud mode to be fast, and that means not running unnecessary >> things like ZooKeeper. >> >> >> ~ David Smiley >> >> Apache Lucene/Solr Search Developer >> >> http://www.linkedin.com/in/davidwsmiley >> >> >> >> >> >> On Tue, Aug 10, 2021 at 12:23 PM Gus Heck <gus.h...@gmail.com> wrote: >> >> I've met several clients who really didn't want to manage zookeeper as an >> additional service (I've talked some into it anyway, but it was clearly a >> key reason they hadn't started/gone cloud). I think it would be far more >> palatable if it's all "part of solr", doesn't require plumbing the docs of >> some other project entirely, and requires neither requisitioning additional >> hardware nor service scripts, monitoring, support that isn't "solr" >> support... etc... then I think that alleviates some of the pain that folks >> in small sub-sections of moderate to large orgs feel at the idea of using >> cloud. These folks face long procurement cycles and disaster/recovery plans >> etc, despite often having team sizes under 20... or face having to educate >> large IT departments into handling deployments when they themselves are new >> (of course that's how some of them wind up hiring folks like me... but >> that's a barrier too since that has to be approved too). Also I've met >> folks who didn't understand that it was possible to have a 1 node "cluster" >> with zk on the same machine, and had the impression that 5 boxes (2 solr >> and 3 zk) were absolutely required to run cloud. Which it is of course for >> high availability with no SPOF, but it is not required if you don't need >> high availability. >> >> >> >> I think to sunset "user managed" we need to figure out how to self manage >> embedded zookeepers, most particularly setup for smaller orgs or lower >> traffic installs should look like: >> >> >> >> A) Start Node 1 with zk embedded ... if you only need one node, don't >> want high availability etc, done. >> >> B) Start Node 2 telling it the zk url for node 1. node 2 comes up, offers >> to participate in zk, but does not because that would make an even number >> >> C) Start Node 3 telling it the zk url for node 1. node 1 (node 2 hasn't >> started zk) node 3 offers to participate in zk, and now with 2 offers >> pending, both 2 and 3, get up to date on the current state and th join, now >> the embedded zk cluster is 3 nodes, not one, and no SPOF... as they grow... >> >> D) Node 4 - like node 2 but can use zk url of any of 1,2,3 >> >> E) Node 5 - like node 3, but can use zk url of any 1,2,3 >> >> >> >> Obviously, features for users to set a cap the size of zk clusters, don't >> need 49 nodes on 50 servers... , ensure they put their data in a convenient >> place that is well documented, document how to secure the inter-node >> connections, clarity in the admin UI of what nodes have zk etc. >> >> >> >> For this embedded zk use case we should document whatever the user needs >> to know so they don't have to sort through docs at an entirely different >> project not necessarily focused on the things solr users need. >> >> >> >> Certainly we would still advocate for a separate zk cluster for better >> performance/stability. In essence a supported mode with known >> limitations... True we have to support all THAT code instead, but the >> available feature set becomes consistent and a bazillion checks to see if >> we have zkStateReader (or some other sentinel for cloud mode) can >> disappear, so probably a net gain etc. >> >> >> >> On the flip side I"ve also had the thought that cluster state management >> should be pluggable such that if a better tool than zk, or merely an >> "already installed" tool is available solr could use it. Without careful >> thought everything I just said could take us in the opposite direction >> >> >> >> Maybe running zk embedded is "Solr Fog" mode :) >> >> >> >> On Mon, Aug 9, 2021 at 2:55 PM Houston Putman <houstonput...@gmail.com> >> wrote: >> >> I agree with David that the first step would be to make SolrCloud the >> default mode. >> >> I made a dev list thread about this a few months ago, but I think I >> failed to respond at some point. >> >> I will get back on that and address the >> >> >> >> I also really like Mike's idea that we enable very similar use cases with >> embedded Zookeeper's, >> >> if at all possible, to make the transition easy for users who want to >> stay on the user-manager mode. >> >> >> >> Marcus, I think it would be a great idea to fix up the documentation to >> make SolrCloud the first and most prominent mode advertised. >> >> Never saw your original PR, but would love to give it a look if you >> resuscitate it at some point. >> >> >> >> - Houston >> >> >> >> On Mon, Aug 9, 2021 at 2:48 PM David Smiley <dsmi...@apache.org> wrote: >> >> Given that SolrCloud is not even the default mode, I think it is >> premature to deprecate standalone mode. Let's do this first and maybe >> consider deprecating standalone after some time? >> >> >> ~ David Smiley >> >> Apache Lucene/Solr Search Developer >> >> http://www.linkedin.com/in/davidwsmiley >> >> >> >> >> >> On Mon, Aug 9, 2021 at 1:58 PM Mike Drob <md...@mdrob.com> wrote: >> >> Could we simulate user managed replication with an embedded zookeeper >> on the primary and pull replicas on the followers? >> >> On Mon, Aug 9, 2021 at 12:56 PM Jason Gerlowski <gerlowsk...@gmail.com> >> wrote: >> > >> > Hey Marcus, >> > >> > The places I've worked in the past have all used SolrCloud primarily >> > so I can't speak to any specifics, but my impression from reading >> > user-list traffic is that a sizable chunk of Solr's user base prefers >> > "User-Managed" mode (formerly called "standalone"). Some because they >> > don't want to manage a ZooKeeper cluster. Some because the >> > replication model in 'user-managed' fits their needs better. Some I >> > imagine just haven't bothered to update in many years. >> > >> > I'm absolutely sympathetic to efforts to streamline development and >> > reduce collective debt, but it might be tough to displace such a big >> > chunk of users. I'm curious what others think though. Maybe the >> > proportion of 'user-managed' users out there is smaller than I >> > imagine. >> > >> > Jason >> > >> > On Fri, Aug 6, 2021 at 11:59 PM Marcus Eagan <marcusea...@gmail.com> >> wrote: >> > > >> > > Hello again, >> > > >> > > Has the time come for us to reduce scope to move faster and with more >> focus? Even for those not in the cloud, SolrCloud has been the undisputed >> performance and usability champ since version 8.0. In version 9.0, I'd like >> to propose that the deciders in the community deprecate standalone mode in >> favor of SolrCloud. >> > > >> > > There are a few drivers: >> > > >> > > We only need to support changes that impact SolrCloud going forward. >> I know that this is hard to stomach. But by the time Solr reaches version >> 10.0, everyone should have migrated to SolrCloud as there is little reason >> to continue to run standalone. >> > > The new features keep coming to SolrCloud, but not to standalone. You >> can see in a few ways how I embarrassingly discovered this late one night >> while trying out a PR. If not careful, users can accidentally start Solr in >> standalone mode. Think of all the features that they will see documented >> but not in their environment. What a confusing user experience? >> > > Last but certainly not least, the number of contributors to the >> project, and the velocity of those contributions has dropped. . It does not >> have to be that way, though. Two ways are for the community to observe our >> push for modernization and improved user experience. Simply eliminating the >> need to include the -c flag in the start command would be a huge win for >> many engineers.We should make life easier for our users as much as the >> maintainers here. We can strive to make the upgrade process from 9 to 10 >> very simple. >> > > >> > > I tried to make one step in this direction last year by re-ordering >> the README to show the Solr Cloud command before the standalone command. I >> believe that patch died on the vine, but I would be excited to revive it to >> document this effort when the time is appropriate. >> > > >> > > Reason not to do it: >> > > >> > > Some large company out there might view this move as introducing >> risk. I view the risk here as negligible but I welcome any perspective >> there. >> > > Some things I inevitably don't know. >> > > >> > > What do you all think? >> > > >> > > Thank you all for your voluntary contributions, >> > > -- >> > > Marcus Eagan >> > > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> > For additional commands, e-mail: dev-h...@solr.apache.org >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> For additional commands, e-mail: dev-h...@solr.apache.org >> >> >> >> >> -- >> >> http://www.needhamsoftware.com (work) >> >> http://www.the111shift.com (play) >> >> >> >> >> -- >> >> http://www.needhamsoftware.com (work) >> >> http://www.the111shift.com (play) >> >> >> >> _______________________ >> >> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | >> 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy >> <http://tinyurl.com/eric-cal> >> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> >> >> This e-mail and all contents, including attachments, is considered to be >> Company Confidential unless explicitly stated otherwise, regardless >> of whether attachments are marked as such. >> >> >> >> >> >