I think Uwe is basically agreeing with my point -- we should not scare people away from embedded ZK. We needn't wait for ZK v3.7; this is a matter of documentation and maybe warnings we emit. Done.
~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Aug 12, 2021 at 7:09 AM Uwe Schindler <u...@thetaphi.de> wrote: > I am fully happy with something that works out of box. > > > > The main problems I see with many customers is not only the complexity of > setup, but also that you need to install a separate Zookeeper ensemble. > When you tell them: Come on, use the one provided by a solr node and you > are fine: “no this is not allowed, see doc xy”. > > > > So let us please simplify the recommendations: If you have one or 2 or > three nodes in standalone node, it is perfectly fine to use embedded > zookeeper. We should not overreact here. A user who used Master/Slave > replication is also not fully fault tolerant. > > > > I’d change the documentation to say something: “If you want to scale, use > a separate zookeeper ensemble with a minimum of three nodes. But for simple > setups just relying on the good old master/slave replication (not the > default solr one that distributes indexing), it is perfectly fine to use > embedded zookeeper (on the “master” node that holds the main index). This > setup is then not really different from classical master/slave replication. > > > > As said before, I am not against Solr cloud, but lets keep it simple for > people that want to keep it simple. I am also fine to start a single node > cluster with zookeeper, but this should be the embedded one (just as > datastore for the fake cluster). And no warnings should be printed. Maybe > as soon as you add too many nodes, print some warning “now it is time to > setup a separate zookeeper ensemble”. But, please not for 2 nodes > (master/slave). > > > > Also where is the problem in spawning an embedded zookeeper in every node > by default? Why does it need to be separated? > > > > Uwe > > > > ----- > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* Jan Høydahl <jan....@cominvent.com> > *Sent:* Wednesday, August 11, 2021 4:27 PM > *To:* dev@solr.apache.org > *Subject:* Re: SolrCloud Alone: Deprecate Standalone Mode > > > > However, we tell people not to use the embedded ZK in production, so I’m > curious if that’s only because it’s a single-node ZK or if there is > something else about the way we’ve embedded it that we would need to change? > > > > As I recall there are several reasons. First, our embedded ZK was kind of > a hack with some forked code etc. Second, it is not designed to be fault > tolerant even if you start three solr nodes this way we cannot form a > quorum. And perhaps third, ZK has not been officially supported on > Windows.. However, I believe this is all solvable if we want to day. Not > saying it is easy though :) > > > > Jan > > > > 11. aug. 2021 kl. 16:17 skrev Cassandra Targett <casstarg...@gmail.com>: > > > > So basically the proposal would be that we use the embedded ZK to > automatically create a quorum via multiple nodes. That’s an interesting > idea. > > However, we tell people not to use the embedded ZK in production, so I’m > curious if that’s only because it’s a single-node ZK or if there is > something else about the way we’ve embedded it that we would need to change? > > I was also under the impression that beyond the complexities of ZK there > are still use cases that SolrCloud does not adequately support, even with > the addition of TLOG and PULL replicas. Does anyone have any examples of > those? > > I’d also like to remind folks to please not use the terminology > “master/slave”, we removed it from the code and documentation because it’s > not inclusive for our community. > > Similarly, “standalone” has always been rather imprecise - it’s not > “standalone”, it’s a cluster but without ZK and other automation sugar. In > the Ref Guide we’ve settled on “user-managed”. It sounds pedantic but it > matters because we should be really clear about what we’re talking about - > deprecating and removing the ability for a single-node Solr installation? > Only the mode of a non-ZK cluster? Both? > > On Aug 11, 2021, 6:39 AM -0500, Eric Pugh <ep...@opensourceconnections.com>, > wrote: > > For small setups I’ve used a single ZK and a single Solr node very > successfully, the operational benefits of all the SolrCloud API’s has been > fantastic. > > > > I’ve always thought that us having ZooKeeper as this “front and center” > requirement for SolrCloud was always a weird decision that would put off a > lot of folks. We don’t beat our potential users over the head with the > fact we use Jetty for example. It’s just part of the stack. > > > > The flow that Gus proposed should have been added to SolrCloud a long time > ago, how much easier would it have made all our lives! The entire > existence of ZooKeeper should be behind APIs and be an abstraction. We > should do this regardless of if deprecated standalone! > > > > Uwe, if we had what Gus proposed, but eliminate zk, would that map much > more to what you wanted? Here is my attempt at retelling the story that > Gus told, but to meet the goals of folks who might want to move to ES for > ease: > > > > A) Start Node 1. > B) Start Node 2 telling it that Node 1 exists. node 2 comes up, joins > network and messages “at risk for split brain”. > C) Start Node 3 telling it that node 1 exists. node 1, node 2, node 3 all > under the covers are sharing state via ZK and messages “no risk for split > brain" > D) Node 4 - like node 2 but since we have optimum quorum doesn’t add to ZK > (under covers, hidden from user). > E) Node 5 - like node 3, but since we have optimum quorum doesn’t add to > ZK (under covers, hidden from user). > > > > > > > > > > > > On Aug 11, 2021, at 7:15 AM, Uwe Schindler <u...@thetaphi.de> wrote: > > > > Hi, > > > > most of my customers prefer standalone mode and manual replication. A lot > of setups, especially in Germany, are very > > > > Solr Cloud is only interesting to large customers that want to scale > hugely. But from what I have seen, most of those have moved to > Elasticsearch or Opensearch (see below). The biggest issue is always the > stupidness of having to maintain a separate Zookeeper cloud, which adds > more hardware/VMs to the game and makes the thing more complex. If you want > to maintain up to 4 or 6 Solr nodes with one index and a few shards, the > overhead by Zookeeper (you need 3 of them) is – sorry to say – > unmaintainable. With Elasticsearch it’s easy to setup. No dedicated > cloud/standalone mode. You just start a single node and test it. If it > works fine, you start additional nodes to form a cloud. Plain simple. > Config files are easy to handle, you need no ip addresses hardcoded into > Zookeeper nodes, it just works. If you don’t want to make people move to > Elasticsearch/Opensearch, make them happy with their fully controllable > local master/slave mode. > > > > So my strong -1 to make cloud mode the default and deprecate standalone > mode. Unless both is the same and works without a separate zookeeper > cluster, I won’t change my vote. > > > > Uwe > > > > ----- > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* Gus Heck <gus.h...@gmail.com> > *Sent:* Tuesday, August 10, 2021 8:34 PM > *To:* dev@solr.apache.org > *Subject:* Re: SolrCloud Alone: Deprecate Standalone Mode > > > > Or to keep things fast without retaining all the checks, one could provide > slow/fast modes for test, fast requiring a local zookeeper external to the > tests, with the tests properly namespacing themselves... that does imply > reworking some tests. > > > > Now that I say the above, it would be interesting if the some of the tests > could (also optionally) properly isolate themselves within an externally > running solr (probably started via cloud.sh with the latest edits. ... > develop, cloud.sh, test manually, run tests against same I expect that > there are still tests for which that makes no sense of course. This is > probably a crazier idea than using an external zookeeper however, where > zkChroot should be sufficient to isolate things I think... > > > > -Gus > > > > On Tue, Aug 10, 2021 at 2:22 PM David Smiley <dsmi...@apache.org> wrote: > > Good call-out on perceived complexity due to running 3 ZK nodes. For many > small installations, honestly Solr's embedded ZK is fine. Also, again for > small installations, running ZK alongside Solr (same hardware) is fine. We > shouldn't needlessly shame users away from doing these things as if it's > irresponsible. There's a spectrum of demands on Solr from low to high. > Anyway, I suspect it's increasingly moot with more Docker & Kubernetes > being used to reduce the hassles of deploying any service (be it Solr or > whatever). This will only increase going forward. > > > > Even if ZK becomes the only mode, I expect many checks in our codebase > that conditionally check for ZK to remain. We want tests that don't care > about SolrCloud mode to be fast, and that means not running unnecessary > things like ZooKeeper. > > > ~ David Smiley > > Apache Lucene/Solr Search Developer > > http://www.linkedin.com/in/davidwsmiley > > > > > > On Tue, Aug 10, 2021 at 12:23 PM Gus Heck <gus.h...@gmail.com> wrote: > > I've met several clients who really didn't want to manage zookeeper as an > additional service (I've talked some into it anyway, but it was clearly a > key reason they hadn't started/gone cloud). I think it would be far more > palatable if it's all "part of solr", doesn't require plumbing the docs of > some other project entirely, and requires neither requisitioning additional > hardware nor service scripts, monitoring, support that isn't "solr" > support... etc... then I think that alleviates some of the pain that folks > in small sub-sections of moderate to large orgs feel at the idea of using > cloud. These folks face long procurement cycles and disaster/recovery plans > etc, despite often having team sizes under 20... or face having to educate > large IT departments into handling deployments when they themselves are new > (of course that's how some of them wind up hiring folks like me... but > that's a barrier too since that has to be approved too). Also I've met > folks who didn't understand that it was possible to have a 1 node "cluster" > with zk on the same machine, and had the impression that 5 boxes (2 solr > and 3 zk) were absolutely required to run cloud. Which it is of course for > high availability with no SPOF, but it is not required if you don't need > high availability. > > > > I think to sunset "user managed" we need to figure out how to self manage > embedded zookeepers, most particularly setup for smaller orgs or lower > traffic installs should look like: > > > > A) Start Node 1 with zk embedded ... if you only need one node, don't want > high availability etc, done. > > B) Start Node 2 telling it the zk url for node 1. node 2 comes up, offers > to participate in zk, but does not because that would make an even number > > C) Start Node 3 telling it the zk url for node 1. node 1 (node 2 hasn't > started zk) node 3 offers to participate in zk, and now with 2 offers > pending, both 2 and 3, get up to date on the current state and th join, now > the embedded zk cluster is 3 nodes, not one, and no SPOF... as they grow... > > D) Node 4 - like node 2 but can use zk url of any of 1,2,3 > > E) Node 5 - like node 3, but can use zk url of any 1,2,3 > > > > Obviously, features for users to set a cap the size of zk clusters, don't > need 49 nodes on 50 servers... , ensure they put their data in a convenient > place that is well documented, document how to secure the inter-node > connections, clarity in the admin UI of what nodes have zk etc. > > > > For this embedded zk use case we should document whatever the user needs > to know so they don't have to sort through docs at an entirely different > project not necessarily focused on the things solr users need. > > > > Certainly we would still advocate for a separate zk cluster for better > performance/stability. In essence a supported mode with known > limitations... True we have to support all THAT code instead, but the > available feature set becomes consistent and a bazillion checks to see if > we have zkStateReader (or some other sentinel for cloud mode) can > disappear, so probably a net gain etc. > > > > On the flip side I"ve also had the thought that cluster state management > should be pluggable such that if a better tool than zk, or merely an > "already installed" tool is available solr could use it. Without careful > thought everything I just said could take us in the opposite direction > > > > Maybe running zk embedded is "Solr Fog" mode :) > > > > On Mon, Aug 9, 2021 at 2:55 PM Houston Putman <houstonput...@gmail.com> > wrote: > > I agree with David that the first step would be to make SolrCloud the > default mode. > > I made a dev list thread about this a few months ago, but I think I failed > to respond at some point. > > I will get back on that and address the > > > > I also really like Mike's idea that we enable very similar use cases with > embedded Zookeeper's, > > if at all possible, to make the transition easy for users who want to stay > on the user-manager mode. > > > > Marcus, I think it would be a great idea to fix up the documentation to > make SolrCloud the first and most prominent mode advertised. > > Never saw your original PR, but would love to give it a look if you > resuscitate it at some point. > > > > - Houston > > > > On Mon, Aug 9, 2021 at 2:48 PM David Smiley <dsmi...@apache.org> wrote: > > Given that SolrCloud is not even the default mode, I think it is premature > to deprecate standalone mode. Let's do this first and maybe consider > deprecating standalone after some time? > > > ~ David Smiley > > Apache Lucene/Solr Search Developer > > http://www.linkedin.com/in/davidwsmiley > > > > > > On Mon, Aug 9, 2021 at 1:58 PM Mike Drob <md...@mdrob.com> wrote: > > Could we simulate user managed replication with an embedded zookeeper > on the primary and pull replicas on the followers? > > On Mon, Aug 9, 2021 at 12:56 PM Jason Gerlowski <gerlowsk...@gmail.com> > wrote: > > > > Hey Marcus, > > > > The places I've worked in the past have all used SolrCloud primarily > > so I can't speak to any specifics, but my impression from reading > > user-list traffic is that a sizable chunk of Solr's user base prefers > > "User-Managed" mode (formerly called "standalone"). Some because they > > don't want to manage a ZooKeeper cluster. Some because the > > replication model in 'user-managed' fits their needs better. Some I > > imagine just haven't bothered to update in many years. > > > > I'm absolutely sympathetic to efforts to streamline development and > > reduce collective debt, but it might be tough to displace such a big > > chunk of users. I'm curious what others think though. Maybe the > > proportion of 'user-managed' users out there is smaller than I > > imagine. > > > > Jason > > > > On Fri, Aug 6, 2021 at 11:59 PM Marcus Eagan <marcusea...@gmail.com> > wrote: > > > > > > Hello again, > > > > > > Has the time come for us to reduce scope to move faster and with more > focus? Even for those not in the cloud, SolrCloud has been the undisputed > performance and usability champ since version 8.0. In version 9.0, I'd like > to propose that the deciders in the community deprecate standalone mode in > favor of SolrCloud. > > > > > > There are a few drivers: > > > > > > We only need to support changes that impact SolrCloud going forward. I > know that this is hard to stomach. But by the time Solr reaches version > 10.0, everyone should have migrated to SolrCloud as there is little reason > to continue to run standalone. > > > The new features keep coming to SolrCloud, but not to standalone. You > can see in a few ways how I embarrassingly discovered this late one night > while trying out a PR. If not careful, users can accidentally start Solr in > standalone mode. Think of all the features that they will see documented > but not in their environment. What a confusing user experience? > > > Last but certainly not least, the number of contributors to the > project, and the velocity of those contributions has dropped. . It does not > have to be that way, though. Two ways are for the community to observe our > push for modernization and improved user experience. Simply eliminating the > need to include the -c flag in the start command would be a huge win for > many engineers.We should make life easier for our users as much as the > maintainers here. We can strive to make the upgrade process from 9 to 10 > very simple. > > > > > > I tried to make one step in this direction last year by re-ordering > the README to show the Solr Cloud command before the standalone command. I > believe that patch died on the vine, but I would be excited to revive it to > document this effort when the time is appropriate. > > > > > > Reason not to do it: > > > > > > Some large company out there might view this move as introducing > risk. I view the risk here as negligible but I welcome any perspective > there. > > > Some things I inevitably don't know. > > > > > > What do you all think? > > > > > > Thank you all for your voluntary contributions, > > > -- > > > Marcus Eagan > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > > For additional commands, e-mail: dev-h...@solr.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org > > > > > -- > > http://www.needhamsoftware.com (work) > > http://www.the111shift.com (play) > > > > > -- > > http://www.needhamsoftware.com (work) > > http://www.the111shift.com (play) > > > > _______________________ > > *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | > 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy > <http://tinyurl.com/eric-cal> > > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > > > > >