I've met several clients who really didn't want to manage zookeeper as an
additional service (I've talked some into it anyway, but it was clearly a
key reason they hadn't started/gone cloud). I think it would be far more
palatable if it's all "part of solr", doesn't require plumbing the docs of
some other project entirely, and requires neither requisitioning additional
hardware nor service scripts, monitoring, support that isn't "solr"
support... etc... then I think that alleviates some of the pain that folks
in small sub-sections of moderate to large orgs feel at the idea of using
cloud. These folks face long procurement cycles and disaster/recovery plans
etc, despite often having team sizes under 20... or face having to educate
large IT departments into handling deployments when they themselves are new
(of course that's how some of them wind up hiring folks like me... but
that's a barrier too since that has to be approved too).  Also I've met
folks who didn't understand that it was possible to have a 1 node "cluster"
with zk on the same machine, and had the impression that 5 boxes (2 solr
and 3 zk) were absolutely required to run cloud. Which it is of course for
high availability with no SPOF, but it is not required if you don't need
high availability.

I think to sunset "user managed" we need to figure out how to self manage
embedded zookeepers, most particularly setup for smaller orgs or lower
traffic installs should look like:

A) Start Node 1 with zk embedded ... if you only need one node, don't want
high availability etc, done.
B) Start Node 2 telling it the zk url for node 1. node 2 comes up, offers
to participate in zk, but does not because that would make an even number
C) Start Node 3 telling it the zk url for node 1. node 1 (node 2 hasn't
started zk) node 3 offers to participate in zk, and now with 2 offers
pending, both 2 and 3, get up to date on the current state and th join, now
the embedded zk cluster is 3 nodes, not one, and no SPOF... as they grow...
D) Node 4 - like node 2 but can use zk url of any of 1,2,3
E) Node 5 - like node 3, but can use zk url of any 1,2,3

Obviously, features for users to set a cap the size of zk clusters, don't
need 49 nodes on 50 servers... , ensure they put their data in a convenient
place that is well documented, document how to secure the inter-node
connections, clarity in the admin UI of what nodes have zk etc.

For this embedded zk use case we should document whatever the user needs to
know so they don't have to sort through docs at an entirely different
project not necessarily focused on the things solr users need.

Certainly we would still advocate for a separate zk cluster for better
performance/stability. In essence a supported mode with known
limitations... True we have to support all THAT code instead, but the
available feature set becomes consistent and a bazillion checks to see if
we have zkStateReader (or some other sentinel for cloud mode) can
disappear, so probably a net gain etc.

On the flip side I"ve also had the thought that cluster state management
should be pluggable such that if a better tool than zk, or merely an
"already installed" tool is available solr could use it. Without careful
thought everything I just said could take us in the opposite direction

Maybe running zk embedded is "Solr Fog" mode :)

On Mon, Aug 9, 2021 at 2:55 PM Houston Putman <houstonput...@gmail.com>
wrote:

> I agree with David that the first step would be to make SolrCloud the
> default mode.
> I made a dev list thread about this a few months ago, but I think I failed
> to respond at some point.
> I will get back on that and address the
>
> I also really like Mike's idea that we enable very similar use cases with
> embedded Zookeeper's,
> if at all possible, to make the transition easy for users who want to stay
> on the user-manager mode.
>
> Marcus, I think it would be a great idea to fix up the documentation to
> make SolrCloud the first and most prominent mode advertised.
> Never saw your original PR, but would love to give it a look if you
> resuscitate it at some point.
>
> - Houston
>
> On Mon, Aug 9, 2021 at 2:48 PM David Smiley <dsmi...@apache.org> wrote:
>
>> Given that SolrCloud is not even the default mode, I think it is
>> premature to deprecate standalone mode.  Let's do this first and maybe
>> consider deprecating standalone after some time?
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Mon, Aug 9, 2021 at 1:58 PM Mike Drob <md...@mdrob.com> wrote:
>>
>>> Could we simulate user managed replication with an embedded zookeeper
>>> on the primary and pull replicas on the followers?
>>>
>>> On Mon, Aug 9, 2021 at 12:56 PM Jason Gerlowski <gerlowsk...@gmail.com>
>>> wrote:
>>> >
>>> > Hey Marcus,
>>> >
>>> > The places I've worked in the past have all used SolrCloud primarily
>>> > so I can't speak to any specifics, but my impression from reading
>>> > user-list traffic is that a sizable chunk of Solr's user base prefers
>>> > "User-Managed" mode (formerly called "standalone").  Some because they
>>> > don't want to manage a ZooKeeper cluster.  Some because the
>>> > replication model in 'user-managed' fits their needs better.  Some I
>>> > imagine just haven't bothered to update in many years.
>>> >
>>> > I'm absolutely sympathetic to efforts to streamline development and
>>> > reduce collective debt, but it might be tough to displace such a big
>>> > chunk of users.  I'm curious what others think though.  Maybe the
>>> > proportion of 'user-managed' users out there is smaller than I
>>> > imagine.
>>> >
>>> > Jason
>>> >
>>> > On Fri, Aug 6, 2021 at 11:59 PM Marcus Eagan <marcusea...@gmail.com>
>>> wrote:
>>> > >
>>> > > Hello again,
>>> > >
>>> > > Has the time come for us to reduce scope to move faster and with
>>> more focus? Even for those not in the cloud, SolrCloud has been the
>>> undisputed performance and usability champ since version 8.0. In version
>>> 9.0, I'd like to propose that the deciders in the community deprecate
>>> standalone mode in favor of SolrCloud.
>>> > >
>>> > > There are a few drivers:
>>> > >
>>> > > We only need to support changes that impact SolrCloud going forward.
>>> I know that this is hard to stomach. But by the time Solr reaches version
>>> 10.0, everyone should have migrated to SolrCloud as there is little reason
>>> to continue to run standalone.
>>> > > The new features keep coming to SolrCloud, but not to standalone.
>>> You can see in a few ways how I embarrassingly discovered this late one
>>> night while trying out a PR. If not careful, users can accidentally start
>>> Solr in standalone mode. Think of all the features that they will see
>>> documented but not in their environment. What a confusing user experience?
>>> > > Last but certainly not least, the number of contributors to the
>>> project, and the velocity of those contributions has dropped. . It does not
>>> have to be that way, though. Two ways are for the community to observe our
>>> push for modernization and improved user experience. Simply eliminating the
>>> need to include the -c flag in the start command would be a huge win for
>>> many engineers.We should make life easier for our users as much as the
>>> maintainers here. We can strive to make the upgrade process from 9 to 10
>>> very simple.
>>> > >
>>> > > I tried to make one step in this direction last year by re-ordering
>>> the README to show the Solr Cloud command before the standalone command. I
>>> believe that patch died on the vine, but I would be excited to revive it to
>>> document this effort when the time is appropriate.
>>> > >
>>> > > Reason not to do it:
>>> > >
>>> > >  Some large company out there might view this move as introducing
>>> risk. I view the risk here as negligible but I welcome any perspective
>>> there.
>>> > > Some things I inevitably don't know.
>>> > >
>>> > > What do you all think?
>>> > >
>>> > > Thank you all for your voluntary contributions,
>>> > > --
>>> > > Marcus Eagan
>>> > >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> > For additional commands, e-mail: dev-h...@solr.apache.org
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>>
>>>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to