On Fri, 4 Sep, 2020, 12:05 am Erick Erickson, <erickerick...@gmail.com>
wrote:

>
>
> I wish everyone would just use Solr the way I think about it ;)
>

https://twitter.com/ichattopadhyaya/status/1210868171814473728


> > On Sep 3, 2020, at 2:11 PM, Tomás Fernández Löbbe <tomasflo...@gmail.com>
> wrote:
> >
> > I can see that some of these configurations should be moved to
> clusterporps.json, I don’t believe this is the case for all of them. Some
> are configurations that are targeting the local node (i.e sharedLib path),
> some are needed before connecting to ZooKeeper (zk config). Configuration
> of global handlers and components, while in general you do want to see the
> same conf across all nodes, you may not want the changes to reflect
> atomically and instead rely on a phased upgrade (rolling, blue/green, etc),
> where the conf goes together with the binaries that are being deployed. I
> also fear that making the configuration of some of these components dynamic
> means we have to make the code handle them dynamically (i.e. recreate the
> CollectionsHandler based on callback from ZooKeeper). This would be very
> hardly used in reality, but all our code needs to be restructured to handle
> this, I fear this will complicate the code needlessly, and may introduce
> leaks and races of all kinds. If those components can have configuration
> that should be dynamic (some toggle, threshold, etc), I’d love to see those
> as clusterporps, key-value mostly.
> >
> > If we were to put this configuration in clusterprops, would that mean
> that I’m only able to do config changes via API? On a new cluster, do I
> need to start Solr, make a collections API call to change the collections
> handler? Or am I supposed to manually change the clusterporps file before
> starting Solr and push it to Zookeeper (having a file intended for manual
> edits and API edits is bad IMO)? Maybe via the cli, but still, I’d need to
> do this for every cluster I create (vs have the solr.xml in my source
> repository and Docker image, for example). Also I lose the ability to have
> this configuration in my git repo?
> >
> > I'm +1 to keep a node configuration local to the node in the filesystem.
> Currently, it's solr.xml. I've seen comments about xml difficult to
> read/write, I think that's personal preference so, while I don't see it
> that way, I understand lots of people do and things have been moving away
> to other formats, I'm open to discuss that as a change.
> >
> > > However, 1, 2, and 3, are not trivial for a large number of Solr nodes
> and if they aren’t right diagnosing them can be “challenging”…
> > In my mind, solr.xml goes with your code. Having it up to date means
> having all your nodes running the same version of your code. As I said,
> this is the "desired state" of the cluster, but may not be the case all the
> time (i.e. during deployments), and that's fine. Depending on how you
> manage the cluster, you may want to live with different versions for some
> time (you may have canaries or be doing a blue/green deployment, etc).
> Realistically speaking, if you have a 500+ node cluster, you must have a
> system in place to manage configuration and versions, let's not try to bend
> backwards for a situation that isn't that realistic.
> >
> > Let me put an example of things I fear with making these changes atomic.
> Let's say I want to start using a new, custom HealthCheckHandler
> implementation, that I have put in a jar (and let's assume the jar is
> already in all nodes). If I use solr.xml (where one can currently
> configures this implementation), I can do a phased deployment (yes, this is
> a restart of all nodes), if the healthcheck handler is buggy and fails
> request, the nodes with the new code will never show as healthy, so the
> deployment will likely stop (i.e. if you are using Kubernetes and using
> probes, those instances will keep restarting, if you use ASG in AWS you can
> do the same thing). If you make it an atomic change, bye-bye cluster, all
> nodes will start reporting unhealthy (Kubernetes and ASG will kill all
> those nodes). Good luck doing API changes to revert now, there is no node
> to respond to those requests. Hopefully you were using some sort of stable
> storage because all ephemeral is gone. Bringing back that cluster is going
> to be a PITA. I have seen similar things happen.
> >
> >
> > On Thu, Sep 3, 2020 at 9:40 AM Erick Erickson <erickerick...@gmail.com>
> wrote:
> > bq.  Isn’t solr.xml is a way to hardcode config in a more flexible way
> that a Java class?
> >
> > Yes, and the problem word here is “flexible”. For a single-node system
> that flexibility is desirable. Flexibility comes at the cost of complexity,
> especially in the SolrCloud case. In this case, not so much Solr code
> complexity as operations complexity.
> >
> > For me this isn’t so much a question of functionality as
> administration/troubleshooting/barrier to entry.
> >
> > If:
> > 1. you can guarantee that every solr.xml file on every node in your
> entire 500 node cluster is up to date
> > 2. or you can guarantee that the solr.xml stored on Zookeeper
> > 3. and you can guarantee that clusterprops.json in cloud mode is
> interacting properly with whichever solr.xml is read
> > 4. Then I’d have no problem with solr.xml.
> >
> > However, 1, 2, and 3, are not trivial for a large number of Solr nodes
> and if they aren’t right diagnosing them can be “challenging”…
> >
> > Imagine all the ways that “somehow” the solr.xml file on one node or
> more nodes of a 500 node cluster didn’t get updated and you’re trying to
> track down why query X isn’t working as you expect. Some of the time. When
> you happen to hit conditions X, Y and Z on a subrequest that goes to the
> node in question (which won’t be all of the time, or even possibly a
> significant fraction of the time). Do Containers matter here? Some glitch
> in Puppet or similar? Somebody didn’t follow every step in the process in
> the playbook? It doesn’t matter how you got into this situation, tracking
> it down would be a nightmare.
> >
> > Or, for that matter, you’ve solved all the distribution concerns and
> _can_ guarantee 1 and 3. Then somebody pushes a solr.xml to ZK either
> intentionally or by mistake (OH, I thought I was on the QA system, oops).
> Now I get to spend a week tracking down why the guarantee of 1 is still
> true, it’s just not relevant any more.
> >
> > To me, it’s the same problem that is solved by the blob store for jar
> files, or having configsets in ZK. When I want something available to all
> my Solr instances, I do not want to have to run around to every node and
> determine that the object I copied there is the right one, especially if
> I’m trying to track down a problem.
> >
> > Sure, all my concerns can be solved, but why make it harder than it
> needs to be? Distributed systems are hard enough already…
> >
> > FWIW,
> > Erick
> >
> >
> >
> >
> > > On Sep 3, 2020, at 11:00 AM, Ilan Ginzburg <ilans...@gmail.com> wrote:
> > >
> > >  Isn’t solr.xml is a way to hardcode config in a more flexible way
> that a Java class?
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to