There are several parameters to tweak on the lb that may have an impact on
your liveness detection.

1.  How frequently you poll the health check url.  Polling every 15s vs 60s
for example.
2.  How many failures it takes to remove a node from the pool.  5 failures
at 60s intervals means that node won't be removed from the pool for 5
minutes!

So if you are doing maintenance and relying on the detection of node
failure to manage the pool then your going to see requests fail because of
latency created by these parameters.

Removing the node from the lb pool before bringing it down is much safer.
If that's the case you are trying to solve.

On Mon, Jun 19, 2023, 7:01 AM Saksham Gupta
<[email protected]> wrote:

> Thanks Mikhail, will try these approaches.
>
> On Thu, Jun 15, 2023 at 5:40 PM Mikhail Khludnev <[email protected]> wrote:
>
> > From the other POV, a node can be excluded from LB pool via balancer API
> > before restart and brought back then.
> >
> > On Wed, Jun 14, 2023 at 6:09 PM Saksham Gupta
> > <[email protected]> wrote:
> >
> > > Have you configured an URL for health check? Which one?
> > > The load balancer checks if the solr port is in use or not. If yes,
> then
> > it
> > > continues sending the search requests.
> > >
> > > Or you have something like rolling restart/recycle scenarios executed?
> > > No, we don't have anything like that configured.
> > > @ufuk We may have various scenarios where we have to restart a solr
> node,
> > > like changing heap, gc settings, etc. How can I avoid getting 5xx in
> > these
> > > scenarios?
> > >
> > > On Wed, Jun 14, 2023 at 1:02 PM Mikhail Khludnev <[email protected]>
> > wrote:
> > >
> > > > Saksham, can you comment on
> > > > > if a certain port is up or not and based on that send the request
> to
> > > that
> > > > node.
> > > >
> > > > Have you configured an URL for health check? Which one?
> > > >
> > > > > the coordinator node goes down after a request is sent from lb.
> > > > Do you mean nodes are failing more often than healthcheck occur?
> > > >
> > > > Or you have something like rolling restart/recycle scenarios
> executed?
> > > >
> > > > On Wed, Jun 14, 2023 at 8:52 AM Saksham Gupta
> > > > <[email protected]> wrote:
> > > >
> > > > > @Ufuk We are using a load balancer to avoid a single point of
> failure
> > > > i.e.
> > > > > if all the requests have a single coordinator node then it would
> be a
> > > > major
> > > > > issue if this solr node goes down.
> > > > >
> > > > > @Mikhail Khludnev We already have a health check configured on load
> > > > > balancer, but the requests will fail if the coordinator node goes
> > down
> > > > > after request is sent from lb.
> > > > > Further explaining, the load balancer will check if a certain port
> is
> > > up
> > > > or
> > > > > not and based on that send the request to that node. The issue is
> > > > observed
> > > > > for cases where the coordinator node goes down after a request is
> > sent
> > > > from
> > > > > lb.
> > > > >
> > > > > Please let me know if I am missing something here.  Any other
> > > > suggestions?
> > > > >
> > > > > On Tue, Jun 13, 2023 at 7:12 PM Mikhail Khludnev <[email protected]>
> > > > wrote:
> > > > >
> > > > > > Well, probably it's what Solr Operator can provide on Kubernetes.
> > > > > >
> > > > > > On Tue, Jun 13, 2023 at 10:47 AM ufuk yılmaz
> > > > <[email protected]
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Just wondered, solr cloud itself can handle node failings and
> > load
> > > > > > > balancing. Why use an external cloud load balancer?
> > > > > > >
> > > > > > > —ufuk yilmaz
> > > > > > >
> > > > > > > —
> > > > > > >
> > > > > > > > On 13 Jun 2023, at 10:28, Mikhail Khludnev <[email protected]>
> > > > wrote:
> > > > > > > >
> > > > > > > > Hello
> > > > > > > > You can configure healthcheck
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cloud.google.com/load-balancing/docs/health-check-concepts#criteria-protocol-http
> > > > > > > > with Solr's ping request handler
> > > > > > > >
> > > > https://solr.apache.org/guide/solr/latest/deployment-guide/ping.html
> > > > > .
> > > > > > > > Also, Google cloud has sophisticated Traffic Director, which
> > can
> > > > also
> > > > > > > suit
> > > > > > > > for node failover.
> > > > > > > >
> > > > > > > > On Tue, Jun 13, 2023 at 9:13 AM Saksham Gupta
> > > > > > > > <[email protected]> wrote:
> > > > > > > >
> > > > > > > >> Hi team,
> > > > > > > >> We need help with the strategy used to request data from
> solr
> > > > cloud.
> > > > > > > >>
> > > > > > > >> *Current Searching Strategy:*
> > > > > > > >> We are using solr cloud 8.10 having 8 nodes with data
> sharded
> > on
> > > > the
> > > > > > > basis
> > > > > > > >> of an implicit route parameter. We send a search http
> request
> > on
> > > > > > > google's
> > > > > > > >> network load balancer which divides requests amongst the 8
> > solr
> > > > > nodes.
> > > > > > > >>
> > > > > > > >> *Problem with this strategy:*
> > > > > > > >> If solr on any one of the nodes is down, the requests that
> > come
> > > to
> > > > > > this
> > > > > > > >> node give 5xx.
> > > > > > > >>
> > > > > > > >> We are thinking of other strategies like
> > > > > > > >> 1. adding 2 vanilla nodes to this cluster(which will contain
> > no
> > > > > data)
> > > > > > > which
> > > > > > > >> will be used for aggregating and serving requests i.e.
> instead
> > > of
> > > > > > > sending
> > > > > > > >> requests from lb to the 8 nodes, we will be sending the
> > requests
> > > > to
> > > > > > the
> > > > > > > new
> > > > > > > >> nodes which will send internal requests on other nodes and
> > fetch
> > > > > > > required
> > > > > > > >> data.
> > > > > > > >> 2. Instead of dividing requests using a load balancer, we
> can
> > > use
> > > > > > > zookeeper
> > > > > > > >> to connect with solr cloud.
> > > > > > > >>
> > > > > > > >> Would these strategies work? Is there a more optimized way
> > using
> > > > > which
> > > > > > > we
> > > > > > > >> can request on solr?
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Sincerely yours
> > > > > > > > Mikhail Khludnev
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Sincerely yours
> > > > > > Mikhail Khludnev
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Sincerely yours
> > > > Mikhail Khludnev
> > > >
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>

Reply via email to