Re: Change suggestion: more efficient replica state tracking

2023-10-03 Thread Ilan Ginzburg
> > I wonder if ZK session expiration and re establishment works nicely for > others? The code handling this is in ZkController.onReconnect(). Answering my own question: the issue was specific to our fork so I assume ZK session expiration and re establishment does work nicely in general. On Mon,

Re: Change suggestion: more efficient replica state tracking

2023-10-02 Thread Mark Miller
Oh I’m not referring to your proposal, just what happens with the current system and what I had done around DOWN state to address it. The problem there is the missing replica state when you come up as the live node can’t cover for it. How you cover for that state could be done in a lot of ways, it’

Re: Change suggestion: more efficient replica state tracking

2023-10-02 Thread Ishan Chattopadhyaya
Thanks for bringing up this topic, Ilan. > We are also running a fork of Solr and in our > fork we have made some optimizations to avoid processing DOWNNODE > messages for nodes that only host PRS collections. Those optimizations > have not made it upstream at this point. I can take a look at > up

Re: Change suggestion: more efficient replica state tracking

2023-10-02 Thread Ilan Ginzburg
Not sure I totally follow what you mean Mark. We thought making actual replica state = published replica state AND node state, which would set practical replica states to down when an ephemeral Zookeeper node for a SolrCloud node disappears. This works nicely for the going down part, but still requ

Re: Change suggestion: more efficient replica state tracking

2023-10-02 Thread Mark Miller
Actually, I think what I did was move the DOWN state to startup. Since you can’t count on it on shutdown (crash, killed process, state doesn’t get published for a variety of reasons), it doesn’t do anything solid for the holes where you are indexing and a node cycles. So it can come up in any state

Re: Change suggestion: more efficient replica state tracking

2023-09-28 Thread Mark Miller
Yeah, I think a jira issue or two was filed for it, but I didn't see anything user facing go in. You can do it for queries by asking the overseer to publish a DOWN state though. It won't drop indexing leadership until you close the core, but it will prevent the temporary slow/hotspot you get if you

Re: Change suggestion: more efficient replica state tracking

2023-09-28 Thread Houston Putman
Yeah ive mentioned it a number of times, but it’s absolutely something we should have. Give up leadership, dont accept new replicas, dont accept new requests. Maybe remove live_node! - Houstob On Thu, Sep 28, 2023 at 6:45 PM David Smiley wrote: > Somewhat related to this, we don't yet have a wa

Re: Change suggestion: more efficient replica state tracking

2023-09-28 Thread David Smiley
Somewhat related to this, we don't yet have a way to signal to the cluster that a node will soon be shut down, but has not shut down yet. With such information, we'd prefer to not send queries there, and perhaps could even begin shard/overseer leadership changes. This was mentioned somewhere in c

Re: Change suggestion: more efficient replica state tracking

2023-09-28 Thread Mark Miller
That did require some changes around live node handeling, which is why a different approach as you suggest would also be reasonable. You still do want to solve for the original motivation of DOWN - stopping search traffic to the node before things start closing.

Re: Change suggestion: more efficient replica state tracking

2023-09-28 Thread Mark Miller
Yeah, I took the DOWN state out all together in shutdown as its problematic and effectively sugar for the user view of the cluster state - as far as the system goes, if the ephemeral live node is gone, that node is down, regardless of the replica state. There is some value in being able to remove a

Re: Change suggestion: more efficient replica state tracking

2023-09-28 Thread Justin Sweeney
You are right, sorry. We are also running a fork of Solr and in our fork we have made some optimizations to avoid processing DOWNNODE messages for nodes that only host PRS collections. Those optimizations have not made it upstream at this point. I can take a look at upstreaming those changes or som

Re: Change suggestion: more efficient replica state tracking

2023-09-27 Thread Ilan Ginzburg
Justin, Thanks for your reply. My understanding is that with PRS, DOWNNODE on Overseer still iterates over all collections and marks the relevant replicas down. It may be faster, but it's not a no-op. Did I miss something? We do not use PRS in our current setup. I agree with the risk related to Z

Re: Change suggestion: more efficient replica state tracking

2023-09-26 Thread Justin Sweeney
Hey Ilan, curious if you have tried PRS in your implementation or not at this point and what your experience has been if you have tried it? I believe PRS currently publishes DOWNNODE messages to overseer, but they are essentially a no-op by the overseer so they have very little impact. We are runni

Change suggestion: more efficient replica state tracking

2023-09-26 Thread Ilan Ginzburg
*TL;DR; a way to track replica state using EPHEMERAL nodes that disappear automatically when a node goes down.* Hi, When running a cluster with many collections and replicas per node, processing of DOWNNODE messages takes more time. In a public cloud setup, the node that went down can come back q