Hi All, As suggested from the group I tried using this api call /sol/admin/zookeeper/status, to get the zk status whenever i try this in my browser one time I get the status as 0 and get the zk ensemble details, after a while when I try i get status : 500 error: msg: "Java.net.SocketException:connection reset: trace: java.io.UncheckedIOException : java.net.socketexception:connection reset
can I ignore if there is a socket exception because immediately if i try next time the status is ok no errors. Kindly advise. Also in the solr admin UI, I can see the below for all the zookeepers, is this normal? what is the zk_node_count ZK_node_count 1852 zk_approximate_data_size 7853679 *Thanks,* *Reej* On Thu, Jan 27, 2022 at 4:22 PM Reej Nayagam <reej...@gmail.com> wrote: > Hi Vinay, > > We are connecting using cloudsolrclient passing the zk host, so if zk is > down, the connection to solr also won't happen. > > *Thanks,* > *Reej* > > > On Thu, Jan 27, 2022 at 12:35 PM Vinay Rajput <vinayrajput4...@gmail.com> > wrote: > >> It also looks like from your requirement that you want to disable solr >> search and activate DB search in case of zookeeper cluster failure. >> >> That is NOT needed. Solr search is not impacted when zk cluster is down, >> only indexing is impacted. We have had a situation when our all zk nodes >> were down for few minutes and still there was no impact on search. >> >> Thanks, >> Vinay >> >> On Wed, 26 Jan 2022 at 9:12 PM, Walter Underwood <wun...@wunderwood.org> >> wrote: >> >> > You can check the status of each Zookeeper node with the “ruok” command. >> > This is one of the “four lettter words” admin commands. >> > >> > >> https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands >> > >> > This is how it works from a command line. >> > >> > $ echo ruok | nc zoo-shared-1.test.search.cheggnet.com 2181 >> > imok >> > >> > wunder >> > Walter Underwood >> > wun...@wunderwood.org >> > http://observer.wunderwood.org/ (my blog) >> > >> > > On Jan 26, 2022, at 5:53 AM, Reej Nayagam <reej...@gmail.com> wrote: >> > > >> > > The scenario is solr servers are up, but majority of the zk is down, >> > > so we need to tell the issue is with the zookeeper. I don’t find a >> way on >> > > how to identify the zookeeper status without waiting for the timeout >> to >> > > happen after 30 seconds. >> > > >> > > On Wed, 26 Jan 2022 at 9:39 PM, matthew sporleder < >> msporle...@gmail.com> >> > > wrote: >> > > >> > >> I don't understand your approach -- >> > >> >> > >> For checking solr health I would probably use the ping endpoint or a >> > >> very fast query with a low timeout (q=*:*&timeAllowed=100&rows=0). >> > >> >> > >> IIRC zookeeper health (as seen by solr) is in the CLUSTERSTATUS admin >> > >> api command? It's somewhere near there if not in CLUSTERSTATUS. >> > >> >> > >> For interacting with zookeeper itself I would probably just use zk >> > >> clients directly. >> > >> >> > >> >> > >> >> > >> On Wed, Jan 26, 2022 at 7:41 AM Reej Nayagam <reej...@gmail.com> >> wrote: >> > >>> >> > >>> Hi All, >> > >>> >> > >>> I need to handle zk failure and so monitoring the zk ensemble, and >> if >> > the >> > >>> majority of the zk fails we'll activate the HA to point to a DB >> search. >> > >>> >> > >>> So to check if each of the zk is alive , we are connecting as below, >> > >>> >> > >>> *zkClient = solrZkClient(zkaddress,10000),* >> > >>> *return zkclient.getSolrZookeeper().getState(),isAlive* >> > >>> >> > >>> But I noticed, it still takes the default 30,000 ms timeout instead >> of >> > >> 10k >> > >>> milliseconds passed in. >> > >>> >> > >>> Is there a way we can override zookeeper timeout, because we have 3 >> > zk's >> > >>> and if suppose all the 3 are down, to get the status of each we >> need to >> > >>> wait for 30 seconds each. >> > >>> >> > >>> Kindly advise if any of you have handled this. Thank you ! >> > >>> >> > >>> *Thanks,* >> > >>> *Reej* >> > >> >> > > -- >> > > *Thanks,* >> > > *Reej* >> > >> > >> >