Hi Matteo, FYI, images has been removed from your email. The mailing list ate it. You'll need to give us text, not an image.
On Thu, 1 Sep 2022 at 16:35, Matteo Diarena <[email protected]> wrote: > Dear all, > > I’m experiencing a strange behaviour with a SolrCloud cluster. > > > > *Cluster description * > > I have a cluster with a total of 38 nodes. All nodes are installed with > the following features: > > - *OS*: Debian GNU/Linux 9.13 (stretch) > - JRE: openjdk version "11.0.6" 2020-01-14 > - Apache Solr: Apache Solr 8.11.2 > > > > The cluster nodes are divided as follows: > > > > *Nodes used for indexing* > > solrindex-01 > > solrindex-02 > > > > *Nodes used for queries* > > solrquery-01 > > solrquery-02 > > > > *Cluster nodes with collections* > > solrnode-01 > > … > > solrnode-34 > > > > *Configuration of the collection* > > In the cluster I have a collection (i.e testcollection) divided on the > various nodes through different shards (one shard for each month, i.e. > shard_202201, shard_202202, ...) > > > > *Problem* > > From time to time the solrquery-01 node is no longer able to query the > entire collection and in particular it is unable to contact some replicas > of the collection present on the other nodes of the cluster. The problem > does not resolve itself but it is necessary to restart the Apache Solr > service on the solrquery-01 node. > > > > In particular: > > If I try to query a specific replica from the solrquery-01 node, the > request remains pending until it times out > > > > Query > > > http://solrquery-01:8080/solr/volocomapi_search/select?q=UniqueReference:DOC_EBF3D4C11F1239852490280F583D052FC214A10D6E716BD98C19CBC599E5EFED&debug=true&shards=http://solrnode-24.volo.local:8080/solr/volocomapi_search_shard_201501_replica_n575/ > > > > Response > > > > By executing the same query from another node (eg: solrnode-01) the query > is successful. > > > > Query > > > http://solrnode-01:8080/solr/volocomapi_search/select?q=UniqueReference:DOC_EBF3D4C11F1239852490280F583D052FC214A10D6E716BD98C19CBC599E5EFED&debug=true&shards=http://solrnode-24.volo.local:8080/solr/volocomapi_search_shard_201501_replica_n575/ > > > > > > Response: > > > > The same happens if I try to run the query to a different replica > > > > Query > > > http://solrquery-01:8080/solr/volocomapi_search/select?q=UniqueReference:DOC_EBF3D4C11F1239852490280F583D052FC214A10D6E716BD98C19CBC599E5EFED&debug=true&shards=http://solrnode-23.volo.local:8080/solr/volocomapi_search_shard_201501_replica_n573/ > > > > Response > > > > > > Checking the network traffic with tcpdump on the solrquery-01 machine does > not show any connection as it does on the solrnode-01 machine > > > > *tcpdump from the solrquery-01 machine* > > > > *tcpdump on the solrnode-01 machine* > > > > *Question* > > Do you have any suggestions on how to investigate this issue further? > Suggestions on possible solutions? > > > > > > Thank you in advance, > > Matteo > -- Vincenzo D'Amore
