[ 
https://issues.apache.org/jira/browse/SOLR-16622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-16622:
----------------------------------------
    Description: 
While benchmarking for performance, we saw a sharp change in the graphs:
https://issues.apache.org/jira/browse/SOLR-16525?focusedCommentId=17676725&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17676725

Turns out there was a commit (SOLR-16414) that escaped all testing and caused a 
regression where restarted nodes didn't have the replicas coming up as active.

This affects 9.1 release, so opening a new JIRA issue to track it.


Here's how to reproduce it:

{code}
git clone https://github.com/fullstorydev/solr-bench
cd solr-bench

# prerequisites on ubuntu:
sudo apt install openjdk-11-jdk
sudo apt install wget unzip zip ant ivy lsof git netcat make maven jq

# this is a patch to comment out the cleanup/final shutdown
wget https://termbin.com/yuu95
git apply yuu95

mvn clean compile assembly:single
./cleanup.sh && ./stress.sh -c aa4f3d98ab19c201e7f3c74cd14c99174148616d 
suites/stress-facets-local.json
{code}

If the 95th percentile is <10 or so, we have a problem. It should be >300 or 
so. Since, we disabled cleanup, we can hit http://localhost:50000/solr/ to open 
Solr UI. In this case, I see that querying to the ecommerce-events collection 
shows shard2 is down.


  was:
While benchmarking for performance, we saw a sharp change in the graphs:
https://issues.apache.org/jira/browse/SOLR-16525?focusedCommentId=17676725&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17676725

Turns out there was a commit (SOLR-16414) that escaped all testing and caused a 
regression where restarted nodes didn't have the replicas coming up as active.

This affects 9.1 release, so opening a new JIRA issue to track it.


> Replicas don't come up active after node restart
> ------------------------------------------------
>
>                 Key: SOLR-16622
>                 URL: https://issues.apache.org/jira/browse/SOLR-16622
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Ishan Chattopadhyaya
>            Priority: Major
>             Fix For: 9.1.1
>
>         Attachments: Screenshot from 2023-01-17 15-03-05.png
>
>
> While benchmarking for performance, we saw a sharp change in the graphs:
> https://issues.apache.org/jira/browse/SOLR-16525?focusedCommentId=17676725&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17676725
> Turns out there was a commit (SOLR-16414) that escaped all testing and caused 
> a regression where restarted nodes didn't have the replicas coming up as 
> active.
> This affects 9.1 release, so opening a new JIRA issue to track it.
> Here's how to reproduce it:
> {code}
> git clone https://github.com/fullstorydev/solr-bench
> cd solr-bench
> # prerequisites on ubuntu:
> sudo apt install openjdk-11-jdk
> sudo apt install wget unzip zip ant ivy lsof git netcat make maven jq
> # this is a patch to comment out the cleanup/final shutdown
> wget https://termbin.com/yuu95
> git apply yuu95
> mvn clean compile assembly:single
> ./cleanup.sh && ./stress.sh -c aa4f3d98ab19c201e7f3c74cd14c99174148616d 
> suites/stress-facets-local.json
> {code}
> If the 95th percentile is <10 or so, we have a problem. It should be >300 or 
> so. Since, we disabled cleanup, we can hit http://localhost:50000/solr/ to 
> open Solr UI. In this case, I see that querying to the ecommerce-events 
> collection shows shard2 is down.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to