Hi Jan,

> Can you explain what "restart performance" means, and what the y axis
number is?
Yes, that refers to the time it took to restart 7 nodes, 2 at a time, while
waiting for all replicas on those nodes to be active before proceeding to
the next batch (of 2).
https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json#L33-L47

The number on the y-axis is the total time it took for the entire operation
(of restarting those 7 nodes).
https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/createGraph.py#L25-L31
Here's a sample results file from a run:
{task1=[{start-time=0.175, total-time=238334, end-time=238.509}],
task2=[{start-time=238.523, total-time=109.801, end-time=348.324},
{start-time=238.523, total-time=113.048, end-time=351.571},
{start-time=348.324, total-time=122.317, end-time=470.641},
{start-time=351.572, total-time=130.111, end-time=481.683},
{start-time=470.642, total-time=106.068, end-time=576.71},
{start-time=481.683, total-time=107.827, end-time=589.51},
{start-time=576.711, total-time=89.179, end-time=665.89}]}

Thanks,
Ishan

On Wed, Nov 9, 2022 at 6:27 PM Jan Høydahl <[email protected]> wrote:

> Thanks for putting this together Ishan,
>
> Can you explain what "restart performance" means, and what the y axis
> number is?
>
> Jan
>
> > 9. nov. 2022 kl. 07:39 skrev Ishan Chattopadhyaya <
> [email protected]>:
> >
> > I'm working on automating performance testing, details in
> > https://issues.apache.org/jira/browse/SOLR-16525.
> >
> > Even before I could complete the automation, I observed massive slowdown
> in
> > restart performance, now attributable to
> > https://issues.apache.org/jira/browse/SOLR-16414. This affected 9.1
> release
> > candidate RC1, but is now fixed in 9.1 and 9x branches.
> >
> > However, while performance was back to original levels on 9.1 branch,
> there
> > was a 80-100% slowdown on the 9x branch even after this fix.
> > Please see: http://mostly.cool/cluster-test.json.html
> > The test is here:
> >
> https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json
> >
> > In order to investigate the slowdown, I retroactively applied the patch
> > that fixed the performance problem in SOLR-16414 (removing use of
> > parallelStream) to the intermediate commits and plotted the graph:
> > http://mostly.cool/cluster-test-with-patch.html
> >
> > And now, two more commits with potential slowdowns are observed. Here are
> > the JIRA issues I've opened for both:
> > https://issues.apache.org/jira/browse/SOLR-16530
> > https://issues.apache.org/jira/browse/SOLR-16531
> >
> > In a week of working on this automation, I was able to catch 3 slowdowns
> on
> > the first thing I automated. It might be good to keep this running and
> test
> > other aspects. Going forward, I'll be automating more performance suites
> > and open blocker JIRA issues on significant performance degradation,
> > whenever observed. I'll make it easy for all of us to add suites to the
> > framework and have their personal branches/PRs tested through this.
> >
> > Please let me know about any thoughts / concerns / suggestions.
> >
> > Thanks,
> > Ishan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to