[ 
https://issues.apache.org/jira/browse/LUCENE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301365#comment-17301365
 ] 

Robert Muir commented on LUCENE-9817:
-------------------------------------

I updated the patch but there is more work to do, based upon testing with a mac:
* running without tmpfs discovers more issues, attack the worst of those as 
most people aren't using tmpfs or can't easily use such things (e.g. macs)
* test load balancing is really bad for machines with lots of cores. 
TestBackwardsCompatibility takes 93s on a fast mac and lucene 9.0 isn't even 
released yet! 

Just need to try to address these two items, because on the mac it is 
unnecessarily slow:
{noformat}
The slowest tests (exceeding 500 ms) during this run:
  16.83s TestBackwardsCompatibility.testCommandLineArgs 
(:lucene:backward-codecs)
  14.25s 
TestBestCompressionLucene90DocValuesFormat.testSparseDocValuesVsStoredFields 
(:lucene:core)
  10.24s TestBackwardsCompatibility.testUnsupportedOldIndexes 
(:lucene:backward-codecs)
  10.20s TestBackwardsCompatibility.testSearchOldIndex (:lucene:backward-codecs)
   9.82s TestLucene70DocValuesFormat.testSparseDocValuesVsStoredFields 
(:lucene:backward-codecs)
   9.22s TestBestSpeedLucene90DocValuesFormat.testSparseDocValuesVsStoredFields 
(:lucene:core)
   8.95s TestOfflineSorter.testSmallRandom (:lucene:core)
   8.53s TestBackwardsCompatibility.testIndexOldIndex (:lucene:backward-codecs)
   7.17s TestOfflineSorter.testIntermediateMerges (:lucene:core)
   7.08s TestAtomicUpdate.testAtomicUpdates (:lucene:core)
The slowest suites (exceeding 1s) during this run:
  93.25s TestBackwardsCompatibility (:lucene:backward-codecs)
  49.74s TestBestCompressionLucene90DocValuesFormat (:lucene:core)
  39.59s TestLucene70DocValuesFormat (:lucene:backward-codecs)
  37.45s TestBestCompressionLucene80DocValuesFormat (:lucene:backward-codecs)
  36.32s TestBestSpeedLucene90DocValuesFormat (:lucene:core)
  35.97s TestBestSpeedLucene80DocValuesFormat (:lucene:backward-codecs)
  29.34s TestAssertingDocValuesFormat (:lucene:test-framework)
  29.26s TestPerFieldDocValuesFormat (:lucene:core)
  26.43s TestBoolean2 (:lucene:core)
  22.86s TestSimpleTextDocValuesFormat (:lucene:codecs)
{noformat}

> pathological test fixes
> -----------------------
>
>                 Key: LUCENE-9817
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9817
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9817.patch, LUCENE-9817.patch, LUCENE-9817.patch
>
>
> There are now 13,000+ tests in lucene, and if you don't have dozens of cores 
> the situation is slow (around 7 minutes here, with everything tuned as fast 
> as i can get it, running on tmpfs). 
> It is tricky to keep the situation sustainable: so many tests that usually 
> just take a few seconds but they all add up. To put it in perspective, 
> imagine if all 13000 tests only took 1s each, that's 3.5 hours of cpu time.
> From my inspection, there are a few cases of inefficiency:
> * tests with bad random parameters: they might normally be semi-well-behaved, 
> but "rarely" take 30 seconds. That's maybe like a 1% chance but keep in mind 
> 1% equates to 130 wild-west tests every run.
> * tests spinning up too many threads and indexing too many docs 
> unnecessarily: there might literally be thousands of these, so that's a hard 
> problem to fix... and developers love to use lots of threads and docs in 
> tests.
> * tests just being inefficient: stuff like creating indexes in setup/teardown 
> when they have many methods that may not even use them (hey, why did 
> testEqualsHashcode take 30 seconds, what is it doing?)
> I only worked on the first case here, if i fixed anything involving the other 
> two, it was just because I noticed them while I was there. I temporarily 
> overrode methods like LuceneTestCase.rarely(), atLeast(), and so on to 
> present more pathological/worst-case conditions and tried to address them all.
> So here's a patch to give ~ 80 seconds of cpu-time in tests back. YMMV, maybe 
> it helps you more if you are actually using hard disks and stuff!
> Fixing the other issues here will require some more creativity/work, I will 
> followup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to