[ 
https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025307#comment-17025307
 ] 

ASF subversion and git services commented on LUCENE-9189:
---------------------------------------------------------

Commit 4773574578f089802fe3f36bff6951c4a29a3628 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4773574 ]

LUCENE-9189: TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes

The issue is that MockDirectoryWrapper's disk full check is horribly
inefficient. On every writeByte/etc, it totally recomputes disk space
across all files. This means it calls listAll() on the underlying
Directory (which sorts all the underlying files), then sums up fileLength()
for each of those files.

This leads to many pathological cases in the disk full tests... but the
number of tests impacted by this is minimal, and the logic is scary.


> TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes
> ---------------------------------------------------------------
>
>                 Key: LUCENE-9189
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9189
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>
> I thought it was just the testUpdatesOnDiskFull, but looks like this one 
> needs to be nightly too.
> Should look more into the test, but I know something causes it to make such 
> an insane amount of files, that sorting them becomes a bottleneck.
> I guess also related is that it would be great if MockDirectoryWrapper's disk 
> full check didn't trigger a sort of the files (via listAll): it does this 
> check on like every i/o, would be nice for it to be less absurd. Maybe 
> instead the test could check for disk full on not every i/o but some random 
> sample of them?
> Temporarily lets make it nightly...
> {noformat}
> PROFILE SUMMARY from 182501 samples
>   tests.profile.count=10
>   tests.profile.stacksize=1
>   tests.profile.linenumbers=false
> PERCENT       SAMPLES STACK
> 15.89%        28995   java.lang.StringLatin1#compareTo()
> 6.61% 12069   java.util.TimSort#mergeHi()
> 5.96% 10878   java.util.TimSort#binarySort()
> 3.41% 6231    java.util.concurrent.ConcurrentHashMap#tabAt()
> 2.98% 5433    java.util.Comparators$NaturalOrderComparator#compare()
> 2.12% 3876    org.apache.lucene.store.DataOutput#copyBytes()
> 2.03% 3712    java.lang.String#compareTo()
> 1.84% 3350    java.util.concurrent.ConcurrentHashMap#get()
> 1.83% 3337    java.util.TimSort#mergeLo()
> 1.67% 3047    java.util.ArrayList#add()
> {noformat}
> All the file sorting is called from stacks like this, so its literally 
> happening every writeByte() and so on
> {noformat}
> 0.73% 1329    java.util.TimSort#binarySort()
>                         at java.util.TimSort#sort()
>                         at java.util.Arrays#sort()
>                         at java.util.ArrayList#sort()
>                         at java.util.stream.SortedOps$RefSortingSink#end()
>                         at java.util.stream.AbstractPipeline#copyInto()
>                         at java.util.stream.AbstractPipeline#wrapAndCopyInto()
>                         at java.util.stream.AbstractPipeline#evaluate()
>                         at 
> java.util.stream.AbstractPipeline#evaluateToArrayNode()
>                         at java.util.stream.ReferencePipeline#toArray()
>                         at 
> org.apache.lucene.store.ByteBuffersDirectory#listAll()
>                         at 
> org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes()
>                         at 
> org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull()
>                         at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeBytes()
>                         at 
> org.apache.lucene.store.MockIndexOutputWrapper#writeByte()
>                         at org.apache.lucene.store.DataOutput#writeInt()
>                         at org.apache.lucene.codecs.CodecUtil#writeFooter()
>                         at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs()
>                         at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs()
>                         at 
> org.apache.lucene.index.PendingDeletes#writeLiveDocs()
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to