There are, as you would expect, a lot of factors that impact the amount of fragmentation that occurs: commit rate, mergeFactor updates/deletes vs 'new' data etc.
Having run reasonably large indexes on NTFS (>25GB), we've not found fragmentation to be much of a hindrance. I don't have any definitive benchmark numbers, sorry, but as an index grows to large sizes, other factors overshadow any fragmentation hit - e.g. sharding, replication, cache warming etc. If you're really worried that fragmentation is affecting performance, you can move to using SSD drives, which don't suffer from fragmentation (in fact, they must never be defragmented), and of course they absolutely fly. Peter On Wed, Dec 8, 2010 at 5:59 PM, Will Milspec <will.mils...@gmail.com> wrote: > Hi all, > > Pardon if this isn't the best place to post this email...maybe it belongs on > the lucene-user list . Also, it's basically windows-specific,so not of use > to everyone... > > The question: does NTFS fragmentation affect search performance "a little > bit" or "a lot"? It's obvious that "fragmentation will slow things down", > but is it a factor of .1, 10 , or 100? (i.e what order of magnitude)? > > As a follow up: should solr/lucene users periodically remind Windows > sysadmins to defrag their drives ? > > On a production system, I ran the windows defrag "analyzer" and found heavy > fragmentation on the lucene index. > > 11,839 492 MB \data\index\search\_6io5.cfs > 7,153 433 MB \data\index\search\_5ld6.cfs > 6,953 661 MB \data\index\search\_8jvj.cfs > 5,824 74 MB \data\index\search\_5ld7.frq > 5,691 356 MB \data\index\search\_9eev.fdt > 5,638 352 MB \data\index\search\_8mqi.fdt > 5,629 352 MB \data\index\search\_8jvj.fdt > 5,609 351 MB \data\index\search\_88z8.fdt > 5,590 355 MB \data\index\search\_96l5.fdt > 5,568 354 MB \data\index\search\_8zjn.fdt > 5,471 342 MB \data\index\search\_5wgo.fdt > 5,466 342 MB \data\index\search\_5uo1.fdt > 5,450 340 MB \data\index\search\_5hrn.fdt > 5,429 345 MB \data\index\search\_6nyy.fdt > 5,371 353 MB \data\index\search\_8sob.fdt > > Incidentally, we periodically experience some *very* slow searches. Out of > curiousity, I checked for file fragmentation (using 'analyze' mode of the > nfts defragger) > > nota bene: Windows sysinternals has a utility "Contig.exe" whic allows you > to defragment individual drives/directories. We'll use that to defragmeent > the index direcotires > > will >