There are, as you would expect, a lot of factors that impact the
amount of fragmentation that occurs:
commit rate, mergeFactor updates/deletes vs 'new' data etc.

Having run reasonably large indexes on NTFS (>25GB), we've not found
fragmentation to be much of a hindrance.
I don't have any definitive benchmark numbers, sorry, but as an index
grows to large sizes, other factors overshadow
any fragmentation hit - e.g. sharding, replication, cache warming etc.

If you're really worried that fragmentation is affecting performance,
you can move to using SSD drives, which don't suffer from
fragmentation (in fact, they must never be defragmented), and of
course they absolutely fly.

Peter


On Wed, Dec 8, 2010 at 5:59 PM, Will Milspec <will.mils...@gmail.com> wrote:
> Hi all,
>
> Pardon if this isn't the best place to post this email...maybe it belongs on
> the lucene-user list .  Also, it's basically windows-specific,so not of use
> to everyone...
>
> The question: does NTFS fragmentation affect  search performance "a little
> bit" or "a lot"? It's obvious that "fragmentation will slow things down",
> but is it a factor of .1, 10 , or 100? (i.e what order of magnitude)?
>
> As a follow up: should solr/lucene users periodically remind Windows
> sysadmins to defrag their drives ?
>
> On a production system, I ran the windows defrag "analyzer" and found heavy
> fragmentation on the lucene index.
>
> 11,839          492 MB          \data\index\search\_6io5.cfs
> 7,153           433 MB          \data\index\search\_5ld6.cfs
> 6,953           661 MB          \data\index\search\_8jvj.cfs
> 5,824           74 MB           \data\index\search\_5ld7.frq
> 5,691           356 MB          \data\index\search\_9eev.fdt
> 5,638           352 MB          \data\index\search\_8mqi.fdt
> 5,629           352 MB          \data\index\search\_8jvj.fdt
> 5,609           351 MB          \data\index\search\_88z8.fdt
> 5,590           355 MB          \data\index\search\_96l5.fdt
> 5,568           354 MB          \data\index\search\_8zjn.fdt
> 5,471           342 MB          \data\index\search\_5wgo.fdt
> 5,466           342 MB          \data\index\search\_5uo1.fdt
> 5,450           340 MB          \data\index\search\_5hrn.fdt
> 5,429           345 MB          \data\index\search\_6nyy.fdt
> 5,371           353 MB          \data\index\search\_8sob.fdt
>
> Incidentally, we periodically experience some *very* slow searches. Out of
> curiousity, I checked for file fragmentation (using 'analyze' mode of the
> nfts defragger)
>
> nota bene: Windows sysinternals has a utility "Contig.exe" whic allows you
> to defragment individual drives/directories. We'll use that to defragmeent
> the  index direcotires
>
> will
>

Reply via email to