[ 
https://issues.apache.org/jira/browse/LUCENE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499323
 ] 

Michael Busch commented on LUCENE-893:
--------------------------------------


> how many documents were in index? 

 ~1.1 million

> I assume in case where posting are longer bigger buffer brings 
> more benefit,  especially for queries that are on dense terms. 

Yes and no... of course for very short posting lists bigger buffers
don't speed up things. But on the other hand, on long posting
lists it is more likely that more skips are performed, so it is also
questionable if increasing the buffer size always helps here.

> I am bringing this up as optimizing for "general case" is 
> hard/impossible due to too many scenarios, influences. 

Yes, I definitely agree. I know that my tests are very specific
to my hardware, OS, queries, documents... Here 2 KB is the magic
number, maybe on other systems it's different. It'd be good if 
others could run some tests too on other systems.

> 1. Maybe we constrain things a bit,  e.g number of docs in index 
> set to 10 Mio as a "golden standard"  target for optimization 

Having constraints is desirable. But maybe it might prevent others
from running tests? It takes time to get 10M docs, build indexes
with different settings, get good queries...

> 2. Making it configurable, so everybody could tweak it, you never 
> know what HW or file system one uses...

Yes, we could think about adding static getters/setters for 
readBufferSize and writeBufferSize.

> Increase buffer sizes used during searching
> -------------------------------------------
>
>                 Key: LUCENE-893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-893
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 2.1
>            Reporter: Michael McCandless
>
> Spinoff of LUCENE-888.
> In LUCENE-888 we increased buffer sizes that impact indexing and found
> substantial (10-18%) overall performance gains.
> It's very likely that we can also gain some performance for searching
> by increasing the read buffers in BufferedIndexInput used by
> searching.
> We need to test performance impact to verify and then pick a good
> overall default buffer size, also being careful not to add too much
> overall HEAP RAM usage because a potentially very large number of
> BufferedIndexInput instances are created during searching
> (# segments X # index files per segment).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to