[ 
https://issues.apache.org/jira/browse/LUCENE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241386#comment-13241386
 ] 

Michael McCandless commented on LUCENE-3932:
--------------------------------------------

I agree net/net that change is good; we know the in-RAM image will be at least 
as large as the tii file so we should make a better guess up front.

3.x is currently in code freeze (for the 3.6.0 release), but I'll commit to 
trunk's preflex codec.

Can you describe more about your index...?  If your tii fils is 66 MB, how many 
terms do you have...?  5 seconds is also a long startup time... what's the IO 
system like?
                
> Improve load time of .tii files
> -------------------------------
>
>                 Key: LUCENE-3932
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3932
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.5
>         Environment: Linux
>            Reporter: Sean Bridges
>
> We have a large 50 gig index which is optimized as one segment, with a 66 MEG 
> .tii file.  This index has no norms, and no field cache.
> It takes about 5 seconds to load this index, profiling reveals that 60% of 
> the time is spent in GrowableWriter.set(index, value), and most of time in 
> set(...) is spent resizing PackedInts.Mutatable current.
> In the constructor for TermInfosReaderIndex, you initialize the writer with 
> the line,
> {quote}GrowableWriter indexToTerms = new GrowableWriter(4, indexSize, 
> false);{quote}
> For our index using four as the bit estimate results in 27 resizes.
> The last value in indexToTerms is going to be ~ tiiFileLength, and if instead 
> you use,
> {quote}int bitEstimate = (int) Math.ceil(Math.log10(tiiFileLength) / 
> Math.log10(2));
> GrowableWriter indexToTerms = new GrowableWriter(bitEstimate, indexSize, 
> false);{quote}
> Load time improves to ~ 2 seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to