Also, if you are really set on the mmap strategy, why not use the single file 
with fixed length pages, using the header I proposed (and key compression). You 
don't need any fancy partial page stuff, just waste a small amount of space at 
the end of pages.

I think this is going to far faster than a file of fixed length offsets (I 
assume you would also put the entry data length in file #1 as well), and a file 
of data (file #2). Mainly because the final page(s) can be more efficiently 
searched, and since you can use compression (since you have pages), the files 
are going to be significantly smaller (improving the write time, and the cache 
efficiency).


-----Original Message-----
>From: Robert Engels <reng...@ix.netcom.com>
>Sent: Dec 26, 2008 11:30 AM
>To: java-dev@lucene.apache.org, java-dev@lucene.apache.org
>Subject: Re: Realtime Search
>
>That could very well be, but I was referencing your statement:
>
>"1) Design index formats that can be memory mapped rather than slurped,
>     bringing the cost of opening/reopening an IndexReader down to a
>     negligible level."
>
>The only reason to do this (or have it happen) is if you perform a binary 
>search on the term index.
>
>Using a 2 file system is going to be WAY slower - I'll bet lunch. It might be 
>workable if the files were on a striped drive, or put each file on a different 
>drive/controller, but requiring such specially configured hardware is not a 
>good idea. In the common case (single drive), you are going to be seeking all 
>over the place.
>
>Saving the memory structure from the write of the segment is going to offer 
>far superior performance - you can binary seek on the memory structure, not 
>the mmap file. The only problem with this is that there is going to be a 
>minimum memory requirement.
>
>Also, the mmap is only suitable for 64 bit platforms, since there is no way in 
>Java to unmap, you are going to run out of address space as segments are 
>rewritten.
>
>
>
>
>
>
>
>
>
>-----Original Message-----
>>From: Marvin Humphrey <mar...@rectangular.com>
>>Sent: Dec 24, 2008 1:31 PM
>>To: java-dev@lucene.apache.org
>>Subject: Re: Realtime Search
>>
>>On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote:
>>> As I understood this discussion though, it was an attempt to remove  
>>> the in memory 'skip to' index, to avoid the reading of this during  
>>> index open/reopen.
>>
>>No.  That idea was entertained briefly and quickly discarded.  There seems to
>>be an awful lot of irrelevant noise in the current thread arising due to lack
>>of familiarity with the ongoing discussions in JIRA.
>>
>>Marvin Humphrey
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>>For additional commands, e-mail: java-dev-h...@lucene.apache.org
>>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>For additional commands, e-mail: java-dev-h...@lucene.apache.org
>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to