[ 
https://issues.apache.org/jira/browse/LUCENE-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173247#comment-13173247
 ] 

Erick Erickson commented on LUCENE-3659:
----------------------------------------

>From the dev list, didn't want to lose this background (or make Uwe type it 
>again <G>)

The idea was to maybe replace RAMDirectory by a “clone” of MMapDirectory that 
uses large DirectByteBuffers outside the JVM heap. The current RAMDirectory is 
very limited (buffersize hardcoded to 8 KB, if you have a 50 Gigabyte Index in 
this RAMDirectory, your GC simply drives crazy – we investigated this several 
times for customers. RAMDirectory was in fact several times slower than a 
simple disk-based MMapDir). Also the locking on the RAMFile class is horrible, 
as for large indexes you have to change buffer several times when 
seeking/reading/…, which does heavily locking. In contrast, MMapDir is 
completely lock-free!
 
Until there is no replacement we will not remove it, but the current 
RAMDirectory is not useable for large indexes. That’s a limitation and the 
design of this class does not support anything else. It’s currently unfixable 
and instead of putting work into fixing it, the time should be spent in working 
on a new ByteBuffer-based RAMDir with larger blocs/blocs that merge or 
IOContext helping to calculate the file size before writing it (e.g. when 
triggering a merge you know the approximate size of the file before, so you can 
allocate a buffer that’s better than 8 Kilobytes). Also directByteBuffer helps 
to make GC happy, as the RAMdir is outside JVM heap.....

RAMdir uses more time for switching buffers than reading the data. The problem 
is that MMapDir does not support *writing* and that why we plan to improve 
this. Have you tried MMapDir for read access in comparison to RAMDirectory for 
a larger index, it outperforms several times (depending on OS and if file data 
is in FS cache already). The new directory will simply mimic the 
MMapIndexInput, add MMapIndexOutput, but not based on a mmapped buffer, instead 
a in-memory (Direct)ByteBuffer (outside or inside JVM heap – both will be 
supported). This simplifies code a lot.
 
The discussions about the limitations of crappy RAMDirectory were discussed on 
conferences, sorry. We did *not*decide to remove it (without a 
patch/replacement). The whole “message” on the issue was that RAMDirectory is a 
bad idea. The recommended approach at the moment to handle large in-ram 
directories would be to use a tmpfs on Linux/Solaris and use MMapDir on top 
(for larger indexes). The MMap would then directly map the RAM of the 
underlying tmpfs.....

                
> Improve Javadocs of RAMDirectory to document its limitations
> ------------------------------------------------------------
>
>                 Key: LUCENE-3659
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3659
>             Project: Lucene - Java
>          Issue Type: Task
>    Affects Versions: 3.5, 4.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>
> Spinoff from several dev@lao issues:
> - 
> [http://mail-archives.apache.org/mod_mbox/lucene-dev/201112.mbox/%3C001001ccbf1c%2471845830%24548d0890%24%40thetaphi.de%3E]
> - issue LUCENE-3653
> The use cases for RAMDirectory are very limited and to prevent users from 
> using it for e.g. loading a 50 Gigabyte index from a file on disk, we should 
> improve the javadocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to