Done: https://issues.apache.org/jira/browse/LUCENE-3659
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: DM Smith [mailto:[email protected]] > Sent: Tuesday, December 20, 2011 4:08 PM > To: [email protected] > Subject: Re: Plans to remove RAMDirectory? > > How about an issue to track this? I'd be glad to do it, but I'm not really the > "reporter" for it. > > -- DM > > On 12/20/2011 09:51 AM, Shai Erera wrote: > > Thanks for the clarification Uwe. If the whole idea is a new > > RAMDirectory implementation, that is more efficient, then it's ok. I > > think that the ideas you write are interesting. > > > > Have you tried MMapDir for read access in comparison to RAMDirectory > > for a > >> larger index > >> > > I have, and I support the decision not to use RAMDirectory for such cases. > > BUT, MMapDir is not recommended for use on all platforms / JDKs. > > Second, it cannot be used on e.g. HDFS. So sometimes RAMDirectory is > > the best you can do. > > > > Again, if the whole idea is improving RAMDirectory's implementation, > > then that I totally agree with and it makes sense. My point was that > > we should not lose the ability to load indexes into RAM. > > > > Shai > > > > On Tue, Dec 20, 2011 at 3:36 PM, Uwe Schindler<[email protected]> wrote: > > > >> Hi,**** > >> > >> ** ** > >> > >> You misunderstood the whole thing. The idea was to maybe replace > >> RAMDirectory by a clone of MMapDirectory that uses large > >> DirectByteBuffers outside the JVM heap. The current RAMDirectory is > >> very limited (buffersize hardcoded to 8 KB, if you have a 50 Gigabyte > >> Index in this RAMDirectory, your GC simply drives crazy we > >> investigated this several times for customers. RAMDirectory was in > >> fact several times slower than a simple disk-based MMapDir). Also the > >> locking on the RAMFile class is horrible, as for large indexes you > >> have to change buffer several times when seeking/reading/ , which > >> does heavily locking. In contrast, MMapDir is completely > >> lock-free!**** > >> > >> ** ** > >> > >> Until there is no replacement we will not remove it, but the current > >> RAMDirectory is not useable for large indexes. Thats a limitation > >> and the design of this class does not support anything else. Its > >> currently unfixable and instead of putting work into fixing it, the > >> time should be spent in working on a new ByteBuffer-based RAMDir with > >> larger blocs/blocs that merge or IOContext helping to calculate the > >> file size before writing it (e.g. when triggering a merge you know > >> the approximate size of the file before, so you can allocate a buffer > >> thats better than 8 Kilobytes). Also directByteBuffer helps to make GC > happy, as the RAMdir is outside JVM heap. > >> **** > >> > >> ** ** > >> > >> **Ø **Also, RAMDirectory is still more efficient than MMapDirectory, > >> if you want to index (and then search) on a small (sometimes even > >> transient) amount of data**** > >> > >> ** ** > >> > >> Thats not true, as RAMdir uses more time for switching buffers than > >> reading the data. The proble m is that MMapDir does not support > >> **writing** and that why we plan to improve this. Have you tried > >> MMapDir for read access in comparison to RAMDirectory for a larger > >> index, it outperforms several times (depending on OS and if file data is in FS > cache already). > >> The new directory will simply mimic the MMapIndexInput, add > >> MMapIndexOutput, but not based on a mmaped buffer, instead a > >> in-memory (Direct)ByteBuffer (outside or inside JVM heap both will be > supported). > >> This simplifies code a lot.**** > >> > >> ** ** > >> > >> The discussions about the limitations of crappy RAMDirectory were > >> discussed on conferences, sorry. We did **not**decide to remove it > >> (without a patch/replacement). The whole message on the issue was > >> that RAMDirectory is a bad idea. The recommended approach at the > >> moment to handle large in-ram directories would be to use a tmpfs on > >> Linux/Solaris and use MMapDir on top (for larger indexes). The MMap > >> would then directly map the RAM of the underlying tmpfs.**** > >> > >> ** ** > >> > >> Uwe**** > >> > >> ** ** > >> > >> -----**** > >> > >> Uwe Schindler**** > >> > >> H.-H.-Meier-Allee 63, D-28213 Bremen**** > >> > >> http://www.thetaphi.de**** > >> > >> eMail: [email protected]**** > >> > >> ** ** > >> > >> *From:* Shai Erera [mailto:[email protected]] > >> *Sent:* Tuesday, December 20, 2011 2:13 PM > >> *To:* [email protected] > >> *Subject:* Plans to remove RAMDirectory?**** > >> > >> ** ** > >> > >> Hi > >> > >> Uwe mentioned on LUCENE-3653 that there are plans to remove > >> RAMDirectory from Trunk and move to tests only: "RAMDirectory is > >> written for tests, not for production use. There are already plans to > >> remove it from Lucene trunk and move to tests only." (see full > >> comment<https://issues.apache.org/jira/browse/LUCENE- > 3653?focusedComm > >> entId=13172338&page=com.atlassian.jira.plugin.system.issuetabpanels:c > >> omment-tabpanel#comment-13172338> > >> ) > >> > >> I wasn't aware of such plans - were there emails about it or it has > >> been discussed on IRC? > >> > >> I disagree that RAMDirectory is useful only for tests. For example, > >> when someone wants to index on Hadoop, RAMDirectory can be very > >> useful (even though it's not the only solution). Also, RAMDirectory > >> is still more efficient than MMapDirectory, if you want to index (and > >> then search) on a small (sometimes even transient) amount of data. We > >> use it in several cases for such purposes. > >> > >> If RAMDirectory needs to improve (for instance, allocate bigger > >> byte[] chunks), then IMO we should do that, rather than drop it > >> entirely from core. I think it's a very valuable Directory > >> implementation that Lucene offers, and I'd hate to see it disappear. > >> > >> Shai**** > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] For additional > commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
