I think we had a similar bug with NIO in NIOFSDir/SimpleFSDir? We have a chunk size there because of that!
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen <http://www.thetaphi.de/> http://www.thetaphi.de eMail: [email protected] From: Shai Erera [mailto:[email protected]] Sent: Thursday, June 24, 2010 11:08 PM To: [email protected] Subject: FSDirectory.copy() impl might be dangerous Hi Today I ran into a weird exception from FSDir.copy(), and while investigating it, I spotted a potential bug as well. So bug first: FileChannel.transferFrom documents that it may not copy the number of bytes requested, however we don't check the return value. So need to fix the code to read in a loop until all bytes were copied. That's an easy fix. Now for the dangerous part - I wanted to measure segment merging performance, so I created two indexes: 10K docs and 100K docs, both are optimized. I then use IndexWriter.addIndexes(Directory...) method to merge 100 copies of the first into a new directory, and 10 copies of the second into a new directory (to create an index of 1M docs, but different number of segments). I then call optimize(). Surprisingly, when calling addIndexes() w/ the 100K-docs segments, I ran into this exception (Java 1.6 -- Java 1.5's exception was cryptic): Exception in thread "main" java.io.IOException: Map failed at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:770) at sun.nio.ch.FileChannelImpl.transferToTrustedChannel(FileChannelImpl.java:450 ) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:523) at org.apache.lucene.store.FSDirectory.copy(FSDirectory.java:450) at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3019) Caused by: java.lang.OutOfMemoryError: Map failed at sun.nio.ch.FileChannelImpl.map0(Native Method) at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:767) ... 7 more I run this on my laptop w/ 4GB RAM. So it's entirely possible there are memory issues here. BUT - the segment size is only 300 MB, which is still much much less than my machine's RAM. What worries me is not that particular test - but what will happen if someone will try to addIndexes() segments that are 10GB or 100GB or even more ... then it really won't matter how much RAM you have. So let's take the RAM availability out of the picture. This API is dangerous, because if someone will try to merge not so large segments, on a machine w/ not so much RAM, he'll hit an exception - and it didn't happen before 'cause we used byte[] copies (which is slower). I changed FSDir.copy() code to copy in chunks of 64MB: long numWritten = 0; long numToWrite = input.size(); long bufSize = 1 << 26; while (numWritten < numToWrite) { numWritten += output.transferFrom(input, numWritten, bufSize); } And the process completed successfully. Obviously, 64MB may be too high for other systems, so I'm thinking we should make it configurable, but still - chunking, using the same API, succeeds. I guess it's just a "not so friendly impl" of Java's FileChannelImpl, but I don't know if we can go around it. Maybe we can perf-test and use a smaller chunk size that is safer for all cases (and yields the same performance as larger ones) ... BTW, I don't have FileChannelImpl's source, but Mike found here http://www.docjar.com/html/api/sun/nio/ch/FileChannelImpl.java.html. It doesn't look like the impl chunks anything ... What do you think? Shai
