Thanks Uwe!
On Thu, Nov 24, 2016 at 9:41 AM, Uwe Schindler <u...@thetaphi.de> wrote: > Hi Kumaran, hi Erick, > >> Not really, as I don't know that code well, Uwe and company >> are the masters of that realm ;).... >> >> Sorry I can't be more help there.... > > I can help! > >> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian >> <kums....@gmail.com> wrote: >> > Erick, Thanks a lot for sharing an excellent post... >> > >> > Btw, am using NIOFSDirectory, could you please elaborate on below >> mentioned >> > lines? or any further pointers? >> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our >> code >> >> has to do a lot of syscalls to the O/S kernel to copy blocks of data >> >> between the disk or filesystem cache and our buffers residing in Java >> heap. >> >> This needs to be done on every search request, over and over again. > > the blog post just says it simple: You should use MMapDirectory and avoid > SimpleFSDir or MMapDirectory! The blog post explains why: SimpleFSDir and > NIOFSDir extend BufferedIndexInput. This class uses an on-heap buffer for > reading index files (which is 16 KB). For some parts of the index (like doc > values), this is not ideal. E.g. if you sort against a doc values field and > it needs to access a sort value (e.g. a short, integer or byte, which is very > small), it will ask the buffer for the like 4 bytes. In most cases when > sorting the buffer will not contain those byte, as sorting requires random > access over a huge file (so it is unlikely that the buffer will help). Then > BufferedIndexInput will seek the NIO/Simple file pointer and read 16 KiB into > the buffer. This requires a syscall to the OS kernel, which is expensive. > During sorting search results this can be millions or billions of times. In > addition it will copy chunks of memory between Java heap and operating system > cache over and over. > > With MMapDirectory no buffering is done, the Lucene code directly accesses > the file system cache and this is much more optimized. > > So for fast index access: > - avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32 bit > operating systems and JVMs) > - configure your operating system kernel as described in the blog post and > use MMapDirectory > - tell the sysadmin to inform himself about the output of linux commands > free/top/... (or Windows complements). > > Uwe > >> > -- >> > Kumaran R >> > >> > >> > >> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson >> <erickerick...@gmail.com> >> > wrote: >> > >> >> see Uwe's blog: >> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on- >> 64bit.html >> >> >> >> Short form: files are read into the OS's memory as needed. the whole >> >> file isn't read at once. >> >> >> >> Best, >> >> Erick >> >> >> >> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian >> >> <kums....@gmail.com> wrote: >> >> > Hi All, >> >> > >> >> > how do lucene read large index files? >> >> > for example, if one file (for eg: .dat file) is 4GB. >> >> > lucene read only part of file to RAM? or >> >> > is it different approach for different lucene file formats? >> >> > >> >> > >> >> > Related Link: >> >> > How do applications (and OS) handle very big files? >> >> > http://superuser.com/a/361201 >> >> > >> >> > >> >> > -- >> >> > Kumaran R >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org