Thanks Uwe!



On Thu, Nov 24, 2016 at 9:41 AM, Uwe Schindler <u...@thetaphi.de> wrote:
> Hi Kumaran, hi Erick,
>
>> Not really, as I don't know that code well, Uwe and company
>> are the masters of that realm ;)....
>>
>> Sorry I can't be more help there....
>
> I can help!
>
>> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
>> <kums....@gmail.com> wrote:
>> > Erick, Thanks a lot for sharing an excellent post...
>> >
>> > Btw, am using NIOFSDirectory, could you please elaborate on below
>> mentioned
>> > lines? or any further pointers?
>> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our
>> code
>> >> has to do a lot of syscalls to the O/S kernel to copy blocks of data
>> >> between the disk or filesystem cache and our buffers residing in Java
>> heap.
>> >> This needs to be done on every search request, over and over again.
>
> the blog post just says it simple: You should use MMapDirectory and avoid 
> SimpleFSDir or MMapDirectory! The blog post explains why: SimpleFSDir and 
> NIOFSDir extend BufferedIndexInput. This class uses an on-heap buffer for 
> reading index files (which is 16 KB). For some parts of the index (like doc 
> values), this is not ideal. E.g. if you sort against a doc values field and 
> it needs to access a sort value (e.g. a short, integer or byte, which is very 
> small), it will ask the buffer for the like 4 bytes. In most cases when 
> sorting the buffer will not contain those byte, as sorting requires random 
> access over a huge file (so it is unlikely that the buffer will help). Then 
> BufferedIndexInput will seek the NIO/Simple file pointer and read 16 KiB into 
> the buffer. This requires a syscall to the OS kernel, which is expensive. 
> During sorting search results this can be millions or billions of times. In 
> addition it will copy chunks of memory between Java heap and operating system 
> cache over and over.
>
> With MMapDirectory no buffering is done, the Lucene code directly accesses 
> the file system cache and this is much more optimized.
>
> So for fast index access:
> - avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32 bit 
> operating systems and JVMs)
> - configure your operating system kernel as described in the blog post and 
> use MMapDirectory
> - tell the sysadmin to inform himself about the output of linux commands 
> free/top/... (or Windows complements).
>
> Uwe
>
>> > --
>> > Kumaran R
>> >
>> >
>> >
>> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson
>> <erickerick...@gmail.com>
>> > wrote:
>> >
>> >> see Uwe's blog:
>> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
>> 64bit.html
>> >>
>> >> Short form: files are read into the OS's memory as needed. the whole
>> >> file isn't read at once.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
>> >> <kums....@gmail.com> wrote:
>> >> > Hi All,
>> >> >
>> >> > how do lucene read large index files?
>> >> > for example, if one file (for eg: .dat file) is 4GB.
>> >> > lucene read only part of file to RAM? or
>> >> > is it different approach for different lucene file formats?
>> >> >
>> >> >
>> >> > Related Link:
>> >> > How do applications (and OS) handle very big files?
>> >> > http://superuser.com/a/361201
>> >> >
>> >> >
>> >> > --
>> >> > Kumaran R
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to