Hi,
1)
ByteBuffer bb=ByteBuffer.wrap(new byte[len]); //bb is HeapByteBuffer
channel.read(bb);
this takes indeed 2 Gig of heap because you allocate 2 Gigabytes on java heap.
The additional direct memory comes from JVM internals. This depends on the JVM
you are using: Some use mmaping, some use direct buffers behind, some directly
access the byte buffer. The problem what needs to be taken care of (and why
DirectBuffers are better for copying *large* data): The DirectBuffer can be
locked into memory and will not be touched by GC. So The operating system can
use the underlying POSIX read() method from the kernel and pass the direct
memory to it. With heap buffers this more complicated. As the JVM heap manger
and GC sometimes defragments the Java heap (out of order in separate thread),
the read method cannot use the byte[] directly for the posix call. So indeed
you might (depending on JVM) copy the data 2 times: one time from FS cache to
direct buffer and then from direct buffer to heap. So indeed for Lucene it
could be better to use direct buffers, BUT:
Lucene only loads a few bytes from the file, it mostly depends on seeking and
not copying data. In that case the overhead of direct buffers overweigh the
multiple copies in memory. The double memory allocation does not matter to
lucene, as buffers are small.
2) is the same as 1)
3) this copies data one (!) time (from FS cache to direct buffer). But because
Lucene uses small buffers in random access the overhead is larger – similar to
MMap, with the additional cost of 2 times mapping
4) This is an incorrect example: The “trick” with MMapDirectory is to directly
use the whole file as virtual memory. Data is never copied, but its slightly
slower on access. It uses zero memory, only fs cache. In your example you copy
the whole mapped ByteBuffer into a heap buffer and that is not a good idea.
Lucene does not do this, it uses the mapped bytebuffer in random access, so its
just like “memory” access.
RES is hard to explain in most cases as this is highly dependent on the JVM
internals and the internals of your O/S. Its different on Linux and on Solaris
or Windows. Its just a “number”, the definition of “RES” is vague, every O/S
interprets it different. In reality: The MappedByteBuffer in Lucene is mapped
directly on the FS Cache and access to it reads through to the FS cache, the
data is never copied and no additional memory is used (because it’s a mapping)
Finally: If you have a 64 bit JVM – always use MMapDirectory that’s all
explained in the blog post. It uses less physical resources than any other
approach and is fastest!
It might be worth a try to make NIOFSDir use direct buffers. We can only try it
out and see how it behaves – and maybe make it configureable.
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de <http://www.thetaphi.de/>
eMail: [email protected]
From: [email protected] [mailto:[email protected]]
Sent: Wednesday, August 07, 2013 11:36 AM
To: java-user
Subject: 回复:RE: why lucene not use DirectByteBuffer in NIOFSDirectory
Hi Uwe:
Thank you for your detail explaination and I learnt a lot from your
message.
First, the direct buffer and the FS cache do not share the same memory
areas.
Second, accessing direct memory is slower than accessing heap memory from
java code.
In addition, I tested the different ways to use bytebuffer in java NIO
and watch the memory. The results like below:
The initial memory status of linux server , and after each test, i will
clear the system cache.
RandomAccessFile raf=new RandomAccessFile(file,"r"); // file size is
about 2G
FileChannel channel=raf.getChannel();
int len=(int)channel.size();
1. ByteBuffer bb=ByteBuffer.wrap(new byte[len]); //bb is HeapByteBuffer
channel.read(bb);
It consumes 4G physical memory except FS cache, and the heap memory is 2G
consumed by new byte, the other 2G is direct memory outside heap. It confused
me that why it would use DirectByteBuffer inside or I made some mistakes? If it
uses DirectByteBuffer, then the data will copy from FS cache to direct buffer,
from direct buffer to byte array in heap memory, in twice copy?
2. ByteBuffer bb=ByteBuffer.allocate(len); //bb is HeapByteBuffer
channel.read(bb);
The memory is the same as above.
3. ByteBuffer bb=ByteBuffer.allocateDirect(len); //bb is DirectByteBuffer
channel.read(bb);
It consumes 2G physical memory except FS cache, and the whole is direct
memory outside heap. The heap memory is less than 2M.
4. MappedByteBuffer bb=channel.map(FileChannel.MapMode.READ_ONLY,0,len);
bb.get(new byte[len]);
It consumes 4G memory from top RES, and the heap memory is 2G consumed by
new byte. I am confusing of what is the other 2G. From buffers/cache used: the
code should consume 2326 memory, why it is not the same as top RES?
All the above tests are not comparing the performance between different
ways. In lucene NIO, the size of buffer is 1024, not the whole file size as I
tested. And the mmap in lucene, it uses ByteBuffer.get(), getInt() to fetch
the data, do not need copy data to new byte array in heap memory as I tested.
Wish somebody giving me some explainations about my two confusings from
above tests.
Thank you again!
------------------------------------------------------------------
发件人:Uwe Schindler<[email protected]>
发送日期:2013年7月31日 18:18
收件人:[email protected];[email protected];
主 题:RE: why lucene not use DirectByteBuffer in NIOFSDirectory
Hi,
There is a misunderstanding: Just by allocating a direct buffer, there is still
no difference to a heap buffer in the workflow!
NIO will read the data from file, copy it to FS cache and then the positional
read() system call (used by NIO) copies the FS cache contents to the direct
buffer, no real difference to a heap buffer. So it is still the same: data
needs to be copied. Please note: Direct buffers have nothing to do with file
system cache, they don't share the same memory areas! Hotspot allocates direct
buffers using malloc() outside of Java heap, so not really useful here.
The backside of using a non-heap buffer as target for the copy operation is the
fact, that direct buffers are approx. 2 times slower when accessed from Java
code (because they are outside java heap, the VM has to do extra work to
prevent illegal accesses: so you have the same time for copy but slower access
from Java. The buffers allocated by NIO are small so it does not matter for
performance where they are. So heap is better. MappedByteBuffers are also
direct buffers (they have the same base class), so there is still the overhead
when accessing them from Java code, but to get rid of the additional copy, use
MMapDirectory.
To conclude:
- MMapDirectory: No copying of the data from FS-cache to heap or direct buffers
needed, which wastes most of the time. Access times to MappedByteBuffer from
Java code is slower, but the spared data copy makes it much better for large
files as used by Lucene.
- NIOFSDirectory with direct buffers: Needs to copy data from FS cache to
direct buffer memory (outside heap). Access times slower to direct buffers than
to heap buffers -> 2 times bad
- NIOFSDirectory with heap buffers: Needs to copy data from FS cache to heap.
Access time from java code is very good!
Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]
> Sent: Wednesday, July 31, 2013 11:59 AM
> To: [email protected]
> Subject: why lucene not use DirectByteBuffer in NIOFSDirectory
>
> I read this article "Use Lucene's MMapDirectory on 64bit platforms, please!"
> and it said the MMapDirectory is better than other Directory because it will
> void copy data between file system cache and java heap.
>
> I checked the source code of NIOFSDirectory, and in new Buffer method it
> called "ByteBuffer.wrap(newBuffer)", the generated ByteBuffer is
> HeapByteBuffer. And it will indeed copy data between file system cache and
> java heap. Why not use ByteBuffer.allocateDirect to generate
> DirectyByteBuffer, and it will store data directly in file sysytem cache, not
> java heap. If in this case , what is the different in performance between NIO
> and MMap? Or allocate directy memory in still slower than Mmap?
>
> Maybe I made some misunderstanding of lucene code, thank you for any
> suggestion in advance.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]