Re: Direct I/O

Michael McCandless Wed, 18 Sep 2019 06:22:01 -0700

Dawid, it's confusing: direct IO is different from a direct ByteBuffer!

Direct IO means you bypass all kernel "smarts", so the Linux buffer cache
is not used, no IO scheduling, no write cache that the pdflush daemon must
periodically move to disk, etc.  This is normally a bad idea, and better to
use fadvise/madvise to give kernel hints about what you are doing, and use
the buffer cache for what it's good at.  Linus hates that direct IO is even
an option for us ...

Back when I wrote NativeUnixDirectory, the idea was to prevent ongoing
merges from so heavily impacting ongoing searches, when you are doing
indexing and searching on one node.  We open the newly merged segments
files using direct IO, and do our own buffering, and then all writes go
straight to disk instead of using up precious hot pages that are in use for
searching.  I think I ran some simple performance tests back then but I
don't remember the results ... more testing is needed to see if it really
helps.

At Amazon, we are using segment based replication ever 60 seconds to copy
newly indexed segments out to all searchers, so we never have nodes doing
both indexing or searching, it's either or ... but, copying out max sized
newly merged segments to the searchers is causing some thrashing so we are
exploring using direct IO for those writes, and then separately warming the
new segments after the copy.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Sep 17, 2019 at 1:16 PM Uwe Schindler <u...@thetaphi.de> wrote:

> We discussed this already on Berlinbuzzwords (Mike and Michael). Yes it's
> possible and may work for merges where block io is possible. But most of us
> said: it's fine to not use io cache for merging, but it won't make pages
> hot. So merges are invisible to OS, so you have to warm merged segments if
> you write directly. If you read directly on merging, you won't pollute
> cache with one time reads, but it also won't use cache if already cached.
> We should better make a proposal for f/madvise. The jdk people are open
> for that, and I am jdk committer now, so I can make a prototype.
>
> Uwe
>
> Am September 17, 2019 4:48:26 PM UTC schrieb Dawid Weiss <
> dawid.we...@gmail.com>:
>>
>> Isn't that restricted to aligned block-only access though? I can
>> imagine this would complicate the implementation if somebody wanted to
>> use it directly.
>>
>> Dawid
>>
>> On Tue, Sep 17, 2019 at 5:37 PM Michael McCandless
>> <luc...@mikemccandless.com> wrote:
>>
>>>
>>>  Whoa!  That would be awesome -- no more JNI to use Direct I/O?
>>>  Looks like you use it like this:
>>>
>>>  FileChannel fc = FileChannel.open(p, StandardOpenOption.WRITE,
>>>                                    ExtendedOpenOption.DIRECT
>>>
>>>  But it looks like you need to enable the jdk.unsupported module, added 
>>> with http://openjdk.java.net/jeps/260
>>>
>>>  Mike McCandless
>>>
>>>  http://blog.mikemccandless.com
>>>
>>>
>>>  On Mon, Sep 16, 2019 at 11:55 AM Michael Sokolov <msoko...@gmail.com> 
>>> wrote:
>>>
>>>>
>>>>  https://bugs.openjdk.java.net/browse/JDK-8189192 makes it appear that
>>>>  Direct I/O is (or may be?) available now in JDK's since JDK10. Should
>>>>  we try using that API in NativeUnixDirectory in order to avoid JNI
>>>>  calls?
>>>> ------------------------------
>>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>
>>>> ------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>

Re: Direct I/O

Reply via email to