Re: Direct I/O

Michael McCandless Wed, 18 Sep 2019 08:55:13 -0700

Ahh yes sorry you are right Dawid and Uwe.

Mike McCandless


http://blog.mikemccandless.com


On Wed, Sep 18, 2019 at 10:11 AM Uwe Schindler <u...@thetaphi.de> wrote:

> The direct io has in fact the problem which was just wrongly named by
> Dawid: Block alignment is needed - on disk and not in memory. In short: You
> can't read or write a single byte anywhere in file; you need a buffering
> layer in-between that takes care of alignment. NativeUnixDir does this.
>
> Uwe
>
> Am September 18, 2019 1:54:35 PM UTC schrieb Dawid Weiss <
> dawid.we...@gmail.com>:
>>
>> Thanks for the explanation, Mike!
>>
>> D.
>>
>> On Wed, Sep 18, 2019 at 3:21 PM Michael McCandless
>> <luc...@mikemccandless.com> wrote:
>>
>>>
>>>  Dawid, it's confusing: direct IO is different from a direct ByteBuffer!
>>>
>>>  Direct IO means you bypass all kernel "smarts", so the Linux buffer cache 
>>> is not used, no IO scheduling, no write cache that the pdflush daemon must 
>>> periodically move to disk, etc.  This is normally a bad idea, and better to 
>>> use fadvise/madvise to give kernel hints about what you are doing, and use 
>>> the buffer cache for what it's good at.  Linus hates that direct IO is even 
>>> an option for us ...
>>>
>>>  Back when I wrote NativeUnixDirectory, the idea was to prevent ongoing 
>>> merges from so heavily impacting ongoing searches, when you are doing 
>>> indexing and searching on one node.  We open the newly merged segments 
>>> files using direct IO, and do our own buffering, and then all writes go 
>>> straight to disk instead of using up precious hot pages that are in use for 
>>> searching.  I think I ran some simple performance tests back then but I 
>>> don't remember the results ... more testing is needed to see if it really 
>>> helps.
>>>
>>>  At Amazon, we are using segment based replication ever 60 seconds to copy 
>>> newly indexed segments out to all searchers, so we never have nodes doing 
>>> both indexing or searching, it's either or ... but, copying out max sized 
>>> newly merged segments to the searchers is causing some thrashing so we are 
>>> exploring using direct IO for those writes, and then separately warming the 
>>> new segments after the copy.
>>>
>>>  Mike McCandless
>>>
>>>  http://blog.mikemccandless.com
>>>
>>>
>>>  On Tue, Sep 17, 2019 at 1:16 PM Uwe Schindler <u...@thetaphi.de> wrote:
>>>
>>>>
>>>>  We discussed this already on Berlinbuzzwords (Mike and Michael). Yes it's 
>>>> possible and may work for merges where block io is possible. But most of 
>>>> us said: it's fine to not use io cache for merging, but it won't make 
>>>> pages hot. So merges are invisible to OS, so you have to warm merged 
>>>> segments if you write directly. If you read directly on merging, you won't 
>>>> pollute cache with one time reads, but it also won't use cache if already 
>>>> cached.
>>>>  We should better make a proposal for f/madvise. The jdk people are open 
>>>> for that, and I am jdk committer now, so I can make a prototype.
>>>>
>>>>  Uwe
>>>>
>>>>  Am September 17, 2019 4:48:26 PM UTC schrieb Dawid Weiss 
>>>> <dawid.we...@gmail.com>:
>>>>
>>>>>
>>>>>  Isn't that restricted to aligned block-only access though? I can
>>>>>  imagine this would complicate the implementation if somebody wanted to
>>>>>  use it directly.
>>>>>
>>>>>  Dawid
>>>>>
>>>>>  On Tue, Sep 17, 2019 at 5:37 PM Michael McCandless
>>>>>  <luc...@mikemccandless.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>   Whoa!  That would be awesome -- no more JNI to use Direct I/O?
>>>>>>   Looks like you use it like this:
>>>>>>
>>>>>>   FileChannel fc = FileChannel.open(p, StandardOpenOption.WRITE,
>>>>>>                                     ExtendedOpenOption.DIRECT
>>>>>>
>>>>>>   But it looks like you need to enable the jdk.unsupported module, added 
>>>>>> with http://openjdk.java.net/jeps/260
>>>>>>
>>>>>>   Mike McCandless
>>>>>>
>>>>>>   http://blog.mikemccandless.com
>>>>>>
>>>>>>
>>>>>>   On Mon, Sep 16, 2019 at 11:55 AM Michael Sokolov <msoko...@gmail.com> 
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-8189192 makes it appear that
>>>>>>>   Direct I/O is (or may be?) available now in JDK's since JDK10. Should
>>>>>>>   we try using that API in NativeUnixDirectory in order to avoid JNI
>>>>>>>   calls?
>>>>>>> ------------------------------
>>>>>>>   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>>>   For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>>>
>>>>>>> ------------------------------
>>>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>
>>>>>
>>>>  --
>>>>  Uwe Schindler
>>>>  Achterdiek 19, 28357 Bremen
>>>>  https://www.thetaphi.de
>>>>
>>> ------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>

Re: Direct I/O

Reply via email to