[jira] [Commented] (LUCENE-8438) RAMDirectory speed improvements and cleanup

Dawid Weiss (JIRA) Mon, 06 Aug 2018 05:42:17 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570159#comment-16570159
 ]


Dawid Weiss commented on LUCENE-8438:
-------------------------------------

I've done some work on this and have some thoughts to share, so looking for 
feedback. The branch and PR diff is here, it's not merge'able as-is, but it 
provides grounds for discussion.

[https://github.com/apache/lucene-solr/pull/432]

Here are some notes introducing you to this patch and things and decisions I 
consider somewhat important.

*RAMDirectory and ByteBuffersDirectory (and associated classes).*

RAMDirectory and its bits and pieces (RAMFile, RAMOutputStream) are used in 
many, many places in the code. In many of those places the actual use case is a 
writeable (and reusable) abstraction of a buffer where vints and other data 
structures can be accumulated, then flushed to another buffer (or sink). The 
fact that RAMOutputStream implements full IndexOutput and not just DataOutput 
is contradicting those use cases (a flag on optional CRC computation, reset 
method).

While it is possible to change those classes in-place, I initially opted for 
writing a "second set" of ram-based classes. This allows more progressive 
transition from RAM* classes and, perhaps more importantly, provides some room 
for a different structuring of relationships among those new classes (not to 
mention it makes baby steps of re-running full test suite easier).

I opted to use ByteBuffers (heap-based) as the underlying storage in memory, so 
all the classes are named ByteBuffers*. And so we have:
 * ByteBuffersDirectory - a replacement/equivalent of RAMDirectory, 
essentially. 
 * ByteBuffersDataOutput - a growing, paged buffer that implements DataOutput 
and adds additional utility methods like current size, returning the content as 
a contiguous array or a set of underlying ByteBuffers. The implementation is 
paged, but does not use a fixed page size. Once the content grows past 100 
"blocks" the blocks are doubled and existing content is copied to those new 
blocks. This solution is nicer for heap memory than the fixed 1024-byte blocks 
in RAMOutputStream. Sure, some memory copying overhead does occur, but I 
haven't noticed any practical impact from this. 
 * ByteBuffersDataInput - a DataInput implementation on top of a collection of 
ByteBuffers. There requirements for the input byte buffers are that for N 
buffers, the capacity of buffers 1..N-1 is constant and power-of-two (which is 
what ByteBuffersDataOutput emits) to allow efficient page/buffer seeks. The 
initial buffer can also have the initial read position shifted from 0 (for easy 
slicing). 
 * ByteBuffersIndexInput - An IndexInput implementation that reads from the 
delegate ByteBuffersDataInput
 * ByteBuffersIndexOutput - An IndexOutput implementation that writes to a 
delegate ByteBuffersDataOutput and computes the CRC of the written data. This 
is done lazily (so no special switches are required for temporary index outputs 
where previously we explicitly passed crc=false).

*Why not byte[] instead of ByteBuffers?*

Mostly in hope that byte buffers provide JVM intrinsics that would be faster 
than plain Java code (range checking, int or long access over the underlying 
byte arrays). They also make certain things more problematic or complex 
(especially to those not familiar with Buffers), but overall I think it came 
out quite nicely and integrates well with filesystem buffers we use anyway.

*Why not reuse exiting ByteBufferIndexInput abstractions?*

My design decision was to avoid the exception-handler logic control currently 
used in Lucene's memory mapped buffer IndexInput implementation. This is 
subjective taste, really, but also driven by the fact that byte buffers here 
can and will be much smaller than mapped buffer slices (so in-between buffer 
region reads will happen more frequently). So I left the ByteBuffersDataInput 
class in the code, although ByteBuffersDirectory provides a mapper that maps 
each written file's ByteBuffer[] into an IndexInput and one of the provided 
defaults can use Lucene's "native" byte buffer wrappers. Finally redundancies 
can be stripped from the patch entirely (or left in, if we decide they're 
useful). See notes about performance below too.

*What to do with RAMDirectory and other Ram*Stream classes?*

There is an obvious question what to do with RAMDirectory classes now. Should 
this patch substitute the internals (a rename of ByteBuffers*, effectively), 
should we deprecate RAMDirectory and then drop it entirely? My gut feeling is 
that many people use RAMDirectory, but I'd rather deprecate it and remove it 
eventually to give a clear signal that (a) it's not efficient, (b) it's going 
away. I am really open to discussion here though.

*What are the performance implications?*

I ran a few quick experiments creating and reading from indexes and 
ByteBuffersDirectory performs quite nicely. I'll attach a chart from AWS 32-cpu 
instance for a taste. In short, the performance is comparable with FSDirectory, 
especially when the output buffer is consolidated.

This said, I also replaced a number of RAMInput/OutputStream used as buffers in 
internal codec classes, etc., and the implications there may have escaped my 
attention. Please report if you see anything odd.

*What's next?*

This depends mostly on what we decide to do with RAMDirectory. It's still used 
heavily in tests (and mocks), so there's still some work to be done there, but 
I wanted to circulate this for feedback so that I don't spend my time on 
something that won't be accepted.

> RAMDirectory speed improvements and cleanup
> -------------------------------------------
>
>                 Key: LUCENE-8438
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8438
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>         Attachments: capture-4.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> RAMDirectory screams for a cleanup. It is used and abused in many places and 
> even if we discourage its use in favor of native (mmapped) buffers, there 
> seem to be benefits of keeping RAMDirectory available (quick throw-away 
> indexes without the need to setup external tmpfs, for example).
> Currently RAMDirectory performs very poorly under concurrent loads. The 
> implementation is also open for all sorts of abuses – the streams can be 
> reset and are used all around the place as temporary buffers, even without 
> the presence of RAMDirectory itself. This complicates the implementation and 
> is pretty confusing.
> An example of how dramatically slow RAMDirectory is under concurrent load, 
> consider this PoC pseudo-benchmark. It creates a single monolithic segment 
> with 500K very short documents (single field, with norms). The index is ~60MB 
> once created. We then run semi-complex Boolean queries on top of that index 
> from N concurrent threads. The attached capture-4 shows the result (queries 
> per second over 5-second spans) for a varying number of concurrent threads on 
> an AWS machine with 32 CPUs available (of which it seems 16 seem to be real, 
> 16 hyper-threaded). That red line at the bottom (which drops compared to a 
> single-threaded performance) is the current RAMDirectory. RAMDirectory2 is an 
> alternative implementation I wrote that uses ByteBuffers. Yes, it's slower 
> than the native mmapped implementation, but a *lot* faster then the current 
> RAMDirectory (and more GC-friendly because it uses dynamic progressive block 
> scaling internally).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8438) RAMDirectory speed improvements and cleanup

Reply via email to