On Wed, Apr 15, 2015 at 11:06 AM, Adam Fuchs <afu...@apache.org> wrote:
> On Wed, Apr 15, 2015 at 10:20 AM, Keith Turner <ke...@deenlo.com> wrote: >> >> >> Random thought on revamp. Immutable key values with enough primitives to >> make most operations efficient (avoid constant alloc/copy) might be >> something to consider for the iterator API >> >> > So, is this a tradeoff in the performance vs. inter-iterator isolation > space? From a performance perspective we would do best if we just passed > around pointers to an underlying byte array (e.g. ByteBuffer-style), but > maximum > There are performance implications to consider key/vals not being immutable. Currently if any iterator wants to keep a key/val to compare it later key vals, then it has to copy it. I think some iterators do this frequently. I am not making the assertion that immutable would perform better, I don't know. > isolation would require never reusing anything returned from an iterator's > getTopX methods. From a security perspective we need to be careful with how > we reuse data objects (hence the need for the SynchronizedIterator at the > top of the "system" iterators), but I would say we can probably relax other > isolation concerns in the iterators in favor of performance. > > I think there's probably a bigger project here around minimizing the > object creation, data copying, serialization, and deserialization of keys. > We did some work that Chris McCubbin will be presenting at the upcoming > accumulo summit around pushing key comparisons down to a serialized form of > the key, and that made a huge impact on load performance. I think we could > probably achieve an order of magnitude more throughput in the iterator tree > with a major refactoring. Any thoughts on when we might have the appetite > for such a change? If we're thinking about making key/values immutable then > we might piggyback a bigger redesign on that already breaking change. > If we were to introduce an improved iterator API, i would hope we could deprecate and still support the old API. > > Adam >