[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659927#comment-14659927 ] Branimir Lambov commented on CASSANDRA-9471: +1, code looks good. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651724#comment-14651724 ] Benedict commented on CASSANDRA-9471: - I've rebased this and introduced unit tests to cover the majority of functionality. All of which was already present but not covered. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648337#comment-14648337 ] Benedict commented on CASSANDRA-9471: - [~aweisberg] said: {quote} It would be nice to see a microbenchmark and benchmark demonstrating that the use of BTree is a win. I am wondering how many additional instructions and branches are hidden inside BTree that you pay for in the common case of small Columns set. A BTree is more instructions for more cache locality, but in the common case will there be enough columns to benefit? {quote} If we want to go down that avenue, we should IMO simply look into specialising the btree iterator for the situation where we have just one leaf, and this is something I've considered. That's pretty much the only likely win. We don't really have time to do the kind of microbenchmarking you're talking about though, and it is of limited use anyway, since one of the main benefits _performancewise_ (in the common case) is improved icache\* occupancy, which cannot be accounted for in a microbenchmark. In the uncommon case this also gives us better lower bounds on behaviour, and access to a more powerfully expressive toolkit. For instance, it permits us to trivially deliver a more efficient multi-way merge (which we do very often), and that can be improved further still with less trivial enhancements. Mostly, though, this patch simply guarantees us a good lower bound on performance, with improved code clarity. Right now I would very much expect performance in the common case to be a wash, especially on any of our existing workloads. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646593#comment-14646593 ] Benedict commented on CASSANDRA-9471: - Patch available [here|https://github.com/belliottsmith/cassandra/tree/9471] > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613137#comment-14613137 ] Sylvain Lebresne commented on CASSANDRA-9471: - I'm sorry for the back and forth, but if you haven't started working on that rebase, actually disregard my previous comment (might be safer to leave Columns alone until CASSANDRA-9705). > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613122#comment-14613122 ] Sylvain Lebresne commented on CASSANDRA-9471: - Side note: if the changes to {{Columns}} are not hard to rebase, I'd personally be fine with just rebasing that ticket as is (without bothering splitting it in 2 tickets) for the sake of saving you some time. At least for CASSANDRA-9705, I don't plan on having much of {{Columns}} going obsolete (the indexability will be most likely much less used but will still be handy, and we'll still rely heavily-ish on {{contains}} which is currently not terribly efficient). And of course, that still doesn't precludes from consider other implementation of {{Columns}} later. Anyway, fine with whatever way you prefer, but just to say that if splitting into 2 tickets takes you the same time than just rebasing the whole patch, I'd personally just go with the second option. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613103#comment-14613103 ] Benedict commented on CASSANDRA-9471: - bq. but ending up doing something less efficient just because it's not there You're right, this can happen frustratingly often. OK. I'm convinced :) I'll split out the btree-only stuff into a separate ticket. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613093#comment-14613093 ] Sylvain Lebresne commented on CASSANDRA-9471: - bq. for the normal worries of code atrophy Yes, that is something to take into account. However, it's also a utility class, one that is meant to be used a lot in the codebase. And the indexability code both already written. So if it "doesn't introduce significant complexity", it does feels like a relatively good deal. Basically, I would hate to spend more time pulling the already written functionality out to maybe end up someday having a good use of this, but ending up doing something less efficient just because it's not there. Besides, it's totally possible it will be used by {{Columns}} in the end :) Anyway, I don't want to sound insistent, it's not that I absolutely want it. Just offering that maybe simply rebasing that ticket now would avoid pushing that to when we might be even shorter on resources than we are, doesn't precludes considering better alternative for {{Columns}} later, and won't waste all that much work if we do end up changing {{Columns}} but keep the indexability as "generally useful". > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613081#comment-14613081 ] Benedict commented on CASSANDRA-9471: - Well, performance-wise the difference is negligible. There is an extra lg(N/32) cost for the current implementation, which amortizes to imperceptible (and literally zero for small sets). The fact we don't use higher/lower/ceil/floor very commonly means I'm confident this extra cost is better to incur for the simplicity of implementation. The reason I say "better" is exclusively because there is a more direct implementation for the inequality lookups. If we don't have _another_ reason for indexing it seems better practice to implement that directly, and leave out the indexing feature. The indexability is actually surprisingly simple, and doesn't introduce significant complexity IMO. I'm just a little wary of introducing features we don't use _directly_ (even if I have an attachment to it), for the normal worries of code atrophy. I certainly won't argue against its inclusion, though, as I agree it seems like it _should_ be more generally useful. I'm just not yet aware of another place for it. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613065#comment-14613065 ] Sylvain Lebresne commented on CASSANDRA-9471: - bq. If we choose not to include this feature, it would be better to implement these directly How much better? Thinking out loud here, but we're a database, we're dealing with sorted stuff all the time. So even outside of its use (or not) by {{Columns}}, having a more capable {{BtreeSet}} implementation, one that can act more like an efficient sorted list, feels to me like something that would be useful to have in our tool belt. Meaning by that it sounds from you comments that the indexability does add much complexity to the implementation(disclaimer: I haven't looked at the patch) , so if its cost is really small, maybe it's worth getting the flexibility? > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613058#comment-14613058 ] Benedict commented on CASSANDRA-9471: - Well, the decision does ultimately affect how certain features within the btree are implemented - or at least the cost/benefit analysis (for the reviewer as much as myself). Right now I've used the indexability feature to make a trivial implementation of lower/higher/floor/ceil, because it permits you to treat the whole btree as though it were an array for indexing, using binarySearch semantics and positional access. If we choose not to include this feature, it would be better to implement these directly - not onerous, of course, but I want to avoid burdening branimir with unnecessary review. There's also some intertwining on testing (using higher features to help test lower ones). However you make a good point, and I will see what minimal set of changes I can extract to get the ball rolling. It's probably still pretty significant and helpful. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613046#comment-14613046 ] Sylvain Lebresne commented on CASSANDRA-9471: - bq. at the very least, the improved iterator, improved tests, and wider deployment of the btree are all worth incorporating. What about moving those changes to a separate ticket (i.e. one that is not concerned by {{Columns}})? It's useful to trunk anyway as you says, and the less stuff we delay, the better. Splitting the changes related to {{Columns}} from the other is also more incremental in a way :) > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612250#comment-14612250 ] Benedict commented on CASSANDRA-9471: - So, on further consideration, I think I will postpone this patch until the in-progress refactoring is done. The reason being that it may be there is a better, slightly wider-viewed, approach to improving this. If so, these improvements to the BTree won't go completely to waste: at the very least, the improved iterator, improved tests, and wider deployment of the btree are all worth incorporating. However the indexability - whilst very nifty, and relatively simple - is only really helpful while there is a distinction between Columns and RowDataBlock, which may not be necessary to maintain. I have a feeling we can both create a more compact and efficient Columns representation. I will consider it all a bit more once the lay of the land has settled down. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612146#comment-14612146 ] Benedict commented on CASSANDRA-9471: - [~blambov]: just letting you know that I'm rebasing this, which may take a little while. The btree-only changes (i.e. the majority of the work) are still good for review though. > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array
[ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585005#comment-14585005 ] Benedict commented on CASSANDRA-9471: - I've pushed a branch [here|https://github.com/belliottsmith/cassandra/tree/9471] This patch does quite a few things, since I thought we had better get them out of the way sooner than later: # Introduces "indexing" to the BTree class, permitting lookup by positional index, and binarySearch semantics (so inequality lookups yield the position a binarySearch of the equivalent array would) #* The size modulo remainder of a Leaf/Branch are flipped, so that a Branch has a spare slot at the end for an int[] index: #** if missing, this occupies at most 1 bit per entry in the set (often 1/32), if present, it occupies at most 2.5 bits, and typically around 1 bit. Sets with <= 32 items incur no cost. #* This index just contains the cumulative size of the tree preceding each branch key, rooted at the node in question # Reintroduces BTreeSet, with extra methods for accessing by index\*, and simple implementations of missing NavigableMap methods using this new functionality # Introduces a simple BTreeSet.Builder, and replaces all suitable uses of TreeMap in the codebase with a BTreeSet.Builder -> BTreeSet # Rewrites btree.Cursor to make it far easier to understand, and to introduced IndexedSearchIterator, exposing the new indexing # Reintroduces LongBTreeTest, with cleaned up more exhaustive coverage of iterator functionality, and new NavigableMap methods It's more code than I expected, as I didn't plan on rewriting btree.Cursor, but it was frankly a bit of a mess. 8099 also extends it to cover descending ranges, but with no test coverage, and nor did I want to reason about (my own!) code when inverted in this way to corroborate its behaviour. The new implementation is far more declarative, and I hope should be easy for just about anybody to read and review, and crucially to extend/modify in future. The Cursor rewrite was also sparked by the introduction of IndexedSearchIterator, which was designed to drop into SimpleRowDataBlock.CellWriter. Unfortunately, the comments here _suggesting_ columns would be provided in-order was optimistic. It seems that the call-sites would need to at least specify if this is the case. This means that the Cursor rewrite is not helpful for this _yet_, but hopefully will be soon, and I think it was warranted by our dependence on new functionality and by its existing ugliness. It is possible we could split this ticket into (1, 5.1); (2, 5.2); 3; (4, 5.3), but this all seemed naturally grouped together, and given it's a follow on to 8099... I was hoping [~blambov] could review the changes to BTree, BTreeSet and Cursor in particular? If [~slebresne] or [~blerer] could review the more modest places it touches the 8099 code that would be great. \*A simple follow up to this ticket would be to make BTreeSet implement both {{Set}} _and_ {{List}} > Columns should be backed by a BTree, not an array > - > > Key: CASSANDRA-9471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9471 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0 beta 1 > > > Follow up to 8099. > We have pretty terrible lookup performance as the number of columns grows > (linear). In at least one location, this results in quadratic performance. > We don't however want this structure to be either any more expensive to > build, nor to store. Some small modifications to BTree will permit it to > serve here, by permitting efficient lookup by index, and calculation _of_ > index for a given key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)