[ https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591004#comment-16591004 ]
Benedict commented on CASSANDRA-9989: ------------------------------------- {quote}Just to make the initial calculation easier {quote} I think it might be nicer in this case to make the {{TREE_SIZE}} logically more obvious, so that its accessors (which are rather more complicated) are simplified, rather than its calculation? I don't think this is very tricky anyway - just set {{TREE_SIZE[0] = FAN_FACTOR}}, and leave the loop as it is, I think? {quote}It's used to update indexOffsets[I]. {quote} I think this is also more easily done by an incrementing counter, rather than decrementing? {quote}I updated the code and added an unittest to make sure the BTree is exactly the same as before (with NodeBuilder()). {quote} I didn't mean to suggest that we exactly duplicate the old logic. In fact, arguably, the {{NodeBuilder}} code is not only very ugly, it is also "incorrect" - it leaves a space vacant at the end of a full {{Leaf}} node. I think this was a conscious tradeoff when implementing the {{indexOffsets}} feature, but still (note, please maintain this in your patch, as the old code is not able to handle a "full" Leaf). This is not something easily fixed, and I think it should be addressed by rewriting the {{NodeBuilder}} entirely. Obviously, not in this patch, but I would not want to bind all future implementations by this older not-so-great one. I was musing that conceptually, splitting the final two nodes (if not identically) is consistent, so easily justified. I just wonder if a more even splitting of the contents might be more useful - for instance, it might provide more opportunities for cheap inserts into the tree. I don't think this is terribly important, however. {quote}Fixed. {quote} I may have missed it, but didn't see a response to point (1), around the {{childrenNum}} calculation? For instance, my algebra came up with this calculation: {code:java} /** * (size - (childNum - 1)) / childSize <= childNum * ==> size / childSize <= childNum + ((childNum - 1) / childSize) * ==> size <= childSize * childNum + childNum - 1 * ==> size + 1 <= (childSize + 1) * childNum * ==> ceil((size + 1) / (childSize + 1)) == childNum * ==> (size + childSize + 1) / (childSize + 1) == childNum */ int childrenNum = (size + childSize + 1) / (childSize + 1); {code} Of course, this is very close to {{size / (childSize + 1) + 1}}, so perhaps it is easy to demonstrate that it is always safe to increment by one. I haven't attempted, but it would be nice to do so, to be sure this value is always a safe {{childrenNum}} calculation > Optimise BTree.Buider > --------------------- > > Key: CASSANDRA-9989 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9989 > Project: Cassandra > Issue Type: Sub-task > Reporter: Benedict > Assignee: Jay Zhuang > Priority: Minor > Fix For: 4.x > > Attachments: 9989-trunk.txt > > > BTree.Builder could reduce its copying, and exploit toArray more efficiently, > with some work. It's not very important right now because we don't make as > much use of its bulk-add methods as we otherwise might, however over time > this work will become more useful. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org