[ https://issues.apache.org/jira/browse/CASSANDRA-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898182#comment-16898182 ]
Benedict commented on CASSANDRA-15202: -------------------------------------- One micro-nit: It would be nice to statically import the new {{Difference}} enum values, so they can be used without the qualifying {[Difference.}} prefix > Deserialize merkle trees off-heap > --------------------------------- > > Key: CASSANDRA-15202 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15202 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair > Reporter: Jeff Jirsa > Assignee: Aleksey Yeschenko > Priority: Normal > Fix For: 4.0 > > Attachments: offheap-mts-gc.png > > > CASSANDRA-14096 made the first step to address the heavy on-heap footprint of > merkle trees on repair coordinators - by reducing the time frame over which > they are referenced, and by more intelligently limiting depth of the trees > based on available heap size. > That alone improves GC profile and prevents OOMs, but doesn’t address the > issue entirely. The coordinator still must hold all the trees on heap at once > until it’s done diffing them with each other, which has a negative effect, > and, by reducing depth, we lose precision and thus cause more overstreaming > than before. > One way to improve the situation further is to build on CASSANDRA-14096 and > move the trees entirely off-heap. This is a trivial endeavor, given that we > are dealing with what should be full binary trees (though in practice aren’t > quite, yet). This JIRA makes the first step towards there - by moving just > deserialisation off-heap, leaving construction on the replicas on-heap still. > Additionally, the proposed patch fixes the issue of replica coordinators > sending merkle trees to itself over loopback, costing us a ser/deser loop per > tree. > Please note that there is more room for improvement here, and depending on > 4.0 timeline those improvements may or may not land in time. To name a few: > - with some minor modifications to init(), we can make sure that no matter > the range, the tree is *always* perfectly full; this would allow us to get > rid of child pointers in inner nodes, as child node addresses will be > trivially calculatable given fixed size of nodes > - the trees can be easily constructed off-heap so long as you run init() to > pre-size the tree to find out how large a buffer you need > - on-wire format doesn’t need to stream inner nodes, only leaves, and, > really, only the hashes of the leaves -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org