[ https://issues.apache.org/jira/browse/CASSANDRA-16274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231378#comment-17231378 ]
Marcus Eriksson edited comment on CASSANDRA-16274 at 11/13/20, 11:02 AM: ------------------------------------------------------------------------- patch: https://github.com/krummas/cassandra/commits/marcuse/3200opt cci: https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F3200opt A few commits in this, basic idea for the optimisations is to only iterate over the ranges that can overlap instead of all diffing ranges. When picking endpoints to stream from, we always pick the next node sorted by ip address - does not matter which node we pick as long as we pick the same one. This branch also contains a fix for CASSANDRA-15957 was (Author: krummas): patch: https://github.com/krummas/cassandra/commits/marcuse/3200opt cci: https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F3200opt > Improve performance when calculating StreamTasks with optimised streaming > ------------------------------------------------------------------------- > > Key: CASSANDRA-16274 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16274 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair > Reporter: Marcus Eriksson > Assignee: Marcus Eriksson > Priority: Normal > Fix For: 4.0-beta4 > > > The way stream tasks are calculated currently is quite inefficient, improve > that. > Also, we currently try to distribute the streaming nodes evenly, this creates > many more sstables than necessary - instead we should try to stream > everything from a single peer, this should reduce the number of sstables > created on the out-of-sync node. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org