[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: 90p_1million_sstables_with_1000_searches.png Mean_1million_sstables_with_1000_searches.png 95p_1million_sstables_with_1000_searches.png 99p_1million_sstables_with_1000_searches.png 90p_750k_sstables_with_1000_searches.png 95p_750k_sstables_with_1000_searches.png 99p_750k_sstables_with_1000_searches.png Mean_750k_sstables_with_1000_searches.png 90p_500k_sstables_with_1000_searches.png 95p_500k_sstables_with_1000_searches.png 99p_500k_sstables_with_1000_searches.png Mean_500k_sstables_with_1000_searches.png 90p_250k_sstables_with_1000_searches.png 95p_250k_sstables_with_1000_searches.png 99p_250k_sstables_with_1000_searches.png Mean_250k_sstables_with_1000_searches.png 90p_100k_sstables_with_1000_searches.png 95p_100k_sstables_with_1000_searches.png 99p_100k_sstables_with_1000_searches.png Mean_100k_sstables_with_1000_searches.png > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Chandrasekhar Thumuluru >Assignee: Chandrasekhar Thumuluru >Priority: Low > Labels: pull-request-available > Attachments: 90p_100k_sstables_with_1000_searches.png, > 90p_1million_sstables_with_1000_searches.png, > 90p_250k_sstables_with_1000_searches.png, > 90p_500k_sstables_with_1000_searches.png, > 90p_750k_sstables_with_1000_searches.png, > 95p_1_SSTable_with_5000_Searches.png, > 95p_100k_sstables_with_1000_searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_1million_sstables_with_1000_searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_250k_sstables_with_1000_searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 95p_500k_sstables_with_1000_searches.png, > 95p_750k_sstables_with_1000_searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_100k_sstables_with_1000_searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_1million_sstables_with_1000_searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_250k_sstables_with_1000_searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, > 99p_500k_sstables_with_1000_searches.png, > 99p_750k_sstables_with_1000_searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_1_SSTable_with_5000_Searches.png, > Mean_100k_sstables_with_1000_searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_1million_sstables_with_1000_searches.png, > Mean_2_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_250k_sstables_with_1000_searches.png, > Mean_3_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png, > Mean_500k_sstables_with_1000_searches.png, > Mean_750k_sstables_with_1000_searches.png, TESTS-TestSuites.xml.lz4, > replace_intervaltree_with_intervallist.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage.
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-15397: --- Labels: pull-request-available (was: ) > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Chandrasekhar Thumuluru >Assignee: Chandrasekhar Thumuluru >Priority: Low > Labels: pull-request-available > Attachments: 95p_1_SSTable_with_5000_Searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_1_SSTable_with_5000_Searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_2_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_3_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, > replace_intervaltree_with_intervallist.patch > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: TESTS-TestSuites.xml.lz4 > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Chandrasekhar Thumuluru >Assignee: Chandrasekhar Thumuluru >Priority: Low > Attachments: 95p_1_SSTable_with_5000_Searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_1_SSTable_with_5000_Searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_2_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_3_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, > replace_intervaltree_with_intervallist.patch > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: replace_intervaltree_with_intervallist.patch > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Chandrasekhar Thumuluru >Assignee: Chandrasekhar Thumuluru >Priority: Low > Attachments: 95p_1_SSTable_with_5000_Searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_1_SSTable_with_5000_Searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_2_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_3_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png, > replace_intervaltree_with_intervallist.patch > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15397: --- Change Category: Performance Complexity: Normal Component/s: Local/SSTable Reviewers: Benedict Elliott Smith Priority: Low (was: Normal) Status: Open (was: Triage Needed) > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Chandrasekhar Thumuluru >Assignee: Chandrasekhar Thumuluru >Priority: Low > Attachments: 95p_1_SSTable_with_5000_Searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_1_SSTable_with_5000_Searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_2_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_3_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: Mean_3_SSTable_with_5000_Searches.png Mean_25000_SSTable_with_5000_Searches.png Mean_2_SSTable_with_5000_Searches.png Mean_15000_SSTable_with_5000_Searches.png Mean_1_SSTable_with_5000_Searches.png Mean_5000_SSTable_with_5000_Searches.png > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 95p_1_SSTable_with_5000_Searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_1_SSTable_with_5000_Searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_2_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_3_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: 95p_5000_SSTable_with_5000_Searches.png 95p_1_SSTable_with_5000_Searches.png 95p_15000_SSTable_with_5000_Searches.png 95p_2_SSTable_with_5000_Searches.png 95p_25000_SSTable_with_5000_Searches.png 95p_3_SSTable_with_5000_Searches.png > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 95p_1_SSTable_with_5000_Searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_2_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_3_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Description: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree or out performs IntervalTree based search. The cost of IntervalTree construction is also substantial and produces lot of garbage during repairs. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. The x-axis in the graphs is the search interval coverage. 10p means the search interval covered 10% of the intervals. The y-axis is the time the search took in nanos. PS: # For the purpose of test, I simplified the IntervalTree by removing the data portion of the interval. Modified the template version (Java generics) to a specialized version. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. was: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree or out performs IntervalTree based search. The cost of IntervalTree construction is also substantial and produces lot of garbage during repairs. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. The x-axis in the graphs is the search interval coverage. 10p means the search interval covered 10% of the intervals. The y-axis is the time the search took in nanos. PS: # For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Description: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree or out performs IntervalTree based search. The cost of IntervalTree construction is also substantial and produces lot of garbage during repairs. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. The x-axis in the graphs is the search interval coverage. 10p means the search interval covered 10% of the intervals. The y-axis is the time the search took in nanos. PS: # For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. was: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: # For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Description: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: # For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. was: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: # For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree performance or out performs > IntervalTree based search. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. > PS: > # For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Description: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: # For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. # I used the code from Cassandra version _3.11_. # Time in the graph is in nanos. was: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. I used the code from version _3.11_. > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree performance or out performs > IntervalTree based search. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. > PS: > # For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: IntervalTreeSimplified.java > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree performance or out performs > IntervalTree based search. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. > PS: For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. I used the code > from version _3.11_. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: IntervalList.java > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree performance or out performs > IntervalTree based search. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. > PS: For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. I used the code > from version _3.11_. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Attachment: IntervalListWithElimination.java > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree performance or out performs > IntervalTree based search. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. > PS: For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. I used the code > from version _3.11_. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.
[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandrasekhar Thumuluru updated CASSANDRA-15397: Description: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. I used the code from version _3.11_. was: Cassandra uses IntervalTrees to identify the SSTables that overlap with search interval. In Cassandra, IntervalTrees are not mutated. They are recreated each time a mutation is required. This can be an issue during repairs. In fact we noticed such issues during repair. Since lists are cache friendly compared to linked lists and trees, I decided to compare the search performance with: * Linear Walk. * Elimination using Binary Search (idea is to eliminate intervals using start and end points of search interval). Based on the tests I ran, I noticed Binary Search based elimination almost always performs similar to IntervalTree performance or out performs IntervalTree based search. I ran the tests using random intervals to build the tree/lists and another randomly generated search interval with 5000 iterations. I'm attaching all the relevant graphs. PS: For the purpose of test, I simplified the IntervalTree code by making it non-generic and removing the data portion of the interval. > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > -- > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement >Reporter: Chandrasekhar Thumuluru >Priority: Normal > Attachments: 99p_1_SSTable_with_5000_Searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_2_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_3_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png > > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree performance or out performs > IntervalTree based search. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. > PS: For the purpose of test, I simplified the IntervalTree code by making it > non-generic and removing the data portion of the interval. I used the code > from version _3.11_. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org