[ https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chandrasekhar Thumuluru updated CASSANDRA-15397: ------------------------------------------------ Attachment: 90p_1million_sstables_with_1000_searches.png Mean_1million_sstables_with_1000_searches.png 95p_1million_sstables_with_1000_searches.png 99p_1million_sstables_with_1000_searches.png 90p_750k_sstables_with_1000_searches.png 95p_750k_sstables_with_1000_searches.png 99p_750k_sstables_with_1000_searches.png Mean_750k_sstables_with_1000_searches.png 90p_500k_sstables_with_1000_searches.png 95p_500k_sstables_with_1000_searches.png 99p_500k_sstables_with_1000_searches.png Mean_500k_sstables_with_1000_searches.png 90p_250k_sstables_with_1000_searches.png 95p_250k_sstables_with_1000_searches.png 99p_250k_sstables_with_1000_searches.png Mean_250k_sstables_with_1000_searches.png 90p_100k_sstables_with_1000_searches.png 95p_100k_sstables_with_1000_searches.png 99p_100k_sstables_with_1000_searches.png Mean_100k_sstables_with_1000_searches.png > IntervalTree performance comparison with Linear Walk and Binary Search based > Elimination. > ------------------------------------------------------------------------------------------ > > Key: CASSANDRA-15397 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15397 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable > Reporter: Chandrasekhar Thumuluru > Assignee: Chandrasekhar Thumuluru > Priority: Low > Labels: pull-request-available > Attachments: 90p_100k_sstables_with_1000_searches.png, > 90p_1million_sstables_with_1000_searches.png, > 90p_250k_sstables_with_1000_searches.png, > 90p_500k_sstables_with_1000_searches.png, > 90p_750k_sstables_with_1000_searches.png, > 95p_10000_SSTable_with_5000_Searches.png, > 95p_100k_sstables_with_1000_searches.png, > 95p_15000_SSTable_with_5000_Searches.png, > 95p_1million_sstables_with_1000_searches.png, > 95p_20000_SSTable_with_5000_Searches.png, > 95p_25000_SSTable_with_5000_Searches.png, > 95p_250k_sstables_with_1000_searches.png, > 95p_30000_SSTable_with_5000_Searches.png, > 95p_5000_SSTable_with_5000_Searches.png, > 95p_500k_sstables_with_1000_searches.png, > 95p_750k_sstables_with_1000_searches.png, > 99p_10000_SSTable_with_5000_Searches.png, > 99p_100k_sstables_with_1000_searches.png, > 99p_15000_SSTable_with_5000_Searches.png, > 99p_1million_sstables_with_1000_searches.png, > 99p_20000_SSTable_with_5000_Searches.png, > 99p_25000_SSTable_with_5000_Searches.png, > 99p_250k_sstables_with_1000_searches.png, > 99p_30000_SSTable_with_5000_Searches.png, > 99p_5000_SSTable_with_5000_Searches.png, > 99p_500k_sstables_with_1000_searches.png, > 99p_750k_sstables_with_1000_searches.png, IntervalList.java, > IntervalListWithElimination.java, IntervalTreeSimplified.java, > Mean_10000_SSTable_with_5000_Searches.png, > Mean_100k_sstables_with_1000_searches.png, > Mean_15000_SSTable_with_5000_Searches.png, > Mean_1million_sstables_with_1000_searches.png, > Mean_20000_SSTable_with_5000_Searches.png, > Mean_25000_SSTable_with_5000_Searches.png, > Mean_250k_sstables_with_1000_searches.png, > Mean_30000_SSTable_with_5000_Searches.png, > Mean_5000_SSTable_with_5000_Searches.png, > Mean_500k_sstables_with_1000_searches.png, > Mean_750k_sstables_with_1000_searches.png, TESTS-TestSuites.xml.lz4, > replace_intervaltree_with_intervallist.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Cassandra uses IntervalTrees to identify the SSTables that overlap with > search interval. In Cassandra, IntervalTrees are not mutated. They are > recreated each time a mutation is required. This can be an issue during > repairs. In fact we noticed such issues during repair. > Since lists are cache friendly compared to linked lists and trees, I decided > to compare the search performance with: > * Linear Walk. > * Elimination using Binary Search (idea is to eliminate intervals using start > and end points of search interval). > Based on the tests I ran, I noticed Binary Search based elimination almost > always performs similar to IntervalTree or out performs IntervalTree based > search. The cost of IntervalTree construction is also substantial and > produces lot of garbage during repairs. > I ran the tests using random intervals to build the tree/lists and another > randomly generated search interval with 5000 iterations. I'm attaching all > the relevant graphs. The x-axis in the graphs is the search interval > coverage. 10p means the search interval covered 10% of the intervals. The > y-axis is the time the search took in nanos. > PS: > # For the purpose of test, I simplified the IntervalTree by removing the data > portion of the interval. Modified the template version (Java generics) to a > specialized version. > # I used the code from Cassandra version _3.11_. > # Time in the graph is in nanos. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org