[ https://issues.apache.org/jira/browse/ASTERIXDB-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chen Luo updated ASTERIXDB-2453: -------------------------------- Description: The current constant merge policy has a very high merge cost (O(n*n), where n is the number of records), and is thus seldom used in practice. However, it still has a desirable property that read cost is always bounded. From the user's perspective, this policy is also easy to tune - only a single parameter of the number of components. To improve the write cost of the constant merge policy, we will adopt the idea of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This policy significantly improves the merge cost to O(K*n^(1+1/K)), where K is the maximum number of components, and n is the total number of records (or flushes). Another desirable property is that this policy only has write cost O(n log n) (similar to the current prefix policy) when n is relatively small (the number of flushes < 4^K). was: The current constant merge policy has a very high merge cost (O(n*n), where n is the number of records), and is thus seldom used in practice. However, it still has a desirable property that read cost is always bounded. From the user's perspective, this policy is also easy to tune - only a single parameter of the number of components. To improve the write cost of the constant merge policy, we will adopt the idea of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This policy significantly improves the merge cost to O(K*n^{1+1/K}), where K is the bound of number of components, and n is the total number of records. Another desirable property is that this policy only has write cost O(n log n) (similar to the current prefix policy) when n is relatively small (the number of flushes < 4^K). > Improve the Constant Merge Policy > --------------------------------- > > Key: ASTERIXDB-2453 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-2453 > Project: Apache AsterixDB > Issue Type: Bug > Components: STO - Storage > Reporter: Chen Luo > Assignee: Chen Luo > Priority: Major > > The current constant merge policy has a very high merge cost (O(n*n), where n > is the number of records), and is thus seldom used in practice. However, it > still has a desirable property that read cost is always bounded. From the > user's perspective, this policy is also easy to tune - only a single > parameter of the number of components. > To improve the write cost of the constant merge policy, we will adopt the > idea of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This > policy significantly improves the merge cost to O(K*n^(1+1/K)), where K is > the maximum number of components, and n is the total number of records (or > flushes). Another desirable property is that this policy only has write cost > O(n log n) (similar to the current prefix policy) when n is relatively small > (the number of flushes < 4^K). -- This message was sent by Atlassian JIRA (v7.6.3#76005)