[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Luo updated ASTERIXDB-2453:
--------------------------------
    Description: 
The current constant merge policy has a very high merge cost (O(n*n), where n 
is the number of records), and is thus seldom used in practice. However, it 
still has a desirable property that read cost is always bounded. From the 
user's perspective, this policy is also easy to tune - only a single parameter 
of the number of components.

To improve the write cost of the constant merge policy, we will adopt the idea 
of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This policy 
significantly improves the merge cost to O(K*n^(1+1/K)), where K is the maximum 
number of components, and n is the total number of records (or flushes). 
Another desirable property is that this policy only has write cost O(n log n) 
(similar to the current prefix policy) when n is relatively small (the number 
of flushes < 4^K). 

  was:
The current constant merge policy has a very high merge cost (O(n*n), where n 
is the number of records), and is thus seldom used in practice. However, it 
still has a desirable property that read cost is always bounded. From the 
user's perspective, this policy is also easy to tune - only a single parameter 
of the number of components.

To improve the write cost of the constant merge policy, we will adopt the idea 
of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This policy 
significantly improves the merge cost to O(K*n^{1+1/K}), where K is the bound 
of number of components, and n is the total number of records. Another 
desirable property is that this policy only has write cost O(n log n) (similar 
to the current prefix policy) when n is relatively small (the number of flushes 
< 4^K). 


> Improve the Constant Merge Policy
> ---------------------------------
>
>                 Key: ASTERIXDB-2453
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2453
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: STO - Storage
>            Reporter: Chen Luo
>            Assignee: Chen Luo
>            Priority: Major
>
> The current constant merge policy has a very high merge cost (O(n*n), where n 
> is the number of records), and is thus seldom used in practice. However, it 
> still has a desirable property that read cost is always bounded. From the 
> user's perspective, this policy is also easy to tune - only a single 
> parameter of the number of components.
> To improve the write cost of the constant merge policy, we will adopt the 
> idea of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This 
> policy significantly improves the merge cost to O(K*n^(1+1/K)), where K is 
> the maximum number of components, and n is the total number of records (or 
> flushes). Another desirable property is that this policy only has write cost 
> O(n log n) (similar to the current prefix policy) when n is relatively small 
> (the number of flushes < 4^K). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to