[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info

Bin Shi (JIRA) Tue, 19 Feb 2019 09:41:11 -0800


     [ 
https://issues.apache.org/jira/browse/PHOENIX-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Bin Shi updated PHOENIX-4925:
-----------------------------
    Description: 
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use Segment Tree to address both memory and performance concerns. The 
guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
prefix encoding and the encoded data is a leaf node of the tree. The inner node 
contains summary info (the count of rows, the data size) of the sub tree rooted 
at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 

  was:
As reported, Query compilation (for the sample queries showed below), 
especially deriving estimation and generating parallel scans from guide posts, 
becomes much slower after we introduced Phoenix Stats. 
 a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
 b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
 c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
pk1__c,pk2__c
 d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
ORDER BY pk1__c,pk2__c
 e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' 
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to 
make the primary key.
  
 By using prefix encoding for guide post info, we have to decode and traverse 
guide posts sequentially, which causes time complexity in 
BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
count of guide posts.

According to PHOENIX-2417, to reduce footprint in client cache and over 
transmition, the prefix encoding is used as in-memory and over-the-wire 
encoding for guide post info.

We can use something like Sum Tree (even Binary Indexed Tree) to address both 
memory and performance concerns. The guide posts are partitioned to k chunks 
(k=1024?), each chunk is encoded by prefix encoding and the encoded data is a 
leaf node of the tree. The inner node contains summary info (the count of rows, 
the data size) of the sub tree rooted at the inner node.

With this tree like data structure, compared to the current data structure, the 
increased size (mainly coming from the n/k-1 inner nodes) is ignorable. The 
time complexity for queries a, b, c can be reduced to O(m) where m is the total 
count of regions; the time complexity for "EXPLAN" queries a, b, c can be 
reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can even 
be reduced to O(1). For queries d and e, the time complexity to find the start 
of target scan ranges can be reduced to O(log(n/k)).

The tree can also integrate AVL and B+ characteristics to support partial 
load/unload when interacting with stats client cache.

 


> Use Segment tree to organize Guide Post Info
> --------------------------------------------
>
>                 Key: PHOENIX-4925
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4925
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Bin Shi
>            Assignee: Bin Shi
>            Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), 
> especially deriving estimation and generating parallel scans from guide 
> posts, becomes much slower after we introduced Phoenix Stats. 
>  a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
>  b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY 
> Pk1__c
>  c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY 
> pk1__c,pk2__c
>  d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c 
> ORDER BY pk1__c,pk2__c
>  e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 
> 'm' OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only 
> column to make the primary key.
>   
>  By using prefix encoding for guide post info, we have to decode and traverse 
> guide posts sequentially, which causes time complexity in 
> BaseResultIterators.getParallelScan(...) to be O(n) , where n is the total 
> count of guide posts.
> According to PHOENIX-2417, to reduce footprint in client cache and over 
> transmition, the prefix encoding is used as in-memory and over-the-wire 
> encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The 
> guide posts are partitioned to k chunks (k=1024?), each chunk is encoded by 
> prefix encoding and the encoded data is a leaf node of the tree. The inner 
> node contains summary info (the count of rows, the data size) of the sub tree 
> rooted at the inner node.
> With this tree like data structure, compared to the current data structure, 
> the increased size (mainly coming from the n/k-1 inner nodes) is ignorable. 
> The time complexity for queries a, b, c can be reduced to O(m) where m is the 
> total count of regions; the time complexity for "EXPLAN" queries a, b, c can 
> be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE ONLY)", it can 
> even be reduced to O(1). For queries d and e, the time complexity to find the 
> start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial 
> load/unload when interacting with stats client cache.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (PHOENIX-4925) Use Segment tree to organize Guide Post Info

Reply via email to