James Taylor created PHOENIX-111:
------------------------------------

             Summary: Improve intra-region parallelization
                 Key: PHOENIX-111
                 URL: https://issues.apache.org/jira/browse/PHOENIX-111
             Project: Phoenix
          Issue Type: Bug
            Reporter: James Taylor


The manner in which Phoenix parallelizes queries is explained in some detail in 
the  Parallelization section here: 
http://phoenix-hbase.blogspot.com/2013/02/phoenix-knobs-dials.html

It's actually not that important to understand all the details. In the case 
where we try to parallelize within a region, we rely on the HBase Bytes.split() 
method (in DefaultParallelIteratorRegionSplitter) to split, based on the start 
and end key of the region. We basically use that method to come up with the 
start row and stop row of scans that will all run in parallel across that 
region.

The problem is, we haven't really tested this method, and I have my doubts 
about it, especially when two keys are of different length. The first thing 
that should be done is to write a few simple, independent tests using 
Bytes.split() directly to confirm whether or not there's a problem:

1. Write some simple tests to see if Bytes.split() works as expected. Does it 
work for two keys that are of different lengths? If not, we can likely take two 
keys and make them the same length through padding b/c we know the structure of 
the row key. The better we choose the split points to get even distribution, 
the better our parallelization will be.
2. One case that I know will be problematic is when a table is salted. In that 
case, we pre-split the table into N regions, where N is the SALT_BUCKETS=<N> 
value. The problem in this case is that the Bytes.split() points are going to 
be terrible, because it's not taking into account the possible values of the 
row key. For example, imagine you have a table like this:
{code}
CREATE TABLE foo(k VARCHAR PRIMARY KEY) SALT_BUCKETS=4
{code}
In this case, we'll pre-split the table and have the following region 
boundaries: 0-1, 1-2, 2-3, 3-4

What will be the Bytes.split() for these region boundaries? It would chunk it 
up into even byte boundaries which is not ideal, because the VARCHAR value 
would most likely be ascii characters in a range of 'A' to 'z'. We'd be much 
better off if we took into account the data types of the row key when we 
calculate these split points.

So the second thing to do is make some simple improvements to the start/stop 
key we pass Bytes.split() that take into account the data type of each column 
that makes up the primary key.

For Phoenix 5.0, we'll collect stats and drive this off of those, but for now, 
there's likely a few simple things we could do to make a big improvement.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to