[ 
https://issues.apache.org/jira/browse/HBASE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329574#comment-14329574
 ] 

Michael Segel  commented on HBASE-12853:
----------------------------------------

The design seems straight forward, at least as to a starting point. (YMMV) 

The client will create a reference to a table and then instantiate a scanner 
object along with any associated filters. 
The client then passes this object to the server expecting a result set to be 
returned. 

On the server side, it seems that the HBase Master (active) gets the scan 
request and then starts to do the heavy lifting. 

By providing more intelligence to this process, its possible to do more than 
just allow for bucketed tables to abstract the buckets and act as if its a 
regular table. 
The key question is how to best redesign this initial entry point to allow for 
such extensibility.
 

> distributed write pattern to replace ad hoc 'salting'
> -----------------------------------------------------
>
>                 Key: HBASE-12853
>                 URL: https://issues.apache.org/jira/browse/HBASE-12853
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Michael Segel 
>            Priority: Minor
>
> In reviewing HBASE-11682 (Description of Hot Spotting), one of the issues is 
> that while 'salting' alleviated  regional hot spotting, it increased the 
> complexity required to utilize the data.  
> Through the use of coprocessors, it should be possible to offer a method 
> which distributes the data on write across the cluster and then manages 
> reading the data returning a sort ordered result set, abstracting the 
> underlying process. 
> On table creation, a flag is set to indicate that this is a parallel table. 
> On insert in to the table, if the flag is set to true then a prefix is added 
> to the key.  e.g. <region server#>- or <region server #|| where the region 
> server # is an integer between 1 and the number of region servers defined.  
> On read (scan) for each region server defined, a separate scan is created 
> adding the prefix. Since each scan will be in sort order, its possible to 
> strip the prefix and return the lowest value key from each of the subsets. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to