Re: Query on rowkey distribution || Does RS and number of Region related with each other

Josh Elser Thu, 30 Aug 2018 08:42:02 -0700

As I've been trying to explain in Slack:

1. Are you including the salt in the data that you are writing, suchthat you are spreading the data across all Regions per their boundaries?Or, as I think you are, just creating split points with this arbitrary"salt" and not including it when you write data?

If, as I am assuming, you are not, all of your data will go into thefirst or last region. If you are still not getting my point, I'd suggestthat you share the exact splitpoints and one rowkey that you are writingto HBase. That will make it quite clear if my guess is correct or not.

2. The number of Regions controls the number of RegionServers that willbe involved with reads/writes against that table. This is a calculationthat you need to figure out based on your cluster configuration and themagnitude of your workload.


On 8/30/18 1:11 AM, Manjeet Singh wrote:

Hi All,



I have two Question

*Question 1 : *

I want to understand how rowkey distribution happen if I create my table
with out applying any policy but opting prefix salting.

Example I have rowkey like

SALT_ID_DayStartTimestamp_DayEndTimeStamp_IDTimeStamp

So it will look like as below

*_99_1516838400_1516924800_1516865160

Question is : now I can not see that my data is getting distributed only
because of salt.

So does I have only choice of pre splitting? Or do I have any other option?

I have seen two more approaches

i.e.

hbase org.apache.hadoop.hbase.util.RegionSplitter test_table HexStringSplit
-c 10 -f f1

I guess its scope is limited as number of region created at the time table
creation and it will fix? Not sure.

and

*UniformSplit
<https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.UniformSplit.html>*



*Second 2: Does number of split point anywhere related to the number of RS
in cluster, If yes what is the calculation? *

Re: Query on rowkey distribution || Does RS and number of Region related with each other

Reply via email to