Balazs Meszaros created HBASE-30160:
---------------------------------------
Summary: Prevent region creation if the encoded region names are
the same
Key: HBASE-30160
URL: https://issues.apache.org/jira/browse/HBASE-30160
Project: HBase
Issue Type: Sub-task
Reporter: Balazs Meszaros
HBase region names are hash like this: MD5(tableName,startKey,...). With a
special startKey we can create collisions easily, like this:
{noformat}
hbase:001:0> create 'table1', 'f', SPLITS =>
["\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00^B\xb9\x99\xdb\xb7\x98W\xfa\xa1\xe0\xf1\xbc\x09h]1S[&u*\x93\xa1&RzF\x87\x9e\x970\x84\xe5\xb9\xe3ln*l\x07\x0c\xef\x03\x96Q\xbdC!\xb1\xdec-\xfb+\x11\x83h\xc1\xbe$\x1f\xae\x95\xaf\xd3W\x07\x8a\x01\xfa\xf1\xba\x83\x8c}\xa5A1\x83\xae\xae\xf8\xe6\xf9\xe5F\xa7\xc9\x1a\xfeM\xec\x07\xdem\x0em\x9e\x97\xf4\x16\x08\x94\xa8\x8a87\x07\xb5v\xac\xe7\x07\x10\x22\xfc\xb9\x1fm\xbd\x13V\xa9\xedX\xf0\xb1",
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00^B\xb9\x99\xdb\xb7\x98W\xfa\xa1\xe0\xf1\xbc\x09h]1S[\xa6u*\x93\xa1&RzF\x87\x9e\x970\x84\xe5\xb9\xe3ln*l\x07\x0c\xef\x03\x96\xd1\xbcC!\xb1\xdec-\xfb+\x11\x83h\xc1>$\x1f\xae\x95\xaf\xd3W\x07\x8a\x01\xfa\xf1\xba\x83\x8c}\xa5A1\x83\xae\xae\xf8f\xf9\xe5F\xa7\xc9\x1a\xfeM\xec\x07\xdem\x0em\x9e\x97\xf4\x16\x08\x94\xa8\x8a87\x075w\xac\xe7\x07\x10\x22\xfc\xb9\x1fm\xbd\x13V)\xedX\xf0\xb1"]
ERROR: The procedure 9 is still running
For usage try 'help "create"'
Took 608.8101 seconds
{noformat}
The table creation fails, because hashes are the same:
{noformat}
2026-05-13 09:34:23,762 INFO org.apache.hadoop.hbase.regionserver.HRegion:
[RegionOpenAndInit-table1-pool-2]: creating {ENCODED =>
647314dfe2b7e604e08fd7fd3fec44fc, NAME => 'table1,...
2026-05-13 09:34:23,764 INFO org.apache.hadoop.hbase.regionserver.HRegion:
[RegionOpenAndInit-table1-pool-1]: creating {ENCODED =>
647314dfe2b7e604e08fd7fd3fec44fc, NAME => 'table1,...
2026-05-13 09:34:23,772 WARN org.apache.hadoop.hdfs.DataStreamer:
[Thread-140]: DataStreamer Exception
java.io.FileNotFoundException: File does not exist:
/hbase/data/default/table1/647314dfe2b7e604e08fd7fd3fec44fc/.regioninfo (inode
16653) [Lease. Holder: DFSClient_NONMAPREDUCE_1353520776_1, pending creates: 3]
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3194)
at
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:609)
...
{noformat}
The procedure never finishes and prohibits further creation of {{table1}}.
This issue should be triggered with splitting the table twice:
{noformat}
split 'table1', 'malicious-key1'
split 'table1', 'malicious-key2'
{noformat}
It would be hard to change MD5 to something else, but we should handle these
collisions better. We should check if the region hashes are the same and fail
immediately. Under normal circumstances, the chance of a collision with
automatic splitting is very-low.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)