Bryan Beaudreault created HBASE-26547:
-----------------------------------------
Summary: Passing an invalid DURABILITY when creating a table
enters an endless loop of retries
Key: HBASE-26547
URL: https://issues.apache.org/jira/browse/HBASE-26547
Project: HBase
Issue Type: Bug
Reporter: Bryan Beaudreault
As part of our Hbase2 upgrade, our automation copies the HTableDescriptor from
a CDH5 cluster into the HBase2 cluster, then kicks off replication. During our
testing we encountered a misconfigured table, which had a DURABILITY =>
'DEFAULT', when the correct value is 'USE_DEFAULT'.
In hbase 1.x, any invalid value encountered by Durability.valueOf is try/caught
and results in the default value of USE_DEFAULT. So this misconfiguration
caused no pain in cdh5.
In hbase 2.x+, the IllegalArgumentException from Durability.valueOf is no
longer caught. This is probably a good thing, but unfortunately it caused the
CreateTableProcedure to fail in a way that resulted in an endless loop of
retries, with no backoff.
This may be a general issue with CreateTableProcedure – there should probably
be a pre-step which validates the HTableDescriptor and terminally fails if
invalid.
Additionally, does it make sense to have a backoff on the retry of procedures?
The vary rapid retry of this procedure actually caused HDFS issues because it
was creating many thousands of .regioninfo files in rapid succession, enough to
lag replication and cause DataNodes to be considered bad, which caused
RegionServers to abort due to failed WAL writes.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)