Ke Han created HBASE-28159:
------------------------------
Summary: Unable to get table state error when table is being
initialized
Key: HBASE-28159
URL: https://issues.apache.org/jira/browse/HBASE-28159
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 2.4.17
Reporter: Ke Han
Attachments: hbase--master-37bbb9b6f05a.log, persistent.tar.gz
When executing commands to create a table, I noticed the following ERROR in
HMaster
{code:java}
2023-10-17 06:41:47,118 ERROR [master/hmaster:16000.Chore.1]
master.TableStateManager: Unable to get table
uuidf68fb89ec7f4435597d69fb7b099d8e7 state
org.apache.hadoop.hbase.TableNotFoundException: No state found for
uuidf68fb89ec7f4435597d69fb7b099d8e7
at
org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:155)
at
org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:92)
at
org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:419)
at
org.apache.hadoop.hbase.master.assignment.AssignmentManager.getRegionStatesCount(AssignmentManager.java:2341)
at
org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2616)
at
org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2537)
at
org.apache.hadoop.hbase.master.balancer.ClusterStatusChore.chore(ClusterStatusChore.java:47)
at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750){code}
h1. Reproduce
Due to the thread interleaving, it might need to run the following command
sequence multiple times to reproduce
1 HM, 2 RS, HDFS-2.10.2
{code:java}
create 'uuid49bb410e0a0c40ffb070d17787b4cad7', {NAME =>
'uuid66e57e5195e04956a78f789b2a25ec01', VERSIONS => 1, COMPRESSION => 'GZ',
BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME =>
'uuid119181eed72a43ccb66fabe37f84d2c0', VERSIONS => 1, COMPRESSION => 'GZ',
BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME =>
'uuidc2d4931eaf4c429db0e55514fb12e767', VERSIONS => 3, COMPRESSION => 'NONE',
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME =>
'uuidc9802bbfbe434411ae68bb8388d499b6', VERSIONS => 3, COMPRESSION => 'NONE',
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME =>
'uuidc85e117d0ca144719fc53d30b189a343', VERSIONS => 3, COMPRESSION => 'NONE',
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}
create 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME =>
'uuid76ccbd96fbdc418b95ed9971ff423b2d', VERSIONS => 1, COMPRESSION => 'GZ',
BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME =>
'uuid36835d3faff04838bd02d6226557d7c8', VERSIONS => 1, COMPRESSION => 'GZ',
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME =>
'uuid37752598d1bb405eb39a3e17c04d7e60', VERSIONS => 1, COMPRESSION => 'NONE',
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}
create 'uuidf68fb89ec7f4435597d69fb7b099d8e7', {NAME =>
'uuidb235288b1d304fe1a62adb63968d9eee', VERSIONS => 1, COMPRESSION => 'NONE',
BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME =>
'uuidf348f8849e724b3fa231fc2bb459be2d', VERSIONS => 1, COMPRESSION => 'NONE',
BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME =>
'uuid81341a87083e49d7a0d8aff7b1ccf16a', VERSIONS => 3, COMPRESSION => 'GZ',
BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME =>
'uuid24db0d3c67c347d3a4c18af90facec2d', VERSIONS => 1, COMPRESSION => 'NONE',
BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME =>
'uuid7ecf10315f444cfd9c5698695f9054d9', VERSIONS => 1, COMPRESSION => 'NONE',
BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}
enable 'uuid094dd5bf47eb47d69148b63e73ce0e7c'
create_namespace 'uuidc1066f82d7834f698d335dd04fa7ad3e'
alter 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 'enaJvIGYBk',
BLOOMFILTER => 'ROWCOL', IN_MEMORY => false}
disable 'uuidf68fb89ec7f4435597d69fb7b099d8e7' {code}
I have attached the full logs.
h1. Root Cause
The ERROR message is thrown because of the thread interleaving between (1) T1:
creating the table and (2) T2: Chore thread calculating TABLE_TO_REGIONS_COUNT.
Here's how it happens in detail
# User issues a create table request, it puts the table name into
tableDescriptors.
# Chore thread is trying to calculate TABLE_TO_REGIONS_COUNT by iterating all
tables from {*}getTableDescriptors().getAll(){*}. This also includes the table
which is being created but the table state is not created yet.
# It tries to fetch the table state and throws an ERROR.
IMO, this is a normal and correct process which shouldn't incur ERROR level
message. It could be avoided by properly handling the thread interleaving
between table updates and chore threads.
I am trying to fix it. Any help would be appreciated!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)