[ https://issues.apache.org/jira/browse/HBASE-28068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rahul Kumar reassigned HBASE-28068: ----------------------------------- Assignee: Rahul Kumar > Normalizer should batch merging 0 sized/empty regions > ----------------------------------------------------- > > Key: HBASE-28068 > URL: https://issues.apache.org/jira/browse/HBASE-28068 > Project: HBase > Issue Type: Improvement > Components: Normalizer > Affects Versions: 2.5.5 > Reporter: Ravi Kishore Valeti > Assignee: Rahul Kumar > Priority: Minor > Fix For: 2.6.0, 3.0.0 > > > In our production environment, while investigating an issue, we observed that > the Noramlizer had scheduled one single merge procedure to an RS providing > 27K+ empty regions of a table (this was a result of a failed copy table job > that left 27K+ empty regions of the table) to merge. > This action led the procedure to go to stuck state and eventually the > procedure framework bailed out after ~40mins. This was happening with each > normalizer run until we deleted the table manually. > Logs > Normalizer triggers a merge procedure > normalizer.RegionNormalizerWorker - NormalizationTarget[regionInfo=\{ENCODED > => 6e8606335a62f6bafceb017dc7edfdf5, NAME => 'TEST.TEST_TABLE,XXXX.', > STARTKEY => 'XXXX', ENDKEY => 'YYYY'},{*}regionSizeMb=0{*}], > NormalizationTarget[regionInfo=\{ENCODED => 79607df308d7618e632abe8a12c1bf6b, > NAME => 'TEST.TEST_TABLE,XXXX', STARTKEY => 'XXYY', ENDKEY => > 'YYZZ'},{*}regionSizeMb=0]{*}]] resulting in *pid 21968356* > procedure immediately gets stuck > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time 12.4850 sec > Finally fails after ~40 mins > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time *40 mins, 58.055 sec* > Bails out with RuntimeException > procedure2.ProcedureExecutor - force=false > java.lang.UnsupportedOperationException: pid=21968356, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, locked=true, > exception=java.lang.{*}RuntimeException via CODE-BUG: Uncaught runtime > exception{*}: pid=21968356, state=RUNNABLE:MERGE_TABLE_REGIONS_UPDATE_META, > locked=true; MergeTableRegionsProcedure table=TEST.TEST_TABLEXXXX, > {*}regions={*}{*}[269a1b168af497cce9ba6d3d581568f2{*} > . > . > . > . > *27K+ regions printed here]* -- This message was sent by Atlassian Jira (v8.20.10#820010)