[
https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-28812.
-------------------------------
Fix Version/s: 3.0.0-beta-2
Hadoop Flags: Reviewed
Resolution: Fixed
Pushed to master and branch-3.
Thanks [~kehan5800] for reporting and verifying.
Thanks [~ndimiduk] for reviewing!
> Upgrade from 2.6.0 to 3.0.0 crashed
> -----------------------------------
>
> Key: HBASE-28812
> URL: https://issues.apache.org/jira/browse/HBASE-28812
> Project: HBase
> Issue Type: Bug
> Components: compatibility
> Affects Versions: 3.0.0
> Reporter: Ke Han
> Assignee: Duo Zhang
> Priority: Major
> Labels: pull-request-available, upgrade
> Fix For: 3.0.0-beta-2
>
> Attachments: hbase--master-2d6e4fad2af5.log,
> hbase--master-440ed844e077.log
>
>
> I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0
> using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823)
> {code:java}
> commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3,
> upstream/branch-3)
> Author: Ray Mattingly <[email protected]>
> Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load
> system entries until backup is complete (#6089)
>
> Co-authored-by: Ray Mattingly <[email protected]>
> {code}
> However, the HMaster would crash during the upgrade process.
> h1. Reproduce
> Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS)
> Step2: Stop the entire cluster
> Step3: Upgrade to 3.0.0 cluster.
> HMaster will crash with the following error message
> {code:java}
> 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster]
> regionserver.HRegion: Failed initialize of region=
> master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back
> memstore
> java.io.IOException: java.io.IOException:
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile
> Trailer from file
> hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7749)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:277)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:432)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:135)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1003)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155)
> ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.lang.Thread.run(Thread.java:833) ~[?:?]
> Caused by: java.io.IOException:
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile
> Trailer from file
> hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:289)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:339)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:301)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6924)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1181)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1178)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> ~[?:?]
> ... 1 more
> Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem
> reading HFile Trailer from file
> hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:359)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:132)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:763)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:395)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:524)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:226)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:267)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> ~[?:?]
> ... 1 more
> Caused by: java.io.IOException: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.KeyValue$KVComparator
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.getComparatorClass(FixedFileTrailer.java:578)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserializeFromPB(FixedFileTrailer.java:304)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserialize(FixedFileTrailer.java:250)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:407)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:349)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:132)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:763)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:395)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:524)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:226)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:267)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> ~[?:?]
> ... 1 more
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.KeyValue$KVComparator
> at
> jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
> ~[?:?]
> at
> jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
> ~[?:?]
> at java.lang.ClassLoader.loadClass(ClassLoader.java:520) ~[?:?]
> at java.lang.Class.forName0(Native Method) ~[?:?]
> at java.lang.Class.forName(Class.java:375) ~[?:?]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.getComparatorClass(FixedFileTrailer.java:576)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserializeFromPB(FixedFileTrailer.java:304)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserialize(FixedFileTrailer.java:250)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:407)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:349)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:132)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:763)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:395)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:524)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:226)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:267)
> ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> ~[?:?]
> ... 1 more {code}
> This problem seems to be introduced recently, and I can still upgrade from
> 2.6.0 to 3.0.0 using the previous commits (E.g. commit from May 24:
> 516c89e8597fb6ed391f9e85e594f8b7e5b56e38).
> I have attached the hmaster log.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)