[ https://issues.apache.org/jira/browse/HBASE-28583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ke Han updated HBASE-28583: --------------------------- Description: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hmaster,16000,1715302475720: Unhandled exception. Starting shutdown. ***** org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] {code} h1. Reproduce This bug can be reproduced deterministically with the following steps: Start up HBase 2.5.8 cluster (1 HM, 2 RS, 1 HDFS: hadoop 2.10.2) Execute the following commands {code:java} create 'tb1', {NAME => 'c0', VERSIONS => 1} snapshot 'tb1', 's1' disable 'tb1' restore_snapshot 's1' {code} Stop the 2.5.8 cluster, then start up 3.0.0 cluster (commit: 516c89e8597fb6) The upgrade will fail with the above exception. h1. Root Cause This incompatibility between 2.5.8 and 3.0.0 is related to a newly added *required* field in proto file: _{*}old_table_schema{*}._ 2.5.8 {code:java} hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto message RestoreSnapshotStateData { required UserInformation user_info = 1; required SnapshotDescription snapshot = 2; required TableSchema modified_table_schema = 3; repeated RegionInfo region_info_for_restore = 4; repeated RegionInfo region_info_for_remove = 5; repeated RegionInfo region_info_for_add = 6; repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list = 7; optional bool restore_acl = 8; }{code} 3.0.0 (516c89e8597fb6) {code:java} message RestoreSnapshotStateData { required UserInformation user_info = 1; required SnapshotDescription snapshot = 2; required TableSchema modified_table_schema = 3; repeated RegionInfo region_info_for_restore = 4; repeated RegionInfo region_info_for_remove = 5; repeated RegionInfo region_info_for_add = 6; repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list = 7; optional bool restore_acl = 8; required TableSchema old_table_schema = 9; } {code} In certain scenarios, the proto message does not contain the old_table_schema field. I am wondering whether *_old_table_schema_* field must be set as required. I attached the (1) master logs file and (2) all log files in persistent.tar.gz. I am trying to find out the root cause. I appreciate any suggestions. Thank you! was: When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 2 HDFS), I met the following exception and the upgrade failed. {code:java} 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hmaster,16000,1715302475720: Unhandled exception. Starting shutdown. ***** org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: old_table_schema at org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] at org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] {code} h1. Reproduce This bug can be reproduced deterministically with the following steps: Start up HBase 2.5.8 cluster (1 HM, 2 RS, 1 HDFS: hadoop 2.10.2) Execute the following commands {code:java} create 'tb1', {NAME => 'c0', VERSIONS => 1} snapshot 'tb1', 's1' disable 'tb1' restore_snapshot 's1' {code} Stop the 2.5.8 cluster, then start up 3.0.0 cluster (commit: 516c89e8597fb6) The upgrade will fail with the above exception. h1. Root Cause The incompatibility between 2.5.8 and 3.0.0 is a newly added *required* field in proto file: _{*}old_table_schema{*}._ 2.5.8 {code:java} hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto message RestoreSnapshotStateData { required UserInformation user_info = 1; required SnapshotDescription snapshot = 2; required TableSchema modified_table_schema = 3; repeated RegionInfo region_info_for_restore = 4; repeated RegionInfo region_info_for_remove = 5; repeated RegionInfo region_info_for_add = 6; repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list = 7; optional bool restore_acl = 8; }{code} 3.0.0 (516c89e8597fb6) {code:java} message RestoreSnapshotStateData { required UserInformation user_info = 1; required SnapshotDescription snapshot = 2; required TableSchema modified_table_schema = 3; repeated RegionInfo region_info_for_restore = 4; repeated RegionInfo region_info_for_remove = 5; repeated RegionInfo region_info_for_add = 6; repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list = 7; optional bool restore_acl = 8; required TableSchema old_table_schema = 9; } {code} In certain scenarios, the proto message does not contain the old_table_schema field. I am wondering whether *_old_table_schema_* field must be set as required. I attached the (1) master logs file and (2) all log files in persistent.tar.gz. I am trying to find out the root cause. I appreciate any suggestions. Thank you! > Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message > missing required fields: old_table_schema > ---------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-28583 > URL: https://issues.apache.org/jira/browse/HBASE-28583 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 3.0.0, 2.5.8 > Reporter: Ke Han > Priority: Major > Attachments: hbase--master-033a47be7d1d.log, persistent.tar.gz > > > When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 > RS, 2 HDFS), I met the following exception and the upgrade failed. > {code:java} > 2024-05-10T00:54:45,936 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > 2024-05-10T00:54:45,937 ERROR [master/hmaster:16000:becomeActiveMaster] > master.HMaster: ***** ABORTING master hmaster,16000,1715302475720: Unhandled > exception. Starting shutdown. ***** > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: > Message missing required fields: old_table_schema > at > org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666) > ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362] > {code} > h1. Reproduce > This bug can be reproduced deterministically with the following steps: > Start up HBase 2.5.8 cluster (1 HM, 2 RS, 1 HDFS: hadoop 2.10.2) > Execute the following commands > {code:java} > create 'tb1', {NAME => 'c0', VERSIONS => 1} > snapshot 'tb1', 's1' > disable 'tb1' > restore_snapshot 's1' {code} > Stop the 2.5.8 cluster, then start up 3.0.0 cluster (commit: 516c89e8597fb6) > The upgrade will fail with the above exception. > h1. Root Cause > This incompatibility between 2.5.8 and 3.0.0 is related to a newly added > *required* field in proto file: _{*}old_table_schema{*}._ > 2.5.8 > {code:java} > hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto > message RestoreSnapshotStateData { > required UserInformation user_info = 1; > required SnapshotDescription snapshot = 2; > required TableSchema modified_table_schema = 3; > repeated RegionInfo region_info_for_restore = 4; > repeated RegionInfo region_info_for_remove = 5; > repeated RegionInfo region_info_for_add = 6; > repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list > = 7; > optional bool restore_acl = 8; > }{code} > 3.0.0 (516c89e8597fb6) > {code:java} > message RestoreSnapshotStateData { > required UserInformation user_info = 1; > required SnapshotDescription snapshot = 2; > required TableSchema modified_table_schema = 3; > repeated RegionInfo region_info_for_restore = 4; > repeated RegionInfo region_info_for_remove = 5; > repeated RegionInfo region_info_for_add = 6; > repeated RestoreParentToChildRegionsPair parent_to_child_regions_pair_list > = 7; > optional bool restore_acl = 8; > required TableSchema old_table_schema = 9; > } {code} > In certain scenarios, the proto message does not contain the old_table_schema > field. > I am wondering whether *_old_table_schema_* field must be set as required. > > I attached the (1) master logs file and (2) all log files in > persistent.tar.gz. > I am trying to find out the root cause. I appreciate any suggestions. Thank > you! -- This message was sent by Atlassian Jira (v8.20.10#820010)