[ https://issues.apache.org/jira/browse/HBASE-19343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16431637#comment-16431637 ]
Pankaj Kumar commented on HBASE-19343: -------------------------------------- HiĀ [~yuzhih...@gmail.com], please commit this fix to branch-1.2 also, have already attached the patch for that. > Restore snapshot makes split parent region online > -------------------------------------------------- > > Key: HBASE-19343 > URL: https://issues.apache.org/jira/browse/HBASE-19343 > Project: HBase > Issue Type: Bug > Components: snapshots > Reporter: Pankaj Kumar > Assignee: Pankaj Kumar > Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.4 > > Attachments: 19343.tst, HBASE-19343-branch-1-v2.patch, > HBASE-19343-branch-1.2.patch, HBASE-19343-branch-1.3.patch, > HBASE-19343-branch-1.patch, Snapshot.jpg > > > Restore snapshot makes parent split region online as shown in the attached > snapshot. > Steps to reproduce > ===================== > 1. Create table > 2. Insert few records into the table > 3. flush the table > 4. Split the table > 5. Create snapshot before catalog janitor clears the parent region entry from > meta. > 6. Restore snapshot > We can see the problem in meta entries, > Meta content before restore snapshot: > {noformat} > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:regioninfo, timestamp=1511537565964, value={ENCODED => > 077a12b0b3c91b053fa95223635f9543, NAME => > 't1,,1511537529449.077a12b0b3c91b053fa95223635f9543.', STARTKEY => > '', ENDKEY => > '', OFFLINE => true, SPLIT => true} > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:seqnumDuringOpen, timestamp=1511537530107, > value=\x00\x00\x00\x00\x00\x00\x00\x02 > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:server, timestamp=1511537530107, value=host-xx:16020 > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:serverstartcode, timestamp=1511537530107, value=1511537511523 > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:splitA, timestamp=1511537565964, value={ENCODED => > 3c7c866d4df370c586131a4cbe0ef6a8, NAME => > 't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY => '', > ENDKEY => 'm'} > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:splitB, timestamp=1511537565964, value={ENCODED => > dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => > 't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY => 'm > ', ENDKEY => ''} > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:regioninfo, timestamp=1511537566075, value={ENCODED => > 3c7c866d4df370c586131a4cbe0ef6a8, NAME => > 't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY => > '', ENDKEY => > 'm'} > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:seqnumDuringOpen, timestamp=1511537566075, > value=\x00\x00\x00\x00\x00\x00\x00\x02 > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:server, timestamp=1511537566075, value=host-xx:16020 > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:serverstartcode, timestamp=1511537566075, value=1511537511523 > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:regioninfo, timestamp=1511537566069, value={ENCODED => > dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => > 't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY = > > 'm', ENDKEY => > ''} > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:seqnumDuringOpen, timestamp=1511537566069, > value=\x00\x00\x00\x00\x00\x00\x00\x08 > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:server, timestamp=1511537566069, value=host-xx:16020 > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:serverstartcode, timestamp=1511537566069, value=1511537511523 > {noformat} > Meta content after restore snapshot: > {noformat} > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:regioninfo, timestamp=1511537667635, value={ENCODED => > 077a12b0b3c91b053fa95223635f9543, NAME => > 't1,,1511537529449.077a12b0b3c91b053fa95223635f9543.', STARTKEY => > '', ENDKEY => > ''} > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:seqnumDuringOpen, timestamp=1511537667635, > value=\x00\x00\x00\x00\x00\x00\x00\x0A > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:server, timestamp=1511537667635, value=host-xx:16020 > t1,,1511537529449.077a12b0b3c91b053fa95223635f9543. > column=info:serverstartcode, timestamp=1511537667635, value=1511537511523 > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:regioninfo, timestamp=1511537667598, value={ENCODED => > 3c7c866d4df370c586131a4cbe0ef6a8, NAME => > 't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY => > '', ENDKEY => > 'm'} > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:seqnumDuringOpen, timestamp=1511537667598, > value=\x00\x00\x00\x00\x00\x00\x00\x0B > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:server, timestamp=1511537667598, value=host-xx:16020 > t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8. > column=info:serverstartcode, timestamp=1511537667598, value=1511537511523 > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:regioninfo, timestamp=1511537667621, value={ENCODED => > dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => > 't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY = > > 'm', ENDKEY => > ''} > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:seqnumDuringOpen, timestamp=1511537667621, > value=\x00\x00\x00\x00\x00\x00\x00\x0D > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:server, timestamp=1511537667621, value=host-xx:16020 > t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc. > column=info:serverstartcode, timestamp=1511537667621, value=1511537511523 > {noformat} > Root Cause: > We dont update the region split information in .regioninfo file in HDFS, but > while restoring the snapshot we set regioninfo based on the .regioninfo > entries, > {code} > // Identify which region are still available and which not. > // NOTE: we rely upon the region name as: "table name, start key, end key" > List<HRegionInfo> tableRegions = getTableRegions(); > if (tableRegions != null) { > monitor.rethrowException(); > for (HRegionInfo regionInfo: tableRegions) { > String regionName = regionInfo.getEncodedName(); > if (regionNames.contains(regionName)) { > LOG.info("region to restore: " + regionName); > regionNames.remove(regionName); > metaChanges.addRegionToRestore(regionInfo); > } else { > LOG.info("region to remove: " + regionName); > metaChanges.addRegionToRemove(regionInfo); > } > } > {code} > Here getTableRegions() is retrieved from HDFS. > There can be two solutions, > 1. Set the regioninfo based on the snapshot-manifest details. > 2. Update the .regioninfo after region split -- This message was sent by Atlassian JIRA (v7.6.3#76005)