[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta
[ https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917388#comment-16917388 ] stack edited comment on HBASE-22567 at 8/28/19 4:37 AM: What you want to do w/ this [~wchevreuil]? I think we should finish this up, get it in, and then in the fixMeta that is also coming in, suggest that this action be run first since it can connect holes in meta to dirs in filesystem (the fixMeta in HBASE-22771 only papers over the holes making no effort at trying to figure if info in fs). was (Author: stack): What you want to do w/ this [~wchevreuil]? > HBCK2 addMissingRegionsToMeta > - > > Key: HBASE-22567 > URL: https://issues.apache.org/jira/browse/HBASE-22567 > Project: HBase > Issue Type: New Feature > Components: hbck2 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > > Following latest discussion on HBASE-21745, this proposes an hbck2 command > that allows for inserting back regions missing in META that still have > *regioninfo* available in HDFS. Although this is still an interactive and > simpler version than the old _OfflineMetaRepair_, it still relies on hdfs > state as the source of truth, and performs META updates mostly independently > from Master (apart from requiring Meta table been online). > For a more detailed explanation on this command behaviour, pasting _command > usage_ text: > {noformat} > To be used for scenarios where some regions may be missing in META, > but there's still a valid 'regioninfo' metadata file on HDFS. > This is a lighter version of 'OfflineMetaRepair' tool commonly used for > similar issues on 1.x release line. > This command needs META to be online. For each table name passed as > parameter, it performs a diff between regions available in META, > against existing regions dirs on HDFS. Then, for region dirs with > no matches in META, it reads regioninfo metadata file and > re-creates given region in META. Regions are re-created in 'CLOSED' > state at META table only, but not in Masters' cache, and are not > assigned either. A rolling Masters restart, followed by a > hbck2 'assigns' command with all re-inserted regions is required. > This hbck2 'assigns' command is printed for user convenience. > WARNING: To avoid potential region overlapping problems due to ongoing > splits, this command disables given tables while re-inserting regions. > An example adding missing regions for tables 'table_1' and 'table_2': > $ HBCK2 addMissingRegionsInMeta table_1 table_2 > Returns hbck2 'assigns' command with all re-inserted regions.{noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta
[ https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896609#comment-16896609 ] stack edited comment on HBASE-22567 at 7/31/19 9:35 PM: -[~wchevreuil] One thing I've noticed working on fixing holes is that if that after scanning to find SINGLE REGION WIDE holes and then using the hole edges to create the regioninfo that covers the hole, if I use the new RegionInfo to try and create the corresponding directory in the filesystem, it fails because the directory already exists (with any data it may have had in it). I've made it so if existing directory we just use it rather than create a new one. This works for the case of a single region missing from table but it won't work for the case covered here where a bunch of contiguous regions are missing from hbase:meta but their data is in the fs; in this latter case, the tool added here will restore data and fill holes. Just something I noticed...- Above didn't seem right and it isn't. I had bug in my fill holes code. Ignore the above assertion. was (Author: stack): [~wchevreuil] One thing I've noticed working on fixing holes is that if that after scanning to find SINGLE REGION WIDE holes and then using the hole edges to create the regioninfo that covers the hole, if I use the new RegionInfo to try and create the corresponding directory in the filesystem, it fails because the directory already exists (with any data it may have had in it). I've made it so if existing directory we just use it rather than create a new one. This works for the case of a single region missing from table but it won't work for the case covered here where a bunch of contiguous regions are missing from hbase:meta but their data is in the fs; in this latter case, the tool added here will restore data and fill holes. Just something I noticed... > HBCK2 addMissingRegionsToMeta > - > > Key: HBASE-22567 > URL: https://issues.apache.org/jira/browse/HBASE-22567 > Project: HBase > Issue Type: New Feature > Components: hbck2 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > > Following latest discussion on HBASE-21745, this proposes an hbck2 command > that allows for inserting back regions missing in META that still have > *regioninfo* available in HDFS. Although this is still an interactive and > simpler version than the old _OfflineMetaRepair_, it still relies on hdfs > state as the source of truth, and performs META updates mostly independently > from Master (apart from requiring Meta table been online). > For a more detailed explanation on this command behaviour, pasting _command > usage_ text: > {noformat} > To be used for scenarios where some regions may be missing in META, > but there's still a valid 'regioninfo' metadata file on HDFS. > This is a lighter version of 'OfflineMetaRepair' tool commonly used for > similar issues on 1.x release line. > This command needs META to be online. For each table name passed as > parameter, it performs a diff between regions available in META, > against existing regions dirs on HDFS. Then, for region dirs with > no matches in META, it reads regioninfo metadata file and > re-creates given region in META. Regions are re-created in 'CLOSED' > state at META table only, but not in Masters' cache, and are not > assigned either. A rolling Masters restart, followed by a > hbck2 'assigns' command with all re-inserted regions is required. > This hbck2 'assigns' command is printed for user convenience. > WARNING: To avoid potential region overlapping problems due to ongoing > splits, this command disables given tables while re-inserting regions. > An example adding missing regions for tables 'table_1' and 'table_2': > $ HBCK2 addMissingRegionsInMeta table_1 table_2 > Returns hbck2 'assigns' command with all re-inserted regions.{noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta
[ https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883895#comment-16883895 ] stack edited comment on HBASE-22567 at 7/12/19 5:23 PM: I defer to your support experience as to how best to progress since you know better what is needed. I see the main differences as: * online vs offline repair * total replace of meta vs targetted surgery I'd think that an operator would try the tool here first. It would ideally work for 99% of cases. For the remainder, there would be total rebuild of meta? This tool could grow to obsolete OMR. That'd be good all around. I went back to hbck1 classes after poking around trying to figure how we'd implement fixHoles, fixOverlaps as well as rebuild, and general reporting. It seemed to me that a bunch of basic infrastructure had to be built up first before we do fixHoles, reporting, etc. Rather than try and write from scratch, I figured I would try and exploit what already exists. hbck1 has those datastructures that hold a bunch of state -- who has what, where the info was gleaned from (hdfs or meta) and so on -- which seemed useful as building blocks for fixup. hbck1 is in violation of hbase2/hbck2 principals in many regards but at least when it comes to rebuild of meta, it was adaptable. Was going to look at trying to make use of other facility in hbck1 in follow-on issues if possible. Yeah, OfflineMetaRepair from hbck1 kills the meta as far as hbase2 is concerned. Let me say so on [~brfrn169]'s discussion thread. I'm game for chatting offline anytime if that would help here. Meantime let me go back and finish review of the PR. was (Author: stack): I defer to your support experience as to how best to progress since you know better what is needed. I see the main differences as: * online vs offline repair * total replace of meta vs targetted surgery I went back to hbck1 classes after poking around trying to figure how we'd implement fixHoles, fixOverlaps as well as rebuild, and general reporting. It seemed to me that a bunch of basic infrastructure had to be built up first before we do fixHoles, reporting, etc. Rather than try and write from scratch, I figured I would try and exploit what already exists. hbck1 has those datastructures that hold a bunch of state -- who has what, where the info was gleaned from (hdfs or meta) and so on -- which seemed useful as building blocks for fixup. hbck1 is in violation of hbase2/hbck2 principals in many regards but at least when it comes to rebuild of meta, it was adaptable. Was going to look at trying to make use of other facility in hbck1 in follow-on issues if possible. Yeah, OfflineMetaRepair from hbck1 kills the meta as far as hbase2 is concerned. Let me say so on [~brfrn169]'s discussion thread. I'm game for chatting offline anytime if that would help here. Meantime let me go back and finish review of the PR. > HBCK2 addMissingRegionsToMeta > - > > Key: HBASE-22567 > URL: https://issues.apache.org/jira/browse/HBASE-22567 > Project: HBase > Issue Type: New Feature > Components: hbck2 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > > Following latest discussion on HBASE-21745, this proposes an hbck2 command > that allows for inserting back regions missing in META that still have > *regioninfo* available in HDFS. Although this is still an interactive and > simpler version than the old _OfflineMetaRepair_, it still relies on hdfs > state as the source of truth, and performs META updates mostly independently > from Master (apart from requiring Meta table been online). > For a more detailed explanation on this command behaviour, pasting _command > usage_ text: > {noformat} > To be used for scenarios where some regions may be missing in META, > but there's still a valid 'regioninfo' metadata file on HDFS. > This is a lighter version of 'OfflineMetaRepair' tool commonly used for > similar issues on 1.x release line. > This command needs META to be online. For each table name passed as > parameter, it performs a diff between regions available in META, > against existing regions dirs on HDFS. Then, for region dirs with > no matches in META, it reads regioninfo metadata file and > re-creates given region in META. Regions are re-created in 'CLOSED' > state at META table only, but not in Masters' cache, and are not > assigned either. A rolling Masters restart, followed by a > hbck2 'assigns' command with all re-inserted regions is required. > This hbck2 'assigns' command is printed for user convenience. > WARNING: To avoid potential region overlapping problems due to ongoing > splits, this command disables given tables while re-inserting regions. > An example adding missing regions for tables 'table_1'
[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta
[ https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883529#comment-16883529 ] stack edited comment on HBASE-22567 at 7/12/19 5:10 AM: [~wchevreuil] I added an OfflineMetaRepair to hbck over in HBASE-22680. It is based on hbck1 implementation trying to exploit already-written code being selective about what is revealed from hbck1 in hbck2. I see the extant hbck1 tooling as useful implementing not only OfflineMetaRepair but also other items listed up in HBASE-21745: e.g. 'enumerate store files to determine file level corruption and sideline corrupt files' and `fix hfile link problems (dangling / broken)`. [~busbey] asks in review of HBASE-22680 if overlap with the work here. Let me review your latest but I don't think so (correct me if I'm wrong). The patch here works against an online meta selectively making fixes, posing as a "...a lighter version of 'OfflineMetaRepair'". HBASE-22680 does a brute, wholesale rewrite of meta w/ the cluster offline. When I asked above, you said this patch could perhaps evolve to subsume OfflineMetaRepair. When that is the case, we could remove HBASE-22680 -- or keep it for when a radical cure is wanted. What you reckon? On [~daisuke.kobayashi]'s note, that table state is migrated from zk on startup, my comments above were backed by a misunderstanding on my part (only table 'state' is repopulated; I thought it more than this). My comments above probably confuse because of this. Pardon me (D and W). Having been in the code recently, I was reminded that the mirroring of table state to zk was supposed to be turned off; the zk table state mirroring is so hbase1 clients can work against hbase2. So, yeah, was you say above, you probably want to assume control over what state the table is in and making sure the state is populated. was (Author: stack): [~wchevreuil] I added an OfflineMetaRepair to hbck over in HBASE-22680. It is based on hbck1 implementation trying to exploit already-written code being selective about what is revealed from hbck1 in hbck2. I see the extant hbck1 tooling as useful implementing not only OfflineMetaRepair but also other items listed up in HBASE-21745: e.g. 'enumerate store files to determine file level corruption and sideline corrupt files' and `fix hfile link problems (dangling / broken)`. [~busbey] asks in review of HBASE-22680 if overlap with the work here. Let me review your latest but I don't think so (correct me if I'm wrong). The patch here works against an online meta selectively posing as a "a lighter version of 'OfflineMetaRepair'". HBASE-22680 does a wholesale rewrite of meta w/ the cluster offline. When I asked above, you said this patch could perhaps evolve to subsume OfflineMetaRepair. When that is the case, we could remove HBASE-22680. What you reckon? On [~daisuke.kobayashi]'s note, that table state is migrated from zk on startup, my comments above were backed by a misunderstanding on my part (only table 'state' is repopulated; I thought it more than this). My comments above probably confuse because of this. Having been in the code recently, I was reminded that the mirroring of table state to zk was supposed to be turned off; the zk table state mirroring is so hbase1 clients can work against hbase2. > HBCK2 addMissingRegionsToMeta > - > > Key: HBASE-22567 > URL: https://issues.apache.org/jira/browse/HBASE-22567 > Project: HBase > Issue Type: New Feature >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > > Following latest discussion on HBASE-21745, this proposes an hbck2 command > that allows for inserting back regions missing in META that still have > *regioninfo* available in HDFS. Although this is still an interactive and > simpler version than the old _OfflineMetaRepair_, it still relies on hdfs > state as the source of truth, and performs META updates mostly independently > from Master (apart from requiring Meta table been online). > For a more detailed explanation on this command behaviour, pasting _command > usage_ text: > {noformat} > To be used for scenarios where some regions may be missing in META, > but there's still a valid 'regioninfo' metadata file on HDFS. > This is a lighter version of 'OfflineMetaRepair' tool commonly used for > similar issues on 1.x release line. > This command needs META to be online. For each table name passed as > parameter, it performs a diff between regions available in META, > against existing regions dirs on HDFS. Then, for region dirs with > no matches in META, it reads regioninfo metadata file and > re-creates given region in META. Regions are re-created in 'CLOSED' > state at META table only, but not in Masters' cache, and are not > assigned either. A
[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta
[ https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880125#comment-16880125 ] Wellington Chevreuil edited comment on HBASE-22567 at 7/8/19 8:40 AM: -- {quote}Maybe we doc the Daisuke Kobayashi finding over on the hbck page? Suggest restart of master as way to rebuild meta if issue? Perhaps then we'd add the 'reader' part of your patch?{quote} Sounds all good for me! Given this jira was specific for the new command, and the given PR is already a bit large, maybe worth doing this extra doc work on a separate jira? was (Author: wchevreuil): {quote}Maybe we doc the Daisuke Kobayashi finding over on the hbck page? Suggest restart of master as way to rebuild meta if issue? Perhaps then we'd add the 'reader' part of your patch?{quote} Sounds all good for me! Given this jira was specific for the new command, and the given PR is already a bit large, maybe worth doing it on a separate jira? > HBCK2 addMissingRegionsToMeta > - > > Key: HBASE-22567 > URL: https://issues.apache.org/jira/browse/HBASE-22567 > Project: HBase > Issue Type: New Feature >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > > Following latest discussion on HBASE-21745, this proposes an hbck2 command > that allows for inserting back regions missing in META that still have > *regioninfo* available in HDFS. Although this is still an interactive and > simpler version than the old _OfflineMetaRepair_, it still relies on hdfs > state as the source of truth, and performs META updates mostly independently > from Master (apart from requiring Meta table been online). > For a more detailed explanation on this command behaviour, pasting _command > usage_ text: > {noformat} > To be used for scenarios where some regions may be missing in META, > but there's still a valid 'regioninfo' metadata file on HDFS. > This is a lighter version of 'OfflineMetaRepair' tool commonly used for > similar issues on 1.x release line. > This command needs META to be online. For each table name passed as > parameter, it performs a diff between regions available in META, > against existing regions dirs on HDFS. Then, for region dirs with > no matches in META, it reads regioninfo metadata file and > re-creates given region in META. Regions are re-created in 'CLOSED' > state at META table only, but not in Masters' cache, and are not > assigned either. A rolling Masters restart, followed by a > hbck2 'assigns' command with all re-inserted regions is required. > This hbck2 'assigns' command is printed for user convenience. > WARNING: To avoid potential region overlapping problems due to ongoing > splits, this command disables given tables while re-inserting regions. > An example adding missing regions for tables 'table_1' and 'table_2': > $ HBCK2 addMissingRegionsInMeta table_1 table_2 > Returns hbck2 'assigns' command with all re-inserted regions.{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)