[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta

2019-08-27 Thread stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917388#comment-16917388
 ] 

stack edited comment on HBASE-22567 at 8/28/19 4:37 AM:


What you want to do w/ this [~wchevreuil]?

I think we should finish this up, get it in, and then in the fixMeta that is 
also coming in, suggest that this action be run first since it can connect 
holes in meta to dirs in filesystem (the fixMeta in HBASE-22771 only papers 
over the holes making no effort at trying to figure if info in fs).


was (Author: stack):
What you want to do w/ this [~wchevreuil]?

> HBCK2 addMissingRegionsToMeta
> -
>
> Key: HBASE-22567
> URL: https://issues.apache.org/jira/browse/HBASE-22567
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck2
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> Following latest discussion on HBASE-21745, this proposes an hbck2 command 
> that allows for inserting back regions missing in META that still have 
> *regioninfo* available in HDFS. Although this is still an interactive and 
> simpler version than the old _OfflineMetaRepair_, it still relies on hdfs 
> state as the source of truth, and performs META updates mostly independently 
> from Master (apart from requiring Meta table been online).
> For a more detailed explanation on this command behaviour, pasting _command 
> usage_ text:
> {noformat}
> To be used for scenarios where some regions may be missing in META,
> but there's still a valid 'regioninfo' metadata file on HDFS.
> This is a lighter version of 'OfflineMetaRepair' tool commonly used for
> similar issues on 1.x release line.
> This command needs META to be online. For each table name passed as
> parameter, it performs a diff between regions available in META,
> against existing regions dirs on HDFS. Then, for region dirs with
> no matches in META, it reads regioninfo metadata file and
> re-creates given region in META. Regions are re-created in 'CLOSED'
> state at META table only, but not in Masters' cache, and are not
> assigned either. A rolling Masters restart, followed by a
> hbck2 'assigns' command with all re-inserted regions is required.
> This hbck2 'assigns' command is printed for user convenience.
> WARNING: To avoid potential region overlapping problems due to ongoing
> splits, this command disables given tables while re-inserting regions.
> An example adding missing regions for tables 'table_1' and 'table_2':
> $ HBCK2 addMissingRegionsInMeta table_1 table_2
> Returns hbck2 'assigns' command with all re-inserted regions.{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta

2019-07-31 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896609#comment-16896609
 ] 

stack edited comment on HBASE-22567 at 7/31/19 9:35 PM:


-[~wchevreuil] One thing I've noticed working on fixing holes is that if that 
after scanning to find SINGLE REGION WIDE holes and then using the hole edges 
to create the regioninfo that covers the hole, if I use the new RegionInfo to 
try and create the corresponding directory in the filesystem, it fails because 
the directory already exists (with any data it may have had in it). I've made 
it so if existing directory we just use it rather than create a new one. This 
works for the case of a single region missing from table but it won't work for 
the case covered here where a bunch of contiguous regions are missing from 
hbase:meta but their data is in the fs; in this latter case, the tool added 
here will restore data and fill holes. Just something I noticed...-

Above didn't seem right and it isn't. I had bug in my fill holes code. Ignore 
the above assertion.


was (Author: stack):
[~wchevreuil] One thing I've noticed working on fixing holes is that if that 
after scanning to find SINGLE REGION WIDE holes and then using the hole edges 
to create the regioninfo that covers the hole, if I use the new RegionInfo to 
try and create the corresponding directory in the filesystem, it fails because 
the directory already exists (with any data it may have had in it). I've made 
it so if existing directory we just use it rather than create a new one. This 
works for the case of a single region missing from table but it won't work for 
the case covered here where a bunch of contiguous regions are missing from 
hbase:meta but their data is in the fs; in this latter case, the tool added 
here will restore data and fill holes. Just something I noticed...

> HBCK2 addMissingRegionsToMeta
> -
>
> Key: HBASE-22567
> URL: https://issues.apache.org/jira/browse/HBASE-22567
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck2
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> Following latest discussion on HBASE-21745, this proposes an hbck2 command 
> that allows for inserting back regions missing in META that still have 
> *regioninfo* available in HDFS. Although this is still an interactive and 
> simpler version than the old _OfflineMetaRepair_, it still relies on hdfs 
> state as the source of truth, and performs META updates mostly independently 
> from Master (apart from requiring Meta table been online).
> For a more detailed explanation on this command behaviour, pasting _command 
> usage_ text:
> {noformat}
> To be used for scenarios where some regions may be missing in META,
> but there's still a valid 'regioninfo' metadata file on HDFS.
> This is a lighter version of 'OfflineMetaRepair' tool commonly used for
> similar issues on 1.x release line.
> This command needs META to be online. For each table name passed as
> parameter, it performs a diff between regions available in META,
> against existing regions dirs on HDFS. Then, for region dirs with
> no matches in META, it reads regioninfo metadata file and
> re-creates given region in META. Regions are re-created in 'CLOSED'
> state at META table only, but not in Masters' cache, and are not
> assigned either. A rolling Masters restart, followed by a
> hbck2 'assigns' command with all re-inserted regions is required.
> This hbck2 'assigns' command is printed for user convenience.
> WARNING: To avoid potential region overlapping problems due to ongoing
> splits, this command disables given tables while re-inserting regions.
> An example adding missing regions for tables 'table_1' and 'table_2':
> $ HBCK2 addMissingRegionsInMeta table_1 table_2
> Returns hbck2 'assigns' command with all re-inserted regions.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta

2019-07-12 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883895#comment-16883895
 ] 

stack edited comment on HBASE-22567 at 7/12/19 5:23 PM:


I defer to your support experience as to how best to progress since you know 
better what is needed.

I see the main differences as:

 * online vs offline repair
 * total replace of meta vs targetted surgery

I'd think that an operator would try the tool here first. It would ideally work 
for 99% of cases. For the remainder, there would be total rebuild of meta? This 
tool could grow to obsolete OMR. That'd be good all around.

I went back to hbck1 classes after poking around trying to figure how we'd 
implement fixHoles, fixOverlaps as well as rebuild, and general reporting. It 
seemed to me that a bunch of basic infrastructure had to be built up first 
before we do fixHoles, reporting, etc. Rather than try and write from scratch, 
I figured I would try and exploit what already exists. hbck1 has those 
datastructures that hold a bunch of state -- who has what, where the info was 
gleaned from (hdfs or meta) and so on -- which seemed useful as building blocks 
for fixup. hbck1 is in violation of hbase2/hbck2 principals in many regards but 
at least when it comes to rebuild of meta, it was adaptable. Was going to look 
at trying to make use of other facility in hbck1 in follow-on issues if 
possible.

Yeah, OfflineMetaRepair from hbck1 kills the meta as far as hbase2 is 
concerned. Let me say so on [~brfrn169]'s discussion thread. I'm game for 
chatting offline anytime if that would help here. Meantime let me go back and 
finish review of the PR.


was (Author: stack):
I defer to your support experience as to how best to progress since you know 
better what is needed.

I see the main differences as:

 * online vs offline repair
 * total replace of meta vs targetted surgery

I went back to hbck1 classes after poking around trying to figure how we'd 
implement fixHoles, fixOverlaps as well as rebuild, and general reporting. It 
seemed to me that a bunch of basic infrastructure had to be built up first 
before we do fixHoles, reporting, etc. Rather than try and write from scratch, 
I figured I would try and exploit what already exists. hbck1 has those 
datastructures that hold a bunch of state -- who has what, where the info was 
gleaned from (hdfs or meta) and so on -- which seemed useful as building blocks 
for fixup. hbck1 is in violation of hbase2/hbck2 principals in many regards but 
at least when it comes to rebuild of meta, it was adaptable. Was going to look 
at trying to make use of other facility in hbck1 in follow-on issues if 
possible.

Yeah, OfflineMetaRepair from hbck1 kills the meta as far as hbase2 is 
concerned. Let me say so on [~brfrn169]'s discussion thread. I'm game for 
chatting offline anytime if that would help here. Meantime let me go back and 
finish review of the PR.

> HBCK2 addMissingRegionsToMeta
> -
>
> Key: HBASE-22567
> URL: https://issues.apache.org/jira/browse/HBASE-22567
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck2
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> Following latest discussion on HBASE-21745, this proposes an hbck2 command 
> that allows for inserting back regions missing in META that still have 
> *regioninfo* available in HDFS. Although this is still an interactive and 
> simpler version than the old _OfflineMetaRepair_, it still relies on hdfs 
> state as the source of truth, and performs META updates mostly independently 
> from Master (apart from requiring Meta table been online).
> For a more detailed explanation on this command behaviour, pasting _command 
> usage_ text:
> {noformat}
> To be used for scenarios where some regions may be missing in META,
> but there's still a valid 'regioninfo' metadata file on HDFS.
> This is a lighter version of 'OfflineMetaRepair' tool commonly used for
> similar issues on 1.x release line.
> This command needs META to be online. For each table name passed as
> parameter, it performs a diff between regions available in META,
> against existing regions dirs on HDFS. Then, for region dirs with
> no matches in META, it reads regioninfo metadata file and
> re-creates given region in META. Regions are re-created in 'CLOSED'
> state at META table only, but not in Masters' cache, and are not
> assigned either. A rolling Masters restart, followed by a
> hbck2 'assigns' command with all re-inserted regions is required.
> This hbck2 'assigns' command is printed for user convenience.
> WARNING: To avoid potential region overlapping problems due to ongoing
> splits, this command disables given tables while re-inserting regions.
> An example adding missing regions for tables 'table_1' 

[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta

2019-07-11 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883529#comment-16883529
 ] 

stack edited comment on HBASE-22567 at 7/12/19 5:10 AM:


[~wchevreuil] I added an OfflineMetaRepair to hbck over in HBASE-22680. It is 
based on hbck1 implementation trying to exploit already-written code being 
selective about what is revealed from hbck1 in hbck2. I see the extant hbck1 
tooling as useful implementing not only OfflineMetaRepair but also other items 
listed up in HBASE-21745: e.g. 'enumerate store files to determine file level 
corruption and sideline corrupt files' and `fix hfile link problems (dangling / 
broken)`.

[~busbey] asks in review of HBASE-22680 if overlap with the work here. Let me 
review your latest but I don't think so (correct me if I'm wrong). The patch 
here works against an online meta selectively making fixes, posing as a "...a 
lighter version of 'OfflineMetaRepair'".  HBASE-22680 does a brute, wholesale 
rewrite of meta w/ the cluster offline.

When I asked above, you said this patch could perhaps evolve to subsume 
OfflineMetaRepair. When that is the case, we could remove HBASE-22680 -- or 
keep it for when a radical cure is wanted. What you reckon?

On [~daisuke.kobayashi]'s note, that table state is migrated from zk on 
startup, my comments above were backed by a misunderstanding on my part (only 
table 'state' is repopulated; I thought it more than this). My comments above 
probably confuse because of this. Pardon me (D and W). Having been in the code 
recently, I was reminded that the mirroring of table state to zk was supposed 
to be turned off; the zk table state mirroring is so hbase1 clients can work 
against hbase2. So, yeah, was you say above, you probably want to assume 
control over what state the table is in and making sure the state is populated.


was (Author: stack):
[~wchevreuil] I added an OfflineMetaRepair to hbck over in HBASE-22680. It is 
based on hbck1 implementation trying to exploit already-written code being 
selective about what is revealed from hbck1 in hbck2. I see the extant hbck1 
tooling as useful implementing not only OfflineMetaRepair but also other items 
listed up in HBASE-21745: e.g. 'enumerate store files to determine file level 
corruption and sideline corrupt files' and `fix hfile link problems (dangling / 
broken)`.

[~busbey] asks in review of HBASE-22680 if overlap with the work here. Let me 
review your latest but I don't think so (correct me if I'm wrong). The patch 
here works against an online meta selectively posing as a "a lighter version of 
'OfflineMetaRepair'".  HBASE-22680 does a wholesale rewrite of meta w/ the 
cluster offline.

When I asked above, you said this patch could perhaps evolve to subsume 
OfflineMetaRepair. When that is the case, we could remove HBASE-22680. What you 
reckon?

On [~daisuke.kobayashi]'s note, that table state is migrated from zk on 
startup, my comments above were backed by a misunderstanding on my part (only 
table 'state' is repopulated; I thought it more than this). My comments above 
probably confuse because of this. Having been in the code recently, I was 
reminded that the mirroring of table state to zk was supposed to be turned off; 
the zk table state mirroring is so hbase1 clients can work against hbase2.

> HBCK2 addMissingRegionsToMeta
> -
>
> Key: HBASE-22567
> URL: https://issues.apache.org/jira/browse/HBASE-22567
> Project: HBase
>  Issue Type: New Feature
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> Following latest discussion on HBASE-21745, this proposes an hbck2 command 
> that allows for inserting back regions missing in META that still have 
> *regioninfo* available in HDFS. Although this is still an interactive and 
> simpler version than the old _OfflineMetaRepair_, it still relies on hdfs 
> state as the source of truth, and performs META updates mostly independently 
> from Master (apart from requiring Meta table been online).
> For a more detailed explanation on this command behaviour, pasting _command 
> usage_ text:
> {noformat}
> To be used for scenarios where some regions may be missing in META,
> but there's still a valid 'regioninfo' metadata file on HDFS.
> This is a lighter version of 'OfflineMetaRepair' tool commonly used for
> similar issues on 1.x release line.
> This command needs META to be online. For each table name passed as
> parameter, it performs a diff between regions available in META,
> against existing regions dirs on HDFS. Then, for region dirs with
> no matches in META, it reads regioninfo metadata file and
> re-creates given region in META. Regions are re-created in 'CLOSED'
> state at META table only, but not in Masters' cache, and are not
> assigned either. A 

[jira] [Comment Edited] (HBASE-22567) HBCK2 addMissingRegionsToMeta

2019-07-08 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880125#comment-16880125
 ] 

Wellington Chevreuil edited comment on HBASE-22567 at 7/8/19 8:40 AM:
--

{quote}Maybe we doc the Daisuke Kobayashi finding over on the hbck page? 
Suggest restart of master as way to rebuild meta if issue? Perhaps then we'd 
add the 'reader' part of your patch?{quote}
Sounds all good for me! Given this jira was specific for the new command, and 
the given PR is already a bit large, maybe worth doing this extra doc work on a 
separate jira?


was (Author: wchevreuil):
{quote}Maybe we doc the Daisuke Kobayashi finding over on the hbck page? 
Suggest restart of master as way to rebuild meta if issue? Perhaps then we'd 
add the 'reader' part of your patch?{quote}
Sounds all good for me! Given this jira was specific for the new command, and 
the given PR is already a bit large, maybe worth doing it on a separate jira?

> HBCK2 addMissingRegionsToMeta
> -
>
> Key: HBASE-22567
> URL: https://issues.apache.org/jira/browse/HBASE-22567
> Project: HBase
>  Issue Type: New Feature
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>
> Following latest discussion on HBASE-21745, this proposes an hbck2 command 
> that allows for inserting back regions missing in META that still have 
> *regioninfo* available in HDFS. Although this is still an interactive and 
> simpler version than the old _OfflineMetaRepair_, it still relies on hdfs 
> state as the source of truth, and performs META updates mostly independently 
> from Master (apart from requiring Meta table been online).
> For a more detailed explanation on this command behaviour, pasting _command 
> usage_ text:
> {noformat}
> To be used for scenarios where some regions may be missing in META,
> but there's still a valid 'regioninfo' metadata file on HDFS.
> This is a lighter version of 'OfflineMetaRepair' tool commonly used for
> similar issues on 1.x release line.
> This command needs META to be online. For each table name passed as
> parameter, it performs a diff between regions available in META,
> against existing regions dirs on HDFS. Then, for region dirs with
> no matches in META, it reads regioninfo metadata file and
> re-creates given region in META. Regions are re-created in 'CLOSED'
> state at META table only, but not in Masters' cache, and are not
> assigned either. A rolling Masters restart, followed by a
> hbck2 'assigns' command with all re-inserted regions is required.
> This hbck2 'assigns' command is printed for user convenience.
> WARNING: To avoid potential region overlapping problems due to ongoing
> splits, this command disables given tables while re-inserting regions.
> An example adding missing regions for tables 'table_1' and 'table_2':
> $ HBCK2 addMissingRegionsInMeta table_1 table_2
> Returns hbck2 'assigns' command with all re-inserted regions.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)