[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-11-07 Thread Nicholas Jiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677906#comment-16677906
 ] 

Nicholas Jiang commented on HBASE-19121:


[~stack] https://issues.apache.org/jira/browse/HBASE-21447 I have some question 
on holes based on HBCK2 
tool[https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2]

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-11-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677907#comment-16677907
 ] 

Hadoop QA commented on HBASE-19121:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-19121 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19121 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927372/hbase-19121.master.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14979/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-29 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667472#comment-16667472
 ] 

stack commented on HBASE-19121:
---

Thank you [~tianjingyun]

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1724#comment-1724
 ] 

Jingyun Tian commented on HBASE-19121:
--

Sounds like we need to get regions of all problematic states for all tables to 
get a full list? I think add a tab to the navigator bar to dump the RIT as a 
table and can be viewed as txt could be easier to use?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1702#comment-1702
 ] 

stack commented on HBASE-19121:
---

Its as you state, if in UI, it is always available to the operator but yeah, if 
UI is not up, then operator is stuck. Perhaps we work on making sure UI is 
always available?

I was thinking that operator could click on the UI in the tables panel on the 
OPENING count and get a page that listed all the regions in OPENING. Then same 
for OPEN, CLOSED, CLOSING? I made a start a while back but didn't get far. 
Would be useful for operator. Could use curl or wget or lynx to get the list.


> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1689#comment-1689
 ] 

Jingyun Tian commented on HBASE-19121:
--

[~stack] planning to build 2 tools as your doc already mentioned:
# dump a list of stuck procedures as txt.
# dump a list of RIT as txt.

Should we build these tools in Master UI or Canary tools? 
If we build this in Master UI, it's easier for operator to use. But if Master 
UI is not up, it's unavailable (This situation should be rare?). Or build them 
in Canary tools? 
Please let me know your thoughts.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-25 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664571#comment-16664571
 ] 

Jingyun Tian commented on HBASE-19121:
--

Yes. Maybe we can hide these in one single method.

Sounds good. I'll try to help:D.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663952#comment-16663952
 ] 

stack commented on HBASE-19121:
---

Ok on your 1-3. Try to hide how decisions are made inside a single method?

Yes, I was thinking adding facility to the Canary. Let me know if you think 
otherwise.


> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-25 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663338#comment-16663338
 ] 

Jingyun Tian commented on HBASE-19121:
--

{quote}Queu'ing an SCP is not enough (IIRC) because we don't have the list of 
what regions were on that old dead server so when the SCP goes to do assigns, 
it'll have an empty queue.
{quote}
Yes. That's why I proposed to port onlineConsistencyRepair. I think the steps 
to fix these inconsistency problem could be as follows:
# Use tools to find problematic regions (inconsistency between META and 
regionservers)
# Check if these problematic regionservers' WAL directory have -splitting 
suffix. If so, a SCP need to be scheduled to split logs.
# After SCP done, reassign problematic regions we get from step 1.

bq. Adding some functionality to the Canary where it recognizes that the server 
is not online, is not in dead servers, and perhaps has no WALs on fs, might be 
the way to go? You'd add a flag for it to actually act on any Regions it found 
that were in the 'wrong' state? Its sort of built to do this sort of review of 
the cluster?

Do you mean let Canary tool to check if the region is OPEN on META but 
regionserver is not found? Then let Canary to gather all these information and 
we operator fix problems base on those information?

Thanks.


> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-24 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663223#comment-16663223
 ] 

stack commented on HBASE-19121:
---

bq. These regions are recorded as OPEN on that crashed regionserver in META, 
that's why I need a tool to help find all these regions that are OPEN in META 
but actually not alive anymore. 

Smile. Its first item in my list of things we need here: 
https://docs.google.com/document/d/1Y0HIo5yRGXi7nl-JWc69JtxB87fYE-jXe8nBe7HWKe0/edit#heading=h.awq9l5odz77e

I was using the Canary to find these. I'd then unassign the Region -- which 
triggers an SCP for this deadserver -- and then after, I'd do a re-assign. 
Usually this runs smoothly unless another has lock on the Region entity whether 
directly or on the containing Table.

A tool to scan hbase:meta looking for servers that are not online, are not 
deadservers, might be good. What would it do w/ the info? Queu'ing an SCP is 
not enough (IIRC) because we don't have the list of what regions were on that 
old dead server so when the SCP goes to do assigns, it'll have an empty queue. 
Doing an unassign on each of these Regions will trigger a sort of useless SCP 
-- unless we determine it a long dead and gone server -- though if WALs to be 
split, it'll split them. Otherwise, these SCPs will be noops mostly. I'd be 
interested in any thoughts you have here [~tianjingyun].

bq. Canary tool can help solve this problem. But it's a little bit slow since 
it needs to read a row from all these regions.

I don't mind it being 'slow'. It actually does this in parallel so can be 
pretty fast.

Adding some functionality to the Canary where it recognizes that the server is 
not online, is not in dead servers, and perhaps has no WALs on fs, might be the 
way to go? You'd add a flag for it to actually act on any Regions it found that 
were in the 'wrong' state? Its sort of built to do this sort of review of the 
cluster?

bq. Besides, do we still get a chance to met the problem that region OPEN on 
more than one regionserver?

We don't seem to have this problem any more. I believe its because the Master 
kills RegionServers that are in disagreement with what it thinks the state of 
affairs are. RegionServers report the Regions they are hosting on each 
heartbeat. Would have to check

Thanks.


> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-24 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663145#comment-16663145
 ] 

Jingyun Tian commented on HBASE-19121:
--

[~stack]

I checked this doc 
[https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2], it's 
very helpful.

I mean the onlineConsistencyRepair. The problem I met is that one server is 
crash and SCP is not finished since I delete the procedure logs, then the 
inconsistency come out. These regions are recorded as OPEN at that crashed 
regionserver in META, that's why I need a tool to help find all these regions 
that are OPEN in META but actually not alive anymore. 

Canary tool can help solve this problem. But it's a little bit slow since it 
needs to read a key from all these regions. As the Implementation of 
onlineConsistencyRepair, we only need to compare the onlineRegions of META and 
all the regions reported by all regionservers. So I'm not very sure if we still 
need to port this one since canary tool can solve this.

Besides, do we still get a chance to met the problem that region OPEN on more 
than one regionserver? I've never met but seems no tools can help find this 
problem?

The biggest benefit of onlineConsistencyRepair I think is to show all these 
inconsistency problems, then we operators can decide how to fix these problems.
{quote}That said, we could do with more operator help. I think tools to fix 
hbase.version file and research to see if possible to rebuild a meta table if 
for some reason meta were erased would be good to have.
{quote}
Yes, this could be put on schedule in case some extremely situation happens.

 

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-24 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662719#comment-16662719
 ] 

stack commented on HBASE-19121:
---

bq. I plan to migrate the function of checking consistency among meta, 
regionserver and hdfs from hbck-1

Which parts are you thinking [~tianjingyun] ?

I did some writeup here on repair: 
https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2

Was trying to not repro in hbck2 tooling we have elsewhere.

That said, we could do with more operator help. I think tools to fix 
hbase.version file and research to see if possible to rebuild a meta table if 
for some reason meta were erased would be good to have.

Tooling that looked to see that all regions matched table state would be 
helpful. In the above notes, we point folks at the tables UI section where you 
can see table state and if all regions are OPEN/CLOSED. Canary is good for 
confirming regions are where they claim to be.

What you thinking?

Thanks.


> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-24 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662006#comment-16662006
 ] 

Jingyun Tian commented on HBASE-19121:
--

[~stack] sir, I plan to migrate the function of checking consistency among 
meta, regionserver and hdfs from hbck-1. I think it is necessary for hbase 
administrator to find out if the cluster has any problem. How do you think?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657719#comment-16657719
 ] 

stack commented on HBASE-19121:
---

Thanks. Made an hbck-1.0.0 version (Not committing to a version for all 
hbase-operator-tools just yet... ).

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657714#comment-16657714
 ] 

Sean Busbey commented on HBASE-19121:
-

I guess "no way" isn't accurate. But it'll be confusing. You can't do jira 
release notes for the combination of a version and a component, for example. 
Also currently no way to do that for Yetus Release Doc Maker.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657712#comment-16657712
 ] 

Sean Busbey commented on HBASE-19121:
-

definitely make an hbck2 specific version. like we have for hbase-thirdparty. 
actually I think that means it'd be better to make it for hbase-operator-tools.

If you just do 1.0.0 there'll be no way to distinguish between JIRAs for 
HBase's 1.0.0 release and those for hbck2 / operator-tools.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657709#comment-16657709
 ] 

stack commented on HBASE-19121:
---

Link to doc on what we need tooling-wise doing assign debug as well as 
nice-to-haves in hbck2.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657440#comment-16657440
 ] 

stack commented on HBASE-19121:
---

bq. Presumably the FixVersion should be set for "hback2-1.0.0" when things land?

Hmm... I set it to 1.0.0. What you think? I'll search hbck2 component and 
1.0.0? Or should I just make the version hbck2-1.0.0 as you suggest? That might 
be more selective?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657438#comment-16657438
 ] 

Sean Busbey commented on HBASE-19121:
-

yes, thank you, that does. Presumably the FixVersion should be set for 
"hback2-1.0.0" when things land?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657419#comment-16657419
 ] 

stack commented on HBASE-19121:
---

I do use a jira component for hbck2.

I meant to put a limit on this umbrella, the release of hbck2-1.0.0. Hopefully 
that assuages your never-ending issue concern.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657402#comment-16657402
 ] 

Sean Busbey commented on HBASE-19121:
-

bq. Made this an umbrella issue. Removed 2.1.1 and 2.0.3 as fix versions. Lets 
use this to hang all hbck2 stuff one whatever version.

please use a jira component for this. umbrella jira issues that can never close 
are messy when we start trying to reason by looking across jira issues.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657343#comment-16657343
 ] 

stack commented on HBASE-19121:
---

Made this an umbrella issue. Removed 2.1.1 and 2.0.3 as fix versions. Lets use 
this to hang all hbck2 stuff one whatever version.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629649#comment-16629649
 ] 

stack commented on HBASE-19121:
---

[~Apache9] I was thinking something like that. We'll fail fast if remote side 
does not support a particular operation. It might be good to add a dump of 
operations the remote side supports to help the operator. Need to add a 
'--version' to hbck2 too.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629639#comment-16629639
 ] 

Duo Zhang commented on HBASE-19121:
---

So HBCK2 will have its own release cycle, and we will maintain a support matrix 
in HBCK2? The two dimensions are HBase version and HBCK2 operations, when 
starting HBCK2, we will connect to HMaster to get the version of the cluster, 
and then print out which operations are supported?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628758#comment-16628758
 ] 

stack commented on HBASE-19121:
---

bq. We will keep adding new features, so the new HBCK2 can only work with newer 
version of HBase if you want to use the new feature. It will be confusing to 
users, that HBCK2 can not work with 2.1.0 but 2.1.1, and some features can not 
work with 2.1.1 but 2.1.2...

Yes. My thought currently is that I check the version of the remote cluster 
before running operation. I need to check for <= 2.0.2 and 2.1.0 anyways since 
these have no hbck2.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628722#comment-16628722
 ] 

Duo Zhang commented on HBASE-19121:
---

{quote}
there is more to add I think.
{quote}

This is exactly my concern... We will keep adding new features, so the new 
HBCK2 can only work with newer version of HBase if you want to use the new 
feature. It will be confusing to users, that HBCK2 can not work with 2.1.0 but 
2.1.1, and some features can not work with 2.1.1 but 2.1.2...

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628712#comment-16628712
 ] 

stack commented on HBASE-19121:
---

bq. ...so I can forward port the HBCK2 changes to 2.2(or 2.3?).

Sounds good. On when to port, I'm still hacking on hbck2; there is more to add 
I think.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628706#comment-16628706
 ] 

stack commented on HBASE-19121:
---

I pushed a DISCUSSION note on dev list on hbck2 showing up in a point release 
referencing the discussion here.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628702#comment-16628702
 ] 

Duo Zhang commented on HBASE-19121:
---

So [~allan163] will take care of the HBCK2 for branch-2.0. And I have already 
ported the TRSP related changes to our internal 2.x branch, so I can forward 
port the HBCK2 changes to 2.2(or 2.3?). What do you think sir [~stack]?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628442#comment-16628442
 ] 

Allan Yang commented on HBASE-19121:


{quote}
And I do not think it is a good idea to leave branch-2.0 and branch-2.1 there 
without HBCK2 support.
{quote}
Totally agree! We should be responsible for our releases. We may not add new 
features in these branches, but at least we should keep them stable, for now, 
HBCK2 is the only tool they can depend to fix the cluster.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628323#comment-16628323
 ] 

Duo Zhang commented on HBASE-19121:
---

{quote}
At least for some basic operations, higher hbase version should be compatible 
with lower HBCK2 version.
{quote}

[~allan163], the problem here is not higher HBase version with lower HBCK2 
version, but higher HBCK2 version with lower HBase version. As the recovery 
code are mainly placed in HBase, when we introduce new features in HBCK2, it 
usually means that an old version of HBase can not support this feature, since 
it does not have the new recovery code.

And I do not think it is a good idea to leave branch-2.0 and branch-2.1 there 
without HBCK2 support. The users who are currently using these two minor 
releases are good seed users, as usually the new major releases are not very 
stable and they help testing them and make them stable. The decision of only 
supporting HBCK2 on 2.2+ is not a good news to them, it seems that we just 
throw them away...

And as I described above, what if later we add new methods to HbckService? We 
immediately release 2.3 and drop 2.2? It does not make sense... The rule for a 
patch release is not everything, at least for HBCK2 we should have something 
different.

Thanks.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628245#comment-16628245
 ] 

stack commented on HBASE-19121:
---

bq. A reminder, this way will break the `Fix Version/s` tag in JIRA IMO, 
because we have some patches pushed to branch-2 only and marked them fixed in 
2.2.0.

Thats fixable [~reidchan]... (Move current 2.2.0 to 2.3.0... And 2.1.1 to 
2.2.0).

Let me start conversation on dev list... so we can all muck in. Thanks for the 
input.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628243#comment-16628243
 ] 

Reid Chan commented on HBASE-19121:
---

bq. Branching 2.2 from 2.1 might be the 'safest'.
A reminder, this way will break the `Fix Version/s` tag in JIRA IMO, because we 
have some patches pushed to branch-2 only and marked them fixed in 2.2.0.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628219#comment-16628219
 ] 

Allan Yang commented on HBASE-19121:


{quote}
A possible way to deal with this problem is that, we align the version of hbase 
and hbck2, i.e, release a new version of hbck2 every time when we release a new 
hbase version.
{quote}
IMHO, HBCK2 should be able to against every hbase2 version. We don't want user 
to download different versions for different hbase clusters. At least for some 
basic operations, higher hbase version should be compatible with lower HBCK2 
version. 

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628200#comment-16628200
 ] 

stack commented on HBASE-19121:
---

hbck2 for all branches is tough given we then add a new HbckService on a point 
release. We were trying to do bug fixes only on point releases.

2.1.1 has 126 fixes in it so far too... enough to make a minor release?

If folks think hbck2 is an exception and that we should allow it in on a point 
release, thats fine. I can take it to the dev list for discussion.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628181#comment-16628181
 ] 

Duo Zhang commented on HBASE-19121:
---

I think we should have hbck2 for all branches? IIRC on another issue, 
[~allan163] said that he will backport the hbck2 stuffs to branch-2.0.

And I do not think branching 2.2 from 2.1 can solve the problem permanently. 
Hbck2 is in a separated repo, but lots of the recovery code are in the hbase 
repo, and if we rely on the new recovery code then hbck2 will not be compatible 
with the old versions of hbase...

A possible way to deal with this problem is that, we align the version of hbase 
and hbck2, i.e, release a new version of hbck2 every time when we release a new 
hbase version.

Thanks.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627531#comment-16627531
 ] 

stack commented on HBASE-19121:
---

Ok. Let me let this stew a bit to see what fellows from China have to say 
[~elserj]. Will surface on dev list tomorrow to see if more input.

Branching 2.2 from 2.1 might be the 'safest'.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627528#comment-16627528
 ] 

stack commented on HBASE-19121:
---

Thanks [~elserj] for input.

HBASE-20881 adds new Procedure type. HBASE-21075 is about how we have to drain 
the old ones first before we can start the new Master. Need to make it 'smooth'.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627513#comment-16627513
 ] 

Josh Elser commented on HBASE-19121:


{quote}Or crazy-pants stuff like cutting branch-2.2 from branch-2.1?
{quote}
IMO, that's fine for me too. Until we've release a new version, I see those 
versions as something we fully control. branch-2.1 becoming "2.2.0" and 
branch-2.2's HEAD being pushed to 2.3 would give the user-facing semantics we 
want.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627473#comment-16627473
 ] 

stack commented on HBASE-19121:
---

Or crazy-pants stuff like cutting branch-2.2 from branch-2.1?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627457#comment-16627457
 ] 

Josh Elser commented on HBASE-19121:


{quote}hbck2 is showing up on a point release – 2.1.1 – rather than on a minor 
(2.2.x) because I'm thinking its ok adding in this new stuff because it is on a 
new Service and it won't break what was there previous (To be confirmed).
{quote}
Seems OK. A little wonky for it to work on 2.1.1 and not 2.1.0, but that's not 
the end of the world.
{quote}Also avoiding waiting on 2.2.0 because it has an awkward upgrade
{quote}
You have more info I can read up on regarding this? Sounds like something we'd 
want to try to make better to avoid us getting stuck on 2.0 and 2.1 releases 
(not like that even happened in HBase 1.x releases ;))

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627438#comment-16627438
 ] 

stack commented on HBASE-19121:
---

TODO: hbck2 is showing up on a point release -- 2.1.1 -- rather than on a minor 
(2.2.x) because I'm thinking its ok adding in this new stuff because it is on a 
new Service and it won't break what was there previous (To be confirmed). Also 
avoiding waiting on 2.2.0 because it has an awkward upgrade. Thats how I'm 
thinking. Happy to hear opinions otherwise.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-19 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621044#comment-16621044
 ] 

Sean Busbey commented on HBASE-19121:
-

sweet. Added some comments around teh cli invocation.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621032#comment-16621032
 ] 

stack commented on HBASE-19121:
---

Anyone can comment now.

Would love feedback on tooling [~busbey]

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-19 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621029#comment-16621029
 ] 

Sean Busbey commented on HBASE-19121:
-

The google doc only has view access. can you enable commenting access? Or would 
you prefer feedback here?

I'd like to talk about improving this (from the 
[hbase-operator-tools/hbck2/README|https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2#running-hbck2]):

{quote}
{{org.apache.hbase.HBCK2}} is the name of the main class. Running the below 
will dump out the HBCK2 usage:
{code}
 $ HBASE_CLASSPATH_PREFIX=/tmp/hbase-hbck2-1.0.0-SNAPSHOT.jar ./bin/hbase 
org.apache.hbase.HBCK2
{code}
{quote}

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620087#comment-16620087
 ] 

stack commented on HBASE-19121:
---

Hoisted up Design for HBCK2 from subissue.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-15 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616458#comment-16616458
 ] 

stack commented on HBASE-19121:
---

Thanks for the write-up [~Apache9].

I like #1, just reading the WALs and playing them -- no split. Will be back to 
play with RecoverMetaProcedureV2 after get some basic hbck2 functionality in 
place.



> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-15 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616214#comment-16616214
 ] 

Duo Zhang commented on HBASE-19121:
---

In HBASE-21035, [~allan163], [~stack] and me have discussed a lot on how to get 
the cluster back when the procedure wals are broken, or simply say, how to make 
meta online, and then process the dead servers and assign regions, after we 
remove all the procedure wals. We finally (maybe) reached an agreement that 
these recovery work should be done by the HBCK2 framework. So I think we need 
to do this:

1. Implement a RecoverMetaProcedure(not the old one), where we find the 
location of meta on zk, and check if it is still alive, by checking the 
ephemeral node on zk on something else. If not, split the log inline(do we 
really need to split? It only contains wals from meta...), and then assign meta 
to a live RS. 
2. If meta is online, then we are able to execute SCPs. We can scan the wal 
directories and find the ones end with 'splitting' suffix, and schedule SCPs 
for them. We have procedure locks so usually this should not be a big problem 
to schedule duplicated SCPs for one RS(need to confirm that the procedure 
scheduler can work fine).
3. After all SCPs have been finished, we could still have some regions in an 
intermediate state. This is because we may also removed some TRSPs. We should 
have the ability to find out these regions and schedule TRSPs to bring them 
online.
4. How do deal with split and merge in the middle? No ideas for now...

Notice that this is very important for bringing hbase2 into the production 
environment, as no one can guarantee that there is no bug in a project. In a 
long run, we will always hit some critical bugs, or operational accidents, 
which causes the cluster to trap into a status which can not be recovered 
automatically. HBCK2 is our last line of defense.



> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-09-07 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607502#comment-16607502
 ] 

stack commented on HBASE-19121:
---

h2. Horror Story

Big cluster. Lots of regions. A couple of STUCK procedures that prevent 
clean-up of old WALs.   A backlog builds. Master crashes (for some unrelated 
reason).  New Master tries to become active Master. It reads outstanding 
MasterProcWAL logs to reconstruct assignment. If a large backlog, this can take 
hours.

HBASE-21165 describes an instance where 700servers and 420k regions. The Master 
is taking hours to put together assignment again from backed-up logs (~300 and 
I think a few million procedures). HBASE-21165 is adding emitting state because 
otherwise it looks like we are  hung.

Need to support remove of all MasterProcWAL and come up anyways as per notes 
above.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-08-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591919#comment-16591919
 ] 

Hadoop QA commented on HBASE-19121:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-19121 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19121 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927372/hbase-19121.master.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14193/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-08-24 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591907#comment-16591907
 ] 

stack commented on HBASE-19121:
---

Added link to HBASE-21083, the "bypass" broken Procedures issue.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-08-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585114#comment-16585114
 ] 

stack commented on HBASE-19121:
---

Chatting with [~allan163] and [~Apache9], major concern is loss of master proc 
wals. If gone, mis-deleted, or damaged, then the cluster is hosed. Can't have 
this. Redundancy? How to have redundant master proc WAL? Or can we leave 
breadcrumbs as we used to try in hbck1 days that allow us rebuild if all is 
trashed? How? We have some file-based droppings. Will use for now though we 
would like to move away from depending on particularities of our fs persist. 
For hbase2, minimally:

* A rebuild procedure that can put cluster back together after catastrophy.  
Rebuild procedure might be composed of multiple fix-it procedures that an 
operator would run via hbck2.  hbck2 would require at least a minimal Master 
running ("maintenance mode"). Best if no dependency on RSs.
* But only ever one master at a time! Even if a mimimal.
* One procedure would repair meta. It would work though minimal master. It 
would look for meta WAL logs for recovery. It'd run splitting inline rather 
than try farm it out to cluster to minimize dependency on RS's being up. It'd 
dump the recovered.edits into place. It might then open the the meta region
for hbck2 to read.
* hbck2 would make report of the troublesomeRITs. Or unfinished split or 
merge.
* A procedure to look for -SPLITTING RS dirs for queuing new SCPs.

Other hbck2 features:

* Move aside the master proc wals.
* Force complete of a procedure. Can't kill Procedures. Rollback doesn't always 
work. Procedures maybe subprocedures. Need to have them complete so parent can 
complete. Then operator does fixup. When force complete, need to release locks 
too... else operator or new procedures to fix cannot make progress.



> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-07-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560863#comment-16560863
 ] 

stack commented on HBASE-19121:
---

I think I've asked for this above but here is more detail.

A corrupt Master proc WAL file was responsible for two regions being stuck in 
OPENING. It looks like this in Master log:

{code}
2018-07-28 12:33:49,724 WARN  [ProcExecTimeout] assignment.AssignmentManager: 
STUCK Region-In-Transition rit=OPENING, 
location=ve0530.halxg.cloudera.com,16020,1532716446468, 
table=IntegrationTestBigLinkedList, region=8198218a4532a0ee544cb069970f9a77
2018-07-28 12:33:49,724 WARN  [ProcExecTimeout] assignment.AssignmentManager: 
STUCK Region-In-Transition rit=OPENING, 
location=ve0530.halxg.cloudera.com,16020,1532716446468, 
table=IntegrationTestBigLinkedList, region=4459746bcff48c116337e732ac4df705
2018-07-28 12:34:17,532 WARN  [PEWorker-2] 
assignment.RegionTransitionProcedure: Failed transition, suspend 3600secs 
pid=14482, ppid=14168, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=4459746bcff48c116337e732ac4df705, 
server=ve0530.halxg.cloudera.com,16020,1532716446468; rit=OPENING, 
location=ve0530.halxg.cloudera.com,16020,1532716446468; waiting on rectified 
condition fixed by other Procedure or operator intervention
org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected 
[SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but current 
state=OPENING
  at 
org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:164)
  at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1542)
  at 
org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:204)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:345)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:95)
  at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1474)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1249)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1763)
2018-07-28 12:34:17,533 INFO  [PEWorker-2] procedure2.TimeoutExecutorThread: 
ADDED pid=14482, ppid=14168, state=WAITING_TIMEOUT:REGION_TRANSITION_DISPATCH; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=4459746bcff48c116337e732ac4df705, 
server=ve0530.halxg.cloudera.com,16020,1532716446468; timeout=360, 
timestamp=1532810057533
2018-07-28 12:34:19,078 WARN  [PEWorker-12] 
assignment.RegionTransitionProcedure: Failed transition, suspend 3600secs 
pid=14373, ppid=14168, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
UnassignProcedure table=IntegrationTestBigLinkedList, 
region=8198218a4532a0ee544cb069970f9a77, 
server=ve0530.halxg.cloudera.com,16020,1532716446468; rit=OPENING, 
location=ve0530.halxg.cloudera.com,16020,1532716446468; waiting on rectified 
condition fixed by other Procedure or operator intervention
org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected 
[SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but current 
state=OPENING
  at 
org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:164)
  at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1542)
  at 
org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:204)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:345)
  at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:95)
  at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1474)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1249)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
  at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1763)
{code}

At least the log is clear no what has to be done. We are seeing the STUCK 
messages.. and then out comes the prescription on a period.

The Locks UI shows that there is an exclusive lock on the two 
regions making it so no other Procedure can run to do fixup:

{code}
Locks

[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-06-14 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512757#comment-16512757
 ] 

Umesh Agashe commented on HBASE-19121:
--

Current usage:
{code:java}
usage: hbase org.apache.hadoop.hbase.util.HBaseFsck2 [OPTIONS] [ACTIONS]
Options:
-l,--timelag  Restrict actions to regions that are not updated 
in last  seconds.
-e,--noExclusive Run even if another instance of hbck is running.
-t,--tables  Restrict actions to specified comma seperated list of 
tables.
-r,--regions  Restrict actions to specified comma seperated list of 
regions.
-s,--regionServers  Restrict actions to specified comma 
seperated list of region servers.
-d,--details Report details.
-v,--verbose Verbose output.
Actions:
FixAssignments Try fixing assignments of regions stuck in transition by 
submitting assign/ unassign procedures.{code}

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-06-11 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509198#comment-16509198
 ] 

stack commented on HBASE-19121:
---

I just noticed that we reference hbck in some of our shell help doc:

{code}
hbase(main):022:0> help 'unassign'
Unassign a region. Unassign will close region in current location and then
reopen it again.  Pass 'true' to force the unassignment ('force' will clear
all in-memory state in master before the reassign. If results in
double assignment use hbck -fix to resolve. To be used by experts).
Use with caution.  For expert use only.  Examples:

  hbase> unassign 'REGIONNAME'
  hbase> unassign 'REGIONNAME', true
  hbase> unassign 'ENCODED_REGIONNAME'
  hbase> unassign 'ENCODED_REGIONNAME', true
{code}

We should clean these up too as part of move to hbck2.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-06-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508976#comment-16508976
 ] 

Hadoop QA commented on HBASE-19121:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 0s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
16s{color} | {color:red} hbase-server: The patch generated 55 new + 202 
unchanged - 7 fixed = 257 total (was 209) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m 
40s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 11s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
26s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}125m 36s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Boxing/unboxing to parse a primitive 
org.apache.hadoop.hbase.util.HBaseFsck2.processOptions(CommandLine)  At 
HBaseFsck2.java:org.apache.hadoop.hbase.util.HBaseFsck2.processOptions(CommandLine)
  At HBaseFsck2.java:[line 332] |
| Failed junit tests | hadoop.hbase.util.TestHBaseFsck2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19121 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927372/hbase-19121.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux dfbac7674ad7 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 40f0a43462 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | 

[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-06-11 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508721#comment-16508721
 ] 

Umesh Agashe commented on HBASE-19121:
--

HBCK2 will evolve. First version with basic command line options and parsing is 
in 001 patch. It also has action to FixAssignments.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Priority: Major
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-04-27 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456901#comment-16456901
 ] 

Umesh Agashe commented on HBASE-19121:
--

For region states:
{code:java}
"scan 'hbase:meta', { ROWPREFIXFILTER => 't1,', COLUMNS => 'info:state'}{code}

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Priority: Major
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-04-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456597#comment-16456597
 ] 

stack commented on HBASE-19121:
---

TODOs from a debug session w/ [~uagashe] yesterday:

 *  "scan 'hbase:meta', { ROWPREFIXFILTER => 't1,', COLUMNS => 'table:state'}" 
... to list out state of all regions in table 't1'.
 * Fix table state so it shows string for enum rather than binary.  Meantime, 
"put 'hbase:meta', 't', 'table:state', "\x8\x1"}" is how you set table to 
disabled and \x8\x0 to enabled.
 * A little verb in shell that printed out region states -- e.g. 1,102 OPEN, 2 
OPENING, 4 CLOSED -- to summarize state would help.



> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Priority: Major
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-04-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444703#comment-16444703
 ] 

stack commented on HBASE-19121:
---

A few of us looking at a hosed cluster came up with a scenario that would help 
with the dev of an hbck2: run a cluster with some loading, kill the master, 
remove its procedure WALs, and then restart the master. HBCK2 should be able to 
restore cluster to a working state. This manufacture is good too for figuring 
scope of what hbck2 needs to provide. Chatting here a few of us, it was noted 
that the Master UI and shell will not be useable unless the Master completes 
initialization -- which it will not be able to do if the procedure WAL files 
held part done procedures (e.g. a Move of the namespace region where the master 
kill and prcedure wal files were removed after the move unassign but before the 
moe assign completed).

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: stack
>Priority: Major
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)