date:20181025

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-25 Thread Jingyun Tian (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664729#comment-16664729
 ] 

Jingyun Tian commented on HBASE-21322:
--

{quote}One item to consider is that you may be duplicating what 
getServerNameFromWALDirectoryName#getServerNameFromWALDirectoryName does? Is 
that possible?
{quote}
Sorry, I don't get this. The method getServerNameFromWALDirectoryName is used 
to extract serverName from WAL Directory. The serverName I got is from user's 
input. Seems not related?
{quote}If making a new patch, you might check if the Procedure is finished 
before claiming one is running (it may not be running):

if (serverName.compareTo(((ServerCrashProcedure) procedure).getServerName()) == 
0) {

{quote}
Sure. I'll check if the procedure is still running.

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, Screenshot from 
> 2018-10-17 13-35-58.png, Screenshot from 2018-10-17 13-38-41.png, Screenshot 
> from 2018-10-17 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-21379) RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication enabled

2018-10-25 Thread Zheng Hu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu reassigned HBASE-21379:


Assignee: Zheng Hu

> RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication 
> enabled
> ---
>
> Key: HBASE-21379
> URL: https://issues.apache.org/jira/browse/HBASE-21379
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: justice
>Assignee: Zheng Hu
>Priority: Major
> Attachments: hbase-wal-p.tgz, log.tgz, wal-20181026.tgz, wal.tgz
>
>
>  log as follow:
> {code:java}
> //代码占位符
> 2018-10-24 09:22:42,381 INFO  [regionserver/11-3-19-10:16020] 
> wal.AbstractFSWAL: New WAL 
> /hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>          │
> 2018-10-24 09:23:05,151 ERROR 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.ReplicationSource: Unexpected exception in 
> regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540│
> 344155469,2 
> currentPath=hdfs://11-3-18-67.JD.LOCAL:9000/hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>  │
> java.lang.ArrayIndexOutOfBoundsException: 8830 │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365) │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358) ┤
> at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:114) │
> at 
> org.apache.hadoop.hbase.replication.ScopeWALEntryFilter.filterCell(ScopeWALEntryFilter.java:54)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filterCells(ChainWALEntryFilter.java:90)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filter(ChainWALEntryFilter.java:77)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.filterEntry(ReplicationSourceWALReader.java:234)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:170)
>  │ at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:133)
>  │
> 2018-10-24 09:23:05,153 INFO 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.HRegionServer: * STOPPING region server 
> '11-3-19-10.jd.local,16020,1540344155469' *
> {code}
> hbase wal -p output
> {code:java}
> //代码占位符
> writer Classes: ProtobufLogWriter AsyncProtobufLogWriter
> Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
> Sequence=15 , region=fee7a9465ced6ce9e319d37e9d71c63c at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=8000, column=METAFAMILY:HBASE::REGION_EVENT
> value: \x08\x00\x12\x1Cmlaas:ump_host_second_181029\x1A 
> fee7a9465ced6ce9e319d37e9d71c63c 
> \x0E*\x06\x0A\x01f\x12\x01f2\x1F\x0A\x1311-3-19-10.JD.LOCAL\x10\x94}\x18\xCD\x9A\xAA\x9D\xEA,:Umlaas:ump_host_second_181029,8000,1540271129253.fee7a9465ced6ce9e319d37e9d71c63c.
> Sequence=9 , region=ba6684888d826328a6373435124dc1cd at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=9100, column=METAFAMILY:HBASE::REGION_EVENT
> ...
> row=34975#00, column=f:\x09,
> value: 
> {"tp50":1,"avg":2,"min":0,"tp90":1,"max":3,"count":13,"tp99":2,"tp999":2,"error":0}
> row=349824#00, column=f:\x08\xFA
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":98,"count":957,"tp99":3,"tp999":34,"error":0}
> row=349824#00, column=f:\x08\xD2
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":43,"count":1842,"tp99":2,"tp999":31,"error":0}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8830
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:336)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:290)
> at org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:421)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.main(WALPrettyPrinter.java:356)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-10-25 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664727#comment-16664727
 ] 

Hudson commented on HBASE-20952:


Results for branch HBASE-20952
[build #29 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/29/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/29//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/29//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/29//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21391) RefreshPeerProcedure should also wait master initialized before executing

2018-10-25 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21391:
--
Assignee: Duo Zhang
  Status: Patch Available  (was: Open)

> RefreshPeerProcedure should also wait master initialized before executing
> -
>
> Key: HBASE-21391
> URL: https://issues.apache.org/jira/browse/HBASE-21391
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2
>
> Attachments: HBASE-21391.patch
>
>
> Missed this one when introducing the waitInitialized method in Procedure, and 
> found when implementing HBASE-21389.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21391) RefreshPeerProcedure should also wait master initialized before executing

2018-10-25 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21391:
--
Attachment: HBASE-21391.patch

> RefreshPeerProcedure should also wait master initialized before executing
> -
>
> Key: HBASE-21391
> URL: https://issues.apache.org/jira/browse/HBASE-21391
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2
>
> Attachments: HBASE-21391.patch
>
>
> Missed this one when introducing the waitInitialized method in Procedure, and 
> found when implementing HBASE-21389.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664720#comment-16664720
 ] 

Hadoop QA commented on HBASE-21380:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HBASE-21380 does not apply to branch-2.1. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-21380 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945699/HBASE-21380.branch-2.1.004.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14869/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.branch-2.1.004.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-21391) RefreshPeerProcedure should also wait master initialized before executing

2018-10-25 Thread Duo Zhang (JIRA)

Duo Zhang created HBASE-21391:
-

 Summary: RefreshPeerProcedure should also wait master initialized 
before executing
 Key: HBASE-21391
 URL: https://issues.apache.org/jira/browse/HBASE-21391
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Duo Zhang
 Fix For: 3.0.0, 2.2.0, 2.1.2


Missed this one when introducing the waitInitialized method in Procedure, and 
found when implementing HBASE-21389.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Nick.han (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664708#comment-16664708
 ] 

Nick.han commented on HBASE-21328:
--

[~busbey]

The patch was regenerated by git format-patch command.

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch, 
> HBASE-21328.master.002.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21379) RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication enabled

2018-10-25 Thread Zheng Hu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664707#comment-16664707
 ] 

Zheng Hu commented on HBASE-21379:
--

No sure yet.  Will continue to work on this.. 

> RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication 
> enabled
> ---
>
> Key: HBASE-21379
> URL: https://issues.apache.org/jira/browse/HBASE-21379
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: justice
>Priority: Major
> Attachments: hbase-wal-p.tgz, log.tgz, wal-20181026.tgz, wal.tgz
>
>
>  log as follow:
> {code:java}
> //代码占位符
> 2018-10-24 09:22:42,381 INFO  [regionserver/11-3-19-10:16020] 
> wal.AbstractFSWAL: New WAL 
> /hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>          │
> 2018-10-24 09:23:05,151 ERROR 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.ReplicationSource: Unexpected exception in 
> regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540│
> 344155469,2 
> currentPath=hdfs://11-3-18-67.JD.LOCAL:9000/hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>  │
> java.lang.ArrayIndexOutOfBoundsException: 8830 │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365) │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358) ┤
> at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:114) │
> at 
> org.apache.hadoop.hbase.replication.ScopeWALEntryFilter.filterCell(ScopeWALEntryFilter.java:54)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filterCells(ChainWALEntryFilter.java:90)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filter(ChainWALEntryFilter.java:77)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.filterEntry(ReplicationSourceWALReader.java:234)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:170)
>  │ at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:133)
>  │
> 2018-10-24 09:23:05,153 INFO 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.HRegionServer: * STOPPING region server 
> '11-3-19-10.jd.local,16020,1540344155469' *
> {code}
> hbase wal -p output
> {code:java}
> //代码占位符
> writer Classes: ProtobufLogWriter AsyncProtobufLogWriter
> Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
> Sequence=15 , region=fee7a9465ced6ce9e319d37e9d71c63c at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=8000, column=METAFAMILY:HBASE::REGION_EVENT
> value: \x08\x00\x12\x1Cmlaas:ump_host_second_181029\x1A 
> fee7a9465ced6ce9e319d37e9d71c63c 
> \x0E*\x06\x0A\x01f\x12\x01f2\x1F\x0A\x1311-3-19-10.JD.LOCAL\x10\x94}\x18\xCD\x9A\xAA\x9D\xEA,:Umlaas:ump_host_second_181029,8000,1540271129253.fee7a9465ced6ce9e319d37e9d71c63c.
> Sequence=9 , region=ba6684888d826328a6373435124dc1cd at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=9100, column=METAFAMILY:HBASE::REGION_EVENT
> ...
> row=34975#00, column=f:\x09,
> value: 
> {"tp50":1,"avg":2,"min":0,"tp90":1,"max":3,"count":13,"tp99":2,"tp999":2,"error":0}
> row=349824#00, column=f:\x08\xFA
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":98,"count":957,"tp99":3,"tp999":34,"error":0}
> row=349824#00, column=f:\x08\xD2
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":43,"count":1842,"tp99":2,"tp999":31,"error":0}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8830
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:336)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:290)
> at org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:421)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.main(WALPrettyPrinter.java:356)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Nick.han (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick.han updated HBASE-21328:
-
Attachment: HBASE-21328.master.002.patch

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch, 
> HBASE-21328.master.002.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21390) [test] TestRpcAccessChecks is buggy

2018-10-25 Thread Reid Chan (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664649#comment-16664649
 ] 

Reid Chan commented on HBASE-21390:
---

I'm working on HBASE-21255 and struggling in fixing fail UTs where i found this 
bug.

The original cause is in TableAuthManager which is buggy as well, the fix will 
be included in HBASE-21255. 

Just file this one to raise notices.

> [test] TestRpcAccessChecks is buggy
> ---
>
> Key: HBASE-21390
> URL: https://issues.apache.org/jira/browse/HBASE-21390
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Major
>
> TestRpcAccessChecks is buggy.
> From setup() we know, USER_ADMIN is only granted ADMIN action, but 
> testTableFlush() and testTableFlushAndSnapshot() require CREATE action which 
> USER_ADMIN doesn't have.
> Both tests should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-21390) [test] TestRpcAccessChecks is buggy

2018-10-25 Thread Reid Chan (JIRA)

Reid Chan created HBASE-21390:
-

 Summary: [test] TestRpcAccessChecks is buggy
 Key: HBASE-21390
 URL: https://issues.apache.org/jira/browse/HBASE-21390
 Project: HBase
  Issue Type: Improvement
Reporter: Reid Chan
Assignee: Reid Chan


TestRpcAccessChecks is buggy.
>From setup() we know, USER_ADMIN is only granted ADMIN action, but 
>testTableFlush() and testTableFlushAndSnapshot() require CREATE action which 
>USER_ADMIN doesn't have.
Both tests should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21379) RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication enabled

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664642#comment-16664642
 ] 

stack commented on HBASE-21379:
---

Nice work [~openinx]. How would we mess up the decoding?

> RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication 
> enabled
> ---
>
> Key: HBASE-21379
> URL: https://issues.apache.org/jira/browse/HBASE-21379
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: justice
>Priority: Major
> Attachments: hbase-wal-p.tgz, log.tgz, wal-20181026.tgz, wal.tgz
>
>
>  log as follow:
> {code:java}
> //代码占位符
> 2018-10-24 09:22:42,381 INFO  [regionserver/11-3-19-10:16020] 
> wal.AbstractFSWAL: New WAL 
> /hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>          │
> 2018-10-24 09:23:05,151 ERROR 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.ReplicationSource: Unexpected exception in 
> regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540│
> 344155469,2 
> currentPath=hdfs://11-3-18-67.JD.LOCAL:9000/hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>  │
> java.lang.ArrayIndexOutOfBoundsException: 8830 │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365) │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358) ┤
> at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:114) │
> at 
> org.apache.hadoop.hbase.replication.ScopeWALEntryFilter.filterCell(ScopeWALEntryFilter.java:54)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filterCells(ChainWALEntryFilter.java:90)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filter(ChainWALEntryFilter.java:77)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.filterEntry(ReplicationSourceWALReader.java:234)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:170)
>  │ at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:133)
>  │
> 2018-10-24 09:23:05,153 INFO 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.HRegionServer: * STOPPING region server 
> '11-3-19-10.jd.local,16020,1540344155469' *
> {code}
> hbase wal -p output
> {code:java}
> //代码占位符
> writer Classes: ProtobufLogWriter AsyncProtobufLogWriter
> Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
> Sequence=15 , region=fee7a9465ced6ce9e319d37e9d71c63c at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=8000, column=METAFAMILY:HBASE::REGION_EVENT
> value: \x08\x00\x12\x1Cmlaas:ump_host_second_181029\x1A 
> fee7a9465ced6ce9e319d37e9d71c63c 
> \x0E*\x06\x0A\x01f\x12\x01f2\x1F\x0A\x1311-3-19-10.JD.LOCAL\x10\x94}\x18\xCD\x9A\xAA\x9D\xEA,:Umlaas:ump_host_second_181029,8000,1540271129253.fee7a9465ced6ce9e319d37e9d71c63c.
> Sequence=9 , region=ba6684888d826328a6373435124dc1cd at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=9100, column=METAFAMILY:HBASE::REGION_EVENT
> ...
> row=34975#00, column=f:\x09,
> value: 
> {"tp50":1,"avg":2,"min":0,"tp90":1,"max":3,"count":13,"tp99":2,"tp999":2,"error":0}
> row=349824#00, column=f:\x08\xFA
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":98,"count":957,"tp99":3,"tp999":34,"error":0}
> row=349824#00, column=f:\x08\xD2
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":43,"count":1842,"tp99":2,"tp999":31,"error":0}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8830
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:336)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:290)
> at org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:421)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.main(WALPrettyPrinter.java:356)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664634#comment-16664634
 ] 

stack commented on HBASE-21322:
---

Thanks for the patch [~tianjingyun]. Looks better sir. One item to consider is 
that you may be duplicating what 
getServerNameFromWALDirectoryName#getServerNameFromWALDirectoryName does? Is 
that possible? 

If making a new patch, you might check if the Procedure is finished before 
claiming one is running (it may not be running):

if (serverName.compareTo(((ServerCrashProcedure) procedure).getServerName()) == 
0) {

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, Screenshot from 
> 2018-10-17 13-35-58.png, Screenshot from 2018-10-17 13-38-41.png, Screenshot 
> from 2018-10-17 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21383) Change refguide to point at hbck2 instead of hbck1

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21383:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to master branch (will pull back to other branches at release time). 
Thanks for review [~Apache9]

> Change refguide to point at hbck2 instead of hbck1
> --
>
> Key: HBASE-21383
> URL: https://issues.apache.org/jira/browse/HBASE-21383
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21383.branch-2.1.001.patch
>
>
> Update the refguide. I



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21379) RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication enabled

2018-10-25 Thread Zheng Hu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-21379:
-
Attachment: wal-20181026.tgz

> RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication 
> enabled
> ---
>
> Key: HBASE-21379
> URL: https://issues.apache.org/jira/browse/HBASE-21379
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0
>Reporter: justice
>Priority: Major
> Attachments: hbase-wal-p.tgz, log.tgz, wal-20181026.tgz, wal.tgz
>
>
>  log as follow:
> {code:java}
> //代码占位符
> 2018-10-24 09:22:42,381 INFO  [regionserver/11-3-19-10:16020] 
> wal.AbstractFSWAL: New WAL 
> /hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>          │
> 2018-10-24 09:23:05,151 ERROR 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.ReplicationSource: Unexpected exception in 
> regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540│
> 344155469,2 
> currentPath=hdfs://11-3-18-67.JD.LOCAL:9000/hbase/WALs/11-3-19-10.jd.local,16020,1540344155469/11-3-19-10.jd.local%2C16020%2C1540344155469.1540344162124
>  │
> java.lang.ArrayIndexOutOfBoundsException: 8830 │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365) │
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358) ┤
> at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:114) │
> at 
> org.apache.hadoop.hbase.replication.ScopeWALEntryFilter.filterCell(ScopeWALEntryFilter.java:54)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filterCells(ChainWALEntryFilter.java:90)
>  │
> at 
> org.apache.hadoop.hbase.replication.ChainWALEntryFilter.filter(ChainWALEntryFilter.java:77)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.filterEntry(ReplicationSourceWALReader.java:234)
>  │
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:170)
>  │ at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:133)
>  │
> 2018-10-24 09:23:05,153 INFO 
> [regionserver/11-3-19-10:16020.replicationSource.11-3-19-10.jd.local%2C16020%2C1540344155469,2.replicationSource.wal-reader.11-3-19-10.jd.local%2C16020%2C1540344155469,2]
>  region│
> server.HRegionServer: * STOPPING region server 
> '11-3-19-10.jd.local,16020,1540344155469' *
> {code}
> hbase wal -p output
> {code:java}
> //代码占位符
> writer Classes: ProtobufLogWriter AsyncProtobufLogWriter
> Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
> Sequence=15 , region=fee7a9465ced6ce9e319d37e9d71c63c at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=8000, column=METAFAMILY:HBASE::REGION_EVENT
> value: \x08\x00\x12\x1Cmlaas:ump_host_second_181029\x1A 
> fee7a9465ced6ce9e319d37e9d71c63c 
> \x0E*\x06\x0A\x01f\x12\x01f2\x1F\x0A\x1311-3-19-10.JD.LOCAL\x10\x94}\x18\xCD\x9A\xAA\x9D\xEA,:Umlaas:ump_host_second_181029,8000,1540271129253.fee7a9465ced6ce9e319d37e9d71c63c.
> Sequence=9 , region=ba6684888d826328a6373435124dc1cd at write timestamp=Wed 
> Oct 24 09:22:49 CST 2018
> row=9100, column=METAFAMILY:HBASE::REGION_EVENT
> ...
> row=34975#00, column=f:\x09,
> value: 
> {"tp50":1,"avg":2,"min":0,"tp90":1,"max":3,"count":13,"tp99":2,"tp999":2,"error":0}
> row=349824#00, column=f:\x08\xFA
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":98,"count":957,"tp99":3,"tp999":34,"error":0}
> row=349824#00, column=f:\x08\xD2
> value: 
> {"tp50":2,"avg":2,"min":0,"tp90":2,"max":43,"count":1842,"tp99":2,"tp999":31,"error":0}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8830
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
> at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:336)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:290)
> at org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:421)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.main(WALPrettyPrinter.java:356)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21383) Change refguide to point at hbck2 instead of hbck1

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21383:
--
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)

> Change refguide to point at hbck2 instead of hbck1
> --
>
> Key: HBASE-21383
> URL: https://issues.apache.org/jira/browse/HBASE-21383
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21383.branch-2.1.001.patch
>
>
> Update the refguide. I



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21379) RegionServer Stop by ArrayIndexOutOfBoundsException of WAL when replication enabled

2018-10-25 Thread Zheng Hu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664625#comment-16664625
 ] 

Zheng Hu commented on HBASE-21379:
--

Decoded the hlog, and read the byte hex carefully. I found something... 
{code}
Writer Classes: ProtobufLogWriter AsyncProtobufLogWriter
Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec
==> expectedCells: 45
==> KeyValue Cell Hex: \x00\x00\x00z\x00\x00\x00 
\x00\x00\x00R\x00\x118242341#02#438917\x01f\x0A^\x00\x00\x01f\xAC\x8A\xBC\xE9\x04{"tp50":5,"avg":5,"min":5,"tp90":5,"max":5,"count":1,"tp99":5,"tp999":5,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12029960\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDY\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12029961\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDY\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":89,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12029962\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDY\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12029963\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDY\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":94,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12029996\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDC\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":93,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12029997\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDC\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":85,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12030005\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDC\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":3,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12030058\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBD}\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":75,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12030061\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBD}\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":92,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12030066\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDZ\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00~\x00\x00\x00"\x00\x00\x00T\x00\x138242741#02#12030068\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDZ\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":103,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12030069\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDZ\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12030070\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDZ\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":92,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12030071\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBDY\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12030098\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBC\xFE\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":91,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}99":2,"error":0}\x00\x00\x00{\x00\x00\x00\x1F\x00\x00\x00T\x00\x108484941#02#39103\x01f\x0A_\x00\x00\x01f\xAC\x8A\xBDr\x04{"tp50":0,"avg":1,"min":0,"tp90":0,"max":2,"count":103,"tp99":2,"t
==> KeyValue Cell Hex: 
\x00\x00\x00|\x00\x00\x00"\x00\x00\x00R\x00\x138242741#02#12030139\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBC\xDA\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00}\x00\x00\x00"\x00\x00\x00S\x00\x138242741#02#12030141\x01f\x0Ab\x00\x00\x01f\xAC\x8A\xBC\xDA\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":84,"tp99":0,"tp999":0,"error":0}
==> KeyValue Cell Hex: 
\x00\x00\x00|or":0}\x00\x00\x00y\x00\x00\x00\x1F\x00\x00\x00R\x00\x10894737#02#225822\x01f\x0A^\x00\x00\x01f\xAC\x8A\xBC\x9F\x04{"tp50":1,"avg":1,"min":1,"tp90":1,"max":1,"count":1,"tp99":1,"tp999":1,"er
==> KeyValue Cell Hex: 
\x00\x00\x00}"error":0}\x00\x00\x00y\x00\x00\x00\x1F\x00\x00\x00R\x00\x10894737#02#419456\x01f\x0A^\x00\x00\x01f\xAC\x8A\xBCi\x04{"tp50":0,"avg":0,"min":0,"tp90":0,"max":0,"count":1,"tp99

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664604#comment-16664604
 ] 

stack commented on HBASE-21380:
---

I pushed .003 patch on branch-2.1 and started up some nightlies to see how it 
does overnight. Will push elsewhere if goes well.

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.branch-2.1.004.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664602#comment-16664602
 ] 

stack commented on HBASE-21380:
---

Pushed .001 but reverted it after [~mdrob] put up better fix on .003.

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.branch-2.1.004.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21389) Revisit the procedure lock for sync replication

2018-10-25 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21389:
--
Attachment: HBASE-21389.patch

> Revisit the procedure lock for sync replication
> ---
>
> Key: HBASE-21389
> URL: https://issues.apache.org/jira/browse/HBASE-21389
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2, Replication
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21389.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21389) Revisit the procedure lock for sync replication

2018-10-25 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21389:
--
Assignee: Duo Zhang
  Status: Patch Available  (was: Open)

> Revisit the procedure lock for sync replication
> ---
>
> Key: HBASE-21389
> URL: https://issues.apache.org/jira/browse/HBASE-21389
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2, Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21389.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664596#comment-16664596
 ] 

Hadoop QA commented on HBASE-21175:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 8 new + 8 unchanged 
- 2 fixed = 16 total (was 10) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
28s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  5s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}129m 
31s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}174m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21175 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945660/HBASE-21175.v04.patch 
|
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5be57ca5838d 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / cd943419b6 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14865/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14865/testReport/ |
| Max. process+thread count | 4742 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output |

[jira] [Updated] (HBASE-16040) Remove configuration "hbase.replication"

2018-10-25 Thread Yechao Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HBASE-16040:

Description: 
This configuration was introduced to reduce overhead of replication. 
 Now the overhead of replication is negligible. Besides that, this config is 
not in hbase-default.xml, user has to read the code to know about it and its' 
default value, this is unfriendly.

So let's remove it. suggestions?

  was:
+underlined text+This configuration was introduced to reduce overhead of 
replication. 
Now the overhead of replication is negligible. Besides that, this config is not 
in hbase-default.xml,  user has to read the code to know about it and its' 
default value, this is unfriendly. 

So let's remove it.  suggestions?


> Remove configuration "hbase.replication"
> 
>
> Key: HBASE-16040
> URL: https://issues.apache.org/jira/browse/HBASE-16040
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-16040.patch, HBASE-16040_v1.patch
>
>
> This configuration was introduced to reduce overhead of replication. 
>  Now the overhead of replication is negligible. Besides that, this config is 
> not in hbase-default.xml, user has to read the code to know about it and its' 
> default value, this is unfriendly.
> So let's remove it. suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-16040) Remove configuration "hbase.replication"

2018-10-25 Thread Yechao Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yechao Chen updated HBASE-16040:

Description: 
+underlined text+This configuration was introduced to reduce overhead of 
replication. 
Now the overhead of replication is negligible. Besides that, this config is not 
in hbase-default.xml,  user has to read the code to know about it and its' 
default value, this is unfriendly. 

So let's remove it.  suggestions?

  was:
This configuration was introduced to reduce overhead of replication. 
Now the overhead of replication is negligible. Besides that, this config is not 
in hbase-default.xml,  user has to read the code to know about it and its' 
default value, this is unfriendly. 

So let's remove it.  suggestions?


> Remove configuration "hbase.replication"
> 
>
> Key: HBASE-16040
> URL: https://issues.apache.org/jira/browse/HBASE-16040
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-16040.patch, HBASE-16040_v1.patch
>
>
> +underlined text+This configuration was introduced to reduce overhead of 
> replication. 
> Now the overhead of replication is negligible. Besides that, this config is 
> not in hbase-default.xml,  user has to read the code to know about it and 
> its' default value, this is unfriendly. 
> So let's remove it.  suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21383) Change refguide to point at hbck2 instead of hbck1

2018-10-25 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664589#comment-16664589
 ] 

Duo Zhang commented on HBASE-21383:
---

+1.

> Change refguide to point at hbck2 instead of hbck1
> --
>
> Key: HBASE-21383
> URL: https://issues.apache.org/jira/browse/HBASE-21383
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21383.branch-2.1.001.patch
>
>
> Update the refguide. I



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21380:
--
Attachment: HBASE-21380.branch-2.1.004.patch

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.branch-2.1.004.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664588#comment-16664588
 ] 

Hadoop QA commented on HBASE-21387:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.8.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
58s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} hbase-server: The patch generated 0 new + 1 
unchanged - 1 fixed = 1 total (was 2) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}132m 
38s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21387 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945664/21387.v1.txt |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9cfe574aa0cb 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / cd943419b6 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21380:
--
Attachment: HBASE-21380.branch-2.1.003.patch

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.master.001.patch, HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-21389) Revisit the procedure lock for sync replication

2018-10-25 Thread Duo Zhang (JIRA)

Duo Zhang created HBASE-21389:
-

 Summary: Revisit the procedure lock for sync replication
 Key: HBASE-21389
 URL: https://issues.apache.org/jira/browse/HBASE-21389
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, Replication
Reporter: Duo Zhang
 Fix For: 3.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-25 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21375:
--
Attachment: HBASE-21375-v1.patch

> Revisit the lock and queue implementation in MasterProcedureScheduler
> -
>
> Key: HBASE-21375
> URL: https://issues.apache.org/jira/browse/HBASE-21375
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, 
> HBASE-21375-v1.patch, HBASE-21375.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-25 Thread Jingyun Tian (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664571#comment-16664571
 ] 

Jingyun Tian commented on HBASE-19121:
--

Yes. Maybe we can hide these in one single method.

Sounds good. I'll try to help:D.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-25 Thread Jingyun Tian (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21322:
-
Attachment: HBASE-21322.master.003.patch

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, Screenshot from 
> 2018-10-17 13-35-58.png, Screenshot from 2018-10-17 13-38-41.png, Screenshot 
> from 2018-10-17 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664550#comment-16664550
 ] 

stack commented on HBASE-21380:
---

Who put that errorprone crap in our build! (smile).

It found a good one. Going w/ 001 for now so can put up a 2.1.1 since not an 
easy fix and I want to push the RC.



> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-21388) No need to instantiate MemStore for master which not carry table

2018-10-25 Thread Guanghao Zhang (JIRA)

Guanghao Zhang created HBASE-21388:
--

 Summary: No need to instantiate MemStore for master which not 
carry table
 Key: HBASE-21388
 URL: https://issues.apache.org/jira/browse/HBASE-21388
 Project: HBase
  Issue Type: Improvement
Reporter: Guanghao Zhang


We found this log in our master.

2018-10-26,10:00:00,449 INFO 
[master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
2018-10-26,10:00:00,452 INFO 
[master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0

 

Same with HBASE-21290, we don't need to instantiate MemStore for master which 
not carry table.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-10-25 Thread Allan Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664537#comment-16664537
 ] 

Allan Yang commented on HBASE-20671:


This one is valid still, I have seen similar issue like split parent was 
brought online causing the referenced file being compacted. Will come back 
later.

> Merged region brought back to life causing RS to be killed by Master
> 
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.0
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Attachments: 0001-Test-for-HBASE-20671.patch, 
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip, 
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip,
>  workaround.txt
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then 
> ended up assigning the children region back out to the cluster. There is a 
> log message which appears to indicate that RegionStates acknowledges that it 
> doesn't know what this region is as it's replaying the pv2 WAL; however, it 
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c 
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG 
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
> MergeTableRegionsProcedure table=tabletwo_merge, 
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to 
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
> assignment.AssignmentManager: Killing 
> ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21297) ModifyTableProcedure can throw TNDE instead of IOE in case of REGION_REPLICATION change

2018-10-25 Thread Allan Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664536#comment-16664536
 ] 

Allan Yang commented on HBASE-21297:


+1

> ModifyTableProcedure can throw TNDE instead of IOE in case of 
> REGION_REPLICATION change
> ---
>
> Key: HBASE-21297
> URL: https://issues.apache.org/jira/browse/HBASE-21297
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21297.master.001.patch
>
>
> Currently {{ModifyTableProcedure}} throws an {{IOException}} (See 
> [ModifyTableProcedure.java#L252|https://github.com/apache/hbase/blob/924d183ba0e67b975e998f6006c993f457e03c20/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java#L252])
>  when a user tries to modify REGION_REPLICATION for an enabled table. 
> Instead, it can throw a more specific {{TableNotDisabledException}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664501#comment-16664501
 ] 

Hadoop QA commented on HBASE-21380:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
28s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
41s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 41s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
13s{color} | {color:red} hbase-server: The patch generated 1 new + 178 
unchanged - 0 fixed = 179 total (was 178) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
37s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}208m 41s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}245m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncTableGetMultiThreaded |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21380 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945667/HBASE-21380.branch-2.1.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 1998553b7cd2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / e71c05707e |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| compile | 
https://builds.apache.org/job/PreCommi

[jira] [Commented] (HBASE-21365) Throw exception when user put data with skip wal to a table which may be replicated

2018-10-25 Thread Guanghao Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664493#comment-16664493
 ] 

Guanghao Zhang commented on HBASE-21365:


Add 003 patch which rebased.

> Throw exception when user put data with skip wal to a table which may be 
> replicated
> ---
>
> Key: HBASE-21365
> URL: https://issues.apache.org/jira/browse/HBASE-21365
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21365.master.001.patch, 
> HBASE-21365.master.002.patch, HBASE-21365.master.003.patch
>
>
> A real problem in our production cluster. A user point that his table's data 
> can't be replicate to the peer cluster. Then we start to debug the reason. We 
> checked the replication scope, checked the replication wal entry filter, and 
> check the namespace,tablecfs config. But didn't found any problem. We enabled 
> the RS's debug log to find the reason. Finally, we found use use put with 
> skip wal to write data. But it taked a long time... Our replication use wal 
> to replicate data. So the data can't be replicated to peer cluster. I thought 
> throw a exception may be better for user if the table's replication scope is 
> not 0. (as 0 means not replicated).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21365) Throw exception when user put data with skip wal to a table which may be replicated

2018-10-25 Thread Guanghao Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21365:
---
Attachment: HBASE-21365.master.003.patch

> Throw exception when user put data with skip wal to a table which may be 
> replicated
> ---
>
> Key: HBASE-21365
> URL: https://issues.apache.org/jira/browse/HBASE-21365
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21365.master.001.patch, 
> HBASE-21365.master.002.patch, HBASE-21365.master.003.patch
>
>
> A real problem in our production cluster. A user point that his table's data 
> can't be replicate to the peer cluster. Then we start to debug the reason. We 
> checked the replication scope, checked the replication wal entry filter, and 
> check the namespace,tablecfs config. But didn't found any problem. We enabled 
> the RS's debug log to find the reason. Finally, we found use use put with 
> skip wal to write data. But it taked a long time... Our replication use wal 
> to replicate data. So the data can't be replicated to peer cluster. I thought 
> throw a exception may be better for user if the table's replication scope is 
> not 0. (as 0 means not replicated).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21385) HTable.delete request use rpc call directly instead of AsyncProcess

2018-10-25 Thread Guanghao Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664490#comment-16664490
 ] 

Guanghao Zhang commented on HBASE-21385:


Thanks [~stack].

> HTable.delete request use rpc call directly instead of AsyncProcess
> ---
>
> Key: HBASE-21385
> URL: https://issues.apache.org/jira/browse/HBASE-21385
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.1.0, 2.2.0, 2.0.2
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21385.master.001.patch, 
> HBASE-21385.master.002.patch
>
>
> HBASE-16592 unify delete request to use AsyncProcess. But the job is not done 
> totally. As we still use rpc call for get, put, append, and increment. We 
> only use AsyncProcess for batch requests. And I found one problem in 
> HBASE-21365. The rpc call will throw a DoNotRetryException but AsyncProcess 
> will wrap it with a new RetriesExhaustedWithDetailsException. It is not 
> right. So I thought HTable.delete should use rpc call directly, it is same 
> with get, put, append and increment request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21385) HTable.delete request use rpc call directly instead of AsyncProcess

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21385:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1 Pushed to branch-2.0+. Did it myself so I could include it in 2.1.1. Nice 
test BTW [~zghaobac].

> HTable.delete request use rpc call directly instead of AsyncProcess
> ---
>
> Key: HBASE-21385
> URL: https://issues.apache.org/jira/browse/HBASE-21385
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.1.0, 2.2.0, 2.0.2
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21385.master.001.patch, 
> HBASE-21385.master.002.patch
>
>
> HBASE-16592 unify delete request to use AsyncProcess. But the job is not done 
> totally. As we still use rpc call for get, put, append, and increment. We 
> only use AsyncProcess for batch requests. And I found one problem in 
> HBASE-21365. The rpc call will throw a DoNotRetryException but AsyncProcess 
> will wrap it with a new RetriesExhaustedWithDetailsException. It is not 
> right. So I thought HTable.delete should use rpc call directly, it is same 
> with get, put, append and increment request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664407#comment-16664407
 ] 

stack commented on HBASE-21322:
---

Took a look at the patch. If an operator wants to schedule a SCP, why not just 
do it? You have all this checking for -splitting. Maybe there is some aspect of 
SCP that they need that does not involve WALs... some cleanup of master state 
or something? Just let the SCP go through? IIRC, it is able to handle case of 
many SCPs on the one server.

Otherwise, the patch looks good. Thanks [~tianjingyun]

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, Screenshot from 2018-10-17 13-35-58.png, 
> Screenshot from 2018-10-17 13-38-41.png, Screenshot from 2018-10-17 
> 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21297) ModifyTableProcedure can throw TNDE instead of IOE in case of REGION_REPLICATION change

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664390#comment-16664390
 ] 

Hadoop QA commented on HBASE-21297:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
27s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 13s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}203m 
34s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}248m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21297 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945620/HBASE-21297.master.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 96712abd3af9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 66469733ec |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14862/testReport/ |
| Max. process+thread count | 4793 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14862/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Modi

[jira] [Resolved] (HBASE-9559) getRowKeyAtOrBefore may be incorrect for some cases

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-9559.
--
Resolution: Won't Fix

We don't use getRowAtOrBefore anymore. Removed in hbase-2.

> getRowKeyAtOrBefore may be incorrect for some cases
> ---
>
> Key: HBASE-9559
> URL: https://issues.apache.org/jira/browse/HBASE-9559
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Minor
>
> See also HBASE-9503. Unless I'm missing something, getRowKeyAtOrBefore does 
> not handle cross-file deletes correctly. It also doesn't handle timestamps 
> between two candidates of the same row if they are in different file (latest 
> by ts is going to be returned).
> It is only used for meta, so it might be working due to low update rate, lack 
> of anomalies and the fact that row values in meta are reasonably persistent, 
> new ones are only added in split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HBASE-7044) verifyRegionLocation in CatalogTracker.java didn't check if regionserver is in the cluster

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-7044.
--
Resolution: Won't Fix

Resolving old issue w/ no progress.

> verifyRegionLocation in CatalogTracker.java didn't check if  regionserver is 
> in the cluster
> ---
>
> Key: HBASE-7044
> URL: https://issues.apache.org/jira/browse/HBASE-7044
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: wonderyl
>Priority: Major
>
> at the beginning there is 1 whole hbase cluster, then I decide to split is 
> into 2 cluster, one is for offline mining, one is for online service, and the 
> online one is striped, the offline one contains the original master.
> unfortunately, the META of the original cluster is assigned to the machine 
> stripped, and as there is a cache policy for META, the offline cluster is 
> still access the META of the stripped one.
> after inspected the code, I found that in verifyRegionLocation of 
> CatalogTracker.java, although it checks if the region server still contains 
> the region, but it didn't check if the regions erver is still in the cluster 
> which is very easy, just inspect if it is registered int zk.
> all in all, I have to shutdown the online cluster and restart the offline 
> one, then the META is re-assgined. then everything is back to normal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HBASE-7959) HBCK skips regions that have been recently modified, which then leads to it to report holes in region chain

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-7959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-7959.
--
Resolution: Won't Fix

Resolving old issue that has had no work done.

> HBCK skips regions that have been recently modified, which then leads to it 
> to report holes in region chain
> ---
>
> Key: HBASE-7959
> URL: https://issues.apache.org/jira/browse/HBASE-7959
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: Enis Soztutar
>Priority: Major
>
> While lots of region splits going on, HBCK incorrectly reports 
> inconsistencies since it skips recently modified, but does not take those 
> into account for computing the region chain. 
> {code}
> 13/02/28 03:33:16 WARN util.HBaseFsck: Region { meta => 
> cluster_test,,1362021481742.69639761fdf693ab1e2bf33f523cd1ae., hdfs => 
> NN:8020/apps/hbase-trunk/data/cluster_test/69639761fdf693ab1e2bf33f523cd1ae, 
> deployed =>  } was recently modified -- skipping
> 13/02/28 03:33:16 DEBUG util.HBaseFsck: There are 23 region info entries
> ERROR: (region 
> cluster_test,0ccc,1362021481742.ec3ba583b4ea01393591572bf1f31e07.) First 
> region should start with an empty key.  You need to  create a new region and 
> regioninfo in HDFS to plug the hole.
> ERROR: Found inconsistency in table cluster_test
> Summary:
>   -ROOT- is okay.
> Number of regions: 1
> Deployed on:  RSs
>   .META. is okay.
> Number of regions: 1
> Deployed on:  RSs
> Table cluster_test is inconsistent.
> Number of regions: 19
> Deployed on:  RSs
> 1 inconsistencies detected.
> Status: INCONSISTENT
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HBASE-13535) Regions go unassigned when meta-carrying RS is killed

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-13535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-13535.
---
Resolution: Invalid

No DLR anymore. Assignment is different now.

> Regions go unassigned when meta-carrying RS is killed
> -
>
> Key: HBASE-13535
> URL: https://issues.apache.org/jira/browse/HBASE-13535
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
>
> hbase-1.1 will be the first release with DLR on by default. I've been running 
>  ITBLLs on a cluster trying to find issues with DLR. My first few runs ran 
> nicely... but the current run failed complaining regions are not online and 
> indeed recovery is stuck making no progress.
> Upon examination, it looks to be an assignment rather than DLR issue. A 
> server carring meta has its meta log replayed first but we are seemingly 
> failing to assign regions after meta is back online.
> Meantime, my regionserver logs are filling with spewing complaint that 
> regions are not online (we should dampen our logging of region not being 
> online... ) and then the split log workers are stuck:
> {code}
> Thread 13206 (RS_LOG_REPLAY_OPS-c2021:16020-1-Writer-2):
>   State: TIMED_WAITING
>   Blocked count: 45
>   Waited count: 59
>   Stack:
> java.lang.Thread.sleep(Native Method)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.waitUntilRegionOnline(WALSplitter.java:1959)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.locateRegionAndRefreshLastFlushedSequenceId(WALSplitter.java:1857)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.groupEditsByServer(WALSplitter.java:1761)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.append(WALSplitter.java:1674)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1104)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1096)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1066)
> Thread 13205 (RS_LOG_REPLAY_OPS-c2021:16020-1-Writer-1):
>   State: TIMED_WAITING
>   Blocked count: 45
>   Waited count: 59
>   Stack:
> java.lang.Thread.sleep(Native Method)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.waitUntilRegionOnline(WALSplitter.java:1959)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.locateRegionAndRefreshLastFlushedSequenceId(WALSplitter.java:1857)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.groupEditsByServer(WALSplitter.java:1761)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.append(WALSplitter.java:1674)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1104)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1096)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1066)
> Thread 13204 (RS_LOG_REPLAY_OPS-c2021:16020-1-Writer-0):
>   State: TIMED_WAITING
>   Blocked count: 50
>   Waited count: 63
>   Stack:
> java.lang.Thread.sleep(Native Method)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.waitUntilRegionOnline(WALSplitter.java:1959)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.locateRegionAndRefreshLastFlushedSequenceId(WALSplitter.java:1857)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.groupEditsByServer(WALSplitter.java:1761)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$LogReplayOutputSink.append(WALSplitter.java:1674)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1104)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1096)
> 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1066)
> {code}
> ...complaining that:
> 2015-04-22 21:28:02,746 DEBUG [RS_LOG_REPLAY_OPS-c2021:16020-1] 
> wal.WALSplitter: Used 134248328 bytes of buffered edits, waiting for IO 
> threads...
> The accounting seems off around here in SSH where it is moving regions that 
> were on dead server to OFFLINE but is reporting no regions to assign:
> {code}
> 143320 2015-04-21 17:05:07,571 INFO  [MASTER_SERVER_OPERATIONS-c2020:16000-0] 
> handler.ServerShutdownHandler: Mark regions in recovery for crashed server 
> c2024.halxg.cloudera.com,16020,1429660802192 before assignment; regions=[]
> 143321 2015-04-21 17:05:07,572 DEBUG [MASTER_SERVER_OPERATIONS-c2020:16000-0] 
> master.RegionStates: Adding to processed servers 
> c2024.halxg.cloudera.com,16020,1429660802192
> 143322 2015-04-21 17:05:07,575 INFO  [MASTER_SERVER_OPERATIONS-c2020:16000-0] 
> master.RegionStates: Transition {8d63312bc39a39727afea627bb20fee4

[jira] [Resolved] (HBASE-17801) Assigning dead region causing FAILED_OPEN permanent RIT that needs manual resolve

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-17801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-17801.
---
Resolution: Cannot Reproduce

Resolving as 'cannot reproduce'

> Assigning dead region causing FAILED_OPEN permanent RIT that needs manual 
> resolve 
> --
>
> Key: HBASE-17801
> URL: https://issues.apache.org/jira/browse/HBASE-17801
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 1.1.2
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>Priority: Critical
>
> In Apache 1.x, there is a Assignment Manager bug when SSH and drop table 
> happens at the same time.  Here is the sequence:
> (1). The Region Server hosting the target region is dead, SSH (server 
> shutdown handler) offlined all regions hosted by the RS: 
> {noformat}
> 2017-02-20 20:39:25,022 ERROR 
> org.apache.hadoop.hbase.master.MasterRpcServices: Region server 
> rs01.foo.com,60020,1486760911253 reported a fatal error:
> ABORTING region server rs01.foo.com,60020,1486760911253: 
> regionserver:60020-0x55a076071923f5f, 
> quorum=zk01.foo.com:2181,zk02.foo.com:2181,zk3.foo.com:2181, baseZNode=/hbase 
> regionserver:60020-0x1234567890abcdf received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:613)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524)
>   at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> 2017-02-20 20:42:43,775 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for rs01.foo.com,60020,1486760911253 before assignment; region count=999
> 2017-02-20 20:43:31,784 INFO org.apache.hadoop.hbase.master.RegionStates: 
> Transition {783a4814b862a6e23a3265a874c3048b state=OPEN, ts=1487568368296, 
> server=rs01.foo.com,60020,1486760911253} to {783a4814b862a6e23a3265a874c3048b 
> state=OFFLINE, ts=1487648611784, server=rs01.foo.com,60020,1486760911253}
> {noformat}
> (2). Now SSH goes through each region and check whether it should be 
> re-assigned (at this time, SSH do check whether a table is disabled/deleted). 
>  If a region needs to be re-assigned, it would put into a list.  Since at 
> this time, the troubled region is still on the table that is enabled, it will 
> be in the list.
> {noformat}
> 2017-02-20 20:43:31,795 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Reassigning 999 
> region(s) that rs01.foo.com,60020,1486760911253 was carrying (and 0 
> regions(s) that were opening on this server)
> {noformat}
> (3). Now, disable and delete table come in and also try to offline the 
> region; since the region is already offlined, the deleted table just removes 
> the region from meta and in-memory.
> {noformat}
> 2017-02-20 20:43:32,429 INFO org.apache.hadoop.hbase.master.HMaster: 
> Client=b_kylin/null disable t1
> 2017-02-20 20:43:34,275 INFO 
> org.apache.hadoop.hbase.zookeeper.ZKTableStateManager: Moving table t1 state 
> from DISABLING to DISABLED
> 2017-02-20 20:43:34,276 INFO 
> org.apache.hadoop.hbase.master.procedure.DisableTableProcedure: Disabled 
> table, t1, is completed.
> 2017-02-20 20:43:35,624 INFO org.apache.hadoop.hbase.master.HMaster: 
> Client=b_kylin/null delete t1
> 2017-02-20 20:43:36,011 INFO org.apache.hadoop.hbase.MetaTableAccessor: 
> Deleted [{ENCODED => fbf9fda1381636aa5b3cd6e3fe0f6c1e, NAME => 
> 't1,,1487568367030.fbf9fda1381636aa5b3cd6e3fe0f6c1e.', STARTKEY => '', ENDKEY 
> => '\x00\x01'}, {ENCODED => 783a4814b862a6e23a3265a874c3048b, NAME => 
> 't1,\x00\x01,1487568367030.783a4814b862a6e23a3265a874c3048b.', STARTKEY => 
> '\x00\x01', ENDKEY => ''}]
> {noformat}
> (4). However, SSH calls Assignment Manager to reassign the dead region (note 
> that the dead region is in the re-assign list SSH collected and we don't 
> re-check again)
> {noformat}
> 2017-02-20 20:43:52,725 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning but not in region 
> states: {ENCODED => 783a4814b862a6e23a3265a874c3048b, NAME => 
> 't1,\x00\x01,1487568367030.783a4814b862a6e23a3265a874c3048b.', STARTKEY => 
> '\x00\x01', ENDKEY => ''}
> {noformat}
> (5).  In the region server that the dead region tries to land, because the 
> table is dropped, we could not open region and now the dead region is in 
> FAILED_OPEN, which is in permanent RIT state. 
> {noformat}
> 2017-02-20 20:43:52,861 INFO 
> org.apache.hadoop.hbase.regionserver.RSRpcServices: Open 
> t1,\x00\x01,1487568367030.783a4814b86

[jira] [Updated] (HBASE-14090) Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14090:
--
Component/s: fsredo

> Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS
> --
>
> Key: HBASE-14090
> URL: https://issues.apache.org/jira/browse/HBASE-14090
> Project: HBase
>  Issue Type: Sub-task
>  Components: fsredo
>Reporter: stack
>Assignee: Sean Busbey
>Priority: Major
>
> Our layout as is won't work if 1M regions; e.g. HDFS will fall over if 
> directories of hundreds of thousands of files. HBASE-13991 (Humongous Tables) 
> would address this specific directory problem only by adding subdirs under 
> table dir but there are other issues with our current layout:
>  * Our table/regions/column family 'facade' has to be maintained in two 
> locations -- in master memory and in the hdfs directory layout -- and the 
> farce needs to be kept synced or worse, the model management is split between 
> master memory and DFS layout. 'Syncing' in HDFS has us dropping constructs 
> such as 'Reference' and 'HalfHFiles' on split, 'HFileLinks' when archiving, 
> and so on. This 'tie' makes it hard to make changes.
>  * While HDFS has atomic rename, useful for fencing and for having files 
> added atomically, if the model were solely owned by hbase, there are hbase 
> primitives we could make use of -- changes in a row are atomic and 
> coprocessors -- to simplify table transactions and provide more consistent 
> views of our model to clients; file 'moves' could be a memory operation only 
> rather than an HDFS call; sharing files between tables/snapshots and when it 
> is safe to remove them would be simplified if one owner only; and so on.
> This is an umbrella blue-sky issue to discuss what a new layout would look 
> like and how we might get there. I'll follow up with some sketches of what 
> new layout could look like that come of some chats a few of us have been 
> having. We are also under the 'delusion' that move to a new layout could be 
> done as part of a rolling upgrade and that the amount of work involved is not 
> gargantuan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-7806) Isolate the FileSystem calls

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-7806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7806:
-
Component/s: fsredo

> Isolate the FileSystem calls
> 
>
> Key: HBASE-7806
> URL: https://issues.apache.org/jira/browse/HBASE-7806
> Project: HBase
>  Issue Type: Task
>  Components: fsredo, regionserver
>Affects Versions: 0.95.2
>Reporter: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-7806.pdf
>
>
> Motivations:
> * No way to change the fs layout without touching all the code (and breaking 
> stuff)
> * Each test create is own mocked fs layout: mkdir(region), mkdir(new 
> Path(region, family))
> * new Path(table, region, family, hfile) is everywhere
> * DIR_NAME constants are not in a single place
> * lots of code does a the same for (region.listStatus()) for 
> (family.listStatus()) ...
> Goals:
>  * Central point for all the fs operations
>  ** Make easier the file creation
>  ** Make easier the store files retrival (with proper reference, link 
> handling)
>  * Removing all the new Path() from the code
>  * Removing all the fs.listStatus() from the code
>  * Cleanup the tests to use the new classes of creating mocks object
>  * Reduce the code (in theory there should be less line than before once this 
> refactor is complete)
> Since the fs operations are all over the code, the refactor must be gradual, 
> and step by step limiting the internal fs layout visibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-8109) HBase can manage blocks instead of (or inside) files in HDFS

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-8109:
-
Component/s: fsredo

> HBase can manage blocks instead of (or inside) files in HDFS
> 
>
> Key: HBASE-8109
> URL: https://issues.apache.org/jira/browse/HBASE-8109
> Project: HBase
>  Issue Type: Brainstorming
>  Components: fsredo
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Prompted by previous non-Hadoop experience and some dev list discussions, and 
> after talking to some HDFS people about blocks.
> HBase could improve a lot by managing HDFS blocks instead of files, and 
> reusing the blocks among other things. Some areas that could improve are 
> splits, compactions, management of large blobs, locality enforcement.
> I was told that block APIs in Hadoop 2 are well-isolated, but not exposed 
> yet. They can easily be exposed, and as one of the first potential users we 
> could get to help shape them. Two areas that from my limited understanding is 
> currently fuzzy are namespaces for blocks, and ref-counting.
> We should come up with list of initial scenarios to figure out what we need 
> from block API (locality, detecting/enforcing block boundary/variable size 
> blocks, reusing one block, ...).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HBASE-6184) HRegionInfo was null or empty in Meta

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6184.
--
Resolution: Cannot Reproduce

Resolving old issue that we don't see anymore as 'cannot reproduce'

> HRegionInfo was null or empty in Meta 
> --
>
> Key: HBASE-6184
> URL: https://issues.apache.org/jira/browse/HBASE-6184
> Project: HBase
>  Issue Type: Bug
>  Components: Client, io
>Affects Versions: 0.94.0
>Reporter: jiafeng.zhang
>Priority: Major
> Attachments: HBASE-6184.patch
>
>
> insert data
> hadoop-0.23.2 + hbase-0.94.0
> 2012-06-07 13:09:38,573 WARN  
> [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] 
> Encountered problems when prefetch META table: 
> java.io.IOException: HRegionInfo was null or empty in Meta for hbase_one_col, 
> row=hbase_one_col,09115303780247449149,99
> at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
> at 
> org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:48)
> at 
> org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:126)
> at 
> org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:123)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:359)
> at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:123)
> at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:99)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:894)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:948)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
> at 
> org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397)
> at com.dinglicom.hbase.HbaseImport.insertData(HbaseImport.java:177)
> at com.dinglicom.hbase.HbaseImport.run(HbaseImport.java:210)
> at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HBASE-15355) region.jsp can not be found on info server of master

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-15355.
---
Resolution: Won't Fix

This issue became invalid after we ruled out master hosting Regions (See above 
for agreement by Jianwei).

> region.jsp can not be found on info server of master
> 
>
> Key: HBASE-15355
> URL: https://issues.apache.org/jira/browse/HBASE-15355
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>Priority: Minor
>
> After [HBASE-10569|https://issues.apache.org/jira/browse/HBASE-10569], master 
> is also a regionserver and it will serve regions of system tables. The meta 
> region info could be viewed on master at the address such as : 
> http://localhost:16010/region.jsp?name=1588230740. The real path of 
> region.jsp for the request will be hbase-webapps/master/region.jsp on master, 
> however, the region.jsp is under the directory hbase-webapps/regionserver, so 
> that can not be found on master.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-10-25 Thread Josh Elser (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-20671:
---
Resolution: Cannot Reproduce
Status: Resolved  (was: Patch Available)

Closing this as we probably need to start fresh. Too much changed since the 
original info.

> Merged region brought back to life causing RS to be killed by Master
> 
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.0
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Attachments: 0001-Test-for-HBASE-20671.patch, 
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip, 
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip,
>  workaround.txt
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then 
> ended up assigning the children region back out to the cluster. There is a 
> log message which appears to indicate that RegionStates acknowledges that it 
> doesn't know what this region is as it's replaying the pv2 WAL; however, it 
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c 
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG 
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
> MergeTableRegionsProcedure table=tabletwo_merge, 
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to 
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
> assignment.AssignmentManager: Killing 
> ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664338#comment-16664338
 ] 

stack commented on HBASE-21380:
---

At first I had a violent reaction to ServerManager having to now know about 
ServerCrashProcedure but being able to ask isFinished at the point of adding to 
DeadServers is nice addressing a concern I had about how a 'processing' server 
gets readded to the processing list.

+1 from me.

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21344) hbase:meta location in ZooKeeper set to OPENING by the procedure which eventually failed but precludes Master from assigning it forever

2018-10-25 Thread Josh Elser (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664328#comment-16664328
 ] 

Josh Elser commented on HBASE-21344:


Thanks, Stack. Doing so now.

> hbase:meta location in ZooKeeper set to OPENING by the procedure which 
> eventually failed but precludes Master from assigning it forever
> ---
>
> Key: HBASE-21344
> URL: https://issues.apache.org/jira/browse/HBASE-21344
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21344-branch-2.0.patch, 
> HBASE-21344-branch-2.0_v2.patch, HBASE-21344-branch-2.0_v3.patch, 
> HBASE-21344.branch-2.0.001.patch, HBASE-21344.branch-2.0.003-addendum.patch, 
> HBASE-21344.branch-2.0.003.patch
>
>
> [~elserj] has already summarized it well.
> 1. hbase:meta was on RS8
> 2. RS8 crashed, SCP was queued for it, meta first
> 3. meta was marked OFFLINE
> 4. meta marked as OPENING on RS3
> 5. Can't actually send the openRegion RPC to RS3 due to the krb ticket issue
> 6. We attempt the openRegion/assignment 10 times, failing each time
> 7. We start rolling back the procedure:
> {code:java}
> 2018-10-08 06:51:24,440 WARN  [PEWorker-9] procedure2.ProcedureExecutor: 
> Usually this should not happen, we will release the lock before if the 
> procedure is finished, even if the holdLock is true, arrive here means we 
> have some holes where we do not release the lock. And the releaseLock below 
> may fail since the procedure may have already been deleted from the procedure 
> store.
> 2018-10-08 06:51:24,543 INFO  [PEWorker-9] 
> procedure.MasterProcedureScheduler: pid=48, ppid=47, 
> state=FAILED:REGION_TRANSITION_QUEUE, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; AssignProcedure table=hbase:meta, region=1588230740 
> checking lock on 1588230740
> {code}
> {code:java}
> 2018-10-08 06:51:30,957 ERROR [PEWorker-9] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=47, 
> state=FAILED:SERVER_CRASH_ASSIGN_META, locked=true, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; ServerCrashProcedure 
> server=,16020,1538974612843, splitWal=true, meta=true
> java.lang.UnsupportedOperationException: unhandled 
> state=SERVER_CRASH_GET_REGIONS
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:254)
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:58)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:960)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1577)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1539)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1418)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
> {code}
> {code:java}
> { DEBUG [PEWorker-2] client.RpcRetryingCallerImpl: Call exception, tries=7, 
> retries=7, started=8168 ms ago, cancelled=false, msg=Meta region is in state 
> OPENING, details=row 'backup:system' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, hostname=, seqNum=-1, 
> exception=java.io.IOException: Meta region is in state OPENING
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$null$1(ZKAsyncRegistry.java:154)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$getAndConvert$0(ZKAsyncRegistry.java:77)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFutu

[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664301#comment-16664301
 ] 

Ted Yu commented on HBASE-21175:


lgtm

Pending QA

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664296#comment-16664296
 ] 

Mike Drob commented on HBASE-21380:
---

[~stack] - found a not _too_ hacky way to propagate the information through. 
WDYT?

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21380:
--
Attachment: HBASE-21380.branch-2.1.002.patch

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664291#comment-16664291
 ] 

Hadoop QA commented on HBASE-21380:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
23s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}133m 
37s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21380 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945625/HBASE-21380.master.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux aa38e8cd26e8 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 66469733ec |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14861/testReport/ |
| Max. process+thread count | 5308 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBA

[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Attachment: 21387.v1.txt

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21387.v1.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Attachment: (was: 21387.v1.txt)

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: HBASE-21175.v04.patch

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Issue Comment Deleted] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Comment: was deleted

(was: [~yuzhih...@gmail.com] please review)

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: (was: HBASE-21175.v03.patch)

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Status: Patch Available  (was: Open)

[~yuzhih...@gmail.com] please review

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Status: Open  (was: Patch Available)

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v03.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Status: Patch Available  (was: Open)

[~yuzhih...@gmail.com] please review

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v03.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-25 Thread Artem Ervits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: HBASE-21175.v03.patch

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v03.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky

2018-10-25 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21216:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Dup of HBASE-21387

> TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
> --
>
> Key: HBASE-21216
> URL: https://issues.apache.org/jira/browse/HBASE-21216
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21216.v1.txt
>
>
> From 
> https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/
>  :
> {code}
> java.lang.AssertionError: Archived hfiles [] and table hfiles 
> [9ca09392705f425f9c916beedc10d63c] is missing snapshot 
> file:6739a09747e54189a4112a6d8f37e894
>   at 
> org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370)
> {code}
> The file appeared in archive dir before hfile cleaners were run:
> {code}
> 2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-archive/
> 2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |data/
> 2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |---default/
> 2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |--test/
> 2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-1237d57b63a7bdf067a930441a02514a/
> 2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |recovered.edits/
> 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---4.seqid
> 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-29e1700e09b51223ad2f5811105a4d51/
> 2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |fam/
> 2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---2c66a18f6c1a4074b84ffbb3245268c4
> 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---45bb396c6a5e49629e45a4d56f1e9b14
> 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): 
> |---6739a09747e54189a4112a6d8f37e894
> {code}
> However, the archive dir became empty after hfile cleaners were run:
> {code}
> 2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-archive/
> 2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): 
> |-corrupt/
> {code}
> Leading to the assertion failure.
> This test is one of the top flaky tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Status: Patch Available  (was: Open)

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21387.v1.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21387:
---
Attachment: 21387.v1.txt

> Race condition in snapshot cache refreshing leads to loss of snapshot files
> ---
>
> Key: HBASE-21387
> URL: https://issues.apache.org/jira/browse/HBASE-21387
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21387.v1.txt
>
>
> During recent report from customer where ExportSnapshot failed:
> {code}
> 2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
> snapshot.SnapshotReferenceUtil: Can't find hfile: 
> 44f6c3c646e84de6a63fe30da4fcb3aa in the real 
> (hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  or archive 
> (hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
>  directory for the primary table. 
> {code}
> We found the following in log:
> {code}
> 2018-10-09 18:54:23,675 DEBUG 
> [00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
> cleaner.HFileCleaner: Removing: 
> hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
> from archive
> {code}
> The root cause is race condition surrounding SnapshotFileCache#refreshCache().
> There are two callers of refreshCache: one from RefreshCacheTask#run and the 
> other from SnapshotHFileCleaner.
> Let's look at the code of refreshCache:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are 
> done
> if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;
> // 1. update the modified time
> this.lastModifiedTime = dirStatus.getModificationTime();
> // 2.clear the cache
> this.cache.clear();
> {code}
> Suppose the RefreshCacheTask runs past the if check and sets 
> this.lastModifiedTime
> The cleaner executes refreshCache and returns immediately since 
> this.lastModifiedTime matches the modification time of the directory.
> Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
> lookup, the cache is empty.
> Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-21387) Race condition in snapshot cache refreshing leads to loss of snapshot files

2018-10-25 Thread Ted Yu (JIRA)

Ted Yu created HBASE-21387:
--

 Summary: Race condition in snapshot cache refreshing leads to loss 
of snapshot files
 Key: HBASE-21387
 URL: https://issues.apache.org/jira/browse/HBASE-21387
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


During recent report from customer where ExportSnapshot failed:
{code}
2018-10-09 18:54:32,559 ERROR [VerifySnapshot-pool1-t2] 
snapshot.SnapshotReferenceUtil: Can't find hfile: 
44f6c3c646e84de6a63fe30da4fcb3aa in the real 
(hdfs://in.com:8020/apps/hbase/data/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 or archive 
(hdfs://in.com:8020/apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa)
 directory for the primary table. 
{code}
We found the following in log:
{code}
2018-10-09 18:54:23,675 DEBUG 
[00:16000.activeMasterManager-HFileCleaner.large-1539035367427] 
cleaner.HFileCleaner: Removing: 
hdfs:///apps/hbase/data/archive/data/.../a/44f6c3c646e84de6a63fe30da4fcb3aa 
from archive
{code}
The root cause is race condition surrounding SnapshotFileCache#refreshCache().
There are two callers of refreshCache: one from RefreshCacheTask#run and the 
other from SnapshotHFileCleaner.
Let's look at the code of refreshCache:
{code}
// if the snapshot directory wasn't modified since we last check, we are 
done
if (dirStatus.getModificationTime() <= this.lastModifiedTime) return;

// 1. update the modified time
this.lastModifiedTime = dirStatus.getModificationTime();

// 2.clear the cache
this.cache.clear();
{code}
Suppose the RefreshCacheTask runs past the if check and sets 
this.lastModifiedTime
The cleaner executes refreshCache and returns immediately since 
this.lastModifiedTime matches the modification time of the directory.
Now RefreshCacheTask clears the cache. By the time the cleaner performs cache 
lookup, the cache is empty.
Therefore cleaner puts the file into unReferencedFiles - leading to data loss.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15557) document SyncTable in ref guide

2018-10-25 Thread Wellington Chevreuil (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664224#comment-16664224
 ] 

Wellington Chevreuil commented on HBASE-15557:
--

Attached patch with my proposed description for HashTable/SyncTable on the ref 
guide. Please review and let me know on any suggestions.

> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of the data for each range.
> Make the hashes available to the target cluster - I'd recommend using DistCp 
> to copy it across.
> In the target table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.SyncTable and pass it the directory where 
> you put the hashes, and the names of the source and destination tables. You 
> will likely also need to specify the source table's ZK quorum via the 
> --sourcezkcluster option. SyncTable will then read the hash information, and 
> compute the hashes of the same row ranges for the target table. For any row 
> range where the hash fails to match, it will open a remote scanner to the 
> source table, read the data for that range, and do Puts and Deletes to the 
> target table to update it to match the source.
> I hope that clarifies it a bit. Let me know if you need a hand. If anyone 
> wants to work on getting some documentation into the book, I can try to write 
> some more but would love a hand on turning it into an actual book patch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-15557) document SyncTable in ref guide

2018-10-25 Thread Wellington Chevreuil (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-15557:
-
Attachment: HBASE-15557.master.001.patch

> document SyncTable in ref guide
> ---
>
> Key: HBASE-15557
> URL: https://issues.apache.org/jira/browse/HBASE-15557
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.2.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Attachments: HBASE-15557.master.001.patch
>
>
> The docs for SyncTable are insufficient. Brief description from [~davelatham] 
> HBASE-13639 comment:
> {quote}
> Sorry for the lack of better documentation, Abhishek Soni. Thanks for 
> bringing it up. I'll try to provide a better explanation. You may have 
> already seen it, but if not, the design doc linked in the description above 
> may also give you some better clues as to how it should be used.
> Briefly, the feature is intended to start with a pair of tables in remote 
> clusters that are already substantially similar and make them identical by 
> comparing hashes of the data and copying only the diffs instead of having to 
> copy the entire table. So it is targeted at a very specific use case (with 
> some work it could generalize to cover things like CopyTable and 
> VerifyRepliaction but it's not there yet). To use it, you choose one table to 
> be the "source", and the other table is the "target". After the process is 
> complete the target table should end up being identical to the source table.
> In the source table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.HashTable and pass it the name of the 
> source table and an output directory in HDFS. HashTable will scan the source 
> table, break the data up into row key ranges (default of 8kB per range) and 
> produce a hash of the data for each range.
> Make the hashes available to the target cluster - I'd recommend using DistCp 
> to copy it across.
> In the target table's cluster, run 
> org.apache.hadoop.hbase.mapreduce.SyncTable and pass it the directory where 
> you put the hashes, and the names of the source and destination tables. You 
> will likely also need to specify the source table's ZK quorum via the 
> --sourcezkcluster option. SyncTable will then read the hash information, and 
> compute the hashes of the same row ranges for the target table. For any row 
> range where the hash fails to match, it will open a remote scanner to the 
> source table, read the data for that range, and do Puts and Deletes to the 
> target table to update it to match the source.
> I hope that clarifies it a bit. Let me know if you need a hand. If anyone 
> wants to work on getting some documentation into the book, I can try to write 
> some more but would love a hand on turning it into an actual book patch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21344) hbase:meta location in ZooKeeper set to OPENING by the procedure which eventually failed but precludes Master from assigning it forever

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664192#comment-16664192
 ] 

stack commented on HBASE-21344:
---

+1 on push [~elserj] and [~an...@apache.org] Thanks.

> hbase:meta location in ZooKeeper set to OPENING by the procedure which 
> eventually failed but precludes Master from assigning it forever
> ---
>
> Key: HBASE-21344
> URL: https://issues.apache.org/jira/browse/HBASE-21344
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21344-branch-2.0.patch, 
> HBASE-21344-branch-2.0_v2.patch, HBASE-21344-branch-2.0_v3.patch, 
> HBASE-21344.branch-2.0.001.patch, HBASE-21344.branch-2.0.003-addendum.patch, 
> HBASE-21344.branch-2.0.003.patch
>
>
> [~elserj] has already summarized it well.
> 1. hbase:meta was on RS8
> 2. RS8 crashed, SCP was queued for it, meta first
> 3. meta was marked OFFLINE
> 4. meta marked as OPENING on RS3
> 5. Can't actually send the openRegion RPC to RS3 due to the krb ticket issue
> 6. We attempt the openRegion/assignment 10 times, failing each time
> 7. We start rolling back the procedure:
> {code:java}
> 2018-10-08 06:51:24,440 WARN  [PEWorker-9] procedure2.ProcedureExecutor: 
> Usually this should not happen, we will release the lock before if the 
> procedure is finished, even if the holdLock is true, arrive here means we 
> have some holes where we do not release the lock. And the releaseLock below 
> may fail since the procedure may have already been deleted from the procedure 
> store.
> 2018-10-08 06:51:24,543 INFO  [PEWorker-9] 
> procedure.MasterProcedureScheduler: pid=48, ppid=47, 
> state=FAILED:REGION_TRANSITION_QUEUE, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; AssignProcedure table=hbase:meta, region=1588230740 
> checking lock on 1588230740
> {code}
> {code:java}
> 2018-10-08 06:51:30,957 ERROR [PEWorker-9] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=47, 
> state=FAILED:SERVER_CRASH_ASSIGN_META, locked=true, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; ServerCrashProcedure 
> server=,16020,1538974612843, splitWal=true, meta=true
> java.lang.UnsupportedOperationException: unhandled 
> state=SERVER_CRASH_GET_REGIONS
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:254)
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:58)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:960)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1577)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1539)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1418)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
> {code}
> {code:java}
> { DEBUG [PEWorker-2] client.RpcRetryingCallerImpl: Call exception, tries=7, 
> retries=7, started=8168 ms ago, cancelled=false, msg=Meta region is in state 
> OPENING, details=row 'backup:system' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, hostname=, seqNum=-1, 
> exception=java.io.IOException: Meta region is in state OPENING
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$null$1(ZKAsyncRegistry.java:154)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$getAndConvert$0(ZKAsyncRegistry.java:77)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.C

[jira] [Commented] (HBASE-21344) hbase:meta location in ZooKeeper set to OPENING by the procedure which eventually failed but precludes Master from assigning it forever

2018-10-25 Thread Josh Elser (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664159#comment-16664159
 ] 

Josh Elser commented on HBASE-21344:


Me being the silent instigator :)

Tried to run the test and I'm getting timeouts when I run the whole test class. 
Seems like when both methods run in the same JVM, testMetaAssignmentFailure 
fails for me. I don't know why QA didn't fail though (but it happens every time 
for me).

> hbase:meta location in ZooKeeper set to OPENING by the procedure which 
> eventually failed but precludes Master from assigning it forever
> ---
>
> Key: HBASE-21344
> URL: https://issues.apache.org/jira/browse/HBASE-21344
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21344-branch-2.0.patch, 
> HBASE-21344-branch-2.0_v2.patch, HBASE-21344-branch-2.0_v3.patch, 
> HBASE-21344.branch-2.0.001.patch, HBASE-21344.branch-2.0.003-addendum.patch, 
> HBASE-21344.branch-2.0.003.patch
>
>
> [~elserj] has already summarized it well.
> 1. hbase:meta was on RS8
> 2. RS8 crashed, SCP was queued for it, meta first
> 3. meta was marked OFFLINE
> 4. meta marked as OPENING on RS3
> 5. Can't actually send the openRegion RPC to RS3 due to the krb ticket issue
> 6. We attempt the openRegion/assignment 10 times, failing each time
> 7. We start rolling back the procedure:
> {code:java}
> 2018-10-08 06:51:24,440 WARN  [PEWorker-9] procedure2.ProcedureExecutor: 
> Usually this should not happen, we will release the lock before if the 
> procedure is finished, even if the holdLock is true, arrive here means we 
> have some holes where we do not release the lock. And the releaseLock below 
> may fail since the procedure may have already been deleted from the procedure 
> store.
> 2018-10-08 06:51:24,543 INFO  [PEWorker-9] 
> procedure.MasterProcedureScheduler: pid=48, ppid=47, 
> state=FAILED:REGION_TRANSITION_QUEUE, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; AssignProcedure table=hbase:meta, region=1588230740 
> checking lock on 1588230740
> {code}
> {code:java}
> 2018-10-08 06:51:30,957 ERROR [PEWorker-9] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=47, 
> state=FAILED:SERVER_CRASH_ASSIGN_META, locked=true, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; ServerCrashProcedure 
> server=,16020,1538974612843, splitWal=true, meta=true
> java.lang.UnsupportedOperationException: unhandled 
> state=SERVER_CRASH_GET_REGIONS
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:254)
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:58)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:960)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1577)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1539)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1418)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
> {code}
> {code:java}
> { DEBUG [PEWorker-2] client.RpcRetryingCallerImpl: Call exception, tries=7, 
> retries=7, started=8168 ms ago, cancelled=false, msg=Meta region is in state 
> OPENING, details=row 'backup:system' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, hostname=, seqNum=-1, 
> exception=java.io.IOException: Meta region is in state OPENING
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$null$1(ZKAsyncRegistry.java:154)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
>

[jira] [Commented] (HBASE-21344) hbase:meta location in ZooKeeper set to OPENING by the procedure which eventually failed but precludes Master from assigning it forever

2018-10-25 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664155#comment-16664155
 ] 

stack commented on HBASE-21344:
---

[~an...@apache.org] Why the addendum sir?

> hbase:meta location in ZooKeeper set to OPENING by the procedure which 
> eventually failed but precludes Master from assigning it forever
> ---
>
> Key: HBASE-21344
> URL: https://issues.apache.org/jira/browse/HBASE-21344
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21344-branch-2.0.patch, 
> HBASE-21344-branch-2.0_v2.patch, HBASE-21344-branch-2.0_v3.patch, 
> HBASE-21344.branch-2.0.001.patch, HBASE-21344.branch-2.0.003-addendum.patch, 
> HBASE-21344.branch-2.0.003.patch
>
>
> [~elserj] has already summarized it well.
> 1. hbase:meta was on RS8
> 2. RS8 crashed, SCP was queued for it, meta first
> 3. meta was marked OFFLINE
> 4. meta marked as OPENING on RS3
> 5. Can't actually send the openRegion RPC to RS3 due to the krb ticket issue
> 6. We attempt the openRegion/assignment 10 times, failing each time
> 7. We start rolling back the procedure:
> {code:java}
> 2018-10-08 06:51:24,440 WARN  [PEWorker-9] procedure2.ProcedureExecutor: 
> Usually this should not happen, we will release the lock before if the 
> procedure is finished, even if the holdLock is true, arrive here means we 
> have some holes where we do not release the lock. And the releaseLock below 
> may fail since the procedure may have already been deleted from the procedure 
> store.
> 2018-10-08 06:51:24,543 INFO  [PEWorker-9] 
> procedure.MasterProcedureScheduler: pid=48, ppid=47, 
> state=FAILED:REGION_TRANSITION_QUEUE, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; AssignProcedure table=hbase:meta, region=1588230740 
> checking lock on 1588230740
> {code}
> {code:java}
> 2018-10-08 06:51:30,957 ERROR [PEWorker-9] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=47, 
> state=FAILED:SERVER_CRASH_ASSIGN_META, locked=true, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; ServerCrashProcedure 
> server=,16020,1538974612843, splitWal=true, meta=true
> java.lang.UnsupportedOperationException: unhandled 
> state=SERVER_CRASH_GET_REGIONS
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:254)
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:58)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:960)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1577)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1539)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1418)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
> {code}
> {code:java}
> { DEBUG [PEWorker-2] client.RpcRetryingCallerImpl: Call exception, tries=7, 
> retries=7, started=8168 ms ago, cancelled=false, msg=Meta region is in state 
> OPENING, details=row 'backup:system' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, hostname=, seqNum=-1, 
> exception=java.io.IOException: Meta region is in state OPENING
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$null$1(ZKAsyncRegistry.java:154)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$getAndConvert$0(ZKAsyncRegistry.java:77)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableF

[jira] [Updated] (HBASE-21344) hbase:meta location in ZooKeeper set to OPENING by the procedure which eventually failed but precludes Master from assigning it forever

2018-10-25 Thread Ankit Singhal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-21344:
--
Attachment: HBASE-21344.branch-2.0.003-addendum.patch

> hbase:meta location in ZooKeeper set to OPENING by the procedure which 
> eventually failed but precludes Master from assigning it forever
> ---
>
> Key: HBASE-21344
> URL: https://issues.apache.org/jira/browse/HBASE-21344
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21344-branch-2.0.patch, 
> HBASE-21344-branch-2.0_v2.patch, HBASE-21344-branch-2.0_v3.patch, 
> HBASE-21344.branch-2.0.001.patch, HBASE-21344.branch-2.0.003-addendum.patch, 
> HBASE-21344.branch-2.0.003.patch
>
>
> [~elserj] has already summarized it well.
> 1. hbase:meta was on RS8
> 2. RS8 crashed, SCP was queued for it, meta first
> 3. meta was marked OFFLINE
> 4. meta marked as OPENING on RS3
> 5. Can't actually send the openRegion RPC to RS3 due to the krb ticket issue
> 6. We attempt the openRegion/assignment 10 times, failing each time
> 7. We start rolling back the procedure:
> {code:java}
> 2018-10-08 06:51:24,440 WARN  [PEWorker-9] procedure2.ProcedureExecutor: 
> Usually this should not happen, we will release the lock before if the 
> procedure is finished, even if the holdLock is true, arrive here means we 
> have some holes where we do not release the lock. And the releaseLock below 
> may fail since the procedure may have already been deleted from the procedure 
> store.
> 2018-10-08 06:51:24,543 INFO  [PEWorker-9] 
> procedure.MasterProcedureScheduler: pid=48, ppid=47, 
> state=FAILED:REGION_TRANSITION_QUEUE, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; AssignProcedure table=hbase:meta, region=1588230740 
> checking lock on 1588230740
> {code}
> {code:java}
> 2018-10-08 06:51:30,957 ERROR [PEWorker-9] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=47, 
> state=FAILED:SERVER_CRASH_ASSIGN_META, locked=true, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; ServerCrashProcedure 
> server=,16020,1538974612843, splitWal=true, meta=true
> java.lang.UnsupportedOperationException: unhandled 
> state=SERVER_CRASH_GET_REGIONS
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:254)
>   at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.rollbackState(ServerCrashProcedure.java:58)
>   at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:960)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1577)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1539)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1418)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
> {code}
> {code:java}
> { DEBUG [PEWorker-2] client.RpcRetryingCallerImpl: Call exception, tries=7, 
> retries=7, started=8168 ms ago, cancelled=false, msg=Meta region is in state 
> OPENING, details=row 'backup:system' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, hostname=, seqNum=-1, 
> exception=java.io.IOException: Meta region is in state OPENING
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$null$1(ZKAsyncRegistry.java:154)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at 
> org.apache.hadoop.hbase.client.ZKAsyncRegistry.lambda$getAndConvert$0(ZKAsyncRegistry.java:77)
> at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryF

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21380:
--
Attachment: HBASE-21380.master.002.patch

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.master.001.patch, HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664131#comment-16664131
 ] 

Mike Drob commented on HBASE-21380:
---

One concern here is that in addition to no longer adding servers to the 
{{processing}} list, we also don't add them to the {{deadServers}} set if 
they've already been through SCP. Which means that if that server comes back 
later... I don't actually know what it means. Maybe need to special case this, 
though?

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.master.001.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21386) Disable TestRSGroups#testRSGroupsWithHBaseQuota; causes TestRSGroups to fail in branch-2.1.

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21386:
--
   Resolution: Invalid
Fix Version/s: (was: 2.1.1)
   Status: Resolved  (was: Patch Available)

> Disable TestRSGroups#testRSGroupsWithHBaseQuota; causes TestRSGroups to fail 
> in branch-2.1.
> ---
>
> Key: HBASE-21386
> URL: https://issues.apache.org/jira/browse/HBASE-21386
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: HBASE-21386.branch-2.1.001.patch
>
>
> Disable testRSGroupsWithHBaseQuota in TestRSGroups. It is a test added after 
> the original set in TestRSGroups that is not like the others. After fix of 
> the balancer in HBASE-21266, its manufacture causes a bunch of failures. 
> Parent issue is about making a real fix. Pushing this to branch-2.1 in 
> meantime so can cut an RC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21386) Disable TestRSGroups#testRSGroupsWithHBaseQuota; causes TestRSGroups to fail in branch-2.1.

2018-10-25 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664121#comment-16664121
 ] 

Mike Drob commented on HBASE-21386:
---

I think I have a fix for this, let's not ignore the test and instead defer to 
the parent.

> Disable TestRSGroups#testRSGroupsWithHBaseQuota; causes TestRSGroups to fail 
> in branch-2.1.
> ---
>
> Key: HBASE-21386
> URL: https://issues.apache.org/jira/browse/HBASE-21386
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: HBASE-21386.branch-2.1.001.patch
>
>
> Disable testRSGroupsWithHBaseQuota in TestRSGroups. It is a test added after 
> the original set in TestRSGroups that is not like the others. After fix of 
> the balancer in HBASE-21266, its manufacture causes a bunch of failures. 
> Parent issue is about making a real fix. Pushing this to branch-2.1 in 
> meantime so can cut an RC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Sean Busbey (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664077#comment-16664077
 ] 

Sean Busbey commented on HBASE-21328:
-

Can you generate the patch using {{git format-patch}}?

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21328:

Fix Version/s: 3.0.0

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-21328:
---

Assignee: Nick.han

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21328:

Component/s: Operability
 documentation

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21328:

Fix Version/s: 2.2.0
   1.5.0

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, Operability
>Reporter: Nick.han
>Assignee: Nick.han
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21328.master.001.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21328) why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?

2018-10-25 Thread Sean Busbey (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664070#comment-16664070
 ] 

Sean Busbey commented on HBASE-21328:
-

it's set by default to false to maintain existing behavior. calling it out in 
hbase-env seems like a reasonable incremental improvement.

> why HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default?
> --
>
> Key: HBASE-21328
> URL: https://issues.apache.org/jira/browse/HBASE-21328
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nick.han
>Priority: Minor
> Attachments: HBASE-21328.master.001.patch
>
>
> hi,all
>       I got a problem while I using hbase3.0.0-snapshot and hadoop 2.7.5 to 
> build a hbase cluster,the problem is  hbase using javax.servlet-api-3.1.0-jar 
> witch is conflict by servlet-api-2.5.jar that in
> hadoop lib path, I run into hbase file and got config 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP set to false by default, this config 
> decide whether or not include Hadoop lib to hbase class path，so the question 
> is why we set this config to false?can we set it to true and exclude the 
> Hadoop lib by default?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21380:
--
Attachment: HBASE-21380.master.001.patch

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.master.001.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664063#comment-16664063
 ] 

Mike Drob commented on HBASE-21380:
---

Also posting a forward port to master

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.master.001.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664055#comment-16664055
 ] 

Hadoop QA commented on HBASE-21375:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} The patch passed checkstyle in hbase-procedure 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} hbase-server: The patch generated 0 new + 10 
unchanged - 1 fixed = 10 total (was 11) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
51s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
27s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 26m 49s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 87m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterProcedureScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21375 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945599/HBASE-21375.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 800ae3416d1f 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 201

[jira] [Commented] (HBASE-21386) Disable TestRSGroups#testRSGroupsWithHBaseQuota; causes TestRSGroups to fail in branch-2.1.

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664056#comment-16664056
 ] 

Hadoop QA commented on HBASE-21386:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
53s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
59s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
20s{color} | {color:green} hbase-rsgroup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21386 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945603/HBASE-21386.branch-2.1.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b5ddb6e39411 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / e71c05707e |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14860/testReport/ |
| Max. process+thread count | 2753 (vs. ulimit of 1) |
| modules | C: hbase-rsgroup U: hbase-rsgroup |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14860/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was

[jira] [Resolved] (HBASE-18152) [AMv2] Corrupt Procedure WAL file; procedure data stored out of order

2018-10-25 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-18152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-18152.
---
Resolution: Cannot Reproduce

Closing as 'Cannot Reproduce' Haven seen this in months. It looks like its 
fixed to me.

> [AMv2] Corrupt Procedure WAL file; procedure data stored out of order
> -
>
> Key: HBASE-18152
> URL: https://issues.apache.org/jira/browse/HBASE-18152
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: 
> 0001-TestWALProcedureExecutore-order-checking-test-that-d.patch, 
> HBASE-18152.master.001.patch, 
> hbase-hbase-master-ctr-e138-1518143905142-221855-01-02.hwx.site.log.gz, 
> pv2-0036.log, pv2-0047.log, 
> reading_bad_wal.patch
>
>
> I've seen corruption from time-to-time testing.  Its rare enough. Often we 
> can get over it but sometimes we can't. It took me a while to capture an 
> instance of corruption. Turns out we are write to the WAL out-of-order which 
> undoes a basic tenet; that WAL content is ordered in line w/ execution.
> Below I'll post a corrupt WAL.
> Looking at the write-side, there is a lot going on. I'm not clear on how we 
> could write out of order. Will try and get more insight. Meantime parking 
> this issue here to fill data into.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21297) ModifyTableProcedure can throw TNDE instead of IOE in case of REGION_REPLICATION change

2018-10-25 Thread Nihal Jain (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-21297:
---
Fix Version/s: 3.0.0
   Attachment: HBASE-21297.master.001.patch
   Status: Patch Available  (was: In Progress)

> ModifyTableProcedure can throw TNDE instead of IOE in case of 
> REGION_REPLICATION change
> ---
>
> Key: HBASE-21297
> URL: https://issues.apache.org/jira/browse/HBASE-21297
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21297.master.001.patch
>
>
> Currently {{ModifyTableProcedure}} throws an {{IOException}} (See 
> [ModifyTableProcedure.java#L252|https://github.com/apache/hbase/blob/924d183ba0e67b975e998f6006c993f457e03c20/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java#L252])
>  when a user tries to modify REGION_REPLICATION for an enabled table. 
> Instead, it can throw a more specific {{TableNotDisabledException}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (HBASE-21297) ModifyTableProcedure can throw TNDE instead of IOE in case of REGION_REPLICATION change

2018-10-25 Thread Nihal Jain (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-21297 started by Nihal Jain.
--
> ModifyTableProcedure can throw TNDE instead of IOE in case of 
> REGION_REPLICATION change
> ---
>
> Key: HBASE-21297
> URL: https://issues.apache.org/jira/browse/HBASE-21297
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Minor
>
> Currently {{ModifyTableProcedure}} throws an {{IOException}} (See 
> [ModifyTableProcedure.java#L252|https://github.com/apache/hbase/blob/924d183ba0e67b975e998f6006c993f457e03c20/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java#L252])
>  when a user tries to modify REGION_REPLICATION for an enabled table. 
> Instead, it can throw a more specific {{TableNotDisabledException}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21374) Backport HBASE-21342 to branch-1

2018-10-25 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663991#comment-16663991
 ] 

Mike Drob commented on HBASE-21374:
---

The issue is that SecureBulkLoadManager doesn't exist in branch-1 and is 
instead the old SecureBulkLoadEndpoint.

[~mazhenlin] - are you interested in putting up a branch-1 patch, or would you 
prefer I try to handle it?

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-25 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663976#comment-16663976
 ] 

Hadoop QA commented on HBASE-21322:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
13s{color} | {color:red} hbase-server: The patch generated 3 new + 23 unchanged 
- 0 fixed = 26 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
16s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
59s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}205m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}268m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21322 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945576/HBASE-21322.master.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-25 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-21380:
--
Status: Patch Available  (was: Open)

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-21380.branch-2.1.001.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 146 matches

Mail list logo