[jira] [Commented] (HBASE-21262) [hbck2] AMv2 Lock Picker

2018-10-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636476#comment-16636476
 ] 

stack commented on HBASE-21262:
---

Lock picker would be useful for cases like this.

The list locks says there is a lock:

{code}
TABLE(IntegrationTestBigLinkedList_20180803113809)
Lock type: SHARED, count: 1

REGION(5ad3301b300e8e95fbc364ff05bbcbb2)
Lock type: EXCLUSIVE, procedure: 
{"className"=>"org.apache.hadoop.hbase.master.assignment.AssignProcedure", 
"parentId"=>"252406", "procId"=>"259414", "submittedTime"=>"1538543021322", 
"owner"=>"hbase", "state"=>"SUCCESS", "stackId"=>[78557, 78633], 
"lastUpdate"=>"1538544091939", 
"stateMessage"=>[{"transitionState"=>"REGION_TRANSITION_FINISH", 
"regionInfo"=>{"regionId"=>"1533324320765", 
"tableName"=>{"namespace"=>"ZGVmYXVsdA==", 
"qualifier"=>"SW50ZWdyYXRpb25UZXN0QmlnTGlua2VkTGlzdF8yMDE4MDgwMzExMzgwOQ=="}, 
"startKey"=>"NgemLyDwwWU=", "endKey"=>"NgmfhZ0N3yA=", "offline"=>false, 
"split"=>false, "replicaId"=>0}}], "locked"=>true}
{code}

... but if I try to bypass the lock owner, I get this:

{code}
2018-10-02 22:22:01,486 DEBUG 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=259414 does 
not exist, skipping bypass
{code}

The Procedure somehow has rolled away. Meantime the lock is held.

> [hbck2] AMv2 Lock Picker
> 
>
> Key: HBASE-21262
> URL: https://issues.apache.org/jira/browse/HBASE-21262
> Project: HBase
>  Issue Type: Sub-task
>  Components: hbck2, Operability
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
>
> This issue is about adding a lock picker to the HbckService
> Over the w/e I had interesting case where an enable failed -- a subprocedure 
> ran into an exclusive lock (I think) -- and then the parent enabletabled 
> tried rollback. The rollback threw CODE-BUG because some subprocedures were 
> in unrollbackable states so we ended up skipping out of the enable table 
> procedure. The enable table procedure was marked ROLLBACKED... so it got 
> GC'd. But the exclusive lock it had on the table stayed in place.
> The above has to be fixed but for the future, we need way to kill locks 
> otherwise only alternative if removing master proc wal files -- which is a 
> bigger pain restoring good state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation

2018-10-02 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636469#comment-16636469
 ] 

Nihal Jain commented on HBASE-21196:


Thanks [~yuzhih...@gmail.com] and [~apurtell]. Do we want this for branch-1? I 
think it should go there too.

> HTableMultiplexer clears the meta cache after every put operation
> -
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 3.0.0, 1.3.3, 2.2.0
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21196.master.001.patch, 
> HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, 
> HTableMultiplexer1000Puts.UT.txt
>
>
> *Problem:* Operations which use 
> {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, 
> MultiResponse, int)}} API with tablename set to null reset the meta cache of 
> the corresponding server after each call. One such operation is put operation 
> of HTableMultiplexer (Might not be the only one). This may impact the 
> performance of the system severely as all new ops directed to that server 
> will have to go to zk first to get the meta table address and then get the 
> location of the table region as it will become empty after every 
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region 
> locations are cleared. As a side effect of this, before every put the server 
> needs to contact zk and get meta table location and read meta to get region 
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): 
> Removed all cached region locations that map to 
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] 
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for 
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] 
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" 
> request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: 
> 137 connection: 127.0.0.1:42338 param: region= 
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., 
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { 
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE 
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] 
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, 
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, 
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan 
> table=hbase:meta, 
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): 
> Advancing internal small scanner to startKey at 
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up 
> meta region location in ZK, 
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see 
> that the string "Removed all cached region locations that map" and "Looking 
> up meta region location in ZK" are present for every put.
> *Analysis:*
>  The problem occurs as we call the {{cleanServerCache}} method always clears 
> the server cache in case tablename is null and exception is null. See 
> [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918]
> {code:java}
> private void cleanServerCache(ServerName server, Throwable regionException) {
> if (tableName == null && 
> ClientExceptionsUtil.isMetaClearingException(regionException)) {
>   // For multi-actions, we don't have a table name, but we want to make 
> sure to clear the
>   // cache in case there were location-related exceptions. We 

[jira] [Commented] (HBASE-21256) Improve IntegrationTestBigLinkedList for testing huge data

2018-10-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636403#comment-16636403
 ] 

stack commented on HBASE-21256:
---

Trying to follow along, I don't see listing of likelihood of collision on the 
wiki page?

I left some notes on rb.

> Improve IntegrationTestBigLinkedList for testing huge data
> --
>
> Key: HBASE-21256
> URL: https://issues.apache.org/jira/browse/HBASE-21256
> Project: HBase
>  Issue Type: Improvement
>  Components: integration tests
>Affects Versions: 3.0.0
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21256-v1.patch, HBASE-21256-v2.patch, 
> HBASE-21256-v3.patch, HBASE-21256-v4.patch, ITBLL-1.png, ITBLL-2.png
>
>
> Recently, I use ITBLL to test some features in our company. I have 
> encountered the following problems:
>   
>  1. Generator is too slow at the generating stage, the root cause is 
> SecureRandom. There is a global lock in SecureRandom( See the following 
> picture). I use Random instead of SecureRandom, and it could speed up this 
> stage(500% up with 20 mapper).  SecureRandom was brought by HBASE-13382, but 
> speaking of generating random bytes, in my opnion,
>  it is the same with Random.
> !ITBLL-1.png!
> 2. VerifyReducer have a cpu cost of 14% on format method. This is cause by 
> create keyString variable. However, keyString may never be used if test 
> result is correct.(and that's in most cases). Just delay creating keyString 
> can yield huge performance boost in verifing stage.
> !ITBLL-2.png!
> 3.Arguments check is needed, because there's constraint between arguments. If 
> we broken this constraint, we can not get a correct circular list.  
>   
>  4.Let big family value size could be configured.
>  
> 5.Avoid starting RS at backup master



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'

2018-10-02 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636364#comment-16636364
 ] 

Sakthi commented on HBASE-21225:


[~elserj] thanks for your comments. 
{quote}1. Why all of the modification/removal of existing tests?
{quote}
Actually I just added two test cases (to specifically test table/namespaces 
with both quotas) and did just rename the previous ones to clearly specify that 
those tests are to for drops with a single quota set on the entity. Looks like 
the git diff shows previous test cases also to be modified (which is not the 
case).
{quote}2. The purpose of the REMOVE => true entry is to denote that HBase 
should remove the SpaceQuota when it writes that out. With this change, it 
seems like you're leaving the REMOVE attribute in the protobuf message, but 
completely ignoring it which is confusing.
{quote}
Yes, you are right.
{quote}IMO, the bug is that GlobalSettingsQuotaImpl is not correctly removing 
the SpaceQuota when REMOVE is set to true.
{quote}
As of now, the bug seems to be in the following code peice:

 
{code:java}
if (throttleBuilder == null &&
(spaceBuilder == null || (spaceBuilder.hasRemove() && 
spaceBuilder.getRemove()))
&& bypassGlobals == null) {
  return null;
}
{code}
When the throttlebuilder is not set and spacebuilder is set to remove, then 
null is returned as the "merged" quota. But when throttlebuilder is set and 
spacebuilder is set to remove, then it doesn't enter this condition and rather 
passes over to this return statement:
{code:java}
return new GlobalQuotaSettingsImpl(
getUserName(), getTableName(), getNamespace(),
(throttleBuilder == null ? null : throttleBuilder.build()), 
bypassGlobals,
(spaceBuilder == null ? null : spaceBuilder.build()));{code}
which just checks the "null" condition of spacebuilder and creates a new 
GlobalQuotaSettingsImpl object that actually makes the "remove=>true" entry 
stay in the table. Thinking along the lines to just fix how this works, instead 
of just checking that spaceBuilder == null, we can also check if remove is set, 
in this return statement.
{code:java}
return new GlobalQuotaSettingsImpl(
getUserName(), getTableName(), getNamespace(),
(throttleBuilder == null ? null : throttleBuilder.build()), 
bypassGlobals,
(spaceBuilder == null || (spaceBuilder.hasRemove() && 
spaceBuilder.getRemove())? null : spaceBuilder.build()));{code}
This will produce the desired results.

{quote}Seems the logic I had initially written around an RPC and Space quota on 
the same table/namespace is lacking. Does that make sense?{quote}
I will have to dig deeper. Will loop back with ideas, Josh.

> Having RPC & Space quota on a table/Namespace doesn't allow space quota to be 
> removed using 'NONE'
> --
>
> Key: HBASE-21225
> URL: https://issues.apache.org/jira/browse/HBASE-21225
> Project: HBase
>  Issue Type: Bug
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Major
> Attachments: hbase-21225.master.001.patch
>
>
> A part of HBASE-20705 is still unresolved. In that Jira it was assumed that 
> problem is: when table having both rpc & space quotas is dropped (with 
> hbase.quota.remove.on.table.delete set as true), the rpc quota is not set to 
> be dropped along with table-drops, and space quota was not being able to be 
> removed completely because of the "EMPTY" row that rpc quota left even after 
> removing. 
> The proposed solution for that was to make sure that rpc quota didn't leave 
> empty rows after removal of quota. And setting automatic removal of rpc quota 
> with table drops. That made sure that space quotas can be recreated/removed.
> But all this was under the assumption that hbase.quota.remove.on.table.delete 
> is set as true. When it is set as false, the same issue can reproduced. Also 
> the below shown steps can used to reproduce the issue without table-drops.
> {noformat}
> hbase(main):005:0> create 't2','cf'
> Created table t2
> Took 0.7619 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.0514 seconds
> hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0162 seconds
> hbase(main):008:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, 
> LIMIT => 10M/sec, SCOPE =>
>MACHINE
>  TABLE => t2   TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, 
> VIOLATION_POLICY => NO_WRIT
>ES
> 2 row(s)
> Took 0.0716 seconds
> hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE
> Took 

[jira] [Assigned] (HBASE-20913) List memstore direct memory/heap memory usage

2018-10-02 Thread huaxiang sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun reassigned HBASE-20913:


Assignee: (was: huaxiang sun)

> List memstore direct memory/heap memory usage
> -
>
> Key: HBASE-20913
> URL: https://issues.apache.org/jira/browse/HBASE-20913
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Priority: Minor
> Attachments: Screen Shot 2018-07-13 at 4.44.07 PM.png
>
>
> With offheap write path support, mslab can be allocated at offheap memory. 
> Currently, RS Server Metrics, only show Memstore Size and it does not list 
> the DM usage for memstore. It will be good that memstore's offheap memory 
> usage along with the heap memory usage be shown at the webpage so we will 
> know how much DM is used for memstore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17392) Load DefaultStoreEngine when user misconfigures 'hbase.hstore.engine.class'

2018-10-02 Thread huaxiang sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun updated HBASE-17392:
-
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Resolved as wont fix as there are concerns.

> Load DefaultStoreEngine when user misconfigures 'hbase.hstore.engine.class'
> ---
>
> Key: HBASE-17392
> URL: https://issues.apache.org/jira/browse/HBASE-17392
> Project: HBase
>  Issue Type: Improvement
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
> Attachments: HBASE-17392-master-001.patch
>
>
> When user misconfigures 'hbase.hstore.engine.class', region server complains 
> "Class not found" and gives up. In this case, we need to load the 
> DefaultStoreEngine to avoid that. Sanity check needs to be done to prevent 
> user from misconfiguration as well.
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreEngine.java#L121



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException

2018-10-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636323#comment-16636323
 ] 

Hadoop QA commented on HBASE-20690:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  8s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  4m 25s{color} 
| {color:red} hbase-rsgroup in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.rsgroup.TestRSGroupsWithACL |
|   | hadoop.hbase.rsgroup.TestRSGroupsOfflineMode |
|   | hadoop.hbase.rsgroup.TestEnableRSGroup |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20690 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12941145/HBASE-20690.master.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 68ce2db3955d 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 42aa3dd463 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 

[jira] [Commented] (HBASE-19320) document the mysterious direct memory leak in hbase

2018-10-02 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636322#comment-16636322
 ] 

huaxiang sun commented on HBASE-19320:
--

Let me update the doc a bit. With hbase 2.0, it is interesting as Netty has its 
own way to allocate and buffer DM.

> document the mysterious direct memory leak in hbase 
> 
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.6, 2.0.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-19320-master-v001.patch, Screen Shot 2017-11-21 at 
> 4.43.36 PM.png, Screen Shot 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to 
> trace and debug. Internally discussed with our [~saint@gmail.com], we 
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in 
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of 
> our hbase clusters.
> Create the jira first and will fill in more details later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-16908) investigate flakey TestQuotaThrottle

2018-10-02 Thread huaxiang sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun reassigned HBASE-16908:


Assignee: (was: huaxiang sun)

> investigate flakey TestQuotaThrottle 
> -
>
> Key: HBASE-16908
> URL: https://issues.apache.org/jira/browse/HBASE-16908
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Priority: Minor
>
> find out the root cause for TestQuotaThrottle failures.
>  
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21037) Hbck needs to call Master.offlineRegion() to clean in-memory state after issuing a RS.closeRegion()

2018-10-02 Thread huaxiang sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun resolved HBASE-21037.
--
Resolution: Cannot Reproduce

> Hbck needs to call Master.offlineRegion() to clean in-memory state after 
> issuing a RS.closeRegion()
> ---
>
> Key: HBASE-21037
> URL: https://issues.apache.org/jira/browse/HBASE-21037
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> In certain cases, hbck issues a RS.closeRegion() to close a region from RS. 
> It does not clean up in-memory state from master for the offlined region and 
> balancer will bring back the closed region, causing region inconsistency. 
> certain codes needs to be reexamined to see a Master.offlineRegion() is 
> needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21037) Hbck needs to call Master.offlineRegion() to clean in-memory state after issuing a RS.closeRegion()

2018-10-02 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636320#comment-16636320
 ] 

huaxiang sun commented on HBASE-21037:
--

I checked the code and did not find any case which closeRegion() is not being 
called, resolve it for now, until it pops out again, thanks.

> Hbck needs to call Master.offlineRegion() to clean in-memory state after 
> issuing a RS.closeRegion()
> ---
>
> Key: HBASE-21037
> URL: https://issues.apache.org/jira/browse/HBASE-21037
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 1.2.6
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> In certain cases, hbck issues a RS.closeRegion() to close a region from RS. 
> It does not clean up in-memory state from master for the offlined region and 
> balancer will bring back the closed region, causing region inconsistency. 
> certain codes needs to be reexamined to see a Master.offlineRegion() is 
> needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21220) Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) to branch-1

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21220:
---
Status: Open  (was: Patch Available)

> Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and 
> ROWPREFIX_DELIMITED) to branch-1
> --
>
> Key: HBASE-21220
> URL: https://issues.apache.org/jira/browse/HBASE-21220
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: HBASE-21220-branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21220) Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) to branch-1

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21220:
---
Status: Patch Available  (was: Open)

Resubmit patch

> Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and 
> ROWPREFIX_DELIMITED) to branch-1
> --
>
> Key: HBASE-21220
> URL: https://issues.apache.org/jira/browse/HBASE-21220
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: HBASE-21220-branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21220) Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) to branch-1

2018-10-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636308#comment-16636308
 ] 

Andrew Purtell commented on HBASE-21220:


The QA bot won't trigger here for some reason.

> Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and 
> ROWPREFIX_DELIMITED) to branch-1
> --
>
> Key: HBASE-21220
> URL: https://issues.apache.org/jira/browse/HBASE-21220
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: HBASE-21220-branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Colin Garcia (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636299#comment-16636299
 ] 

Colin Garcia commented on HBASE-20306:
--

My output from the latest run, for reference:

 
{code:java}
~/hbase/bin(master*) » ./hbase ltt -read 100:10 -write 10:100:10 -num_keys 
100
{code}
 
{code:java}
2018-10-02 17:42:15,369 INFO  
[MultiThreadedAction-ProgressReporter-1538525614665] util.MultiThreadedAction: 
[W:11] Keys=993612, cols=10.9 M, time=00:28:40 Overall: [keys/s= 577, 
latency=17.27 ms] Current: [keys/s=566, latency=17.62 ms], wroteUpTo=993597, 
wroteQSize=4

2018-10-02 17:42:15,423 INFO  
[MultiThreadedAction-ProgressReporter-1538525614699] util.MultiThreadedAction: 
[R:10] Keys=9144453, cols=155.6 M, time=00:28:40 Overall: [keys/s= 5314, 
latency=0.90 ms] Current: [keys/s=5187, latency=0.95 ms], verified=9144453

2018-10-02 17:42:20,369 INFO  
[MultiThreadedAction-ProgressReporter-1538525614665] util.MultiThreadedAction: 
[W:11] Keys=996682, cols=10.9 M, time=00:28:45 Overall: [keys/s= 577, 
latency=17.26 ms] Current: [keys/s=614, latency=16.24 ms], wroteUpTo=996669, 
wroteQSize=2

2018-10-02 17:42:20,425 INFO  
[MultiThreadedAction-ProgressReporter-1538525614699] util.MultiThreadedAction: 
[R:10] Keys=9171711, cols=156.1 M, time=00:28:45 Overall: [keys/s= 5314, 
latency=0.90 ms] Current: [keys/s=5451, latency=0.84 ms], verified=9171711

2018-10-02 17:42:25,369 INFO  
[MultiThreadedAction-ProgressReporter-1538525614665] util.MultiThreadedAction: 
[W:11] Keys=999798, cols=11.0 M, time=00:28:50 Overall: [keys/s= 577, 
latency=17.26 ms] Current: [keys/s=623, latency=16.01 ms], wroteUpTo=999780, 
wroteQSize=7

2018-10-02 17:42:25,429 INFO  
[MultiThreadedAction-ProgressReporter-1538525614699] util.MultiThreadedAction: 
[R:10] Keys=9199696, cols=156.5 M, time=00:28:50 Overall: [keys/s= 5315, 
latency=0.90 ms] Current: [keys/s=5597, latency=0.81 ms], verified=9199699

2018-10-02 17:42:30,406 INFO  
[MultiThreadedAction-ProgressReporter-1538525614665] util.MultiThreadedAction: 
RUN SUMMARY

KEYS PER SECOND [W]: mean=577.53, min=322.00, max=729.00, stdDev=65.36, 
50th=586.00, 75th=622.00, 95th=663.00, 99th=711.30, 99.9th=729.00, 
99.99th=729.00, 99.999th=729.00

LATENCY [W]: mean=17.05, min=13.00, max=30.00, stdDev=2.42, 50th=16.00, 
75th=18.00, 95th=22.00, 99th=27.00, 99.9th=30.00, 99.99th=30.00, 99.999th=30.00

2018-10-02 17:42:30,434 INFO  
[MultiThreadedAction-ProgressReporter-1538525614699] util.MultiThreadedAction: 
RUN SUMMARY

KEYS PER SECOND [R]: mean=5317.31, min=2953.00, max=6900.00, stdDev=639.58, 
50th=5366.00, 75th=5727.00, 95th=6248.65, 99th=6598.89, 99.9th=6900.00, 
99.99th=6900.00, 99.999th=6900.00

LATENCY [R]: mean=0.28, min=0.00, max=2.00, stdDev=0.46, 50th=0.00, 75th=1.00, 
95th=1.00, 99th=1.00, 99.9th=2.00, 99.99th=2.00, 99.999th=2.00

2018-10-02 17:42:30,783 INFO  [main] zookeeper.ReadOnlyZKClient: Close 
zookeeper connection 0x31aa3ca5 to localhost:2181

Failed to write keys: 0

2018-10-02 17:42:30,794 INFO  [main] zookeeper.ReadOnlyZKClient: Close 
zookeeper connection 0x6b44435b to localhost:2181{code}

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch, HBASE-20306.003.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException

2018-10-02 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636295#comment-16636295
 ] 

Ted Yu commented on HBASE-20690:


Triggered QA run #14566

> Moving table to target rsgroup needs to handle TableStateNotFoundException
> --
>
> Key: HBASE-20690
> URL: https://issues.apache.org/jira/browse/HBASE-20690
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Guangxu Cheng
>Priority: Major
> Attachments: HBASE-20690.master.001.patch, 
> HBASE-20690.master.002.patch
>
>
> This is related code:
> {code}
>   if (targetGroup != null) {
> for (TableName table: tables) {
>   if (master.getAssignmentManager().isTableDisabled(table)) {
> LOG.debug("Skipping move regions because the table" + table + " 
> is disabled.");
> continue;
>   }
> {code}
> In a stack trace [~rmani] showed me:
> {code}
> 2018-06-06 07:10:44,893 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] 
> master.TableStateManager: Unable to get table demo:tbl1 state
> org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: 
> demo:tbl1
> at 
> org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193)
> at 
> org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470)
> at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334)
> at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331)
> at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768)
> at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
> at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logic should take potential TableStateNotFoundException into account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636278#comment-16636278
 ] 

Hadoop QA commented on HBASE-21259:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  2m 
23s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
20s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  2m 
24s{color} | {color:red} patch has 14 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m 
23s{color} | {color:red} The patch causes 14 errors with Hadoop v2.7.4. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  3m  
2s{color} | {color:red} The patch causes 14 errors with Hadoop v3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
18s{color} | {color:red} hbase-server in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 22s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21259 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942211/HBASE-21259.branch-2.1.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 3ed150ee22d2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / 3df8b6f7bb |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| mvninstall | 

[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Colin Garcia (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636274#comment-16636274
 ] 

Colin Garcia commented on HBASE-20306:
--

That's fine, I agree with this assessment. I'm just spitballing to determine 
what the cause could be. Will keep investigating.

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch, HBASE-20306.003.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21220) Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) to branch-1

2018-10-02 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636273#comment-16636273
 ] 

Tak Lon (Stephen) Wu commented on HBASE-21220:
--

[non-binding] +1, reviewed the patched and ran unit test on top of branch-1 
passed, is the QA-bot not working?

> Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and 
> ROWPREFIX_DELIMITED) to branch-1
> --
>
> Key: HBASE-21220
> URL: https://issues.apache.org/jira/browse/HBASE-21220
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: HBASE-21220-branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636270#comment-16636270
 ] 

Andrew Purtell commented on HBASE-20306:


LTT shuts down without the patch applied, fails to terminate with the patch 
applied, I'm sorry, I can't commit under these circumstances. If you can't 
catch it later this week I can try again and drop a stacktrace if it repros

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch, HBASE-20306.003.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Colin Garcia (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636264#comment-16636264
 ] 

Colin Garcia edited comment on HBASE-20306 at 10/3/18 12:16 AM:


I see. But I don't think the old if/else is right then either. If the reader 
progress was printing the same status, that would have to mean that there were 
still worker threads running (as they were formerly the same check), so the old 
if/else wouldn't have terminated either. This problem might be elsewhere, 
unrelated to the change but having something to do with how these worker 
threads are exiting. 

I'm rerunning to see if I can mimic the behavior you describe. 


was (Author: colingarcia):
I see. But I don't think the old if/else is right then either. If the reader 
progress was printing the same status, that would have to mean that there were 
still worker threads running (as they were formerly the same check), so the old 
if/else wouldn't have terminated either. This problem might be elsewhere, 
unrelated to the change but having something to do with how these worker 
threads are exiting. 

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch, HBASE-20306.003.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Colin Garcia (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636264#comment-16636264
 ] 

Colin Garcia commented on HBASE-20306:
--

I see. But I don't think the old if/else is right then either. If the reader 
progress was printing the same status, that would have to mean that there were 
still worker threads running (as they were formerly the same check), so the old 
if/else wouldn't have terminated either. This problem might be elsewhere, 
unrelated to the change but having something to do with how these worker 
threads are exiting. 

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch, HBASE-20306.003.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Colin Garcia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Garcia updated HBASE-20306:
-
Attachment: HBASE-20306.003.patch

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch, HBASE-20306.003.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636257#comment-16636257
 ] 

stack edited comment on HBASE-21259 at 10/3/18 12:03 AM:
-

.001 seems to work mostly (it has worked 99% of the time... trying to figure 
where the holes are). It is simply an undo of all the places we auto-create 
ServerStateNodes so that we don't create one long after a server has been dead 
and gone (messing up UP#remoteCallFailed processing).

Let me figure where the holes are and see if I can do a test too.


was (Author: stack):
.001 seems to work mostly (it has worked 99% of the time... trying to figure 
where the holes are). It is simply an undo of all the places we auto-create 
ServerStateNodes so that we don't create one long after a server has been dead 
and gone (messing up UP#remoteCallFailed processing).

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636257#comment-16636257
 ] 

stack commented on HBASE-21259:
---

.001 seems to work mostly (it has worked 99% of the time... trying to figure 
where the holes are). It is simply an undo of all the places we auto-create 
ServerStateNodes so that we don't create one long after a server has been dead 
and gone (messing up UP#remoteCallFailed processing).

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21259:
--
Status: Patch Available  (was: Open)

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635924#comment-16635924
 ] 

stack edited comment on HBASE-21259 at 10/3/18 12:00 AM:
-

[~allan163]

 * meta  has a region in CLOSING state against a server that has no mention in 
fs, is not online, nor has it a znode so it is 'unknown' to the system.
 * I try to move the region 'manually' via hbck2 from CLOSING to CLOSED -- i.e. 
unassign -- so I can assign it elsewhere. The CLOSING dispatch fails because no 
such server and the UP expires the server which queues a SCP.
 * The SCP runs. Finds no logs to splits. Finds the stuck UP and calls it's 
handleFailure. The UP then moves to CLOSED and all is good.
 * Except, the SCP has now resulted in their being a deadserver element.
 * So, when the next region that references the 'unknown' server comes along, 
it goes to unassign, fails, and tries to queue a server expiration.
 * But the attempt at expiration is rejected because 'there is one in progress 
already' (because the server has an entry in dead servers -- See 
ServerManager#expireServer) so we skip out without queuing a new SCP.
 * This second UP and all subsequent regions that were pointing to the 
'unknown' server end up 'hung', suspended waiting for someone to wake them.

I have to call 'bypass' on each to get them out of suspend. I cannot unassign 
the regions, not in bulk at least.

If a server is dead we should not be reviving it. It causes more headache that 
it solves.

My first patch was stopping our reviving a server if it unknown but it messed 
up startup. Let me try and be more clinical.


was (Author: stack):
[~allan163]

 * meta  has a region in CLOSING state against a server that has no mention in 
fs, is not online, nor has it a znode so it is 'unknown' to the system.
 * I try to move the region from CLOSING to CLOSED so I can assign it 
elsewhere. The CLOSING dispatch fails because no such server and UP queues a 
SCP.
 * The SCP runs. Finds no logs to splits. Finds the stuck UP and calls its  
handleFailure. The UP then moves to CLOSED and all is good.
 * Except, the SCP has now resulted in their being a deadserver element.
 * So, when the new region that references the 'unknown' server comes along, it 
goes to unassign, fails, and tries to queue an expiration.
 * But the attempt at expiration is rejected because 'there is one in progress 
already' (because the server has an entry in dead servers) so we skip out 
without queuing a new SCP.
 * The second UP and all subsequent regions that were pointing to the 'unknown' 
server end up 'hung', suspended waiting for someone to wake them.

If a server is dead we should not be reviving it. It cause more headache that 
it solves.

My first patch was stopping our reviving a server if it unknown but it messed 
up startup. Let me try and be more clinical.

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> 

[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636256#comment-16636256
 ] 

Andrew Purtell commented on HBASE-20306:


I don't have the output. Problem was as described. Writer progress thread 
prints summary and terminates. Reader progress thread continues forever, 
printing the same (unchanging) status over and over, requiring ^C. 

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21259:
--
Attachment: HBASE-21259.branch-2.1.001.patch

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21259.branch-2.1.001.patch
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Colin Garcia (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636241#comment-16636241
 ] 

Colin Garcia commented on HBASE-20306:
--

[~apurtell] Thanks! Strangely, I ran with the same input and both threads 
finished (although it took some time). Could there be a race condition here? I 
don't see how one could run indefinitely as at some point the worker threads 
will finish, setting isProgressReporter to false. Can you paste what your 
output was? 

Fixing the other pieces you suggested now.

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21263) Mention compression algorithm along with other storefile details

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21263:
---
Description: 
Where we log storefile details we should also log the compression algorithm 
used to compress blocks on disk, if any. 

For example, here's a log line out of compaction:

2018-10-02 21:59:47,594 DEBUG 
[regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
compactions.Compactor: Compacting 
hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/i/3d04a7c28d6343ceb773737dbb192533,
 keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
seqNum=154199, earliestPutTs=1538516084915

Aside from bloom type, block encoding, and filename, it would be good to know 
compression type in this type of DEBUG or INFO level logging. A minor omission 
of information that could be helpful during debugging. 

  was:
Where we log storefile details we should also log the compression algorithm 
used to compress blocks on disk, if any. 

For example, here's a log line out of compaction:

2018-10-02 21:59:47,594 DEBUG 
[regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
compactions.Compactor: Compacting 
hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/tiny/3d04a7c28d6343ceb773737dbb192533,
 keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
seqNum=154199, earliestPutTs=1538516084915

Aside from bloom type, block encoding, and filename, it would be good to know 
compression type in this type of DEBUG or INFO level logging. A minor omission 
of information that could be helpful during debugging. 


> Mention compression algorithm along with other storefile details
> 
>
> Key: HBASE-21263
> URL: https://issues.apache.org/jira/browse/HBASE-21263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
>
> Where we log storefile details we should also log the compression algorithm 
> used to compress blocks on disk, if any. 
> For example, here's a log line out of compaction:
> 2018-10-02 21:59:47,594 DEBUG 
> [regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
> compactions.Compactor: Compacting 
> hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/i/3d04a7c28d6343ceb773737dbb192533,
>  keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
> seqNum=154199, earliestPutTs=1538516084915
> Aside from bloom type, block encoding, and filename, it would be good to know 
> compression type in this type of DEBUG or INFO level logging. A minor 
> omission of information that could be helpful during debugging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21263) Mention compression algorithm along with other storefile details

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21263:
---
Labels: beginner beginners  (was: )

> Mention compression algorithm along with other storefile details
> 
>
> Key: HBASE-21263
> URL: https://issues.apache.org/jira/browse/HBASE-21263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
>
> Where we log storefile details we should also log the compression algorithm 
> used to compress blocks on disk, if any. 
> For example, here's a log line out of compaction:
> 2018-10-02 21:59:47,594 DEBUG 
> [regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
> compactions.Compactor: Compacting 
> hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/tiny/3d04a7c28d6343ceb773737dbb192533,
>  keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
> seqNum=154199, earliestPutTs=1538516084915
> Aside from bloom type, block encoding, and filename, it would be good to know 
> compression type in this type of DEBUG or INFO level logging. A minor 
> omission of information that could be helpful during debugging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21263) Mention compression algorithm along with other storefile details

2018-10-02 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-21263:
--

 Summary: Mention compression algorithm along with other storefile 
details
 Key: HBASE-21263
 URL: https://issues.apache.org/jira/browse/HBASE-21263
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell


Where we log storefile details we should also log the compression algorithm 
used to compress blocks on disk, if any. 

For example, here's a log line out of compaction:

2018-10-02 21:59:47,594 DEBUG 
[regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
compactions.Compactor: Compacting 
hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/tiny/3d04a7c28d6343ceb773737dbb192533,
 keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
seqNum=154199, earliestPutTs=1538516084915

Aside from bloom type, block encoding, and filename, it would be good to know 
compression type in this type of DEBUG or INFO level logging. A minor omission 
of information that could be helpful during debugging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21243) Correct java-doc for the method RpcServer.getRemoteAddress()

2018-10-02 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635974#comment-16635974
 ] 

Josh Elser commented on HBASE-21243:


[~nihaljain.cs], how about a patch to fix this? :)

> Correct java-doc for the method RpcServer.getRemoteAddress()
> 
>
> Key: HBASE-21243
> URL: https://issues.apache.org/jira/browse/HBASE-21243
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.0.0
>Reporter: Nihal Jain
>Priority: Trivial
>  Labels: beginner, beginners, documentaion
>
> Correct the java-doc for the method {{RpcServer.getRemoteAddress()}}.
>  Currently it look like as below:
> {code:java}
>   /**
>* @return Address of remote client if a request is ongoing, else null
>*/
>   public static Optional getRemoteAddress() {
> return getCurrentCall().map(RpcCall::getRemoteAddress);
>   }
> {code}
> Contrary to the doc the method will never return null. Rather it may return 
> an empty Optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21200) Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner.

2018-10-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635939#comment-16635939
 ] 

Hadoop QA commented on HBASE-21200:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 
46s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21200 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942119/HBASE-21200.master.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 88635976dfdf 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 42aa3dd463 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14564/testReport/ |
| Max. process+thread count | 5083 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14564/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Memstore flush 

[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635930#comment-16635930
 ] 

Hadoop QA commented on HBASE-21185:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
16s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}120m  
8s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21185 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942118/HBASE-21185.master.003.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 519e9f4b7105 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 42aa3dd463 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14563/testReport/ |
| Max. process+thread count | 4451 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode

2018-10-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635924#comment-16635924
 ] 

stack commented on HBASE-21259:
---

[~allan163]

 * meta  has a region in CLOSING state against a server that has no mention in 
fs, is not online, nor has it a znode so it is 'unknown' to the system.
 * I try to move the region from CLOSING to CLOSED so I can assign it 
elsewhere. The CLOSING dispatch fails because no such server and UP queues a 
SCP.
 * The SCP runs. Finds no logs to splits. Finds the stuck UP and calls its  
handleFailure. The UP then moves to CLOSED and all is good.
 * Except, the SCP has now resulted in their being a deadserver element.
 * So, when the new region that references the 'unknown' server comes along, it 
goes to unassign, fails, and tries to queue an expiration.
 * But the attempt at expiration is rejected because 'there is one in progress 
already' (because the server has an entry in dead servers) so we skip out 
without queuing a new SCP.
 * The second UP and all subsequent regions that were pointing to the 'unknown' 
server end up 'hung', suspended waiting for someone to wake them.

If a server is dead we should not be reviving it. It cause more headache that 
it solves.

My first patch was stopping our reviving a server if it unknown but it messed 
up startup. Let me try and be more clinical.

> [amv2] Revived deadservers; recreated serverstatenode
> -
>
> Key: HBASE-21259
> URL: https://issues.apache.org/jira/browse/HBASE-21259
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.1.0
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
>
> On startup, I see servers being revived; i.e. their serverstatenode is 
> getting marked online even though its just been processed by 
> ServerCrashProcedure. It looks like this (in a patched server that reports on 
> whenever a serverstatenode is created):
> {code}
> 2018-09-29 03:45:40,963 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, 
> state=SUCCESS; ServerCrashProcedure 
> server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, 
> meta=false in 1.0130sec
> ...
> 2018-09-29 03:45:43,733 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! 
> vb1442.halxg.cloudera.com,22101,1536675314426
> java.lang.RuntimeException: WHERE AM I?
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464)
> at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022)
> {code}
> See how we've just finished a SCP which will have removed the 
> serverstatenode... but then we come across an unassign that references the 
> server that was just processed. The unassign will attempt to update the 
> serverstatenode and therein we create one if one not present. We shouldn't be 
> creating one.
> I think I see this a lot because I am scheduling unassigns with hbck2. The 
> servers crash and then come up with SCPs doing cleanup of old server and 
> unassign procedures in the procedure executor queue to be processed still 
>  but could happen at any time on cluster should an unassign happen get 
> scheduled near an SCP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20306:
---
Status: Open  (was: Patch Available)

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635917#comment-16635917
 ] 

Andrew Purtell commented on HBASE-20306:


While updating the patch I also think the summary should print the action mode 
indicator too. See how the progress reporter prints one of R, W, or A. Comes 
from the task impl specific progressInfo() method I think. The summary line 
should also print R, W, or A to make it easier to figure out for which action 
the summary is in regard.

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20306:
---
Attachment: HBASE-20306-branch-1.000.patch

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306-branch-1.000.patch, HBASE-20306.000.patch, 
> HBASE-20306.001.patch, HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635903#comment-16635903
 ] 

Andrew Purtell commented on HBASE-20306:


Patch is not quite ready. When I test with {{./bin/hbase ltt -read 100:10 
-write 10:100:10 -num_keys 100}} only one multithreadedaction progress 
reporter shuts down. The other continues to run indefinitely, holding up the 
process. I think the problem is here:
{code}
@@ -261,7 +285,7 @@ public abstract class MultiThreadedAction {
   }
 
   public void waitForFinish() {
-while (numThreadsWorking.get() != 0) {
+while (isProgressReporterRunning) {
   Threads.sleepWithoutInterrupt(1000);
 }
 close();
{code}

I also attached a branch-1 port. 

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306.000.patch, HBASE-20306.001.patch, 
> HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20306) LoadTestTool does not print summary at end of run

2018-10-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635823#comment-16635823
 ] 

Andrew Purtell commented on HBASE-20306:


Sorry for dropping this. I think the current patch looks good. There are two 
line length checkstyle warnings but we don't need to do another round of 
review, I can fix those at commit. Let me see about getting this in today.

> LoadTestTool does not print summary at end of run
> -
>
> Key: HBASE-20306
> URL: https://issues.apache.org/jira/browse/HBASE-20306
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Reporter: Mike Drob
>Assignee: Colin Garcia
>Priority: Major
>  Labels: beginner
> Attachments: HBASE-20306.000.patch, HBASE-20306.001.patch, 
> HBASE-20306.002.patch
>
>
> ltt currently prints status as it goes, but doesn't give a nice summary of 
> what happened so users have to infer it from the last status line printed.
> Would be nice to print a real summary with statistics about what was run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21105) TestHBaseFsck failing in branch-1, branch-1.4, branch-1.3 with NPE

2018-10-02 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21105:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> TestHBaseFsck failing in branch-1, branch-1.4, branch-1.3 with NPE
> --
>
> Key: HBASE-21105
> URL: https://issues.apache.org/jira/browse/HBASE-21105
> Project: HBase
>  Issue Type: Bug
>  Components: hbck, test
>Affects Versions: 1.5.0, 1.3.3, 1.4.7
>Reporter: Sean Busbey
>Assignee: Vishal Khandelwal
>Priority: Major
> Attachments: HBASE-21105.branch-1.v1.patch
>
>
> TestHBaseFsck in the mentioned branches has two tests that rely on 
> TestEndToEndSplitTransaction for blocking in the same way TestTableResource 
> used to before HBASE-21076.
> Both tests appear to specifically be testing that something happens after a 
> split, so we'll need a solution that removes the cross-test dependency but 
> still allows for "wait until this split has finished"
> example failure from branch-1
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.util.TestHBaseFsck.testSplitDaughtersNotInMeta(TestHBaseFsck.java:1985)
> {code}
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.util.TestHBaseFsck.testValidLingeringSplitParent(TestHBaseFsck.java:1934)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20940) HStore.cansplit should not allow split to happen if it has references

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635700#comment-16635700
 ] 

Hudson commented on HBASE-20940:


FAILURE: Integrated in Jenkins build Phoenix-omid2 #111 (See 
[https://builds.apache.org/job/Phoenix-omid2/111/])
After HBASE-20940 any local index query will open all HFiles of every (larsh: 
rev ed4366063b983270767757d12daf3a8f4b126897)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/iterate/RegionScannerFactory.java


> HStore.cansplit should not allow split to happen if it has references
> -
>
> Key: HBASE-20940
> URL: https://issues.apache.org/jira/browse/HBASE-20940
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.1.1, 1.4.7, 2.0.3
>
> Attachments: HBASE-20940-branch-1-addendum.patch, 
> HBASE-20940.branch-1.3.v1.patch, HBASE-20940.branch-1.3.v2.patch, 
> HBASE-20940.branch-1.v1.patch, HBASE-20940.branch-1.v2.patch, 
> HBASE-20940.branch-1.v3.patch, HBASE-20940.branch-1.v5.patch, 
> HBASE-20940.v1.patch, HBASE-20940.v2.patch, HBASE-20940.v3.patch, 
> HBASE-20940.v4.patch, result_HBASE-20940.branch-1.v2.log
>
>
> When split happens and immediately another split happens, it may result into 
> a split of a region who still has references to its parent. More details 
> about scenario can be found here HBASE-20933
> HStore.hasReferences should check from fs.storefile rather than in memory 
> objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21200) Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner.

2018-10-02 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635661#comment-16635661
 ] 

Toshihiro Suzuki commented on HBASE-21200:
--

I just attached v1 patch. I think in SegmentScanner#seekToPreviousRow(), we can 
skip seeking when the sequenceId of the cells whose row is equal to 
firstKeyOnPreviousRow is grater than readPoint. I added this optimization to 
the patch.

> Memstore flush doesn't finish because of seekToPreviousRow() in memstore 
> scanner.
> -
>
> Key: HBASE-21200
> URL: https://issues.apache.org/jira/browse/HBASE-21200
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners
>Reporter: dongjin2193.jeon
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21200-UT.patch, HBASE-21200.master.001.patch, 
> RegionServerJstack.log
>
>
> The  issue of delaying memstore flush still occurs after backport hbase-15871.
> Reverse scan takes a long time to seek previous row in the memstore full of 
> deleted cells.
>  
> jstack :
> "MemStoreFlusher.0" #114 prio=5 os_prio=0 tid=0x7fa3d0729000 nid=0x486a 
> waiting on condition [0x7fa3b9b6b000]
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0xa465fe60> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.updateReaders(StoreScanner.java:695)*
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1127)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1106)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.access$600(HStore.java:130)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2455)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2519)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2256)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2218)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2110)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2036)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>     at java.lang.Thread.run(Thread.java:748)
>  
> "RpcServer.FifoWFPBQ.default.handler=27,queue=0,port=16020" #65 daemon prio=5 
> os_prio=0 tid=0x7fa3e628 nid=0x4801 runnable [0x7fa3bd29a000]
>    java.lang.Thread.State: RUNNABLE
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.getNext(DefaultMemStore.java:780)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekInSubLists(DefaultMemStore.java:826)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seek(DefaultMemStore.java:818)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekToPreviousRow(DefaultMemStore.java:1000)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.next(ReversedKeyValueHeap.java:136)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.next(StoreScanner.java:629)*
>     at 
> 

[jira] [Updated] (HBASE-21200) Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner.

2018-10-02 Thread Toshihiro Suzuki (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki updated HBASE-21200:
-
Assignee: Toshihiro Suzuki
  Status: Patch Available  (was: Open)

> Memstore flush doesn't finish because of seekToPreviousRow() in memstore 
> scanner.
> -
>
> Key: HBASE-21200
> URL: https://issues.apache.org/jira/browse/HBASE-21200
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners
>Reporter: dongjin2193.jeon
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21200-UT.patch, HBASE-21200.master.001.patch, 
> RegionServerJstack.log
>
>
> The  issue of delaying memstore flush still occurs after backport hbase-15871.
> Reverse scan takes a long time to seek previous row in the memstore full of 
> deleted cells.
>  
> jstack :
> "MemStoreFlusher.0" #114 prio=5 os_prio=0 tid=0x7fa3d0729000 nid=0x486a 
> waiting on condition [0x7fa3b9b6b000]
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0xa465fe60> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.updateReaders(StoreScanner.java:695)*
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1127)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1106)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.access$600(HStore.java:130)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2455)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2519)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2256)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2218)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2110)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2036)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>     at java.lang.Thread.run(Thread.java:748)
>  
> "RpcServer.FifoWFPBQ.default.handler=27,queue=0,port=16020" #65 daemon prio=5 
> os_prio=0 tid=0x7fa3e628 nid=0x4801 runnable [0x7fa3bd29a000]
>    java.lang.Thread.State: RUNNABLE
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.getNext(DefaultMemStore.java:780)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekInSubLists(DefaultMemStore.java:826)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seek(DefaultMemStore.java:818)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekToPreviousRow(DefaultMemStore.java:1000)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.next(ReversedKeyValueHeap.java:136)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.next(StoreScanner.java:629)*
>     at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876)
>     at 
> 

[jira] [Updated] (HBASE-21200) Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner.

2018-10-02 Thread Toshihiro Suzuki (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki updated HBASE-21200:
-
Attachment: HBASE-21200.master.001.patch

> Memstore flush doesn't finish because of seekToPreviousRow() in memstore 
> scanner.
> -
>
> Key: HBASE-21200
> URL: https://issues.apache.org/jira/browse/HBASE-21200
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners
>Reporter: dongjin2193.jeon
>Priority: Major
> Attachments: HBASE-21200-UT.patch, HBASE-21200.master.001.patch, 
> RegionServerJstack.log
>
>
> The  issue of delaying memstore flush still occurs after backport hbase-15871.
> Reverse scan takes a long time to seek previous row in the memstore full of 
> deleted cells.
>  
> jstack :
> "MemStoreFlusher.0" #114 prio=5 os_prio=0 tid=0x7fa3d0729000 nid=0x486a 
> waiting on condition [0x7fa3b9b6b000]
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0xa465fe60> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.updateReaders(StoreScanner.java:695)*
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1127)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1106)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.access$600(HStore.java:130)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2455)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2519)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2256)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2218)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2110)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2036)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>     at java.lang.Thread.run(Thread.java:748)
>  
> "RpcServer.FifoWFPBQ.default.handler=27,queue=0,port=16020" #65 daemon prio=5 
> os_prio=0 tid=0x7fa3e628 nid=0x4801 runnable [0x7fa3bd29a000]
>    java.lang.Thread.State: RUNNABLE
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.getNext(DefaultMemStore.java:780)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekInSubLists(DefaultMemStore.java:826)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seek(DefaultMemStore.java:818)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekToPreviousRow(DefaultMemStore.java:1000)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.next(ReversedKeyValueHeap.java:136)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.next(StoreScanner.java:629)*
>     at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027)
>     at 
> 

[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-02 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635636#comment-16635636
 ] 

Wellington Chevreuil commented on HBASE-21185:
--

Submitted a new patch fixing checkstyle issues from previous one.

> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-02 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21185:
-
Attachment: HBASE-21185.master.003.patch

> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21231) Add documentation for MajorCompactor

2018-10-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635504#comment-16635504
 ] 

Hadoop QA commented on HBASE-21231:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
1s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
23s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
35s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 1s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m  
5s{color} | {color:blue} patch has no errors when building the reference guide. 
See footer for rendered docs, which you should manually inspect. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21231 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942108/HBASE-21231.master.002.patch
 |
| Optional Tests |  dupname  asflicense  shellcheck  shelldocs  refguide  |
| uname | Linux 2b91897703a4 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 42aa3dd463 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| shellcheck | v0.4.4 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14562/artifact/patchprocess/branch-site/book.html
 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14562/artifact/patchprocess/patch-site/book.html
 |
| Max. process+thread count | 87 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14562/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add documentation for MajorCompactor
> 
>
> Key: HBASE-21231
> URL: https://issues.apache.org/jira/browse/HBASE-21231
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Balazs Meszaros
>Assignee: Balazs Meszaros
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21231.master.001.patch, 
> HBASE-21231.master.002.patch
>
>
> HBASE-19528 added a new MajorCompactor tool, but it lacks of documentation. 
> Let's document it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635465#comment-16635465
 ] 

Hudson commented on HBASE-21261:


Results for branch master
[build #522 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/522/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add log4j.properties for hbase-rsgroup tests
> 
>
> Key: HBASE-21261
> URL: https://issues.apache.org/jira/browse/HBASE-21261
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
>
> When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
> Turns out that under hbase-rsgroup/src/test/resources there is no 
> log4j.properties
> This issue adds log4j.properties for hbase-rsgroup tests.
> This would be useful when finding root cause for hbase-rsgroup test 
> failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635463#comment-16635463
 ] 

Hudson commented on HBASE-21258:


Results for branch master
[build #522 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/522/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add resetting of flags for RS Group pre/post hooks in TestRSGroups
> --
>
> Key: HBASE-21258
> URL: https://issues.apache.org/jira/browse/HBASE-21258
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: 21258.branch-1.04.txt, 21258.branch-1.05.txt, 
> 21258.branch-2.v1.patch, 21258.v1.txt
>
>
> Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS 
> Group pre/post hooks in TestRSGroups was absent.
> This issue is to add the resetting of these flags before each subtest starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18549) Unclaimed replication queues can go undetected

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635466#comment-16635466
 ] 

Hudson commented on HBASE-18549:


Results for branch master
[build #522 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/522/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Unclaimed replication queues can go undetected
> --
>
> Key: HBASE-18549
> URL: https://issues.apache.org/jira/browse/HBASE-18549
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Ashu Pachauri
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-18549-.master.001.patch, 
> HBASE-18549-.master.002.patch, HBASE-18549-.master.003.patch, 
> HBASE-18549-.master.004.patch, HBASE-18549.branch-1.001.patch, 
> HBASE-18549.branch-1.001.patch
>
>
> We have come across this situation multiple times where a zookeeper issues 
> can cause NodeFailoverWorker to fail picking up replication queue for a dead 
> region server silently. One example is when the znode size for a particular 
> queue exceed jute.maxBuffer value.
> There can be other situations that may lead to this and just go undetected. 
> We need to have a metric for number of unclaimed replication queues. This 
> will help in mitigating the problem through alerting on the metric and 
> identifying underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19275) TestSnapshotFileCache never worked properly

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635464#comment-16635464
 ] 

Hudson commented on HBASE-19275:


Results for branch master
[build #522 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/522/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/522//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestSnapshotFileCache never worked properly
> ---
>
> Key: HBASE-19275
> URL: https://issues.apache.org/jira/browse/HBASE-19275
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-19275-branch-1.patch, 
> HBASE-19275-master.001.patch, HBASE-19275-master.001.patch
>
>
> Error-prone noticed we were asking Iterables.contains() questions with the 
> wrong type in TestSnapshotFileCache. I've attached a fixed version of the 
> test. The results suggest the cache is not evicting entries properly. 
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1/9e49edd0ab41657fb0c6ebb4d9dfad15/cf/f132e5b06f66443f8003363ed1535aac',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testReloadModifiedDirectory(TestSnapshotFileCache.java:102)
> {noformat}
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1a/2e81adb9212c98cff970eafa006fc40b/cf/a2ec478d850e4e348359699c53b732c4',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testLoadAndDelete(TestSnapshotFileCache.java:88)
> {noformat}
> These changes are part of HBASE-19239
> I've disabled the offending test cases with @Ignore in that patch, but they 
> should be reenabled and fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21231) Add documentation for MajorCompactor

2018-10-02 Thread Balazs Meszaros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balazs Meszaros updated HBASE-21231:

Status: Patch Available  (was: Open)

> Add documentation for MajorCompactor
> 
>
> Key: HBASE-21231
> URL: https://issues.apache.org/jira/browse/HBASE-21231
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Balazs Meszaros
>Assignee: Balazs Meszaros
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21231.master.001.patch, 
> HBASE-21231.master.002.patch
>
>
> HBASE-19528 added a new MajorCompactor tool, but it lacks of documentation. 
> Let's document it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21231) Add documentation for MajorCompactor

2018-10-02 Thread Balazs Meszaros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balazs Meszaros updated HBASE-21231:

Attachment: HBASE-21231.master.002.patch

> Add documentation for MajorCompactor
> 
>
> Key: HBASE-21231
> URL: https://issues.apache.org/jira/browse/HBASE-21231
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Balazs Meszaros
>Assignee: Balazs Meszaros
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21231.master.001.patch, 
> HBASE-21231.master.002.patch
>
>
> HBASE-19528 added a new MajorCompactor tool, but it lacks of documentation. 
> Let's document it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18549) Unclaimed replication queues can go undetected

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635365#comment-16635365
 ] 

Hudson commented on HBASE-18549:


Results for branch branch-2.1
[build #407 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Unclaimed replication queues can go undetected
> --
>
> Key: HBASE-18549
> URL: https://issues.apache.org/jira/browse/HBASE-18549
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Ashu Pachauri
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-18549-.master.001.patch, 
> HBASE-18549-.master.002.patch, HBASE-18549-.master.003.patch, 
> HBASE-18549-.master.004.patch, HBASE-18549.branch-1.001.patch, 
> HBASE-18549.branch-1.001.patch
>
>
> We have come across this situation multiple times where a zookeeper issues 
> can cause NodeFailoverWorker to fail picking up replication queue for a dead 
> region server silently. One example is when the znode size for a particular 
> queue exceed jute.maxBuffer value.
> There can be other situations that may lead to this and just go undetected. 
> We need to have a metric for number of unclaimed replication queues. This 
> will help in mitigating the problem through alerting on the metric and 
> identifying underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19275) TestSnapshotFileCache never worked properly

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635363#comment-16635363
 ] 

Hudson commented on HBASE-19275:


Results for branch branch-2.1
[build #407 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestSnapshotFileCache never worked properly
> ---
>
> Key: HBASE-19275
> URL: https://issues.apache.org/jira/browse/HBASE-19275
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-19275-branch-1.patch, 
> HBASE-19275-master.001.patch, HBASE-19275-master.001.patch
>
>
> Error-prone noticed we were asking Iterables.contains() questions with the 
> wrong type in TestSnapshotFileCache. I've attached a fixed version of the 
> test. The results suggest the cache is not evicting entries properly. 
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1/9e49edd0ab41657fb0c6ebb4d9dfad15/cf/f132e5b06f66443f8003363ed1535aac',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testReloadModifiedDirectory(TestSnapshotFileCache.java:102)
> {noformat}
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1a/2e81adb9212c98cff970eafa006fc40b/cf/a2ec478d850e4e348359699c53b732c4',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testLoadAndDelete(TestSnapshotFileCache.java:88)
> {noformat}
> These changes are part of HBASE-19239
> I've disabled the offending test cases with @Ignore in that patch, but they 
> should be reenabled and fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635364#comment-16635364
 ] 

Hudson commented on HBASE-21261:


Results for branch branch-2.1
[build #407 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/407//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add log4j.properties for hbase-rsgroup tests
> 
>
> Key: HBASE-21261
> URL: https://issues.apache.org/jira/browse/HBASE-21261
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
>
> When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
> Turns out that under hbase-rsgroup/src/test/resources there is no 
> log4j.properties
> This issue adds log4j.properties for hbase-rsgroup tests.
> This would be useful when finding root cause for hbase-rsgroup test 
> failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19275) TestSnapshotFileCache never worked properly

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635334#comment-16635334
 ] 

Hudson commented on HBASE-19275:


Results for branch branch-2
[build #1331 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestSnapshotFileCache never worked properly
> ---
>
> Key: HBASE-19275
> URL: https://issues.apache.org/jira/browse/HBASE-19275
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-19275-branch-1.patch, 
> HBASE-19275-master.001.patch, HBASE-19275-master.001.patch
>
>
> Error-prone noticed we were asking Iterables.contains() questions with the 
> wrong type in TestSnapshotFileCache. I've attached a fixed version of the 
> test. The results suggest the cache is not evicting entries properly. 
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1/9e49edd0ab41657fb0c6ebb4d9dfad15/cf/f132e5b06f66443f8003363ed1535aac',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testReloadModifiedDirectory(TestSnapshotFileCache.java:102)
> {noformat}
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1a/2e81adb9212c98cff970eafa006fc40b/cf/a2ec478d850e4e348359699c53b732c4',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testLoadAndDelete(TestSnapshotFileCache.java:88)
> {noformat}
> These changes are part of HBASE-19239
> I've disabled the offending test cases with @Ignore in that patch, but they 
> should be reenabled and fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18549) Unclaimed replication queues can go undetected

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635336#comment-16635336
 ] 

Hudson commented on HBASE-18549:


Results for branch branch-2
[build #1331 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Unclaimed replication queues can go undetected
> --
>
> Key: HBASE-18549
> URL: https://issues.apache.org/jira/browse/HBASE-18549
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Ashu Pachauri
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-18549-.master.001.patch, 
> HBASE-18549-.master.002.patch, HBASE-18549-.master.003.patch, 
> HBASE-18549-.master.004.patch, HBASE-18549.branch-1.001.patch, 
> HBASE-18549.branch-1.001.patch
>
>
> We have come across this situation multiple times where a zookeeper issues 
> can cause NodeFailoverWorker to fail picking up replication queue for a dead 
> region server silently. One example is when the znode size for a particular 
> queue exceed jute.maxBuffer value.
> There can be other situations that may lead to this and just go undetected. 
> We need to have a metric for number of unclaimed replication queues. This 
> will help in mitigating the problem through alerting on the metric and 
> identifying underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635335#comment-16635335
 ] 

Hudson commented on HBASE-21261:


Results for branch branch-2
[build #1331 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1331//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add log4j.properties for hbase-rsgroup tests
> 
>
> Key: HBASE-21261
> URL: https://issues.apache.org/jira/browse/HBASE-21261
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
>
> When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
> Turns out that under hbase-rsgroup/src/test/resources there is no 
> log4j.properties
> This issue adds log4j.properties for hbase-rsgroup tests.
> This would be useful when finding root cause for hbase-rsgroup test 
> failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19275) TestSnapshotFileCache never worked properly

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635319#comment-16635319
 ] 

Hudson commented on HBASE-19275:


Results for branch branch-1.4
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> TestSnapshotFileCache never worked properly
> ---
>
> Key: HBASE-19275
> URL: https://issues.apache.org/jira/browse/HBASE-19275
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-19275-branch-1.patch, 
> HBASE-19275-master.001.patch, HBASE-19275-master.001.patch
>
>
> Error-prone noticed we were asking Iterables.contains() questions with the 
> wrong type in TestSnapshotFileCache. I've attached a fixed version of the 
> test. The results suggest the cache is not evicting entries properly. 
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1/9e49edd0ab41657fb0c6ebb4d9dfad15/cf/f132e5b06f66443f8003363ed1535aac',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testReloadModifiedDirectory(TestSnapshotFileCache.java:102)
> {noformat}
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1a/2e81adb9212c98cff970eafa006fc40b/cf/a2ec478d850e4e348359699c53b732c4',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testLoadAndDelete(TestSnapshotFileCache.java:88)
> {noformat}
> These changes are part of HBASE-19239
> I've disabled the offending test cases with @Ignore in that patch, but they 
> should be reenabled and fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635320#comment-16635320
 ] 

Hudson commented on HBASE-21261:


Results for branch branch-1.4
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add log4j.properties for hbase-rsgroup tests
> 
>
> Key: HBASE-21261
> URL: https://issues.apache.org/jira/browse/HBASE-21261
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
>
> When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
> Turns out that under hbase-rsgroup/src/test/resources there is no 
> log4j.properties
> This issue adds log4j.properties for hbase-rsgroup tests.
> This would be useful when finding root cause for hbase-rsgroup test 
> failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21117) Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing table locking issue.)

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635317#comment-16635317
 ] 

Hudson commented on HBASE-21117:


Results for branch branch-1.4
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Backport HBASE-18350  (fix RSGroups)  to branch-1 (Only port the part fixing 
> table locking issue.)
> --
>
> Key: HBASE-21117
> URL: https://issues.apache.org/jira/browse/HBASE-21117
> Project: HBase
>  Issue Type: Bug
>  Components: backport, rsgroup, shell
>Affects Versions: 1.3.2
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>  Labels: backport
> Fix For: 1.5.0, 1.4.8
>
> Attachments: HBASE-21117-branch-1.001.patch, 
> HBASE-21117-branch-1.002.patch
>
>
> When working on HBASE-20666, I found out HBASE-18350 did not get ported to 
> branch-1, which causes procedure to hang when #moveTables called sometimes. 
> After looking into the 18350 patch, seems it's important since it fixes 4 
> issues. This Jira is an attempt to backport it to branch-1.
>  
>  
> Edited: Aug26.
> After reviewed the HBASE-18350 patch. I decided to only port part 2 of the 
> patch.
> Because part1 and part3 is AMv2 related. I won't touch is since Amv2 is only 
> for branch-2
>  
> {quote} 
> Subject: [PATCH] HBASE-18350 RSGroups are broken under AMv2
> - Table moving to RSG was buggy, because it left the table unassigned.
>   Now it is fixed we immediately assign to an appropriate RS
>   (MoveRegionProcedure).
> *- Table was locked while moving, but unassign operation hung, because*
>   *locked table queues are not scheduled while locked. Fixed.    port 
> this one.*
> - ProcedureSyncWait was buggy, because it searched the procId in
>   executor, but executor does not store the return values of internal
>   operations (they are stored, but immediately removed by the cleaner).
> - list_rsgroups in the shell show also the assigned tables and servers.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635316#comment-16635316
 ] 

Hudson commented on HBASE-21258:


Results for branch branch-1.4
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add resetting of flags for RS Group pre/post hooks in TestRSGroups
> --
>
> Key: HBASE-21258
> URL: https://issues.apache.org/jira/browse/HBASE-21258
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: 21258.branch-1.04.txt, 21258.branch-1.05.txt, 
> 21258.branch-2.v1.patch, 21258.v1.txt
>
>
> Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS 
> Group pre/post hooks in TestRSGroups was absent.
> This issue is to add the resetting of these flags before each subtest starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20666) Unsuccessful table creation leaves entry in hbase:rsgroup table

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635318#comment-16635318
 ] 

Hudson commented on HBASE-20666:


Results for branch branch-1.4
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Unsuccessful table creation leaves entry in hbase:rsgroup table
> ---
>
> Key: HBASE-20666
> URL: https://issues.apache.org/jira/browse/HBASE-20666
> Project: HBase
>  Issue Type: Bug
>Reporter: Biju Nair
>Assignee: Xu Cang
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-20666.master.001.patch, 
> HBASE-20666.master.002.patch, HBASE-20666.master.004.patch
>
>
> If a table creation fails in a cluster enabled with {{rsgroup}} feature, the 
> table is still listed as part of {{default}} rsgroup.
> To recreate the scenario:
> - Create a namespace (NS) with number of region limit
> - Create table in the NS which satisfies the region limit by pre-splitting
> - Create a new table in the NS which will fail
> - {{list_rsgroup}} will show the table being part of {{default}} rsgroup and 
> data can be found in {{hbase:rsgroup}} table
> Would be good to revert the entry when the table creation fails or a script 
> to clean up the metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18549) Unclaimed replication queues can go undetected

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635321#comment-16635321
 ] 

Hudson commented on HBASE-18549:


Results for branch branch-1.4
[build #489 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/489//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Unclaimed replication queues can go undetected
> --
>
> Key: HBASE-18549
> URL: https://issues.apache.org/jira/browse/HBASE-18549
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Ashu Pachauri
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-18549-.master.001.patch, 
> HBASE-18549-.master.002.patch, HBASE-18549-.master.003.patch, 
> HBASE-18549-.master.004.patch, HBASE-18549.branch-1.001.patch, 
> HBASE-18549.branch-1.001.patch
>
>
> We have come across this situation multiple times where a zookeeper issues 
> can cause NodeFailoverWorker to fail picking up replication queue for a dead 
> region server silently. One example is when the znode size for a particular 
> queue exceed jute.maxBuffer value.
> There can be other situations that may lead to this and just go undetected. 
> We need to have a metric for number of unclaimed replication queues. This 
> will help in mitigating the problem through alerting on the metric and 
> identifying underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20666) Unsuccessful table creation leaves entry in hbase:rsgroup table

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635290#comment-16635290
 ] 

Hudson commented on HBASE-20666:


Results for branch branch-1
[build #487 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Unsuccessful table creation leaves entry in hbase:rsgroup table
> ---
>
> Key: HBASE-20666
> URL: https://issues.apache.org/jira/browse/HBASE-20666
> Project: HBase
>  Issue Type: Bug
>Reporter: Biju Nair
>Assignee: Xu Cang
>Priority: Minor
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-20666.master.001.patch, 
> HBASE-20666.master.002.patch, HBASE-20666.master.004.patch
>
>
> If a table creation fails in a cluster enabled with {{rsgroup}} feature, the 
> table is still listed as part of {{default}} rsgroup.
> To recreate the scenario:
> - Create a namespace (NS) with number of region limit
> - Create table in the NS which satisfies the region limit by pre-splitting
> - Create a new table in the NS which will fail
> - {{list_rsgroup}} will show the table being part of {{default}} rsgroup and 
> data can be found in {{hbase:rsgroup}} table
> Would be good to revert the entry when the table creation fails or a script 
> to clean up the metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19275) TestSnapshotFileCache never worked properly

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635291#comment-16635291
 ] 

Hudson commented on HBASE-19275:


Results for branch branch-1
[build #487 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> TestSnapshotFileCache never worked properly
> ---
>
> Key: HBASE-19275
> URL: https://issues.apache.org/jira/browse/HBASE-19275
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-19275-branch-1.patch, 
> HBASE-19275-master.001.patch, HBASE-19275-master.001.patch
>
>
> Error-prone noticed we were asking Iterables.contains() questions with the 
> wrong type in TestSnapshotFileCache. I've attached a fixed version of the 
> test. The results suggest the cache is not evicting entries properly. 
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1/9e49edd0ab41657fb0c6ebb4d9dfad15/cf/f132e5b06f66443f8003363ed1535aac',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testReloadModifiedDirectory(TestSnapshotFileCache.java:102)
> {noformat}
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1a/2e81adb9212c98cff970eafa006fc40b/cf/a2ec478d850e4e348359699c53b732c4',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testLoadAndDelete(TestSnapshotFileCache.java:88)
> {noformat}
> These changes are part of HBASE-19239
> I've disabled the offending test cases with @Ignore in that patch, but they 
> should be reenabled and fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18549) Unclaimed replication queues can go undetected

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635293#comment-16635293
 ] 

Hudson commented on HBASE-18549:


Results for branch branch-1
[build #487 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Unclaimed replication queues can go undetected
> --
>
> Key: HBASE-18549
> URL: https://issues.apache.org/jira/browse/HBASE-18549
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Ashu Pachauri
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1
>
> Attachments: HBASE-18549-.master.001.patch, 
> HBASE-18549-.master.002.patch, HBASE-18549-.master.003.patch, 
> HBASE-18549-.master.004.patch, HBASE-18549.branch-1.001.patch, 
> HBASE-18549.branch-1.001.patch
>
>
> We have come across this situation multiple times where a zookeeper issues 
> can cause NodeFailoverWorker to fail picking up replication queue for a dead 
> region server silently. One example is when the znode size for a particular 
> queue exceed jute.maxBuffer value.
> There can be other situations that may lead to this and just go undetected. 
> We need to have a metric for number of unclaimed replication queues. This 
> will help in mitigating the problem through alerting on the metric and 
> identifying underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21117) Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing table locking issue.)

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635289#comment-16635289
 ] 

Hudson commented on HBASE-21117:


Results for branch branch-1
[build #487 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Backport HBASE-18350  (fix RSGroups)  to branch-1 (Only port the part fixing 
> table locking issue.)
> --
>
> Key: HBASE-21117
> URL: https://issues.apache.org/jira/browse/HBASE-21117
> Project: HBase
>  Issue Type: Bug
>  Components: backport, rsgroup, shell
>Affects Versions: 1.3.2
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
>  Labels: backport
> Fix For: 1.5.0, 1.4.8
>
> Attachments: HBASE-21117-branch-1.001.patch, 
> HBASE-21117-branch-1.002.patch
>
>
> When working on HBASE-20666, I found out HBASE-18350 did not get ported to 
> branch-1, which causes procedure to hang when #moveTables called sometimes. 
> After looking into the 18350 patch, seems it's important since it fixes 4 
> issues. This Jira is an attempt to backport it to branch-1.
>  
>  
> Edited: Aug26.
> After reviewed the HBASE-18350 patch. I decided to only port part 2 of the 
> patch.
> Because part1 and part3 is AMv2 related. I won't touch is since Amv2 is only 
> for branch-2
>  
> {quote} 
> Subject: [PATCH] HBASE-18350 RSGroups are broken under AMv2
> - Table moving to RSG was buggy, because it left the table unassigned.
>   Now it is fixed we immediately assign to an appropriate RS
>   (MoveRegionProcedure).
> *- Table was locked while moving, but unassign operation hung, because*
>   *locked table queues are not scheduled while locked. Fixed.    port 
> this one.*
> - ProcedureSyncWait was buggy, because it searched the procId in
>   executor, but executor does not store the return values of internal
>   operations (they are stored, but immediately removed by the cleaner).
> - list_rsgroups in the shell show also the assigned tables and servers.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635292#comment-16635292
 ] 

Hudson commented on HBASE-21261:


Results for branch branch-1
[build #487 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Add log4j.properties for hbase-rsgroup tests
> 
>
> Key: HBASE-21261
> URL: https://issues.apache.org/jira/browse/HBASE-21261
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
>
> When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
> Turns out that under hbase-rsgroup/src/test/resources there is no 
> log4j.properties
> This issue adds log4j.properties for hbase-rsgroup tests.
> This would be useful when finding root cause for hbase-rsgroup test 
> failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635288#comment-16635288
 ] 

Hudson commented on HBASE-21258:


Results for branch branch-1
[build #487 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/487//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Add resetting of flags for RS Group pre/post hooks in TestRSGroups
> --
>
> Key: HBASE-21258
> URL: https://issues.apache.org/jira/browse/HBASE-21258
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8
>
> Attachments: 21258.branch-1.04.txt, 21258.branch-1.05.txt, 
> 21258.branch-2.v1.patch, 21258.v1.txt
>
>
> Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS 
> Group pre/post hooks in TestRSGroups was absent.
> This issue is to add the resetting of these flags before each subtest starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21261) Add log4j.properties for hbase-rsgroup tests

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635255#comment-16635255
 ] 

Hudson commented on HBASE-21261:


Results for branch branch-2.0
[build #893 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add log4j.properties for hbase-rsgroup tests
> 
>
> Key: HBASE-21261
> URL: https://issues.apache.org/jira/browse/HBASE-21261
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
>
> When I tried to debug TestRSGroups, at first I couldn't find any DEBUG log.
> Turns out that under hbase-rsgroup/src/test/resources there is no 
> log4j.properties
> This issue adds log4j.properties for hbase-rsgroup tests.
> This would be useful when finding root cause for hbase-rsgroup test 
> failure(s).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19275) TestSnapshotFileCache never worked properly

2018-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635254#comment-16635254
 ] 

Hudson commented on HBASE-19275:


Results for branch branch-2.0
[build #893 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/893//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> TestSnapshotFileCache never worked properly
> ---
>
> Key: HBASE-19275
> URL: https://issues.apache.org/jira/browse/HBASE-19275
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-19275-branch-1.patch, 
> HBASE-19275-master.001.patch, HBASE-19275-master.001.patch
>
>
> Error-prone noticed we were asking Iterables.contains() questions with the 
> wrong type in TestSnapshotFileCache. I've attached a fixed version of the 
> test. The results suggest the cache is not evicting entries properly. 
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1/9e49edd0ab41657fb0c6ebb4d9dfad15/cf/f132e5b06f66443f8003363ed1535aac',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testReloadModifiedDirectory(TestSnapshotFileCache.java:102)
> {noformat}
> {noformat}
> java.lang.AssertionError: Cache found 
> 'hdfs://localhost:52867/user/apurtell/test-data/8ce04c85-ce4b-4844-b454-5303482ade95/data/default/snapshot1a/2e81adb9212c98cff970eafa006fc40b/cf/a2ec478d850e4e348359699c53b732c4',
>  but it shouldn't have.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshot(TestSnapshotFileCache.java:260)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.createAndTestSnapshotV1(TestSnapshotFileCache.java:206)
>   at 
> org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache.testLoadAndDelete(TestSnapshotFileCache.java:88)
> {noformat}
> These changes are part of HBASE-19239
> I've disabled the offending test cases with @Ignore in that patch, but they 
> should be reenabled and fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)