[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in btw Compaction flow

2015-10-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962194#comment-14962194
 ] 

ramkrishna.s.vasudevan commented on HBASE-14636:


With the above approache I was able to avoid OOME with even a 2GB heap space 
while doing a compaction. Without this change a 2GB compaction was OOMEing 
without any other operations being performed. 
So I suggest we do the clear after every next() in compaction flow.

> Clear HFileScannerImpl#prevBlocks in btw Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in btw Compaction flow

2015-10-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962183#comment-14962183
 ] 

ramkrishna.s.vasudevan commented on HBASE-14636:


bq.Is btw By-The-Way?
Here it is Between.
bq.The YCSB suite worked fine when all was onheap with 16G. If I offheap, even 
though I leave heap at 16, it OOMEs pretty quickly.
I really doubt this. Onheap or offheap we should have the same implications 
except that we don't do the final ref counting.  Seeing the logs that you 
attached I can see frequent compactions and every compaction creating a 11 to 
13 G files. 


> Clear HFileScannerImpl#prevBlocks in btw Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962163#comment-14962163
 ] 

Hadoop QA commented on HBASE-14420:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767232/none_fix.txt
  against master branch at commit 71b38d60bbd26ff80ac23c53149d8da85976f39b.
  ATTACHMENT ID: 12767232

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16080//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16080//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16080//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16080//console

This message is automatically generated.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14628) Save object creation for scanning with block encodings

2015-10-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962162#comment-14962162
 ] 

Lars Hofhansl commented on HBASE-14628:
---

Yep, only useful in 0.98. Worth committing? 1.0 and later do a shallow copy of 
the seeker state... I could try to backport that, but 0.98 might not be ready 
for this in other aspects.

> Save object creation for scanning with block encodings
> --
>
> Key: HBASE-14628
> URL: https://issues.apache.org/jira/browse/HBASE-14628
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 14628-0.98.txt
>
>
> I noticed that (at least in 0.98 - master is entirely different) we create 
> ByteBuffer just to create a byte[], which is then used to create a KeyValue.
> We can save the creation of the ByteBuffer and hence save allocating an extra 
> object for each KV we find by creating the byte[] directly.
> In a Phoenix count\(*) query that saved from 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962159#comment-14962159
 ] 

Hudson commented on HBASE-14637:


SUCCESS: Integrated in HBase-TRUNK #6924 (See 
[https://builds.apache.org/job/HBase-TRUNK/6924/])
HBASE-14637 Loosen TestChoreService assert AND have (stack: rev 
71b38d60bbd26ff80ac23c53149d8da85976f39b)
* hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java


> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962157#comment-14962157
 ] 

Hudson commented on HBASE-14637:


FAILURE: Integrated in HBase-1.3 #278 (See 
[https://builds.apache.org/job/HBase-1.3/278/])
HBASE-14637 Loosen TestChoreService assert AND have (stack: rev 
836afcc901e8588ee6a21f4eb9bb9c9d0675cef9)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java


> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962148#comment-14962148
 ] 

Hudson commented on HBASE-14637:


FAILURE: Integrated in HBase-1.2 #270 (See 
[https://builds.apache.org/job/HBase-1.2/270/])
HBASE-14637 Loosen TestChoreService assert AND have (stack: rev 
005a74c44b2eb47d8cf51d9d4cbc26c8f90521ac)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java


> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14631:
---
Summary: Region merge request should be audited with request user through 
proper scope of doAs() calls to region observer notifications  (was: Region 
merge request should be audited with request user through proper the scope of 
doAs() calls to region observer notifications)

> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14631) Region merge request should be audited with request user through proper the scope of doAs() calls to region observer notifications

2015-10-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14631:
---
Summary: Region merge request should be audited with request user through 
proper the scope of doAs() calls to region observer notifications  (was: Narrow 
the scope of doAs() calls to region observer notifications for region merge)

> Region merge request should be audited with request user through proper the 
> scope of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Narrow the scope of doAs() calls to region observer notifications for region merge

2015-10-17 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962141#comment-14962141
 ] 

Jerry He commented on HBASE-14631:
--

Hi, Ted

This is to fix the security and audit loophole for region merge so that the 
doAs() is added.
Would you mind update the JIRA subject to to be accurate?
The patch looks good.

> Narrow the scope of doAs() calls to region observer notifications for region 
> merge
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962139#comment-14962139
 ] 

Hudson commented on HBASE-14637:


SUCCESS: Integrated in HBase-1.3-IT #248 (See 
[https://builds.apache.org/job/HBase-1.3-IT/248/])
HBASE-14637 Loosen TestChoreService assert AND have (stack: rev 
836afcc901e8588ee6a21f4eb9bb9c9d0675cef9)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java


> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962137#comment-14962137
 ] 

Hudson commented on HBASE-14637:


SUCCESS: Integrated in HBase-1.2-IT #221 (See 
[https://builds.apache.org/job/HBase-1.2-IT/221/])
HBASE-14637 Loosen TestChoreService assert AND have (stack: rev 
005a74c44b2eb47d8cf51d9d4cbc26c8f90521ac)
* hbase-common/src/test/java/org/apache/hadoop/hbase/TestChoreService.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java


> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14420) Zombie Stomping Session

2015-10-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14420:
--
Attachment: none_fix.txt

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14637.
---
   Resolution: Fixed
Fix Version/s: 1.3.0
   1.2.0
   2.0.0

> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14637:
--
Attachment: 14637.txt

What I pushed to branch-1.2+

> Loosen TestChoreService assert AND have TestDataBlockEncoders do less work 
> (and add timeouts)
> -
>
> Key: HBASE-14637
> URL: https://issues.apache.org/jira/browse/HBASE-14637
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14637.txt
>
>
> Small patch to loosen assert in TestChoreService that fails on occasion -- it 
> is timing dependent -- and then TestDataBlockEncoders can hang so just have 
> it do less work and add logging so get clue why hanging (not apparent at a 
> glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14637) Loosen TestChoreService assert AND have TestDataBlockEncoders do less work (and add timeouts)

2015-10-17 Thread stack (JIRA)
stack created HBASE-14637:
-

 Summary: Loosen TestChoreService assert AND have 
TestDataBlockEncoders do less work (and add timeouts)
 Key: HBASE-14637
 URL: https://issues.apache.org/jira/browse/HBASE-14637
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: stack


Small patch to loosen assert in TestChoreService that fails on occasion -- it 
is timing dependent -- and then TestDataBlockEncoders can hang so just have it 
do less work and add logging so get clue why hanging (not apparent at a glance)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14633) Try fluid width UI

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962061#comment-14962061
 ] 

Hadoop QA commented on HBASE-14633:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767220/HBASE-14633-v1.patch
  against master branch at commit d9ee19131881d3730f1fe5dada6160560d953284.
  ATTACHMENT ID: 12767220

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16079//console

This message is automatically generated.

> Try fluid width UI
> --
>
> Key: HBASE-14633
> URL: https://issues.apache.org/jira/browse/HBASE-14633
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0, 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-14633-v1.patch, HBASE-14633.patch
>
>
> Our UI is often too long. Lets give it more room if available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14633) Try fluid width UI

2015-10-17 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14633:
--
Attachment: HBASE-14633-v1.patch

> Try fluid width UI
> --
>
> Key: HBASE-14633
> URL: https://issues.apache.org/jira/browse/HBASE-14633
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0, 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-14633-v1.patch, HBASE-14633.patch
>
>
> Our UI is often too long. Lets give it more room if available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14633) Try fluid width UI

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962026#comment-14962026
 ] 

Hadoop QA commented on HBASE-14633:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767218/HBASE-14633.patch
  against master branch at commit d9ee19131881d3730f1fe5dada6160560d953284.
  ATTACHMENT ID: 12767218

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16078//console

This message is automatically generated.

> Try fluid width UI
> --
>
> Key: HBASE-14633
> URL: https://issues.apache.org/jira/browse/HBASE-14633
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0, 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-14633.patch
>
>
> Our UI is often too long. Lets give it more room if available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14633) Try fluid width UI

2015-10-17 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14633:
--
Status: Patch Available  (was: Open)

> Try fluid width UI
> --
>
> Key: HBASE-14633
> URL: https://issues.apache.org/jira/browse/HBASE-14633
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0, 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-14633.patch
>
>
> Our UI is often too long. Lets give it more room if available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14633) Try fluid width UI

2015-10-17 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14633:
--
Attachment: HBASE-14633.patch

> Try fluid width UI
> --
>
> Key: HBASE-14633
> URL: https://issues.apache.org/jira/browse/HBASE-14633
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0, 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-14633.patch
>
>
> Our UI is often too long. Lets give it more room if available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961987#comment-14961987
 ] 

stack commented on HBASE-14420:
---

Says:

{code}
laked tests: 
org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence.testOnlineSnapshotDeleteIndependent(org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence)
  Run 1: 
TestSnapshotCloneIndependence.testOnlineSnapshotDeleteIndependent:191->runTestSnapshotDeleteIndependent:459
 expected:<17576> but was:<14046>
  Run 2: 
TestSnapshotCloneIndependence.testOnlineSnapshotDeleteIndependent:191->runTestSnapshotDeleteIndependent:459
 expected:<17576> but was:<14046>
  Run 3: PASS

org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts(org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer)
  Run 1: 
TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts:454->BalancerTestBase.testWithCluster:444->BalancerTestBase.assertClusterAsBalanced:203
 null
  Run 2: PASS

org.apache.hadoop.hbase.regionserver.TestWALLockup.testLockupWhenSyncInMiddleOfZigZagSetup(org.apache.hadoop.hbase.regionserver.TestWALLockup)
  Run 1: TestWALLockup.testLockupWhenSyncInMiddleOfZigZagSetup:245 � 
TestTimedOut test ...
  Run 2: PASS
{code}

TestSnapshotCloneIndependence#testOnlineSnapshotDeleteIndependent was disabled 
last night.

Load balancer is showing up from time to time still.

I see this: ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
(secondPartTestsExecution) on project hbase-server: There was a timeout or 
other error in the fork -> [Help 1]

org.apache.hadoop.hbase.TestChoreService does not show up in list above. The 
test that failed is all timer based... let me just disable it.

Let me put timeout on the encoding failure.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961989#comment-14961989
 ] 

stack commented on HBASE-14420:
---

TestDataBlockEncoders was doing 10k random seeks... cut it down to 1k.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in btw Compaction flow

2015-10-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961972#comment-14961972
 ] 

stack commented on HBASE-14636:
---

This just straight YCSB with not filters or other machinations in between. 
First experience of OOME was workloadc so I'd doubt there was any compaction.  
The one I sent over was a load .. then workloada, b, etc...  The YCSB suite 
worked fine when all was onheap with 16G. If I offheap, even though I leave 
heap at 16, it OOMEs pretty quickly.

Is btw By-The-Way?


> Clear HFileScannerImpl#prevBlocks in btw Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14630) Cells still show up in scan after cell-level TTL has expired

2015-10-17 Thread Emre Colak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961971#comment-14961971
 ] 

Emre Colak commented on HBASE-14630:


Thank you, it all makes sense now. For some reason I thought I could update the 
TTL of an existing cell. 

> Cells still show up in scan after cell-level TTL has expired
> 
>
> Key: HBASE-14630
> URL: https://issues.apache.org/jira/browse/HBASE-14630
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.2, 1.1.2
>Reporter: Emre Colak
>
> I have an HBase table with the following description:
> {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', 
> REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', 
> MIN_VERSIONS => '0' , TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', 
> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> I put some values in it and then set TTL (30s) on those values with another
> put operation. First thing I notice is that the timestamps of the cells get
> updated after the 2nd put. And 30 seconds later, when I do a scan on the
> table, I still see those cells in the table, however this time with their
> timestamps updated to the original timestamps.
> I understand that these cells won't necessarily be deleted until a
> compaction, but they still come up in my scan even though the TTL
> that I set on them has expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14536) Balancer & SSH interfering with each other leading to unavailability

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961820#comment-14961820
 ] 

Hudson commented on HBASE-14536:


FAILURE: Integrated in HBase-1.3 #277 (See 
[https://builds.apache.org/job/HBase-1.3/277/])
HBASE-14536 Balancer & SSH interfering with each other leading to 
(syuanjiangdev: rev 9bdb88a572ac30fb51fcc44284f51543d2b4568f)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java


> Balancer & SSH interfering with each other leading to unavailability
> 
>
> Key: HBASE-14536
> URL: https://issues.apache.org/jira/browse/HBASE-14536
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment
>Affects Versions: 1.1.2
>Reporter: Devaraj Das
>Assignee: Stephen Yuan Jiang
> Fix For: 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14536.v1-branch-1.1.patch, 
> HBASE-14536.v1-branch-1.patch, HBASE-14536.v2-branch-1.1.patch, 
> HBASE-14536.v3-branch-1.1.patch, master-log.tgz
>
>
> Came across this in our cluster:
> 1. The meta was assigned to a server 10.0.0.149,16020,1443507203340
> {noformat}
> 2015-09-29 06:16:22,472 DEBUG [AM.ZK.Worker-pool2-t56] 
> master.RegionStates: Onlined 1588230740 on 
> 10.0.0.149,16020,1443507203340 {ENCODED => 1588230740, NAME => 
> 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
> {noformat}
> 2. The server dies at some point:
> {noformat}
> 2015-09-29 06:18:25,952 INFO  [main-EventThread] 
> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, 
> processing expiration [10.0.0.149,16020,1443507203340]
> 2015-09-29 06:18:25,955 DEBUG [main-EventThread] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=10.0.0.149,16020,1443507203340 
> server being checked: 
> 10.0.0.149,16020,1443507203340
> {noformat}
> 3. The balancer had computed a plan that contained a move for the meta:
> {noformat}
> 2015-09-29 06:18:26,833 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.HMaster: 
> balance hri=hbase:meta,,1.1588230740, 
> src=10.0.0.149,16020,1443507203340, dest=10.0.0.205,16020,1443507257905
> {noformat}
> 4. The following ensues after this, leading to the meta remaining unassigned:
> {noformat}
> 2015-09-29 06:18:26,859 DEBUG 
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Offline hbase:meta,,1.1588230740, no need to 
> unassign since it's on a dead server: 10.0.0.149,16020,1443507203340
> ..
> 2015-09-29 06:18:26,899 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.RegionStates: 
> Offlined 1588230740 from 10.0.0.149,16020,1443507203340
> .
> 2015-09-29 06:18:26,914 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Skip assigning hbase:meta,,1.1588230740, it is 
> on a dead but not processed yet server: 10.0.0.149,16020,1443507203340
> 
> 2015-09-29 06:18:26,915 DEBUG [AM.ZK.Worker-pool2-t58] 
> master.AssignmentManager: Znode hbase:meta,,1.1588230740 deleted, 
> state: {1588230740 state=OFFLINE, ts=1443507506914, 
> server=10.0.0.149,16020,1443507203340}
> 
> 2015-09-29 06:18:29,447 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=null server being checked: 
> 10.0.0.149,16020,1443507203340
> 2015-09-29 06:18:29,451 INFO  [MASTER_META_SERVER_OPERATIONS-
> 10.0.0.148:16000-2] handler.MetaServerShutdownHandler: META has been 
> assigned to otherwhere, skip assigning.
> 2015-09-29 06:18:29,452 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] 
> master.DeadServer: Finished processing 10.0.0.149,16020,1443507203340
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14536) Balancer & SSH interfering with each other leading to unavailability

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961819#comment-14961819
 ] 

Hudson commented on HBASE-14536:


SUCCESS: Integrated in HBase-1.2 #269 (See 
[https://builds.apache.org/job/HBase-1.2/269/])
HBASE-14536 Balancer & SSH interfering with each other leading to 
(syuanjiangdev: rev c6c5c95f01ffd582392fcc3f025602190219a213)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java


> Balancer & SSH interfering with each other leading to unavailability
> 
>
> Key: HBASE-14536
> URL: https://issues.apache.org/jira/browse/HBASE-14536
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment
>Affects Versions: 1.1.2
>Reporter: Devaraj Das
>Assignee: Stephen Yuan Jiang
> Fix For: 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14536.v1-branch-1.1.patch, 
> HBASE-14536.v1-branch-1.patch, HBASE-14536.v2-branch-1.1.patch, 
> HBASE-14536.v3-branch-1.1.patch, master-log.tgz
>
>
> Came across this in our cluster:
> 1. The meta was assigned to a server 10.0.0.149,16020,1443507203340
> {noformat}
> 2015-09-29 06:16:22,472 DEBUG [AM.ZK.Worker-pool2-t56] 
> master.RegionStates: Onlined 1588230740 on 
> 10.0.0.149,16020,1443507203340 {ENCODED => 1588230740, NAME => 
> 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
> {noformat}
> 2. The server dies at some point:
> {noformat}
> 2015-09-29 06:18:25,952 INFO  [main-EventThread] 
> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, 
> processing expiration [10.0.0.149,16020,1443507203340]
> 2015-09-29 06:18:25,955 DEBUG [main-EventThread] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=10.0.0.149,16020,1443507203340 
> server being checked: 
> 10.0.0.149,16020,1443507203340
> {noformat}
> 3. The balancer had computed a plan that contained a move for the meta:
> {noformat}
> 2015-09-29 06:18:26,833 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.HMaster: 
> balance hri=hbase:meta,,1.1588230740, 
> src=10.0.0.149,16020,1443507203340, dest=10.0.0.205,16020,1443507257905
> {noformat}
> 4. The following ensues after this, leading to the meta remaining unassigned:
> {noformat}
> 2015-09-29 06:18:26,859 DEBUG 
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Offline hbase:meta,,1.1588230740, no need to 
> unassign since it's on a dead server: 10.0.0.149,16020,1443507203340
> ..
> 2015-09-29 06:18:26,899 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.RegionStates: 
> Offlined 1588230740 from 10.0.0.149,16020,1443507203340
> .
> 2015-09-29 06:18:26,914 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Skip assigning hbase:meta,,1.1588230740, it is 
> on a dead but not processed yet server: 10.0.0.149,16020,1443507203340
> 
> 2015-09-29 06:18:26,915 DEBUG [AM.ZK.Worker-pool2-t58] 
> master.AssignmentManager: Znode hbase:meta,,1.1588230740 deleted, 
> state: {1588230740 state=OFFLINE, ts=1443507506914, 
> server=10.0.0.149,16020,1443507203340}
> 
> 2015-09-29 06:18:29,447 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=null server being checked: 
> 10.0.0.149,16020,1443507203340
> 2015-09-29 06:18:29,451 INFO  [MASTER_META_SERVER_OPERATIONS-
> 10.0.0.148:16000-2] handler.MetaServerShutdownHandler: META has been 
> assigned to otherwhere, skip assigning.
> 2015-09-29 06:18:29,452 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] 
> master.DeadServer: Finished processing 10.0.0.149,16020,1443507203340
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14536) Balancer & SSH interfering with each other leading to unavailability

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961816#comment-14961816
 ] 

Hudson commented on HBASE-14536:


SUCCESS: Integrated in HBase-1.3-IT #247 (See 
[https://builds.apache.org/job/HBase-1.3-IT/247/])
HBASE-14536 Balancer & SSH interfering with each other leading to 
(syuanjiangdev: rev 9bdb88a572ac30fb51fcc44284f51543d2b4568f)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java


> Balancer & SSH interfering with each other leading to unavailability
> 
>
> Key: HBASE-14536
> URL: https://issues.apache.org/jira/browse/HBASE-14536
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment
>Affects Versions: 1.1.2
>Reporter: Devaraj Das
>Assignee: Stephen Yuan Jiang
> Fix For: 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14536.v1-branch-1.1.patch, 
> HBASE-14536.v1-branch-1.patch, HBASE-14536.v2-branch-1.1.patch, 
> HBASE-14536.v3-branch-1.1.patch, master-log.tgz
>
>
> Came across this in our cluster:
> 1. The meta was assigned to a server 10.0.0.149,16020,1443507203340
> {noformat}
> 2015-09-29 06:16:22,472 DEBUG [AM.ZK.Worker-pool2-t56] 
> master.RegionStates: Onlined 1588230740 on 
> 10.0.0.149,16020,1443507203340 {ENCODED => 1588230740, NAME => 
> 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
> {noformat}
> 2. The server dies at some point:
> {noformat}
> 2015-09-29 06:18:25,952 INFO  [main-EventThread] 
> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, 
> processing expiration [10.0.0.149,16020,1443507203340]
> 2015-09-29 06:18:25,955 DEBUG [main-EventThread] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=10.0.0.149,16020,1443507203340 
> server being checked: 
> 10.0.0.149,16020,1443507203340
> {noformat}
> 3. The balancer had computed a plan that contained a move for the meta:
> {noformat}
> 2015-09-29 06:18:26,833 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.HMaster: 
> balance hri=hbase:meta,,1.1588230740, 
> src=10.0.0.149,16020,1443507203340, dest=10.0.0.205,16020,1443507257905
> {noformat}
> 4. The following ensues after this, leading to the meta remaining unassigned:
> {noformat}
> 2015-09-29 06:18:26,859 DEBUG 
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Offline hbase:meta,,1.1588230740, no need to 
> unassign since it's on a dead server: 10.0.0.149,16020,1443507203340
> ..
> 2015-09-29 06:18:26,899 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.RegionStates: 
> Offlined 1588230740 from 10.0.0.149,16020,1443507203340
> .
> 2015-09-29 06:18:26,914 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Skip assigning hbase:meta,,1.1588230740, it is 
> on a dead but not processed yet server: 10.0.0.149,16020,1443507203340
> 
> 2015-09-29 06:18:26,915 DEBUG [AM.ZK.Worker-pool2-t58] 
> master.AssignmentManager: Znode hbase:meta,,1.1588230740 deleted, 
> state: {1588230740 state=OFFLINE, ts=1443507506914, 
> server=10.0.0.149,16020,1443507203340}
> 
> 2015-09-29 06:18:29,447 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=null server being checked: 
> 10.0.0.149,16020,1443507203340
> 2015-09-29 06:18:29,451 INFO  [MASTER_META_SERVER_OPERATIONS-
> 10.0.0.148:16000-2] handler.MetaServerShutdownHandler: META has been 
> assigned to otherwhere, skip assigning.
> 2015-09-29 06:18:29,452 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] 
> master.DeadServer: Finished processing 10.0.0.149,16020,1443507203340
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14536) Balancer & SSH interfering with each other leading to unavailability

2015-10-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961809#comment-14961809
 ] 

Hudson commented on HBASE-14536:


SUCCESS: Integrated in HBase-1.2-IT #220 (See 
[https://builds.apache.org/job/HBase-1.2-IT/220/])
HBASE-14536 Balancer & SSH interfering with each other leading to 
(syuanjiangdev: rev c6c5c95f01ffd582392fcc3f025602190219a213)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java


> Balancer & SSH interfering with each other leading to unavailability
> 
>
> Key: HBASE-14536
> URL: https://issues.apache.org/jira/browse/HBASE-14536
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment
>Affects Versions: 1.1.2
>Reporter: Devaraj Das
>Assignee: Stephen Yuan Jiang
> Fix For: 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14536.v1-branch-1.1.patch, 
> HBASE-14536.v1-branch-1.patch, HBASE-14536.v2-branch-1.1.patch, 
> HBASE-14536.v3-branch-1.1.patch, master-log.tgz
>
>
> Came across this in our cluster:
> 1. The meta was assigned to a server 10.0.0.149,16020,1443507203340
> {noformat}
> 2015-09-29 06:16:22,472 DEBUG [AM.ZK.Worker-pool2-t56] 
> master.RegionStates: Onlined 1588230740 on 
> 10.0.0.149,16020,1443507203340 {ENCODED => 1588230740, NAME => 
> 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
> {noformat}
> 2. The server dies at some point:
> {noformat}
> 2015-09-29 06:18:25,952 INFO  [main-EventThread] 
> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, 
> processing expiration [10.0.0.149,16020,1443507203340]
> 2015-09-29 06:18:25,955 DEBUG [main-EventThread] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=10.0.0.149,16020,1443507203340 
> server being checked: 
> 10.0.0.149,16020,1443507203340
> {noformat}
> 3. The balancer had computed a plan that contained a move for the meta:
> {noformat}
> 2015-09-29 06:18:26,833 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.HMaster: 
> balance hri=hbase:meta,,1.1588230740, 
> src=10.0.0.149,16020,1443507203340, dest=10.0.0.205,16020,1443507257905
> {noformat}
> 4. The following ensues after this, leading to the meta remaining unassigned:
> {noformat}
> 2015-09-29 06:18:26,859 DEBUG 
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Offline hbase:meta,,1.1588230740, no need to 
> unassign since it's on a dead server: 10.0.0.149,16020,1443507203340
> ..
> 2015-09-29 06:18:26,899 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] master.RegionStates: 
> Offlined 1588230740 from 10.0.0.149,16020,1443507203340
> .
> 2015-09-29 06:18:26,914 INFO  
> [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Skip assigning hbase:meta,,1.1588230740, it is 
> on a dead but not processed yet server: 10.0.0.149,16020,1443507203340
> 
> 2015-09-29 06:18:26,915 DEBUG [AM.ZK.Worker-pool2-t58] 
> master.AssignmentManager: Znode hbase:meta,,1.1588230740 deleted, 
> state: {1588230740 state=OFFLINE, ts=1443507506914, 
> server=10.0.0.149,16020,1443507203340}
> 
> 2015-09-29 06:18:29,447 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] master.AssignmentManager: 
> based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=null server being checked: 
> 10.0.0.149,16020,1443507203340
> 2015-09-29 06:18:29,451 INFO  [MASTER_META_SERVER_OPERATIONS-
> 10.0.0.148:16000-2] handler.MetaServerShutdownHandler: META has been 
> assigned to otherwhere, skip assigning.
> 2015-09-29 06:18:29,452 DEBUG 
> [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] 
> master.DeadServer: Finished processing 10.0.0.149,16020,1443507203340
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in btw Compaction flow

2015-10-17 Thread kevin.chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kevin.chen updated HBASE-14636:
---
Description: (was: c)

> Clear HFileScannerImpl#prevBlocks in btw Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in btw Compaction flow

2015-10-17 Thread kevin.chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kevin.chen updated HBASE-14636:
---
Description: c

> Clear HFileScannerImpl#prevBlocks in btw Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
>
> c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961804#comment-14961804
 ] 

Hadoop QA commented on HBASE-14420:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767188/none_fix.txt
  against master branch at commit 6774f223a4a1cfd3cd81a17861ac94faad3b2916.
  ATTACHMENT ID: 12767188

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16076//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16076//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16076//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16076//console

This message is automatically generated.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14505) Reenable tests disabled by HBASE-14430 in TestHttpServerLifecycle

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961795#comment-14961795
 ] 

Hadoop QA commented on HBASE-14505:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12767191/HBASE-14505.3%20%281%29.patch
  against master branch at commit d9ee19131881d3730f1fe5dada6160560d953284.
  ATTACHMENT ID: 12767191

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16077//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16077//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16077//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16077//console

This message is automatically generated.

> Reenable tests disabled by HBASE-14430 in TestHttpServerLifecycle
> -
>
> Key: HBASE-14505
> URL: https://issues.apache.org/jira/browse/HBASE-14505
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>  Labels: beginner
> Attachments: HBASE-14505.1.patch, HBASE-14505.2.patch, HBASE-14505.3 
> (1).patch, HBASE-14505.3.patch, HBASE-14505.3.patch, HBASE-14505.3.patch
>
>
> Probably needs newer version of jetty or some cryptic JVM version.
> See HBASE-14430 for litany on how hard this is to reproduce, not only in 
> hbase, but back up in the Jetty where the issue was also reported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)