date:20141101

[
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192978#comment-14192978
]

Hadoop QA commented on HBASE-12346:
---

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/12678663/HBASE-12346-master-v3.patch
against trunk revision .
ATTACHMENT ID: 12678663

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 5 new
or modified tests.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 checkstyle{color}. The applied patch does not increase the
total number of checkstyle errors

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//artifact/patchprocess/checkstyle-aggregate.html

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/11552//console

This message is automatically generated.

Scan's default auths behavior under Visibility labels
-

Key: HBASE-12346
URL: https://issues.apache.org/jira/browse/HBASE-12346
Project: HBase
Issue Type: Bug
Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
Fix For: 0.98.8, 0.99.2

Attachments: HBASE-12346-master-v2.patch,
HBASE-12346-master-v3.patch, HBASE-12346-master.patch

In Visibility Labels security, a set of labels (auths) are administered and
associated with a user.
A user can normally only see cell data during scan that are part of the
user's label set (auths).
Scan uses setAuthorizations to indicates its wants to use the auths to access
the cells.
Similarly in the shell:
{code}
scan 'table1', AUTHORIZATIONS = ['private']
{code}
But it is a surprise to find that setAuthorizations seems to be 'mandatory'
in the default visibility label security setting. Every scan needs to
setAuthorizations before the scan can get any cells even the cells are under
the labels the request user is part of.
The following steps will illustrate the issue:
Run as superuser.
{code}
1. create a visibility label called 'private'
2. create 'table1'
3. put into 'table1' data and label the data as 'private'
4. set_auths 'user1', 'private'
5. grant 'user1', 'RW', 'table1'
{code}
Run as

[jira] [Commented] (HBASE-12406) Bulk load fails in 0.98 against hadoop-1 due to unmatched family name


[ 
https://issues.apache.org/jira/browse/HBASE-12406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192983#comment-14192983
 ] 

Hadoop QA commented on HBASE-12406:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678664/12406-0.98-v1.txt
  against trunk revision .
  ATTACHMENT ID: 12678664

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11553//console

This message is automatically generated.

 Bulk load fails in 0.98 against hadoop-1 due to unmatched family name
 -

 Key: HBASE-12406
 URL: https://issues.apache.org/jira/browse/HBASE-12406
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.98.8

 Attachments: 12406-0.98-v1.txt


 From 
 https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/614/testReport/org.apache.hadoop.hbase.mapreduce/TestCopyTable/testCopyTableWithBulkload/
  :
 {code}
 java.io.IOException: Unmatched family names found: unmatched family names in 
 HFiles to be bulkloaded: [_logs]; valid family names of table testCopyTable2 
 are: [family]
   at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:268)
   at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:907)
   at org.apache.hadoop.hbase.mapreduce.CopyTable.run(CopyTable.java:344)
 {code}
 The above failure was due to the presence of history directory under _logs 
 directory.
 e.g.
 {code}
 hdfs://nn:59313/user/tyu/copytable/4282249372082687850/_logs/history
 {code}
 HBASE-12375 removed check for directory name which starts with

[jira] [Updated] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-12363:
--
Attachment: 12363-master.txt

Here's a patch.
* Adds new TTL option to KEEP_DELETED_CELLS
* 100% backwards compatible in HColumnDescriptor (can parse the old 'true',
'false' string)
* 100% compatible in shell (arg.to_s.upcase to boolean and strings will work
exactly as before)
* the only difference is that a newly created table will show 'TRUE' instead
'true', even that is compatible forward compatible for old case, as the old
code will try to parse it as Boolean
* added tests

Now, ScanQueryMatcher doesn't exactly look nicer now. If somebody suggests some
easy simplifications here I'm happy to incorporate them.

It's think it's time to refactor it... For another jira.

TL;DR: with KEEP_DELETED_CELLS=TTL deleted cells *and* their delete markers
are removed when the TTL expired (regardless of MIN_VERSION setting). I.e. one
can keep TTL + MIN_VERSIONS and still get rid of old deleted rows.

We could even add another enum: MAKERS_ONLY and remove the
hbase.hstore.time.to.purge.deletes config option, but that's also another
jira.

KEEP_DELETED_CELLS considered harmful?
--

Key: HBASE-12363
URL: https://issues.apache.org/jira/browse/HBASE-12363
Project: HBase
Issue Type: Sub-task
Components: regionserver
Reporter: Lars Hofhansl
Labels: Phoenix
Attachments: 12363-master.txt, 12363-test.txt

Brainstorming...
This morning in the train (of all places) I realized a fundamental issue in
how KEEP_DELETED_CELLS is implemented.
The problem is around knowing when it is safe to remove a delete marker (we
cannot remove it unless all cells affected by it are remove otherwise).
This was particularly hard for family marker, since they sort before all
cells of a row, and hence scanning forward through an HFile you cannot know
whether the family markers are still needed until at least the entire row is
scanned.
My solution was to keep the TS of the oldest put in any given HFile, and only
remove delete markers older than that TS.
That sounds good on the face of it... But now imagine you wrote a version of
ROW 1 and then never update it again. Then later you write a billion other
rows and delete them all. Since the TS of the cells in ROW 1 is older than
all the delete markers for the other billion rows, these will never be
collected... At least for the region that hosts ROW 1 after a major
compaction.
Note, in a sense that is what HBase is supposed to do when keeping deleted
cells: Keep them until they would be removed by some other means (for example
TTL, or MAX_VERSION when new versions are inserted).
The specific problem here is that even as all KVs affected by a delete marker
are expired this way the marker would not be removed if there just one older
KV in the HStore.
I don't see a good way out of this. In parent I outlined these four solutions:
So there are three options I think:
# Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not
apply to deleted rows or delete marker rows (wouldn't know how long to keep
family deletes in that case). (MAX)VERSIONS would still be enforced on all
rows types except for family delete markers.
# Translate family delete markers to column delete marker at (major)
compaction time.
# Change HFileWriterV* to keep track of the earliest put TS in a store and
write it to the file metadata. Use that use expire delete marker that are
older and hence can't affect any puts in the file.
# Have Store.java keep track of the earliest put in internalFlushCache and
compactStore and then append it to the file metadata. That way HFileWriterV*
would not need to know about KVs.
And I implemented #4.
I'd love to get input on ideas.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192990#comment-14192990
]

Lars Hofhansl edited comment on HBASE-12363 at 11/1/14 6:38 AM:

Here's a patch.
* Adds new TTL option to KEEP_DELETED_CELLS
* 100% backwards compatible in HColumnDescriptor (can parse the old 'true',
'false' string)
* 100% compatible in shell (arg.to_s.upcase so boolean and strings will work
exactly as before)
* the only difference is that a newly created table will show 'TRUE' instead of
'true'; even that is compatible forward compatible for old case, as the old
code will try to parse it as Boolean
* added tests

Now, ScanQueryMatcher doesn't exactly look nicer now. If somebody suggests some
easy simplifications here I'm happy to incorporate them.

It's think it's time to refactor it... For another jira.

We could even add another enum: MAKERS_ONLY and remove the
hbase.hstore.time.to.purge.deletes config option, but that's also another
jira.

was (Author: lhofhansl):
Here's a patch.
* Adds new TTL option to KEEP_DELETED_CELLS
* 100% backwards compatible in HColumnDescriptor (can parse the old 'true',
'false' string)
* 100% compatible in shell (arg.to_s.upcase to boolean and strings will work
exactly as before)
* the only difference is that a newly created table will show 'TRUE' instead
'true', even that is compatible forward compatible for old case, as the old
code will try to parse it as Boolean
* added tests

Now, ScanQueryMatcher doesn't exactly look nicer now. If somebody suggests some
easy simplifications here I'm happy to incorporate them.

It's think it's time to refactor it... For another jira.

We could even add another enum: MAKERS_ONLY and remove the
hbase.hstore.time.to.purge.deletes config option, but that's also another
jira.

KEEP_DELETED_CELLS considered harmful?
--

[jira] [Assigned] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl reassigned HBASE-12363:
-

Assignee: Lars Hofhansl

KEEP_DELETED_CELLS considered harmful?
--

Key: HBASE-12363
URL: https://issues.apache.org/jira/browse/HBASE-12363
Project: HBase
Issue Type: Sub-task
Components: regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Labels: Phoenix
Attachments: 12363-master.txt, 12363-test.txt

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-12363:
--
Status: Patch Available (was: Open)

KEEP_DELETED_CELLS considered harmful?
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?


[ 
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192991#comment-14192991
 ] 

Hadoop QA commented on HBASE-12363:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678669/12363-master.txt
  against trunk revision .
  ATTACHMENT ID: 12678669

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 27 new 
or modified tests.

{color:red}-1 javac{color}.  The patch appears to cause mvn compile goal to 
fail.

Compilation errors resume:
[ERROR] COMPILATION ERROR : 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[162,23]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[429,36]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[794,10]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[819,48]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[162,63]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[798,14]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[811,61]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[811,85]
 cannot find symbol
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on 
project hbase-client: Compilation failure: Compilation failure:
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[162,23]
 cannot find symbol
[ERROR] symbol:   class KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[429,36]
 cannot find symbol
[ERROR] symbol:   class KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[794,10]
 cannot find symbol
[ERROR] symbol:   class KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[819,48]
 cannot find symbol
[ERROR] symbol:   class KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[162,63]
 cannot find symbol
[ERROR] symbol:   variable KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[798,14]
 cannot find symbol
[ERROR] symbol:   variable KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[811,61]
 cannot find symbol
[ERROR] symbol:   variable KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:[811,85]
 cannot find symbol
[ERROR] symbol:   variable KeepDeletedCells
[ERROR] location: class org.apache.hadoop.hbase.HColumnDescriptor
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]

[jira] [Updated] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-12363:
--
Attachment: (was: 12363-master.txt)

KEEP_DELETED_CELLS considered harmful?
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-12363:
--
Attachment: 12363-master.txt

Whoops... Correct version this time.

KEEP_DELETED_CELLS considered harmful?
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?


[ 
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193026#comment-14193026
 ] 

Hadoop QA commented on HBASE-12363:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678670/12363-master.txt
  against trunk revision .
  ATTACHMENT ID: 12678670

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 27 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3784 checkstyle errors (more than the trunk's current 3781 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the trunk's current 0 warnings).

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+return setValue(KEEP_DELETED_CELLS, (keepDeletedCells ? 
KeepDeletedCells.TRUE : KeepDeletedCells.FALSE).toString());
+this.keepDeletedCells = scan.isRaw() ? KeepDeletedCells.TRUE : isUserScan 
? KeepDeletedCells.FALSE : scanInfo.getKeepDeletedCells();
+this.seePastDeleteMarkers = scanInfo.getKeepDeletedCells() != 
KeepDeletedCells.FALSE  isUserScan;
+ScanInfo scanInfo = new ScanInfo(null, 0, 1, HConstants.LATEST_TIMESTAMP, 
KeepDeletedCells.FALSE,
+  
family.setKeepDeletedCells(org.apache.hadoop.hbase.KeepDeletedCells.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::KEEP_DELETED_CELLS).to_s.upcase))
 if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::KEEP_DELETED_CELLS)

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/patchReleaseAuditWarnings.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//console

This message is automatically generated.

 KEEP_DELETED_CELLS considered harmful?
 --

 Key: HBASE-12363
 URL: https://issues.apache.org/jira/browse/HBASE-12363
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
  Labels: Phoenix
 Attachments: 12363-master.txt, 12363-test.txt


 Brainstorming...
 This morning in the train (of all places) I realized a fundamental issue in 
 how

[jira] [Reopened] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-11-01 Thread Dima Spivak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak reopened HBASE-12285:
-

Lots of failing builds recently with {{Stream Closed}} being replaced with 
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.18-SNAPSHOT:test 
(secondPartTestsExecution) on project hbase-server: There was a timeout or 
other error in the fork - [Help 1]
{code}
since we switched to Surefire 2.18-SNAPSHOT. I'm also still bothered by not 
being able to answer [~stack]'s question of why this was only hitting branch-1 
(even when using the known-faulty 2.17 version), so I'm reopening this.

 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12285_branch-1_v1.patch, 
 HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12405) WAL accounting by Store

2014-11-01 Thread zhangduo (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-12405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

zhangduo updated HBASE-12405:
-
Description:
HBASE-10201 has made flush decisions per Store, but has not done enough work on
HLog, so there are two problems:
1. We record minSeqId both in HRegion and FSHLog, which is a duplication.
2. There maybe holes in WAL accounting.
For example, assume family A with sequence id 1 and 3, family B with seqId
2. If we flush family A, we can only record that WAL before sequence id 1 can
be removed safely. If we do a replay at this point, sequence id 3 will also be
replayed which is unnecessary.

was:
HBASE-10201 has made flush decisions per Store, but has not done enough work on
HLog, so there are two problems:
1. We record minSeqId both in HRegion and FSHLog, which is a duplication.
2. There maybe holes in WAL accounting.
For example, assume family A with sequence id 1 and 3, family B with seqId
2. If we flush family A, we can only record that WAL before sequence id 1 can
be removed safely. If we do a replay at this point, sequence id 4 will also be
replayed which is unnecessary.

WAL accounting by Store
---

Key: HBASE-12405
URL: https://issues.apache.org/jira/browse/HBASE-12405
Project: HBase
Issue Type: Improvement
Components: wal
Reporter: zhangduo
Assignee: zhangduo

HBASE-10201 has made flush decisions per Store, but has not done enough work
on HLog, so there are two problems:
1. We record minSeqId both in HRegion and FSHLog, which is a duplication.
2. There maybe holes in WAL accounting.
For example, assume family A with sequence id 1 and 3, family B with
seqId 2. If we flush family A, we can only record that WAL before sequence id
1 can be removed safely. If we do a replay at this point, sequence id 3 will
also be replayed which is unnecessary.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12393) The regionserver web will throw exception if we disable block cache

2014-11-01 Thread ChiaPing Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-12393:
--
Labels: patch  (was: )
Status: Patch Available  (was: Open)

To avoid invoking disabled blockcache's method, we use an additional 
statement(else if) to evaluate the value of blockcache.
If blockcache is null, it will display Block Cache is disabled on the web 
page of blockcache stats.

 The regionserver web will throw exception if we disable block cache
 ---

 Key: HBASE-12393
 URL: https://issues.apache.org/jira/browse/HBASE-12393
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.7
 Environment: ubuntu 12.04 64bits, hadoop-2.2.0, hbase-0.98.7-hadoop2
Reporter: ChiaPing Tsai
Priority: Minor
  Labels: patch
 Attachments: HBASE-12393.patch


 The CacheConfig.getBlockCache() will return the null point when we set 
 hfile.block.cache.size to zero.
 It caused the BlockCacheTmplImpl.java:123 to throw null exception.
 {code}
 org.jamon.escaping.Escaping.HTML.write(org.jamon.emit.StandardEmitter.valueOf(StringUtils.humanReadableInt(cacheConfig.getBlockCache().size())),
  jamonWriter);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12393) The regionserver web will throw exception if we disable block cache


[ 
https://issues.apache.org/jira/browse/HBASE-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193136#comment-14193136
 ] 

Hadoop QA commented on HBASE-12393:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678584/HBASE-12393.patch
  against trunk revision .
  ATTACHMENT ID: 12678584

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3782 checkstyle errors (more than the trunk's current 3781 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11556//console

This message is automatically generated.

 The regionserver web will throw exception if we disable block cache
 ---

 Key: HBASE-12393
 URL: https://issues.apache.org/jira/browse/HBASE-12393
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.7
 Environment: ubuntu 12.04 64bits, hadoop-2.2.0, hbase-0.98.7-hadoop2
Reporter: ChiaPing Tsai
Priority: Minor
  Labels: patch
 Attachments: HBASE-12393.patch


 The CacheConfig.getBlockCache() will return the null point when we set 
 hfile.block.cache.size to zero.
 It caused the BlockCacheTmplImpl.java:123 to throw null exception.
 {code}
 org.jamon.escaping.Escaping.HTML.write(org.jamon.emit.StandardEmitter.valueOf(StringUtils.humanReadableInt(cacheConfig.getBlockCache().size())),
  jamonWriter);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12406) Bulk load fails in 0.98 against hadoop-1 due to unmatched family name

2014-11-01 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193181#comment-14193181
 ] 

Anoop Sam John commented on HBASE-12406:


Any other such 'to be excluded' dirs?
Ping [~ashish singhi]

 Bulk load fails in 0.98 against hadoop-1 due to unmatched family name
 -

 Key: HBASE-12406
 URL: https://issues.apache.org/jira/browse/HBASE-12406
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.98.8

 Attachments: 12406-0.98-v1.txt


 From 
 https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/614/testReport/org.apache.hadoop.hbase.mapreduce/TestCopyTable/testCopyTableWithBulkload/
  :
 {code}
 java.io.IOException: Unmatched family names found: unmatched family names in 
 HFiles to be bulkloaded: [_logs]; valid family names of table testCopyTable2 
 are: [family]
   at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:268)
   at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:907)
   at org.apache.hadoop.hbase.mapreduce.CopyTable.run(CopyTable.java:344)
 {code}
 The above failure was due to the presence of history directory under _logs 
 directory.
 e.g.
 {code}
 hdfs://nn:59313/user/tyu/copytable/4282249372082687850/_logs/history
 {code}
 HBASE-12375 removed check for directory name which starts with underscore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12393) The regionserver web will throw exception if we disable block cache

2014-11-01 Thread ChiaPing Tsai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193208#comment-14193208
 ] 

ChiaPing Tsai commented on HBASE-12393:
---

{quote}
-1 tests included. The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
{quote}
Not added any UT as it was only message change on UI.
The manual steps are shown below:
# set hfile.block.cache.size to zero.
# open the RegionServer UI and there are no nullpointexception anymore.
# click on Stats of Block Cache and the message Block Cache is disabled 
will appear

{quote}
-1 checkstyle. The applied patch generated 3782 checkstyle errors (more than 
the trunk's current 3781 errors).
{quote}
The BlockCacheTmplImpl.java is the auto-generated Jamon implementation. The 
white space error is due to the code style of Jamon.



 The regionserver web will throw exception if we disable block cache
 ---

 Key: HBASE-12393
 URL: https://issues.apache.org/jira/browse/HBASE-12393
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.7
 Environment: ubuntu 12.04 64bits, hadoop-2.2.0, hbase-0.98.7-hadoop2
Reporter: ChiaPing Tsai
Priority: Minor
  Labels: patch
 Attachments: HBASE-12393.patch


 The CacheConfig.getBlockCache() will return the null point when we set 
 hfile.block.cache.size to zero.
 It caused the BlockCacheTmplImpl.java:123 to throw null exception.
 {code}
 org.jamon.escaping.Escaping.HTML.write(org.jamon.emit.StandardEmitter.valueOf(StringUtils.humanReadableInt(cacheConfig.getBlockCache().size())),
  jamonWriter);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-12406) Bulk load fails in 0.98 against hadoop-1 due to unmatched family name


 [ 
https://issues.apache.org/jira/browse/HBASE-12406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-12406:
--

Assignee: Ted Yu

 Bulk load fails in 0.98 against hadoop-1 due to unmatched family name
 -

 Key: HBASE-12406
 URL: https://issues.apache.org/jira/browse/HBASE-12406
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.8

 Attachments: 12406-0.98-v1.txt


 From 
 https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/614/testReport/org.apache.hadoop.hbase.mapreduce/TestCopyTable/testCopyTableWithBulkload/
  :
 {code}
 java.io.IOException: Unmatched family names found: unmatched family names in 
 HFiles to be bulkloaded: [_logs]; valid family names of table testCopyTable2 
 are: [family]
   at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:268)
   at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:907)
   at org.apache.hadoop.hbase.mapreduce.CopyTable.run(CopyTable.java:344)
 {code}
 The above failure was due to the presence of history directory under _logs 
 directory.
 e.g.
 {code}
 hdfs://nn:59313/user/tyu/copytable/4282249372082687850/_logs/history
 {code}
 HBASE-12375 removed check for directory name which starts with underscore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193213#comment-14193213
]

Ted Yu commented on HBASE-12363:

What if a table with KEEP_DELETED_CELLS set to TTL is exported to a cluster
which is running an older release ?
Would the exported table be parsed correctly ?

KEEP_DELETED_CELLS considered harmful?
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12403) IntegrationTestMTTR flaky due to aggressive RS restart timeout

2014-11-01 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12403:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to 0.98+. Thanks folks.

 IntegrationTestMTTR flaky due to aggressive RS restart timeout
 --

 Key: HBASE-12403
 URL: https://issues.apache.org/jira/browse/HBASE-12403
 Project: HBase
  Issue Type: Test
  Components: integration tests
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12403.00.patch


 TL;DR: the CM RestartRS action timeout is only 60 seconds. Considering the RS 
 must connect to the Master before it can be online, this is not long enough 
 time in an environment where the Master can also be killed.
 Failure from the console says the test failed because a 
 RestartRsHoldingMetaAction timed out.
 {noformat}
 Caused by: java.io.IOException: did timeout waiting for region server to 
 start:ip-172-31-42-248.ec2.internal
 at 
 org.apache.hadoop.hbase.HBaseCluster.waitForRegionServerToStart(HBaseCluster.java:153)
 at org.apache.hadoop.hbase.chaos.actions.Action.startRs(Action.java:93)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartActionBaseAction.restartRs(RestartActionBaseAction.java:52)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartRsHoldingMetaAction.perform(RestartRsHoldingMetaAction.java:38)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:559)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:550)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This is only reported at the end of the test run. There's no indication as to 
 when during the test run this failure happened. The timeout on the start RS 
 operation is 60 seconds.
 Hacking out the start/stop messages from the logs during the time window when 
 this test ran, it appears that at one point the RS took 2min 12s between when 
 it was launched and when it reported for duty
 {noformat}
 Fri Oct 31 14:53:17 UTC 2014 Starting regionserver on ip-172-31-42-248
 2014-10-31 14:55:29,049 INFO  [regionserver60020] regionserver.HRegionServer: 
 Serving as ip-172-31-42-248.ec2.internal,60020,1414767238992, RpcServer on 
 ip-172-31-42-248.ec2.internal/172.31.42.248:60020, sessionid=0x249661c2b7b0118
 {noformat}
 The RS came up without incident. It spent 1min 4s of that time waiting on the 
 master to start, attempted to report for duty from 14:54:28 to 14:55:24.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12394) Support multiple regions as input to each mapper in map/reduce jobs


[ 
https://issues.apache.org/jira/browse/HBASE-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193306#comment-14193306
 ] 

Ted Yu commented on HBASE-12394:


Mind putting patch on reviewboard ?
hbase.mapreduce.scan.regionspermapper controls how many mappers would be used.
Have you considered specifying number of mappers for this feature ?

Thanks

 Support multiple regions as input to each mapper in map/reduce jobs
 ---

 Key: HBASE-12394
 URL: https://issues.apache.org/jira/browse/HBASE-12394
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 2.0.0, 0.98.6.1
Reporter: Weichen Ye
 Attachments: HBASE-12394.patch


 For Hadoop cluster, a job with large HBase table as input always consumes a 
 large amount of computing resources. For example, we need to create a job 
 with 1000 mappers to scan a table with 1000 regions. This patch is to support 
 one mapper using multiple regions as input.
  
 The following new files are included in this patch:
 TableMultiRegionInputFormat.java
 TableMultiRegionInputFormatBase.java
 TableMultiRegionMapReduceUtil.java
 *TestTableMultiRegionInputFormatScan1.java
 *TestTableMultiRegionInputFormatScan2.java
 *TestTableMultiRegionInputFormatScanBase.java
 *TestTableMultiRegionMapReduceUtil.java
  
 The files start with * are tests.
 In order to support multiple regions for one mapper, we need a new property 
 in configuration--hbase.mapreduce.scan.regionspermapper
 This is an example,which means each mapper has 3 regions as input.
 property
  namehbase.mapreduce.scan.regionspermapper/name
  value3/value
 /property
 This is an example for Java code:
 TableMultiRegionMapReduceUtil.initTableMapperJob(tablename, scan, Map.class, 
 Text.class, Text.class, job);
  
   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12399) Master startup race between metrics and RpcServer

2014-11-01 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12399:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to 0.98+

 Master startup race between metrics and RpcServer
 -

 Key: HBASE-12399
 URL: https://issues.apache.org/jira/browse/HBASE-12399
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 12399.patch, HBASE-12399.00.patch


 Seeing this on CM tests with frequent master thrashing
 {noformat}
 2014-10-31 12:01:59,196 ERROR [Timer for 'HBase' metrics system] 
 impl.MetricsSourceAdapter: Error getting metrics from source IPC,sub=IPC
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.ipc.FifoRpcScheduler.getGeneralQueueLength(FifoRpcScheduler.java:81)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerWrapperImpl.getGeneralQueueLength(MetricsHBaseServerWrapperImpl.java:43)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.getMetrics(MetricsHBaseServerSourceImpl.java:117)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:382)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:369)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12402) ZKPermissionWatcher race condition in refreshing the cache leaving stale ACLs and causing AccessDenied

2014-11-01 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193372#comment-14193372
 ] 

Enis Soztutar commented on HBASE-12402:
---

I have run IntegrationTestIngest with CM on a cluster of 4 nodes 10 times to 
test the change. It seem good to go. 

 ZKPermissionWatcher race condition in refreshing the cache leaving stale ACLs 
 and causing AccessDenied
 --

 Key: HBASE-12402
 URL: https://issues.apache.org/jira/browse/HBASE-12402
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: hbase-12402_v1.patch


 In testing, we have seen an issue where a region in a newly created table 
 will throw AccessDeniedException. 
 There seems to be a race condition in the ZKPermissionWatcher when it is just 
 starting up, and a new table is created around the same time. 
 The master just created the table, and adds permissions to acl table:
 {code}
 2014-10-30 19:21:26,494 DEBUG 
 [MASTER_TABLE_OPERATIONS-ip-172-31-32-87:6-0] access.AccessControlLists: 
 Writing permission with rowKey loadtest_d1 hrt_qa: RWXCA
 {code}
 One of the region servers is just starting: 
 {code}
 Thu Oct 30 19:21:11 UTC 2014 Starting regionserver on ip-172-31-32-90
 2014-10-30 19:21:13,915 INFO  [main] util.VersionInfo: HBase 
 0.98.4.2.2.0.0-1194-hadoop2
 {code}
 The node creation event is received 
 {code}
 2014-10-30 19:21:26,764 DEBUG [regionserver60020-EventThread] 
 access.ZKPermissionWatcher: Updating permissions cache from node loadtest_d1 
 with data: 
 PBUF\x0A0\x0A\x06hrt_qa\x12\x08\x03\x0A\x16\x0A\x07default\x12\x0Bloadtest_d1
  \x00 \x01 \x02 \x03 \x04
 {code}
 which put the write data to the cache, only to be invalidated later shortly: 
 {code}
 ...
 2014-10-30 19:21:26,855 DEBUG [RS_OPEN_REGION-ip-172-31-32-90:60020-1] 
 access.ZKPermissionWatcher: Updating permissions cache from node 
 tabletwo_copytable_cell_versions_two with data: 
 PBUF\x0AI\x0A\x06hrt_qa\x12?\x08\x03;\x0A/\x0A\x07default\x12$tabletwo_copytable_cell_versions_two
  \x00 \x01 \x02 \x03 \x04
 2014-10-30 19:21:26,856 DEBUG [RS_OPEN_REGION-ip-172-31-32-90:60020-1] 
 access.ZKPermissionWatcher: Updating permissions cache from node loadtest_d1 
 with data: PBUF
 2014-10-30 19:21:26,856 DEBUG [RS_OPEN_REGION-ip-172-31-32-90:60020-1] 
 access.ZKPermissionWatcher: Updating permissions cache from node 
 tablefour_cell_version_snapshots_copy with data: PBUF
 ...
 {code}
 Notice that the threads are different. The first one is the zk event 
 notification thread, vs the other is the thread from OpenRegionHandler. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12399) Master startup race between metrics and RpcServer


[ 
https://issues.apache.org/jira/browse/HBASE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193376#comment-14193376
 ] 

Hudson commented on HBASE-12399:


SUCCESS: Integrated in HBase-TRUNK #5735 (See 
[https://builds.apache.org/job/HBase-TRUNK/5735/])
HBASE-12399 Master startup race between metrics and RpcServer (ndimiduk: rev 
b5764a8e74179bfc0c09a416d51271116b903c2c)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MetricsHBaseServerWrapperImpl.java


 Master startup race between metrics and RpcServer
 -

 Key: HBASE-12399
 URL: https://issues.apache.org/jira/browse/HBASE-12399
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 12399.patch, HBASE-12399.00.patch


 Seeing this on CM tests with frequent master thrashing
 {noformat}
 2014-10-31 12:01:59,196 ERROR [Timer for 'HBase' metrics system] 
 impl.MetricsSourceAdapter: Error getting metrics from source IPC,sub=IPC
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.ipc.FifoRpcScheduler.getGeneralQueueLength(FifoRpcScheduler.java:81)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerWrapperImpl.getGeneralQueueLength(MetricsHBaseServerWrapperImpl.java:43)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.getMetrics(MetricsHBaseServerSourceImpl.java:117)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:382)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:369)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12403) IntegrationTestMTTR flaky due to aggressive RS restart timeout


[ 
https://issues.apache.org/jira/browse/HBASE-12403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193375#comment-14193375
 ] 

Hudson commented on HBASE-12403:


SUCCESS: Integrated in HBase-TRUNK #5735 (See 
[https://builds.apache.org/job/HBase-TRUNK/5735/])
HBASE-12403 IntegrationTestMTTR flaky due to aggressive RS restart timeout 
(ndimiduk: rev 3c06b48181e22eb4ce91d6d8a455a1617f13d85f)
* hbase-it/src/test/java/org/apache/hadoop/hbase/mttr/IntegrationTestMTTR.java
* hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java


 IntegrationTestMTTR flaky due to aggressive RS restart timeout
 --

 Key: HBASE-12403
 URL: https://issues.apache.org/jira/browse/HBASE-12403
 Project: HBase
  Issue Type: Test
  Components: integration tests
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12403.00.patch


 TL;DR: the CM RestartRS action timeout is only 60 seconds. Considering the RS 
 must connect to the Master before it can be online, this is not long enough 
 time in an environment where the Master can also be killed.
 Failure from the console says the test failed because a 
 RestartRsHoldingMetaAction timed out.
 {noformat}
 Caused by: java.io.IOException: did timeout waiting for region server to 
 start:ip-172-31-42-248.ec2.internal
 at 
 org.apache.hadoop.hbase.HBaseCluster.waitForRegionServerToStart(HBaseCluster.java:153)
 at org.apache.hadoop.hbase.chaos.actions.Action.startRs(Action.java:93)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartActionBaseAction.restartRs(RestartActionBaseAction.java:52)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartRsHoldingMetaAction.perform(RestartRsHoldingMetaAction.java:38)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:559)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:550)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This is only reported at the end of the test run. There's no indication as to 
 when during the test run this failure happened. The timeout on the start RS 
 operation is 60 seconds.
 Hacking out the start/stop messages from the logs during the time window when 
 this test ran, it appears that at one point the RS took 2min 12s between when 
 it was launched and when it reported for duty
 {noformat}
 Fri Oct 31 14:53:17 UTC 2014 Starting regionserver on ip-172-31-42-248
 2014-10-31 14:55:29,049 INFO  [regionserver60020] regionserver.HRegionServer: 
 Serving as ip-172-31-42-248.ec2.internal,60020,1414767238992, RpcServer on 
 ip-172-31-42-248.ec2.internal/172.31.42.248:60020, sessionid=0x249661c2b7b0118
 {noformat}
 The RS came up without incident. It spent 1min 4s of that time waiting on the 
 master to start, attempted to report for duty from 14:54:28 to 14:55:24.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12399) Master startup race between metrics and RpcServer


[ 
https://issues.apache.org/jira/browse/HBASE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193408#comment-14193408
 ] 

Hudson commented on HBASE-12399:


FAILURE: Integrated in HBase-1.0 #405 (See 
[https://builds.apache.org/job/HBase-1.0/405/])
HBASE-12399 Master startup race between metrics and RpcServer (ndimiduk: rev 
c3a7f2f3bbb2a12bfffeff6d181e619a1545c41a)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MetricsHBaseServerWrapperImpl.java


 Master startup race between metrics and RpcServer
 -

 Key: HBASE-12399
 URL: https://issues.apache.org/jira/browse/HBASE-12399
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 12399.patch, HBASE-12399.00.patch


 Seeing this on CM tests with frequent master thrashing
 {noformat}
 2014-10-31 12:01:59,196 ERROR [Timer for 'HBase' metrics system] 
 impl.MetricsSourceAdapter: Error getting metrics from source IPC,sub=IPC
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.ipc.FifoRpcScheduler.getGeneralQueueLength(FifoRpcScheduler.java:81)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerWrapperImpl.getGeneralQueueLength(MetricsHBaseServerWrapperImpl.java:43)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.getMetrics(MetricsHBaseServerSourceImpl.java:117)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:382)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:369)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12403) IntegrationTestMTTR flaky due to aggressive RS restart timeout


[ 
https://issues.apache.org/jira/browse/HBASE-12403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193407#comment-14193407
 ] 

Hudson commented on HBASE-12403:


FAILURE: Integrated in HBase-1.0 #405 (See 
[https://builds.apache.org/job/HBase-1.0/405/])
HBASE-12403 IntegrationTestMTTR flaky due to aggressive RS restart timeout 
(ndimiduk: rev 687710eb2869817952461796d04e35de29a98fdb)
* hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java
* hbase-it/src/test/java/org/apache/hadoop/hbase/mttr/IntegrationTestMTTR.java


 IntegrationTestMTTR flaky due to aggressive RS restart timeout
 --

 Key: HBASE-12403
 URL: https://issues.apache.org/jira/browse/HBASE-12403
 Project: HBase
  Issue Type: Test
  Components: integration tests
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12403.00.patch


 TL;DR: the CM RestartRS action timeout is only 60 seconds. Considering the RS 
 must connect to the Master before it can be online, this is not long enough 
 time in an environment where the Master can also be killed.
 Failure from the console says the test failed because a 
 RestartRsHoldingMetaAction timed out.
 {noformat}
 Caused by: java.io.IOException: did timeout waiting for region server to 
 start:ip-172-31-42-248.ec2.internal
 at 
 org.apache.hadoop.hbase.HBaseCluster.waitForRegionServerToStart(HBaseCluster.java:153)
 at org.apache.hadoop.hbase.chaos.actions.Action.startRs(Action.java:93)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartActionBaseAction.restartRs(RestartActionBaseAction.java:52)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartRsHoldingMetaAction.perform(RestartRsHoldingMetaAction.java:38)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:559)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:550)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This is only reported at the end of the test run. There's no indication as to 
 when during the test run this failure happened. The timeout on the start RS 
 operation is 60 seconds.
 Hacking out the start/stop messages from the logs during the time window when 
 this test ran, it appears that at one point the RS took 2min 12s between when 
 it was launched and when it reported for duty
 {noformat}
 Fri Oct 31 14:53:17 UTC 2014 Starting regionserver on ip-172-31-42-248
 2014-10-31 14:55:29,049 INFO  [regionserver60020] regionserver.HRegionServer: 
 Serving as ip-172-31-42-248.ec2.internal,60020,1414767238992, RpcServer on 
 ip-172-31-42-248.ec2.internal/172.31.42.248:60020, sessionid=0x249661c2b7b0118
 {noformat}
 The RS came up without incident. It spent 1min 4s of that time waiting on the 
 master to start, attempted to report for duty from 14:54:28 to 14:55:24.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12399) Master startup race between metrics and RpcServer


[ 
https://issues.apache.org/jira/browse/HBASE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193437#comment-14193437
 ] 

Hudson commented on HBASE-12399:


FAILURE: Integrated in HBase-0.98 #647 (See 
[https://builds.apache.org/job/HBase-0.98/647/])
HBASE-12399 Master startup race between metrics and RpcServer (ndimiduk: rev 
da145ae2da11d0b59f47ca78bb26c166a84bf386)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MetricsHBaseServerWrapperImpl.java


 Master startup race between metrics and RpcServer
 -

 Key: HBASE-12399
 URL: https://issues.apache.org/jira/browse/HBASE-12399
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 12399.patch, HBASE-12399.00.patch


 Seeing this on CM tests with frequent master thrashing
 {noformat}
 2014-10-31 12:01:59,196 ERROR [Timer for 'HBase' metrics system] 
 impl.MetricsSourceAdapter: Error getting metrics from source IPC,sub=IPC
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.ipc.FifoRpcScheduler.getGeneralQueueLength(FifoRpcScheduler.java:81)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerWrapperImpl.getGeneralQueueLength(MetricsHBaseServerWrapperImpl.java:43)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.getMetrics(MetricsHBaseServerSourceImpl.java:117)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:382)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:369)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12403) IntegrationTestMTTR flaky due to aggressive RS restart timeout


[ 
https://issues.apache.org/jira/browse/HBASE-12403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193436#comment-14193436
 ] 

Hudson commented on HBASE-12403:


FAILURE: Integrated in HBase-0.98 #647 (See 
[https://builds.apache.org/job/HBase-0.98/647/])
HBASE-12403 IntegrationTestMTTR flaky due to aggressive RS restart timeout 
(ndimiduk: rev 414bed7197097db4e2ce638f46d9996fdfb305b1)
* hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java
* hbase-it/src/test/java/org/apache/hadoop/hbase/mttr/IntegrationTestMTTR.java


 IntegrationTestMTTR flaky due to aggressive RS restart timeout
 --

 Key: HBASE-12403
 URL: https://issues.apache.org/jira/browse/HBASE-12403
 Project: HBase
  Issue Type: Test
  Components: integration tests
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12403.00.patch


 TL;DR: the CM RestartRS action timeout is only 60 seconds. Considering the RS 
 must connect to the Master before it can be online, this is not long enough 
 time in an environment where the Master can also be killed.
 Failure from the console says the test failed because a 
 RestartRsHoldingMetaAction timed out.
 {noformat}
 Caused by: java.io.IOException: did timeout waiting for region server to 
 start:ip-172-31-42-248.ec2.internal
 at 
 org.apache.hadoop.hbase.HBaseCluster.waitForRegionServerToStart(HBaseCluster.java:153)
 at org.apache.hadoop.hbase.chaos.actions.Action.startRs(Action.java:93)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartActionBaseAction.restartRs(RestartActionBaseAction.java:52)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartRsHoldingMetaAction.perform(RestartRsHoldingMetaAction.java:38)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:559)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:550)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This is only reported at the end of the test run. There's no indication as to 
 when during the test run this failure happened. The timeout on the start RS 
 operation is 60 seconds.
 Hacking out the start/stop messages from the logs during the time window when 
 this test ran, it appears that at one point the RS took 2min 12s between when 
 it was launched and when it reported for duty
 {noformat}
 Fri Oct 31 14:53:17 UTC 2014 Starting regionserver on ip-172-31-42-248
 2014-10-31 14:55:29,049 INFO  [regionserver60020] regionserver.HRegionServer: 
 Serving as ip-172-31-42-248.ec2.internal,60020,1414767238992, RpcServer on 
 ip-172-31-42-248.ec2.internal/172.31.42.248:60020, sessionid=0x249661c2b7b0118
 {noformat}
 The RS came up without incident. It spent 1min 4s of that time waiting on the 
 master to start, attempted to report for duty from 14:54:28 to 14:55:24.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12403) IntegrationTestMTTR flaky due to aggressive RS restart timeout


[ 
https://issues.apache.org/jira/browse/HBASE-12403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193450#comment-14193450
 ] 

Hudson commented on HBASE-12403:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #615 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/615/])
HBASE-12403 IntegrationTestMTTR flaky due to aggressive RS restart timeout 
(ndimiduk: rev 414bed7197097db4e2ce638f46d9996fdfb305b1)
* hbase-it/src/test/java/org/apache/hadoop/hbase/mttr/IntegrationTestMTTR.java
* hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java


 IntegrationTestMTTR flaky due to aggressive RS restart timeout
 --

 Key: HBASE-12403
 URL: https://issues.apache.org/jira/browse/HBASE-12403
 Project: HBase
  Issue Type: Test
  Components: integration tests
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12403.00.patch


 TL;DR: the CM RestartRS action timeout is only 60 seconds. Considering the RS 
 must connect to the Master before it can be online, this is not long enough 
 time in an environment where the Master can also be killed.
 Failure from the console says the test failed because a 
 RestartRsHoldingMetaAction timed out.
 {noformat}
 Caused by: java.io.IOException: did timeout waiting for region server to 
 start:ip-172-31-42-248.ec2.internal
 at 
 org.apache.hadoop.hbase.HBaseCluster.waitForRegionServerToStart(HBaseCluster.java:153)
 at org.apache.hadoop.hbase.chaos.actions.Action.startRs(Action.java:93)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartActionBaseAction.restartRs(RestartActionBaseAction.java:52)
 at 
 org.apache.hadoop.hbase.chaos.actions.RestartRsHoldingMetaAction.perform(RestartRsHoldingMetaAction.java:38)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:559)
 at 
 org.apache.hadoop.hbase.mttr.IntegrationTestMTTR$ActionCallable.call(IntegrationTestMTTR.java:550)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 This is only reported at the end of the test run. There's no indication as to 
 when during the test run this failure happened. The timeout on the start RS 
 operation is 60 seconds.
 Hacking out the start/stop messages from the logs during the time window when 
 this test ran, it appears that at one point the RS took 2min 12s between when 
 it was launched and when it reported for duty
 {noformat}
 Fri Oct 31 14:53:17 UTC 2014 Starting regionserver on ip-172-31-42-248
 2014-10-31 14:55:29,049 INFO  [regionserver60020] regionserver.HRegionServer: 
 Serving as ip-172-31-42-248.ec2.internal,60020,1414767238992, RpcServer on 
 ip-172-31-42-248.ec2.internal/172.31.42.248:60020, sessionid=0x249661c2b7b0118
 {noformat}
 The RS came up without incident. It spent 1min 4s of that time waiting on the 
 master to start, attempted to report for duty from 14:54:28 to 14:55:24.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12399) Master startup race between metrics and RpcServer


[ 
https://issues.apache.org/jira/browse/HBASE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193451#comment-14193451
 ] 

Hudson commented on HBASE-12399:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #615 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/615/])
HBASE-12399 Master startup race between metrics and RpcServer (ndimiduk: rev 
da145ae2da11d0b59f47ca78bb26c166a84bf386)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MetricsHBaseServerWrapperImpl.java


 Master startup race between metrics and RpcServer
 -

 Key: HBASE-12399
 URL: https://issues.apache.org/jira/browse/HBASE-12399
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 12399.patch, HBASE-12399.00.patch


 Seeing this on CM tests with frequent master thrashing
 {noformat}
 2014-10-31 12:01:59,196 ERROR [Timer for 'HBase' metrics system] 
 impl.MetricsSourceAdapter: Error getting metrics from source IPC,sub=IPC
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.ipc.FifoRpcScheduler.getGeneralQueueLength(FifoRpcScheduler.java:81)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerWrapperImpl.getGeneralQueueLength(MetricsHBaseServerWrapperImpl.java:43)
   at 
 org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.getMetrics(MetricsHBaseServerSourceImpl.java:117)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:382)
   at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:369)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12363) KEEP_DELETED_CELLS considered harmful?

[
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193510#comment-14193510
]

Lars Hofhansl commented on HBASE-12363:
---

Obviously that is not going to work. The old code would interpret that as not
true (i.e. false) and have KEEP_DELETED_CELLS disabled.

One would have to be aware of that before enabling the new feature.

I also need to fix the long lines and put an interface
annotation/comment/license into the KeepDeletedCells enum.

KEEP_DELETED_CELLS considered harmful?
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors

2014-11-01 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-12219:
---

Reverted branch-1 patch and addendum.  Builds are unstable starting w/ this 
patch going in.  I'm reverting till build is back to stable again then will put 
stuff back.

 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12407) HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in CONNECTION_PROPERTIES

2014-11-01 Thread Jeffrey Zhong (JIRA)

Jeffrey Zhong created HBASE-12407:
-

 Summary: HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY 
in CONNECTION_PROPERTIES 
 Key: HBASE-12407
 URL: https://issues.apache.org/jira/browse/HBASE-12407
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.1, 0.98.7, 2.0.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong


This causes a HTable instance with custom 
RpcControllerFactory.CUSTOM_CONTROLLER_CONF_KEY conf setting while HTable 
internal may use a cached connection without this custom conf setting because 
CUSTOM_CONTROLLER_CONF_KEY isn't part of HConnectionKey.CONNECTION_PROPERTIES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12407) HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in CONNECTION_PROPERTIES

2014-11-01 Thread Jeffrey Zhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12407:
--
Attachment: HBASE-12407.patch

 HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in 
 CONNECTION_PROPERTIES 
 ---

 Key: HBASE-12407
 URL: https://issues.apache.org/jira/browse/HBASE-12407
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 0.98.7, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12407.patch


 This causes a HTable instance with custom 
 RpcControllerFactory.CUSTOM_CONTROLLER_CONF_KEY conf setting while HTable 
 internal may use a cached connection without this custom conf setting because 
 CUSTOM_CONTROLLER_CONF_KEY isn't part of HConnectionKey.CONNECTION_PROPERTIES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12407) HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in CONNECTION_PROPERTIES

2014-11-01 Thread Jeffrey Zhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12407:
--
Status: Patch Available  (was: Open)

 HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in 
 CONNECTION_PROPERTIES 
 ---

 Key: HBASE-12407
 URL: https://issues.apache.org/jira/browse/HBASE-12407
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.1, 0.98.7, 2.0.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12407.patch


 This causes a HTable instance with custom 
 RpcControllerFactory.CUSTOM_CONTROLLER_CONF_KEY conf setting while HTable 
 internal may use a cached connection without this custom conf setting because 
 CUSTOM_CONTROLLER_CONF_KEY isn't part of HConnectionKey.CONNECTION_PROPERTIES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12407) HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in CONNECTION_PROPERTIES


[ 
https://issues.apache.org/jira/browse/HBASE-12407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193593#comment-14193593
 ] 

Ted Yu commented on HBASE-12407:


+1

 HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in 
 CONNECTION_PROPERTIES 
 ---

 Key: HBASE-12407
 URL: https://issues.apache.org/jira/browse/HBASE-12407
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 0.98.7, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12407.patch


 This causes a HTable instance with custom 
 RpcControllerFactory.CUSTOM_CONTROLLER_CONF_KEY conf setting while HTable 
 internal may use a cached connection without this custom conf setting because 
 CUSTOM_CONTROLLER_CONF_KEY isn't part of HConnectionKey.CONNECTION_PROPERTIES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12407) HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in CONNECTION_PROPERTIES

2014-11-01 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193597#comment-14193597
 ] 

Enis Soztutar commented on HBASE-12407:
---

This looks good. Remember that cached/managed connections are going away. So we 
should switch to using new style of connections in Phoenix in the future. 

 HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in 
 CONNECTION_PROPERTIES 
 ---

 Key: HBASE-12407
 URL: https://issues.apache.org/jira/browse/HBASE-12407
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 0.98.7, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12407.patch


 This causes a HTable instance with custom 
 RpcControllerFactory.CUSTOM_CONTROLLER_CONF_KEY conf setting while HTable 
 internal may use a cached connection without this custom conf setting because 
 CUSTOM_CONTROLLER_CONF_KEY isn't part of HConnectionKey.CONNECTION_PROPERTIES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors


[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193614#comment-14193614
 ] 

Hudson commented on HBASE-12219:


SUCCESS: Integrated in HBase-1.0 #406 (See 
[https://builds.apache.org/job/HBase-1.0/406/])
HBASE-12219 Cache more efficiently getAll() and get() in FSTableDescriptors; 
REVERTgit log! branch-1 patch AND addendum (stack: rev 
0aca51e89cd0fe69d9cd57648949df5c5b506c53)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java


 Cache more efficiently getAll() and get() in FSTableDescriptors
 ---

 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: scalability
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12219-0.98.patch, HBASE-12219-0.98.v1.patch, 
 HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
 HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, 
 HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png


 Currently table descriptors and tables are cached once they are accessed for 
 the first time. Next calls to the master only require a trip to HDFS to 
 lookup the modified time in order to reload the table descriptors if 
 modified. However in clusters with a large number of tables or concurrent 
 clients and this can be too aggressive to HDFS and the master causing 
 contention to process other requests. A simple solution is to have a TTL 
 based cached for FSTableDescriptors#getAll() and  
 FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
 process those calls faster without causing contention without having to 
 perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12407) HConnectionKey doesn't contain CUSTOM_CONTROLLER_CONF_KEY in CONNECTION_PROPERTIES