[jira] [Commented] (HBASE-9778) Avoid seeking to next column in ExplicitColumnTracker when possible
[ https://issues.apache.org/jira/browse/HBASE-9778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815743#comment-13815743 ] Lars Hofhansl commented on HBASE-9778: -- Some more numbers with other hardcoded improvements indicate that some Phoenix queries can run over 3x as fast (8.8s instead of 27s). The challenge is now to keep the improvements from HBASE-4433 while also improve other scenarios. A new config option is probably not avoidable. > Avoid seeking to next column in ExplicitColumnTracker when possible > --- > > Key: HBASE-9778 > URL: https://issues.apache.org/jira/browse/HBASE-9778 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: 9778-0.94-v2.txt, 9778-0.94-v3.txt, 9778-0.94-v4.txt, > 9778-0.94.txt, 9778-trunk-v2.txt, 9778-trunk-v3.txt, 9778-trunk.txt > > > The issue of slow seeking in ExplicitColumnTracker was brought up by > [~vrodionov] on the dev list. > My idea here is to avoid the seeking if we know that there aren't many > versions to skip. > How do we know? We'll use the column family's VERSIONS setting as a hint. If > VERSIONS is set to 1 (or maybe some value < 10) we'll avoid the seek and call > SKIP repeatedly. > HBASE-9769 has some initial number for this approach: > Interestingly it depends on which column(s) is (are) selected. > Some numbers: 4m rows, 5 cols each, 1 cf, 10 bytes values, VERSIONS=1, > everything filtered at the server with a ValueFilter. Everything measured in > seconds. > Without patch: > ||Wildcard||Col 1||Col 2||Col 4||Col 5||Col 2+4|| > |6.4|8.5|14.3|14.6|11.1|20.3| > With patch: > ||Wildcard||Col 1||Col 2||Col 4||Col 5||Col 2+4|| > |6.4|8.4|8.9|9.9|6.4|10.0| > Variation here was +- 0.2s. > So with this patch scanning is 2x faster than without in some cases, and > never slower. No special hint needed, beyond declaring VERSIONS correctly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HBASE-9893) Incorrect assert condition in OrderedBytes decoding
[ https://issues.apache.org/jira/browse/HBASE-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Liangliang reassigned HBASE-9893: Assignee: Nick Dimiduk (was: He Liangliang) > Incorrect assert condition in OrderedBytes decoding > --- > > Key: HBASE-9893 > URL: https://issues.apache.org/jira/browse/HBASE-9893 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.96.0 >Reporter: He Liangliang >Assignee: Nick Dimiduk >Priority: Minor > Attachments: HBASE-9893.patch > > > The following assert condition is incorrect when decoding blob var byte array. > {code} > assert t == 0 : "Unexpected bits remaining after decoding blob."; > {code} > When the number of bytes to decode is multiples of 8 (i.e the original number > of bytes is multiples of 7), this assert may fail. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815714#comment-13815714 ] stack commented on HBASE-7403: -- [~asafm] yes > Online Merge > > > Key: HBASE-7403 > URL: https://issues.apache.org/jira/browse/HBASE-7403 > Project: HBase > Issue Type: New Feature >Affects Versions: 0.95.0 >Reporter: chunhui shen >Assignee: chunhui shen >Priority: Critical > Fix For: 0.98.0, 0.95.0 > > Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403-v5.txt, > 7403v5.diff, 7403v5.txt, hbase-7403-0.95.patch, hbase-7403-94v1.patch, > hbase-7403-trunkv1.patch, hbase-7403-trunkv10.patch, > hbase-7403-trunkv11.patch, hbase-7403-trunkv12.patch, > hbase-7403-trunkv13.patch, hbase-7403-trunkv14.patch, > hbase-7403-trunkv15.patch, hbase-7403-trunkv16.patch, > hbase-7403-trunkv19.patch, hbase-7403-trunkv20.patch, > hbase-7403-trunkv22.patch, hbase-7403-trunkv23.patch, > hbase-7403-trunkv24.patch, hbase-7403-trunkv26.patch, > hbase-7403-trunkv28.patch, hbase-7403-trunkv29.patch, > hbase-7403-trunkv30.patch, hbase-7403-trunkv31.patch, > hbase-7403-trunkv32.patch, hbase-7403-trunkv33.patch, > hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, > hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, merge region.pdf > > > Support executing region merge transaction on Regionserver, similar with > split transaction > Process of merging two regions: > a.client sends RPC (dispatch merging regions) to master > b.master moves the regions together (on the same regionserver where the more > heavily loaded region resided) > c.master sends RPC (merge regions) to this regionserver > d.Regionserver executes the region merge transaction in the thread pool > e.the above b,c,d run asynchronously > Process of region merge transaction: > a.Construct a new region merge transaction. > b.prepare for the merge transaction, the transaction will be canceled if it > is unavailable, > e.g. two regions don't belong to same table; two regions are not adjacent in > a non-compulsory merge; region is closed or has reference > c.execute the transaction as the following: > /** > * Set region as in transition, set it into MERGING state. > */ > SET_MERGING_IN_ZK, > /** > * We created the temporary merge data directory. > */ > CREATED_MERGE_DIR, > /** > * Closed the merging region A. > */ > CLOSED_REGION_A, > /** > * The merging region A has been taken out of the server's online regions > list. > */ > OFFLINED_REGION_A, > /** > * Closed the merging region B. > */ > CLOSED_REGION_B, > /** > * The merging region B has been taken out of the server's online regions > list. > */ > OFFLINED_REGION_B, > /** > * Started in on creation of the merged region. > */ > STARTED_MERGED_REGION_CREATION, > /** > * Point of no return. If we got here, then transaction is not recoverable > * other than by crashing out the regionserver. > */ > PONR > d.roll back if step c throws exception > Usage: > HBaseAdmin#mergeRegions > See more details from the patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9912) Need to delete a row based on partial rowkey in hbase ... Pls provide query for that
[ https://issues.apache.org/jira/browse/HBASE-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815709#comment-13815709 ] Lars Hofhansl commented on HBASE-9912: -- Also, this is not really possible. I tried this a while ago and failed. Even blogged about that failure: http://hadoop-hbase.blogspot.com/2012/01/scanning-in-hbase.html (In a nutshell we'd break seeking, since HBase would have no way of knowing how many KVs before the seek key it would have to seek in order to determine whether KVs following the seek key are marked for deletion). > Need to delete a row based on partial rowkey in hbase ... Pls provide query > for that > - > > Key: HBASE-9912 > URL: https://issues.apache.org/jira/browse/HBASE-9912 > Project: HBase > Issue Type: Bug >Reporter: ranjini >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9902) Region Server is starting normally even if clock skew is more than default 30 seconds(or any configured). -> Regionserver node time is greater than master node time
[ https://issues.apache.org/jira/browse/HBASE-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kashif J S updated HBASE-9902: -- Attachment: HBASE-9902.patch Patch for absolute value for clock skew detection. For 0.98.0 and 0.96.0 versions > Region Server is starting normally even if clock skew is more than default 30 > seconds(or any configured). -> Regionserver node time is greater than master > node time > > > Key: HBASE-9902 > URL: https://issues.apache.org/jira/browse/HBASE-9902 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.11 >Reporter: Kashif J S > Fix For: 0.98.0, 0.96.0 > > Attachments: HBASE-9902.patch > > > When Region server's time is ahead of Master's time and the difference is > more than hbase.master.maxclockskew value, region server startup is not > failing with ClockOutOfSyncException. > This causes some abnormal behavior as detected by our Tests. > ServerManager.java#checkClockSkew > long skew = System.currentTimeMillis() - serverCurrentTime; > if (skew > maxSkew) { > String message = "Server " + serverName + " has been " + > "rejected; Reported time is too far out of sync with master. " + > "Time difference of " + skew + "ms > max allowed of " + maxSkew + > "ms"; > LOG.warn(message); > throw new ClockOutOfSyncException(message); > } > Above line results in negative value when Master's time is lesser than > region server time and " if (skew > maxSkew) " check fails to find the skew > in this case. > Please Note: This was tested in hbase 0.94.11 version and the trunk also > currently has the same logic. > The fix for the same would be to make the skew positive value first as below: > long skew = System.currentTimeMillis() - serverCurrentTime; > skew = (skew < 0 ? -skew : skew); > if (skew > maxSkew) {. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9902) Region Server is starting normally even if clock skew is more than default 30 seconds(or any configured). -> Regionserver node time is greater than master node time
[ https://issues.apache.org/jira/browse/HBASE-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kashif J S updated HBASE-9902: -- Fix Version/s: 0.96.0 0.98.0 > Region Server is starting normally even if clock skew is more than default 30 > seconds(or any configured). -> Regionserver node time is greater than master > node time > > > Key: HBASE-9902 > URL: https://issues.apache.org/jira/browse/HBASE-9902 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.11 >Reporter: Kashif J S > Fix For: 0.98.0, 0.96.0 > > > When Region server's time is ahead of Master's time and the difference is > more than hbase.master.maxclockskew value, region server startup is not > failing with ClockOutOfSyncException. > This causes some abnormal behavior as detected by our Tests. > ServerManager.java#checkClockSkew > long skew = System.currentTimeMillis() - serverCurrentTime; > if (skew > maxSkew) { > String message = "Server " + serverName + " has been " + > "rejected; Reported time is too far out of sync with master. " + > "Time difference of " + skew + "ms > max allowed of " + maxSkew + > "ms"; > LOG.warn(message); > throw new ClockOutOfSyncException(message); > } > Above line results in negative value when Master's time is lesser than > region server time and " if (skew > maxSkew) " check fails to find the skew > in this case. > Please Note: This was tested in hbase 0.94.11 version and the trunk also > currently has the same logic. > The fix for the same would be to make the skew positive value first as below: > long skew = System.currentTimeMillis() - serverCurrentTime; > skew = (skew < 0 ? -skew : skew); > if (skew > maxSkew) {. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)
[ https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815707#comment-13815707 ] Hadoop QA commented on HBASE-8741: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612521/HBASE-8741-trunk-v6.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 36 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7772//console This message is automatically generated. > Scope sequenceid to the region rather than regionserver (WAS: Mutations on > Regions in recovery mode might have same sequenceIDs) > > > Key: HBASE-8741 > URL: https://issues.apache.org/jira/browse/HBASE-8741 > Project: HBase > Issue Type: Bug > Components: MTTR >Affects Versions: 0.95.1 >Reporter: Himanshu Vashishtha >Assignee: Himanshu Vashishtha > Fix For: 0.98.0 > > Attachments: HBASE-8741-trunk-v6.1-rebased.patch, > HBASE-8741-trunk-v6.2.1.patch, HBASE-8741-trunk-v6.2.2.patch, > HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.3.patch, > HBASE-8741-trunk-v6.4.patch, HBASE-8741-trunk-v6.patch, HBASE-8741-v0.patch, > HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, > HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, > HBASE-8741-v5.patch > > > Currently, when opening a region, we find the maximum sequence ID from all > its HFiles and then set the LogSequenceId of the log (in case the later is at > a small value). This works good in recovered.edits case as we are not writing > to the region until we have replayed all of its previous edits. > With distributed log replay, if we want to enable writes while a region is > under recovery, we need to make sure that the logSequenceId > maximum > logSequenceId of the old regionserver. Otherwise, we might have a situation > where new edits have same (or smaller) sequenceIds. > We can store region level information in the WALTrailer, than this scenario > could be avoided by: > a) reading the trailer of the "last completed" file, i.e., last wal file > which has a trailer and, > b) completely reading the last wal file (this file would not
[jira] [Commented] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815706#comment-13815706 ] Asaf Mesika commented on HBASE-7403: 0.96.0 has this too? > Online Merge > > > Key: HBASE-7403 > URL: https://issues.apache.org/jira/browse/HBASE-7403 > Project: HBase > Issue Type: New Feature >Affects Versions: 0.95.0 >Reporter: chunhui shen >Assignee: chunhui shen >Priority: Critical > Fix For: 0.98.0, 0.95.0 > > Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403-v5.txt, > 7403v5.diff, 7403v5.txt, hbase-7403-0.95.patch, hbase-7403-94v1.patch, > hbase-7403-trunkv1.patch, hbase-7403-trunkv10.patch, > hbase-7403-trunkv11.patch, hbase-7403-trunkv12.patch, > hbase-7403-trunkv13.patch, hbase-7403-trunkv14.patch, > hbase-7403-trunkv15.patch, hbase-7403-trunkv16.patch, > hbase-7403-trunkv19.patch, hbase-7403-trunkv20.patch, > hbase-7403-trunkv22.patch, hbase-7403-trunkv23.patch, > hbase-7403-trunkv24.patch, hbase-7403-trunkv26.patch, > hbase-7403-trunkv28.patch, hbase-7403-trunkv29.patch, > hbase-7403-trunkv30.patch, hbase-7403-trunkv31.patch, > hbase-7403-trunkv32.patch, hbase-7403-trunkv33.patch, > hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, > hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, merge region.pdf > > > Support executing region merge transaction on Regionserver, similar with > split transaction > Process of merging two regions: > a.client sends RPC (dispatch merging regions) to master > b.master moves the regions together (on the same regionserver where the more > heavily loaded region resided) > c.master sends RPC (merge regions) to this regionserver > d.Regionserver executes the region merge transaction in the thread pool > e.the above b,c,d run asynchronously > Process of region merge transaction: > a.Construct a new region merge transaction. > b.prepare for the merge transaction, the transaction will be canceled if it > is unavailable, > e.g. two regions don't belong to same table; two regions are not adjacent in > a non-compulsory merge; region is closed or has reference > c.execute the transaction as the following: > /** > * Set region as in transition, set it into MERGING state. > */ > SET_MERGING_IN_ZK, > /** > * We created the temporary merge data directory. > */ > CREATED_MERGE_DIR, > /** > * Closed the merging region A. > */ > CLOSED_REGION_A, > /** > * The merging region A has been taken out of the server's online regions > list. > */ > OFFLINED_REGION_A, > /** > * Closed the merging region B. > */ > CLOSED_REGION_B, > /** > * The merging region B has been taken out of the server's online regions > list. > */ > OFFLINED_REGION_B, > /** > * Started in on creation of the merged region. > */ > STARTED_MERGED_REGION_CREATION, > /** > * Point of no return. If we got here, then transaction is not recoverable > * other than by crashing out the regionserver. > */ > PONR > d.roll back if step c throws exception > Usage: > HBaseAdmin#mergeRegions > See more details from the patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9895) 0.96 Import utility can't import an exported file from 0.94
[ https://issues.apache.org/jira/browse/HBASE-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815697#comment-13815697 ] Hadoop QA commented on HBASE-9895: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612519/hbase-9895.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:488) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7771//console This message is automatically generated. > 0.96 Import utility can't import an exported file from 0.94 > --- > > Key: HBASE-9895 > URL: https://issues.apache.org/jira/browse/HBASE-9895 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.96.0 >Reporter: Jeffrey Zhong >Assignee: Jeffrey Zhong > Attachments: hbase-9895.patch > > > Basically we PBed org.apache.hadoop.hbase.client.Result so a 0.96 cluster > cannot import 0.94 exported files. This issue is annoying because a user > can't import his old archive files after upgrade or archives from others who > are using 0.94. > The ideal way is to catch deserialization error and then fall back to 0.94 > format for importing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-9912) Need to delete a row based on partial rowkey in hbase ... Pls provide query for that
[ https://issues.apache.org/jira/browse/HBASE-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling resolved HBASE-9912. -- Resolution: Invalid This is a question, not a bug. Please email u...@hbase.apache.org with questions. JIRA is for actual bug reports, improvements, etc. See http://hbase.apache.org/mail-lists.html > Need to delete a row based on partial rowkey in hbase ... Pls provide query > for that > - > > Key: HBASE-9912 > URL: https://issues.apache.org/jira/browse/HBASE-9912 > Project: HBase > Issue Type: Bug >Reporter: ranjini >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9912) Need to delete a row based on partial rowkey in hbase ... Pls provide query for that
ranjini created HBASE-9912: -- Summary: Need to delete a row based on partial rowkey in hbase ... Pls provide query for that Key: HBASE-9912 URL: https://issues.apache.org/jira/browse/HBASE-9912 Project: HBase Issue Type: Bug Reporter: ranjini Priority: Critical -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9907) Rig to fake a cluster so can profile client behaviors
[ https://issues.apache.org/jira/browse/HBASE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9907: - Attachment: 9907.txt > Rig to fake a cluster so can profile client behaviors > - > > Key: HBASE-9907 > URL: https://issues.apache.org/jira/browse/HBASE-9907 > Project: HBase > Issue Type: Sub-task > Components: Client >Affects Versions: 0.96.0 >Reporter: stack >Assignee: stack > Fix For: 0.98.0, 0.96.1 > > Attachments: 9907.txt > > > Patch carried over from HBASE-9775 parent issue. Adds to the > TestClientNoCluster#main a rig that allows faking many clients against a few > servers and the opposite. Useful for studying client operation. > Includes a few changes to pb makings to try and save on a few creations. > Also has an edit of javadoc on how to create an HConnection and HTable trying > to be more forceful about pointing you in right direction ([~lhofhansl] -- > mind reviewing these javadoc changes?) > I have a +1 already on this patch up in parent issue. Will run by hadoopqa > to make sure all good before commit. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815671#comment-13815671 ] Hudson commented on HBASE-9885: --- FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #829 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/829/]) HBASE-9885 Avoid some Result creation in protobuf conversions - REVERT to check the cause of precommit flakiness (nkeywal: rev 1539492) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java HBASE-9885 Avoid some Result creation in protobuf conversions (nkeywal: rev 1539429) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java > Avoid some Result creation in protobuf conversions > -- > > Key: HBASE-9885 > URL: https://issues.apache.org/jira/browse/HBASE-9885 > Project: HBase > Issue Type: Bug > Components: Client, Protobufs, regionserver >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0, 0.96.1 > > Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, > 9885.v3.patch > > > We creates a lot of Result that we could avoid, as they contain nothing else > than a boolean value. We create sometimes a protobuf builder as well on this > path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9001) TestThriftServerCmdLine.testRunThriftServer[0] failed
[ https://issues.apache.org/jira/browse/HBASE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815672#comment-13815672 ] Hudson commented on HBASE-9001: --- FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #829 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/829/]) HBASE-9001 Add a toString in HTable, fix a log in AssignmentManager (nkeywal: rev 1539425) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > TestThriftServerCmdLine.testRunThriftServer[0] failed > - > > Key: HBASE-9001 > URL: https://issues.apache.org/jira/browse/HBASE-9001 > Project: HBase > Issue Type: Bug > Components: test >Reporter: stack >Assignee: stack > Fix For: 0.95.2 > > Attachments: 9001.txt > > > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/624/testReport/junit/org.apache.hadoop.hbase.thrift/TestThriftServerCmdLine/testRunThriftServer_0_/ > It seems stuck here: > {code} > 2013-07-19 03:52:03,158 INFO [Thread-131] > thrift.TestThriftServerCmdLine(132): Starting HBase Thrift server with > command line: -hsha -port 56708 start > 2013-07-19 03:52:03,174 INFO [ThriftServer-cmdline] > thrift.ThriftServerRunner$ImplType(208): Using thrift server type hsha > 2013-07-19 03:52:03,205 WARN [ThriftServer-cmdline] conf.Configuration(817): > fs.default.name is deprecated. Instead, use fs.defaultFS > 2013-07-19 03:52:03,206 WARN [ThriftServer-cmdline] conf.Configuration(817): > mapreduce.job.counters.limit is deprecated. Instead, use > mapreduce.job.counters.max > 2013-07-19 03:52:03,207 WARN [ThriftServer-cmdline] conf.Configuration(817): > io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum > 2013-07-19 03:54:03,156 INFO [pool-1-thread-1] hbase.ResourceChecker(171): > after: thrift.TestThriftServerCmdLine#testRunThriftServer[0] Thread=146 (was > 155), OpenFileDescriptor=295 (was 311), MaxFileDescriptor=4096 (was 4096), > SystemLoadAverage=293 (was 240) - SystemLoadAverage LEAK? -, ProcessCount=145 > (was 143) - ProcessCount LEAK? -, AvailableMemoryMB=779 (was 1263), > ConnectionCount=4 (was 4) > 2013-07-19 03:54:03,157 DEBUG [pool-1-thread-1] > thrift.TestThriftServerCmdLine(107): implType=-hsha, specifyFramed=false, > specifyBindIP=false, specifyCompact=true > {code} > My guess is that we didn't get scheduled because load was almost 300 on this > box at the time? > Let me up the timeout of two minutes. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9902) Region Server is starting normally even if clock skew is more than default 30 seconds(or any configured). -> Regionserver node time is greater than master node time
[ https://issues.apache.org/jira/browse/HBASE-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815661#comment-13815661 ] Jyothi Mandava commented on HBASE-9902: --- Kashif will submit the patch soon for 0.94, 0.96 and trunk versions > Region Server is starting normally even if clock skew is more than default 30 > seconds(or any configured). -> Regionserver node time is greater than master > node time > > > Key: HBASE-9902 > URL: https://issues.apache.org/jira/browse/HBASE-9902 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.11 >Reporter: Kashif J S > > When Region server's time is ahead of Master's time and the difference is > more than hbase.master.maxclockskew value, region server startup is not > failing with ClockOutOfSyncException. > This causes some abnormal behavior as detected by our Tests. > ServerManager.java#checkClockSkew > long skew = System.currentTimeMillis() - serverCurrentTime; > if (skew > maxSkew) { > String message = "Server " + serverName + " has been " + > "rejected; Reported time is too far out of sync with master. " + > "Time difference of " + skew + "ms > max allowed of " + maxSkew + > "ms"; > LOG.warn(message); > throw new ClockOutOfSyncException(message); > } > Above line results in negative value when Master's time is lesser than > region server time and " if (skew > maxSkew) " check fails to find the skew > in this case. > Please Note: This was tested in hbase 0.94.11 version and the trunk also > currently has the same logic. > The fix for the same would be to make the skew positive value first as below: > long skew = System.currentTimeMillis() - serverCurrentTime; > skew = (skew < 0 ? -skew : skew); > if (skew > maxSkew) {. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9903) Remove the jamon generated classes from the findbugs analysis
[ https://issues.apache.org/jira/browse/HBASE-9903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815655#comment-13815655 ] Hadoop QA commented on HBASE-9903: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612442/9903.v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7769//console This message is automatically generated. > Remove the jamon generated classes from the findbugs analysis > - > > Key: HBASE-9903 > URL: https://issues.apache.org/jira/browse/HBASE-9903 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0 > > Attachments: 9903.v1.patch, 9903.v2.patch, 9903.v2.patch > > > The current filter does not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815645#comment-13815645 ] Ted Yu commented on HBASE-9885: --- {code} List values = proto.getCellList(); -if (cells == null) cells = new ArrayList(values.size()); -for (CellProtos.Cell c: values) { - cells.add(toCell(c)); +if (cells == null) { + if (values.isEmpty()) { +return EMPTY_RESULT; + } else { +cells = new ArrayList(values.size()); +for (CellProtos.Cell c : values) { + toCell(c)); +} + } {code} Looks like the scope of cells == null condition is too wide: the for loop should be outside 'if (cells == null)' check. > Avoid some Result creation in protobuf conversions > -- > > Key: HBASE-9885 > URL: https://issues.apache.org/jira/browse/HBASE-9885 > Project: HBase > Issue Type: Bug > Components: Client, Protobufs, regionserver >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0, 0.96.1 > > Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, > 9885.v3.patch > > > We creates a lot of Result that we could avoid, as they contain nothing else > than a boolean value. We create sometimes a protobuf builder as well on this > path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815646#comment-13815646 ] Hadoop QA commented on HBASE-9906: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612509/hbase-9906-0.94_v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7770//console This message is automatically generated. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9001) TestThriftServerCmdLine.testRunThriftServer[0] failed
[ https://issues.apache.org/jira/browse/HBASE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815640#comment-13815640 ] Hudson commented on HBASE-9001: --- SUCCESS: Integrated in HBase-TRUNK #4671 (See [https://builds.apache.org/job/HBase-TRUNK/4671/]) HBASE-9001 Add a toString in HTable, fix a log in AssignmentManager (nkeywal: rev 1539425) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > TestThriftServerCmdLine.testRunThriftServer[0] failed > - > > Key: HBASE-9001 > URL: https://issues.apache.org/jira/browse/HBASE-9001 > Project: HBase > Issue Type: Bug > Components: test >Reporter: stack >Assignee: stack > Fix For: 0.95.2 > > Attachments: 9001.txt > > > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/624/testReport/junit/org.apache.hadoop.hbase.thrift/TestThriftServerCmdLine/testRunThriftServer_0_/ > It seems stuck here: > {code} > 2013-07-19 03:52:03,158 INFO [Thread-131] > thrift.TestThriftServerCmdLine(132): Starting HBase Thrift server with > command line: -hsha -port 56708 start > 2013-07-19 03:52:03,174 INFO [ThriftServer-cmdline] > thrift.ThriftServerRunner$ImplType(208): Using thrift server type hsha > 2013-07-19 03:52:03,205 WARN [ThriftServer-cmdline] conf.Configuration(817): > fs.default.name is deprecated. Instead, use fs.defaultFS > 2013-07-19 03:52:03,206 WARN [ThriftServer-cmdline] conf.Configuration(817): > mapreduce.job.counters.limit is deprecated. Instead, use > mapreduce.job.counters.max > 2013-07-19 03:52:03,207 WARN [ThriftServer-cmdline] conf.Configuration(817): > io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum > 2013-07-19 03:54:03,156 INFO [pool-1-thread-1] hbase.ResourceChecker(171): > after: thrift.TestThriftServerCmdLine#testRunThriftServer[0] Thread=146 (was > 155), OpenFileDescriptor=295 (was 311), MaxFileDescriptor=4096 (was 4096), > SystemLoadAverage=293 (was 240) - SystemLoadAverage LEAK? -, ProcessCount=145 > (was 143) - ProcessCount LEAK? -, AvailableMemoryMB=779 (was 1263), > ConnectionCount=4 (was 4) > 2013-07-19 03:54:03,157 DEBUG [pool-1-thread-1] > thrift.TestThriftServerCmdLine(107): implType=-hsha, specifyFramed=false, > specifyBindIP=false, specifyCompact=true > {code} > My guess is that we didn't get scheduled because load was almost 300 on this > box at the time? > Let me up the timeout of two minutes. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815639#comment-13815639 ] Hudson commented on HBASE-9885: --- SUCCESS: Integrated in HBase-TRUNK #4671 (See [https://builds.apache.org/job/HBase-TRUNK/4671/]) HBASE-9885 Avoid some Result creation in protobuf conversions - REVERT to check the cause of precommit flakiness (nkeywal: rev 1539492) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java HBASE-9885 Avoid some Result creation in protobuf conversions (nkeywal: rev 1539429) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java > Avoid some Result creation in protobuf conversions > -- > > Key: HBASE-9885 > URL: https://issues.apache.org/jira/browse/HBASE-9885 > Project: HBase > Issue Type: Bug > Components: Client, Protobufs, regionserver >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0, 0.96.1 > > Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, > 9885.v3.patch > > > We creates a lot of Result that we could avoid, as they contain nothing else > than a boolean value. We create sometimes a protobuf builder as well on this > path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815634#comment-13815634 ] Hadoop QA commented on HBASE-9909: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612502/hbase-9909_v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7768//console This message is automatically generated. > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9818: -- Attachment: 9818-trial.txt I tried to use 9818-trial.txt for detecting where the close() came from. First attempt ended in with device full. Rerunning the two tests now. > NPE in HFileBlock#AbstractFSReader#readAtOffset > --- > > Key: HBASE-9818 > URL: https://issues.apache.org/jira/browse/HBASE-9818 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Ted Yu > Attachments: 9818-trial.txt, 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, > 9818-v5.txt > > > HFileBlock#istream seems to be null. I was wondering should we hide > FSDataInputStreamWrapper#useHBaseChecksum. > By the way, this happened when online schema change is enabled (encoding) > {noformat} > 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] > regionserver.HRegionServer: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] > regionserver.HRegionServer: > org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected > nextCallSeq: 53438 But the nextCallSeq got from client: 53437; > request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: > false next_call_seq: 53437 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcSched
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9818: -- Status: Open (was: Patch Available) > NPE in HFileBlock#AbstractFSReader#readAtOffset > --- > > Key: HBASE-9818 > URL: https://issues.apache.org/jira/browse/HBASE-9818 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Ted Yu > Attachments: 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt > > > HFileBlock#istream seems to be null. I was wondering should we hide > FSDataInputStreamWrapper#useHBaseChecksum. > By the way, this happened when online schema change is enabled (encoding) > {noformat} > 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] > regionserver.HRegionServer: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] > regionserver.HRegionServer: > org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected > nextCallSeq: 53438 But the nextCallSeq got from client: 53437; > request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: > false next_call_seq: 53437 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > {noformat} -- This message was sent by Atlassian JIRA (v6.
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815632#comment-13815632 ] Ted Yu commented on HBASE-9906: --- Minor comment: {code} - if (metaChanges.hasRegionsToRestore()) hrisToRemove.addAll(metaChanges.getRegionsToRestore()); MetaEditor.deleteRegions(catalogTracker, hrisToRemove); {code} Can the 20 ms sleep start counting from the call to MetaEditor.deleteRegions() ? Would 17ms sleep be good enough ? > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustavo Anatoly updated HBASE-9808: --- Attachment: HBASE-9808-v2.patch Nick, could you please review again? Thanks. > org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with > org.apache.hadoop.hbase.PerformanceEvaluation > > > Key: HBASE-9808 > URL: https://issues.apache.org/jira/browse/HBASE-9808 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Gustavo Anatoly > Attachments: HBASE-9808-v1.patch, HBASE-9808-v2.patch, > HBASE-9808.patch > > > Here is list of JIRAs whose fixes might have gone into > rest.PerformanceEvaluation : > {code} > > r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line > HBASE-9663 PerformanceEvaluation does not properly honor specified table name > parameter > > r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line > HBASE-9662 PerformanceEvaluation input do not handle tags properties > > r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines > HBASE-8496 - Implement tags and the internals of how a tag should look like > (Ram) > > r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line > HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a > dependency to MiniDFSCluster > > r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line > HBASE-9521 clean clearBufferOnFail behavior and deprecate it > > r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines > HBASE-9330 Refactor PE to create HTable the correct way > {code} > Long term, we may consider consolidating the two PerformanceEvaluation > classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815619#comment-13815619 ] Liu Shaohui commented on HBASE-9892: [~enis] {quote} What about backporting HBASE-7027 to 0.94 and fixing the issue in table.jsp? It makes sense, but the challenge is that we cannot easily backport HBASE-7027 without breaking BC. HServerLoad does not have extra fields we can use i fear. {quote} Since HServerLoad have verison field, I think we can add a info port field and keep the compatibility. [~stack] [~enis] Could you give a suggestion? Which method is acceptable? This patch or backporting 7027 and fixing the small issues left. > Add info port to ServerName to support multi instances in a node > > > Key: HBASE-9892 > URL: https://issues.apache.org/jira/browse/HBASE-9892 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, > HBASE-9892-0.94-v3.diff > > > The full GC time of regionserver with big heap(> 30G ) usually can not be > controlled in 30s. At the same time, the servers with 64G memory are normal. > So we try to deploy multi rs instances(2-3 ) in a single node and the heap of > each rs is about 20G ~ 24G. > Most of the things works fine, except the hbase web ui. The master get the RS > info port from conf, which is suitable for this situation of multi rs > instances in a node. So we add info port to ServerName. > a. at the startup, rs report it's info port to Hmaster. > b, For root region, rs write the servername with info port ro the zookeeper > root-region-server node. > c, For meta regions, rs write the servername with info port to root region > d. For user regions, rs write the servername with info port to meta regions > So hmaster and client can get info port from the servername. > To test this feature, I change the rs num from 1 to 3 in standalone mode, so > we can test it in standalone mode, > I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows > how Hoya handle this problem? > PS: There are different formats for servername in zk node and meta table, i > think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815611#comment-13815611 ] Hudson commented on HBASE-9885: --- FAILURE: Integrated in hbase-0.96 #181 (See [https://builds.apache.org/job/hbase-0.96/181/]) HBASE-9885 Avoid some Result creation in protobuf conversions - REVERT to check the cause of precommit flakiness (nkeywal: rev 1539493) * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java HBASE-9885 Avoid some Result creation in protobuf conversions (nkeywal: rev 1539427) * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java > Avoid some Result creation in protobuf conversions > -- > > Key: HBASE-9885 > URL: https://issues.apache.org/jira/browse/HBASE-9885 > Project: HBase > Issue Type: Bug > Components: Client, Protobufs, regionserver >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0, 0.96.1 > > Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, > 9885.v3.patch > > > We creates a lot of Result that we could avoid, as they contain nothing else > than a boolean value. We create sometimes a protobuf builder as well on this > path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9001) TestThriftServerCmdLine.testRunThriftServer[0] failed
[ https://issues.apache.org/jira/browse/HBASE-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815612#comment-13815612 ] Hudson commented on HBASE-9001: --- FAILURE: Integrated in hbase-0.96 #181 (See [https://builds.apache.org/job/hbase-0.96/181/]) HBASE-9001 Add a toString in HTable, fix a log in AssignmentManager (nkeywal: rev 1539426) * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > TestThriftServerCmdLine.testRunThriftServer[0] failed > - > > Key: HBASE-9001 > URL: https://issues.apache.org/jira/browse/HBASE-9001 > Project: HBase > Issue Type: Bug > Components: test >Reporter: stack >Assignee: stack > Fix For: 0.95.2 > > Attachments: 9001.txt > > > https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/624/testReport/junit/org.apache.hadoop.hbase.thrift/TestThriftServerCmdLine/testRunThriftServer_0_/ > It seems stuck here: > {code} > 2013-07-19 03:52:03,158 INFO [Thread-131] > thrift.TestThriftServerCmdLine(132): Starting HBase Thrift server with > command line: -hsha -port 56708 start > 2013-07-19 03:52:03,174 INFO [ThriftServer-cmdline] > thrift.ThriftServerRunner$ImplType(208): Using thrift server type hsha > 2013-07-19 03:52:03,205 WARN [ThriftServer-cmdline] conf.Configuration(817): > fs.default.name is deprecated. Instead, use fs.defaultFS > 2013-07-19 03:52:03,206 WARN [ThriftServer-cmdline] conf.Configuration(817): > mapreduce.job.counters.limit is deprecated. Instead, use > mapreduce.job.counters.max > 2013-07-19 03:52:03,207 WARN [ThriftServer-cmdline] conf.Configuration(817): > io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum > 2013-07-19 03:54:03,156 INFO [pool-1-thread-1] hbase.ResourceChecker(171): > after: thrift.TestThriftServerCmdLine#testRunThriftServer[0] Thread=146 (was > 155), OpenFileDescriptor=295 (was 311), MaxFileDescriptor=4096 (was 4096), > SystemLoadAverage=293 (was 240) - SystemLoadAverage LEAK? -, ProcessCount=145 > (was 143) - ProcessCount LEAK? -, AvailableMemoryMB=779 (was 1263), > ConnectionCount=4 (was 4) > 2013-07-19 03:54:03,157 DEBUG [pool-1-thread-1] > thrift.TestThriftServerCmdLine(107): implType=-hsha, specifyFramed=false, > specifyBindIP=false, specifyCompact=true > {code} > My guess is that we didn't get scheduled because load was almost 300 on this > box at the time? > Let me up the timeout of two minutes. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815607#comment-13815607 ] Hadoop QA commented on HBASE-9890: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612491/HBASE-9890-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7767//console This message is automatically generated. > MR jobs are not working if started by a delegated user > -- > > Key: HBASE-9890 > URL: https://issues.apache.org/jira/browse/HBASE-9890 > Project: HBase > Issue Type: Bug > Components: mapreduce, security >Affects Versions: 0.98.0, 0.94.12, 0.96.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 0.98.0, 0.94.13, 0.96.1 > > Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, > HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch > > > If Map-Reduce jobs are started with by a proxy user that has already the > delegation tokens, we get an exception on "obtain token" since the proxy user > doesn't have the kerberos auth. > For example: > * If we use oozie to execute RowCounter - oozie will get the tokens required > (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter > tries to obtain the token, it will get an exception. > * If we use oozie to execute LoadIncrementalHFiles - oozie will get the > tokens required (HDFS_DELEGATION_TOKEN) and it will start the > LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the > token, it will get an exception. > {code} > org.apache.hadoop.hbase.security.AccessDeniedException: Token generation > only allowed for Kerberos authenticated clients > at > org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) > {code} > {code} > org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token > can be issued only with kerberos or web authentication > at > org.apach
[jira] [Updated] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)
[ https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-8741: --- Attachment: HBASE-8741-trunk-v6.4.patch Uploading the latest on rb here. Thanks. > Scope sequenceid to the region rather than regionserver (WAS: Mutations on > Regions in recovery mode might have same sequenceIDs) > > > Key: HBASE-8741 > URL: https://issues.apache.org/jira/browse/HBASE-8741 > Project: HBase > Issue Type: Bug > Components: MTTR >Affects Versions: 0.95.1 >Reporter: Himanshu Vashishtha >Assignee: Himanshu Vashishtha > Fix For: 0.98.0 > > Attachments: HBASE-8741-trunk-v6.1-rebased.patch, > HBASE-8741-trunk-v6.2.1.patch, HBASE-8741-trunk-v6.2.2.patch, > HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.3.patch, > HBASE-8741-trunk-v6.4.patch, HBASE-8741-trunk-v6.patch, HBASE-8741-v0.patch, > HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, > HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, > HBASE-8741-v5.patch > > > Currently, when opening a region, we find the maximum sequence ID from all > its HFiles and then set the LogSequenceId of the log (in case the later is at > a small value). This works good in recovered.edits case as we are not writing > to the region until we have replayed all of its previous edits. > With distributed log replay, if we want to enable writes while a region is > under recovery, we need to make sure that the logSequenceId > maximum > logSequenceId of the old regionserver. Otherwise, we might have a situation > where new edits have same (or smaller) sequenceIds. > We can store region level information in the WALTrailer, than this scenario > could be avoided by: > a) reading the trailer of the "last completed" file, i.e., last wal file > which has a trailer and, > b) completely reading the last wal file (this file would not have the > trailer, so it needs to be read completely). > In future, if we switch to multi wal file, we could read the trailer for all > completed WAL files, and reading the remaining incomplete files. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9895) 0.96 Import utility can't import an exported file from 0.94
[ https://issues.apache.org/jira/browse/HBASE-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-9895: - Status: Patch Available (was: Open) > 0.96 Import utility can't import an exported file from 0.94 > --- > > Key: HBASE-9895 > URL: https://issues.apache.org/jira/browse/HBASE-9895 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.96.0 >Reporter: Jeffrey Zhong >Assignee: Jeffrey Zhong > Attachments: hbase-9895.patch > > > Basically we PBed org.apache.hadoop.hbase.client.Result so a 0.96 cluster > cannot import 0.94 exported files. This issue is annoying because a user > can't import his old archive files after upgrade or archives from others who > are using 0.94. > The ideal way is to catch deserialization error and then fall back to 0.94 > format for importing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9895) 0.96 Import utility can't import an exported file from 0.94
[ https://issues.apache.org/jira/browse/HBASE-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-9895: - Attachment: hbase-9895.patch No good way to dynamically determine an input file format in 0.94 so introducing a system property such as following in order for Import to load a file using 0.94 deserializer. {code} ./bin/hbase -Dhbase.input.version=0.94 org.apache.hadoop.hbase.mapreduce.Import {code} > 0.96 Import utility can't import an exported file from 0.94 > --- > > Key: HBASE-9895 > URL: https://issues.apache.org/jira/browse/HBASE-9895 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.96.0 >Reporter: Jeffrey Zhong >Assignee: Jeffrey Zhong > Attachments: hbase-9895.patch > > > Basically we PBed org.apache.hadoop.hbase.client.Result so a 0.96 cluster > cannot import 0.94 exported files. This issue is annoying because a user > can't import his old archive files after upgrade or archives from others who > are using 0.94. > The ideal way is to catch deserialization error and then fall back to 0.94 > format for importing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HBASE-9895) 0.96 Import utility can't import an exported file from 0.94
[ https://issues.apache.org/jira/browse/HBASE-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong reassigned HBASE-9895: Assignee: Jeffrey Zhong > 0.96 Import utility can't import an exported file from 0.94 > --- > > Key: HBASE-9895 > URL: https://issues.apache.org/jira/browse/HBASE-9895 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.96.0 >Reporter: Jeffrey Zhong >Assignee: Jeffrey Zhong > > Basically we PBed org.apache.hadoop.hbase.client.Result so a 0.96 cluster > cannot import 0.94 exported files. This issue is annoying because a user > can't import his old archive files after upgrade or archives from others who > are using 0.94. > The ideal way is to catch deserialization error and then fall back to 0.94 > format for importing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815594#comment-13815594 ] stack commented on HBASE-9892: -- HBASE-7027 is an admitted hack; would be good to do it better. > Add info port to ServerName to support multi instances in a node > > > Key: HBASE-9892 > URL: https://issues.apache.org/jira/browse/HBASE-9892 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, > HBASE-9892-0.94-v3.diff > > > The full GC time of regionserver with big heap(> 30G ) usually can not be > controlled in 30s. At the same time, the servers with 64G memory are normal. > So we try to deploy multi rs instances(2-3 ) in a single node and the heap of > each rs is about 20G ~ 24G. > Most of the things works fine, except the hbase web ui. The master get the RS > info port from conf, which is suitable for this situation of multi rs > instances in a node. So we add info port to ServerName. > a. at the startup, rs report it's info port to Hmaster. > b, For root region, rs write the servername with info port ro the zookeeper > root-region-server node. > c, For meta regions, rs write the servername with info port to root region > d. For user regions, rs write the servername with info port to meta regions > So hmaster and client can get info port from the servername. > To test this feature, I change the rs num from 1 to 3 in standalone mode, so > we can test it in standalone mode, > I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows > how Hoya handle this problem? > PS: There are different formats for servername in zk node and meta table, i > think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9873) Some improvements in hlog and hlog split
[ https://issues.apache.org/jira/browse/HBASE-9873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815593#comment-13815593 ] Liu Shaohui commented on HBASE-9873: [~nkeywal] {quote} Actually, we want to intro a speculative scheduler for hlog tasks, as the speculative scheduler for map/reduce tasks in mapreduce. Note that there is a new algo implemented in HBASE-7006, allows to have writes during the recovery. This algo is not really suitable for speculative execution, because the writes are always executed on the same machines, so adding executions would likely slow down the process. Ok that's not for 0.94 {quote} I will take a deep look at HBASE-7006 first. Thanks. {quote} Rely on the smallest of all biggest hfile's seqId of previous served regions to ignore some entries. Facebook have implemented this in HBASE-6508 and we backport it to hbase 0.94 in HBASE-9568. Yep, this would be useful for sure (my understanding is that 0.96+ has it) {quote} Thanks. HBASE-8573 has done it in 0.96. Sorry for not noticing it. As many companies still use 0.94, I think backporting is needed. > Some improvements in hlog and hlog split > > > Key: HBASE-9873 > URL: https://issues.apache.org/jira/browse/HBASE-9873 > Project: HBase > Issue Type: Improvement > Components: MTTR, wal >Reporter: Liu Shaohui >Priority: Critical > Labels: failover, hlog > > Some improvements in hlog and hlog split > 1) Try to clean old hlog after each memstore flush to avoid unnecessary hlogs > split in failover. Now hlogs cleaning only be run in rolling hlog writer. > 2) Add a background hlog compaction thread to compaction the hlog: remove the > hlog entries whose data have been flushed to hfile. The scenario is that in a > share cluster, write requests of a table may very little and periodical, a > lots of hlogs can not be cleaned for entries of this table in those hlogs. > 3) Rely on the smallest of all biggest hfile's seqId of previous served > regions to ignore some entries. Facebook have implemented this in HBASE-6508 > and we backport it to hbase 0.94 in HBASE-9568. > 4) Support running multiple hlog splitters on a single RS and on > master(latter can boost split efficiency for tiny cluster) > 5) Enable multiple splitters on 'big' hlog file by splitting(logically) hlog > to slices(configurable size, eg hdfs trunk size 64M) > support concurrent multiple split tasks on a single hlog file slice > 6) Do not cancel the timeout split task until one task reports it succeeds > (avoids scenario where split for a hlog file fails due to no one task can > succeed within the timeout period ), and and reschedule a same split task to > reduce split time ( to avoid some straggler in hlog split) > 7) Consider the hlog data locality when schedule the hlog split task. > Schedule the hlog to a splitter which is near to hlog data. > 8) Support multi hlog writers and switching to another hlog writer when long > write latency to current hlog due to possible temporary network spike? > This is a draft which lists the improvements about hlog we try to implement > in the near future. Comments and discussions are welcomed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9775) Client write path perf issues
[ https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815591#comment-13815591 ] stack commented on HBASE-9775: -- Thanks [~jmspaggi]. Don't mind the patch in here. I think Nicolas had a prescription above for you comparing 0.94 and 0.96? > Client write path perf issues > - > > Key: HBASE-9775 > URL: https://issues.apache.org/jira/browse/HBASE-9775 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Priority: Critical > Attachments: 9775.rig.txt, 9775.rig.v2.patch, 9775.rig.v3.patch, > Charts Search Cloudera Manager - ITBLL.png, Charts Search Cloudera > Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, ycsb.png, > ycsb_insert_94_vs_96.png > > > Testing on larger clusters has not had the desired throughput increases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9889) Make sure we clean up scannerReadPoints upon any exceptions
[ https://issues.apache.org/jira/browse/HBASE-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815590#comment-13815590 ] Amitanand Aiyer commented on HBASE-9889: I think we can move it to the end of the constructor. Just need to make sure that we grab the read points before we open the scanners (have seen get latency go up .. if we move the entier synchronized () .. block to the end). for the removal .. i think it is okay. scannerReadPoints is supposed to be a concurrentHashMap. > Make sure we clean up scannerReadPoints upon any exceptions > --- > > Key: HBASE-9889 > URL: https://issues.apache.org/jira/browse/HBASE-9889 > Project: HBase > Issue Type: Sub-task >Affects Versions: 0.89-fb, 0.94.12, 0.96.0 >Reporter: Amitanand Aiyer >Assignee: Amitanand Aiyer >Priority: Minor > Fix For: 0.96.1 > > Attachments: hbase-9889.diff > > > If there is an exception in the creation of RegionScanner (for example, > exception while opening store files) the scanner Read points is not cleaned > up. > Having an unused old entry in the scannerReadPoints means that flushes and > compactions cannot garbage-collect older versions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9908) [WINDOWS] Fix filesystem / classloader related unit tests
[ https://issues.apache.org/jira/browse/HBASE-9908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815587#comment-13815587 ] Hadoop QA commented on HBASE-9908: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612498/hbase-9908_v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 32 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7766//console This message is automatically generated. > [WINDOWS] Fix filesystem / classloader related unit tests > - > > Key: HBASE-9908 > URL: https://issues.apache.org/jira/browse/HBASE-9908 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9908_v1.patch > > > Some of the unit tests related to classloasing and filesystem are failing on > windows. > {code} > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testHBase3810 > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLocalFS > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testPrivateClassLoader > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromRelativeLibDirInJar > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLibDirInJar > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromHDFS > org.apache.hadoop.hbase.backup.TestHFileArchiving.testCleaningRace > org.apache.hadoop.hbase.regionserver.wal.TestDurability.testDurability > org.apache.hadoop.hbase.regionserver.wal.TestHLog.testMaintainOrderWithConcurrentWrites > org.apache.hadoop.hbase.security.access.TestAccessController.testBulkLoad > org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction > org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait.testRecoveredEditsReplayCompaction > org.apache.hadoop.hbase.util.TestFSUtils.testRenameAndSetModifyTime > {code} > The root causes are: > - Using local file name for referring to hdfs paths (HBASE-6830) > - Classloader using the wrong file system > - StoreFile readers not being closed (for unfinished compaction) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815584#comment-13815584 ] Jean-Marc Spaggiari commented on HBASE-9909: bq. Looked at those, it seems these are slightly different. TestHFilePerformance is more focussed on perf for seq writes and reads between hfile and seq file. Should they be merged then? I guess you will say yes, but on a separate JIRA? ;) Opened HBASE-9910 and HBASE-9911. > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9911) PerformanceEvaluation should be used as a proxy class for TestHFilePerformance and HFilePerformanceEvaluation
Jean-Marc Spaggiari created HBASE-9911: -- Summary: PerformanceEvaluation should be used as a proxy class for TestHFilePerformance and HFilePerformanceEvaluation Key: HBASE-9911 URL: https://issues.apache.org/jira/browse/HBASE-9911 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari All the performance tests classes should be called from PerformanceEvaluation as a proxy. This will allow to have a clear view of all the performance tests available. TestHFilePerformance and HFilePerformanceEvaluation should do the same. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9910) TestHFilePerformance and HFilePerformanceEvaluation should be merged in a single HFile performance test class.
Jean-Marc Spaggiari created HBASE-9910: -- Summary: TestHFilePerformance and HFilePerformanceEvaluation should be merged in a single HFile performance test class. Key: HBASE-9910 URL: https://issues.apache.org/jira/browse/HBASE-9910 Project: HBase Issue Type: Bug Components: Performance Reporter: Jean-Marc Spaggiari Today TestHFilePerformance and HFilePerformanceEvaluation are doing slightly different kind of performance tests both for the HFile. We should consider merging those 2 tests in a single class. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815583#comment-13815583 ] Enis Soztutar commented on HBASE-9892: -- bq. What about backporting HBASE-7027 to 0.94 and fixing the issue in table.jsp? It makes sense, but the challenge is that we cannot easily backport HBASE-7027 without breaking BC. HServerLoad does not have extra fields we can use i fear. Your patch at RB is actually better than 7027, if we do this for trunk, we should undo 7027. > Add info port to ServerName to support multi instances in a node > > > Key: HBASE-9892 > URL: https://issues.apache.org/jira/browse/HBASE-9892 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, > HBASE-9892-0.94-v3.diff > > > The full GC time of regionserver with big heap(> 30G ) usually can not be > controlled in 30s. At the same time, the servers with 64G memory are normal. > So we try to deploy multi rs instances(2-3 ) in a single node and the heap of > each rs is about 20G ~ 24G. > Most of the things works fine, except the hbase web ui. The master get the RS > info port from conf, which is suitable for this situation of multi rs > instances in a node. So we add info port to ServerName. > a. at the startup, rs report it's info port to Hmaster. > b, For root region, rs write the servername with info port ro the zookeeper > root-region-server node. > c, For meta regions, rs write the servername with info port to root region > d. For user regions, rs write the servername with info port to meta regions > So hmaster and client can get info port from the servername. > To test this feature, I change the rs num from 1 to 3 in standalone mode, so > we can test it in standalone mode, > I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows > how Hoya handle this problem? > PS: There are different formats for servername in zk node and meta table, i > think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815577#comment-13815577 ] Enis Soztutar commented on HBASE-9909: -- bq. I would love to see a "proxy" for all those performance testing files... Can we also modify PE to have an option to test the HFilePerf the same way we have randowWrite, etc.? Sure, not in this issue though. bq. TestHFilePerformance and HFilePerformanceEvaluation ? Looked at those, it seems these are slightly different. TestHFilePerformance is more focussed on perf for seq writes and reads between hfile and seq file. > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815575#comment-13815575 ] Liu Shaohui commented on HBASE-9892: Thanks [~ndimiduk][~enis] HBASE-7027 partly have fixed the problem by report the info port to hmaster via serverLoad indead. But the infoPort in table.jps still be got from config. {code} // HARDCODED FOR NOW TODO: FIX GET FROM ZK // This port might be wrong if RS actually ended up using something else. int infoPort = conf.getInt("hbase.regionserver.info.port", 60030); {code} What about backporting HBASE-7027 to 0.94 and fixing the issue in table.jsp? > Add info port to ServerName to support multi instances in a node > > > Key: HBASE-9892 > URL: https://issues.apache.org/jira/browse/HBASE-9892 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, > HBASE-9892-0.94-v3.diff > > > The full GC time of regionserver with big heap(> 30G ) usually can not be > controlled in 30s. At the same time, the servers with 64G memory are normal. > So we try to deploy multi rs instances(2-3 ) in a single node and the heap of > each rs is about 20G ~ 24G. > Most of the things works fine, except the hbase web ui. The master get the RS > info port from conf, which is suitable for this situation of multi rs > instances in a node. So we add info port to ServerName. > a. at the startup, rs report it's info port to Hmaster. > b, For root region, rs write the servername with info port ro the zookeeper > root-region-server node. > c, For meta regions, rs write the servername with info port to root region > d. For user regions, rs write the servername with info port to meta regions > So hmaster and client can get info port from the servername. > To test this feature, I change the rs num from 1 to 3 in standalone mode, so > we can test it in standalone mode, > I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows > how Hoya handle this problem? > PS: There are different formats for servername in zk node and meta table, i > think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815569#comment-13815569 ] Jean-Marc Spaggiari commented on HBASE-9909: Also, any duplication between TestHFilePerformance and HFilePerformanceEvaluation ? > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815568#comment-13815568 ] Jean-Marc Spaggiari commented on HBASE-9909: I would love to see a "proxy" for all those performance testing files... Can we also modify PE to have an option to test the HFilePerf the same way we have randowWrite, etc.? > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9906: - Attachment: hbase-9906-0.94_v1.patch 0.94 version of the patch. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906-0.94_v1.patch, hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815560#comment-13815560 ] Enis Soztutar commented on HBASE-9906: -- Thanks Matteo, test failure seems unrelated. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815556#comment-13815556 ] Gary Helmling commented on HBASE-9890: -- v2 looks good. A couple comments: * instead of ZKClusterId.getUUIDForCluster() and converting back to String, you can just use ZKClusterId.readClusterIdZNode(). * I think we need the same changes in mapred.TableMapReduceUtil. Although that one doesn't have the 2 ZK quorum support, if more than one HBASE_AUTH_TOKEN is present for the UGI, you could still wind up returning the wrong one and adding it to the job. > MR jobs are not working if started by a delegated user > -- > > Key: HBASE-9890 > URL: https://issues.apache.org/jira/browse/HBASE-9890 > Project: HBase > Issue Type: Bug > Components: mapreduce, security >Affects Versions: 0.98.0, 0.94.12, 0.96.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 0.98.0, 0.94.13, 0.96.1 > > Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, > HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch > > > If Map-Reduce jobs are started with by a proxy user that has already the > delegation tokens, we get an exception on "obtain token" since the proxy user > doesn't have the kerberos auth. > For example: > * If we use oozie to execute RowCounter - oozie will get the tokens required > (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter > tries to obtain the token, it will get an exception. > * If we use oozie to execute LoadIncrementalHFiles - oozie will get the > tokens required (HDFS_DELEGATION_TOKEN) and it will start the > LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the > token, it will get an exception. > {code} > org.apache.hadoop.hbase.security.AccessDeniedException: Token generation > only allowed for Kerberos authenticated clients > at > org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) > {code} > {code} > org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token > can be issued only with kerberos or web authentication > at > org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) > at > org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) > at > org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) > at > org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) > at > org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) > at > org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) > at > org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9909: - Status: Patch Available (was: Open) > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
[ https://issues.apache.org/jira/browse/HBASE-9909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9909: - Attachment: hbase-9909_v1.patch Attaching patch. > TestHFilePerformance should not be a unit test, but a tool > -- > > Key: HBASE-9909 > URL: https://issues.apache.org/jira/browse/HBASE-9909 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9909_v1.patch > > > TestHFilePerformance is a very old test, which does not test anything, but a > perf evaluation tool. It is not clear to me whether there is any utility for > keeping it, but that should at least be converted to be a tool. > Note that TestHFile already covers the unit test cases (writing hfile with > none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9775) Client write path perf issues
[ https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815545#comment-13815545 ] Jean-Marc Spaggiari commented on HBASE-9775: I will try to start the tests this evening, else will be tomorrow morning. Might take about 24h. I will run the last 0.96 and the last 0.96+9775 and compare. I will run in standalone but I can also run in pseudo-dist if you want (6 disks). > Client write path perf issues > - > > Key: HBASE-9775 > URL: https://issues.apache.org/jira/browse/HBASE-9775 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Priority: Critical > Attachments: 9775.rig.txt, 9775.rig.v2.patch, 9775.rig.v3.patch, > Charts Search Cloudera Manager - ITBLL.png, Charts Search Cloudera > Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, ycsb.png, > ycsb_insert_94_vs_96.png > > > Testing on larger clusters has not had the desired throughput increases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9909) TestHFilePerformance should not be a unit test, but a tool
Enis Soztutar created HBASE-9909: Summary: TestHFilePerformance should not be a unit test, but a tool Key: HBASE-9909 URL: https://issues.apache.org/jira/browse/HBASE-9909 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1 TestHFilePerformance is a very old test, which does not test anything, but a perf evaluation tool. It is not clear to me whether there is any utility for keeping it, but that should at least be converted to be a tool. Note that TestHFile already covers the unit test cases (writing hfile with none and gz compression). We do not need to test SequenceFile. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815544#comment-13815544 ] Hadoop QA commented on HBASE-9906: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612481/hbase-9906_v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7765//console This message is automatically generated. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-0
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815541#comment-13815541 ] Enis Soztutar commented on HBASE-9892: -- If we do not have the problem in 0.96, I would be -0 for fixing it in 0.94. > Add info port to ServerName to support multi instances in a node > > > Key: HBASE-9892 > URL: https://issues.apache.org/jira/browse/HBASE-9892 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, > HBASE-9892-0.94-v3.diff > > > The full GC time of regionserver with big heap(> 30G ) usually can not be > controlled in 30s. At the same time, the servers with 64G memory are normal. > So we try to deploy multi rs instances(2-3 ) in a single node and the heap of > each rs is about 20G ~ 24G. > Most of the things works fine, except the hbase web ui. The master get the RS > info port from conf, which is suitable for this situation of multi rs > instances in a node. So we add info port to ServerName. > a. at the startup, rs report it's info port to Hmaster. > b, For root region, rs write the servername with info port ro the zookeeper > root-region-server node. > c, For meta regions, rs write the servername with info port to root region > d. For user regions, rs write the servername with info port to meta regions > So hmaster and client can get info port from the servername. > To test this feature, I change the rs num from 1 to 3 in standalone mode, so > we can test it in standalone mode, > I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows > how Hoya handle this problem? > PS: There are different formats for servername in zk node and meta table, i > think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9908) [WINDOWS] Fix filesystem / classloader related unit tests
[ https://issues.apache.org/jira/browse/HBASE-9908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9908: - Attachment: hbase-9908_v1.patch Attaching simple patch. > [WINDOWS] Fix filesystem / classloader related unit tests > - > > Key: HBASE-9908 > URL: https://issues.apache.org/jira/browse/HBASE-9908 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9908_v1.patch > > > Some of the unit tests related to classloasing and filesystem are failing on > windows. > {code} > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testHBase3810 > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLocalFS > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testPrivateClassLoader > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromRelativeLibDirInJar > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLibDirInJar > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromHDFS > org.apache.hadoop.hbase.backup.TestHFileArchiving.testCleaningRace > org.apache.hadoop.hbase.regionserver.wal.TestDurability.testDurability > org.apache.hadoop.hbase.regionserver.wal.TestHLog.testMaintainOrderWithConcurrentWrites > org.apache.hadoop.hbase.security.access.TestAccessController.testBulkLoad > org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction > org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait.testRecoveredEditsReplayCompaction > org.apache.hadoop.hbase.util.TestFSUtils.testRenameAndSetModifyTime > {code} > The root causes are: > - Using local file name for referring to hdfs paths (HBASE-6830) > - Classloader using the wrong file system > - StoreFile readers not being closed (for unfinished compaction) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9908) [WINDOWS] Fix filesystem / classloader related unit tests
[ https://issues.apache.org/jira/browse/HBASE-9908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9908: - Status: Patch Available (was: Open) > [WINDOWS] Fix filesystem / classloader related unit tests > - > > Key: HBASE-9908 > URL: https://issues.apache.org/jira/browse/HBASE-9908 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1 > > Attachments: hbase-9908_v1.patch > > > Some of the unit tests related to classloasing and filesystem are failing on > windows. > {code} > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testHBase3810 > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLocalFS > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testPrivateClassLoader > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromRelativeLibDirInJar > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLibDirInJar > org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromHDFS > org.apache.hadoop.hbase.backup.TestHFileArchiving.testCleaningRace > org.apache.hadoop.hbase.regionserver.wal.TestDurability.testDurability > org.apache.hadoop.hbase.regionserver.wal.TestHLog.testMaintainOrderWithConcurrentWrites > org.apache.hadoop.hbase.security.access.TestAccessController.testBulkLoad > org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction > org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait.testRecoveredEditsReplayCompaction > org.apache.hadoop.hbase.util.TestFSUtils.testRenameAndSetModifyTime > {code} > The root causes are: > - Using local file name for referring to hdfs paths (HBASE-6830) > - Classloader using the wrong file system > - StoreFile readers not being closed (for unfinished compaction) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9908) [WINDOWS] Fix filesystem / classloader related unit tests
Enis Soztutar created HBASE-9908: Summary: [WINDOWS] Fix filesystem / classloader related unit tests Key: HBASE-9908 URL: https://issues.apache.org/jira/browse/HBASE-9908 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1 Some of the unit tests related to classloasing and filesystem are failing on windows. {code} org.apache.hadoop.hbase.coprocessor.TestClassLoading.testHBase3810 org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLocalFS org.apache.hadoop.hbase.coprocessor.TestClassLoading.testPrivateClassLoader org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromRelativeLibDirInJar org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLibDirInJar org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromHDFS org.apache.hadoop.hbase.backup.TestHFileArchiving.testCleaningRace org.apache.hadoop.hbase.regionserver.wal.TestDurability.testDurability org.apache.hadoop.hbase.regionserver.wal.TestHLog.testMaintainOrderWithConcurrentWrites org.apache.hadoop.hbase.security.access.TestAccessController.testBulkLoad org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait.testRecoveredEditsReplayCompaction org.apache.hadoop.hbase.util.TestFSUtils.testRenameAndSetModifyTime {code} The root causes are: - Using local file name for referring to hdfs paths (HBASE-6830) - Classloader using the wrong file system - StoreFile readers not being closed (for unfinished compaction) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815520#comment-13815520 ] Hadoop QA commented on HBASE-9818: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612473/9818-v5.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:488) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7764//console This message is automatically generated. > NPE in HFileBlock#AbstractFSReader#readAtOffset > --- > > Key: HBASE-9818 > URL: https://issues.apache.org/jira/browse/HBASE-9818 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Ted Yu > Attachments: 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt > > > HFileBlock#istream seems to be null. I was wondering should we hide > FSDataInputStreamWrapper#useHBaseChecksum. > By the way, this happened when online schema change is enabled (encoding) > {noformat} > 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] > regionserver.HRegionServer: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:55
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815518#comment-13815518 ] Matteo Bertozzi commented on HBASE-9906: if we don't have the ts fix, the sleep sounds ok to me > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815510#comment-13815510 ] Enis Soztutar commented on HBASE-9906: -- Agreed that sleep is stupid, but without major surgery (uniqueTs, etc), and fixes to HBASE-9770, this seems to be the best option. [~mbertozzi], [~jmhsieh] mind taking a look? Thanks. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-9890: --- Attachment: HBASE-9890-v2.patch what about something like v2... * fetch the ClusterId * use AuthenticationTokenSelector to select the token based on the clusterId * if the token is not present ask for a new one Is there a way to get the ClusterId without connecting to zookeeper? should I try to get the token without connecting to zookeeper if we've only one token and there is no quorum address specified? > MR jobs are not working if started by a delegated user > -- > > Key: HBASE-9890 > URL: https://issues.apache.org/jira/browse/HBASE-9890 > Project: HBase > Issue Type: Bug > Components: mapreduce, security >Affects Versions: 0.98.0, 0.94.12, 0.96.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 0.98.0, 0.94.13, 0.96.1 > > Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, > HBASE-9890-v0.patch, HBASE-9890-v1.patch, HBASE-9890-v2.patch > > > If Map-Reduce jobs are started with by a proxy user that has already the > delegation tokens, we get an exception on "obtain token" since the proxy user > doesn't have the kerberos auth. > For example: > * If we use oozie to execute RowCounter - oozie will get the tokens required > (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter > tries to obtain the token, it will get an exception. > * If we use oozie to execute LoadIncrementalHFiles - oozie will get the > tokens required (HDFS_DELEGATION_TOKEN) and it will start the > LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the > token, it will get an exception. > {code} > org.apache.hadoop.hbase.security.AccessDeniedException: Token generation > only allowed for Kerberos authenticated clients > at > org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) > {code} > {code} > org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token > can be issued only with kerberos or web authentication > at > org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) > at > org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) > at > org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) > at > org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) > at > org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) > at > org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) > at > org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:173) > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815502#comment-13815502 ] Nick Dimiduk commented on HBASE-9892: - Hi [~liushaohui] I cannot reproduce the problem you describe on 0.96 or trunk. There it appears the info port comes from ServerName or ServerLoad objects for live server link or dead server list, respectively. Could we not backport an existing fix rather than add all this new plumbing? For context, I have multiple RS processes deployed on each RS host. Their RPC and info ports are set explicitly for each process. Perhaps your configuration is different? > Add info port to ServerName to support multi instances in a node > > > Key: HBASE-9892 > URL: https://issues.apache.org/jira/browse/HBASE-9892 > Project: HBase > Issue Type: Improvement >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, > HBASE-9892-0.94-v3.diff > > > The full GC time of regionserver with big heap(> 30G ) usually can not be > controlled in 30s. At the same time, the servers with 64G memory are normal. > So we try to deploy multi rs instances(2-3 ) in a single node and the heap of > each rs is about 20G ~ 24G. > Most of the things works fine, except the hbase web ui. The master get the RS > info port from conf, which is suitable for this situation of multi rs > instances in a node. So we add info port to ServerName. > a. at the startup, rs report it's info port to Hmaster. > b, For root region, rs write the servername with info port ro the zookeeper > root-region-server node. > c, For meta regions, rs write the servername with info port to root region > d. For user regions, rs write the servername with info port to meta regions > So hmaster and client can get info port from the servername. > To test this feature, I change the rs num from 1 to 3 in standalone mode, so > we can test it in standalone mode, > I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows > how Hoya handle this problem? > PS: There are different formats for servername in zk node and meta table, i > think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815492#comment-13815492 ] Sergey Shelukhin commented on HBASE-9906: - Btw another option is uniqueTs. I am -0 on sleep... > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9775) Client write path perf issues
[ https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815493#comment-13815493 ] stack commented on HBASE-9775: -- Back to the root discussion on this issue: bq. with a max.total.tasks of 100 and max.perserver.tasks of 5, the client might not use all the server. May be a default of 2 for max.perserver.tasks would be better That'll work if many servers right but will be a constraint if only a few servers and a few clients. In that we will only schedule two tasks at most to each server when it could take much more. Ideally we want something like what you had before -- 5 or 1/2 the CPUs on the local server as guesstimate of how many CPUs the server has, which ever is greater-- and then soon as we get indications that server is struggling, go down from this max per server and slowly ramp back up as we have successful ops against said server (How drastic the drop in tasks-per-server should be would depend on the exception we'd gotten from the server). bq. the server reject the client when it's busy (HBASE-9467). That increases the number of retries to do, and, on an heavy load, can lead us to fail on something that would have worked before. We only reject as 'busy' when we can't obtain lock after an amount of time and if we are trying to flush because we are up against the global mem limit. Regards retries, if we get one of these RegionTooBusyExceptions, rather than back off for a 100ms or so, should we back off more (an Elliott suggestion)? And drop the number of tasks to throw at this server at any one time. It'd be hard to do as things are now given backoff is calculated based off retry count only. Give the two items above, we should keep more stats per server than just count of tasks? We should keep a history of success/error and do backoffs -- both amount of time and how many tasks to send the server -- based on this? bq. For example, the new settings will make the client to send 4 queries in 1 second Yeah, that is not going to help anyone. bq. If we want to compare 0.94 and 0.96, may be we should use the same settings, i.e. pause: 1000ms backoff: { 1, 1, 1, 2, 2, 4, 4, 8, 16, 32, 64 } hbase.client.max.perserver.tasks: 1 Seems like good idea. [~nkeywal] What you think of the [~jeffreyz] patch? [~jmspaggi] Any luck run perf test? We got our big cluster back so we'll start in on this one again. In single client, if many regions, I see the client threads blocked waiting to do locateRegionInMeta (I don't understand this regionLockObject... it locks everyone out while a lookup is going on rather than threads contending on the same region location). If there are few regions, we are doing softvaluemap operations all the time. > Client write path perf issues > - > > Key: HBASE-9775 > URL: https://issues.apache.org/jira/browse/HBASE-9775 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Priority: Critical > Attachments: 9775.rig.txt, 9775.rig.v2.patch, 9775.rig.v3.patch, > Charts Search Cloudera Manager - ITBLL.png, Charts Search Cloudera > Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, ycsb.png, > ycsb_insert_94_vs_96.png > > > Testing on larger clusters has not had the desired throughput increases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline
[ https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815491#comment-13815491 ] Hadoop QA commented on HBASE-9047: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612459/HBASE-9047-trunk-v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:488) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7762//console This message is automatically generated. > Tool to handle finishing replication when the cluster is offline > > > Key: HBASE-9047 > URL: https://issues.apache.org/jira/browse/HBASE-9047 > Project: HBase > Issue Type: New Feature >Affects Versions: 0.96.0 >Reporter: Jean-Daniel Cryans >Assignee: Demai Ni > Fix For: 0.98.0 > > Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, > HBASE-9047-trunk-v1.patch, HBASE-9047-trunk-v2.patch, > HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch > > > We're having a discussion on the mailing list about replicating the data on a > cluster that was shut down in an offline fashion. The motivation could be > that you don't want to bring HBase back up but still need that data on the > slave. > So I have this idea of a tool that would be running on the master cluster > while it is down, although it could also run at any time. Basically it would > be able to read the replication state of each master region server, finish > replicating what's missing to all the slave, and then clear that state in > zookeeper. > The code that handles replication does most of that already, see > ReplicationSourceManager and ReplicationSource. Basically when > ReplicationSourceManager.init() is called, it will check all the queues in ZK > and try to grab those that aren't attached to a region server. If the whole > cluster is down, it will grab all of them. > The beautiful thing here is that you could start that tool on all your > machines and the load will be spread out, but that might not be a big concern > i
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815475#comment-13815475 ] Sergey Shelukhin commented on HBASE-9906: - You can use the power of out-of-order ts by doing puts first, getting the ts, and then doing deletes at that ts minus 1 :) although iirc meta might break because of that, because the key-before code optimizes by assuming no out-of-order ts across files. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9906: - Attachment: hbase-9906_v1.patch Patch for option (2). > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-9906: - Status: Patch Available (was: Open) > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > Attachments: hbase-9906_v1.patch > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815463#comment-13815463 ] Enis Soztutar commented on HBASE-9906: -- Out of the above options, (1) will take some time to fix. (3) has another problem because we would be intermixing client-supplied timestamps and server supplied tss, which might cause further problems in meta, if clocks are out of sync. (4) is not ideal as well, since we want to delete the whole row, except for column info:regioninfo. For this we have to do a get for obtaining the columns for each row, and send deletes for each row. So that leaves us with option (2), which is embarrassing, but given that restore is very infrequent, that we can justify sleeping extra 20ms. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815446#comment-13815446 ] Nicolas Liochon commented on HBASE-9885: SInce the commit, the precommit env bacame flaky and it seems that surefire cannot parse the tests results. Let see if there is a relation by reverting. > Avoid some Result creation in protobuf conversions > -- > > Key: HBASE-9885 > URL: https://issues.apache.org/jira/browse/HBASE-9885 > Project: HBase > Issue Type: Bug > Components: Client, Protobufs, regionserver >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0, 0.96.1 > > Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, > 9885.v3.patch > > > We creates a lot of Result that we could avoid, as they contain nothing else > than a boolean value. We create sometimes a protobuf builder as well on this > path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9890) MR jobs are not working if started by a delegated user
[ https://issues.apache.org/jira/browse/HBASE-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815447#comment-13815447 ] Gary Helmling commented on HBASE-9890: -- In the case that Francis points out, whether using CopyTable or something custom, you would actually have more than one token of type HBASE_AUTH_TOKEN. Does Oozie support running CopyTable between two clusters? If so, it needs to fetch the delegation token for each, but this patch wouldn't pass along both, only the first that it sees. Obtaining the token from UGI by type alone does not guarantee it is associated with the given cluster. That need to match the token service against the cluster ID. In fact, I think the change as it is will cause CopyTable between 2 secure HBase clusters to fail. The change in this section of o.a.h.h.mapreduce.TableMapReduce.initCredentials() is the problem: {code} try { // init credentials for remote cluster String quorumAddress = job.getConfiguration().get(TableOutputFormat.QUORUM_ADDRESS); if (quorumAddress != null) { Configuration peerConf = HBaseConfiguration.create(job.getConfiguration()); ZKUtil.applyClusterKeyToConf(peerConf, quorumAddress); - userProvider.getCurrent().obtainAuthTokenForJob(peerConf, job); + user.obtainAuthTokenForJob(peerConf, job); } - userProvider.getCurrent().obtainAuthTokenForJob(job.getConfiguration(), job); + +Token authToken = user.getToken(AuthenticationTokenIdentifier.AUTH_TOKEN_TYPE.toString()); +if (authToken == null) { {code} When running between 2 secure clusters, we'll obtain a token against one cluster (using the config value of TableOutputFormat.QUORUM_ADDRESS), then the following call to user.getToken("HBASE_AUTH_TOKEN") will return the token just obtained, so we never fetch the second token. You can use AuthenticationTokenSelector.selectToken() to pull out the correct token for a given cluster. But first you will need the cluster ID for the cluster you're connecting to. > MR jobs are not working if started by a delegated user > -- > > Key: HBASE-9890 > URL: https://issues.apache.org/jira/browse/HBASE-9890 > Project: HBase > Issue Type: Bug > Components: mapreduce, security >Affects Versions: 0.98.0, 0.94.12, 0.96.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 0.98.0, 0.94.13, 0.96.1 > > Attachments: HBASE-9890-94-v0.patch, HBASE-9890-94-v1.patch, > HBASE-9890-v0.patch, HBASE-9890-v1.patch > > > If Map-Reduce jobs are started with by a proxy user that has already the > delegation tokens, we get an exception on "obtain token" since the proxy user > doesn't have the kerberos auth. > For example: > * If we use oozie to execute RowCounter - oozie will get the tokens required > (HBASE_AUTH_TOKEN) and it will start the RowCounter. Once the RowCounter > tries to obtain the token, it will get an exception. > * If we use oozie to execute LoadIncrementalHFiles - oozie will get the > tokens required (HDFS_DELEGATION_TOKEN) and it will start the > LoadIncrementalHFiles. Once the LoadIncrementalHFiles tries to obtain the > token, it will get an exception. > {code} > org.apache.hadoop.hbase.security.AccessDeniedException: Token generation > only allowed for Kerberos authenticated clients > at > org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:87) > {code} > {code} > org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token > can be issued only with kerberos or web authentication > at > org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:783) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:868) > at > org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:509) > at > org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:487) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:130) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111) > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85) > at > org.apache.hadoop.filecache.TrackerDistributedCacheManager.getDelegationTokens(TrackerDistributedCacheManager.java:949) > at > org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:854) > at > org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) > at org.apache.hadoop.mapreduce.Job
[jira] [Reopened] (HBASE-9885) Avoid some Result creation in protobuf conversions
[ https://issues.apache.org/jira/browse/HBASE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon reopened HBASE-9885: > Avoid some Result creation in protobuf conversions > -- > > Key: HBASE-9885 > URL: https://issues.apache.org/jira/browse/HBASE-9885 > Project: HBase > Issue Type: Bug > Components: Client, Protobufs, regionserver >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0, 0.96.1 > > Attachments: 9885.v1.patch, 9885.v2, 9885.v2.patch, 9885.v3.patch, > 9885.v3.patch > > > We creates a lot of Result that we could avoid, as they contain nothing else > than a boolean value. We create sometimes a protobuf builder as well on this > path, this can be avoided. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9818: -- Attachment: 9818-v5.txt > NPE in HFileBlock#AbstractFSReader#readAtOffset > --- > > Key: HBASE-9818 > URL: https://issues.apache.org/jira/browse/HBASE-9818 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Ted Yu > Attachments: 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt > > > HFileBlock#istream seems to be null. I was wondering should we hide > FSDataInputStreamWrapper#useHBaseChecksum. > By the way, this happened when online schema change is enabled (encoding) > {noformat} > 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] > regionserver.HRegionServer: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] > regionserver.HRegionServer: > org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected > nextCallSeq: 53438 But the nextCallSeq got from client: 53437; > request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: > false next_call_seq: 53437 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9818: -- Attachment: (was: 9818-v5.txt) > NPE in HFileBlock#AbstractFSReader#readAtOffset > --- > > Key: HBASE-9818 > URL: https://issues.apache.org/jira/browse/HBASE-9818 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Ted Yu > Attachments: 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt > > > HFileBlock#istream seems to be null. I was wondering should we hide > FSDataInputStreamWrapper#useHBaseChecksum. > By the way, this happened when online schema change is enabled (encoding) > {noformat} > 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] > regionserver.HRegionServer: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:553) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:245) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:166) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:361) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:336) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:293) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:258) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:603) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:476) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:129) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3546) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3616) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3494) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3485) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3079) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > 2013-10-22 10:58:43,665 ERROR [RpcServer.handler=23,port=36020] > regionserver.HRegionServer: > org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected > nextCallSeq: 53438 But the nextCallSeq got from client: 53437; > request=scanner_id: 1252577470624375060 number_of_rows: 100 close_scanner: > false next_call_seq: 53437 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3030) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:27022) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1979) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:724) > {noformat} -- This message was sent by Atlassian JIRA (v6.1#
[jira] [Commented] (HBASE-9879) Can't undelete a KeyValue
[ https://issues.apache.org/jira/browse/HBASE-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815432#comment-13815432 ] Enis Soztutar commented on HBASE-9879: -- bq. There was support in the recent PMC meeting for deprecating client set timestamps. Existing tables would grandfather a setting that allows user set timestamps but new tables would not allow them. Allowing clients to (ab)use cell timestamps leads to several problems not just this known issue. Opened HBASE-9905 for discussing that. > Can't undelete a KeyValue > - > > Key: HBASE-9879 > URL: https://issues.apache.org/jira/browse/HBASE-9879 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Benoit Sigoure > > Test scenario: > put(KV, timestamp=100) > put(KV, timestamp=200) > delete(KV, timestamp=200, with MutationProto.DeleteType.DELETE_ONE_VERSION) > get(KV) => returns value at timestamp=100 (OK) > put(KV, timestamp=200) > get(KV) => returns value at timestamp=100 (but not the one at timestamp=200 > that was "reborn" by the previous put) > Is that normal? > I ran into this bug while running the integration tests at > https://github.com/OpenTSDB/asynchbase/pull/60 – the first time you run it, > it passes, but after that, it keeps failing. Sorry I don't have the > corresponding HTable-based code but that should be fairly easy to write. > I only tested this with 0.96.0, dunno yet how this behaved in prior releases. > My hunch is that the tombstone added by the DELETE_ONE_VERSION keeps > shadowing the value even after it's reborn. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-5583) Master restart on create table with splitkeys does not recreate table with all the splitkey regions
[ https://issues.apache.org/jira/browse/HBASE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815420#comment-13815420 ] Hadoop QA commented on HBASE-5583: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7763//console This message is automatically generated. > Master restart on create table with splitkeys does not recreate table with > all the splitkey regions > --- > > Key: HBASE-5583 > URL: https://issues.apache.org/jira/browse/HBASE-5583 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.96.1 > > Attachments: HBASE-5583_new_1.patch, HBASE-5583_new_1_review.patch, > HBASE-5583_new_2.patch, HBASE-5583_new_4_WIP.patch, > HBASE-5583_new_5_WIP_using_tableznode.patch > > > -> Create table using splitkeys > -> MAster goes down before all regions are added to meta > -> On master restart the table is again enabled but with less number of > regions than specified in splitkeys > Anyway client will get an exception if i had called sync create table. But > table exists or not check will say table exists. > Is this scenario to be handled by client only or can we have some mechanism > on the master side for this? Pls suggest. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
[ https://issues.apache.org/jira/browse/HBASE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815418#comment-13815418 ] Enis Soztutar commented on HBASE-9906: -- We can fix this issue by: - Fix either HBASE-9905 or HBASE-8770 or HBASE-9879 - Add a sleep(20) between meta delete and update - obtain a ts from the client, and do the delete with that ts, and puts with ts+1 - change the meta delete to only delete columns not needed. The subsequent put will override the column values anyway. > Restore snapshot fails to restore the meta edits sporadically > --- > > Key: HBASE-9906 > URL: https://issues.apache.org/jira/browse/HBASE-9906 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 0.98.0, 0.96.1, 0.94.14 > > > After snaphot restore, we see failures to find the table in meta: > {code} > > disable 'tablefour' > > restore_snapshot 'snapshot_tablefour' > > enable 'tablefour' > ERROR: Table tablefour does not exist.' > {code} > This is quite subtle. From the looks of it, we successfully restore the > snapshot, do the meta updates, return to the client about the status. The > client then tries to do an operation for the table (like enable table, or > scan in the test outputs) which fails because the meta entry for the region > seems to be gone (in case of single region, the table will be reported > missing). Subsequent attempts for creating the table will also fail because > the table directories will be there, but not the meta entries. > For restoring meta entries, we are doing a delete then a put to the same > region: > {code} > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: > 76d0e2b7ec3291afcaa82e18a56ccc30 > 2013-11-04 10:39:51,582 INFO > org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: > fa41edf43fe3ee131db4a34b848ff432 > ... > 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => > 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY > => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => > 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE > 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: > Added 1 > {code} > The root cause for this sporadic failure is that, the delete and subsequent > put will have the same timestamp if they execute in the same ms. The delete > will override the put in the same ts, even though the put have a larger ts. > See: HBASE-9905, HBASE-8770 > Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9905) Enable using seqId as timestamp
[ https://issues.apache.org/jira/browse/HBASE-9905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815414#comment-13815414 ] Enis Soztutar commented on HBASE-9905: -- Some offline discussion with Sergey, we probably need a couple of "modes" per table for "timestamp mode" : - mode_seqid : The server supplies seqId to the cells. If ts is set for puts, the server will throw IllegelArgumentException. - mode_server_ts : The server supplies ts to the cells from the wall clock. If ts is set for puts, the server will throw IllegelArgumentException. - mode_client_ts : The client always supplies the timestamps (from clock or from ts oracle). The server throws exception if the cell does not have timestamp - mode_mixed : Will operate similarly to current semantics. Will be deprecated. mode_server_ts is a special case for mode_mixed, and may not be needed. > Enable using seqId as timestamp > > > Key: HBASE-9905 > URL: https://issues.apache.org/jira/browse/HBASE-9905 > Project: HBase > Issue Type: New Feature >Reporter: Enis Soztutar > Fix For: 0.98.0 > > > This has been discussed previously, and Lars H. was mentioning an idea from > the client to declare whether timestamps are used or not explicitly. > The problem is that, for data models not using timestamps, we are still > relying on clocks to order the updates. Clock skew, same milisecond puts > after deletes, etc can cause unexpected behavior and data not being visible. > We should have a table descriptor / family property, which would declare that > the data model does not use timestamps. Then we can populate this dimension > with the seqId, so that global ordering of edits are not effected by wall > clock. > For example, META will use this. > Once we have something like this, we can think of making it default for new > tables, so that the unknowing user will not shoot herself in the foot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9907) Rig to fake a cluster so can profile client behaviors
[ https://issues.apache.org/jira/browse/HBASE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9907: - Affects Version/s: 0.96.0 Status: Patch Available (was: Open) > Rig to fake a cluster so can profile client behaviors > - > > Key: HBASE-9907 > URL: https://issues.apache.org/jira/browse/HBASE-9907 > Project: HBase > Issue Type: Sub-task > Components: Client >Affects Versions: 0.96.0 >Reporter: stack >Assignee: stack > Fix For: 0.98.0, 0.96.1 > > > Patch carried over from HBASE-9775 parent issue. Adds to the > TestClientNoCluster#main a rig that allows faking many clients against a few > servers and the opposite. Useful for studying client operation. > Includes a few changes to pb makings to try and save on a few creations. > Also has an edit of javadoc on how to create an HConnection and HTable trying > to be more forceful about pointing you in right direction ([~lhofhansl] -- > mind reviewing these javadoc changes?) > I have a +1 already on this patch up in parent issue. Will run by hadoopqa > to make sure all good before commit. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9907) Rig to fake a cluster so can profile client behaviors
stack created HBASE-9907: Summary: Rig to fake a cluster so can profile client behaviors Key: HBASE-9907 URL: https://issues.apache.org/jira/browse/HBASE-9907 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Fix For: 0.98.0, 0.96.1 Patch carried over from HBASE-9775 parent issue. Adds to the TestClientNoCluster#main a rig that allows faking many clients against a few servers and the opposite. Useful for studying client operation. Includes a few changes to pb makings to try and save on a few creations. Also has an edit of javadoc on how to create an HConnection and HTable trying to be more forceful about pointing you in right direction ([~lhofhansl] -- mind reviewing these javadoc changes?) I have a +1 already on this patch up in parent issue. Will run by hadoopqa to make sure all good before commit. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9903) Remove the jamon generated classes from the findbugs analysis
[ https://issues.apache.org/jira/browse/HBASE-9903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815406#comment-13815406 ] Hadoop QA commented on HBASE-9903: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612442/9903.v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7761//console This message is automatically generated. > Remove the jamon generated classes from the findbugs analysis > - > > Key: HBASE-9903 > URL: https://issues.apache.org/jira/browse/HBASE-9903 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0 > > Attachments: 9903.v1.patch, 9903.v2.patch, 9903.v2.patch > > > The current filter does not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9818) NPE in HFileBlock#AbstractFSReader#readAtOffset
[ https://issues.apache.org/jira/browse/HBASE-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815407#comment-13815407 ] Hadoop QA commented on HBASE-9818: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612439/9818-v5.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:488) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7760//console This message is automatically generated. > NPE in HFileBlock#AbstractFSReader#readAtOffset > --- > > Key: HBASE-9818 > URL: https://issues.apache.org/jira/browse/HBASE-9818 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Ted Yu > Attachments: 9818-v2.txt, 9818-v3.txt, 9818-v4.txt, 9818-v5.txt > > > HFileBlock#istream seems to be null. I was wondering should we hide > FSDataInputStreamWrapper#useHBaseChecksum. > By the way, this happened when online schema change is enabled (encoding) > {noformat} > 2013-10-22 10:58:43,321 ERROR [RpcServer.handler=28,port=36020] > regionserver.HRegionServer: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1200) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1436) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1318) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:359) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:503) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:55
[jira] [Updated] (HBASE-9047) Tool to handle finishing replication when the cluster is offline
[ https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-9047: Attachment: HBASE-9047-trunk-v4.patch new patch remove the 30 sec timeout at the end because the resourcemanager.oldsource checking is good enough to indicate no edits in queue. > Tool to handle finishing replication when the cluster is offline > > > Key: HBASE-9047 > URL: https://issues.apache.org/jira/browse/HBASE-9047 > Project: HBase > Issue Type: New Feature >Affects Versions: 0.96.0 >Reporter: Jean-Daniel Cryans >Assignee: Demai Ni > Fix For: 0.98.0 > > Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, > HBASE-9047-trunk-v1.patch, HBASE-9047-trunk-v2.patch, > HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch > > > We're having a discussion on the mailing list about replicating the data on a > cluster that was shut down in an offline fashion. The motivation could be > that you don't want to bring HBase back up but still need that data on the > slave. > So I have this idea of a tool that would be running on the master cluster > while it is down, although it could also run at any time. Basically it would > be able to read the replication state of each master region server, finish > replicating what's missing to all the slave, and then clear that state in > zookeeper. > The code that handles replication does most of that already, see > ReplicationSourceManager and ReplicationSource. Basically when > ReplicationSourceManager.init() is called, it will check all the queues in ZK > and try to grab those that aren't attached to a region server. If the whole > cluster is down, it will grab all of them. > The beautiful thing here is that you could start that tool on all your > machines and the load will be spread out, but that might not be a big concern > if replication wasn't lagging since it would take a few seconds to finish > replicating the missing data for each region server. > I'm guessing when starting ReplicationSourceManager you'd give it a fake > region server ID, and you'd tell it not to start its own source. > FWIW the main difference in how replication is handled between Apache's HBase > and Facebook's is that the latter is always done separately of HBase itself. > This jira isn't about doing that. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9906) Restore snapshot fails to restore the meta edits sporadically
Enis Soztutar created HBASE-9906: Summary: Restore snapshot fails to restore the meta edits sporadically Key: HBASE-9906 URL: https://issues.apache.org/jira/browse/HBASE-9906 Project: HBase Issue Type: New Feature Components: snapshots Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.1, 0.94.14 After snaphot restore, we see failures to find the table in meta: {code} > disable 'tablefour' > restore_snapshot 'snapshot_tablefour' > enable 'tablefour' ERROR: Table tablefour does not exist.' {code} This is quite subtle. From the looks of it, we successfully restore the snapshot, do the meta updates, return to the client about the status. The client then tries to do an operation for the table (like enable table, or scan in the test outputs) which fails because the meta entry for the region seems to be gone (in case of single region, the table will be reported missing). Subsequent attempts for creating the table will also fail because the table directories will be there, but not the meta entries. For restoring meta entries, we are doing a delete then a put to the same region: {code} 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to restore: 76d0e2b7ec3291afcaa82e18a56ccc30 2013-11-04 10:39:51,582 INFO org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper: region to remove: fa41edf43fe3ee131db4a34b848ff432 ... 2013-11-04 10:39:52,102 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted [{ENCODED => fa41edf43fe3ee131db4a34b848ff432, NAME => 'tablethree_mod,,1383559723345.fa41edf43fe3ee131db4a34b848ff432.', STARTKEY => '', ENDKEY => ''}, {ENCODED => 76d0e2b7ec3291afcaa82e18a56ccc30, NAME => 'tablethree_mod,,1383561123097.76d0e2b7ec3291afcaa82e18a56ccc30.', STARTKE 2013-11-04 10:39:52,111 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 1 {code} The root cause for this sporadic failure is that, the delete and subsequent put will have the same timestamp if they execute in the same ms. The delete will override the put in the same ts, even though the put have a larger ts. See: HBASE-9905, HBASE-8770 Credit goes to [~huned] for reporting this bug. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8770) deletes and puts with the same ts should be resolved according to mvcc/seqNum
[ https://issues.apache.org/jira/browse/HBASE-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815380#comment-13815380 ] Sergey Shelukhin commented on HBASE-8770: - There was another issue today where user does use equal TS (put ts 100m, put ts 200, del-version ts 200, then later put ts 200). This would solve both problems... I think HBASE-9905 we can also do > deletes and puts with the same ts should be resolved according to mvcc/seqNum > - > > Key: HBASE-8770 > URL: https://issues.apache.org/jira/browse/HBASE-8770 > Project: HBase > Issue Type: Brainstorming >Reporter: Sergey Shelukhin > > This came up during HBASE-8721. Puts with the same ts are resolved by seqNum. > It's not clear why deletes with the same ts as a put should always mask the > put, rather than also being resolve by seqNum. > What do you think? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8770) deletes and puts with the same ts should be resolved according to mvcc/seqNum
[ https://issues.apache.org/jira/browse/HBASE-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815369#comment-13815369 ] Enis Soztutar commented on HBASE-8770: -- Linking HBASE-9905. We might as well do that instead of this. > deletes and puts with the same ts should be resolved according to mvcc/seqNum > - > > Key: HBASE-8770 > URL: https://issues.apache.org/jira/browse/HBASE-8770 > Project: HBase > Issue Type: Brainstorming >Reporter: Sergey Shelukhin > > This came up during HBASE-8721. Puts with the same ts are resolved by seqNum. > It's not clear why deletes with the same ts as a put should always mask the > put, rather than also being resolve by seqNum. > What do you think? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9905) Enable using seqId as timestamp
Enis Soztutar created HBASE-9905: Summary: Enable using seqId as timestamp Key: HBASE-9905 URL: https://issues.apache.org/jira/browse/HBASE-9905 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Fix For: 0.98.0 This has been discussed previously, and Lars H. was mentioning an idea from the client to declare whether timestamps are used or not explicitly. The problem is that, for data models not using timestamps, we are still relying on clocks to order the updates. Clock skew, same milisecond puts after deletes, etc can cause unexpected behavior and data not being visible. We should have a table descriptor / family property, which would declare that the data model does not use timestamps. Then we can populate this dimension with the seqId, so that global ordering of edits are not effected by wall clock. For example, META will use this. Once we have something like this, we can think of making it default for new tables, so that the unknowing user will not shoot herself in the foot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9775) Client write path perf issues
[ https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815341#comment-13815341 ] stack commented on HBASE-9775: -- bq, I've tried it myself (exactly the same approach), but I didn't see a real difference. Do you see one in your tests? Minor (Certain allocation hotspots went from 3% to 2.4% in my extreme allocation test which probably means close to zero diff). I left it in since on the face of it there are less allocations. I'll commit this since the rig can be useful. Want to do some comment/javadoc first though. > Client write path perf issues > - > > Key: HBASE-9775 > URL: https://issues.apache.org/jira/browse/HBASE-9775 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Priority: Critical > Attachments: 9775.rig.txt, 9775.rig.v2.patch, 9775.rig.v3.patch, > Charts Search Cloudera Manager - ITBLL.png, Charts Search Cloudera > Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, ycsb.png, > ycsb_insert_94_vs_96.png > > > Testing on larger clusters has not had the desired throughput increases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-4876) TestDistributedLogSplitting#testWorkerAbort occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-4876. --- Resolution: Cannot Reproduce > TestDistributedLogSplitting#testWorkerAbort occasionally fails > -- > > Key: HBASE-4876 > URL: https://issues.apache.org/jira/browse/HBASE-4876 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu > > From > https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2486/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testWorkerAbort/: > {code} > 2011-11-26 18:10:25,075 DEBUG > [SplitLogWorker-janus.apache.org,42484,1322330994864] wal.HLogSplitter(460): > Closed > hdfs://localhost:47236/user/jenkins/splitlog/janus.apache.org,42484,1322330994864_hdfs%3A%2F%2Flocalhost%3A47236%2Fuser%2Fjenkins%2F.logs%2Fjanus.apache.org%2C42484%2C1322330994864%2Fjanus.apache.org%252C42484%252C1322330994864.1322330997838/table/be67e8c1df1e77e93181ff7300e77639/recovered.edits/152 > 2011-11-26 18:10:25,075 DEBUG > [SplitLogWorker-janus.apache.org,42484,1322330994864] wal.HLogSplitter(460): > Closed > hdfs://localhost:47236/user/jenkins/splitlog/janus.apache.org,42484,1322330994864_hdfs%3A%2F%2Flocalhost%3A47236%2Fuser%2Fjenkins%2F.logs%2Fjanus.apache.org%2C42484%2C1322330994864%2Fjanus.apache.org%252C42484%252C1322330994864.1322330997838/table/bf112e57fbaa65c12accfafaaa4dc2b0/recovered.edits/167 > 2011-11-26 18:10:25,075 DEBUG > [SplitLogWorker-janus.apache.org,42484,1322330994864] wal.HLogSplitter(460): > Closed > hdfs://localhost:47236/user/jenkins/splitlog/janus.apache.org,42484,1322330994864_hdfs%3A%2F%2Flocalhost%3A47236%2Fuser%2Fjenkins%2F.logs%2Fjanus.apache.org%2C42484%2C1322330994864%2Fjanus.apache.org%252C42484%252C1322330994864.1322330997838/table/bfb6983046589215ed8e6cb0e60dd803/recovered.edits/146 > 2011-11-26 18:10:25,488 INFO > [SplitLogWorker-janus.apache.org,42484,1322330994864] > regionserver.SplitLogWorker(308): worker janus.apache.org,42484,1322330994864 > done with task > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%3A47236%2Fuser%2Fjenkins%2F.logs%2Fjanus.apache.org%2C42484%2C1322330994864%2Fjanus.apache.org%252C42484%252C1322330994864.1322330997838 > in 13379ms > 2011-11-26 18:10:25,488 ERROR > [SplitLogWorker-janus.apache.org,42484,1322330994864] > regionserver.SplitLogWorker(169): unexpected error > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeThreads(DFSClient.java:3648) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3691) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3626) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) > at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:966) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.close(SequenceFileLogWriter.java:214) > at > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HLogSplitter.java:459) > at > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HLogSplitter.java:352) > at > org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:113) > at > org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:266) > at > org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:197) > at > org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:165) > at java.lang.Thread.run(Thread.java:662) > 2011-11-26 18:10:25,488 INFO > [SplitLogWorker-janus.apache.org,42484,1322330994864] > regionserver.SplitLogWorker(171): SplitLogWorker > janus.apache.org,42484,1322330994864 exiting > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9888) HBase replicates edits written before the replication peer is created
[ https://issues.apache.org/jira/browse/HBASE-9888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815337#comment-13815337 ] Dave Latham commented on HBASE-9888: {quote} > Would it work to just do it in each RS when the ReplicationSource on that RS > is created (in the mode for add_peer)? That's what I was proposing, sorry if not clear. {quote} +1 for the proposal. > HBase replicates edits written before the replication peer is created > - > > Key: HBASE-9888 > URL: https://issues.apache.org/jira/browse/HBASE-9888 > Project: HBase > Issue Type: Bug >Reporter: Dave Latham > > When creating a new replication peer the ReplicationSourceManager enqueues > the currently open HLog to the ReplicationSource to ship to the destination > cluster. The ReplicationSource starts at the beginning of the HLog and ships > over any pre-existing writes. > A workaround is to roll all the HLogs before enabling replication. > A little background for how it affected us - we were migrating one cluster in > a master-master pair. I.e. transitioning from A <\-> B to B <-> C. After > shutting down writes from A -> B we enabled writes from C -> B. However, > this replicated some earlier writes that were in C's HLogs that had > originated in A. Since we were running a version of HBase before HBASE-7709 > those writes then got caught in a infinite replication cycle and bringing > down region servers OOM because of HBASE-9865. > However, in general, if one wants to manage what data gets replicated, one > wouldn't expect that potentially very old writes would be included when > setting up a new replication link. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-6731) Port HBASE-6537 'Race between balancer and disable table can lead to inconsistent cluster' to 0.92
[ https://issues.apache.org/jira/browse/HBASE-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-6731. --- Resolution: Later Fix Version/s: (was: 0.92.3) 0.92 is not active. > Port HBASE-6537 'Race between balancer and disable table can lead to > inconsistent cluster' to 0.92 > -- > > Key: HBASE-6731 > URL: https://issues.apache.org/jira/browse/HBASE-6731 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: rajeshbabu > Attachments: HBASE-6731.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-4839) Re-enable TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover
[ https://issues.apache.org/jira/browse/HBASE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-4839. --- Resolution: Won't Fix The test no longer exists > Re-enable > TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover > -- > > Key: HBASE-4839 > URL: https://issues.apache.org/jira/browse/HBASE-4839 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Subbu M Iyer > > TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover > was disabled for instant schema change (HBASE-4213) after it failed on > Jenkins. > We should enable it and make it pass on Jenkins and dev enviroments. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9888) HBase replicates edits written before the replication peer is created
[ https://issues.apache.org/jira/browse/HBASE-9888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815322#comment-13815322 ] Jean-Daniel Cryans commented on HBASE-9888: --- bq. So,are you suggesting implementing a custom ReplicationSource to seek into the current WAL to find the edit with writeTime > sourceCreationTimestamp? It wouldn't be custom, it'd be the default behavior. > HBase replicates edits written before the replication peer is created > - > > Key: HBASE-9888 > URL: https://issues.apache.org/jira/browse/HBASE-9888 > Project: HBase > Issue Type: Bug >Reporter: Dave Latham > > When creating a new replication peer the ReplicationSourceManager enqueues > the currently open HLog to the ReplicationSource to ship to the destination > cluster. The ReplicationSource starts at the beginning of the HLog and ships > over any pre-existing writes. > A workaround is to roll all the HLogs before enabling replication. > A little background for how it affected us - we were migrating one cluster in > a master-master pair. I.e. transitioning from A <\-> B to B <-> C. After > shutting down writes from A -> B we enabled writes from C -> B. However, > this replicated some earlier writes that were in C's HLogs that had > originated in A. Since we were running a version of HBase before HBASE-7709 > those writes then got caught in a infinite replication cycle and bringing > down region servers OOM because of HBASE-9865. > However, in general, if one wants to manage what data gets replicated, one > wouldn't expect that potentially very old writes would be included when > setting up a new replication link. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9888) HBase replicates edits written before the replication peer is created
[ https://issues.apache.org/jira/browse/HBASE-9888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815314#comment-13815314 ] Jean-Daniel Cryans commented on HBASE-9888: --- bq. That sounds great. Is that 0.94 only or do the newer versions also have it? It's in trunk too. bq. Do you have an idea where the minimum timestamp would be generated? Once we get the zk event? Not sure. bq. Would it work to just do it in each RS when the ReplicationSource on that RS is created (in the mode for add_peer)? That's what I was proposing, sorry if not clear. bq. Alternatively, should each RS roll its HLog when creating a new peer? That could work but I'd rather not roll logs for this. > HBase replicates edits written before the replication peer is created > - > > Key: HBASE-9888 > URL: https://issues.apache.org/jira/browse/HBASE-9888 > Project: HBase > Issue Type: Bug >Reporter: Dave Latham > > When creating a new replication peer the ReplicationSourceManager enqueues > the currently open HLog to the ReplicationSource to ship to the destination > cluster. The ReplicationSource starts at the beginning of the HLog and ships > over any pre-existing writes. > A workaround is to roll all the HLogs before enabling replication. > A little background for how it affected us - we were migrating one cluster in > a master-master pair. I.e. transitioning from A <\-> B to B <-> C. After > shutting down writes from A -> B we enabled writes from C -> B. However, > this replicated some earlier writes that were in C's HLogs that had > originated in A. Since we were running a version of HBase before HBASE-7709 > those writes then got caught in a infinite replication cycle and bringing > down region servers OOM because of HBASE-9865. > However, in general, if one wants to manage what data gets replicated, one > wouldn't expect that potentially very old writes would be included when > setting up a new replication link. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9888) HBase replicates edits written before the replication peer is created
[ https://issues.apache.org/jira/browse/HBASE-9888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815313#comment-13815313 ] santosh banerjee commented on HBASE-9888: - {quote} In 0.94, HLogKey has a writeTime and we could seek in the current WAL until we find an edit that's been written after the source was created.{quote} This sounds interesting. So,are you suggesting implementing a custom ReplicationSource to seek into the current WAL to find the edit with writeTime > sourceCreationTimestamp? > HBase replicates edits written before the replication peer is created > - > > Key: HBASE-9888 > URL: https://issues.apache.org/jira/browse/HBASE-9888 > Project: HBase > Issue Type: Bug >Reporter: Dave Latham > > When creating a new replication peer the ReplicationSourceManager enqueues > the currently open HLog to the ReplicationSource to ship to the destination > cluster. The ReplicationSource starts at the beginning of the HLog and ships > over any pre-existing writes. > A workaround is to roll all the HLogs before enabling replication. > A little background for how it affected us - we were migrating one cluster in > a master-master pair. I.e. transitioning from A <\-> B to B <-> C. After > shutting down writes from A -> B we enabled writes from C -> B. However, > this replicated some earlier writes that were in C's HLogs that had > originated in A. Since we were running a version of HBase before HBASE-7709 > those writes then got caught in a infinite replication cycle and bringing > down region servers OOM because of HBASE-9865. > However, in general, if one wants to manage what data gets replicated, one > wouldn't expect that potentially very old writes would be included when > setting up a new replication link. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9903) Remove the jamon generated classes from the findbugs analysis
[ https://issues.apache.org/jira/browse/HBASE-9903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-9903: --- Status: Patch Available (was: Open) > Remove the jamon generated classes from the findbugs analysis > - > > Key: HBASE-9903 > URL: https://issues.apache.org/jira/browse/HBASE-9903 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0, 0.98.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0 > > Attachments: 9903.v1.patch, 9903.v2.patch, 9903.v2.patch > > > The current filter does not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-7025) Metric for how many WAL files a regionserver is carrying
[ https://issues.apache.org/jira/browse/HBASE-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815299#comment-13815299 ] Asaf Mesika commented on HBASE-7025: I'm running into a "too many hlog" warnings, which in turn causes (eventually) region server to crash. I'm in the middle on analyzing it through the debug log files and Graphite and could really use this metric to understand, overtime, when started having a lot of wal files in the queue. > Metric for how many WAL files a regionserver is carrying > > > Key: HBASE-7025 > URL: https://issues.apache.org/jira/browse/HBASE-7025 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: stack > > A metric that shows how many WAL files a regionserver is carrying at any one > time would be useful for fingering those servers that are always over the > upper bounds and in need of attention -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9903) Remove the jamon generated classes from the findbugs analysis
[ https://issues.apache.org/jira/browse/HBASE-9903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-9903: --- Status: Open (was: Patch Available) > Remove the jamon generated classes from the findbugs analysis > - > > Key: HBASE-9903 > URL: https://issues.apache.org/jira/browse/HBASE-9903 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0, 0.98.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0 > > Attachments: 9903.v1.patch, 9903.v2.patch, 9903.v2.patch > > > The current filter does not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9903) Remove the jamon generated classes from the findbugs analysis
[ https://issues.apache.org/jira/browse/HBASE-9903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-9903: --- Attachment: 9903.v2.patch > Remove the jamon generated classes from the findbugs analysis > - > > Key: HBASE-9903 > URL: https://issues.apache.org/jira/browse/HBASE-9903 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.98.0, 0.96.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon > Fix For: 0.98.0 > > Attachments: 9903.v1.patch, 9903.v2.patch, 9903.v2.patch > > > The current filter does not work. -- This message was sent by Atlassian JIRA (v6.1#6144)