[jira] [Commented] (HBASE-10486) ProtobufUtil Append Increment deserialization lost cell level timestamp
[ https://issues.apache.org/jira/browse/HBASE-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896301#comment-13896301 ] Hudson commented on HBASE-10486: FAILURE: Integrated in HBase-TRUNK #4903 (See [https://builds.apache.org/job/HBase-TRUNK/4903/]) HBASE-10486: ProtobufUtil Append Increment deserialization lost cell level timestamp (jeffreyz: rev 1566505) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java ProtobufUtil Append Increment deserialization lost cell level timestamp - Key: HBASE-10486 URL: https://issues.apache.org/jira/browse/HBASE-10486 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.1 Attachments: hbase-10486-v2.patch, hbase-10486.patch When we deserialized Append Increment, we uses wrong timestamp value during deserialization in trunk 0.98 code and discard the value in 0.96 code base. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10479) HConnection interface is public but is used internally, and contains a bunch of methods
[ https://issues.apache.org/jira/browse/HBASE-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896302#comment-13896302 ] Hudson commented on HBASE-10479: FAILURE: Integrated in HBase-TRUNK #4903 (See [https://builds.apache.org/job/HBase-TRUNK/4903/]) HBASE-10479 HConnection interface is public but is used internally, and contains a bunch of methods (sershe: rev 1566501) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClusterConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionKey.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperKeepAliveConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperRegistry.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientNoCluster.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestSnapshotFromAdmin.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTableWrapper.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientTimeouts.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestHTableWrapper.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java HConnection interface is public but is used internally, and contains a bunch of methods --- Key: HBASE-10479 URL: https://issues.apache.org/jira/browse/HBASE-10479 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: 45751591.jpg, HBASE-10479.01.patch, HBASE-10479.02.patch, HBASE-10479.03.patch, HBASE-10479.04.patch, HBASE-10479.final.patch, HBASE-10479.patch HConnection has too many methods for a public interface, and some of these should not be public. It is used extensively for internal purposes, so we keep adding methods to it that may not make sense for public interface. The idea is to create a separate internal interface inheriting HConnection, copy some methods to it and deprecate them on HConnection. New methods for internal use would be added to new interface; the deprecated methods would eventually be removed from public interface. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10479) HConnection interface is public but is used internally, and contains a bunch of methods
[ https://issues.apache.org/jira/browse/HBASE-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-10479: - Resolution: Fixed Fix Version/s: hbase-10070 0.99.0 Status: Resolved (was: Patch Available) committed to trunk and hbase-10070 HConnection interface is public but is used internally, and contains a bunch of methods --- Key: HBASE-10479 URL: https://issues.apache.org/jira/browse/HBASE-10479 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0, hbase-10070 Attachments: 45751591.jpg, HBASE-10479.01.patch, HBASE-10479.02.patch, HBASE-10479.03.patch, HBASE-10479.04.patch, HBASE-10479.final.patch, HBASE-10479.patch HConnection has too many methods for a public interface, and some of these should not be public. It is used extensively for internal purposes, so we keep adding methods to it that may not make sense for public interface. The idea is to create a separate internal interface inheriting HConnection, copy some methods to it and deprecate them on HConnection. New methods for internal use would be added to new interface; the deprecated methods would eventually be removed from public interface. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10486) ProtobufUtil Append Increment deserialization lost cell level timestamp
[ https://issues.apache.org/jira/browse/HBASE-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896341#comment-13896341 ] Hudson commented on HBASE-10486: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #132 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/132/]) HBASE-10486: ProtobufUtil Append Increment deserialization lost cell level timestamp (jeffreyz: rev 1566507) * /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java ProtobufUtil Append Increment deserialization lost cell level timestamp - Key: HBASE-10486 URL: https://issues.apache.org/jira/browse/HBASE-10486 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.1 Attachments: hbase-10486-v2.patch, hbase-10486.patch When we deserialized Append Increment, we uses wrong timestamp value during deserialization in trunk 0.98 code and discard the value in 0.96 code base. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Nalezenec updated HBASE-10413: Attachment: HBASE-10413-6.patch Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10452: -- Attachment: HBase-10452-trunk-v3.patch Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods in the same file: Line: 154, same file {noformat}
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896424#comment-13896424 ] Ding Yuan commented on HBASE-10452: --- Thanks for the comments! Attached a new patch to address them. As for the possible integer overflow error from TimeRange: an IOException instead of RuntimeException is now thrown so the upper levels will deal with it. Let me know if this is fine. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896437#comment-13896437 ] Hadoop QA commented on HBASE-10413: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627957/HBASE-10413-6.patch against trunk revision . ATTACHMENT ID: 12627957 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8649//console This message is automatically generated. Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896438#comment-13896438 ] Lukas Nalezenec commented on HBASE-10413: - I have removed setLength() from TableSplit. Unit tests are green, I would like to resolve this ticket. Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896489#comment-13896489 ] Hadoop QA commented on HBASE-10452: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627964/HBase-10452-trunk-v3.patch against trunk revision . ATTACHMENT ID: 12627964 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//console This message is automatically generated. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java ==
[jira] [Commented] (HBASE-10489) TestImportExport fails in 0.94 with Hadoop2
[ https://issues.apache.org/jira/browse/HBASE-10489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896753#comment-13896753 ] Lars Hofhansl commented on HBASE-10489: --- Passed in latest run. It seems either TestImportExport or TestImportTsv passes now. There are some interactions between the tests. TestImportExport fails in 0.94 with Hadoop2 --- Key: HBASE-10489 URL: https://issues.apache.org/jira/browse/HBASE-10489 Project: HBase Issue Type: Bug Components: test Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.17 Attachments: 10489.txt With HBASE-10363 fixed, we're now seeing other M/R tests failing. TestImportExport is one of them. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10485: --- Fix Version/s: (was: 0.98.0) 0.98.1 PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1, 0.99.0 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896763#comment-13896763 ] Ted Yu commented on HBASE-10413: [~apurtell]: Do you want this in 0.98 ? Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10490) Simplify RpcClient code
[ https://issues.apache.org/jira/browse/HBASE-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10490: Attachment: 10490.v1.patch Simplify RpcClient code --- Key: HBASE-10490 URL: https://issues.apache.org/jira/browse/HBASE-10490 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10490.v1.patch The code is complex. Here is a set of proposed changes, for trunk: 1) remove PingInputStream. if rpcTimeout 0 it just rethrows the exception. I expect that we always have a rpcTimeout. So we can remove the code. 2) remove the sendPing: instead, just close the connection if it's not used for a while, instead of trying to ping the server. 3) remove maxIddle time: to avoid the confusion if someone has overwritten the conf. 4) remove shouldCloseConnection: it was more or less synchronized with closeException. Having a single variable instead of two avoids the synchro 5) remove lastActivity: instead of trying to have an exact timeout, just kill the connection after some time. lastActivity could be set to wrong values if the server was slow to answer. 6) hopefully, a better management of the exception; we don't use the close exception of someone else as an input for another one. Same goes for interruption. I may have something wrong in the code. I will review it myself again. Feedback welcome, especially on the ping removal: I hope I got all the use cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10490) Simplify RpcClient code
[ https://issues.apache.org/jira/browse/HBASE-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10490: Status: Patch Available (was: Open) Simplify RpcClient code --- Key: HBASE-10490 URL: https://issues.apache.org/jira/browse/HBASE-10490 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10490.v1.patch The code is complex. Here is a set of proposed changes, for trunk: 1) remove PingInputStream. if rpcTimeout 0 it just rethrows the exception. I expect that we always have a rpcTimeout. So we can remove the code. 2) remove the sendPing: instead, just close the connection if it's not used for a while, instead of trying to ping the server. 3) remove maxIddle time: to avoid the confusion if someone has overwritten the conf. 4) remove shouldCloseConnection: it was more or less synchronized with closeException. Having a single variable instead of two avoids the synchro 5) remove lastActivity: instead of trying to have an exact timeout, just kill the connection after some time. lastActivity could be set to wrong values if the server was slow to answer. 6) hopefully, a better management of the exception; we don't use the close exception of someone else as an input for another one. Same goes for interruption. I may have something wrong in the code. I will review it myself again. Feedback welcome, especially on the ping removal: I hope I got all the use cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10490) Simplify RpcClient code
Nicolas Liochon created HBASE-10490: --- Summary: Simplify RpcClient code Key: HBASE-10490 URL: https://issues.apache.org/jira/browse/HBASE-10490 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 The code is complex. Here is a set of proposed changes, for trunk: 1) remove PingInputStream. if rpcTimeout 0 it just rethrows the exception. I expect that we always have a rpcTimeout. So we can remove the code. 2) remove the sendPing: instead, just close the connection if it's not used for a while, instead of trying to ping the server. 3) remove maxIddle time: to avoid the confusion if someone has overwritten the conf. 4) remove shouldCloseConnection: it was more or less synchronized with closeException. Having a single variable instead of two avoids the synchro 5) remove lastActivity: instead of trying to have an exact timeout, just kill the connection after some time. lastActivity could be set to wrong values if the server was slow to answer. 6) hopefully, a better management of the exception; we don't use the close exception of someone else as an input for another one. Same goes for interruption. I may have something wrong in the code. I will review it myself again. Feedback welcome, especially on the ping removal: I hope I got all the use cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896787#comment-13896787 ] Hudson commented on HBASE-10485: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #133 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/133/]) HBASE-10485 Addendum (tedyu: rev 1566651) * /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1, 0.99.0 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order
[ https://issues.apache.org/jira/browse/HBASE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10481: -- Fix Version/s: (was: 0.94.16) 0.94.17 API Compatibility JDiff script does not properly handle arguments in reverse order -- Key: HBASE-10481 URL: https://issues.apache.org/jira/browse/HBASE-10481 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.98.0, 0.94.16, 0.99.0, 0.96.1.1 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.98.1, 0.99.0, 0.96.1.1, 0.94.17 Attachments: HBASE-10481-v1.patch [~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a post-0.96 branch. Typically, if the pre-0.96 branch is specified first, and the post-0.96 branch second, the exisitng logic handles it. When it is in the reverse order, that logic is not handled properly. The fix should address this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10490) Simplify RpcClient code
[ https://issues.apache.org/jira/browse/HBASE-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896796#comment-13896796 ] stack commented on HBASE-10490: --- +1 on removing the ping code. Spelling minIddleTimeBeforeClose I know this is not you, but this mechanism seems fragile +protected final AtomicReferenceIOException closeReason = new AtomicReferenceIOException(); i.e. keeping around the exception Why do you think the code used try to contain exceptions? Now you let them out. Good stuff [~nkeywal] Simplify RpcClient code --- Key: HBASE-10490 URL: https://issues.apache.org/jira/browse/HBASE-10490 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10490.v1.patch The code is complex. Here is a set of proposed changes, for trunk: 1) remove PingInputStream. if rpcTimeout 0 it just rethrows the exception. I expect that we always have a rpcTimeout. So we can remove the code. 2) remove the sendPing: instead, just close the connection if it's not used for a while, instead of trying to ping the server. 3) remove maxIddle time: to avoid the confusion if someone has overwritten the conf. 4) remove shouldCloseConnection: it was more or less synchronized with closeException. Having a single variable instead of two avoids the synchro 5) remove lastActivity: instead of trying to have an exact timeout, just kill the connection after some time. lastActivity could be set to wrong values if the server was slow to answer. 6) hopefully, a better management of the exception; we don't use the close exception of someone else as an input for another one. Same goes for interruption. I may have something wrong in the code. I will review it myself again. Feedback welcome, especially on the ping removal: I hope I got all the use cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10490) Simplify RpcClient code
[ https://issues.apache.org/jira/browse/HBASE-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896802#comment-13896802 ] Nicolas Liochon commented on HBASE-10490: - bq. Spelling minIddleTimeBeforeClose Sure. bq. i.e. keeping around the exception We're clean now: we now use it only in logs and as initCause in the exceptions we throw. So we can as well fully remove it (basically, between the boolean and the exception, I kept the exception, but keeping only the boolean should be fine as well). Simplify RpcClient code --- Key: HBASE-10490 URL: https://issues.apache.org/jira/browse/HBASE-10490 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10490.v1.patch The code is complex. Here is a set of proposed changes, for trunk: 1) remove PingInputStream. if rpcTimeout 0 it just rethrows the exception. I expect that we always have a rpcTimeout. So we can remove the code. 2) remove the sendPing: instead, just close the connection if it's not used for a while, instead of trying to ping the server. 3) remove maxIddle time: to avoid the confusion if someone has overwritten the conf. 4) remove shouldCloseConnection: it was more or less synchronized with closeException. Having a single variable instead of two avoids the synchro 5) remove lastActivity: instead of trying to have an exact timeout, just kill the connection after some time. lastActivity could be set to wrong values if the server was slow to answer. 6) hopefully, a better management of the exception; we don't use the close exception of someone else as an input for another one. Same goes for interruption. I may have something wrong in the code. I will review it myself again. Feedback welcome, especially on the ping removal: I hope I got all the use cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order
[ https://issues.apache.org/jira/browse/HBASE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896801#comment-13896801 ] stack commented on HBASE-10481: --- lgtm I will apply soon unless objection (I can fix 100 lines on commit) API Compatibility JDiff script does not properly handle arguments in reverse order -- Key: HBASE-10481 URL: https://issues.apache.org/jira/browse/HBASE-10481 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.98.0, 0.94.16, 0.99.0, 0.96.1.1 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.98.1, 0.99.0, 0.96.1.1, 0.94.17 Attachments: HBASE-10481-v1.patch [~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a post-0.96 branch. Typically, if the pre-0.96 branch is specified first, and the post-0.96 branch second, the exisitng logic handles it. When it is in the reverse order, that logic is not handled properly. The fix should address this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10352) Region and RegionServer changes for opening region replicas, and refreshing store files
[ https://issues.apache.org/jira/browse/HBASE-10352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896815#comment-13896815 ] Devaraj Das commented on HBASE-10352: - LGTM overall. One nit is that, given the considerations in HBASE-10347 about names, should we make this patch adhere to that - primary to default and secondary to replica... Region and RegionServer changes for opening region replicas, and refreshing store files --- Key: HBASE-10352 URL: https://issues.apache.org/jira/browse/HBASE-10352 Project: HBase Issue Type: Sub-task Components: Region Assignment, regionserver Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.99.0 Attachments: hbase-10352_v2.patch Region replicas should be opened in read-only mode, and the replica mode so that they serve queries from the primary regions' files. This jira will also capture periodic refreshing of the store files from the secondary regions so that they can get flushed and compacted files according to the region snapshots section in the design doc for the parent jira. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896855#comment-13896855 ] Hudson commented on HBASE-10485: SUCCESS: Integrated in HBase-TRUNK #4904 (See [https://builds.apache.org/job/HBase-TRUNK/4904/]) HBASE-10485 Addendum (tedyu: rev 1566653) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1, 0.99.0 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896865#comment-13896865 ] Lukas Nalezenec commented on HBASE-10413: - It would be great. Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10490) Simplify RpcClient code
[ https://issues.apache.org/jira/browse/HBASE-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896872#comment-13896872 ] Hadoop QA commented on HBASE-10490: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628010/10490.v1.patch against trunk revision . ATTACHMENT ID: 12628010 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.replication.TestReplicationKillSlaveRS Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8651//console This message is automatically generated. Simplify RpcClient code --- Key: HBASE-10490 URL: https://issues.apache.org/jira/browse/HBASE-10490 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10490.v1.patch The code is complex. Here is a set of proposed changes, for trunk: 1) remove PingInputStream. if rpcTimeout 0 it just rethrows the exception. I expect that we always have a rpcTimeout. So we can remove the code. 2) remove the sendPing: instead, just close the connection if it's not used for a while, instead of trying to ping the server. 3) remove maxIddle time: to avoid the confusion if someone has overwritten the conf. 4) remove shouldCloseConnection: it was more or less synchronized with closeException. Having a single variable instead of two avoids the synchro 5) remove lastActivity: instead of trying to have an exact timeout, just kill the connection after some time. lastActivity could be set to wrong values if the server was slow to answer. 6) hopefully, a better management of the exception; we don't use the close exception of someone else as an input for another one. Same goes for interruption. I may have something wrong in the code. I will review it myself again. Feedback welcome, especially on the ping removal: I hope I got all the use cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10491) RegionLocations::getRegionLocation can return unexpected replica
Sergey Shelukhin created HBASE-10491: Summary: RegionLocations::getRegionLocation can return unexpected replica Key: HBASE-10491 URL: https://issues.apache.org/jira/browse/HBASE-10491 Project: HBase Issue Type: Bug Affects Versions: hbase-10070 Reporter: Sergey Shelukhin The method returns first non-null replica. If first replica is assumed to always be non-null (discussed with Enis), then this code is not necessary, it should return 0th one, maybe assert it's not null. If that is not the case, then code may be incorrect and may return non-primary to some code (locateRegion overload) that doesn't expect it. Perhaps method should be called getAnyRegionReplica or something like that; and get(Primary?)RegionLocation should return the first. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10491) RegionLocations::getRegionLocation can return unexpected replica
[ https://issues.apache.org/jira/browse/HBASE-10491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-10491: - Issue Type: Sub-task (was: Bug) Parent: HBASE-10070 RegionLocations::getRegionLocation can return unexpected replica Key: HBASE-10491 URL: https://issues.apache.org/jira/browse/HBASE-10491 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10070 Reporter: Sergey Shelukhin The method returns first non-null replica. If first replica is assumed to always be non-null (discussed with Enis), then this code is not necessary, it should return 0th one, maybe assert it's not null. If that is not the case, then code may be incorrect and may return non-primary to some code (locateRegion overload) that doesn't expect it. Perhaps method should be called getAnyRegionReplica or something like that; and get(Primary?)RegionLocation should return the first. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896899#comment-13896899 ] Lars Hofhansl commented on HBASE-10485: --- [~stack], I assume you want that in 0.96. I'll commit to 0.94 and 0.96 unless I hear an objection from you. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1, 0.99.0 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896898#comment-13896898 ] Lars Hofhansl commented on HBASE-10485: --- +1 on addendum. Good catch [~ram_krish]. RowFIlter could have served as an example. We should add some documentation to Filter.filterKeyValue that it must be consistent with FilterRow and FilterRowKey (just like a Comparable must be consistent with equals in collections). Otherwise the result is undefined. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1, 0.99.0 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896900#comment-13896900 ] Nick Dimiduk commented on HBASE-10413: -- Overall looks good. IMHO, better to have constructors with more arguments and support RAII than have setters folks forget to call. Just some nits now: TableInputFormatBase: - HBase code doesn't use format strings (for whatever reason). Please keep it consistent and use string concatenation. - extra whitespace TableSplit: - comment as with previous version has no context, just omit it. Nice test in TestRegionSizeCalculator :) [~enis] anything else here? Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10486) ProtobufUtil Append Increment deserialization lost cell level timestamp
[ https://issues.apache.org/jira/browse/HBASE-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-10486: -- Fix Version/s: 0.99.0 ProtobufUtil Append Increment deserialization lost cell level timestamp - Key: HBASE-10486 URL: https://issues.apache.org/jira/browse/HBASE-10486 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.1, 0.99.0 Attachments: hbase-10486-v2.patch, hbase-10486.patch When we deserialized Append Increment, we uses wrong timestamp value during deserialization in trunk 0.98 code and discard the value in 0.96 code base. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10492) open daughter regions can unpredictably take long time
Jerry He created HBASE-10492: Summary: open daughter regions can unpredictably take long time Key: HBASE-10492 URL: https://issues.apache.org/jira/browse/HBASE-10492 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Jerry He I have seen during a stress testing client was getting RetriesExhaustedWithDetailsException: Failed 748 actions: NotServingRegionException On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN. The corresponding time period on the region sever log is: {code} 2014-02-08 20:44:12,662 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: 010c1981882d1a59201af5e2dc589d44 2014-02-08 20:44:12,666 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: c2eb9b7971ca7f3fed3da86df5b788e7 {code} There were no INFO related to these two regions until: (at the end see this: Split took 57mins, 16sec) {code} 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354 2014-02-08 21:41:14,032 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c. into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: Request = regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c., storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, time=1391924373278861000; duration=1mins, 40sec 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on cf in region tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Starting compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. into tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp, totalSize=709.7 M 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,190 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Region split, hbase:meta updated, and report to master. Parent=tpch_hb_1000_2.lineitem,,1391918508561.b576e8db65d56ec08db5ca900587c28d., new regions: tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44., tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.. Split took 57mins, 16sec {code} -- This message
[jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time
[ https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896937#comment-13896937 ] Jerry He commented on HBASE-10492: -- The problem is probably caused by this part of the code in SplitTransaction.openDaughters() {code} // Open daughters in parallel. DaughterOpener aOpener = new DaughterOpener(server, a); DaughterOpener bOpener = new DaughterOpener(server, b); aOpener.start(); bOpener.start(); try { aOpener.join(); bOpener.join(); } {code} We are opening the daughter regions in separate new threads. It is possible, although rare, due to issues like thread scheduling, the daughter regions may not be opened until after a long time. open daughter regions can unpredictably take long time -- Key: HBASE-10492 URL: https://issues.apache.org/jira/browse/HBASE-10492 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Jerry He I have seen during a stress testing client was getting RetriesExhaustedWithDetailsException: Failed 748 actions: NotServingRegionException On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN. The corresponding time period on the region sever log is: {code} 2014-02-08 20:44:12,662 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: 010c1981882d1a59201af5e2dc589d44 2014-02-08 20:44:12,666 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: c2eb9b7971ca7f3fed3da86df5b788e7 {code} There were no INFO related to these two regions until: (at the end see this: Split took 57mins, 16sec) {code} 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354 2014-02-08 21:41:14,032 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c. into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: Request = regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c., storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, time=1391924373278861000; duration=1mins, 40sec 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on cf in region tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Starting compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. into tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp, totalSize=709.7 M 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
[jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time
[ https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896978#comment-13896978 ] stack commented on HBASE-10492: --- Can you reproduce [~jerryhe]? It'd be crazy a thread not being scheduled for an hour. Any details on your setup? The load type, the os? Thanks. open daughter regions can unpredictably take long time -- Key: HBASE-10492 URL: https://issues.apache.org/jira/browse/HBASE-10492 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Jerry He I have seen during a stress testing client was getting RetriesExhaustedWithDetailsException: Failed 748 actions: NotServingRegionException On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN. The corresponding time period on the region sever log is: {code} 2014-02-08 20:44:12,662 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: 010c1981882d1a59201af5e2dc589d44 2014-02-08 20:44:12,666 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: c2eb9b7971ca7f3fed3da86df5b788e7 {code} There were no INFO related to these two regions until: (at the end see this: Split took 57mins, 16sec) {code} 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354 2014-02-08 21:41:14,032 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c. into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: Request = regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c., storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, time=1391924373278861000; duration=1mins, 40sec 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on cf in region tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Starting compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. into tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp, totalSize=709.7 M 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,190 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Region split,
[jira] [Updated] (HBASE-8332) Add truncate as HMaster method
[ https://issues.apache.org/jira/browse/HBASE-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-8332: --- Attachment: HBASE-8332-v2.patch Add truncate as HMaster method -- Key: HBASE-8332 URL: https://issues.apache.org/jira/browse/HBASE-8332 Project: HBase Issue Type: Improvement Components: master Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Attachments: HBASE-8332-v0.patch, HBASE-8332-v2.patch, HBASE-8332.draft.patch Currently truncate and truncate_preserve are only shell functions, and implemented as deleteTable() + createTable(). Using ACLs the user running truncate, must have rights to create a table and only global granted users can create tables. Add truncate() and truncatePreserve() to HBaseAdmin/HMaster with its own ACL check. https://reviews.apache.org/r/15835/ -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897049#comment-13897049 ] Enis Soztutar commented on HBASE-10413: --- bq. Enis Soztutar anything else here? Looks good to me. We can commit after your comments are addressed. Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-10389: -- Fix Version/s: 0.99.0 0.98.1 Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.98.1, 0.99.0 Attachments: HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}} You can also keep around a reference to the created table: hbase t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. {code} We should document the usage of namespace in these commands. For example: #namespace=foo and table qualifier=bar create 'foo:bar', 'fam' #namespace=default and table qualifier=bar create 'bar', 'fam' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-10389: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed this to trunk and 0.98. Thanks Jerry for the patch. The patch does not apply cleanly to 0.96, if you supply a patch, I can commit it there as well. Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.98.1, 0.99.0 Attachments: HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}} You can also keep around a reference to the created table: hbase t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. {code} We should document the usage of namespace in these commands. For example: #namespace=foo and table qualifier=bar create 'foo:bar', 'fam' #namespace=default and table qualifier=bar create 'bar', 'fam' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10350) Master/AM/RegionStates changes to create and assign region replicas
[ https://issues.apache.org/jira/browse/HBASE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897058#comment-13897058 ] Enis Soztutar commented on HBASE-10350: --- Can you also put an RB. Thanks. Master/AM/RegionStates changes to create and assign region replicas --- Key: HBASE-10350 URL: https://issues.apache.org/jira/browse/HBASE-10350 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Devaraj Das Fix For: 0.99.0 Attachments: 10350-1.txt As per design in the parent jira, this jira will capture the changes in the master side (especially AM / RegionStates) for creating tables with region replicas, and making sure the regions are assigned on create and failover. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-10389: -- Fix Version/s: 0.96.2 Committed to 0.96 too. Thanks [~jerryhe] Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}} You can also keep around a reference to the created table: hbase t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. {code} We should document the usage of namespace in these commands. For example: #namespace=foo and table qualifier=bar create 'foo:bar', 'fam' #namespace=default and table qualifier=bar create 'bar', 'fam' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-10389: -- Attachment: 10389.096.txt A little fixup to make it go against 0.96. Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}} You can also keep around a reference to the created table: hbase t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. {code} We should document the usage of namespace in these commands. For example: #namespace=foo and table qualifier=bar create 'foo:bar', 'fam' #namespace=default and table qualifier=bar create 'bar', 'fam' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897102#comment-13897102 ] Andrew Purtell commented on HBASE-10413: Sure +1 for 0.98 after the remaining review comments are addressed. Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10350) Master/AM/RegionStates changes to create and assign region replicas
[ https://issues.apache.org/jira/browse/HBASE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897105#comment-13897105 ] Devaraj Das commented on HBASE-10350: - https://reviews.apache.org/r/17865/ (had forgotten to paste the link here earlier). Master/AM/RegionStates changes to create and assign region replicas --- Key: HBASE-10350 URL: https://issues.apache.org/jira/browse/HBASE-10350 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Devaraj Das Fix For: 0.99.0 Attachments: 10350-1.txt As per design in the parent jira, this jira will capture the changes in the master side (especially AM / RegionStates) for creating tables with region replicas, and making sure the regions are assigned on create and failover. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-8751) Enable peer cluster to choose/change the ColumnFamilies/Tables it really want to replicate from a source cluster
[ https://issues.apache.org/jira/browse/HBASE-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897109#comment-13897109 ] Andrew Purtell commented on HBASE-8751: --- +1 for 0.98 Enable peer cluster to choose/change the ColumnFamilies/Tables it really want to replicate from a source cluster Key: HBASE-8751 URL: https://issues.apache.org/jira/browse/HBASE-8751 Project: HBase Issue Type: New Feature Components: Replication Reporter: Feng Honghua Assignee: Feng Honghua Attachments: HBASE-8751-0.94-V0.patch, HBASE-8751-0.94-v1.patch, HBASE-8751-trunk_v0.patch, HBASE-8751-trunk_v1.patch, HBASE-8751-trunk_v2.patch, HBASE-8751-trunk_v3.patch Consider scenarios (all cf are with replication-scope=1): 1) cluster S has 3 tables, table A has cfA,cfB, table B has cfX,cfY, table C has cf1,cf2. 2) cluster X wants to replicate table A : cfA, table B : cfX and table C from cluster S. 3) cluster Y wants to replicate table B : cfY, table C : cf2 from cluster S. Current replication implementation can't achieve this since it'll push the data of all the replicatable column-families from cluster S to all its peers, X/Y in this scenario. This improvement provides a fine-grained replication theme which enable peer cluster to choose the column-families/tables they really want from the source cluster: A). Set the table:cf-list for a peer when addPeer: hbase-shell add_peer '3', zk:1100:/hbase, table1; table2:cf1,cf2; table3:cf2 B). View the table:cf-list config for a peer using show_peer_tableCFs: hbase-shell show_peer_tableCFs 1 C). Change/set the table:cf-list for a peer using set_peer_tableCFs: hbase-shell set_peer_tableCFs '2', table1:cfX; table2:cf1; table3:cf1,cf2 In this theme, replication-scope=1 only means a column-family CAN be replicated to other clusters, but only the 'table:cf-list list' determines WHICH cf/table will actually be replicated to a specific peer. To provide back-compatibility, empty 'table:cf-list list' will replicate all replicatable cf/table. (this means we don't allow a peer which replicates nothing from a source cluster, we think it's reasonable: if replicating nothing why bother adding a peer?) This improvement addresses the exact problem raised by the first FAQ in http://hbase.apache.org/replication.html: GLOBAL means replicate? Any provision to replicate only to cluster X and not to cluster Y? or is that for later? Yes, this is for much later. I also noticed somebody mentioned replication-scope as integer rather than a boolean is for such fine-grained replication purpose, but I think extending replication-scope can't achieve the same replication granularity flexibility as providing above per-peer replication configurations. This improvement has been running smoothly in our production clusters (Xiaomi) for several months. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10413: --- Fix Version/s: 0.99.0 0.98.1 Hadoop Flags: Reviewed Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Fix For: 0.98.1, 0.99.0 Attachments: HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10413: --- Resolution: Fixed Status: Resolved (was: Patch Available) Integrated to 0.98 and trunk. Thanks for the patch, Lucas. Thanks for the reviews. Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Fix For: 0.98.1, 0.99.0 Attachments: 10413-7.patch, HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10413) Tablesplit.getLength returns 0
[ https://issues.apache.org/jira/browse/HBASE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10413: --- Attachment: 10413-7.patch Patch v7 addresses Nick's comments Tablesplit.getLength returns 0 -- Key: HBASE-10413 URL: https://issues.apache.org/jira/browse/HBASE-10413 Project: HBase Issue Type: Bug Components: Client, mapreduce Affects Versions: 0.96.1.1 Reporter: Lukas Nalezenec Assignee: Lukas Nalezenec Fix For: 0.98.1, 0.99.0 Attachments: 10413-7.patch, HBASE-10413-2.patch, HBASE-10413-3.patch, HBASE-10413-4.patch, HBASE-10413-5.patch, HBASE-10413-6.patch, HBASE-10413.patch InputSplits should be sorted by length but TableSplit does not contain real getLength implementation: @Override public long getLength() { // Not clear how to obtain this... seems to be used only for sorting splits return 0; } This is causing us problem with scheduling - we have got jobs that are supposed to finish in limited time but they get often stuck in last mapper working on large region. Can we implement this method ? What is the best way ? We were thinking about estimating size by size of files on HDFS. We would like to get Scanner from TableSplit, use startRow, stopRow and column families to get corresponding region than computing size of HDFS for given region and column family. Update: This ticket was about production issue - I talked with guy who worked on this and he said our production issue was probably not directly caused by getLength() returning 0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
Ted Yu created HBASE-10493: -- Summary: InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10493: --- Attachment: 10493-v1.txt InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10493-v1.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10493: --- Status: Patch Available (was: Open) InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10493-v1.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-8332) Add truncate as HMaster method
[ https://issues.apache.org/jira/browse/HBASE-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897161#comment-13897161 ] Hadoop QA commented on HBASE-8332: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628055/HBASE-8332-v2.patch against trunk revision . ATTACHMENT ID: 12628055 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 11 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +private TruncateTableRequest(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } +private TruncateTableResponse(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } +private EnableTableRequest(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } +private EnableTableResponse(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } +private DisableTableRequest(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } +private DisableTableResponse(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } +private ModifyTableRequest(boolean noInit) { this.unknownFields = com.google.protobuf.UnknownFieldSet.getDefaultInstance(); } + private void testTruncateTable(final TableName tableName, boolean preserveSplits) throws IOException { + raise(ArgumentError, BloomFilter type #{bloomtype} is not supported. Use one of + org.apache.hadoop.hbase.regionserver.StoreFile::BloomType.constants.join( )) {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8652//console This message is automatically generated. Add truncate as HMaster method -- Key: HBASE-8332 URL: https://issues.apache.org/jira/browse/HBASE-8332 Project: HBase Issue Type: Improvement Components: master Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Attachments: HBASE-8332-v0.patch, HBASE-8332-v2.patch, HBASE-8332.draft.patch Currently truncate and truncate_preserve are only shell functions, and implemented as deleteTable() + createTable(). Using ACLs
[jira] [Updated] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10485: -- Fix Version/s: 0.94.17 0.96.2 PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897187#comment-13897187 ] Lars Hofhansl commented on HBASE-10485: --- Also, should filterRowKey return NEXT_ROW instead of SKIP when the row is filtered? ([~te...@apache.org], [~ram_krish])? PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897267#comment-13897267 ] Lars Hofhansl commented on HBASE-10485: --- And lastly the test does not need to start a minicluster. Seems overkill. See what TestFilterList does. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time
[ https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897276#comment-13897276 ] Andrew Purtell commented on HBASE-10492: JVM? open daughter regions can unpredictably take long time -- Key: HBASE-10492 URL: https://issues.apache.org/jira/browse/HBASE-10492 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Jerry He I have seen during a stress testing client was getting RetriesExhaustedWithDetailsException: Failed 748 actions: NotServingRegionException On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN. The corresponding time period on the region sever log is: {code} 2014-02-08 20:44:12,662 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: 010c1981882d1a59201af5e2dc589d44 2014-02-08 20:44:12,666 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not found for region: c2eb9b7971ca7f3fed3da86df5b788e7 {code} There were no INFO related to these two regions until: (at the end see this: Split took 57mins, 16sec) {code} 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354 2014-02-08 21:41:14,032 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks for region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c. into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: Request = regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c., storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, time=1391924373278861000; duration=1mins, 40sec 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on cf in region tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Starting compaction of 10 file(s) in cf of tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7. into tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp, totalSize=709.7 M 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. with server=hdtest208.svl.ibm.com,60020,1391887547473 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy task for tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. 2014-02-08 21:41:14,190 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Region split, hbase:meta updated, and report to master. Parent=tpch_hb_1000_2.lineitem,,1391918508561.b576e8db65d56ec08db5ca900587c28d., new
[jira] [Commented] (HBASE-10482) ReplicationSyncUp doesn't clean up its ZK, needed for tests
[ https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897285#comment-13897285 ] Jean-Daniel Cryans commented on HBASE-10482: [~lhofhansl] I just pushed it to 0.94. bq. Or are you just debugging in 0.96+ for now? Nothing like that, I think Stack just committed it to 0.96 and above since the patch I posted could directly be applied to those branches. ReplicationSyncUp doesn't clean up its ZK, needed for tests --- Key: HBASE-10482 URL: https://issues.apache.org/jira/browse/HBASE-10482 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.96.1, 0.94.16 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.98.1, 0.99.0, 0.94.17 Attachments: HBASE-10249.patch TestReplicationSyncUpTool failed again: https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/ It's not super obvious why only one of the two tables is replicated, the test could use some more logging, but I understand it this way: The first ReplicationSyncUp gets started and for some reason it cannot replicate the data: {noformat} 2014-02-06 21:32:19,811 INFO [Thread-1372] regionserver.ReplicationSourceManager(203): Current list of replicators: [1391722339091.SyncUpTool.replication.org,1234,1, quirinus.apache.org,37045,1391722237951, quirinus.apache.org,33502,1391722238125] other RSs: [] 2014-02-06 21:32:19,811 INFO [Thread-1372.replicationSource,1] regionserver.ReplicationSource(231): Replicating db42e7fc-7f29-4038-9292-d85ea8b9994b - 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2 ... 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8 2014-02-06 21:32:23,673 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,37045,1391722237951's hlogs to my queue 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 {noformat} Finally it gives up: {noformat} 2014-02-06 21:32:30,873 DEBUG [Thread-1372] replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200 {noformat} The syncUp tool has an ID you can follow, grep for syncupReplication1391722338885 or just the timestamp, and you can see it doing things after that. The reason is that the tool closes the ReplicationSourceManager but not the ZK connection, so events _still_ come in and NodeFailoverWorker _still_ tries to recover queues but then there's nothing to process them. Later in the logs you can see: {noformat} 2014-02-06 21:32:37,381 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,33502,1391722238125's hlogs to my queue 2014-02-06 21:32:37,567 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another RS took care of it because of: KeeperErrorCode = NoNode for /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock {noformat} There shouldn't' be any racing, but now someone already moved quirinus.apache.org,33502,1391722238125 away. FWIW I can't even make the test fail on my machine so I'm not 100% sure closing the ZK connection fixes the issue, but at least it's the right thing to do. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897286#comment-13897286 ] Andrew Purtell commented on HBASE-10485: Should we revert the initial commit and addendum and then apply a new patch with all of the above feedback consolidated together? Two addendums is pushing it IMHO. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897290#comment-13897290 ] Ted Yu commented on HBASE-10485: bq. should filterRowKey return NEXT_ROW instead of SKIP when the row is filtered? Here is the method signature for filterRowKey() - return type is boolean. {code} abstract public boolean filterRowKey(byte[] buffer, int offset, int length) throws IOException; {code} The test refactoring can be done in another JIRA. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897292#comment-13897292 ] Lars Hofhansl commented on HBASE-10485: --- If we want to filter the row then the filterKeyValue should return NEXT_ROW (also see RowFilter). I agree with Andy. Let's revert to start from scratch. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10485: -- Attachment: 10485-0.94-v2.txt Here's what I have to 0.94. See the vastly simplified test. If we want we can keep the end-to-end test as well, although I do not think that is necessary. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897299#comment-13897299 ] Lars Hofhansl commented on HBASE-10485: --- I do apologize for my sloppy initial review. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897298#comment-13897298 ] Lars Hofhansl commented on HBASE-10493: --- Can we hint NEXT_ROW? Also I'd go with a simpler test as suggested n HBASE-10485. InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10493-v1.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10482) ReplicationSyncUp doesn't clean up its ZK, needed for tests
[ https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897300#comment-13897300 ] Lars Hofhansl commented on HBASE-10482: --- Cool. Thanks [~jdcryans]. ReplicationSyncUp doesn't clean up its ZK, needed for tests --- Key: HBASE-10482 URL: https://issues.apache.org/jira/browse/HBASE-10482 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.96.1, 0.94.16 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.98.1, 0.99.0, 0.94.17 Attachments: HBASE-10249.patch TestReplicationSyncUpTool failed again: https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/ It's not super obvious why only one of the two tables is replicated, the test could use some more logging, but I understand it this way: The first ReplicationSyncUp gets started and for some reason it cannot replicate the data: {noformat} 2014-02-06 21:32:19,811 INFO [Thread-1372] regionserver.ReplicationSourceManager(203): Current list of replicators: [1391722339091.SyncUpTool.replication.org,1234,1, quirinus.apache.org,37045,1391722237951, quirinus.apache.org,33502,1391722238125] other RSs: [] 2014-02-06 21:32:19,811 INFO [Thread-1372.replicationSource,1] regionserver.ReplicationSource(231): Replicating db42e7fc-7f29-4038-9292-d85ea8b9994b - 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2 ... 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8 2014-02-06 21:32:23,673 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,37045,1391722237951's hlogs to my queue 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 {noformat} Finally it gives up: {noformat} 2014-02-06 21:32:30,873 DEBUG [Thread-1372] replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200 {noformat} The syncUp tool has an ID you can follow, grep for syncupReplication1391722338885 or just the timestamp, and you can see it doing things after that. The reason is that the tool closes the ReplicationSourceManager but not the ZK connection, so events _still_ come in and NodeFailoverWorker _still_ tries to recover queues but then there's nothing to process them. Later in the logs you can see: {noformat} 2014-02-06 21:32:37,381 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,33502,1391722238125's hlogs to my queue 2014-02-06 21:32:37,567 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another RS took care of it because of: KeeperErrorCode = NoNode for /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock {noformat} There shouldn't' be any racing, but now someone already moved quirinus.apache.org,33502,1391722238125 away. FWIW I can't even make the test fail on my machine so I'm not 100% sure closing the ZK connection fixes the issue, but at least it's the right thing to do. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897308#comment-13897308 ] Hadoop QA commented on HBASE-10493: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628101/10493-v1.txt against trunk revision . ATTACHMENT ID: 12628101 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8654//console This message is automatically generated. InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10493-v1.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10485: --- Attachment: 10485-trunk-v2.txt Previous patches reverted. Patch v2 addresses Lars' comments. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk-v2.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897316#comment-13897316 ] Ted Yu commented on HBASE-10493: Will post new patch once HBASE-10485 is resolved. InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10493-v1.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897319#comment-13897319 ] Hudson commented on HBASE-10389: SUCCESS: Integrated in HBase-0.98 #145 (See [https://builds.apache.org/job/HBase-0.98/145/]) HBASE-10389 Add namespace help info in table related shell commands (Jerry He) (enis: rev 1566756) * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/alter.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/alter_async.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/alter_status.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/clone_snapshot.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/close_region.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/compact.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/count.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/create.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/delete.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/deleteall.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/describe.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/disable.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/drop.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/enable.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/exists.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/get.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/get_counter.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/get_table.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/grant.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/incr.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/is_disabled.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/is_enabled.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/major_compact.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/put.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/revoke.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/scan.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/snapshot.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/split.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/user_permission.rb Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION =
[jira] [Commented] (HBASE-10486) ProtobufUtil Append Increment deserialization lost cell level timestamp
[ https://issues.apache.org/jira/browse/HBASE-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897320#comment-13897320 ] Hudson commented on HBASE-10486: SUCCESS: Integrated in HBase-0.98 #145 (See [https://builds.apache.org/job/HBase-0.98/145/]) HBASE-10486: ProtobufUtil Append Increment deserialization lost cell level timestamp (jeffreyz: rev 1566507) * /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java ProtobufUtil Append Increment deserialization lost cell level timestamp - Key: HBASE-10486 URL: https://issues.apache.org/jira/browse/HBASE-10486 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.1, 0.99.0 Attachments: hbase-10486-v2.patch, hbase-10486.patch When we deserialized Append Increment, we uses wrong timestamp value during deserialization in trunk 0.98 code and discard the value in 0.96 code base. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10480) TestLogRollPeriod#testWithEdits may fail due to insufficient waiting
[ https://issues.apache.org/jira/browse/HBASE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897322#comment-13897322 ] Hudson commented on HBASE-10480: SUCCESS: Integrated in HBase-0.98 #145 (See [https://builds.apache.org/job/HBase-0.98/145/]) HBASE-10480 TestLogRollPeriod#testWithEdits may fail due to insufficient waiting (mbertozzi: rev 1565771) * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollPeriod.java TestLogRollPeriod#testWithEdits may fail due to insufficient waiting Key: HBASE-10480 URL: https://issues.apache.org/jira/browse/HBASE-10480 Project: HBase Issue Type: Test Components: test Reporter: Ted Yu Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17 Attachments: HBASE-10480-v0.patch The test waits for minRolls rolls by sleeping: {code} Thread.sleep((minRolls + 1) * LOG_ROLL_PERIOD); {code} However, the above wait period may not be sufficient. See https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.regionserver.wal/TestLogRollPeriod/testWithEdits/ : {code} 2014-02-06 23:02:25,710 DEBUG [RS:0;quirinus:56476.logRoller] regionserver.LogRoller(87): Hlog roll period 4000ms elapsed ... 2014-02-06 23:02:30,275 DEBUG [RS:0;quirinus:56476.logRoller] regionserver.LogRoller(87): Hlog roll period 4000ms elapsed {code} The interval between two successive periodic rolls was ~1.5s longer than LOG_ROLL_PERIOD (4s) 1.5s * 4 (minRolls-1) 4s (LOG_ROLL_PERIOD) This led to the test failure: {code} java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertFalse(Assert.java:64) at org.junit.Assert.assertFalse(Assert.java:74) at org.apache.hadoop.hbase.regionserver.wal.TestLogRollPeriod.checkMinLogRolls(TestLogRollPeriod.java:168) at org.apache.hadoop.hbase.regionserver.wal.TestLogRollPeriod.testWithEdits(TestLogRollPeriod.java:130) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897321#comment-13897321 ] Hudson commented on HBASE-10485: SUCCESS: Integrated in HBase-0.98 #145 (See [https://builds.apache.org/job/HBase-0.98/145/]) HBASE-10485 Addendum (tedyu: rev 1566651) * /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java HBASE-10485 PrefixFilter#filterKeyValue() should perform filtering on row key (tedyu: rev 1566386) * /hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilterListAdditional.java PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk-v2.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10482) ReplicationSyncUp doesn't clean up its ZK, needed for tests
[ https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897323#comment-13897323 ] Hudson commented on HBASE-10482: SUCCESS: Integrated in HBase-0.98 #145 (See [https://builds.apache.org/job/HBase-0.98/145/]) HBASE-10482 ReplicationSyncUp doesn't clean up its ZK, needed for tests (stack: rev 1565836) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSyncUp.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationSyncUpTool.java ReplicationSyncUp doesn't clean up its ZK, needed for tests --- Key: HBASE-10482 URL: https://issues.apache.org/jira/browse/HBASE-10482 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.96.1, 0.94.16 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.98.1, 0.99.0, 0.94.17 Attachments: HBASE-10249.patch TestReplicationSyncUpTool failed again: https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/ It's not super obvious why only one of the two tables is replicated, the test could use some more logging, but I understand it this way: The first ReplicationSyncUp gets started and for some reason it cannot replicate the data: {noformat} 2014-02-06 21:32:19,811 INFO [Thread-1372] regionserver.ReplicationSourceManager(203): Current list of replicators: [1391722339091.SyncUpTool.replication.org,1234,1, quirinus.apache.org,37045,1391722237951, quirinus.apache.org,33502,1391722238125] other RSs: [] 2014-02-06 21:32:19,811 INFO [Thread-1372.replicationSource,1] regionserver.ReplicationSource(231): Replicating db42e7fc-7f29-4038-9292-d85ea8b9994b - 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2 ... 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8 2014-02-06 21:32:23,673 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,37045,1391722237951's hlogs to my queue 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 {noformat} Finally it gives up: {noformat} 2014-02-06 21:32:30,873 DEBUG [Thread-1372] replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200 {noformat} The syncUp tool has an ID you can follow, grep for syncupReplication1391722338885 or just the timestamp, and you can see it doing things after that. The reason is that the tool closes the ReplicationSourceManager but not the ZK connection, so events _still_ come in and NodeFailoverWorker _still_ tries to recover queues but then there's nothing to process them. Later in the logs you can see: {noformat} 2014-02-06 21:32:37,381 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,33502,1391722238125's hlogs to my queue 2014-02-06 21:32:37,567 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another RS took care of it because of: KeeperErrorCode = NoNode for /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock {noformat} There shouldn't' be any racing, but now someone already moved quirinus.apache.org,33502,1391722238125 away. FWIW I can't even make the test fail on my machine so I'm not 100% sure closing the ZK connection fixes the issue, but at least it's the right thing to do. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897333#comment-13897333 ] Hudson commented on HBASE-10389: SUCCESS: Integrated in HBase-TRUNK #4905 (See [https://builds.apache.org/job/HBase-TRUNK/4905/]) HBASE-10389 Add namespace help info in table related shell commands (Jerry He) (enis: rev 1566755) * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/alter.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/alter_async.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/alter_status.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/clone_snapshot.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/close_region.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/compact.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/count.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/create.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/delete.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/deleteall.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/describe.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/disable.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/drop.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/enable.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/exists.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/get.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/get_counter.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/get_table.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/grant.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/incr.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/is_disabled.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/is_enabled.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/list.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/major_compact.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/put.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/revoke.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/scan.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/snapshot.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/split.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/user_permission.rb Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}} You can also keep around a reference to the created table: hbase t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. {code} We should document the usage of namespace in these
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897345#comment-13897345 ] Lars Hofhansl commented on HBASE-10485: --- Thanks Ted. +1 on v2 PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk-v2.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897353#comment-13897353 ] Ted Yu commented on HBASE-10485: Integrated patch v2 to 0.98 and trunk. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk-v2.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9501) Provide throttling for replication
[ https://issues.apache.org/jira/browse/HBASE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-9501: -- Attachment: HBASE-9501-trunk_v4.patch Attaching the patch I committed, just a few cosmetic changes. Provide throttling for replication -- Key: HBASE-9501 URL: https://issues.apache.org/jira/browse/HBASE-9501 Project: HBase Issue Type: Improvement Components: Replication Reporter: Feng Honghua Assignee: Feng Honghua Attachments: HBASE-9501-trunk_v0.patch, HBASE-9501-trunk_v1.patch, HBASE-9501-trunk_v2.patch, HBASE-9501-trunk_v3.patch, HBASE-9501-trunk_v4.patch When we disable a peer for a time of period, and then enable it, the ReplicationSource in master cluster will push the accumulated hlog entries during the disabled interval to the re-enabled peer cluster at full speed. If the bandwidth of the two clusters is shared by different applications, the push at full speed for replication can use all the bandwidth and severely influence other applications. Though there are two config replication.source.size.capacity and replication.source.nb.capacity to tweak the batch size each time a push delivers, but if decrease these two configs, the number of pushes increase, and all these pushes proceed continuously without pause. And no obvious help for the bandwidth throttling. From bandwidth-sharing and push-speed perspective, it's more reasonable to provide a bandwidth up limit for each peer push channel, and within that limit, peer can choose a big batch size for each push for bandwidth efficiency. Any opinion? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10493: --- Fix Version/s: 0.99.0 0.98.1 InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.98.1, 0.99.0 Attachments: 10493-v1.txt, 10493-v2.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10493: --- Attachment: 10493-v2.txt InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.98.1, 0.99.0 Attachments: 10493-v1.txt, 10493-v2.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10493: --- Attachment: (was: 10493-v2.txt) InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.98.1, 0.99.0 Attachments: 10493-v1.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10493) InclusiveStopFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10493: --- Attachment: 10493-v2.txt InclusiveStopFilter#filterKeyValue() should perform filtering on row key Key: HBASE-10493 URL: https://issues.apache.org/jira/browse/HBASE-10493 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.98.1, 0.99.0 Attachments: 10493-v1.txt, 10493-v2.txt InclusiveStopFilter inherits filterKeyValue() from FilterBase which always returns ReturnCode.INCLUDE InclusiveStopFilter#filterKeyValue() should be consistent with filtering on row key. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9501) Provide throttling for replication
[ https://issues.apache.org/jira/browse/HBASE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-9501: -- Resolution: Fixed Fix Version/s: 0.99.0 0.98.1 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to 0.98.1 (or .0 if the current RC is sunk), and trunk. Thanks Honghua! Provide throttling for replication -- Key: HBASE-9501 URL: https://issues.apache.org/jira/browse/HBASE-9501 Project: HBase Issue Type: Improvement Components: Replication Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.98.1, 0.99.0 Attachments: HBASE-9501-trunk_v0.patch, HBASE-9501-trunk_v1.patch, HBASE-9501-trunk_v2.patch, HBASE-9501-trunk_v3.patch, HBASE-9501-trunk_v4.patch When we disable a peer for a time of period, and then enable it, the ReplicationSource in master cluster will push the accumulated hlog entries during the disabled interval to the re-enabled peer cluster at full speed. If the bandwidth of the two clusters is shared by different applications, the push at full speed for replication can use all the bandwidth and severely influence other applications. Though there are two config replication.source.size.capacity and replication.source.nb.capacity to tweak the batch size each time a push delivers, but if decrease these two configs, the number of pushes increase, and all these pushes proceed continuously without pause. And no obvious help for the bandwidth throttling. From bandwidth-sharing and push-speed perspective, it's more reasonable to provide a bandwidth up limit for each peer push channel, and within that limit, peer can choose a big batch size for each push for bandwidth efficiency. Any opinion? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897394#comment-13897394 ] Hudson commented on HBASE-10389: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #134 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/134/]) HBASE-10389 Add namespace help info in table related shell commands (Jerry He) (enis: rev 1566756) * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/alter.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/alter_async.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/alter_status.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/clone_snapshot.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/close_region.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/compact.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/count.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/create.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/delete.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/deleteall.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/describe.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/disable.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/drop.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/enable.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/exists.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/get.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/get_counter.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/get_table.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/grant.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/incr.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/is_disabled.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/is_enabled.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/major_compact.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/put.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/revoke.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/scan.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/snapshot.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/split.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/user_permission.rb Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION =
[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897396#comment-13897396 ] Hudson commented on HBASE-10389: FAILURE: Integrated in hbase-0.96-hadoop2 #198 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/198/]) HBASE-10389 Add namespace help info in table related shell commands (stack: rev 1566763) * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/alter.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/alter_async.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/alter_status.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/clone_snapshot.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/close_region.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/compact.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/count.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/create.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/delete.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/deleteall.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/describe.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/disable.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/drop.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/enable.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/exists.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/get.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/get_counter.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/get_table.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/grant.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/incr.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/is_disabled.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/is_enabled.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/list.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/major_compact.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/put.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/revoke.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/scan.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/snapshot.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/split.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/user_permission.rb Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION =
[jira] [Commented] (HBASE-10482) ReplicationSyncUp doesn't clean up its ZK, needed for tests
[ https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897401#comment-13897401 ] Hudson commented on HBASE-10482: FAILURE: Integrated in HBase-0.94-security #406 (See [https://builds.apache.org/job/HBase-0.94-security/406/]) HBASE-10482 ReplicationSyncUp doesn't clean up its ZK, needed for tests (jdcryans: rev 1566855) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSyncUp.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationSyncUpTool.java ReplicationSyncUp doesn't clean up its ZK, needed for tests --- Key: HBASE-10482 URL: https://issues.apache.org/jira/browse/HBASE-10482 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.96.1, 0.94.16 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.98.1, 0.99.0, 0.94.17 Attachments: HBASE-10249.patch TestReplicationSyncUpTool failed again: https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/ It's not super obvious why only one of the two tables is replicated, the test could use some more logging, but I understand it this way: The first ReplicationSyncUp gets started and for some reason it cannot replicate the data: {noformat} 2014-02-06 21:32:19,811 INFO [Thread-1372] regionserver.ReplicationSourceManager(203): Current list of replicators: [1391722339091.SyncUpTool.replication.org,1234,1, quirinus.apache.org,37045,1391722237951, quirinus.apache.org,33502,1391722238125] other RSs: [] 2014-02-06 21:32:19,811 INFO [Thread-1372.replicationSource,1] regionserver.ReplicationSource(231): Replicating db42e7fc-7f29-4038-9292-d85ea8b9994b - 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2 ... 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8 2014-02-06 21:32:23,673 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,37045,1391722237951's hlogs to my queue 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 {noformat} Finally it gives up: {noformat} 2014-02-06 21:32:30,873 DEBUG [Thread-1372] replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200 {noformat} The syncUp tool has an ID you can follow, grep for syncupReplication1391722338885 or just the timestamp, and you can see it doing things after that. The reason is that the tool closes the ReplicationSourceManager but not the ZK connection, so events _still_ come in and NodeFailoverWorker _still_ tries to recover queues but then there's nothing to process them. Later in the logs you can see: {noformat} 2014-02-06 21:32:37,381 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,33502,1391722238125's hlogs to my queue 2014-02-06 21:32:37,567 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another RS took care of it because of: KeeperErrorCode = NoNode for /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock {noformat} There shouldn't' be any racing, but now someone already moved quirinus.apache.org,33502,1391722238125 away. FWIW I can't even make the test fail on my machine so I'm not 100% sure closing the ZK connection fixes the issue, but at least it's the right thing to do. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-8751) Enable peer cluster to choose/change the ColumnFamilies/Tables it really want to replicate from a source cluster
[ https://issues.apache.org/jira/browse/HBASE-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-8751: -- Resolution: Fixed Fix Version/s: 0.99.0 0.98.1 Release Note: From the shell's doc: # set table / table-cf to be replicable for a peer, for a table without # an explicit column-family list, all replicable column-families (with # replication_scope == 1) will be replicated hbase set_peer_tableCFs '2', table1; table2:cf1,cf2; table3:cfA,cfB Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to 0.98.1 and trunk, thanks Honghua! Enable peer cluster to choose/change the ColumnFamilies/Tables it really want to replicate from a source cluster Key: HBASE-8751 URL: https://issues.apache.org/jira/browse/HBASE-8751 Project: HBase Issue Type: New Feature Components: Replication Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.98.1, 0.99.0 Attachments: HBASE-8751-0.94-V0.patch, HBASE-8751-0.94-v1.patch, HBASE-8751-trunk_v0.patch, HBASE-8751-trunk_v1.patch, HBASE-8751-trunk_v2.patch, HBASE-8751-trunk_v3.patch Consider scenarios (all cf are with replication-scope=1): 1) cluster S has 3 tables, table A has cfA,cfB, table B has cfX,cfY, table C has cf1,cf2. 2) cluster X wants to replicate table A : cfA, table B : cfX and table C from cluster S. 3) cluster Y wants to replicate table B : cfY, table C : cf2 from cluster S. Current replication implementation can't achieve this since it'll push the data of all the replicatable column-families from cluster S to all its peers, X/Y in this scenario. This improvement provides a fine-grained replication theme which enable peer cluster to choose the column-families/tables they really want from the source cluster: A). Set the table:cf-list for a peer when addPeer: hbase-shell add_peer '3', zk:1100:/hbase, table1; table2:cf1,cf2; table3:cf2 B). View the table:cf-list config for a peer using show_peer_tableCFs: hbase-shell show_peer_tableCFs 1 C). Change/set the table:cf-list for a peer using set_peer_tableCFs: hbase-shell set_peer_tableCFs '2', table1:cfX; table2:cf1; table3:cf1,cf2 In this theme, replication-scope=1 only means a column-family CAN be replicated to other clusters, but only the 'table:cf-list list' determines WHICH cf/table will actually be replicated to a specific peer. To provide back-compatibility, empty 'table:cf-list list' will replicate all replicatable cf/table. (this means we don't allow a peer which replicates nothing from a source cluster, we think it's reasonable: if replicating nothing why bother adding a peer?) This improvement addresses the exact problem raised by the first FAQ in http://hbase.apache.org/replication.html: GLOBAL means replicate? Any provision to replicate only to cluster X and not to cluster Y? or is that for later? Yes, this is for much later. I also noticed somebody mentioned replication-scope as integer rather than a boolean is for such fine-grained replication purpose, but I think extending replication-scope can't achieve the same replication granularity flexibility as providing above per-peer replication configurations. This improvement has been running smoothly in our production clusters (Xiaomi) for several months. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10482) ReplicationSyncUp doesn't clean up its ZK, needed for tests
[ https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897406#comment-13897406 ] Hudson commented on HBASE-10482: FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #16 (See [https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/16/]) HBASE-10482 ReplicationSyncUp doesn't clean up its ZK, needed for tests (jdcryans: rev 1566855) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSyncUp.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationSyncUpTool.java ReplicationSyncUp doesn't clean up its ZK, needed for tests --- Key: HBASE-10482 URL: https://issues.apache.org/jira/browse/HBASE-10482 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.96.1, 0.94.16 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.98.1, 0.99.0, 0.94.17 Attachments: HBASE-10249.patch TestReplicationSyncUpTool failed again: https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/ It's not super obvious why only one of the two tables is replicated, the test could use some more logging, but I understand it this way: The first ReplicationSyncUp gets started and for some reason it cannot replicate the data: {noformat} 2014-02-06 21:32:19,811 INFO [Thread-1372] regionserver.ReplicationSourceManager(203): Current list of replicators: [1391722339091.SyncUpTool.replication.org,1234,1, quirinus.apache.org,37045,1391722237951, quirinus.apache.org,33502,1391722238125] other RSs: [] 2014-02-06 21:32:19,811 INFO [Thread-1372.replicationSource,1] regionserver.ReplicationSource(231): Replicating db42e7fc-7f29-4038-9292-d85ea8b9994b - 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2 ... 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8 2014-02-06 21:32:23,673 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,37045,1391722237951's hlogs to my queue 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 {noformat} Finally it gives up: {noformat} 2014-02-06 21:32:30,873 DEBUG [Thread-1372] replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200 {noformat} The syncUp tool has an ID you can follow, grep for syncupReplication1391722338885 or just the timestamp, and you can see it doing things after that. The reason is that the tool closes the ReplicationSourceManager but not the ZK connection, so events _still_ come in and NodeFailoverWorker _still_ tries to recover queues but then there's nothing to process them. Later in the logs you can see: {noformat} 2014-02-06 21:32:37,381 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,33502,1391722238125's hlogs to my queue 2014-02-06 21:32:37,567 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another RS took care of it because of: KeeperErrorCode = NoNode for /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock {noformat} There shouldn't' be any racing, but now someone already moved quirinus.apache.org,33502,1391722238125 away. FWIW I can't even make the test fail on my machine so I'm not 100% sure closing the ZK connection fixes the issue, but at least it's the right thing to do. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897408#comment-13897408 ] Hudson commented on HBASE-10389: SUCCESS: Integrated in hbase-0.96 #287 (See [https://builds.apache.org/job/hbase-0.96/287/]) HBASE-10389 Add namespace help info in table related shell commands (stack: rev 1566763) * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/alter.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/alter_async.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/alter_status.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/clone_snapshot.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/close_region.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/compact.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/count.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/create.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/delete.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/deleteall.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/describe.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/disable.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/drop.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/enable.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/exists.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/get.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/get_counter.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/get_table.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/grant.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/incr.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/is_disabled.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/is_enabled.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/list.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/major_compact.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/put.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/revoke.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/scan.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/snapshot.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/split.rb * /hbase/branches/0.96/hbase-shell/src/main/ruby/shell/commands/user_permission.rb Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' =
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897412#comment-13897412 ] Hudson commented on HBASE-10485: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #86 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/86/]) HBASE-10485 Addendum (tedyu: rev 1566653) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk-v2.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10389) Add namespace help info in table related shell commands
[ https://issues.apache.org/jira/browse/HBASE-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897410#comment-13897410 ] Hudson commented on HBASE-10389: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #86 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/86/]) HBASE-10389 Add namespace help info in table related shell commands (Jerry He) (enis: rev 1566755) * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/alter.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/alter_async.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/alter_status.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/clone_snapshot.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/close_region.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/compact.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/count.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/create.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/delete.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/deleteall.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/describe.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/disable.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/drop.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/enable.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/exists.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/get.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/get_counter.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/get_table.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/grant.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/incr.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/is_disabled.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/is_enabled.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/list.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/major_compact.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/put.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/revoke.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/scan.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/snapshot.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/split.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/user_permission.rb Add namespace help info in table related shell commands --- Key: HBASE-10389 URL: https://issues.apache.org/jira/browse/HBASE-10389 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.96.0, 0.96.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.96.2, 0.98.1, 0.99.0 Attachments: 10389.096.txt, HBASE-10389-trunk.patch Currently in the help info of table related shell command, we don't mention or give namespace as part of the table name. For example, to create table: {code} hbase(main):001:0 help 'create' Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: hbase create 't1', {NAME = 'f1', VERSIONS = 5} hbase create 't1', {NAME = 'f1'}, {NAME = 'f2'}, {NAME = 'f3'} hbase # The above in shorthand would be the following: hbase create 't1', 'f1', 'f2', 'f3' hbase create 't1', {NAME = 'f1', VERSIONS = 1, TTL = 2592000, BLOCKCACHE = true} hbase create 't1', {NAME = 'f1', CONFIGURATION = {'hbase.hstore.blockingStoreFiles' = '10'}} Table configuration options can be put at the end. Examples: hbase create 't1', 'f1', SPLITS = ['10', '20', '30', '40'] hbase create 't1', 'f1', SPLITS_FILE = 'splits.txt', OWNER = 'johndoe' hbase create 't1', {NAME = 'f1', VERSIONS = 5}, METADATA = { 'mykey' = 'myvalue' } hbase # Optionally pre-split the table into NUMREGIONS, using hbase # SPLITALGO (HexStringSplit, UniformSplit or classname) hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit'} hbase create 't1', 'f1', {NUMREGIONS = 15, SPLITALGO = 'HexStringSplit', CONFIGURATION = {'hbase.hregion.scan.loadColumnFamiliesOnDemand' = 'true'}} You can also keep around a reference to the created table: hbase t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. {code} We should document the usage of
[jira] [Commented] (HBASE-10486) ProtobufUtil Append Increment deserialization lost cell level timestamp
[ https://issues.apache.org/jira/browse/HBASE-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897411#comment-13897411 ] Hudson commented on HBASE-10486: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #86 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/86/]) HBASE-10486: ProtobufUtil Append Increment deserialization lost cell level timestamp (jeffreyz: rev 1566505) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java ProtobufUtil Append Increment deserialization lost cell level timestamp - Key: HBASE-10486 URL: https://issues.apache.org/jira/browse/HBASE-10486 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.1, 0.99.0 Attachments: hbase-10486-v2.patch, hbase-10486.patch When we deserialized Append Increment, we uses wrong timestamp value during deserialization in trunk 0.98 code and discard the value in 0.96 code base. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10479) HConnection interface is public but is used internally, and contains a bunch of methods
[ https://issues.apache.org/jira/browse/HBASE-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897413#comment-13897413 ] Hudson commented on HBASE-10479: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #86 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/86/]) HBASE-10479 HConnection interface is public but is used internally, and contains a bunch of methods (sershe: rev 1566501) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClusterConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionKey.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperKeepAliveConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperRegistry.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientNoCluster.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestSnapshotFromAdmin.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTableWrapper.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientTimeouts.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestHTableWrapper.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java HConnection interface is public but is used internally, and contains a bunch of methods --- Key: HBASE-10479 URL: https://issues.apache.org/jira/browse/HBASE-10479 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0, hbase-10070 Attachments: 45751591.jpg, HBASE-10479.01.patch, HBASE-10479.02.patch, HBASE-10479.03.patch, HBASE-10479.04.patch, HBASE-10479.final.patch, HBASE-10479.patch HConnection has too many methods for a public interface, and some of these should not be public. It is used extensively for internal purposes, so we keep adding methods to it that may not make sense for public interface. The idea is to create a separate internal interface inheriting HConnection, copy some methods to it and deprecate them on HConnection. New methods for internal use would be added to new interface; the deprecated methods would eventually be removed from public interface. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10485) PrefixFilter#filterKeyValue() should perform filtering on row key
[ https://issues.apache.org/jira/browse/HBASE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897418#comment-13897418 ] Hadoop QA commented on HBASE-10485: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628114/10485-trunk-v2.txt against trunk revision . ATTACHMENT ID: 12628114 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8655//console This message is automatically generated. PrefixFilter#filterKeyValue() should perform filtering on row key - Key: HBASE-10485 URL: https://issues.apache.org/jira/browse/HBASE-10485 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17 Attachments: 10485-0.94-v2.txt, 10485-0.94.txt, 10485-trunk-v2.txt, 10485-trunk.addendum, 10485-v1.txt Niels reported an issue under the thread 'Trouble writing custom filter for use in FilterList' where his custom filter used in FilterList along with PrefixFilter produced an unexpected results. His test can be found here: https://github.com/nielsbasjes/HBase-filter-problem This is due to PrefixFilter#filterKeyValue() using FilterBase#filterKeyValue() which returns ReturnCode.INCLUDE When FilterList.Operator.MUST_PASS_ONE is specified, FilterList#filterKeyValue() would return ReturnCode.INCLUDE even when row key prefix doesn't match meanwhile the other filter's filterKeyValue() returns ReturnCode.NEXT_COL -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10351) LoadBalancer changes for supporting region replicas
[ https://issues.apache.org/jira/browse/HBASE-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-10351: -- Attachment: hbase-10351_v0.patch Review board is down again. Attaching v0 patch here instead. This patch builds on top of HBASE-10350, and includes changes from [~devaraj] and myself for load balancer to enforce region replica placement. In short, HBASE-10350 enables table creation with region replicas, but won't have any region placement enforcement. This patch, adds co-location constraints to the LB, so that it does a best effort job to not place replicas of the same region in same hosts / racks. The overview of changes are: - BaseLoadBalancer.Cluster is aware of hosts and racks for servers - BaseLoadBalancer.Cluster is aware of regions per host / rack - BaseLoadBalancer.Cluster keeps track of region replicas for co-location enforcement - BaseLoadBalancer.Cluster can be constructed with unassigned regions as well. Some refactoring (Action, etc) for better abstractions etc. - BaseLoadBalancer.retainAssignments(), etc now construct a Cluster object and use the methods there to ensure co-location constraint. This is a first step in unifying the way we do balance() and table creation, retainAssignment() etc. We can continue with this in other jiras (not part of HBASE-10070). - StochasticLoadBalancer has (high) costs for host/rack replica co-locations and candidate generator for ensuring optimum plan generation. By the way of these cost functions, the LB can satisfy that replicas are not co-located as long as there are enough servers / rack left. If not, it will still do assignment (soft constraint) - Bunch of tests to ensure region placement (and speed of balance() run) LoadBalancer changes for supporting region replicas --- Key: HBASE-10351 URL: https://issues.apache.org/jira/browse/HBASE-10351 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 0.99.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-10351_v0.patch LoadBalancer has to be aware of and enforce placement of region replicas so that the replicas are not co-hosted in the same server, host or rack. This will ensure that the region is highly available during process / host / rack failover. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10356) Failover RPC's for multi-get
[ https://issues.apache.org/jira/browse/HBASE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897426#comment-13897426 ] Sergey Shelukhin commented on HBASE-10356: -- bq. Can we fix the naming? An Action instance with a method named getAction is confusing (not you I know...) Will do separate patch on trunk probably. bq. Do I have the model wrong then? This Action is for accounting on how this 'Action' did in the Multi request, right? My thinking is that server and region accounting is done outside of the Action usually. I'd think which replica would be done there w/o needing to mark this index on the Multi call. I don't think Action has any particular model, it's just a context between request and response time. bq. The new Runnables need explanatory comments and better names . Should the two types be related? Subclass? Do we need two Callable types? Two types are not related. Will rename and add comments. Failover RPC's for multi-get - Key: HBASE-10356 URL: https://issues.apache.org/jira/browse/HBASE-10356 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10356.reference.patch, HBASE-10356.reference.patch This is extension of HBASE-10355 to add failover support for multi-gets. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10356) Failover RPC's for multi-get
[ https://issues.apache.org/jira/browse/HBASE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-10356: - Attachment: HBASE-10356.patch Rebased onto hbase-10070 with recent changes, changed the patch accordingly to use the new location interfaces to get replicas, added extensive unit tests, added comments and renames for the runnables. I ran TestAsyncProcess for now, will run other tests too. Failover RPC's for multi-get - Key: HBASE-10356 URL: https://issues.apache.org/jira/browse/HBASE-10356 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10356.patch, HBASE-10356.reference.patch, HBASE-10356.reference.patch This is extension of HBASE-10355 to add failover support for multi-gets. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10482) ReplicationSyncUp doesn't clean up its ZK, needed for tests
[ https://issues.apache.org/jira/browse/HBASE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897436#comment-13897436 ] Hudson commented on HBASE-10482: FAILURE: Integrated in HBase-0.94 #1281 (See [https://builds.apache.org/job/HBase-0.94/1281/]) HBASE-10482 ReplicationSyncUp doesn't clean up its ZK, needed for tests (jdcryans: rev 1566855) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSyncUp.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationSyncUpTool.java ReplicationSyncUp doesn't clean up its ZK, needed for tests --- Key: HBASE-10482 URL: https://issues.apache.org/jira/browse/HBASE-10482 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.96.1, 0.94.16 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.98.1, 0.99.0, 0.94.17 Attachments: HBASE-10249.patch TestReplicationSyncUpTool failed again: https://builds.apache.org/job/HBase-TRUNK/4895/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/ It's not super obvious why only one of the two tables is replicated, the test could use some more logging, but I understand it this way: The first ReplicationSyncUp gets started and for some reason it cannot replicate the data: {noformat} 2014-02-06 21:32:19,811 INFO [Thread-1372] regionserver.ReplicationSourceManager(203): Current list of replicators: [1391722339091.SyncUpTool.replication.org,1234,1, quirinus.apache.org,37045,1391722237951, quirinus.apache.org,33502,1391722238125] other RSs: [] 2014-02-06 21:32:19,811 INFO [Thread-1372.replicationSource,1] regionserver.ReplicationSource(231): Replicating db42e7fc-7f29-4038-9292-d85ea8b9994b - 783c0ab2-4ff9-4dc0-bb38-86bf31d1d817 2014-02-06 21:32:19,892 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:19,911 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 1 2014-02-06 21:32:20,094 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 2 ... 2014-02-06 21:32:23,414 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 8 2014-02-06 21:32:23,673 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,37045,1391722237951's hlogs to my queue 2014-02-06 21:32:23,768 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:23,842 DEBUG [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(396): Creating quirinus.apache.org%2C37045%2C1391722237951.1391722243779 with data 10803 2014-02-06 21:32:24,297 TRACE [Thread-1372.replicationSource,2] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 2014-02-06 21:32:24,314 TRACE [Thread-1372.replicationSource,1] regionserver.ReplicationSource(596): No log to process, sleeping 100 times 9 {noformat} Finally it gives up: {noformat} 2014-02-06 21:32:30,873 DEBUG [Thread-1372] replication.TestReplicationSyncUpTool(323): SyncUpAfterDelete failed at retry = 0, with rowCount_ht1TargetPeer1 =100 and rowCount_ht2TargetAtPeer1 =200 {noformat} The syncUp tool has an ID you can follow, grep for syncupReplication1391722338885 or just the timestamp, and you can see it doing things after that. The reason is that the tool closes the ReplicationSourceManager but not the ZK connection, so events _still_ come in and NodeFailoverWorker _still_ tries to recover queues but then there's nothing to process them. Later in the logs you can see: {noformat} 2014-02-06 21:32:37,381 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(169): Moving quirinus.apache.org,33502,1391722238125's hlogs to my queue 2014-02-06 21:32:37,567 INFO [ReplicationExecutor-0] replication.ReplicationQueuesZKImpl(239): Won't transfer the queue, another RS took care of it because of: KeeperErrorCode = NoNode for /1/replication/rs/quirinus.apache.org,33502,1391722238125/lock {noformat} There shouldn't' be any racing, but now someone already moved quirinus.apache.org,33502,1391722238125 away. FWIW I can't even make the test fail on my machine so I'm not 100% sure closing the ZK connection fixes the issue, but at least it's the right thing to do. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10356) Failover RPC's for multi-get
[ https://issues.apache.org/jira/browse/HBASE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-10356: - Status: Patch Available (was: In Progress) Failover RPC's for multi-get - Key: HBASE-10356 URL: https://issues.apache.org/jira/browse/HBASE-10356 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10356.patch, HBASE-10356.reference.patch, HBASE-10356.reference.patch This is extension of HBASE-10355 to add failover support for multi-gets. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10491) RegionLocations::getRegionLocation can return unexpected replica
[ https://issues.apache.org/jira/browse/HBASE-10491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897438#comment-13897438 ] Sergey Shelukhin commented on HBASE-10491: -- [~enis] fyi RegionLocations::getRegionLocation can return unexpected replica Key: HBASE-10491 URL: https://issues.apache.org/jira/browse/HBASE-10491 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10070 Reporter: Sergey Shelukhin The method returns first non-null replica. If first replica is assumed to always be non-null (discussed with Enis), then this code is not necessary, it should return 0th one, maybe assert it's not null. If that is not the case, then code may be incorrect and may return non-primary to some code (locateRegion overload) that doesn't expect it. Perhaps method should be called getAnyRegionReplica or something like that; and get(Primary?)RegionLocation should return the first. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10356) Failover RPC's for multi-get
[ https://issues.apache.org/jira/browse/HBASE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897443#comment-13897443 ] Sergey Shelukhin commented on HBASE-10356: -- [~enis] [~devaraj] [~nkeywal] fyi Failover RPC's for multi-get - Key: HBASE-10356 URL: https://issues.apache.org/jira/browse/HBASE-10356 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10356.patch, HBASE-10356.reference.patch, HBASE-10356.reference.patch This is extension of HBASE-10355 to add failover support for multi-gets. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10356) Failover RPC's for multi-get
[ https://issues.apache.org/jira/browse/HBASE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897442#comment-13897442 ] Sergey Shelukhin commented on HBASE-10356: -- rb at https://reviews.apache.org/r/17937/; RB is acting as usual for me, so publish button is stuck. This review will eventually become available Failover RPC's for multi-get - Key: HBASE-10356 URL: https://issues.apache.org/jira/browse/HBASE-10356 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10356.patch, HBASE-10356.reference.patch, HBASE-10356.reference.patch This is extension of HBASE-10355 to add failover support for multi-gets. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors
[ https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897450#comment-13897450 ] James Taylor commented on HBASE-9203: - Is it possible to break this JIRA down into smaller pieces? The only part that Phoenix would need is the custom load balancer that keeps two regions co-located. If that functionality was available, I believe that Phoenix could support a local index option to compliment its global index feature. Would it be possible for that to be a separate JIRA that is tackled first? Secondary index support through coprocessors Key: HBASE-9203 URL: https://issues.apache.org/jira/browse/HBASE-9203 Project: HBase Issue Type: New Feature Reporter: rajeshbabu Assignee: rajeshbabu Attachments: SecondaryIndex Design.pdf, SecondaryIndex Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf We have been working on implementing secondary index in HBase and open sourced on hbase 0.94.8 version. The project is available on github. https://github.com/Huawei-Hadoop/hindex This Jira is to support secondary index on trunk(0.98). Following features will be supported. - multiple indexes on table, - multi column index, - index based on part of a column value, - equals and range condition scans using index, and - bulk loading data to indexed table (Indexing done with bulk load) Most of the kernel changes needed for secondary index is available in trunk. Very minimal changes needed for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)