[jira] [Commented] (HBASE-5506) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220761#comment-13220761 ] Hadoop QA commented on HBASE-5506: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516791/HBASE-5506.D2031.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1078//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1078//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1078//console This message is automatically generated. Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo() - Key: HBASE-5506 URL: https://issues.apache.org/jira/browse/HBASE-5506 Project: HBase Issue Type: Test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Attachments: HBASE-5506.D2031.1.patch, HBASE-5506.D2031.2.patch, HBASE-5506.D2031.3.patch We observed that when with framed transport option. The thrift call ThriftServerRunner.HbaseHandler.getRegionInfo() receives corrupted parameter (some garbage string attached to the beginning). This may be a thrift bug requires further investigation. Add a unit test to reproduce the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220774#comment-13220774 ] Hadoop QA commented on HBASE-5074: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516798/D1521.11.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1079//console This message is automatically generated. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5074: --- Attachment: D1521.12.patch dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin Fixed failed unit test TestFixedFileTrailer REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5074: --- Attachment: D1521.12.patch dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin Fixed failed unit test TestFixedFileTrailer REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220810#comment-13220810 ] Hadoop QA commented on HBASE-5074: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516807/D1521.12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1080//console This message is automatically generated. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220823#comment-13220823 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5552 --- This seems to be close to a one-to-one mapping with the current interface today. I don't know if this is the intent or whether you're willing to completely redesign the look of the API too. Maybe it's to ease the transition? I'd like to see a request type to do one-shot scans. Something where you don't even get a scanner ID. You pass parameters like to open a scanner, you say up to how many rows or bytes you want to retrieve, and you get just that in one shot. When opening a actual scanner, we also need to be able to get the first batch of scan results at the same time we open the scanner. This is a must-have IMO. And we need to be able to request to close the scanner while fetching a batch of results. It would be nice to have a keep-alive request for existing scanners. Something to tell the server I'm not fetching anything from this scanner right now, but please keep it open by reseting its TTL, don't close it just because I haven't used it for a while. Please, please, please, consider shortening the name of all these protobufs and dropping the Proto suffix. The current names are unnecessarily long or aren't intuitive (e.g. columnFamily for something that describes the multiple things you're trying to get out of a row) or are too redundant (e.g. KeyType keyType). Regarding the lack of multi RPC, I think this is a good thing. multi is a big mess that was only marginally better than its horrible multiPut predecessor. This proposal already supports multi-everything, it just doesn't support batching different kind of operations in the same RPC, which isn't a big deal IMO. pom.xml https://reviews.apache.org/r/4054/#comment11997 Do this instead: if which cygpath /dev/null 2/dev/null; then # Windows else # Not Windows fi pom.xml https://reviews.apache.org/r/4054/#comment11998 Simply do: if $IS_WIN; then pom.xml https://reviews.apache.org/r/4054/#comment12016 Actually you can just remove the whole $IS_WIN business and everything. Simply fix PROTO_DIR and JAVA_DIR when on Windows before calling protoc. src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment11999 I find the Proto suffix unnecessary and long. If you truly want a suffix, PB would be shorter, but no suffix would be better IMO. src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12004 Use option optimize_for = SPEED, it makes a big difference. src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12006 I'd call this just Columns. src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12005 I would recommend to pluralize all repeated fields. This will make for nicer code where you'll be able to write something along the lines of: for (byte[] qualifier : pb.qualifiers()) src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12010 The thing that append() and increment() have in common is that they're atomic operations that don't require a read-modify-write from the client. So maybe AtomicOp would be better? src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12012 Just call this Columns. src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12019 What's the meaning of this? How do we know what has been processed and what hasn't? src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12000 I'd vote for adding this right now. It's easy to add directly and would be a huge improvement for short scans (which are super common). src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12007 We need to have a way to get feedback from the server about the TTL of the scanner. How long can the client hold on to the scanner before the server will kill it. Add a field here so that the server can communicate the TTL to the client. src/main/proto/HRegionProtocol.proto https://reviews.apache.org/r/4054/#comment12001 Please add an optional boolean close to request that the scanner be closed after returning this batch of results. This can help clients eliminate the CloseScannerRequestProto when they know they're going to close the scanner after this batch. src/main/proto/HRegionProtocol.proto
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Status: Open (was: Patch Available) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Attachment: 5399_inprogress.v23.patch Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220964#comment-13220964 ] nkeywal commented on HBASE-5399: @stack. A lot of variance, but not that much. So I know I broke something somewhere. I fixed a synchronization issue in v23 (plus the points mentioned in you review). Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220965#comment-13220965 ] Hadoop QA commented on HBASE-5399: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516840/5399_inprogress.v23.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1081//console This message is automatically generated. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing
[ https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220981#comment-13220981 ] Anoop Sam John commented on HBASE-2038: --- Hi Lars, {quote}It might be possible to provide a custom filter to do that.{quote} - What we wanted from the filter is include a row and then seek to the next row which we are interested in. I cant see such a facility with our Filter right now. Correct me if I am wrong. So suppose we already seeked to one row and this need to be included in the result, then the Filter should return INCLUDE. Then when the next next() call happens, then only we can return a SEEK_USING_HINT. So one extra row reading is needed. This might create even one unwanted HFileBlock fetch (who knows). Can we add reseek() at higher level? If you have suggestion pls give me. Thanks Anoop Coprocessors: Region level indexing --- Key: HBASE-2038 URL: https://issues.apache.org/jira/browse/HBASE-2038 Project: HBase Issue Type: New Feature Components: coprocessors Reporter: Andrew Purtell Priority: Minor HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a good goalpost for coprocessor environment design -- there should be enough of it so region level indexing can be reimplemented as a coprocessor without any loss of functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API
Change in LB.randomAssignment(ListServerName servers) API --- Key: HBASE-5510 URL: https://issues.apache.org/jira/browse/HBASE-5510 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Anoop Sam John In LB there is randomAssignment(ListServerName servers) API which will be used by AM to assign a region from a down RS. [This will be also used in other cases like call to assign() API from client] I feel it would be better to pass the HRegionInfo also into this method. When the LB making a choice for a region assignment, when one RS is down, it would be nice that the LB knows for which region it is doing this server selection. +Scenario+ While one RS down, we wanted the regions to get moved to other RSs but a set of regions stay together. We are having custom load balancer but with the current way of LB interface this is not possible. Another way is I can allow a random assignment of the regions at the RS down time. Later with a cluster balance I can balance the regions as I need. But this might make regions assign 1st to one RS and then again move to another. Also for some time period my business use case can not get satisfied. Also I have seen some issue in JIRA which speaks about making sure that Root and META regions always sit in some specific RSs. With the current LB API this wont be possible in future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5504) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221023#comment-13221023 ] stack commented on HBASE-5504: -- bq. I meant that data for the neighbor region we choose should be copied. The neighbor region would have new delimiting key. Sorry, I'm not following Ted. You need to bring me a long. Thanks. Online Merge Key: HBASE-5504 URL: https://issues.apache.org/jira/browse/HBASE-5504 Project: HBase Issue Type: Brainstorming Components: client, master, shell, zookeeper Affects Versions: 0.94.0 Reporter: Mubarak Seyed Fix For: 0.96.0 As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991] Design suggestion from Stack: {quote} I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA. Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing. (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper) 1. Client calls merge or deleteRegion API. API is a range of rows. (C) 2. Master gets call. (M) 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M - ZK) 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M) 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M - C) 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M) 7. Write into zk that just turned off the balancer (If it was on) (M - ZK) 8. Get regions that are involved in the span (M) 9. Hoist the list up into zk. (M - ZK) 10. Create region to span the range. (M) 11. Write that we did this up into zk. (M - ZK) 12. Close regions in parallel. Confirm close in parallel. (M - RS) 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M - ZK) 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M) 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M - ZK) 16. Edit .META. (M) 17. Confirm edits went in. (M) 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M) 19. Enable balancer (if it was off) (M) 20. Unlock table (M) {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API
[ https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-5510: - Assignee: ramkrishna.s.vasudevan Change in LB.randomAssignment(ListServerName servers) API --- Key: HBASE-5510 URL: https://issues.apache.org/jira/browse/HBASE-5510 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan In LB there is randomAssignment(ListServerName servers) API which will be used by AM to assign a region from a down RS. [This will be also used in other cases like call to assign() API from client] I feel it would be better to pass the HRegionInfo also into this method. When the LB making a choice for a region assignment, when one RS is down, it would be nice that the LB knows for which region it is doing this server selection. +Scenario+ While one RS down, we wanted the regions to get moved to other RSs but a set of regions stay together. We are having custom load balancer but with the current way of LB interface this is not possible. Another way is I can allow a random assignment of the regions at the RS down time. Later with a cluster balance I can balance the regions as I need. But this might make regions assign 1st to one RS and then again move to another. Also for some time period my business use case can not get satisfied. Also I have seen some issue in JIRA which speaks about making sure that Root and META regions always sit in some specific RSs. With the current LB API this wont be possible in future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API
[ https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221027#comment-13221027 ] ramkrishna.s.vasudevan commented on HBASE-5510: --- @Ted If interface change is ok, I can make a patch for it. Change in LB.randomAssignment(ListServerName servers) API --- Key: HBASE-5510 URL: https://issues.apache.org/jira/browse/HBASE-5510 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan In LB there is randomAssignment(ListServerName servers) API which will be used by AM to assign a region from a down RS. [This will be also used in other cases like call to assign() API from client] I feel it would be better to pass the HRegionInfo also into this method. When the LB making a choice for a region assignment, when one RS is down, it would be nice that the LB knows for which region it is doing this server selection. +Scenario+ While one RS down, we wanted the regions to get moved to other RSs but a set of regions stay together. We are having custom load balancer but with the current way of LB interface this is not possible. Another way is I can allow a random assignment of the regions at the RS down time. Later with a cluster balance I can balance the regions as I need. But this might make regions assign 1st to one RS and then again move to another. Also for some time period my business use case can not get satisfied. Also I have seen some issue in JIRA which speaks about making sure that Root and META regions always sit in some specific RSs. With the current LB API this wont be possible in future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API
[ https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221045#comment-13221045 ] Zhihong Yu commented on HBASE-5510: --- For getRegionPlan(), we already provide server to exclude: {code} RegionPlan getRegionPlan(final RegionState state, final ServerName serverToExclude, final boolean forceNewPlan) { {code} Does the above not serve the purpose of migrating region away ? bq. but a set of regions stay together Can I get some explanation for the rationale here ? Load balancer should honor block locality when moving regions. Why should these regions stay together ? Changing interface is fine for TRUNK. But I want to understand the use case more. Change in LB.randomAssignment(ListServerName servers) API --- Key: HBASE-5510 URL: https://issues.apache.org/jira/browse/HBASE-5510 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan In LB there is randomAssignment(ListServerName servers) API which will be used by AM to assign a region from a down RS. [This will be also used in other cases like call to assign() API from client] I feel it would be better to pass the HRegionInfo also into this method. When the LB making a choice for a region assignment, when one RS is down, it would be nice that the LB knows for which region it is doing this server selection. +Scenario+ While one RS down, we wanted the regions to get moved to other RSs but a set of regions stay together. We are having custom load balancer but with the current way of LB interface this is not possible. Another way is I can allow a random assignment of the regions at the RS down time. Later with a cluster balance I can balance the regions as I need. But this might make regions assign 1st to one RS and then again move to another. Also for some time period my business use case can not get satisfied. Also I have seen some issue in JIRA which speaks about making sure that Root and META regions always sit in some specific RSs. With the current LB API this wont be possible in future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5074: -- Comment: was deleted (was: dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin Fixed failed unit test TestFixedFileTrailer REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java ) support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221059#comment-13221059 ] Zhihong Yu commented on HBASE-5074: --- Adding CRC32C in another JIRA is fine. Hadoop 2.0 isn't released. It would be nice to give users CRC32C early. The current formation w.r.t. minor version means that HFileV3 would start with minor version of 1. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221060#comment-13221060 ] Zhihong Yu commented on HBASE-3134: --- @J-D: Can you outline the tests you were planning ? [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5511) More doc on maven release process
More doc on maven release process - Key: HBASE-5511 URL: https://issues.apache.org/jira/browse/HBASE-5511 Project: HBase Issue Type: Task Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5511) More doc on maven release process
[ https://issues.apache.org/jira/browse/HBASE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5511: - Attachment: doc.txt Doc and edit to pom that turns off running tests when doing mvn release (Might be responsible for our not publishing an hbase-test.jar... TBD). More doc on maven release process - Key: HBASE-5511 URL: https://issues.apache.org/jira/browse/HBASE-5511 Project: HBase Issue Type: Task Reporter: stack Attachments: doc.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5511) More doc on maven release process
[ https://issues.apache.org/jira/browse/HBASE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5511. -- Resolution: Fixed Fix Version/s: 0.94.0 0.92.1 Assignee: stack Committed to trunk and then I committed the pom patch that disables the running of tests on mvn release to 0.92 and 0.94. This addition may be responsible for our not uploading a hbase-test.jar; tbd. More doc on maven release process - Key: HBASE-5511 URL: https://issues.apache.org/jira/browse/HBASE-5511 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: doc.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5506) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221077#comment-13221077 ] Zhihong Yu commented on HBASE-5506: --- If I comment out the exclusion Scott added, I see the following test failure: {code} Tests in error: testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[2](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[4](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[6](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[12](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[14](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[16](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) testRunThriftServer[18](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine): IOError(message:Cannot find row in .META., row=\x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x00) {code} With the exclusion condition, the test passed: {code} [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 4:21.622s {code} I will integrate patch v3 later today if there is no objection. Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo() - Key: HBASE-5506 URL: https://issues.apache.org/jira/browse/HBASE-5506 Project: HBase Issue Type: Test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Attachments: HBASE-5506.D2031.1.patch, HBASE-5506.D2031.2.patch, HBASE-5506.D2031.3.patch We observed that when with framed transport option. The thrift call ThriftServerRunner.HbaseHandler.getRegionInfo() receives corrupted parameter (some garbage string attached to the beginning). This may be a thrift bug requires further investigation. Add a unit test to reproduce the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5508) Add an option to allow test output to show on the terminal
[ https://issues.apache.org/jira/browse/HBASE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221081#comment-13221081 ] Zhihong Yu commented on HBASE-5508: --- I will integrate the patch later today if there is no objection. Add an option to allow test output to show on the terminal -- Key: HBASE-5508 URL: https://issues.apache.org/jira/browse/HBASE-5508 Project: HBase Issue Type: Improvement Components: test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Attachments: HBASE-5508.D2037.1.patch Sometimes it is useful to directly see the test results on the terminal. We can add a property to achieve that. mvn test -Dtest.output.tofile=false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221085#comment-13221085 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. This seems to be close to a one-to-one mapping with the current interface today. I don't know if this is the intent or whether you're willing to completely redesign the look of the API too. Maybe it's to ease the transition? bq. bq. I'd like to see a request type to do one-shot scans. Something where you don't even get a scanner ID. You pass parameters like to open a scanner, you say up to how many rows or bytes you want to retrieve, and you get just that in one shot. bq. When opening a actual scanner, we also need to be able to get the first batch of scan results at the same time we open the scanner. This is a must-have IMO. And we need to be able to request to close the scanner while fetching a batch of results. bq. bq. It would be nice to have a keep-alive request for existing scanners. Something to tell the server I'm not fetching anything from this scanner right now, but please keep it open by reseting its TTL, don't close it just because I haven't used it for a while. bq. bq. Please, please, please, consider shortening the name of all these protobufs and dropping the Proto suffix. The current names are unnecessarily long or aren't intuitive (e.g. columnFamily for something that describes the multiple things you're trying to get out of a row) or are too redundant (e.g. KeyType keyType). bq. bq. Regarding the lack of multi RPC, I think this is a good thing. multi is a big mess that was only marginally better than its horrible multiPut predecessor. This proposal already supports multi-everything, it just doesn't support batching different kind of operations in the same RPC, which isn't a big deal IMO. We should implement what Benoît is asking for, probably not all as part of this issue. That said, if possible can we try and accomodate what he's asking for down here at the rpc level? I suppose once all is pb, it should be easy enough adding new stuff but it would be good to keep in mind what he's asking while redoing this layer. In a later issue we can add the overloads that exploit the additions or add the new methods B wants (What B is asking for are long-time outstanding fixups needed in hbase). For example, can the pb response on open of a scanner be more than just the scanner id; could it include an optional result item? Or I suppose, once up on pb, we can do this easily enough later? - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5552 --- On 2012-02-27 18:54:31, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-02-27 18:54:31) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml 066c027 bq.src/main/proto/HRegionProtocol.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API
[ https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221094#comment-13221094 ] ramkrishna.s.vasudevan commented on HBASE-5510: --- @Ted The rationale is like, I have set of regions S1...Sn and R1...Rn...In this i expect S1-R1, S2-R2 ...to be colocated in the same RS. S1...Sn are balanced by one LB. There is a custom LB which will balance R1..Rn. Now this LB should get the assignment done by first LB and based on that it will assign R1..Rn. Finally ensuring the colocation. Am i clear Ted? Change in LB.randomAssignment(ListServerName servers) API --- Key: HBASE-5510 URL: https://issues.apache.org/jira/browse/HBASE-5510 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan In LB there is randomAssignment(ListServerName servers) API which will be used by AM to assign a region from a down RS. [This will be also used in other cases like call to assign() API from client] I feel it would be better to pass the HRegionInfo also into this method. When the LB making a choice for a region assignment, when one RS is down, it would be nice that the LB knows for which region it is doing this server selection. +Scenario+ While one RS down, we wanted the regions to get moved to other RSs but a set of regions stay together. We are having custom load balancer but with the current way of LB interface this is not possible. Another way is I can allow a random assignment of the regions at the RS down time. Later with a cluster balance I can balance the regions as I need. But this might make regions assign 1st to one RS and then again move to another. Also for some time period my business use case can not get satisfied. Also I have seen some issue in JIRA which speaks about making sure that Root and META regions always sit in some specific RSs. With the current LB API this wont be possible in future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API
[ https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221105#comment-13221105 ] Zhihong Yu commented on HBASE-5510: --- Should colocation be satisfied by randomAssignment() ? By the time of assigning regions R1 to Rn, one of more of S1 to Sn may have been moved to new location (manual operation, server downtime, etc). How do we address this scenario ? The HRegionInfo Anoop suggested passing to randomAssignment() is that of the region to be assigned, how do we retrieve its buddy region's information ? Thanks Change in LB.randomAssignment(ListServerName servers) API --- Key: HBASE-5510 URL: https://issues.apache.org/jira/browse/HBASE-5510 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan In LB there is randomAssignment(ListServerName servers) API which will be used by AM to assign a region from a down RS. [This will be also used in other cases like call to assign() API from client] I feel it would be better to pass the HRegionInfo also into this method. When the LB making a choice for a region assignment, when one RS is down, it would be nice that the LB knows for which region it is doing this server selection. +Scenario+ While one RS down, we wanted the regions to get moved to other RSs but a set of regions stay together. We are having custom load balancer but with the current way of LB interface this is not possible. Another way is I can allow a random assignment of the regions at the RS down time. Later with a cluster balance I can balance the regions as I need. But this might make regions assign 1st to one RS and then again move to another. Also for some time period my business use case can not get satisfied. Also I have seen some issue in JIRA which speaks about making sure that Root and META regions always sit in some specific RSs. With the current LB API this wont be possible in future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5430) Fix licenses in 0.92.1 -- RAT plugin won't pass
[ https://issues.apache.org/jira/browse/HBASE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5430: - Attachment: 5430.txt Excluded two files... our test tgz file and our test sample hfile. Fix licenses in 0.92.1 -- RAT plugin won't pass --- Key: HBASE-5430 URL: https://issues.apache.org/jira/browse/HBASE-5430 Project: HBase Issue Type: Bug Reporter: stack Priority: Blocker Fix For: 0.92.1 Attachments: 5430.txt Use the -Drelease profile to see we are missing 30 or so license. Fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5430) Fix licenses in 0.92.1 -- RAT plugin won't pass
[ https://issues.apache.org/jira/browse/HBASE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5430. -- Resolution: Fixed Assignee: stack Applied to 0.92, 0.94 and trunk. Fix licenses in 0.92.1 -- RAT plugin won't pass --- Key: HBASE-5430 URL: https://issues.apache.org/jira/browse/HBASE-5430 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.1 Attachments: 5430.txt Use the -Drelease profile to see we are missing 30 or so license. Fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4890) fix possible NPE in HConnectionManager
[ https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221112#comment-13221112 ] stack commented on HBASE-4890: -- Any more luck w/ this one J-D (or you got distracted?) fix possible NPE in HConnectionManager -- Key: HBASE-4890 URL: https://issues.apache.org/jira/browse/HBASE-4890 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.92.1 I was running YCSB against a 0.92 branch and encountered this error message: {code} 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: Failed all from region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41., hostname=c0316.hal.cloudera.com, port=57020 java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750) at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source) at com.yahoo.ycsb.DBWrapper.update(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source) at com.yahoo.ycsb.ClientThread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158) at $Proxy4.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309) ... 7 more {code} It looks like the NPE is caused by server being null in the MultiRespone call() method. {code} public MultiResponse call() throws IOException { return getRegionServerWithoutRetries( new ServerCallableMultiResponse(connection, tableName, null) { public MultiResponse call() throws IOException { return server.multi(multi); } @Override public void connect(boolean reload) throws IOException { server = connection.getHRegionConnection(loc.getHostname(), loc.getPort()); } } ); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221119#comment-13221119 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. This seems to be close to a one-to-one mapping with the current interface today. I don't know if this is the intent or whether you're willing to completely redesign the look of the API too. Maybe it's to ease the transition? bq. bq. I'd like to see a request type to do one-shot scans. Something where you don't even get a scanner ID. You pass parameters like to open a scanner, you say up to how many rows or bytes you want to retrieve, and you get just that in one shot. bq. When opening a actual scanner, we also need to be able to get the first batch of scan results at the same time we open the scanner. This is a must-have IMO. And we need to be able to request to close the scanner while fetching a batch of results. bq. bq. It would be nice to have a keep-alive request for existing scanners. Something to tell the server I'm not fetching anything from this scanner right now, but please keep it open by reseting its TTL, don't close it just because I haven't used it for a while. bq. bq. Please, please, please, consider shortening the name of all these protobufs and dropping the Proto suffix. The current names are unnecessarily long or aren't intuitive (e.g. columnFamily for something that describes the multiple things you're trying to get out of a row) or are too redundant (e.g. KeyType keyType). bq. bq. Regarding the lack of multi RPC, I think this is a good thing. multi is a big mess that was only marginally better than its horrible multiPut predecessor. This proposal already supports multi-everything, it just doesn't support batching different kind of operations in the same RPC, which isn't a big deal IMO. bq. bq. Michael Stack wrote: bq. We should implement what Benoît is asking for, probably not all as part of this issue. That said, if possible can we try and accomodate what he's asking for down here at the rpc level? I suppose once all is pb, it should be easy enough adding new stuff but it would be good to keep in mind what he's asking while redoing this layer. In a later issue we can add the overloads that exploit the additions or add the new methods B wants (What B is asking for are long-time outstanding fixups needed in hbase). For example, can the pb response on open of a scanner be more than just the scanner id; could it include an optional result item? Or I suppose, once up on pb, we can do this easily enough later? bq. bq. The idea is not to break the existing client application code. So the new interface should be able to do the same thing and more. By the way, I have changed the interfaces a lot after several reviews so I closed this review. I will post a new review later. As to scanner, we cannot retrieve everything in one shot. So in the RPC layer, there must be multiple trips. As to the function you mentioned, it can be built in the client side, right? I will add an option to return results in opening a scanner, and an option to close the scanner in fetching from the scanner. Ok, I will try to shorten the names. As to multi, I am not sure. This proposal doesn't support mix different kind of operations in different order in the same RPC. I may need to add a similar one if we don't want to break the existing function. bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. src/main/proto/HRegionProtocol.proto, line 22 bq. https://reviews.apache.org/r/4054/diff/1/?file=86003#file86003line22 bq. bq. I find the Proto suffix unnecessary and long. If you truly want a suffix, PB would be shorter, but no suffix would be better IMO. Ok, I will remove the Proto suffix. bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. src/main/proto/HRegionProtocol.proto, line 25 bq. https://reviews.apache.org/r/4054/diff/1/?file=86003#file86003line25 bq. bq. Use option optimize_for = SPEED, it makes a big difference. Cool. I will add it. bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. src/main/proto/HRegionProtocol.proto, line 28 bq. https://reviews.apache.org/r/4054/diff/1/?file=86003#file86003line28 bq. bq. I'd call this just Columns. I changed it to ColumnProto. bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. src/main/proto/HRegionProtocol.proto, line 30 bq. https://reviews.apache.org/r/4054/diff/1/?file=86003#file86003line30 bq. bq. I would recommend to pluralize all repeated fields. This will make for nicer code where you'll be able to write something along the lines of: bq. bq. for (byte[] qualifier : pb.qualifiers()) For the repeated fields, the generated code will have method
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221122#comment-13221122 ] Devaraj Das commented on HBASE-5451: bq. Can we avoid the copy in the interim by having a convention that, if the request is a protobuf, then we send it following the call envelope rather than inside it? (does that make sense?) I explored this route but seems like it's not straightforward to do this (due to the fact that there are assumptions made on the order of data-length and data on the server, and I'd have to make changes to that to accommodate sending another set of bytes after the call envelope .. messy). I propose we leave the copy around and fix it by introducing something similar to ProtobufRpcEngine (of Hadoop) that would use native PBs everywhere. Of course we have to complete moving all protocols to PB. If people agree with me, I can submit a patch with only the path for the generated classes changed to what Jimmy suggested. Thoughts? Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221124#comment-13221124 ] Todd Lipcon commented on HBASE-5074: There's no benefit to CRC32C over CRC32 unless you can use the JNI code. I don't think copy-pasting all of the JNI stuff into HBase is a good idea. And, besides, this patch is not yet equipped to do the JNI-based checksumming (which requires direct buffers, etc) support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221127#comment-13221127 ] Todd Lipcon commented on HBASE-5451: Sounds reasonable, thanks Devaraj. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5341) Push the security 0.92 profile to maven repo
[ https://issues.apache.org/jira/browse/HBASE-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5341: - Priority: Blocker (was: Major) Making blocker on 0.92.1 Push the security 0.92 profile to maven repo Key: HBASE-5341 URL: https://issues.apache.org/jira/browse/HBASE-5341 Project: HBase Issue Type: Improvement Components: build, security Affects Versions: 0.92.1, 0.94.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Blocker Fix For: 0.92.1 Hbase 0.92.0 was released with two artifacts, plain and security. The security code is built with -Psecurity. There are two tarballs, but only the plain jar in maven repo at repository.a.o. I see no reason to do a separate artifact for the security related code, since 0.92 already depends on secure Hadoop 1.0.0, and all of the security related code is not loaded by default. In this issue, I propose, we merge the code under /security to src/ and remove the maven profile. Edit: after some discussion, and the plans for modularizing the build to include a security module, we changed the issue description to push the security jars in 0.92.1 to maven repo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5506) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5506: -- Fix Version/s: 0.96.0 Integrated to TRUNK. Thanks for the patch, Scott. Thanks for the review Stack and Dhruba. Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo() - Key: HBASE-5506 URL: https://issues.apache.org/jira/browse/HBASE-5506 Project: HBase Issue Type: Test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5506.D2031.1.patch, HBASE-5506.D2031.2.patch, HBASE-5506.D2031.3.patch We observed that when with framed transport option. The thrift call ThriftServerRunner.HbaseHandler.getRegionInfo() receives corrupted parameter (some garbage string attached to the beginning). This may be a thrift bug requires further investigation. Add a unit test to reproduce the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5508) Add an option to allow test output to show on the terminal
[ https://issues.apache.org/jira/browse/HBASE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5508: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Integrated to TRUNK. Thanks for the patch Scott. Thanks for the review Stack and Lars. Add an option to allow test output to show on the terminal -- Key: HBASE-5508 URL: https://issues.apache.org/jira/browse/HBASE-5508 Project: HBase Issue Type: Improvement Components: test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5508.D2037.1.patch Sometimes it is useful to directly see the test results on the terminal. We can add a property to achieve that. mvn test -Dtest.output.tofile=false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5508) Add an option to allow test output to show on the terminal
[ https://issues.apache.org/jira/browse/HBASE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221139#comment-13221139 ] Lars Hofhansl commented on HBASE-5508: -- Seems like a simple enough change for 0.94 too. Add an option to allow test output to show on the terminal -- Key: HBASE-5508 URL: https://issues.apache.org/jira/browse/HBASE-5508 Project: HBase Issue Type: Improvement Components: test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5508.D2037.1.patch Sometimes it is useful to directly see the test results on the terminal. We can add a property to achieve that. mvn test -Dtest.output.tofile=false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5506) Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221150#comment-13221150 ] Scott Chen commented on HBASE-5506: --- Ted: Yes, that's the bug. The parameter should be {code} tableA,rowA,99\x0 {code} But we are seeing some string attach to it and this is causing the failure {code} \x80\x01\x00\x01\x00\x00\x00\x0DgetRegionInfo\x00\x00\x00\x12\x0B\x00\x01\x00\x00\x00\x1AtableA,rowA,99\x0 {code} Thanks for the review Ted, Stack and Dhruba! Add unit test for ThriftServerRunner.HbaseHandler.getRegionInfo() - Key: HBASE-5506 URL: https://issues.apache.org/jira/browse/HBASE-5506 Project: HBase Issue Type: Test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5506.D2031.1.patch, HBASE-5506.D2031.2.patch, HBASE-5506.D2031.3.patch We observed that when with framed transport option. The thrift call ThriftServerRunner.HbaseHandler.getRegionInfo() receives corrupted parameter (some garbage string attached to the beginning). This may be a thrift bug requires further investigation. Add a unit test to reproduce the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221151#comment-13221151 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/ --- (Updated 2012-03-02 18:54:29.710858) Review request for hbase. Summary --- This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 Please review. I'd like to move ahead after we get to some agreement. This addresses bug HBASE-5443. https://issues.apache.org/jira/browse/HBASE-5443 Diffs - pom.xml bb518b1 src/main/proto/RegionAdmin.proto PRE-CREATION src/main/proto/RegionClient.proto PRE-CREATION src/main/proto/hbase.proto PRE-CREATION Diff: https://reviews.apache.org/r/4054/diff Testing --- Thanks, Jimmy Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing
[ https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221153#comment-13221153 ] Lars Hofhansl commented on HBASE-2038: -- @Alex: Looks like preScannerOpen could actually change the passed Scan object and add a filter. The API is a bit strange. Scan is marked final, but it is perfectly OK (and possible, and final does not prevent that) to change it here. postScannerOpen also gets the Scan object, but modifying it there is pointless. @Anoop: Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, I'm happy to do that, if that's the route we want to go with this. Coprocessors: Region level indexing --- Key: HBASE-2038 URL: https://issues.apache.org/jira/browse/HBASE-2038 Project: HBase Issue Type: New Feature Components: coprocessors Reporter: Andrew Purtell Priority: Minor HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a good goalpost for coprocessor environment design -- there should be enough of it so region level indexing can be reimplemented as a coprocessor without any loss of functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5508) Add an option to allow test output to show on the terminal
[ https://issues.apache.org/jira/browse/HBASE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221154#comment-13221154 ] Scott Chen commented on HBASE-5508: --- Lars and Stack: Thanks for the review. Ted: Thanks for integrating this. Add an option to allow test output to show on the terminal -- Key: HBASE-5508 URL: https://issues.apache.org/jira/browse/HBASE-5508 Project: HBase Issue Type: Improvement Components: test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5508.D2037.1.patch Sometimes it is useful to directly see the test results on the terminal. We can add a property to achieve that. mvn test -Dtest.output.tofile=false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221152#comment-13221152 ] Zhihong Yu commented on HBASE-5509: --- The following already exists in FSTableDescriptors.java: {code} + public static boolean isTableInfoExists(FileSystem fs, Path tabledir) {code} Can the patch be refreshed based on current TRUNK ? MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221155#comment-13221155 ] Jimmy Xiang commented on HBASE-5443: I updated the review with new diff, which incorporated the feedbacks from all reviewers. Thanks a lot for review. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221157#comment-13221157 ] stack commented on HBASE-5451: -- Can we have hbase go all-pb for hbase 0.96.0? Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221158#comment-13221158 ] dhruba borthakur commented on HBASE-5074: - The reason I kept the definition of CRC32C in the ChecksumType is essentially to reserve an ordinal in the enum for this checksum algorithm in the future. We should just wait for Hadoop 2.0 to be released to get this feature (instead of copying it to hbase). means that HFileV3 would start with minor version of 1. I am suggesting that HFileV3 has nothing to do with minorVersions. HFileV3 can decide to support minor version 0 or 1 or both. HFileV3 might not even use the HFileBlock format as we know it, in which case the question is moot. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221159#comment-13221159 ] stack commented on HBASE-5451: -- Thats a dumb question. Let me rephrase. Won't hbase be all pb natively by 0.96.0? Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221163#comment-13221163 ] Devaraj Das commented on HBASE-5451: bq. Won't hbase be all pb natively by 0.96.0? That should be the goal, rather a blocker *smile* Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221165#comment-13221165 ] jirapos...@reviews.apache.org commented on HBASE-5444: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4149/#review5570 --- Please check Benoit's comments on https://reviews.apache.org/r/4054/ about shorten the names. I updated https://reviews.apache.org/r/4054/ with a new diff. It has ServeName and other shared protos. src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12061 Please put generated at the end. - Jimmy On 2012-03-02 01:30:36, Gregory Chanan wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4149/ bq. --- bq. bq. (Updated 2012-03-02 01:30:36) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Protobuf work for HMasterRegionInterface. bq. bq. No need to comment on the pom.xml changes: I just copied those from HBASE-5443 (https://reviews.apache.org/r/4054/). bq. bq. bq. This addresses bug HBASE-5444. bq. https://issues.apache.org/jira/browse/HBASE-5444 bq. bq. bq. Diffs bq. - bq. bq.pom.xml 0f0aa9a bq.src/main/proto/HMasterRegionProtocol.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4149/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -DskipTests package successful and files generated successfully bq. bq. bq. Thanks, bq. bq. Gregory bq. bq. Add PB-based calls to HMasterRegionInterface Key: HBASE-5444 URL: https://issues.apache.org/jira/browse/HBASE-5444 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gregory Chanan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221166#comment-13221166 ] Jimmy Xiang commented on HBASE-5451: I hope we can. I know the RPC won't be backward compatible. How about the client code? We definitely won't break any existing client applications, right? Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221170#comment-13221170 ] stack commented on HBASE-5451: -- Client APIs should be the same but yeah, lets get up on pb before 0.96.0; a blocker as per DD. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5508) Add an option to allow test output to show on the terminal
[ https://issues.apache.org/jira/browse/HBASE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221169#comment-13221169 ] Zhihong Yu commented on HBASE-5508: --- Integrated to 0.94 as well. Add an option to allow test output to show on the terminal -- Key: HBASE-5508 URL: https://issues.apache.org/jira/browse/HBASE-5508 Project: HBase Issue Type: Improvement Components: test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5508.D2037.1.patch Sometimes it is useful to directly see the test results on the terminal. We can add a property to achieve that. mvn test -Dtest.output.tofile=false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5504) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221179#comment-13221179 ] stack commented on HBASE-5504: -- @Ted that takes me to a comment I made. I still am without understanding. Online Merge Key: HBASE-5504 URL: https://issues.apache.org/jira/browse/HBASE-5504 Project: HBase Issue Type: Brainstorming Components: client, master, shell, zookeeper Affects Versions: 0.94.0 Reporter: Mubarak Seyed Fix For: 0.96.0 As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991] Design suggestion from Stack: {quote} I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA. Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing. (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper) 1. Client calls merge or deleteRegion API. API is a range of rows. (C) 2. Master gets call. (M) 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M - ZK) 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M) 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M - C) 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M) 7. Write into zk that just turned off the balancer (If it was on) (M - ZK) 8. Get regions that are involved in the span (M) 9. Hoist the list up into zk. (M - ZK) 10. Create region to span the range. (M) 11. Write that we did this up into zk. (M - ZK) 12. Close regions in parallel. Confirm close in parallel. (M - RS) 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M - ZK) 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M) 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M - ZK) 16. Edit .META. (M) 17. Confirm edits went in. (M) 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M) 19. Enable balancer (if it was off) (M) 20. Unlock table (M) {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221181#comment-13221181 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5573 --- This is a lot better already. One thing this doesn't address that I should've mentioned in my previous review is that the requests and responses still have a lot of duplicate data. For example if I Get a row that contains 3 KeyValue, in the response, on the wire, I'll get 3 times the key and 3 times the family. pom.xml https://reviews.apache.org/r/4054/#comment12066 You didn't take into account my comments on fixing this shell scripting from the previous iteration. src/main/proto/RegionClient.proto https://reviews.apache.org/r/4054/#comment12069 So a Get request can only fetch multiple Get from a single Region? That's not good. We need true multi-get, where you can fetch things from multiple regions on the same RegionServer at once. src/main/proto/RegionClient.proto https://reviews.apache.org/r/4054/#comment12068 trailing whitespaces src/main/proto/RegionClient.proto https://reviews.apache.org/r/4054/#comment12070 I don't know if we should let the client specify the TTL. Right now in HBase the TTL is hardcoded in the Configuration object of the RegionServer. Actually I'm fine with allowing clients specify their own TTL as long as we bound the TTL with the servers' Configuration. src/main/proto/hbase.proto https://reviews.apache.org/r/4054/#comment12071 I still don't understand how these can be optional. - Benoit On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT
Add support for INCLUDE_AND_SEEK_USING_HINT --- Key: HBASE-5512 URL: https://issues.apache.org/jira/browse/HBASE-5512 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu This came up from HBASE-2038 From Anoop: - What we wanted from the filter is include a row and then seek to the next row which we are interested in. I cant see such a facility with our Filter right now. Correct me if I am wrong. So suppose we already seeked to one row and this need to be included in the result, then the Filter should return INCLUDE. Then when the next next() call happens, then only we can return a SEEK_USING_HINT. So one extra row reading is needed. This might create even one unwanted HFileBlock fetch (who knows). Can we add reseek() at higher level? From Lars: Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, I'm happy to do that, if that's the route we want to go with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing
[ https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221185#comment-13221185 ] Zhihong Yu commented on HBASE-2038: --- I logged HBASE-5512 for adding INCLUDE_AND_SEEK_USING_HINT Coprocessors: Region level indexing --- Key: HBASE-2038 URL: https://issues.apache.org/jira/browse/HBASE-2038 Project: HBase Issue Type: New Feature Components: coprocessors Reporter: Andrew Purtell Priority: Minor HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a good goalpost for coprocessor environment design -- there should be enough of it so region level indexing can be reimplemented as a coprocessor without any loss of functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT
[ https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5512: - Assignee: Lars Hofhansl Add support for INCLUDE_AND_SEEK_USING_HINT --- Key: HBASE-5512 URL: https://issues.apache.org/jira/browse/HBASE-5512 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu Assignee: Lars Hofhansl This came up from HBASE-2038 From Anoop: - What we wanted from the filter is include a row and then seek to the next row which we are interested in. I cant see such a facility with our Filter right now. Correct me if I am wrong. So suppose we already seeked to one row and this need to be included in the result, then the Filter should return INCLUDE. Then when the next next() call happens, then only we can return a SEEK_USING_HINT. So one extra row reading is needed. This might create even one unwanted HFileBlock fetch (who knows). Can we add reseek() at higher level? From Lars: Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, I'm happy to do that, if that's the route we want to go with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221191#comment-13221191 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5575 --- This is another option for scan. This way, we will have only one scan method, no need to open/next/close. Which one do you prefer? In the ScanRequest, either scannerId or scan must be specified, not both. message Scan { required RegionSpecifier region = 1; repeated Column column = 2; repeated Attribute attribute = 3; optional bytes startRow = 4; optional bytes stopRow = 5; optional string filterName = 6; optional TimeRange timeRange = 7; optional uint32 maxVersions = 8 [default = 1]; optional bool cacheBlocks = 9 [default = true]; optional uint32 rowsToCache = 10; optional uint32 batchSize = 11; } message ScanRequest { optional uint64 scannerId = 1; optional Scan scan = 2; optional uint32 numberOfRows = 3; optional bool closeScanner = 4; optional uint32 ttl = 5; } message ScanResponse { repeated Result result = 1; optional uint64 scannerId = 2; optional bool moreResults = 3; optional uint32 ttl = 4; } - Jimmy On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221192#comment-13221192 ] jirapos...@reviews.apache.org commented on HBASE-5444: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4149/#review5574 --- src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12072 Can we doc whats in here better? src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12073 Is this one k/v only? Isn't it possible to pass more than just the one k/v? An actual Map? src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12075 Doesn't exising interface have doc. Could we port that over? - Michael On 2012-03-02 01:30:36, Gregory Chanan wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4149/ bq. --- bq. bq. (Updated 2012-03-02 01:30:36) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Protobuf work for HMasterRegionInterface. bq. bq. No need to comment on the pom.xml changes: I just copied those from HBASE-5443 (https://reviews.apache.org/r/4054/). bq. bq. bq. This addresses bug HBASE-5444. bq. https://issues.apache.org/jira/browse/HBASE-5444 bq. bq. bq. Diffs bq. - bq. bq.pom.xml 0f0aa9a bq.src/main/proto/HMasterRegionProtocol.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4149/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -DskipTests package successful and files generated successfully bq. bq. bq. Thanks, bq. bq. Gregory bq. bq. Add PB-based calls to HMasterRegionInterface Key: HBASE-5444 URL: https://issues.apache.org/jira/browse/HBASE-5444 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gregory Chanan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221198#comment-13221198 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 19:31:46, Jimmy Xiang wrote: bq. This is another option for scan. This way, we will have only one scan method, no need to open/next/close. bq. bq. Which one do you prefer? In the ScanRequest, either scannerId or scan must be specified, not both. bq. bq. message Scan { bq. required RegionSpecifier region = 1; bq. repeated Column column = 2; bq. repeated Attribute attribute = 3; bq. optional bytes startRow = 4; bq. optional bytes stopRow = 5; bq. optional string filterName = 6; bq. optional TimeRange timeRange = 7; bq. optional uint32 maxVersions = 8 [default = 1]; bq. optional bool cacheBlocks = 9 [default = true]; bq. optional uint32 rowsToCache = 10; bq. optional uint32 batchSize = 11; bq. } bq. bq. message ScanRequest { bq. optional uint64 scannerId = 1; bq. optional Scan scan = 2; bq. optional uint32 numberOfRows = 3; bq. optional bool closeScanner = 4; bq. optional uint32 ttl = 5; bq. } bq. bq. message ScanResponse { bq. repeated Result result = 1; bq. optional uint64 scannerId = 2; bq. optional bool moreResults = 3; bq. optional uint32 ttl = 4; bq. } bq. So we would do away with openScanner, next, and close, just do scan? Inside in the ScanRequest, we'd carry over the Scan specification each time? We'd be able to honor the current openScanner, next, close client-facing API but could add a new scan method to the public api that allowed passing the above specifications? Sounds good. - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5575 --- On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221209#comment-13221209 ] jirapos...@reviews.apache.org commented on HBASE-5444: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4149/#review5576 --- src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12076 will do. src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12077 You can, the name here sucks -- this is really a MapEntry. If you look at RegionServerStartupResponseProto, it contains a list of these entries. src/main/proto/HMasterRegionProtocol.proto https://reviews.apache.org/r/4149/#comment12084 Good idea. Not sure what the best form for comments is -- I don't see any way to have protobufs generate javadoc comments into the generated code. I'll figure something out. - Gregory On 2012-03-02 01:30:36, Gregory Chanan wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4149/ bq. --- bq. bq. (Updated 2012-03-02 01:30:36) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Protobuf work for HMasterRegionInterface. bq. bq. No need to comment on the pom.xml changes: I just copied those from HBASE-5443 (https://reviews.apache.org/r/4054/). bq. bq. bq. This addresses bug HBASE-5444. bq. https://issues.apache.org/jira/browse/HBASE-5444 bq. bq. bq. Diffs bq. - bq. bq.pom.xml 0f0aa9a bq.src/main/proto/HMasterRegionProtocol.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4149/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -DskipTests package successful and files generated successfully bq. bq. bq. Thanks, bq. bq. Gregory bq. bq. Add PB-based calls to HMasterRegionInterface Key: HBASE-5444 URL: https://issues.apache.org/jira/browse/HBASE-5444 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gregory Chanan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221218#comment-13221218 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 19:31:46, Jimmy Xiang wrote: bq. This is another option for scan. This way, we will have only one scan method, no need to open/next/close. bq. bq. Which one do you prefer? In the ScanRequest, either scannerId or scan must be specified, not both. bq. bq. message Scan { bq. required RegionSpecifier region = 1; bq. repeated Column column = 2; bq. repeated Attribute attribute = 3; bq. optional bytes startRow = 4; bq. optional bytes stopRow = 5; bq. optional string filterName = 6; bq. optional TimeRange timeRange = 7; bq. optional uint32 maxVersions = 8 [default = 1]; bq. optional bool cacheBlocks = 9 [default = true]; bq. optional uint32 rowsToCache = 10; bq. optional uint32 batchSize = 11; bq. } bq. bq. message ScanRequest { bq. optional uint64 scannerId = 1; bq. optional Scan scan = 2; bq. optional uint32 numberOfRows = 3; bq. optional bool closeScanner = 4; bq. optional uint32 ttl = 5; bq. } bq. bq. message ScanResponse { bq. repeated Result result = 1; bq. optional uint64 scannerId = 2; bq. optional bool moreResults = 3; bq. optional uint32 ttl = 4; bq. } bq. bq. bq. Michael Stack wrote: bq. So we would do away with openScanner, next, and close, just do scan? Inside in the ScanRequest, we'd carry over the Scan specification each time? We'd be able to honor the current openScanner, next, close client-facing API but could add a new scan method to the public api that allowed passing the above specifications? Sounds good. The only issue is that both optional. They need to know to specify one. From documentation? - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5575 --- On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221221#comment-13221221 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 19:16:55, Benoit Sigoure wrote: bq. src/main/proto/RegionClient.proto, line 112 bq. https://reviews.apache.org/r/4054/diff/2/?file=87679#file87679line112 bq. bq. trailing whitespaces Will remove it. bq. On 2012-03-02 19:16:55, Benoit Sigoure wrote: bq. pom.xml, line 764 bq. https://reviews.apache.org/r/4054/diff/2/?file=87677#file87677line764 bq. bq. You didn't take into account my comments on fixing this shell scripting from the previous iteration. Sorry. I forgot it. Good idea. I will address it in the next diff. bq. On 2012-03-02 19:16:55, Benoit Sigoure wrote: bq. src/main/proto/hbase.proto, line 64 bq. https://reviews.apache.org/r/4054/diff/2/?file=87680#file87680line64 bq. bq. I still don't understand how these can be optional. Ok, I can make family and qualifier required. bq. On 2012-03-02 19:16:55, Benoit Sigoure wrote: bq. src/main/proto/RegionClient.proto, line 147 bq. https://reviews.apache.org/r/4054/diff/2/?file=87679#file87679line147 bq. bq. I don't know if we should let the client specify the TTL. Right now in HBase the TTL is hardcoded in the Configuration object of the RegionServer. bq. bq. Actually I'm fine with allowing clients specify their own TTL as long as we bound the TTL with the servers' Configuration. I see. I will remove it. We can add it back if server supports it later on. How about lockRow? Does the server support client specified TTL? bq. On 2012-03-02 19:16:55, Benoit Sigoure wrote: bq. src/main/proto/RegionClient.proto, line 63 bq. https://reviews.apache.org/r/4054/diff/2/?file=87679#file87679line63 bq. bq. So a Get request can only fetch multiple Get from a single Region? That's not good. We need true multi-get, where you can fetch things from multiple regions on the same RegionServer at once. I can move the region to Get. That means each Get need to specify a region, so it can be duplicated. Another option is to add another message like GetGroup which has a region and a set of Gets. One more option is to make region optional in the request, and add an optional region to Get. The region in GetRequest will be used if there is no region specified in Get. What do you think? - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5573 --- On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT
[ https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221219#comment-13221219 ] Lars Hofhansl commented on HBASE-5512: -- I misspoke slightly. Filters can only return INCLUDE, SKIP, NEXT_COL, NEXT_ROW, and SEEK_NEXT_USING_HINT. ColumnTrackerd have the additional options of INCLUDE_AND_XXX. Add support for INCLUDE_AND_SEEK_USING_HINT --- Key: HBASE-5512 URL: https://issues.apache.org/jira/browse/HBASE-5512 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu Assignee: Lars Hofhansl This came up from HBASE-2038 From Anoop: - What we wanted from the filter is include a row and then seek to the next row which we are interested in. I cant see such a facility with our Filter right now. Correct me if I am wrong. So suppose we already seeked to one row and this need to be included in the result, then the Filter should return INCLUDE. Then when the next next() call happens, then only we can return a SEEK_USING_HINT. So one extra row reading is needed. This might create even one unwanted HFileBlock fetch (who knows). Can we add reseek() at higher level? From Lars: Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, I'm happy to do that, if that's the route we want to go with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221228#comment-13221228 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5583 --- src/main/proto/RegionAdmin.proto https://reviews.apache.org/r/4054/#comment12092 If we are getting rid of Proto in the message names, might as well get rid of it here too. src/main/proto/RegionClient.proto https://reviews.apache.org/r/4054/#comment12094 Here too. src/main/proto/hbase.proto https://reviews.apache.org/r/4054/#comment12095 Here too. src/main/proto/hbase.proto https://reviews.apache.org/r/4054/#comment12091 You don't have option optimize_for = SPEED; here. - Gregory On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221229#comment-13221229 ] Zhihong Yu commented on HBASE-5444: --- I found this issue w.r.t. javadoc in protobufs generated code: http://code.google.com/p/protobuf/issues/detail?id=148 Add PB-based calls to HMasterRegionInterface Key: HBASE-5444 URL: https://issues.apache.org/jira/browse/HBASE-5444 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gregory Chanan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221231#comment-13221231 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 20:07:30, Gregory Chanan wrote: bq. src/main/proto/hbase.proto, line 23 bq. https://reviews.apache.org/r/4054/diff/2/?file=87680#file87680line23 bq. bq. You don't have option optimize_for = SPEED; here. will add. bq. On 2012-03-02 20:07:30, Gregory Chanan wrote: bq. src/main/proto/RegionAdmin.proto, line 22 bq. https://reviews.apache.org/r/4054/diff/2/?file=87678#file87678line22 bq. bq. If we are getting rid of Proto in the message names, might as well get rid of it here too. This one, I am not sure? - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5583 --- On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221238#comment-13221238 ] jirapos...@reviews.apache.org commented on HBASE-5443: -- bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. pom.xml, line 750 bq. https://reviews.apache.org/r/4054/diff/1/?file=86002#file86002line750 bq. bq. Do this instead: bq. bq. if which cygpath /dev/null 2/dev/null; then bq. # Windows bq. else bq. # Not Windows bq. fi Why do we need /dev/null twice? bq. On 2012-03-02 10:30:06, Benoit Sigoure wrote: bq. pom.xml, line 761 bq. https://reviews.apache.org/r/4054/diff/1/?file=86002#file86002line761 bq. bq. Actually you can just remove the whole $IS_WIN business and everything. Simply fix PROTO_DIR and JAVA_DIR when on Windows before calling protoc. How about this? The reason for I_PROTO_DIR is to keep ls $PROTO_DIR/*.proto working. if which cygpath 2 /dev/null; then I_PROTO_DIR=$PROTO_DIR else I_PROTO_DIR=`cygpath --windows $PROTO_DIR` JAVA_DIR=`cygpath --windows $JAVA_DIR` fi mkdir -p $JAVA_DIR 2 /dev/null for PROTO_FILE in `ls $PROTO_DIR/*.proto 2 /dev/null` do protoc -I$I_PROTO_DIR --java_out=$JAVA_DIR $PROTO_FILE done - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4054/#review5552 --- On 2012-03-02 18:54:29, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4054/ bq. --- bq. bq. (Updated 2012-03-02 18:54:29) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This is the first draft of the ProtoBuff HRegionProtocol. The corresponding java vs pb method mapping is attached to the jira: https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. Please review. I'd like to move ahead after we get to some agreement. bq. bq. bq. This addresses bug HBASE-5443. bq. https://issues.apache.org/jira/browse/HBASE-5443 bq. bq. bq. Diffs bq. - bq. bq.pom.xml bb518b1 bq.src/main/proto/RegionAdmin.proto PRE-CREATION bq.src/main/proto/RegionClient.proto PRE-CREATION bq.src/main/proto/hbase.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4054/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5486) Warn message in HTable: Stringify the byte[]
[ https://issues.apache.org/jira/browse/HBASE-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-5486: --- Status: Open (was: Patch Available) attaching a new version Warn message in HTable: Stringify the byte[] Key: HBASE-5486 URL: https://issues.apache.org/jira/browse/HBASE-5486 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Trivial Labels: noob Attachments: 5486.patch The warn message in the method getStartEndKeys() in HTable can be improved by stringifying the byte array for Regions.Qualifier Currently, a sample message is like : 12/01/17 16:36:34 WARN client.HTable: Null [B@552c8fa8 cell in keyvalues={test5,\xC9\xA2\x00\x00\x00\x00\x00\x00/00_0,1326642537734.dbc62b2765529a9ad2ddcf8eb58cb2dc./info:server/1326750341579/Put/vlen=28, test5,\xC9\xA2\x00\x00\x00\x00\x00\x00/00_0,1326642537734.dbc62b2765529a9ad2ddcf8eb58cb2dc./info:serverstartcode/1326750341579/Put/vlen=8} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5486) Warn message in HTable: Stringify the byte[]
[ https://issues.apache.org/jira/browse/HBASE-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-5486: --- Attachment: 5486-v2.patch Changes incorporated as per Stack's comments Warn message in HTable: Stringify the byte[] Key: HBASE-5486 URL: https://issues.apache.org/jira/browse/HBASE-5486 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Trivial Labels: noob Attachments: 5486-v2.patch, 5486.patch The warn message in the method getStartEndKeys() in HTable can be improved by stringifying the byte array for Regions.Qualifier Currently, a sample message is like : 12/01/17 16:36:34 WARN client.HTable: Null [B@552c8fa8 cell in keyvalues={test5,\xC9\xA2\x00\x00\x00\x00\x00\x00/00_0,1326642537734.dbc62b2765529a9ad2ddcf8eb58cb2dc./info:server/1326750341579/Put/vlen=28, test5,\xC9\xA2\x00\x00\x00\x00\x00\x00/00_0,1326642537734.dbc62b2765529a9ad2ddcf8eb58cb2dc./info:serverstartcode/1326750341579/Put/vlen=8} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5430) Fix licenses in 0.92.1 -- RAT plugin won't pass
[ https://issues.apache.org/jira/browse/HBASE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221261#comment-13221261 ] Hudson commented on HBASE-5430: --- Integrated in HBase-0.92 #313 (See [https://builds.apache.org/job/HBase-0.92/313/]) HBASE-5430 Fix licenses in 0.92.1 -- RAT plugin won't pass (Revision 1296356) Result = FAILURE stack : Files : * /hbase/branches/0.92/pom.xml Fix licenses in 0.92.1 -- RAT plugin won't pass --- Key: HBASE-5430 URL: https://issues.apache.org/jira/browse/HBASE-5430 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.1 Attachments: 5430.txt Use the -Drelease profile to see we are missing 30 or so license. Fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5511) More doc on maven release process
[ https://issues.apache.org/jira/browse/HBASE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221262#comment-13221262 ] Hudson commented on HBASE-5511: --- Integrated in HBase-0.92 #313 (See [https://builds.apache.org/job/HBase-0.92/313/]) HBASE-5511 More doc on maven release process (Revision 1296318) Result = FAILURE stack : Files : * /hbase/branches/0.92/pom.xml More doc on maven release process - Key: HBASE-5511 URL: https://issues.apache.org/jira/browse/HBASE-5511 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: doc.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5486) Warn message in HTable: Stringify the byte[]
[ https://issues.apache.org/jira/browse/HBASE-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5486. -- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Committed to trunk. Thanks for the patch Himanshu. Warn message in HTable: Stringify the byte[] Key: HBASE-5486 URL: https://issues.apache.org/jira/browse/HBASE-5486 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Trivial Labels: noob Fix For: 0.96.0 Attachments: 5486-v2.patch, 5486.patch The warn message in the method getStartEndKeys() in HTable can be improved by stringifying the byte array for Regions.Qualifier Currently, a sample message is like : 12/01/17 16:36:34 WARN client.HTable: Null [B@552c8fa8 cell in keyvalues={test5,\xC9\xA2\x00\x00\x00\x00\x00\x00/00_0,1326642537734.dbc62b2765529a9ad2ddcf8eb58cb2dc./info:server/1326750341579/Put/vlen=28, test5,\xC9\xA2\x00\x00\x00\x00\x00\x00/00_0,1326642537734.dbc62b2765529a9ad2ddcf8eb58cb2dc./info:serverstartcode/1326750341579/Put/vlen=8} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4608: -- Attachment: 4608v16.txt Patch v16 decrements HLogKey.VERSION HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4608: -- Comment: was deleted (was: @Li: Do you want submit latest patch to Hadoop QA ? Thanks) HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4608: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12509633/4608v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -149 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/677//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/677//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/677//console This message is automatically generated.) HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5430) Fix licenses in 0.92.1 -- RAT plugin won't pass
[ https://issues.apache.org/jira/browse/HBASE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221281#comment-13221281 ] Hudson commented on HBASE-5430: --- Integrated in HBase-0.94 #9 (See [https://builds.apache.org/job/HBase-0.94/9/]) HBASE-5430 Fix licenses in 0.92.1 -- RAT plugin won't pass (Revision 1296357) Result = SUCCESS stack : Files : * /hbase/branches/0.94/pom.xml Fix licenses in 0.92.1 -- RAT plugin won't pass --- Key: HBASE-5430 URL: https://issues.apache.org/jira/browse/HBASE-5430 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.1 Attachments: 5430.txt Use the -Drelease profile to see we are missing 30 or so license. Fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5511) More doc on maven release process
[ https://issues.apache.org/jira/browse/HBASE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221282#comment-13221282 ] Hudson commented on HBASE-5511: --- Integrated in HBase-0.94 #9 (See [https://builds.apache.org/job/HBase-0.94/9/]) HBASE-5511 More doc on maven release process (Revision 1296317) Result = SUCCESS stack : Files : * /hbase/branches/0.94/pom.xml More doc on maven release process - Key: HBASE-5511 URL: https://issues.apache.org/jira/browse/HBASE-5511 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: doc.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4109) Hostname returned via reverse dns lookup contains trailing period if configured interface is not default
[ https://issues.apache.org/jira/browse/HBASE-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221283#comment-13221283 ] Harsh J commented on HBASE-4109: Hi, This affects multihomed DataNodes as well. I've filed HADOOP-8134 upstream. Hostname returned via reverse dns lookup contains trailing period if configured interface is not default -- Key: HBASE-4109 URL: https://issues.apache.org/jira/browse/HBASE-4109 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.90.4 Attachments: 0001-HBASE-4109-Sanitize-hostname-returned-from-DNS-class.patch If you are using an interface anything other than 'default' (literally that keyword) DNS.java 's getDefaultHost will return a string which will have a trailing period at the end. It seems javadoc of reverseDns in DNS.java (see below) is conflicting with what that function is actually doing. It is returning a PTR record while claims it returns a hostname. The PTR record always has period at the end , RFC: http://irbs.net/bog-4.9.5/bog47.html We make call to DNS.getDefaultHost at more than one places and treat that as actual hostname. Quoting HRegionServer for example {code} String machineName = DNS.getDefaultHost(conf.get( hbase.regionserver.dns.interface, default), conf.get( hbase.regionserver.dns.nameserver, default)); {code} This causes inconsistencies. An example of such inconsistency was observed while debugging the issue Regions not getting reassigned if RS is brought down. More here http://search-hadoop.com/m/CANUA1qRCkQ1 We may want to sanitize the string returned from DNS class. Or better we can take a path of overhauling the way we do DNS name matching all over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221304#comment-13221304 ] jirapos...@reviews.apache.org commented on HBASE-5444: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4149/ --- (Updated 2012-03-02 22:18:47.273090) Review request for hbase. Summary --- Protobuf work for HMasterRegionInterface. No need to comment on the pom.xml changes: I just copied those from HBASE-5443 (https://reviews.apache.org/r/4054/). This addresses bug HBASE-5444. https://issues.apache.org/jira/browse/HBASE-5444 Diffs (updated) - pom.xml 0f0aa9a src/main/proto/HMasterRegionProtocol.proto PRE-CREATION src/main/proto/hbase.proto PRE-CREATION Diff: https://reviews.apache.org/r/4149/diff Testing --- mvn -DskipTests package successful and files generated successfully Thanks, Gregory Add PB-based calls to HMasterRegionInterface Key: HBASE-5444 URL: https://issues.apache.org/jira/browse/HBASE-5444 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gregory Chanan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221319#comment-13221319 ] Hadoop QA commented on HBASE-4608: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516888/4608v16.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -127 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1082//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1082//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1082//console This message is automatically generated. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5511) More doc on maven release process
[ https://issues.apache.org/jira/browse/HBASE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221323#comment-13221323 ] Hudson commented on HBASE-5511: --- Integrated in HBase-0.94 #10 (See [https://builds.apache.org/job/HBase-0.94/10/]) HBASE-5511 More doc on maven release process (Revision 1296317) Result = SUCCESS stack : Files : * /hbase/branches/0.94/pom.xml More doc on maven release process - Key: HBASE-5511 URL: https://issues.apache.org/jira/browse/HBASE-5511 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: doc.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5430) Fix licenses in 0.92.1 -- RAT plugin won't pass
[ https://issues.apache.org/jira/browse/HBASE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221322#comment-13221322 ] Hudson commented on HBASE-5430: --- Integrated in HBase-0.94 #10 (See [https://builds.apache.org/job/HBase-0.94/10/]) HBASE-5430 Fix licenses in 0.92.1 -- RAT plugin won't pass (Revision 1296357) Result = SUCCESS stack : Files : * /hbase/branches/0.94/pom.xml Fix licenses in 0.92.1 -- RAT plugin won't pass --- Key: HBASE-5430 URL: https://issues.apache.org/jira/browse/HBASE-5430 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.1 Attachments: 5430.txt Use the -Drelease profile to see we are missing 30 or so license. Fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5508) Add an option to allow test output to show on the terminal
[ https://issues.apache.org/jira/browse/HBASE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221321#comment-13221321 ] Hudson commented on HBASE-5508: --- Integrated in HBase-0.94 #10 (See [https://builds.apache.org/job/HBase-0.94/10/]) HBASE-5508 Add an option to allow test output to show on the terminal (Scott Chen) (Revision 1296390) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/pom.xml Add an option to allow test output to show on the terminal -- Key: HBASE-5508 URL: https://issues.apache.org/jira/browse/HBASE-5508 Project: HBase Issue Type: Improvement Components: test Reporter: Scott Chen Assignee: Scott Chen Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5508.D2037.1.patch Sometimes it is useful to directly see the test results on the terminal. We can add a property to achieve that. mvn test -Dtest.output.tofile=false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221337#comment-13221337 ] Zhihong Yu commented on HBASE-5270: --- I put some review on r/4021. Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler - Key: HBASE-5270 URL: https://issues.apache.org/jira/browse/HBASE-5270 Project: HBase Issue Type: Sub-task Components: master Reporter: Zhihong Yu Assignee: chunhui shen Fix For: 0.92.2 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch, hbase-5270v7.patch, hbase-5270v8.patch, hbase-5270v9.patch, sampletest.txt This JIRA continues the effort from HBASE-5179. Starting with Stack's comments about patches for 0.92 and TRUNK: Reviewing 0.92v17 isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere. Does isDeadRootServerInProgress need to be public? Ditto for meta version. This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier? Is there anything in place to stop us expiring a server twice if its carrying root and meta? What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta. I think I've asked for this a few times - onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all. It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here? Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works? Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired? This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch? This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221347#comment-13221347 ] jirapos...@reviews.apache.org commented on HBASE-4608: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2740/#review5597 --- It may be better if 4608v16.txt is uploaded here. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12170 Can we toggle this config param after in.init() ? This way we only create one Configuration src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12171 Should read 'uncompressed array' src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12172 This assignment is not necessary. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12173 Should read '... start writing to' src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12174 Should read 'the length of entry' src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12175 Should we add a check for other sizeBytes values ? src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12176 wrap long line, please. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java https://reviews.apache.org/r/2740/#comment12177 Remove extra empty line. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Dictionary.java https://reviews.apache.org/r/2740/#comment12178 This sentence is in parentheses. People would think it applies to dictionary indexes. Strictly speaking, -1 is not an index. Better rephrase this sentence. - Ted On 2012-03-01 09:58:44, Li Pi wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2740/ bq. --- bq. bq. (Updated 2012-03-01 09:58:44) bq. bq. bq. Review request for hbase, Eli Collins and Todd Lipcon. bq. bq. bq. Summary bq. --- bq. bq. HLog compression. Has unit tests and a command line tool for compressing/decompressing. bq. bq. bq. This addresses bug HBase-4608. bq. https://issues.apache.org/jira/browse/HBase-4608 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/HConstants.java 17cb0e3 bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/Dictionary.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java c945a99 bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221 bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/KeyValueCompression.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java d9cd6de bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java bd31ead bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef bq.src/main/java/org/apache/hadoop/hbase/util/Bytes.java ead9a3b bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLRUDictionary.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java a11899c bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplayCompressed.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2740/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Li bq. bq. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name,
[jira] [Updated] (HBASE-5213) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5213: -- Description: Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? was: Typing hbase master produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? hbase master stop does not bring down backup masters -- Key: HBASE-5213 URL: https://issues.apache.org/jira/browse/HBASE-5213 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Priority: Minor Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5213) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221354#comment-13221354 ] Zhihong Yu commented on HBASE-5213: --- In stop-hbase.sh, I see: {code} # TODO: store backup masters in ZooKeeper and have the primary send them a shutdown message # stop any backup masters $bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} \ --hosts ${HBASE_BACKUP_MASTERS} stop master-backup {code} We have several options: 1. adding a shutdown node in ZK on which all masters should listen 2. introducing backup masters node in ZK so that we know where shutdown message should be sent hbase master stop does not bring down backup masters -- Key: HBASE-5213 URL: https://issues.apache.org/jira/browse/HBASE-5213 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Priority: Minor Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221356#comment-13221356 ] Lars Hofhansl commented on HBASE-5509: -- @Ted: This is against trunk. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4890) fix possible NPE in HConnectionManager
[ https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221358#comment-13221358 ] Jean-Daniel Cryans commented on HBASE-4890: --- Sorry, I got distracted (I even forgot about this issue) so nothing new. fix possible NPE in HConnectionManager -- Key: HBASE-4890 URL: https://issues.apache.org/jira/browse/HBASE-4890 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.92.1 I was running YCSB against a 0.92 branch and encountered this error message: {code} 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: Failed all from region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41., hostname=c0316.hal.cloudera.com, port=57020 java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750) at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source) at com.yahoo.ycsb.DBWrapper.update(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source) at com.yahoo.ycsb.ClientThread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158) at $Proxy4.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309) ... 7 more {code} It looks like the NPE is caused by server being null in the MultiRespone call() method. {code} public MultiResponse call() throws IOException { return getRegionServerWithoutRetries( new ServerCallableMultiResponse(connection, tableName, null) { public MultiResponse call() throws IOException { return server.multi(multi); } @Override public void connect(boolean reload) throws IOException { server = connection.getHRegionConnection(loc.getHostname(), loc.getPort()); } } ); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221361#comment-13221361 ] Zhihong Yu commented on HBASE-5509: --- Right. The existing isTableInfoExists() has a different signature. Pardon me. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221371#comment-13221371 ] Zhihong Yu commented on HBASE-5509: --- SnapshotUtilities.java misses license and javadoc for the class. {code} + public static boolean sameFile(FileSystem srcfs, FileStatus srcstatus, + FileSystem dstfs, Path dstpath, boolean skipCRCCheck) throws IOException { {code} Is it possible to make the src and dst comply to same data type ? Either FileStatus or Path. For sameFile(), I think false should be returned for dest file in the following case: {code} + //return true if checksum is not supported + //(i.e. some of the checksums is null) {code} {code} + public static Path getPathInTrash(Path path, String hbaseUser, + FileSystem srcFileSys) throws IOException { {code} I think FileSystem parameter should be placed as first parameter for the above method. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221371#comment-13221371 ] Zhihong Yu edited comment on HBASE-5509 at 3/3/12 12:18 AM: SnapshotUtilities.java misses license and javadoc for the class. {code} + public static boolean sameFile(FileSystem srcfs, FileStatus srcstatus, + FileSystem dstfs, Path dstpath, boolean skipCRCCheck) throws IOException { {code} Is it possible to make the src and dst comply to same data type ? Either FileStatus or Path. For sameFile(), I think false should be returned for dest file in the following case: {code} + //return true if checksum is not supported + //(i.e. some of the checksums is null) {code} {code} + public static Path getPathInTrash(Path path, String hbaseUser, + FileSystem srcFileSys) throws IOException { {code} I think FileSystem parameter should be placed as first parameter for the above method. {code} +String trashPrefix = /user/ + hbaseUser + /.Trash; {code} I think the name of trash folder should be made configurable. For getStoreFileList(): {code} + * @param families + * a comma separated list of column families for which we need to {code} I think ListString may be better data type for families parameter. This would make this method more general in that it is not tied to the format of user input. {code} +long retryTimeInMins = + conf.getInt(hbase.backups.region.retryTimeInMins, 5) * 60 * 1000L; {code} Please rename the above variable which is converted to millis unit. SnapshotMR.java misses license. was (Author: zhi...@ebaysf.com): SnapshotUtilities.java misses license and javadoc for the class. {code} + public static boolean sameFile(FileSystem srcfs, FileStatus srcstatus, + FileSystem dstfs, Path dstpath, boolean skipCRCCheck) throws IOException { {code} Is it possible to make the src and dst comply to same data type ? Either FileStatus or Path. For sameFile(), I think false should be returned for dest file in the following case: {code} + //return true if checksum is not supported + //(i.e. some of the checksums is null) {code} {code} + public static Path getPathInTrash(Path path, String hbaseUser, + FileSystem srcFileSys) throws IOException { {code} I think FileSystem parameter should be placed as first parameter for the above method. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Status: Open (was: Patch Available) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Attachment: 5399.v27.patch Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221385#comment-13221385 ] Jesse Yates commented on HBASE-5509: think it might be time to RB this bad boy; I've got a bunch of comments of my own. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221389#comment-13221389 ] Zhihong Yu commented on HBASE-5399: --- @N: Can you update the patch on review board ? It is 6 rev's behind. Thanks Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5513) HBase master does not start
HBase master does not start --- Key: HBASE-5513 URL: https://issues.apache.org/jira/browse/HBASE-5513 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.4 Environment: SUSE Linux Enterprise Server 11 64-bit Service Pack 1 Reporter: Quan Liu Priority: Blocker Used Cloudera Manager 3.7.3 to install Cloudera CDH3U3 on a cluster of EC2 instances. After installation, HBase master cannot start. Both HDFS NN and DN have started successfully. The HBase log shows PM INFOorg.apache.hadoop.hbase.metrics MetricsString added: url Mar 2, 4:41:21 PM INFOorg.apache.hadoop.hbase.metrics MetricsString added: version Mar 2, 4:41:21 PM INFOorg.apache.hadoop.hbase.metrics new MBeanInfo Mar 2, 4:41:21 PM INFOorg.apache.hadoop.hbase.metrics new MBeanInfo Mar 2, 4:41:21 PM INFO org.apache.hadoop.hbase.master.metrics.MasterMetrics Initialized Mar 2, 4:41:21 PM INFO org.apache.hadoop.hbase.master.ActiveMasterManager Master=epoch-node-101:6 Mar 2, 4:41:22 PM WARNorg.apache.hadoop.hdfs.DFSClient DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1520) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:665) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy6.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy6.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3553) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3421) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2627) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2822) Mar 2, 4:41:22 PM WARNorg.apache.hadoop.hdfs.DFSClient Error Recovery for block null bad datanode[0] nodes == null Mar 2, 4:41:22 PM WARNorg.apache.hadoop.hdfs.DFSClient Could not get block locations. Source file /hbase/hbase.version - Aborting... Mar 2, 4:41:22 PM WARNorg.apache.hadoop.hbase.util.FSUtils Unable to create version file at hdfs://epoch-node-101:8020/hbase, retrying: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1520) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:665) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) -- This
[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT
[ https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221401#comment-13221401 ] Lars Hofhansl commented on HBASE-5512: -- Working on a test. Add support for INCLUDE_AND_SEEK_USING_HINT --- Key: HBASE-5512 URL: https://issues.apache.org/jira/browse/HBASE-5512 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu Assignee: Lars Hofhansl This came up from HBASE-2038 From Anoop: - What we wanted from the filter is include a row and then seek to the next row which we are interested in. I cant see such a facility with our Filter right now. Correct me if I am wrong. So suppose we already seeked to one row and this need to be included in the result, then the Filter should return INCLUDE. Then when the next next() call happens, then only we can return a SEEK_USING_HINT. So one extra row reading is needed. This might create even one unwanted HFileBlock fetch (who knows). Can we add reseek() at higher level? From Lars: Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, I'm happy to do that, if that's the route we want to go with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221404#comment-13221404 ] Lars Hofhansl commented on HBASE-5509: -- It's not ready for RB. Note that this *is* the Facebook patch ported to trunk with the changes I mentioned. All points Ted mentioned are from the FB patch. The types of comment I am looking for are: 1. do we want to this route at all 2. general comments on failure scenarios. Then I can go and clean up the finer points. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira