[jira] [Commented] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581107#comment-13581107 ] Liang Xie commented on HBASE-7495: -- Thanks for review, [~lhofhansl] i am OK with for trunk only right now. i'll do more work per [~zjusch]'s suggestion, especially, maybe i need to have a benchmark on hight block cache hit ratio scenario, before considering it for 0.94:) parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7878) recoverFileLease does not check return value of recoverLease
Eric Newton created HBASE-7878: -- Summary: recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Attachment: 7590.v1.patch Add a costless notifications mechanism from master to regionservers clients - Key: HBASE-7590 URL: https://issues.apache.org/jira/browse/HBASE-7590 Project: HBase Issue Type: Bug Components: Client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 7590.inprogress.patch, 7590.v1.patch t would be very useful to add a mechanism to distribute some information to the clients and regionservers. Especially It would be useful to know globally (regionservers + clients apps) that some regionservers are dead. This would allow: - to lower the load on the system, without clients using staled information and going on dead machines - to make the recovery faster from a client point of view. It's common to use large timeouts on the client side, so the client may need a lot of time before declaring a region server dead and trying another one. If the client receives the information separatly about a region server states, it can take the right decision, and continue/stop to wait accordingly. We can also send more information, for example instructions like 'slow down' to instruct the client to increase the retries delay and so on. Technically, the master could send this information. To lower the load on the system, we should: - have a multicast communication (i.e. the master does not have to connect to all servers by tcp), with once packet every 10 seconds or so. - receivers should not depend on this: if the information is available great. If not, it should not break anything. - it should be optional. So at the end we would have a thread in the master sending a protobuf message about the dead servers on a multicast socket. If the socket is not configured, it does not do anything. On the client side, when we receive an information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Patch Available (was: Open) Add a costless notifications mechanism from master to regionservers clients - Key: HBASE-7590 URL: https://issues.apache.org/jira/browse/HBASE-7590 Project: HBase Issue Type: Bug Components: Client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 7590.inprogress.patch, 7590.v1.patch t would be very useful to add a mechanism to distribute some information to the clients and regionservers. Especially It would be useful to know globally (regionservers + clients apps) that some regionservers are dead. This would allow: - to lower the load on the system, without clients using staled information and going on dead machines - to make the recovery faster from a client point of view. It's common to use large timeouts on the client side, so the client may need a lot of time before declaring a region server dead and trying another one. If the client receives the information separatly about a region server states, it can take the right decision, and continue/stop to wait accordingly. We can also send more information, for example instructions like 'slow down' to instruct the client to increase the retries delay and so on. Technically, the master could send this information. To lower the load on the system, we should: - have a multicast communication (i.e. the master does not have to connect to all servers by tcp), with once packet every 10 seconds or so. - receivers should not depend on this: if the information is available great. If not, it should not break anything. - it should be optional. So at the end we would have a thread in the master sending a protobuf message about the dead servers on a multicast socket. If the socket is not configured, it does not do anything. On the client side, when we receive an information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581314#comment-13581314 ] nkeywal commented on HBASE-7590: Nearly there. There is one todo left: HConnectionImplementation throws a ZooKeeperConnectionException, I wonder if I should make it throw a IOException instead. So now, if activated - the server sends a status message, at most one every 10 seconds. It contains the list of the newly dead server. When a server dies, it is sent 5 times, in case a client misses a message. If there are more than 10 servers to send, they are sent in multiple messages (one every 10 seconds), the newly dead first. - the clients listens to a status message. When they receive the notification that a server is dead, they clean their cache and close the connection to this server. When creating a new connection, they check that the server is not dead. For this, they use the server name and the start code instead of the hostname:port only. Add a costless notifications mechanism from master to regionservers clients - Key: HBASE-7590 URL: https://issues.apache.org/jira/browse/HBASE-7590 Project: HBase Issue Type: Bug Components: Client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 7590.inprogress.patch, 7590.v1.patch t would be very useful to add a mechanism to distribute some information to the clients and regionservers. Especially It would be useful to know globally (regionservers + clients apps) that some regionservers are dead. This would allow: - to lower the load on the system, without clients using staled information and going on dead machines - to make the recovery faster from a client point of view. It's common to use large timeouts on the client side, so the client may need a lot of time before declaring a region server dead and trying another one. If the client receives the information separatly about a region server states, it can take the right decision, and continue/stop to wait accordingly. We can also send more information, for example instructions like 'slow down' to instruct the client to increase the retries delay and so on. Technically, the master could send this information. To lower the load on the system, we should: - have a multicast communication (i.e. the master does not have to connect to all servers by tcp), with once packet every 10 seconds or so. - receivers should not depend on this: if the information is available great. If not, it should not break anything. - it should be optional. So at the end we would have a thread in the master sending a protobuf message about the dead servers on a multicast socket. If the socket is not configured, it does not do anything. On the client side, when we receive an information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581321#comment-13581321 ] Ted Yu commented on HBASE-7495: --- Integrated to trunk. Thanks for the patch, Liang. Thanks for the reviews, Chunhui, Sergey and Lars. parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7495: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.96.0 Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7879) JUnit dependency in main from htrace
nkeywal created HBASE-7879: -- Summary: JUnit dependency in main from htrace Key: HBASE-7879 URL: https://issues.apache.org/jira/browse/HBASE-7879 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.96.0 Reporter: nkeywal Priority: Minor HTrace main depends on Junit , it should be only test. I created a junit in the github, that's https://github.com/cloudera/htrace/issues/1. If it's not fixed, we will be able to drop it in our pom, but let's wait a little before. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7879) JUnit dependency in main from htrace
[ https://issues.apache.org/jira/browse/HBASE-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581374#comment-13581374 ] nkeywal commented on HBASE-7879: I've created the pull request as well. https://github.com/cloudera/htrace/pull/2 local tests ok. JUnit dependency in main from htrace Key: HBASE-7879 URL: https://issues.apache.org/jira/browse/HBASE-7879 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.96.0 Reporter: nkeywal Priority: Minor HTrace main depends on Junit , it should be only test. I created a junit in the github, that's https://github.com/cloudera/htrace/issues/1. If it's not fixed, we will be able to drop it in our pom, but let's wait a little before. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581381#comment-13581381 ] Ted Yu commented on HBASE-7878: --- Thanks for reporting this, Eric. I think you're right. recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7880) HFile Recovery/Rewrite Tool
Matteo Bertozzi created HBASE-7880: -- Summary: HFile Recovery/Rewrite Tool Key: HBASE-7880 URL: https://issues.apache.org/jira/browse/HBASE-7880 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Sometimes is useful to have a tool to migrate files from a new version to an old version (e.g. convert a new XYZ encoded/compressed file to an old uncompressed format) also it will be useful to been able to recover an hfile from a corrupted state. (e.g. trailer missing/broken, ...) The user can provide the information about the file (compression co) and try to recover as much as possible from the file by reading data blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7880) HFile Recovery/Rewrite Tool
[ https://issues.apache.org/jira/browse/HBASE-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-7880: --- Attachment: HBASE-7880-v0.patch Attached a quick and dirty patch, to rewrite and recover an hfile. We may also add a mapreduce support like the compaction tool to give the ability to specify a source directory and recover all the files in there by distributing the files recovery. HFile Reader/Scanner have a strong dependency on the trailer and the index, that makes difficult to reuse some code to scan just the block, maybe we can refactor the code a bit to isolate some stuff (like reading key/values) that don't really need the trailer dependecy. HFile Recovery/Rewrite Tool --- Key: HBASE-7880 URL: https://issues.apache.org/jira/browse/HBASE-7880 Project: HBase Issue Type: New Feature Components: HFile Affects Versions: 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Attachments: HBASE-7880-v0.patch Sometimes is useful to have a tool to migrate files from a new version to an old version (e.g. convert a new XYZ encoded/compressed file to an old uncompressed format) also it will be useful to been able to recover an hfile from a corrupted state. (e.g. trailer missing/broken, ...) The user can provide the information about the file (compression co) and try to recover as much as possible from the file by reading data blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7860) HBase authorization is reliant on Kerberos
[ https://issues.apache.org/jira/browse/HBASE-7860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581434#comment-13581434 ] Gary Helmling commented on HBASE-7860: -- Looks like this configuration was part of the security documentation, but was removed by HBASE-6027, to reflect the combination of SecureRpcEngine and WritableRpcEngine into ProtobufRpcEngine in trunk. I think this is really an issue with having the generated ref guide on hbase.apache.org being built from trunk, when everyone using it is likely to be running 0.94 or earlier. Have we looked into linking out to the documentation for each release separately, like Hadoop and some other projects do? Would that be easier to do now that our site is converted over the the CMS stuff? HBase authorization is reliant on Kerberos -- Key: HBASE-7860 URL: https://issues.apache.org/jira/browse/HBASE-7860 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.4 Reporter: Kevin Odell We are currently unable to use ACLs without having Kerberos setup. That is a pain for testing and environments that have other authentication methods that are not Kerberos-centric. safety valve: property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property [root@cdh4-oozie-1 ~]# hbase shell hbase(main):001:0 create 't1', 'cf1' ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user 'null' (global, action=CREATE) at org.apache.hadoop.hbase.security.access.AccessController.requirePermission(AccessController.java:402) at org.apache.hadoop.hbase.security.access.AccessController.preCreateTable(AccessController.java:525) at org.apache.hadoop.hbase.master.MasterCoprocessorHost.preCreateTable(MasterCoprocessorHost.java:89) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1056) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1345) [root@cdh4-oozie-1 ~]# su hbase bash-4.1$ hbase shell hbase(main):001:0 create 't1', 'cf1' ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user 'null' (global, action=CREATE) at org.apache.hadoop.hbase.security.access.AccessController.requirePermission(AccessController.java:402) at org.apache.hadoop.hbase.security.access.AccessController.preCreateTable(AccessController.java:525) at org.apache.hadoop.hbase.master.MasterCoprocessorHost.preCreateTable(MasterCoprocessorHost.java:89) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1056) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1345) It looks like we are relying on Kerberos to tell us who the user is, but since we are not using authentication, we are just passing NULL. We should be able to just rely on the local fs account. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7846) Add support for merging implicit regions in Merge tool
[ https://issues.apache.org/jira/browse/HBASE-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-7846: --- Status: Open (was: Patch Available) Cancelling to give Hadoop QA a change to re-test it. Add support for merging implicit regions in Merge tool -- Key: HBASE-7846 URL: https://issues.apache.org/jira/browse/HBASE-7846 Project: HBase Issue Type: Improvement Components: util Reporter: Kaufman Ng Assignee: Jean-Marc Spaggiari Priority: Minor Attachments: HBASE-7846-v0-trunk.patch, HBASE-7846-v1-trunk.patch Currently org.apache.hadoop.hbase.util.Merge needs 2 region names to be explicitly specified to perform a merge. This can be cumbersome. One idea for improvement is to have Merge to figure out all the adjacent regions and perform the merges. For example: regions before merge: row-10, row-20, row-30, row-40, row-50 regions after merge: row-10, row-30, row-50 In the above example, region names of row-10 and row-20 are merged to become a new bigger region of row-10. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7846) Add support for merging implicit regions in Merge tool
[ https://issues.apache.org/jira/browse/HBASE-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-7846: --- Status: Patch Available (was: Open) Add support for merging implicit regions in Merge tool -- Key: HBASE-7846 URL: https://issues.apache.org/jira/browse/HBASE-7846 Project: HBase Issue Type: Improvement Components: util Reporter: Kaufman Ng Assignee: Jean-Marc Spaggiari Priority: Minor Attachments: HBASE-7846-v0-trunk.patch, HBASE-7846-v1-trunk.patch Currently org.apache.hadoop.hbase.util.Merge needs 2 region names to be explicitly specified to perform a merge. This can be cumbersome. One idea for improvement is to have Merge to figure out all the adjacent regions and perform the merges. For example: regions before merge: row-10, row-20, row-30, row-40, row-50 regions after merge: row-10, row-30, row-50 In the above example, region names of row-10 and row-20 are merged to become a new bigger region of row-10. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7866) TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK failed 3 times in a row
[ https://issues.apache.org/jira/browse/HBASE-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581479#comment-13581479 ] ramkrishna.s.vasudevan commented on HBASE-7866: --- @Lars If you are ok with the patch can you commit it. I have some infrastructure issues that prevents me from going on with commit. TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK failed 3 times in a row - Key: HBASE-7866 URL: https://issues.apache.org/jira/browse/HBASE-7866 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.96.0, 0.94.6 Attachments: HBASE-7866_0.94.patch Looks like the jenkins machines are flaky/slow again, causing this test to fail. Same stacktrace all three times: {code} java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:656) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:608) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7866) TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK failed 3 times in a row
[ https://issues.apache.org/jira/browse/HBASE-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581489#comment-13581489 ] Lars Hofhansl commented on HBASE-7866: -- Yep... Will commit in a bit. TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK failed 3 times in a row - Key: HBASE-7866 URL: https://issues.apache.org/jira/browse/HBASE-7866 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.96.0, 0.94.6 Attachments: HBASE-7866_0.94.patch Looks like the jenkins machines are flaky/slow again, causing this test to fail. Same stacktrace all three times: {code} java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:656) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:608) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581490#comment-13581490 ] Hudson commented on HBASE-7495: --- Integrated in HBase-TRUNK #3885 (See [https://builds.apache.org/job/HBase-TRUNK/3885/]) HBASE-7495 parallel seek in StoreScanner (Liang Xie) (Revision 1447740) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example/TestBulkDeleteProtocol.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/ParallelSeekHandler.java * /hbase/trunk/hbase-server/src/main/resources/hbase-default.xml * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestCoprocessorScanPolicy.java parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.96.0 Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7725) Add generic attributes to CP initiated compaction request AND latch on compaction completion
[ https://issues.apache.org/jira/browse/HBASE-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581495#comment-13581495 ] Jesse Yates commented on HBASE-7725: Looks like the test failures are unrelated and it passes locally. Add generic attributes to CP initiated compaction request AND latch on compaction completion Key: HBASE-7725 URL: https://issues.apache.org/jira/browse/HBASE-7725 Project: HBase Issue Type: Bug Components: Compaction, Coprocessors, regionserver Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0, 0.94.6 Attachments: example.java, hbase-7725_0.94-v0.patch, hbase-7725-v0.patch, hbase-7725-v1.patch, hbase-7725-v3.patch, hbase-7725-v4.patch, hbase-7725-v5.patch, hbase-7725-v6.patch, hbase-7725_with-attributes-0.94-v0.patch, hbase-7725_with-attributes-0.94-v1.patch You can request that a compaction be started, but you can't be sure when that compaction request completes. This is a simple update to the CompactionRequest interface and the compact-split thread on the RS that doesn't actually impact the RS exposed interface. This is particularly useful for CPs so they can control starting/running a compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7725) Add generic attributes to CP initiated compaction request AND latch on compaction completion
[ https://issues.apache.org/jira/browse/HBASE-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581498#comment-13581498 ] Andrew Purtell commented on HBASE-7725: --- The test failures are due to this: {noformat} java.net.BindException: Problem binding to localhost/127.0.0.1:42113 : Address already in use {noformat} +1 for commit, thanks Jesse! Add generic attributes to CP initiated compaction request AND latch on compaction completion Key: HBASE-7725 URL: https://issues.apache.org/jira/browse/HBASE-7725 Project: HBase Issue Type: Bug Components: Compaction, Coprocessors, regionserver Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0, 0.94.6 Attachments: example.java, hbase-7725_0.94-v0.patch, hbase-7725-v0.patch, hbase-7725-v1.patch, hbase-7725-v3.patch, hbase-7725-v4.patch, hbase-7725-v5.patch, hbase-7725-v6.patch, hbase-7725_with-attributes-0.94-v0.patch, hbase-7725_with-attributes-0.94-v1.patch You can request that a compaction be started, but you can't be sure when that compaction request completes. This is a simple update to the CompactionRequest interface and the compact-split thread on the RS that doesn't actually impact the RS exposed interface. This is particularly useful for CPs so they can control starting/running a compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4210) Allow coprocessor to interact with batches per region sent from a client(?)
[ https://issues.apache.org/jira/browse/HBASE-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4210: -- Attachment: HBASE-4210_94-V2.patch Allow coprocessor to interact with batches per region sent from a client(?) --- Key: HBASE-4210 URL: https://issues.apache.org/jira/browse/HBASE-4210 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Anoop Sam John Priority: Minor Fix For: 0.96.0, 0.94.6 Attachments: HBASE-4210_94.patch, HBASE-4210_94-V2.patch Currently the coprocessor write hooks - {pre|post}{Put|Delete} - are strictly one row|cell operations. It might be a good idea to allow a coprocessor to deal with batches of puts and deletes as they arrive from the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount
[ https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581509#comment-13581509 ] Tianying Chang commented on HBASE-7818: --- I am not able to find OperationMetrics.java in the current trunk version. I can see it was checked in on Apr 2012 under the old path src/main/java/org/apache/hadoop/hbase/regionserver/metrics/OperationMetrics.java. But since that path does not exist, and fine . -name OperationMetrics.java also did not return this file from any other path, is this file deleted/refactored? @Elliott, is OperationMetrics.java you added has been refactored and get deleted out of trunk? If so, do you know which file is serving the purpose now? Thanks Tian-Ying add region level metrics readReqeustCount and writeRequestCount Key: HBASE-7818 URL: https://issues.apache.org/jira/browse/HBASE-7818 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.4 Reporter: Tianying Chang Assignee: Tianying Chang Priority: Minor Fix For: 0.94.6 Attachments: HBASE-7818_1.patch, HBASE-7818.patch Request rate at region server level can help identify the hot region server. But it will be good if we can further identify the hot regions on that region server. That way, we can easily find out unbalanced regions problem. Currently, readRequestCount and writeReqeustCount per region is exposed at webUI. It will be more useful to expose it through hadoop metrics framework and/or JMX, so that people can see the history when the region is hot. I am exposing the existing readRequestCount/writeRequestCount into the dynamic region level metrics framework. I am not changing/exposing it as rate because our openTSDB is taking the raw data of read/write count, and apply rate function to display the rate already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7881) Add FSUtils method delete(Path f, boolean recursive)
Ted Yu created HBASE-7881: - Summary: Add FSUtils method delete(Path f, boolean recursive) Key: HBASE-7881 URL: https://issues.apache.org/jira/browse/HBASE-7881 Project: HBase Issue Type: Sub-task Reporter: Ted Yu From Matteo (https://reviews.apache.org/r/9416/diff/2/?file=258262#file258262line402): looking at the source, it seems that checking the return value and throw an exception seems a good way to shoot ourselves in the foot. I've added that check in other places and not I'm regretting that... because a return false doesn't really mean I'm not able to delete the file/dir. maybe the file is already deleted by someone else, or renamed... You want to throw an exception only if the file/dir is still there... so if we don't trust that the API will provide an exception is case of failure we should do something like {code} if (!fs.delete(workingDir, true)) { // Make sure that the dir is still there if (fs.exists(workingDir)) { throw new IOException(Unable to delete + workingDir); } } {code} We can add the following method to FSUtils: void delete(Path f, boolean recursive) throws IOException; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581550#comment-13581550 ] Devaraj Das commented on HBASE-7878: The corresponding Accumulo jira - ACCUMULO-1053 recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7858) cleanup before merging snapshots branch to trunk
[ https://issues.apache.org/jira/browse/HBASE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581554#comment-13581554 ] Matteo Bertozzi commented on HBASE-7858: Looks good to me, just one question. Why getTakeSnapshotHandler() is now synchronized? {code} - TakeSnapshotHandler getTakeSnapshotHandler(SnapshotDescription snapshot) { + private synchronized TakeSnapshotHandler getTakeSnapshotHandler(SnapshotDescription snapshot) { {code} cleanup before merging snapshots branch to trunk Key: HBASE-7858 URL: https://issues.apache.org/jira/browse/HBASE-7858 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7858-v1.txt, 7858-v2.txt, 7858-v3.txt, 7858-v4.txt, 7858-v5.txt, 7858-v6.txt, 7858-v7.txt, 7858-v8.txt There have been a lot of review comments from https://reviews.apache.org/r/9416 Since our goal of merging snapshot feature to trunk would preserve revision history, a separate JIRA is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7858) cleanup before merging snapshots branch to trunk
[ https://issues.apache.org/jira/browse/HBASE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581560#comment-13581560 ] Ted Yu commented on HBASE-7858: --- Looking at other methods which access this.handler, such as snapshotEnabledTable(), they're declared synchronized. getTakeSnapshotHandler() is called by isSnapshotDone() which is not synchronized, hence the change. cleanup before merging snapshots branch to trunk Key: HBASE-7858 URL: https://issues.apache.org/jira/browse/HBASE-7858 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7858-v1.txt, 7858-v2.txt, 7858-v3.txt, 7858-v4.txt, 7858-v5.txt, 7858-v6.txt, 7858-v7.txt, 7858-v8.txt There have been a lot of review comments from https://reviews.apache.org/r/9416 Since our goal of merging snapshot feature to trunk would preserve revision history, a separate JIRA is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7878: - Priority: Critical (was: Major) Fix Version/s: 0.94.6 0.96.0 recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton Priority: Critical Fix For: 0.96.0, 0.94.6 I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7881) Add FSUtils method delete(Path f, boolean recursive)
[ https://issues.apache.org/jira/browse/HBASE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581585#comment-13581585 ] stack commented on HBASE-7881: -- [~ted_yu] Is [~mbertozzi] unable to file his own issues? Add FSUtils method delete(Path f, boolean recursive) Key: HBASE-7881 URL: https://issues.apache.org/jira/browse/HBASE-7881 Project: HBase Issue Type: Sub-task Reporter: Ted Yu From Matteo (https://reviews.apache.org/r/9416/diff/2/?file=258262#file258262line402): looking at the source, it seems that checking the return value and throw an exception seems a good way to shoot ourselves in the foot. I've added that check in other places and not I'm regretting that... because a return false doesn't really mean I'm not able to delete the file/dir. maybe the file is already deleted by someone else, or renamed... You want to throw an exception only if the file/dir is still there... so if we don't trust that the API will provide an exception is case of failure we should do something like {code} if (!fs.delete(workingDir, true)) { // Make sure that the dir is still there if (fs.exists(workingDir)) { throw new IOException(Unable to delete + workingDir); } } {code} We can add the following method to FSUtils: void delete(Path f, boolean recursive) throws IOException; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount
[ https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581600#comment-13581600 ] Elliott Clark commented on HBASE-7818: -- Metrics in trunk is significantly different. OperationsMetrics.java was refactored into several different pieces. The linked ppt should describe how metrics are structured now on trunk. add region level metrics readReqeustCount and writeRequestCount Key: HBASE-7818 URL: https://issues.apache.org/jira/browse/HBASE-7818 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.4 Reporter: Tianying Chang Assignee: Tianying Chang Priority: Minor Fix For: 0.94.6 Attachments: HBASE-7818_1.patch, HBASE-7818.patch Request rate at region server level can help identify the hot region server. But it will be good if we can further identify the hot regions on that region server. That way, we can easily find out unbalanced regions problem. Currently, readRequestCount and writeReqeustCount per region is exposed at webUI. It will be more useful to expose it through hadoop metrics framework and/or JMX, so that people can see the history when the region is hot. I am exposing the existing readRequestCount/writeRequestCount into the dynamic region level metrics framework. I am not changing/exposing it as rate because our openTSDB is taking the raw data of read/write count, and apply rate function to display the rate already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7881) Add FSUtils method delete(Path f, boolean recursive)
[ https://issues.apache.org/jira/browse/HBASE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581607#comment-13581607 ] Matteo Bertozzi commented on HBASE-7881: I guess that this should be part of HBASE-7806 Add FSUtils method delete(Path f, boolean recursive) Key: HBASE-7881 URL: https://issues.apache.org/jira/browse/HBASE-7881 Project: HBase Issue Type: Sub-task Reporter: Ted Yu From Matteo (https://reviews.apache.org/r/9416/diff/2/?file=258262#file258262line402): looking at the source, it seems that checking the return value and throw an exception seems a good way to shoot ourselves in the foot. I've added that check in other places and not I'm regretting that... because a return false doesn't really mean I'm not able to delete the file/dir. maybe the file is already deleted by someone else, or renamed... You want to throw an exception only if the file/dir is still there... so if we don't trust that the API will provide an exception is case of failure we should do something like {code} if (!fs.delete(workingDir, true)) { // Make sure that the dir is still there if (fs.exists(workingDir)) { throw new IOException(Unable to delete + workingDir); } } {code} We can add the following method to FSUtils: void delete(Path f, boolean recursive) throws IOException; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7881) Add FSUtils method delete(Path f, boolean recursive)
[ https://issues.apache.org/jira/browse/HBASE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-7881. --- Resolution: Duplicate Covered by HBASE-7806 Add FSUtils method delete(Path f, boolean recursive) Key: HBASE-7881 URL: https://issues.apache.org/jira/browse/HBASE-7881 Project: HBase Issue Type: Sub-task Reporter: Ted Yu From Matteo (https://reviews.apache.org/r/9416/diff/2/?file=258262#file258262line402): looking at the source, it seems that checking the return value and throw an exception seems a good way to shoot ourselves in the foot. I've added that check in other places and not I'm regretting that... because a return false doesn't really mean I'm not able to delete the file/dir. maybe the file is already deleted by someone else, or renamed... You want to throw an exception only if the file/dir is still there... so if we don't trust that the API will provide an exception is case of failure we should do something like {code} if (!fs.delete(workingDir, true)) { // Make sure that the dir is still there if (fs.exists(workingDir)) { throw new IOException(Unable to delete + workingDir); } } {code} We can add the following method to FSUtils: void delete(Path f, boolean recursive) throws IOException; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7866) TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK failed 3 times in a row
[ https://issues.apache.org/jira/browse/HBASE-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581614#comment-13581614 ] Lars Hofhansl commented on HBASE-7866: -- This is different in trunk. Have no time to do that right now. Will do this evening. TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK failed 3 times in a row - Key: HBASE-7866 URL: https://issues.apache.org/jira/browse/HBASE-7866 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.96.0, 0.94.6 Attachments: HBASE-7866_0.94.patch Looks like the jenkins machines are flaky/slow again, causing this test to fail. Same stacktrace all three times: {code} java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:656) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:608) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7763) Compactions not sorting based on size anymore.
[ https://issues.apache.org/jira/browse/HBASE-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581622#comment-13581622 ] Elliott Clark commented on HBASE-7763: -- bq.Why is the order of seqid and bulktime sorts reversed between 94 an trunk? No idea. Let me investigate some Compactions not sorting based on size anymore. -- Key: HBASE-7763 URL: https://issues.apache.org/jira/browse/HBASE-7763 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.96.0, 0.94.4 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 0.96.0 Attachments: HBASE-7763-094-0.patch, HBASE-7763-trunk-1.patch, HBASE-7763-trunk-2.patch, HBASE-7763-trunk-3.patch, HBASE-7763-trunk-TESTING.patch, HBASE-7763-trunk-TESTING.patch, HBASE-7763-trunk-TESTING.patch Currently compaction selection is not sorting based on size. This causes selection to choose larger files to re-write than are needed when bulk loads are involved. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount
[ https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581625#comment-13581625 ] Tianying Chang commented on HBASE-7818: --- @Elliott Thanks, Elliott. Do you have the Jira that refactored the OperationsMetrics.java? add region level metrics readReqeustCount and writeRequestCount Key: HBASE-7818 URL: https://issues.apache.org/jira/browse/HBASE-7818 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.4 Reporter: Tianying Chang Assignee: Tianying Chang Priority: Minor Fix For: 0.94.6 Attachments: HBASE-7818_1.patch, HBASE-7818.patch Request rate at region server level can help identify the hot region server. But it will be good if we can further identify the hot regions on that region server. That way, we can easily find out unbalanced regions problem. Currently, readRequestCount and writeReqeustCount per region is exposed at webUI. It will be more useful to expose it through hadoop metrics framework and/or JMX, so that people can see the history when the region is hot. I am exposing the existing readRequestCount/writeRequestCount into the dynamic region level metrics framework. I am not changing/exposing it as rate because our openTSDB is taking the raw data of read/write count, and apply rate function to display the rate already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7748) Add DelimitedKeyPrefixRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581627#comment-13581627 ] Enis Soztutar commented on HBASE-7748: -- @Robert, that looks like a corner case, which should be handled by changing the data model, rather than split policy, no? Add DelimitedKeyPrefixRegionSplitPolicy --- Key: HBASE-7748 URL: https://issues.apache.org/jira/browse/HBASE-7748 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0, 0.94.5 Attachments: hbase-7748_v1.patch, hbase-7748_v2.patch, hbase-7748_v3-0.94.patch, hbase-7748_v3.patch DelimitedKeyPrefixRegionSplitPolicy similar to KeyPrefixRegionSplitPolicy, but with a delimiter for the key, instead of a fixed prefix. Can be used for META regions, since we are doing table_name,start_key,region_id.encoded_region_name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount
[ https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581632#comment-13581632 ] Elliott Clark commented on HBASE-7818: -- HBASE-4050 is the umbrella jira where things were moved to metrics2. HBASE-6410 will have a lot of the regionserver metrics movements. add region level metrics readReqeustCount and writeRequestCount Key: HBASE-7818 URL: https://issues.apache.org/jira/browse/HBASE-7818 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.4 Reporter: Tianying Chang Assignee: Tianying Chang Priority: Minor Fix For: 0.94.6 Attachments: HBASE-7818_1.patch, HBASE-7818.patch Request rate at region server level can help identify the hot region server. But it will be good if we can further identify the hot regions on that region server. That way, we can easily find out unbalanced regions problem. Currently, readRequestCount and writeReqeustCount per region is exposed at webUI. It will be more useful to expose it through hadoop metrics framework and/or JMX, so that people can see the history when the region is hot. I am exposing the existing readRequestCount/writeRequestCount into the dynamic region level metrics framework. I am not changing/exposing it as rate because our openTSDB is taking the raw data of read/write count, and apply rate function to display the rate already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7748) Add DelimitedKeyPrefixRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581634#comment-13581634 ] Robert Dyer commented on HBASE-7748: @Enis, perhaps a change in the data model would avoid this situation. However to me, regardless of the data model, it appears that this behaviour is non-optimal. We select a split point (roughly the middle) and then arbitrarily move it one direction (to find a group boundary). The original split point is the most optimal, in terms of splitting. Thus, we should find the nearest usable split point to that row and maintain as optimal a split as possible. Sure in the example I gave it is an extreme case, but even ignoring that you might end up with non-optimal splits. It may be the case that moving down 1 single row would find a group boundary, yet we move up back rows anyway. Add DelimitedKeyPrefixRegionSplitPolicy --- Key: HBASE-7748 URL: https://issues.apache.org/jira/browse/HBASE-7748 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0, 0.94.5 Attachments: hbase-7748_v1.patch, hbase-7748_v2.patch, hbase-7748_v3-0.94.patch, hbase-7748_v3.patch DelimitedKeyPrefixRegionSplitPolicy similar to KeyPrefixRegionSplitPolicy, but with a delimiter for the key, instead of a fixed prefix. Can be used for META regions, since we are doing table_name,start_key,region_id.encoded_region_name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7748) Add DelimitedKeyPrefixRegionSplitPolicy
[ https://issues.apache.org/jira/browse/HBASE-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581635#comment-13581635 ] Robert Dyer commented on HBASE-7748: BTW all, I filed HBASE-7877 to fix this inefficiency. Add DelimitedKeyPrefixRegionSplitPolicy --- Key: HBASE-7748 URL: https://issues.apache.org/jira/browse/HBASE-7748 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0, 0.94.5 Attachments: hbase-7748_v1.patch, hbase-7748_v2.patch, hbase-7748_v3-0.94.patch, hbase-7748_v3.patch DelimitedKeyPrefixRegionSplitPolicy similar to KeyPrefixRegionSplitPolicy, but with a delimiter for the key, instead of a fixed prefix. Can be used for META regions, since we are doing table_name,start_key,region_id.encoded_region_name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7495: -- Resolution: Fixed Status: Resolved (was: Patch Available) parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.96.0 Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4755) HBase based block placement in DFS
[ https://issues.apache.org/jira/browse/HBASE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581646#comment-13581646 ] Enis Soztutar commented on HBASE-4755: -- bq. 2) The next step is to have the creation of store files honor this region placement. Is this patch useful without giving hints to DFSClient about block placement. I though we still don't have the pluming in hadoop yet. HBase based block placement in DFS -- Key: HBASE-4755 URL: https://issues.apache.org/jira/browse/HBASE-4755 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0 Reporter: Karthik Ranganathan Assignee: Christopher Gist Attachments: 4755-wip-1.patch The feature as is only useful for HBase clusters that care about data locality on regionservers, but this feature can also enable a lot of nice features down the road. The basic idea is as follows: instead of letting HDFS determine where to replicate data (r=3) by place blocks on various regions, it is better to let HBase do so by providing hints to HDFS through the DFS client. That way instead of replicating data at a blocks level, we can replicate data at a per-region level (each region owned by a promary, a secondary and a tertiary regionserver). This is better for 2 things: - Can make region failover faster on clusters which benefit from data affinity - On large clusters with random block placement policy, this helps reduce the probability of data loss The algo is as follows: - Each region in META will have 3 columns which are the preferred regionservers for that region (primary, secondary and tertiary) - Preferred assignment can be controlled by a config knob - Upon cluster start, HMaster will enter a mapping from each region to 3 regionservers (random hash, could use current locality, etc) - The load balancer would assign out regions preferring region assignments to primary over secondary over tertiary over any other node - Periodically (say weekly, configurable) the HMaster would run a locality checked and make sure the map it has for region to regionservers is optimal. Down the road, this can be enhanced to control region placement in the following cases: - Mixed hardware SKU where some regionservers can hold fewer regions - Load balancing across tables where we dont want multiple regions of a table to get assigned to the same regionservers - Multi-tenancy, where we can restrict the assignment of the regions of some table to a subset of regionservers, so an abusive app cannot take down the whole HBase cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7290) Online snapshots
[ https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-7290: -- Priority: Blocker (was: Major) Online snapshots - Key: HBASE-7290 URL: https://issues.apache.org/jira/browse/HBASE-7290 Project: HBase Issue Type: Bug Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Attachments: 7290-mega-v2.txt, 7290-mega-v3.txt, hbase-7290.mega.patch HBASE-6055 will be closed when the offline snapshots pieces get merged with trunk. This umbrella issue has all the online snapshot specific patches. This will get merged once one of the implementations makes it into trunk. Other flavors of online snapshots can then be done as normal patches instead of on a development branch. (was: HBASE-6055 will be closed when the online snapshots pieces get merged with trunk. This umbrella issue has all the online snapshot specific patches. This will get merged once one of the implementations makes it into trunk. Other flavors of online snapshots can then be done as normal patches instead of on a development branch.) (not a fan of the quick edit descirption jira feature) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7863) make compactionpolicy return compactionrequest
[ https://issues.apache.org/jira/browse/HBASE-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7863: Attachment: HBASE-7863-v1.patch fix test, minor c/p issue make compactionpolicy return compactionrequest -- Key: HBASE-7863 URL: https://issues.apache.org/jira/browse/HBASE-7863 Project: HBase Issue Type: Improvement Components: Compaction Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7863-v0.patch, HBASE-7863-v1.patch See HBASE-7843, I figured the patch could be split for easier review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7858) cleanup before merging snapshots branch to trunk
[ https://issues.apache.org/jira/browse/HBASE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7858: -- Attachment: 7858-v9.txt Patch v9 changes the following condition in verifyRegions(): {code} if (region.isOffline() (region.isSplit() || region.isSplitParent())) { {code} cleanup before merging snapshots branch to trunk Key: HBASE-7858 URL: https://issues.apache.org/jira/browse/HBASE-7858 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7858-v1.txt, 7858-v2.txt, 7858-v3.txt, 7858-v4.txt, 7858-v5.txt, 7858-v6.txt, 7858-v7.txt, 7858-v8.txt, 7858-v9.txt There have been a lot of review comments from https://reviews.apache.org/r/9416 Since our goal of merging snapshot feature to trunk would preserve revision history, a separate JIRA is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7858) cleanup before merging snapshots branch to trunk
[ https://issues.apache.org/jira/browse/HBASE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581682#comment-13581682 ] Ted Yu commented on HBASE-7858: --- I am running test suite based on patch v9. Will report back if there is repeatable test failure. cleanup before merging snapshots branch to trunk Key: HBASE-7858 URL: https://issues.apache.org/jira/browse/HBASE-7858 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7858-v1.txt, 7858-v2.txt, 7858-v3.txt, 7858-v4.txt, 7858-v5.txt, 7858-v6.txt, 7858-v7.txt, 7858-v8.txt, 7858-v9.txt There have been a lot of review comments from https://reviews.apache.org/r/9416 Since our goal of merging snapshot feature to trunk would preserve revision history, a separate JIRA is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount
[ https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581700#comment-13581700 ] Tianying Chang commented on HBASE-7818: --- Elliott, thanks, I will take a look at HBASE-6410. BTW, can you take a look at my patch for 94? If it makes sense, I will also refactor it against metric 2. Maybe also link it under the umbrella jira HBASE-4050. add region level metrics readReqeustCount and writeRequestCount Key: HBASE-7818 URL: https://issues.apache.org/jira/browse/HBASE-7818 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.4 Reporter: Tianying Chang Assignee: Tianying Chang Priority: Minor Fix For: 0.94.6 Attachments: HBASE-7818_1.patch, HBASE-7818.patch Request rate at region server level can help identify the hot region server. But it will be good if we can further identify the hot regions on that region server. That way, we can easily find out unbalanced regions problem. Currently, readRequestCount and writeReqeustCount per region is exposed at webUI. It will be more useful to expose it through hadoop metrics framework and/or JMX, so that people can see the history when the region is hot. I am exposing the existing readRequestCount/writeRequestCount into the dynamic region level metrics framework. I am not changing/exposing it as rate because our openTSDB is taking the raw data of read/write count, and apply rate function to display the rate already. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7882) move region level metrics readReqeustCount and writeRequestCount to Metric 2
Tianying Chang created HBASE-7882: - Summary: move region level metrics readReqeustCount and writeRequestCount to Metric 2 Key: HBASE-7882 URL: https://issues.apache.org/jira/browse/HBASE-7882 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.96.0 Reporter: Tianying Chang Assignee: Tianying Chang Priority: Minor Fix For: 0.96.0 HBASE-7818 is for 94. Following the refactor of HBASE-6410, I need to refactor the 94 patch of HBASE-7818 against metric 2. Patch for 96 will be very different from 94. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7495: -- Release Note: This JIRA adds the ability of doing parallel seek in StoreScanner. It is off by default. The following config parameter turns on this feature: hbase.storescanner.parallel.seek.enable hbase.storescanner.parallel.seek.threads controls the number of threads in thread pool which serves parallel seeking. Default is 10 threads. parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.96.0 Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7863) make compactionpolicy return compactionrequest
[ https://issues.apache.org/jira/browse/HBASE-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581740#comment-13581740 ] Hadoop QA commented on HBASE-7863: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570007/HBASE-7863-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4463//console This message is automatically generated. make compactionpolicy return compactionrequest -- Key: HBASE-7863 URL: https://issues.apache.org/jira/browse/HBASE-7863 Project: HBase Issue Type: Improvement Components: Compaction Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7863-v0.patch, HBASE-7863-v1.patch See HBASE-7843, I figured the patch could be split for easier review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7858) cleanup before merging snapshots branch to trunk
[ https://issues.apache.org/jira/browse/HBASE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581750#comment-13581750 ] Ted Yu commented on HBASE-7858: --- There is no repeatable test failure. cleanup before merging snapshots branch to trunk Key: HBASE-7858 URL: https://issues.apache.org/jira/browse/HBASE-7858 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7858-v1.txt, 7858-v2.txt, 7858-v3.txt, 7858-v4.txt, 7858-v5.txt, 7858-v6.txt, 7858-v7.txt, 7858-v8.txt, 7858-v9.txt There have been a lot of review comments from https://reviews.apache.org/r/9416 Since our goal of merging snapshot feature to trunk would preserve revision history, a separate JIRA is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7725) Add generic attributes to CP initiated compaction request AND latch on compaction completion
[ https://issues.apache.org/jira/browse/HBASE-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581756#comment-13581756 ] Jesse Yates commented on HBASE-7725: Cool, thanks Andy! I'm planning to commmit this evening then, unless there are objections. Add generic attributes to CP initiated compaction request AND latch on compaction completion Key: HBASE-7725 URL: https://issues.apache.org/jira/browse/HBASE-7725 Project: HBase Issue Type: Bug Components: Compaction, Coprocessors, regionserver Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0, 0.94.6 Attachments: example.java, hbase-7725_0.94-v0.patch, hbase-7725-v0.patch, hbase-7725-v1.patch, hbase-7725-v3.patch, hbase-7725-v4.patch, hbase-7725-v5.patch, hbase-7725-v6.patch, hbase-7725_with-attributes-0.94-v0.patch, hbase-7725_with-attributes-0.94-v1.patch You can request that a compaction be started, but you can't be sure when that compaction request completes. This is a simple update to the CompactionRequest interface and the compact-split thread on the RS that doesn't actually impact the RS exposed interface. This is particularly useful for CPs so they can control starting/running a compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7858) cleanup before merging snapshots branch to trunk
[ https://issues.apache.org/jira/browse/HBASE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581766#comment-13581766 ] Ted Yu commented on HBASE-7858: --- Checked in patch v9 to hbase-7290v2 branch. Thanks for the reviews Jon, Matteo and Jesse. We can use follow-on JIRAs for remaining comments. cleanup before merging snapshots branch to trunk Key: HBASE-7858 URL: https://issues.apache.org/jira/browse/HBASE-7858 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7858-v1.txt, 7858-v2.txt, 7858-v3.txt, 7858-v4.txt, 7858-v5.txt, 7858-v6.txt, 7858-v7.txt, 7858-v8.txt, 7858-v9.txt There have been a lot of review comments from https://reviews.apache.org/r/9416 Since our goal of merging snapshot feature to trunk would preserve revision history, a separate JIRA is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-7878: - Assignee: Ted Yu recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.6 I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7878: -- Attachment: 7878-trunk-v1.txt Patch for trunk. recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.6 Attachments: 7878-trunk-v1.txt I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7878: -- Status: Patch Available (was: Open) recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.6 Attachments: 7878-trunk-v1.txt I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581823#comment-13581823 ] Hadoop QA commented on HBASE-7878: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570029/7878-trunk-v1.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4464//console This message is automatically generated. recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.6 Attachments: 7878-trunk-v1.txt I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7883) Update memstore size when removing the entries in append operation
Himanshu Vashishtha created HBASE-7883: -- Summary: Update memstore size when removing the entries in append operation Key: HBASE-7883 URL: https://issues.apache.org/jira/browse/HBASE-7883 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 The memstore size is not updated when the previous entries are removed from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7883) Update memstore size when removing the entries in append operation
[ https://issues.apache.org/jira/browse/HBASE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-7883: --- Attachment: HBASE-7883-v1.patch Patch that fixes the issue. Tested in an environment where I reproduced that issue without the patch. Fwiw, TestGlobalMemStoreSize,TestMemStore pass. Update memstore size when removing the entries in append operation -- Key: HBASE-7883 URL: https://issues.apache.org/jira/browse/HBASE-7883 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 Attachments: HBASE-7883-v1.patch The memstore size is not updated when the previous entries are removed from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7883) Update memstore size when removing the entries in append operation
[ https://issues.apache.org/jira/browse/HBASE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-7883: --- Description: In case of Appends/Increments with VERSION of CF set to 1, the memstore size is not updated when the previous entries are removed from the memstore. (was: The memstore size is not updated when the previous entries are removed from the memstore. ) Update memstore size when removing the entries in append operation -- Key: HBASE-7883 URL: https://issues.apache.org/jira/browse/HBASE-7883 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 Attachments: HBASE-7883-v1.patch In case of Appends/Increments with VERSION of CF set to 1, the memstore size is not updated when the previous entries are removed from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7883) Update memstore size when removing the entries in append operation
[ https://issues.apache.org/jira/browse/HBASE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-7883: --- Status: Patch Available (was: Open) Update memstore size when removing the entries in append operation -- Key: HBASE-7883 URL: https://issues.apache.org/jira/browse/HBASE-7883 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 Attachments: HBASE-7883-v1.patch In case of Appends/Increments with VERSION of CF set to 1, the memstore size is not updated when the previous entries are removed from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-7864) Rename HMaster#listSnapshots as getCompletedSnapshots()
[ https://issues.apache.org/jira/browse/HBASE-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-7864: - Assignee: Ted Yu Rename HMaster#listSnapshots as getCompletedSnapshots() --- Key: HBASE-7864 URL: https://issues.apache.org/jira/browse/HBASE-7864 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu During code review, I proposed renaming HMaster#listSnapshots as getCompletedSnapshots() Jon agreed. This task would perform the renaming across Java and Ruby code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7864) Rename HMaster#listSnapshots as getCompletedSnapshots()
[ https://issues.apache.org/jira/browse/HBASE-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7864: -- Attachment: 7864.txt Rename HMaster#listSnapshots as getCompletedSnapshots() --- Key: HBASE-7864 URL: https://issues.apache.org/jira/browse/HBASE-7864 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7864.txt During code review, I proposed renaming HMaster#listSnapshots as getCompletedSnapshots() Jon agreed. This task would perform the renaming across Java and Ruby code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-7864) Rename HMaster#listSnapshots as getCompletedSnapshots()
[ https://issues.apache.org/jira/browse/HBASE-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-7864 started by Ted Yu. Rename HMaster#listSnapshots as getCompletedSnapshots() --- Key: HBASE-7864 URL: https://issues.apache.org/jira/browse/HBASE-7864 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7864.txt During code review, I proposed renaming HMaster#listSnapshots as getCompletedSnapshots() Jon agreed. This task would perform the renaming across Java and Ruby code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7884) ByteBloomFilter's performance can be optimized by avoiding multiplexing operation when generating hash
clockfly created HBASE-7884: --- Summary: ByteBloomFilter's performance can be optimized by avoiding multiplexing operation when generating hash Key: HBASE-7884 URL: https://issues.apache.org/jira/browse/HBASE-7884 Project: HBase Issue Type: Bug Components: Performance Affects Versions: 0.94.5 Reporter: clockfly Priority: Minor ByteBloomFilter's performance can be optimized by avoiding multiplexing operation when generating hash -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7884) ByteBloomFilter's performance can be optimized by avoiding multiplexing operation when generating hash
[ https://issues.apache.org/jira/browse/HBASE-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] clockfly updated HBASE-7884: Attachment: bloom_performance_tunning.patch ByteBloomFilter's performance can be optimized by avoiding multiplexing operation when generating hash --- Key: HBASE-7884 URL: https://issues.apache.org/jira/browse/HBASE-7884 Project: HBase Issue Type: Bug Components: Performance Affects Versions: 0.94.5 Reporter: clockfly Priority: Minor Attachments: bloom_performance_tunning.patch ByteBloomFilter's performance can be optimized by avoiding multiplexing operation when generating hash -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7885) bloom filter compaction is too aggressive for Hfile which only contains small count of records
clockfly created HBASE-7885: --- Summary: bloom filter compaction is too aggressive for Hfile which only contains small count of records Key: HBASE-7885 URL: https://issues.apache.org/jira/browse/HBASE-7885 Project: HBase Issue Type: Bug Components: Performance, Scanners Affects Versions: 0.94.5 Reporter: clockfly Priority: Minor Fix For: 0.94.5 For HFile V2, the bloom filter will take a initial size, 128KB. When there are not that much records inserted into the bloom filter, the bloom fitler will start to shrink itself to do compaction. For example, for 128K, it will compact to 64K -32K-16K-8K-4K-2K-1K-512-256-128-64-32, as long as it think that it can be bounded by the estimate error rate. If we puts only a few records in the HFile, the bloom filter will be compacted to too small, then it will break the assumption that shrinking will still be bounded by the estimated error rate. The False positive rate will becomes un-acceptable high. For example, if we set the expected error rate is 0.1, for 10 records, after compaction, The size of the bloom filter will be 64 bytes. The real effective false positive rate will be 50%. The use case is like this, if we are using HBase to store big record like images, and binaries, each record will take megabytes. Then for a 128M file, it will only contains dozens of records. The suggested fix is to set a lower limit for the bloom filter compaction process. I suggest to use 1000 bytes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7885) bloom filter compaction is too aggressive for Hfile which only contains small count of records
[ https://issues.apache.org/jira/browse/HBASE-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] clockfly updated HBASE-7885: Attachment: hbase_bloom_shrink_fix.patch bloom filter compaction is too aggressive for Hfile which only contains small count of records -- Key: HBASE-7885 URL: https://issues.apache.org/jira/browse/HBASE-7885 Project: HBase Issue Type: Bug Components: Performance, Scanners Affects Versions: 0.94.5 Reporter: clockfly Priority: Minor Fix For: 0.94.5 Attachments: hbase_bloom_shrink_fix.patch For HFile V2, the bloom filter will take a initial size, 128KB. When there are not that much records inserted into the bloom filter, the bloom fitler will start to shrink itself to do compaction. For example, for 128K, it will compact to 64K -32K-16K-8K-4K-2K-1K-512-256-128-64-32, as long as it think that it can be bounded by the estimate error rate. If we puts only a few records in the HFile, the bloom filter will be compacted to too small, then it will break the assumption that shrinking will still be bounded by the estimated error rate. The False positive rate will becomes un-acceptable high. For example, if we set the expected error rate is 0.1, for 10 records, after compaction, The size of the bloom filter will be 64 bytes. The real effective false positive rate will be 50%. The use case is like this, if we are using HBase to store big record like images, and binaries, each record will take megabytes. Then for a 128M file, it will only contains dozens of records. The suggested fix is to set a lower limit for the bloom filter compaction process. I suggest to use 1000 bytes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7883) Update memstore size when removing the entries in append operation
[ https://issues.apache.org/jira/browse/HBASE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581881#comment-13581881 ] Hadoop QA commented on HBASE-7883: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570044/HBASE-7883-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4465//console This message is automatically generated. Update memstore size when removing the entries in append operation -- Key: HBASE-7883 URL: https://issues.apache.org/jira/browse/HBASE-7883 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 Attachments: HBASE-7883-v1.patch In case of Appends/Increments with VERSION of CF set to 1, the memstore size is not updated when the previous entries are removed from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7462) TestDrainingServer is an integration test. It should be a unit test instead
[ https://issues.apache.org/jira/browse/HBASE-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581902#comment-13581902 ] Gustavo Anatoly commented on HBASE-7462: Hi, Nicolas. The task didn't finished yet, because I'm trying fix interrupted exception when call AssignmentManager#assign(MapHRegionInfo, ServerName) and the other point is my delay to submit a patch, caused by learning curve to understand AM and ZK interaction. TestDrainingServer is an integration test. It should be a unit test instead --- Key: HBASE-7462 URL: https://issues.apache.org/jira/browse/HBASE-7462 Project: HBase Issue Type: Wish Components: test Affects Versions: 0.96.0 Reporter: nkeywal Priority: Trivial Labels: noob TestDrainingServer tests the function that allows to say that a regionserver should not get new regions. As it is written today, it's an integration test: it starts stops a cluster. The test would be more efficient if it would just check that the AssignmentManager does not use the drained region server; whatever the circumstances (bulk assign or not for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7495) parallel seek in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581913#comment-13581913 ] Hudson commented on HBASE-7495: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #412 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/412/]) HBASE-7495 parallel seek in StoreScanner (Liang Xie) (Revision 1447740) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example/TestBulkDeleteProtocol.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/ParallelSeekHandler.java * /hbase/trunk/hbase-server/src/main/resources/hbase-default.xml * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestCoprocessorScanPolicy.java parallel seek in StoreScanner - Key: HBASE-7495 URL: https://issues.apache.org/jira/browse/HBASE-7495 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.94.3, 0.96.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.96.0 Attachments: 7495-v12.txt, HBASE-7495-0.94.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495.txt, HBASE-7495-v10.txt, HBASE-7495-v11.txt, HBASE-7495-v2.txt, HBASE-7495-v3.txt, HBASE-7495-v4.txt, HBASE-7495-v4.txt, HBASE-7495-v5.txt, HBASE-7495-v6.txt, HBASE-7495-v7.txt, HBASE-7495-v8.txt, HBASE-7495-v9.txt seems there's a potential improvable space before doing scanner.next: {code:title=StoreScanner.java|borderStyle=solid} if (explicitColumnQuery lazySeekEnabledGlobally) { for (KeyValueScanner scanner : scanners) { scanner.requestSeek(matcher.getStartKey(), false, true); } } else { for (KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } } {code} we can do scanner.requestSeek or scanner.seek in parallel, instead of current serialization, to reduce latency for special case. Any ideas on it ? I'll have a try if the comments/suggestions are positive:) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7883) Update memstore size when removing the entries in append operation
[ https://issues.apache.org/jira/browse/HBASE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581937#comment-13581937 ] Ted Yu commented on HBASE-7883: --- +1 on patch. Update memstore size when removing the entries in append operation -- Key: HBASE-7883 URL: https://issues.apache.org/jira/browse/HBASE-7883 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 Attachments: HBASE-7883-v1.patch In case of Appends/Increments with VERSION of CF set to 1, the memstore size is not updated when the previous entries are removed from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7886) [replication] hlog zk node will not delete if client roll hlog
terry zhang created HBASE-7886: -- Summary: [replication] hlog zk node will not delete if client roll hlog Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7886) [replication] hlog zk node will not delete if client roll hlog
[ https://issues.apache.org/jira/browse/HBASE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581990#comment-13581990 ] terry zhang commented on HBASE-7886: hlog zk node is deleted in shipEdits() or {code:title=ReplicationSource.java|borderStyle=solid} if (this.isActive() (gotIOE || currentNbEntries == 0)) { if (this.lastLoggedPosition != this.position) { this.manager.logPositionAndCleanOldLogs(this.currentPath, this.peerClusterZnode, this.position, queueRecovered, currentWALisBeingWrittenTo); this.lastLoggedPosition = this.position; } if (sleepForRetries(Nothing to replicate, sleepMultiplier)) { sleepMultiplier++; } continue; } {code} but after patch HBASE-6758. logPositionAndCleanOldLogs can not delete hlog zk node when currentWALisBeingWrittenTo is true. When log switched and we can see // If we didn't get anything and the queue has an object, it means we // hit the end of the file for sure return seenEntries == 0 processEndOfFile(); // seenEntries is 0 when run 'hlog_roll' in shell So replicationsource will continue and hlog zk node can not deleted. {code:title=ReplicationSource.java|borderStyle=solid} if(readAllEntriesToReplicateOrNextFile(currentWALisBeingWrittenTo)) { continue; } {code} [replication] hlog zk node will not delete if client roll hlog -- Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7886) [replication] hlog zk node will not delete if client roll hlog
[ https://issues.apache.org/jira/browse/HBASE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581993#comment-13581993 ] terry zhang commented on HBASE-7886: this issue will be also reproduced when no data write to cluster which is same as run 'hlog_roll' in shell. [replication] hlog zk node will not delete if client roll hlog -- Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7878) recoverFileLease does not check return value of recoverLease
[ https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581996#comment-13581996 ] Devaraj Das commented on HBASE-7878: +1 recoverFileLease does not check return value of recoverLease Key: HBASE-7878 URL: https://issues.apache.org/jira/browse/HBASE-7878 Project: HBase Issue Type: Bug Components: util Reporter: Eric Newton Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.6 Attachments: 7878-trunk-v1.txt I think this is a problem, so I'm opening a ticket so an HBase person takes a look. Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease recovery for Accumulo after HBase's lease recovery. During testing, we experienced data loss. I found it is necessary to wait until recoverLease returns true to know that the file has been truly closed. In FSHDFSUtils, the return result of recoverLease is not checked. In the unit tests created to check lease recovery in HBASE-2645, the return result of recoverLease is always checked. I think FSHDFSUtils should be modified to check the return result, and wait until it returns true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4755) HBase based block placement in DFS
[ https://issues.apache.org/jira/browse/HBASE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582000#comment-13582000 ] Devaraj Das commented on HBASE-4755: bq. Is this patch useful without giving hints to DFSClient about block placement. The stuff in Hadoop is tracked in HDFS-2576. I am thinking that we can use the reflection to figure out whether the underlying Hadoop supports the API to do with block placements or not. HBase based block placement in DFS -- Key: HBASE-4755 URL: https://issues.apache.org/jira/browse/HBASE-4755 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0 Reporter: Karthik Ranganathan Assignee: Christopher Gist Attachments: 4755-wip-1.patch The feature as is only useful for HBase clusters that care about data locality on regionservers, but this feature can also enable a lot of nice features down the road. The basic idea is as follows: instead of letting HDFS determine where to replicate data (r=3) by place blocks on various regions, it is better to let HBase do so by providing hints to HDFS through the DFS client. That way instead of replicating data at a blocks level, we can replicate data at a per-region level (each region owned by a promary, a secondary and a tertiary regionserver). This is better for 2 things: - Can make region failover faster on clusters which benefit from data affinity - On large clusters with random block placement policy, this helps reduce the probability of data loss The algo is as follows: - Each region in META will have 3 columns which are the preferred regionservers for that region (primary, secondary and tertiary) - Preferred assignment can be controlled by a config knob - Upon cluster start, HMaster will enter a mapping from each region to 3 regionservers (random hash, could use current locality, etc) - The load balancer would assign out regions preferring region assignments to primary over secondary over tertiary over any other node - Periodically (say weekly, configurable) the HMaster would run a locality checked and make sure the map it has for region to regionservers is optimal. Down the road, this can be enhanced to control region placement in the following cases: - Mixed hardware SKU where some regionservers can hold fewer regions - Load balancing across tables where we dont want multiple regions of a table to get assigned to the same regionservers - Multi-tenancy, where we can restrict the assignment of the regions of some table to a subset of regionservers, so an abusive app cannot take down the whole HBase cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7886) [replication] hlog zk node will not delete if client roll hlog
[ https://issues.apache.org/jira/browse/HBASE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] terry zhang updated HBASE-7886: --- Status: Patch Available (was: Open) [replication] hlog zk node will not delete if client roll hlog -- Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7886) [replication] hlog zk node will not delete if client roll hlog
[ https://issues.apache.org/jira/browse/HBASE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] terry zhang updated HBASE-7886: --- Status: Open (was: Patch Available) [replication] hlog zk node will not delete if client roll hlog -- Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7886) [replication] hlog zk node will not delete if client roll hlog
[ https://issues.apache.org/jira/browse/HBASE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] terry zhang updated HBASE-7886: --- Attachment: HBASE-7886.patch [replication] hlog zk node will not delete if client roll hlog -- Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang Attachments: HBASE-7886.patch if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7886) [replication] hlog zk node will not be deleted if client roll hlog
[ https://issues.apache.org/jira/browse/HBASE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] terry zhang updated HBASE-7886: --- Summary: [replication] hlog zk node will not be deleted if client roll hlog (was: [replication] hlog zk node will not delete if client roll hlog) [replication] hlog zk node will not be deleted if client roll hlog -- Key: HBASE-7886 URL: https://issues.apache.org/jira/browse/HBASE-7886 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4 Reporter: terry zhang Assignee: terry zhang Attachments: HBASE-7886.patch if we use the hbase shell command hlog_roll on a regionserver which is configured replication. the Hlog zk node under /hbase/replication/rs/1 can not be deleted. this issue is caused by HBASE-6758. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira