[jira] [Commented] (HBASE-5993) Add a no-read Append
[ https://issues.apache.org/jira/browse/HBASE-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283150#comment-13283150 ] Jieshan Bean commented on HBASE-5993: - So combine the existing value and append value during reading? it was my understanding:) Add a no-read Append Key: HBASE-5993 URL: https://issues.apache.org/jira/browse/HBASE-5993 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.0 Reporter: Jacques Priority: Critical HBASE-4102 added an atomic append. For high performance situations, it would be helpful to be able to do appends that don't actually require a read of the existing value. This would be useful in building a growing set of values. Our original use case was for implementing a form of search in HBase where a cell would contain a list of document ids associated with a particular keyword for search. However it seems like it would also be useful to provide substantial performance improvements for most Append scenarios. Within the client API, the simplest way to implement this would be to leverage the existing Append api. If the Append is marked as setReturnResults(false), use this code path. If result return is requested, use the existing Append implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6093) Flatten timestamps during flush and compaction
Matt Corgan created HBASE-6093: -- Summary: Flatten timestamps during flush and compaction Key: HBASE-6093 URL: https://issues.apache.org/jira/browse/HBASE-6093 Project: HBase Issue Type: New Feature Components: io, performance, regionserver Reporter: Matt Corgan Priority: Minor Many applications run with maxVersions=1 and do not care about timestamps, or they will specify one timestamp per row as a normal KeyValue rather than per-cell. Then, DataBlockEncoders like those in HBASE-4218 and HBASE-4676 often encode timestamps as diffs from the previous or diffs from the minimum timestamp in the block. If all timestamps in a block are the same, they will all compress to basically = 8 bytes total per block. This can be 10% to 25% space savings for some schemas, and that savings is realized both on disk and in block cache. We could add a ColumnFamily setting flattenTimestamps=[true/false]. If true, then all timestamps are modified during a flush/compaction to the currentTimeMillis() at the start of the flush/compaction. If all timestamps are made identical in a file, then the encoder will be able to eliminate them. The simplest use case is probably that where all inserts are type=Put, there are no overwrites, and there are no deletes. As use cases get more complex, then so does the implementation. For example, what happens when there is a Put and a Delete of the same cell in the same memstore? Maybe for a flush at t=flushStartTime, the Put gets timestamp=t, and the Delete gets timestamp=t+1. Or maybe HBASE-4241 could take care of this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4336) Convert source tree into maven modules
[ https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4336: - Attachment: refactor.txt.gz I cleaned up some left over files, renamed the hbase-assemble as hbase-assembly. Its a diff against trunk. Still to fix is site build (for now it says 'success' but it doesn't build anything) and assembly... that doesn't work properly yet. I think we can commit after we get assembly working (site we can do later). We should go back to yours or mine git repo Jesse because slinging these 33MB patches is going to definetly hose JIRA again. Convert source tree into maven modules -- Key: HBASE-4336 URL: https://issues.apache.org/jira/browse/HBASE-4336 Project: HBase Issue Type: Task Components: build Reporter: Gary Helmling Priority: Critical Fix For: 0.96.0 Attachments: refactor.txt.gz When we originally converted the build to maven we had a single core module defined, but later reverted this to a module-less build for the sake of simplicity. It now looks like it's time to re-address this, as we have an actual need for modules to: * provide a trimmed down client library that applications can make use of * more cleanly support building against different versions of Hadoop, in place of some of the reflection machinations currently required * incorporate the secure RPC engine that depends on some secure Hadoop classes I propose we start simply by refactoring into two initial modules: * core - common classes and utilities, and client-side code and interfaces * server - master and region server implementations and supporting code This would also lay the groundwork for incorporating the HBase security features that have been developed. Once the module structure is in place, security-related features could then be incorporated into a third module -- security -- after normal review and approval. The security module could then depend on secure Hadoop, without modifying the dependencies of the rest of the HBase code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283221#comment-13283221 ] Zhihong Yu commented on HBASE-5916: --- @Chunhui: Your suggestion is interesting. We should implement that in a separate issue. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283234#comment-13283234 ] Hudson commented on HBASE-6077: --- Integrated in HBase-0.94-security #31 (See [https://builds.apache.org/job/HBase-0.94-security/31/]) Amend HBASE-6077. Replace HTML formatting that does not work with Docbook (Revision 1342382) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283239#comment-13283239 ] Hudson commented on HBASE-6077: --- Integrated in HBase-0.94 #216 (See [https://builds.apache.org/job/HBase-0.94/216/]) Amend HBASE-6077. Replace HTML formatting that does not work with Docbook (Revision 1342382) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283241#comment-13283241 ] chunhui shen commented on HBASE-5916: - bq.We should implement that in a separate issue. I think the above suggestion would fix this issue. @ram Could you give some comments, correct me if anything I didn't consider. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6033) Adding some fuction to check if a table/region is in compaction
[ https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283259#comment-13283259 ] Hudson commented on HBASE-6033: --- Integrated in HBase-TRUNK #2921 (See [https://builds.apache.org/job/HBase-TRUNK/2921/]) HBASE-6033 Addendum changes TestCompactionState to large test (Revision 1342196) Result = FAILURE tedyu : Files : * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionState.java Adding some fuction to check if a table/region is in compaction --- Key: HBASE-6033 URL: https://issues.apache.org/jira/browse/HBASE-6033 Project: HBase Issue Type: New Feature Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: 6033-v7.txt, hbase-6033_v2.patch, hbase-6033_v3.patch, hbase_6033_v5.patch, hbase_6033_v6.patch, table_ui.png This feature will be helpful to find out if a major compaction is going on. We can show if it is in any minor compaction too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283260#comment-13283260 ] Hudson commented on HBASE-6077: --- Integrated in HBase-TRUNK #2921 (See [https://builds.apache.org/job/HBase-TRUNK/2921/]) Amend HBASE-6077. Replace HTML formatting that does not work with Docbook (Revision 1342381) Result = FAILURE apurtell : Files : * /hbase/trunk/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6044) copytable: remove rs.* parameters
[ https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283266#comment-13283266 ] Hudson commented on HBASE-6044: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-6044 copytable: remove rs.* parameters (Revision 1341202) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/docbkx/ops_mgt.xml * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java copytable: remove rs.* parameters - Key: HBASE-6044 URL: https://issues.apache.org/jira/browse/HBASE-6044 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: hbase-6044-92.patch, hbase-6044-v2.patch, hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch In discussion of HBASE-6013 it was suggested that we remove these arguments from 0.92+ (but keep in 0.90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible
[ https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283267#comment-13283267 ] Hudson commented on HBASE-5757: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-5757 TableInputFormat should handle as many errors as possible (Jan Lukavsky) (Revision 1341205) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java TableInputFormat should handle as many errors as possible - Key: HBASE-5757 URL: https://issues.apache.org/jira/browse/HBASE-5757 Project: HBase Issue Type: Bug Components: mapred, mapreduce Affects Versions: 0.90.6 Reporter: Jan Lukavsky Assignee: Jan Lukavsky Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch Prior to HBASE-4196 there was different handling of IOExceptions thrown from scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this handling so that if exception is caught a reconnect is attempted (without bothering the mapred client). After that, HBASE-4269 changed this behavior back, but in both mapred and mapreduce APIs. The question is, is there any reason not to handle all errors that the input format can handle? In other words, why not try to reissue the request after *any* IOException? I see the following disadvantages of current approach * the client may see exceptions like LeaseException and ScannerTimeoutException if he fails to process all fetched data in timeout * to avoid ScannerTimeoutException the client must raise hbase.regionserver.lease.period * timeouts for tasks is aready configured in mapred.task.timeout, so this seems to me a bit redundant, because typically one needs to update both these parameters * I don't see any possibility to get rid of LeaseException (this is configured on server side) I think all of these issues would be gone, if the DoNotRetryIOException would not be rethrown. -On the other hand, handling errors in InputFormat has disadvantage, that it may hide from the user some inefficiency. Eg. if I have very big scanner.caching, and I manage to process only a few rows in timeout, I will end up with single row being fetched many times (and will not be explicitly notified about this). Could we solve this problem by adding some counter to the InputFormat?- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283268#comment-13283268 ] Hudson commented on HBASE-6011: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-6011. Addendum to support master mocking (Ram) (Revision 1340157) Result = FAILURE apurtell : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check
[ https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283269#comment-13283269 ] Hudson commented on HBASE-6061: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-6061 Fix ACL Admin Table inconsistent permission check (Matteo Bertozzi) (Revision 1341268) Result = FAILURE tedyu : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java Fix ACL Admin Table inconsistent permission check --- Key: HBASE-6061 URL: https://issues.apache.org/jira/browse/HBASE-6061 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: acl, security Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, HBASE-6061-v1.patch the requirePermission() check for admin operation on a table is currently inconsistent. Table Owner with CREATE rights (that means, the owner has created that table) can enable/disable and delete the table but needs ADMIN rights to add/remove/modify a column. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6013) Polish sharp edges from CopyTable
[ https://issues.apache.org/jira/browse/HBASE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283270#comment-13283270 ] Hudson commented on HBASE-6013: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-6013 Polish sharp edges from CopyTable (Revision 1339931) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java Polish sharp edges from CopyTable - Key: HBASE-6013 URL: https://issues.apache.org/jira/browse/HBASE-6013 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: hbase-6013-92.patch, hbase-6013.patch CopyTable doesn't report errors when invalid arguments are specified. For example, having a typo in --peer.adr (such as --peer.addr or -peer.adr) silently uses the default cluster and does a same-cluster copy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6054) 0.92 failing because of missing commons-io after upgrade to hadoop 1.0.3.
[ https://issues.apache.org/jira/browse/HBASE-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283272#comment-13283272 ] Hudson commented on HBASE-6054: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-6054 0.92 failing because of missing commons-io after upgrade to hadoop 1.0.3. (Revision 1340272) Result = FAILURE stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/pom.xml 0.92 failing because of missing commons-io after upgrade to hadoop 1.0.3. - Key: HBASE-6054 URL: https://issues.apache.org/jira/browse/HBASE-6054 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2 Attachments: commons-io.txt, hbasedefaultcheck.txt See this note: http://search-hadoop.com/m/0UrOr19BG8v1/test+failure+after+upgrading+to+hadoop+1.0.3+Was%253A+ClassNotFoundException%253A+org.apache.commons.io.FileUtilssubj=test+failure+after+upgrading+to+hadoop+1+0+3+Was+ClassNotFoundException+org+apache+commons+io+FileUtils -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283271#comment-13283271 ] Hudson commented on HBASE-6077: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) Amend HBASE-6077. Replace HTML formatting that does not work with Docbook (Revision 1342383) HBASE-6077. Document the most common secure RPC troubleshooting resolutions (Revision 1342107) Result = FAILURE apurtell : Files : * /hbase/branches/0.92/src/docbkx/troubleshooting.xml apurtell : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283273#comment-13283273 ] Hudson commented on HBASE-6047: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-6047 Put.has() can't determine result correctly (Alex Newman) (Revision 1341741) Result = FAILURE tedyu : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/Put.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestPutDotHas.java Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang Assignee: Alex Newman Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, PutTest.java the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5920) New Compactions Logic can silently prevent user-initiated compactions from occurring
[ https://issues.apache.org/jira/browse/HBASE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283274#comment-13283274 ] Hudson commented on HBASE-5920: --- Integrated in HBase-0.92-security #108 (See [https://builds.apache.org/job/HBase-0.92-security/108/]) HBASE-5920 New Compactions Logic can silently prevent user-initiated compactions from occurring (Revision 1340285) Result = FAILURE stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java New Compactions Logic can silently prevent user-initiated compactions from occurring Key: HBASE-5920 URL: https://issues.apache.org/jira/browse/HBASE-5920 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.92.1 Reporter: Derek Wollenstein Assignee: Derek Wollenstein Priority: Minor Labels: compaction Attachments: 5290-094.txt, HBASE-5920-0.92.1-1.patch, HBASE-5920-0.92.1-2.patch, HBASE-5920-0.92.1.patch, HBASE-5920-trunk-1.patch, HBASE-5920-trunk.patch There seem to be some tuning settings in which manually triggered major compactions will do nothing, including loggic From Store.java in the function ListStoreFile compactSelection(ListStoreFile candidates) When a user manually triggers a compaction, this follows the same logic as a normal compaction check. when a user manually triggers a major compaction, something similar happens. Putting this all together: 1. If a user triggers a major compaction, this is checked against a max files threshold (hbase.hstore.compaction.max). If the number of storefiles to compact is max files, then we downgrade to a minor compaction 2. If we are in a minor compaction, we do the following checks: a. If the file is less than a minimum size (hbase.hstore.compaction.min.size) we automatically include it b. Otherwise, we check how the size compares to the next largest size. based on hbase.hstore.compaction.ratio. c. If the number of files included is less than a minimum count (hbase.hstore.compaction.min) then don't compact. In many of the exit strategies, we aren't seeing an error message. The net-net of this is that if we have a mix of very large and very small files, we may end up having too many files to do a major compact, but too few files to do a minor compact. I'm trying to go through and see if I'm understanding things correctly, but this seems like the bug To put it another way 2012-05-02 20:09:36,389 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large Compaction requested: regionName=str,44594594594594592,1334939064521.f7aed25b55d4d7988af763bede9ce74e., store Name=c, fileCount=15, fileSize=1.5g (20.2k, 362.5m, 155.3k, 3.0m, 30.7k, 361.2m, 6.9m, 4.7m, 14.7k, 363.4m, 30.9m, 3.2m, 7.3k, 362.9m, 23.5m), priority=-9, time=3175046817624398; Because: Recursive enqueue; compaction_queue=(59:0), split_queue=0 When we had a minimum compaction size of 128M, and default settings for hbase.hstore.compaction.min,hbase.hstore.compaction.max,hbase.hstore.compaction.ratio, we were not getting a compaction to run even if we ran major_compact 'str,44594594594594592,1334939064521.f7aed25b55d4d7988af763bede9ce74e.' from the ruby shell. Note that we had many tiny regions (20k, 155k, 3m, 30k,..) and several large regions (362.5m,361.2m,363.4m,362.9m). I think the bimodal nature of the sizes prevented us from doing a compaction. I'm not 100% sure where this errored out because when I manually triggered a compaction, I did not see ' // if we don't have enough files to compact, just wait if (filesToCompact.size() this.minFilesToCompact) { if (LOG.isDebugEnabled()) { LOG.debug(Skipped compaction of + this.storeNameStr + . Only + (end - start) + file(s) of size + StringUtils.humanReadableInt(totalSize) + have met compaction criteria.); } ' being printed in the logs (and I know DEBUG logging was enabled because I saw this elsewhere).
[jira] [Commented] (HBASE-6071) getRegionServerWithRetires, should log unsuccessful attempts and exceptions.
[ https://issues.apache.org/jira/browse/HBASE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283285#comment-13283285 ] Igal Shilman commented on HBASE-6071: - [~zhi...@ebaysf.com], can you please help clarifying core tests score? I went thru the test results and didn't find any failures. Also, browsing around recently submitted patches I see the same findbugs warning count, should I conclude that this is not necessary related to this patch? Thanks. getRegionServerWithRetires, should log unsuccessful attempts and exceptions. Key: HBASE-6071 URL: https://issues.apache.org/jira/browse/HBASE-6071 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.92.0, 0.94.0 Reporter: Igal Shilman Priority: Minor Labels: client, ipc Attachments: HBASE-6071.patch, HConnectionManager_HBASE-6071-0.90.0.patch HConnectionImplementation.getRegionServerWithRetries might terminate w/ an exception different then a DoNotRetryIOException, thus silently drops exceptions from previous attempts. [~ted_yu] suggested ([here|http://mail-archives.apache.org/mod_mbox/hbase-user/201205.mbox/%3CCAFebPXBq9V9BVdzRTNr-MB3a1Lz78SZj6gvP6On0b%2Bajt9StAg%40mail.gmail.com%3E]) adding a log message inside the catch block describing the exception type and details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6071) getRegionServerWithRetires, should log unsuccessful attempts and exceptions.
[ https://issues.apache.org/jira/browse/HBASE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283301#comment-13283301 ] Zhihong Yu commented on HBASE-6071: --- I used my script to scan https://builds.apache.org/job/PreCommit-HBASE-Build/1991/console and didn't find hanging test. {code} + String message = String.format(Exception during try #%d (out of %d):,tries+1, numRetries); + LOG.debug(message,t); {code} nit: space should be inserted between comma and tries, between comma and t. My original intention was using the above debug log to facilitate finding root cause. It would be nice if debug log is added to 0.90 release you're using and see what we get. getRegionServerWithRetires, should log unsuccessful attempts and exceptions. Key: HBASE-6071 URL: https://issues.apache.org/jira/browse/HBASE-6071 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.92.0, 0.94.0 Reporter: Igal Shilman Priority: Minor Labels: client, ipc Attachments: HBASE-6071.patch, HConnectionManager_HBASE-6071-0.90.0.patch HConnectionImplementation.getRegionServerWithRetries might terminate w/ an exception different then a DoNotRetryIOException, thus silently drops exceptions from previous attempts. [~ted_yu] suggested ([here|http://mail-archives.apache.org/mod_mbox/hbase-user/201205.mbox/%3CCAFebPXBq9V9BVdzRTNr-MB3a1Lz78SZj6gvP6On0b%2Bajt9StAg%40mail.gmail.com%3E]) adding a log message inside the catch block describing the exception type and details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6071) getRegionServerWithRetires, should log unsuccessful attempts and exceptions.
[ https://issues.apache.org/jira/browse/HBASE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283302#comment-13283302 ] Zhihong Yu commented on HBASE-6071: --- If you look at ScannerCallable.java in trunk, you would see the following: {code} public static final String LOG_SCANNER_ACTIVITY = hbase.client.log.scanner.activity; ... } catch (IOException e) { if (logScannerActivity) { LOG.info(Got exception in fetching from scanner= + scannerId, e); } {code} You can backport related code and set hbase.client.log.scanner.activity to true. This way you would see the exception in log. getRegionServerWithRetires, should log unsuccessful attempts and exceptions. Key: HBASE-6071 URL: https://issues.apache.org/jira/browse/HBASE-6071 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.92.0, 0.94.0 Reporter: Igal Shilman Priority: Minor Labels: client, ipc Attachments: HBASE-6071.patch, HConnectionManager_HBASE-6071-0.90.0.patch HConnectionImplementation.getRegionServerWithRetries might terminate w/ an exception different then a DoNotRetryIOException, thus silently drops exceptions from previous attempts. [~ted_yu] suggested ([here|http://mail-archives.apache.org/mod_mbox/hbase-user/201205.mbox/%3CCAFebPXBq9V9BVdzRTNr-MB3a1Lz78SZj6gvP6On0b%2Bajt9StAg%40mail.gmail.com%3E]) adding a log message inside the catch block describing the exception type and details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Status: Open (was: Patch Available) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Open) Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Lapan updated HBASE-5416: - Attachment: Filtered_scans_v5.1.patch Fixed issues with incorrect rebase, applied suggested changes from first review. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283349#comment-13283349 ] ramkrishna.s.vasudevan commented on HBASE-5916: --- @Chunhui I like your idea too. As i said we are planning to raise an improvement activity for master restart and SSH. Because even with the above approach i will tell one more scenario which is problematic. Pls note that the scenario can come even without your suggestion also. Two region servers are there. Both went down when the flow is in AM.joinCluster(). Now as no RS is there at that time we will not make any assignment. And all will go into RIT mode waiting for timeout monitor. Now SSH is also waiting as the master initialization is not complete(this step is as per your suggestion). Now suppose there are 100 regions all are waiting for getting assigned. Now if a new RS comes up as there is a code in TimeoutMonitor {code} if (regionState.getStamp() + timeout = now) { //decide on action upon timeout actOnTimeOut(regionState); } else if (this.allRegionServersOffline !allRSsOffline) { // if some RSs just came back online, we can start the // the assignment right away actOnTimeOut(regionState); } {code} It will immediately trigger assignment. At the same time as master initialization has already been done and so we are able to carry on assignment with SSH also. This will lead to double assignment. Actually in defect HBASe-5816 Stack was suggesting to have one common queue where any assignment will be done so that SSH will not interfere with that or viceversa. I suggest we can get in the patch that addresses the current JIRa problem and work on a diff JIRA that will help me to address the master restart and SSH area which is troublesome. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283357#comment-13283357 ] Hadoop QA commented on HBASE-5416: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529706/Filtered_scans_v5.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1994//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1994//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1994//console This message is automatically generated. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: filters, performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.
[ https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh reassigned HBASE-5892: - Assignee: Andrew Wang [hbck] Refactor parallel WorkItem* to Futures. -- Key: HBASE-5892 URL: https://issues.apache.org/jira/browse/HBASE-5892 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Andrew Wang Labels: noob Attachments: hbase-5892.patch This would convert WorkItem* logic (with low level notifies, and rough exception handling) into a more canonical Futures pattern. Currently there are two instances of this pattern (for loading hdfs dirs, for contacting regionservers for assignments, and soon -- for loading hdfs .regioninfo files). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.
[ https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283369#comment-13283369 ] Jonathan Hsieh commented on HBASE-5892: --- Andrew, looks good. I'm going to wait for the hadoopqa robot to execute the test suite. Alternately, since this just modifies hbck, can you try this command and share results: 'mvn test -PlocalTests -Dtest=TestHbaseFsck'? I'd like to keep all hbck across versions essentially the same -- would you be willing to port to 0.90/0.92/0.94? I'd bet that this may apply to 0.94 and 0.92, and that 0.90 would require some near trivial tweaks. [hbck] Refactor parallel WorkItem* to Futures. -- Key: HBASE-5892 URL: https://issues.apache.org/jira/browse/HBASE-5892 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Andrew Wang Labels: noob Attachments: hbase-5892.patch This would convert WorkItem* logic (with low level notifies, and rough exception handling) into a more canonical Futures pattern. Currently there are two instances of this pattern (for loading hdfs dirs, for contacting regionservers for assignments, and soon -- for loading hdfs .regioninfo files). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283412#comment-13283412 ] Jonathan Hsieh commented on HBASE-6050: --- Just for clarification - this edits are actually replayed to the daughter regions and these recovered.edits files are kept around for something (the CJ?) to eventually clean up? HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. --- Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Attachments: HBASE-6050.patch The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283452#comment-13283452 ] ramkrishna.s.vasudevan commented on HBASE-6050: --- @Jon In our case the split got completed and the RS went down due to ZK issue and that is why the Master was not able to respond to the split region completion. Because the RS went down the recovered.edits creation came into play. Ideally CJ just cleans up the entire region directory because the parent is in splitted state and offlined. Also in this case as the split is completed we are sure that the data is also flushed to store files. Daughter regions will have its own region directory. Did i answer your question? ;) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. --- Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Attachments: HBASE-6050.patch The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node
[ https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6088: -- Attachment: HBASE-6088_trunk.patch Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node Key: HBASE-6088 URL: https://issues.apache.org/jira/browse/HBASE-6088 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Fix For: 0.94.1 Attachments: HBASE-6088_trunk.patch Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node {noformat} 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 26668ms for sessionid 0x1377a75f41d0012, closing socket connection and attempting reconnect 2012-05-24 01:45:41,464 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 {noformat} {noformat} 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: cleanupCurrentWriter waiting for transactions to get synced total 189377 synced till here 189365 2012-05-24 01:45:48,474 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. java.io.IOException: Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747) at org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) ... 5 more 2012-05-24 01:45:48,476 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. {noformat} {noformat} 2012-05-24 01:47:28,141 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is not a retry 2012-05-24 01:47:28,142 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 java.io.IOException: Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) {noformat} Due to the above exception, region splitting was failing contineously more than 5hrs -- This message is automatically generated by JIRA. If you
[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node
[ https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283490#comment-13283490 ] rajeshbabu commented on HBASE-6088: --- Attached patch for trunk. Please review and provide suggestions/comments. Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node Key: HBASE-6088 URL: https://issues.apache.org/jira/browse/HBASE-6088 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Fix For: 0.94.1 Attachments: HBASE-6088_trunk.patch Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node {noformat} 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 26668ms for sessionid 0x1377a75f41d0012, closing socket connection and attempting reconnect 2012-05-24 01:45:41,464 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 {noformat} {noformat} 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: cleanupCurrentWriter waiting for transactions to get synced total 189377 synced till here 189365 2012-05-24 01:45:48,474 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. java.io.IOException: Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747) at org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) ... 5 more 2012-05-24 01:45:48,476 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. {noformat} {noformat} 2012-05-24 01:47:28,141 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is not a retry 2012-05-24 01:47:28,142 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 java.io.IOException: Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) {noformat} Due to the above exception, region splitting was
[jira] [Assigned] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node
[ https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu reassigned HBASE-6088: - Assignee: rajeshbabu Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node Key: HBASE-6088 URL: https://issues.apache.org/jira/browse/HBASE-6088 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6088_trunk.patch Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node {noformat} 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 26668ms for sessionid 0x1377a75f41d0012, closing socket connection and attempting reconnect 2012-05-24 01:45:41,464 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 {noformat} {noformat} 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: cleanupCurrentWriter waiting for transactions to get synced total 189377 synced till here 189365 2012-05-24 01:45:48,474 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. java.io.IOException: Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747) at org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) ... 5 more 2012-05-24 01:45:48,476 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. {noformat} {noformat} 2012-05-24 01:47:28,141 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is not a retry 2012-05-24 01:47:28,142 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 java.io.IOException: Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) {noformat} Due to the above exception, region splitting was failing contineously more than 5hrs -- This message is
[jira] [Updated] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node
[ https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6088: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Open) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node Key: HBASE-6088 URL: https://issues.apache.org/jira/browse/HBASE-6088 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6088_trunk.patch Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node {noformat} 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 26668ms for sessionid 0x1377a75f41d0012, closing socket connection and attempting reconnect 2012-05-24 01:45:41,464 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 {noformat} {noformat} 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: cleanupCurrentWriter waiting for transactions to get synced total 189377 synced till here 189365 2012-05-24 01:45:48,474 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. java.io.IOException: Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747) at org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) ... 5 more 2012-05-24 01:45:48,476 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. {noformat} {noformat} 2012-05-24 01:47:28,141 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is not a retry 2012-05-24 01:47:28,142 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 java.io.IOException: Failed create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) {noformat} Due to the above exception, region splitting was failing contineously more than 5hrs -- This message is
[jira] [Comment Edited] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.
[ https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283369#comment-13283369 ] Jonathan Hsieh edited comment on HBASE-5892 at 5/25/12 3:19 PM: Andrew, looks good. I'm going to wait for the hadoopqa robot to execute the test suite. Alternately, since this just modifies hbck, can you try this command and share results: 'mvn test -PlocalTests -Dtest=TestHBaseFsck'? I'd like to keep all hbck across versions essentially the same -- would you be willing to port to 0.90/0.92/0.94? I'd bet that this may apply to 0.94 and 0.92, and that 0.90 would require some near trivial tweaks. was (Author: jmhsieh): Andrew, looks good. I'm going to wait for the hadoopqa robot to execute the test suite. Alternately, since this just modifies hbck, can you try this command and share results: 'mvn test -PlocalTests -Dtest=TestHbaseFsck'? I'd like to keep all hbck across versions essentially the same -- would you be willing to port to 0.90/0.92/0.94? I'd bet that this may apply to 0.94 and 0.92, and that 0.90 would require some near trivial tweaks. [hbck] Refactor parallel WorkItem* to Futures. -- Key: HBASE-5892 URL: https://issues.apache.org/jira/browse/HBASE-5892 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Andrew Wang Labels: noob Attachments: hbase-5892.patch This would convert WorkItem* logic (with low level notifies, and rough exception handling) into a more canonical Futures pattern. Currently there are two instances of this pattern (for loading hdfs dirs, for contacting regionservers for assignments, and soon -- for loading hdfs .regioninfo files). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node
[ https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283544#comment-13283544 ] Hadoop QA commented on HBASE-6088: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529722/HBASE-6088_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1995//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1995//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1995//console This message is automatically generated. Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node Key: HBASE-6088 URL: https://issues.apache.org/jira/browse/HBASE-6088 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6088_trunk.patch Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node {noformat} 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 26668ms for sessionid 0x1377a75f41d0012, closing socket connection and attempting reconnect 2012-05-24 01:45:41,464 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 {noformat} {noformat} 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: cleanupCurrentWriter waiting for transactions to get synced total 189377 synced till here 189365 2012-05-24 01:45:48,474 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. java.io.IOException: Failed setting SPLITTING znode on ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144. at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/bd1079bf948c672e493432020dc0e144 at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321) at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747) at org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
[jira] [Created] (HBASE-6094) [refGuide] Improvements to new contributor docs
Ian Varley created HBASE-6094: - Summary: [refGuide] Improvements to new contributor docs Key: HBASE-6094 URL: https://issues.apache.org/jira/browse/HBASE-6094 Project: HBase Issue Type: Improvement Reporter: Ian Varley Assignee: Doug Meil Priority: Minor book.xml * adding section in compression appendix about changing compression codecs. * A frequent question on the dist-list is whether people will have to copy the data into a new table, etc., You don't. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6094) [refGuide] Improvements to new contributor docs
[ https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Varley updated HBASE-6094: -- Description: developer.xml * Expanded explanation around git svn, and mentioning the EGit plugin * Expanded explanation of setting up the eclipse project * Extra section about basic compilation using maven and eclipse * Fix to tarball command that makes it maven2 compatible * Greatly expanded section about contributing docs, and clarification that pushing generated site is only for those with permissions was: book.xml * adding section in compression appendix about changing compression codecs. * A frequent question on the dist-list is whether people will have to copy the data into a new table, etc., You don't. [refGuide] Improvements to new contributor docs --- Key: HBASE-6094 URL: https://issues.apache.org/jira/browse/HBASE-6094 Project: HBase Issue Type: Improvement Reporter: Ian Varley Assignee: Doug Meil Priority: Minor developer.xml * Expanded explanation around git svn, and mentioning the EGit plugin * Expanded explanation of setting up the eclipse project * Extra section about basic compilation using maven and eclipse * Fix to tarball command that makes it maven2 compatible * Greatly expanded section about contributing docs, and clarification that pushing generated site is only for those with permissions -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6095) ActiveMasterManager NullPointerException
Jimmy Xiang created HBASE-6095: -- Summary: ActiveMasterManager NullPointerException Key: HBASE-6095 URL: https://issues.apache.org/jira/browse/HBASE-6095 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.1 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.1 It is for 0.94 and 0.92. Trunk doesn't have the issue. {code} byte [] bytes = ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode); // TODO: redo this to make it atomic (only added for tests) ServerName master = ServerName.parseVersionedServerName(bytes); {code} bytes could be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6095) ActiveMasterManager NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6095: --- Attachment: hbase-6095.patch ActiveMasterManager NullPointerException Key: HBASE-6095 URL: https://issues.apache.org/jira/browse/HBASE-6095 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.1 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.1 Attachments: hbase-6095.patch It is for 0.94 and 0.92. Trunk doesn't have the issue. {code} byte [] bytes = ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode); // TODO: redo this to make it atomic (only added for tests) ServerName master = ServerName.parseVersionedServerName(bytes); {code} bytes could be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6094) [refGuide] Improvements to new contributor docs
[ https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Varley updated HBASE-6094: -- Attachment: book_hbase_6094.xml.patch [refGuide] Improvements to new contributor docs --- Key: HBASE-6094 URL: https://issues.apache.org/jira/browse/HBASE-6094 Project: HBase Issue Type: Improvement Reporter: Ian Varley Assignee: Doug Meil Priority: Minor Attachments: book_hbase_6094.xml.patch developer.xml * Expanded explanation around git svn, and mentioning the EGit plugin * Expanded explanation of setting up the eclipse project * Extra section about basic compilation using maven and eclipse * Fix to tarball command that makes it maven2 compatible * Greatly expanded section about contributing docs, and clarification that pushing generated site is only for those with permissions -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283599#comment-13283599 ] ramkrishna.s.vasudevan commented on HBASE-6070: --- Committed to trunk, 0.94 and 0.92. Thanks for the review Ted. AM.nodeDeleted and SSH races creating problems for regions under SPLIT -- Key: HBASE-6070 URL: https://issues.apache.org/jira/browse/HBASE-6070 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, HBASE-6070_trunk_1.patch We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806. While doing some more we found still there is one race condition. - Split has just started and the znode is in RS_SPLIT state. - RS goes down. - First call back for SSH comes. - As part of the fix for HBASE-5806 SSH knows that some region is in RIT. - But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT. - After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned. When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6095) ActiveMasterManager NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6095: --- Status: Patch Available (was: Open) ActiveMasterManager NullPointerException Key: HBASE-6095 URL: https://issues.apache.org/jira/browse/HBASE-6095 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.1 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.1 Attachments: hbase-6095.patch It is for 0.94 and 0.92. Trunk doesn't have the issue. {code} byte [] bytes = ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode); // TODO: redo this to make it atomic (only added for tests) ServerName master = ServerName.parseVersionedServerName(bytes); {code} bytes could be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6070: -- Resolution: Fixed Status: Resolved (was: Patch Available) AM.nodeDeleted and SSH races creating problems for regions under SPLIT -- Key: HBASE-6070 URL: https://issues.apache.org/jira/browse/HBASE-6070 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, HBASE-6070_trunk_1.patch We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806. While doing some more we found still there is one race condition. - Split has just started and the znode is in RS_SPLIT state. - RS goes down. - First call back for SSH comes. - As part of the fix for HBASE-5806 SSH knows that some region is in RIT. - But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT. - After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned. When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6095) ActiveMasterManager NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283600#comment-13283600 ] ramkrishna.s.vasudevan commented on HBASE-6095: --- +1 on patch. ActiveMasterManager NullPointerException Key: HBASE-6095 URL: https://issues.apache.org/jira/browse/HBASE-6095 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.1 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.1 Attachments: hbase-6095.patch It is for 0.94 and 0.92. Trunk doesn't have the issue. {code} byte [] bytes = ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode); // TODO: redo this to make it atomic (only added for tests) ServerName master = ServerName.parseVersionedServerName(bytes); {code} bytes could be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6095) ActiveMasterManager NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283601#comment-13283601 ] Hadoop QA commented on HBASE-6095: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529744/hbase-6095.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1996//console This message is automatically generated. ActiveMasterManager NullPointerException Key: HBASE-6095 URL: https://issues.apache.org/jira/browse/HBASE-6095 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.1 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.1 Attachments: hbase-6095.patch It is for 0.94 and 0.92. Trunk doesn't have the issue. {code} byte [] bytes = ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode); // TODO: redo this to make it atomic (only added for tests) ServerName master = ServerName.parseVersionedServerName(bytes); {code} bytes could be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6068: --- Affects Version/s: 0.96.0 0.92.1 Status: Patch Available (was: Open) Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.0, 0.92.1, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6068: --- Attachment: HBASE-6068-v0.patch Since certain znodes are accessed by the client directly they must be marked as readable by everyone. HBaseAdmin.checkHBaseAvailable() - /hbase ZKTable.populateTableStates() - /hbase/table/* znodes Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283605#comment-13283605 ] chunhui shen commented on HBASE-5916: - @ram Thanks to write much for the case. However, I don't think the above case will happen. Correct me if wrong. bq.At the same time as master initialization has already been done and so we are able to carry on assignment with SSH also. This will lead to double assignment Why it will lead to double assignment? When we reassign regions in the process of SSH, we would skip regions as the folowing: {code} if (processDeadRegion(e.getKey(), e.getValue(), this.services.getAssignmentManager(), this.server.getCatalogTracker())) { ServerName addressFromAM = this.services.getAssignmentManager() .getRegionServerOfRegion(e.getKey()); if (rit != null !rit.isClosing() !rit.isPendingClose()) { // Skip regions that were in transition unless CLOSING or // PENDING_CLOSE LOG.info(Skip assigning region + rit.toString()); } else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } else { toAssignRegions.add(e.getKey()); } } {code} In RIT?(not closingnot pendingClose, it won't be these two state in the above case ) - skip Has onlined on other server- skip At last, I think HBASE-5916_trunk_v7.patch is fine, and aggree we check in the patch for the current JIRA. Thanks help for my doubt. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6068: --- Attachment: (was: HBASE-6068-v0.patch) Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6068: --- Attachment: HBASE-6068-v0.patch Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283609#comment-13283609 ] Matteo Bertozzi commented on HBASE-6068: HBaseAdmin.checkHBaseAvailable() - exists() /hbase ZKTable.populateTableStates() - listChildrenNoWatch() /hbase/table/* znodes ZKTable.getTableState() - getData() /hbase/table/table name HConnectionManager.getCurrentNrHRS() - getNumberOfChildren() - /hbase/rs/ Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283612#comment-13283612 ] ramkrishna.s.vasudevan commented on HBASE-5916: --- @Chunhui {code} else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } {code} Just to clarify this, the assignment will not happen if the address mismatches but what if for few regions which are yet to be assigned the RIT is still not updated. Chunhui, as you said all the cases discussed here are very corner case. Also i really appreciate your help on this making us find out more cases. Thank you very much. Let me wait for Stack's comments also. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283614#comment-13283614 ] ramkrishna.s.vasudevan commented on HBASE-6068: --- @Matteo Thanks for bringing out similar cases that deals with ZK. Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6096) AccessController v2
Andrew Purtell created HBASE-6096: - Summary: AccessController v2 Key: HBASE-6096 URL: https://issues.apache.org/jira/browse/HBASE-6096 Project: HBase Issue Type: Umbrella Components: security Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Umbrella issue for iteration on the initial AccessController drop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6036) Add Cluster-level PB-based calls to HMasterInterface (minus file-format related calls)
[ https://issues.apache.org/jira/browse/HBASE-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283616#comment-13283616 ] Gregory Chanan commented on HBASE-6036: --- Safe to mark this Resolved? Add Cluster-level PB-based calls to HMasterInterface (minus file-format related calls) -- Key: HBASE-6036 URL: https://issues.apache.org/jira/browse/HBASE-6036 Project: HBase Issue Type: Task Components: ipc, master, migration Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.96.0 Attachments: HBASE-6036-v2.patch, HBASE-6036.patch This should be a subtask of HBASE-5445, but since that is a subtask, I can't also make this a subtask (apparently). Convert the cluster-level calls that do not touch the file-format related calls (see HBASE-5453). These are: IsMasterRunning Shutdown StopMaster Balance LoadBalancerIs (was synchronousBalanceSwitch/balanceSwitch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283617#comment-13283617 ] chunhui shen commented on HBASE-5916: - bq.but what if for few regions which are yet to be assigned the RIT is still not updated when master startup before initialized, the region will be in the RIT through AssignmentManager#processRegionsInTransition for the above case. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283622#comment-13283622 ] Alex Newman commented on HBASE-6047: Could someone translate the hudson. Is this patch still working or do I need to rebase on to some branches. The robot's output confuses this human. Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang Assignee: Alex Newman Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, PutTest.java the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative
[ https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283621#comment-13283621 ] ramkrishna.s.vasudevan commented on HBASE-5916: --- I meant RIT is with the original server name only and not yet updated with the new RS. :) Good on you Chunhui. RS restart just before master intialization we make the cluster non operative - Key: HBASE-5916 URL: https://issues.apache.org/jira/browse/HBASE-5916 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch Consider a case where my master is getting restarted. RS that was alive when the master restart started, gets restarted before the master initializes the ServerShutDownHandler. {code} serverShutdownHandlerEnabled = true; {code} In this case when the RS tries to register with the master, the master will try to expire the server but the server cannot be expired as still the serverShutdownHandler is not enabled. This case may happen when i have only one RS gets restarted or all the RS gets restarted at the same time.(before assignRootandMeta). {code} LOG.info(message); if (existingServer.getStartcode() serverName.getStartcode()) { LOG.info(Triggering server recovery; existingServer + existingServer + looks stale, new server: + serverName); expireServer(existingServer); } {code} If another RS is brought up then the cluster comes back to normalcy. May be a very corner case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283626#comment-13283626 ] ramkrishna.s.vasudevan commented on HBASE-6047: --- @Alex The hudson has taken up the patch and it is available in the versions where ever the patch has gone in. There are some test case failures caused in the build due to which the hudson says it has failed in creating a build. May be you can just cross check from the build report whether your patch has caused any failures. Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang Assignee: Alex Newman Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, PutTest.java the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6097) TestHRegion.testBatchPut is flaky on 0.92
Gregory Chanan created HBASE-6097: - Summary: TestHRegion.testBatchPut is flaky on 0.92 Key: HBASE-6097 URL: https://issues.apache.org/jira/browse/HBASE-6097 Project: HBase Issue Type: Bug Components: test, wal Affects Versions: 0.92.1 Reporter: Gregory Chanan Assignee: Gregory Chanan If I run this test in a loop, I get failures like the following: Error Message: expected:1 but was:2 Stack Trace: junit.framework.AssertionFailedError: expected:1 but was:2 at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:134) at junit.framework.Assert.assertEquals(Assert.java:140) at org.apache.hadoop.hbase.regionserver.TestHRegion.testBatchPut(TestHRegion.java:536) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283632#comment-13283632 ] Alex Newman commented on HBASE-6047: *ramkrishna sorry to be dense, but just so I understand what you want me to do. I looked at the jenkins builds and it looks like we are having flakey test issues. Sounds like I know what my next jira is. Can you confirm that I am correct in my diagnosis? Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang Assignee: Alex Newman Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, PutTest.java the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283634#comment-13283634 ] Hadoop QA commented on HBASE-6068: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529749/HBASE-6068-v0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1997//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1997//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1997//console This message is automatically generated. Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Resolution: Duplicate Status: Resolved (was: Patch Available) Fixed with HBASE-5849 RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-0.92.patch, HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, HBASE-5666-v5.patch, HBASE-5666-v6.patch, HBASE-5666-v7.patch, HBASE-5666-v8.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6097) TestHRegion.testBatchPut is flaky on 0.92
[ https://issues.apache.org/jira/browse/HBASE-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283637#comment-13283637 ] Gregory Chanan commented on HBASE-6097: --- The issue is that this test checks that a sync happened like so: {code} assertEquals(1, HLog.getSyncOps()); {code} But in 0.92, the LogSyncer thread will sync, and increment HLog.getSyncOps(), even if there are no updates to the wal. This has been fixed in 0.94+. So, you can trigger a failure by setting hbase.regionserver.optionallogflushinterval to an extremely low value and throwing in random sleeps to the test. Possibilities for fixing: 1) Backport the work from 0.94+ that avoids syncing if there are no updates to the wal 2) Only change the test and just check that a sync ran, e.g. {code} assert(HLog.getSyncOps() 0); {code} This makes the test a bit too accepting, because then it is possible that the syncer can sync nothing and we'd think a sync actually ran. I'll investigate some more. TestHRegion.testBatchPut is flaky on 0.92 - Key: HBASE-6097 URL: https://issues.apache.org/jira/browse/HBASE-6097 Project: HBase Issue Type: Bug Components: test, wal Affects Versions: 0.92.1 Reporter: Gregory Chanan Assignee: Gregory Chanan If I run this test in a loop, I get failures like the following: Error Message: expected:1 but was:2 Stack Trace: junit.framework.AssertionFailedError: expected:1 but was:2 at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:134) at junit.framework.Assert.assertEquals(Assert.java:140) at org.apache.hadoop.hbase.regionserver.TestHRegion.testBatchPut(TestHRegion.java:536) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6098) ACL design changes
Andrew Purtell created HBASE-6098: - Summary: ACL design changes Key: HBASE-6098 URL: https://issues.apache.org/jira/browse/HBASE-6098 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6099) Secure ZooKeeper integration changes
Andrew Purtell created HBASE-6099: - Summary: Secure ZooKeeper integration changes Key: HBASE-6099 URL: https://issues.apache.org/jira/browse/HBASE-6099 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283643#comment-13283643 ] Hadoop QA commented on HBASE-6068: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529753/HBASE-6068-v0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1998//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1998//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1998//console This message is automatically generated. Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6100) Fix the frequent testcase failures in 0.94 from build no #209
ramkrishna.s.vasudevan created HBASE-6100: - Summary: Fix the frequent testcase failures in 0.94 from build no #209 Key: HBASE-6100 URL: https://issues.apache.org/jira/browse/HBASE-6100 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.1 Fix the flaky tests in 0.94 branch after #209. Many test cases like the org.apache.hadoop.hbase.TestLocalHBaseCluster.testLocalHBaseCluster org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.testSingleMethod are failing frequently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6101) Insure Observers cover all RPC and lifecycle code paths
Andrew Purtell created HBASE-6101: - Summary: Insure Observers cover all RPC and lifecycle code paths Key: HBASE-6101 URL: https://issues.apache.org/jira/browse/HBASE-6101 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5986) Clients can see holes in the META table when regions are being split
[ https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-5986: - Attachment: HBASE-5986-0.94.patch HBASE-5986-0.92.patch Attaching patches for 0.92 and 0.94 branches. They are direct ports of the v3 patch, but 0.92 patch also includes HRegionServer.getOnlineRegions(byte[] tableName) function directly copied from 0.94, since we need it. I have discovered this when testing with 0.92, so I would like it to make into it. One minor mishap from my part is that the v3 patch which went into trunk includes an unrelated change in RegionServerDynamicStatistics. Related issue is HBASE-6025. Although the change is trivial ,changing RegionServerDynamicStatistics to extend hbase-specific MetricsMBeanBase rather than hadoop-specific MetricsDynamicMBeanBase, we may want to note this, or revert that part. Backport patches does not include this change. Sorry for the trouble guys. Clients can see holes in the META table when regions are being split Key: HBASE-5986 URL: https://issues.apache.org/jira/browse/HBASE-5986 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0 Attachments: 5986-v2.txt, HBASE-5986-0.92.patch, HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch We found this issue when running large scale ingestion tests for HBASE-5754. The problem is that the .META. table updates are not atomic while splitting a region. In SplitTransaction, there is a time lap between the marking the parent offline, and adding of daughters to the META table. This can result in clients using MetaScanner, of HTable.getStartEndKeys (used by the TableInputFormat) missing regions which are made just offline, but the daughters are not added yet. This is also related to HBASE-4335. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283650#comment-13283650 ] ramkrishna.s.vasudevan commented on HBASE-6047: --- I have raised HBASE-6100 to address the test case failures in 0.94. In 0.92 i see that TestLocalHBaseCluster is failing due to {code} Starting shutdown. org.apache.hadoop.hbase.util.FileSystemVersionException: File system needs to be upgraded. You have version null and I want version 7. Run the '${HBASE_HOME}/bin/hbase migrate' script. {code} We need to correct this across all branches i feel. Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang Assignee: Alex Newman Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, PutTest.java the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283652#comment-13283652 ] Hudson commented on HBASE-6070: --- Integrated in HBase-0.94 #217 (See [https://builds.apache.org/job/HBase-0.94/217/]) HBASE-6070 AM.nodeDeleted and SSH races creating problems for regions under SPLIT (Ramkrishna) (Revision 1342725) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java AM.nodeDeleted and SSH races creating problems for regions under SPLIT -- Key: HBASE-6070 URL: https://issues.apache.org/jira/browse/HBASE-6070 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, HBASE-6070_trunk_1.patch We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806. While doing some more we found still there is one race condition. - Split has just started and the znode is in RS_SPLIT state. - RS goes down. - First call back for SSH comes. - As part of the fix for HBASE-5806 SSH knows that some region is in RIT. - But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT. - After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned. When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283653#comment-13283653 ] Hudson commented on HBASE-6077: --- Integrated in HBase-0.94 #217 (See [https://builds.apache.org/job/HBase-0.94/217/]) Amend HBASE-6077. Remove stray tag (Revision 1342705) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6102) API and shell usability improvements
Andrew Purtell created HBASE-6102: - Summary: API and shell usability improvements Key: HBASE-6102 URL: https://issues.apache.org/jira/browse/HBASE-6102 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283658#comment-13283658 ] Hudson commented on HBASE-6070: --- Integrated in HBase-TRUNK #2922 (See [https://builds.apache.org/job/HBase-TRUNK/2922/]) HBASE-6070 AM.nodeDeleted and SSH races creating problems for regions under SPLIT (Ramkrishna) (Revision 1342724) Result = FAILURE ramkrishna : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/Mocking.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java AM.nodeDeleted and SSH races creating problems for regions under SPLIT -- Key: HBASE-6070 URL: https://issues.apache.org/jira/browse/HBASE-6070 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, HBASE-6070_trunk_1.patch We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806. While doing some more we found still there is one race condition. - Split has just started and the znode is in RS_SPLIT state. - RS goes down. - First call back for SSH comes. - As part of the fix for HBASE-5806 SSH knows that some region is in RIT. - But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT. - After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned. When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283659#comment-13283659 ] Hudson commented on HBASE-6077: --- Integrated in HBase-TRUNK #2922 (See [https://builds.apache.org/job/HBase-TRUNK/2922/]) Amend HBASE-6077. Remove stray tag (Revision 1342704) Result = FAILURE apurtell : Files : * /hbase/trunk/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6002: -- Attachment: HBASE-6002_trunk_1.patch Updated patch. I have introduced one boolean to know whether close has already been attempted or not. Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283663#comment-13283663 ] ramkrishna.s.vasudevan commented on HBASE-6050: --- I will commit this tomorrow morning. HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. --- Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Attachments: HBASE-6050.patch The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6002: -- Status: Patch Available (was: Open) Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6002: -- Status: Open (was: Patch Available) Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6098) ACL design changes
[ https://issues.apache.org/jira/browse/HBASE-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6098: -- Component/s: security Affects Version/s: 0.94.1 0.96.0 ACL design changes -- Key: HBASE-6098 URL: https://issues.apache.org/jira/browse/HBASE-6098 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6101) Insure Observers cover all relevant RPC and lifecycle code paths
[ https://issues.apache.org/jira/browse/HBASE-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6101: -- Component/s: security regionserver master coprocessors Affects Version/s: 0.94.1 0.96.0 Summary: Insure Observers cover all relevant RPC and lifecycle code paths (was: Insure Observers cover all RPC and lifecycle code paths) Insure Observers cover all relevant RPC and lifecycle code paths Key: HBASE-6101 URL: https://issues.apache.org/jira/browse/HBASE-6101 Project: HBase Issue Type: Sub-task Components: coprocessors, master, regionserver, security Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6099) Secure ZooKeeper integration changes
[ https://issues.apache.org/jira/browse/HBASE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6099: -- Component/s: zookeeper shell security client Affects Version/s: 0.94.1 0.96.0 Secure ZooKeeper integration changes Key: HBASE-6099 URL: https://issues.apache.org/jira/browse/HBASE-6099 Project: HBase Issue Type: Sub-task Components: client, security, shell, zookeeper Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6102) API and shell usability improvements
[ https://issues.apache.org/jira/browse/HBASE-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6102: -- Component/s: shell security master client Affects Version/s: 0.94.1 0.96.0 API and shell usability improvements Key: HBASE-6102 URL: https://issues.apache.org/jira/browse/HBASE-6102 Project: HBase Issue Type: Sub-task Components: client, master, security, shell Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions
[ https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283670#comment-13283670 ] Hudson commented on HBASE-6077: --- Integrated in HBase-0.92 #420 (See [https://builds.apache.org/job/HBase-0.92/420/]) Amend HBASE-6077. Remove stray tag (Revision 1342706) Amend HBASE-6077. Replace HTML formatting that does not work with Docbook (Revision 1342383) Result = FAILURE apurtell : Files : * /hbase/branches/0.92/src/docbkx/troubleshooting.xml apurtell : Files : * /hbase/branches/0.92/src/docbkx/troubleshooting.xml Document the most common secure RPC troubleshooting resolutions --- Key: HBASE-6077 URL: https://issues.apache.org/jira/browse/HBASE-6077 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6077.patch See attached manual troubleshooting section update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6068: --- Attachment: HBASE-6068-v1.patch Missed one in the list, hbase shell call Zookeeper directly on zk_dump command zk_dump - listChildrenNoWatch() /hbase/backup-masters/* Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch, HBASE-6068-v1.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283688#comment-13283688 ] Hadoop QA commented on HBASE-6002: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529760/HBASE-6002_trunk_1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1999//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1999//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1999//console This message is automatically generated. Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6103) HBaseServer shall read and deserialize data from each connection in parallel
Liyin Tang created HBASE-6103: - Summary: HBaseServer shall read and deserialize data from each connection in parallel Key: HBASE-6103 URL: https://issues.apache.org/jira/browse/HBASE-6103 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently HBaseServer is running with a single listener thread, which is responsible for accepting the connection, reading the data from network channel, deserializing the data into writable objects and handover to the IPC handler threads. So when there are multiple hbase clients connecting to the region server (HBaseServer) and reading/writing a large set of data, this listener thread will be performance bottleneck. Ideally, the listener thread shall only accept the connection and handover the connection to the IPC threads directly, so that each IPC thread would read the data from network channel, deserialize the data and execute the Call. In this way, the HBaseServer can read and deserialize data from each connection in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283702#comment-13283702 ] Zhihong Yu commented on HBASE-6002: --- Latest patch looks Okay. Check out the failed tests. Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs
[ https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283704#comment-13283704 ] Hadoop QA commented on HBASE-6068: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529764/HBASE-6068-v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2000//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2000//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2000//console This message is automatically generated. Secure HBase cluster : Client not able to call some admin APIs -- Key: HBASE-6068 URL: https://issues.apache.org/jira/browse/HBASE-6068 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Anoop Sam John Assignee: Matteo Bertozzi Attachments: HBASE-6068-v0.patch, HBASE-6068-v1.patch In case of secure cluster, we allow the HBase clients to read the zk nodes by providing the global read permissions to all for certain nodes. These nodes are the master address znode, root server znode and the clusterId znode. In ZKUtil.createACL() , we can see these node names are specially handled. But there are some other client side admin APIs which makes a read call into the zookeeper from the client. This include the isTableEnabled() call (May be some other. I have seen this). Here the client directly reads a node in the zookeeper ( node created for this table ) and the data is matched to know whether this is enabled or not. Now in secure cluster case any client can read zookeeper nodes which it needs for its normal operation like the master address and root server address. But what if the client calls this API? [isTableEnaled () ]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
Gary Helmling created HBASE-6104: Summary: Require EXEC permission to call coprocessor endpoints Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: Sub-task Components: coprocessors, security Reporter: Gary Helmling The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split
[ https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283708#comment-13283708 ] Zhihong Yu commented on HBASE-5986: --- @Enis: Did you have a chance to run the backports through respective test suite ? Thanks Clients can see holes in the META table when regions are being split Key: HBASE-5986 URL: https://issues.apache.org/jira/browse/HBASE-5986 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0 Attachments: 5986-v2.txt, HBASE-5986-0.92.patch, HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch We found this issue when running large scale ingestion tests for HBASE-5754. The problem is that the .META. table updates are not atomic while splitting a region. In SplitTransaction, there is a time lap between the marking the parent offline, and adding of daughters to the META table. This can result in clients using MetaScanner, of HTable.getStartEndKeys (used by the TableInputFormat) missing regions which are made just offline, but the daughters are not added yet. This is also related to HBASE-4335. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6105) Sweep all INFO level logging and aggressively drop to DEBUG, and from DEBUG to TRACE
Andrew Purtell created HBASE-6105: - Summary: Sweep all INFO level logging and aggressively drop to DEBUG, and from DEBUG to TRACE Key: HBASE-6105 URL: https://issues.apache.org/jira/browse/HBASE-6105 Project: HBase Issue Type: Task Affects Versions: 0.96.0 Reporter: Andrew Purtell Speaking with Arjen from Facebook ops at HBaseCon, I asked if given one single request for improving HBase operability, what would that be. The answer was to be less verbose at INFO log level. For example, with many regions opening, anomalous events can be difficult to pick out among the 5-6 INFO level messages per region deployment. Where multiple INFO level messages are printed in close succession, we should consider coalescing them. For all INFO level messages, we should be aggressive about demoting them to DEBUG level. And, since we are now increasing the verbosity at DEBUG level, the same considerations should be applied there, with coalescing and demotion of really detailed/low level logging to TRACE. Consider making this a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.
[ https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HBASE-5892: --- Attachment: hbase-5892-1.patch [hbck] Refactor parallel WorkItem* to Futures. -- Key: HBASE-5892 URL: https://issues.apache.org/jira/browse/HBASE-5892 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Andrew Wang Labels: noob Attachments: hbase-5892-1.patch, hbase-5892.patch This would convert WorkItem* logic (with low level notifies, and rough exception handling) into a more canonical Futures pattern. Currently there are two instances of this pattern (for loading hdfs dirs, for contacting regionservers for assignments, and soon -- for loading hdfs .regioninfo files). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.
[ https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283714#comment-13283714 ] Andrew Wang commented on HBASE-5892: Ran TestHBaseFsck, had to fix a null pointer thus new version of the patch. I'll port it to prior versions too. [hbck] Refactor parallel WorkItem* to Futures. -- Key: HBASE-5892 URL: https://issues.apache.org/jira/browse/HBASE-5892 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Andrew Wang Labels: noob Attachments: hbase-5892-1.patch, hbase-5892.patch This would convert WorkItem* logic (with low level notifies, and rough exception handling) into a more canonical Futures pattern. Currently there are two instances of this pattern (for loading hdfs dirs, for contacting regionservers for assignments, and soon -- for loading hdfs .regioninfo files). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283720#comment-13283720 ] Andrew Purtell commented on HBASE-6104: --- bq. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). +1 on this at a minimum. bq. Should access be scoped to a specific table and CoprocessorProtocol extension? bq. Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? bq. Are per-method restrictions necessary? For the sake of simplicity, I suggest considering an EXEC permission per CF. So that would allow the user or group specified in the grant to execute any coprocessors installed in the region for the given CF. We can do more, but it would be good to be informed by a specific use case then. Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: Sub-task Components: coprocessors, security Reporter: Gary Helmling The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283720#comment-13283720 ] Andrew Purtell edited comment on HBASE-6104 at 5/25/12 8:02 PM: bq. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). +1 on this at a minimum. bq. Should access be scoped to a specific table and CoprocessorProtocol extension? bq. Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? bq. Are per-method restrictions necessary? For the sake of simplicity, I suggest considering an EXEC permission per CF. So that would allow the user or group specified in the grant to execute any coprocessors installed in the region for the given CF. We can do more, but it would be good to be informed by a specific use case then. Edit: This implies some additional interface that informs the coprocessor what CFs the principal has access rights to. was (Author: apurtell): bq. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). +1 on this at a minimum. bq. Should access be scoped to a specific table and CoprocessorProtocol extension? bq. Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? bq. Are per-method restrictions necessary? For the sake of simplicity, I suggest considering an EXEC permission per CF. So that would allow the user or group specified in the grant to execute any coprocessors installed in the region for the given CF. We can do more, but it would be good to be informed by a specific use case then. Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: Sub-task Components: coprocessors, security Reporter: Gary Helmling The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5498) Secure Bulk Load
[ https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283724#comment-13283724 ] Andrew Purtell commented on HBASE-5498: --- @Francis, do you have any work, even if in a partially completed state? Secure Bulk Load Key: HBASE-5498 URL: https://issues.apache.org/jira/browse/HBASE-5498 Project: HBase Issue Type: Improvement Reporter: Francis Liu Design doc: https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load Short summary: Security as it stands does not cover the bulkLoadHFiles() feature. Users calling this method will bypass ACLs. Also loading is made more cumbersome in a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data from user's directory to the hbase directory, which would require certain write access privileges set. Our solution is to create a coprocessor which makes use of AuthManager to verify if a user has write access to the table. If so, launches a MR job as the hbase user to do the importing (ie rewrite from text to hfiles). One tricky part this job will have to do is impersonate the calling user when reading the input files. We can do this by expecting the user to pass an hdfs delegation token as part of the secureBulkLoad() coprocessor call and extend an inputformat to make use of that token. The output is written to a temporary directory accessible only by hbase and then bulkloadHFiles() is called. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6104: -- Affects Version/s: 0.96.0 Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: Sub-task Components: coprocessors, security Affects Versions: 0.96.0 Reporter: Gary Helmling The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira