date:20120525


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283221#comment-13283221
 ] 

Zhihong Yu commented on HBASE-5916:
---

@Chunhui:
Your suggestion is interesting. 
We should implement that in a separate issue. 

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283234#comment-13283234
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-0.94-security #31 (See 
[https://builds.apache.org/job/HBase-0.94-security/31/])
Amend HBASE-6077. Replace HTML formatting that does not work with Docbook 
(Revision 1342382)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/branches/0.94/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283239#comment-13283239
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-0.94 #216 (See 
[https://builds.apache.org/job/HBase-0.94/216/])
Amend HBASE-6077. Replace HTML formatting that does not work with Docbook 
(Revision 1342382)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/branches/0.94/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-25 Thread chunhui shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283241#comment-13283241
 ] 

chunhui shen commented on HBASE-5916:
-

bq.We should implement that in a separate issue. 

I think the above suggestion would fix this issue. 

@ram
Could you give some comments, correct me if anything I didn't consider.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6033) Adding some fuction to check if a table/region is in compaction


[ 
https://issues.apache.org/jira/browse/HBASE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283259#comment-13283259
 ] 

Hudson commented on HBASE-6033:
---

Integrated in HBase-TRUNK #2921 (See 
[https://builds.apache.org/job/HBase-TRUNK/2921/])
HBASE-6033 Addendum changes TestCompactionState to large test (Revision 
1342196)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionState.java


 Adding some fuction to check if a table/region is in compaction
 ---

 Key: HBASE-6033
 URL: https://issues.apache.org/jira/browse/HBASE-6033
 Project: HBase
  Issue Type: New Feature
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: 6033-v7.txt, hbase-6033_v2.patch, hbase-6033_v3.patch, 
 hbase_6033_v5.patch, hbase_6033_v6.patch, table_ui.png


 This feature will be helpful to find out if a major compaction is going on.
 We can show if it is in any minor compaction too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283260#comment-13283260
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-TRUNK #2921 (See 
[https://builds.apache.org/job/HBase-TRUNK/2921/])
Amend HBASE-6077. Replace HTML formatting that does not work with Docbook 
(Revision 1342381)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6044) copytable: remove rs.* parameters


[ 
https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283266#comment-13283266
 ] 

Hudson commented on HBASE-6044:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-6044 copytable: remove rs.* parameters (Revision 1341202)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/docbkx/ops_mgt.xml
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java


 copytable: remove rs.* parameters
 -

 Key: HBASE-6044
 URL: https://issues.apache.org/jira/browse/HBASE-6044
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: hbase-6044-92.patch, hbase-6044-v2.patch, 
 hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch


 In discussion of HBASE-6013 it was suggested that we remove these arguments 
 from 0.92+ (but keep in 0.90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5757) TableInputFormat should handle as many errors as possible


[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283267#comment-13283267
 ] 

Hudson commented on HBASE-5757:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-5757 TableInputFormat should handle as many errors as possible (Jan 
Lukavsky) (Revision 1341205)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java


 TableInputFormat should handle as many errors as possible
 -

 Key: HBASE-5757
 URL: https://issues.apache.org/jira/browse/HBASE-5757
 Project: HBase
  Issue Type: Bug
  Components: mapred, mapreduce
Affects Versions: 0.90.6
Reporter: Jan Lukavsky
Assignee: Jan Lukavsky
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: 5757-trunk-v2.txt, HBASE-5757-trunk-r1341041.patch, 
 HBASE-5757.patch, HBASE-5757.patch, hbase-5757-92.patch


 Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
 scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
 handling so that if exception is caught a reconnect is attempted (without 
 bothering the mapred client). After that, HBASE-4269 changed this behavior 
 back, but in both mapred and mapreduce APIs. The question is, is there any 
 reason not to handle all errors that the input format can handle? In other 
 words, why not try to reissue the request after *any* IOException? I see the 
 following disadvantages of current approach
  * the client may see exceptions like LeaseException and 
 ScannerTimeoutException if he fails to process all fetched data in timeout
  * to avoid ScannerTimeoutException the client must raise 
 hbase.regionserver.lease.period
  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
 seems to me a bit redundant, because typically one needs to update both these 
 parameters
  * I don't see any possibility to get rid of LeaseException (this is 
 configured on server side)
 I think all of these issues would be gone, if the DoNotRetryIOException would 
 not be rethrown. -On the other hand, handling errors in InputFormat has 
 disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
 very big scanner.caching, and I manage to process only a few rows in timeout, 
 I will end up with single row being fetched many times (and will not be 
 explicitly notified about this). Could we solve this problem by adding some 
 counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6011) Unable to start master in local mode


[ 
https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283268#comment-13283268
 ] 

Hudson commented on HBASE-6011:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-6011. Addendum to support master mocking (Ram) (Revision 1340157)

 Result = FAILURE
apurtell : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java


 Unable to start master in local mode
 

 Key: HBASE-6011
 URL: https://issues.apache.org/jira/browse/HBASE-6011
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, 
 HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, 
 HBASE-6011_addendum_trunk.patch


 Got this trying to launch head of 0.94 branch in local mode from the build 
 tree but it happens with trunk and 0.92 too:
 {noformat}
 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master
 java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot 
 be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6061) Fix ACL Admin Table inconsistent permission check


[ 
https://issues.apache.org/jira/browse/HBASE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283269#comment-13283269
 ] 

Hudson commented on HBASE-6061:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-6061 Fix ACL Admin Table inconsistent permission check (Matteo 
Bertozzi) (Revision 1341268)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


 Fix ACL Admin Table inconsistent permission check
 ---

 Key: HBASE-6061
 URL: https://issues.apache.org/jira/browse/HBASE-6061
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6061-0.92.patch, HBASE-6061-v0.patch, 
 HBASE-6061-v1.patch


 the requirePermission() check for admin operation on a table is currently 
 inconsistent.
 Table Owner with CREATE rights (that means, the owner has created that table) 
 can enable/disable and delete the table but needs ADMIN rights to 
 add/remove/modify a column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6013) Polish sharp edges from CopyTable


[ 
https://issues.apache.org/jira/browse/HBASE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283270#comment-13283270
 ] 

Hudson commented on HBASE-6013:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-6013 Polish sharp edges from CopyTable (Revision 1339931)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java


 Polish sharp edges from CopyTable
 -

 Key: HBASE-6013
 URL: https://issues.apache.org/jira/browse/HBASE-6013
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: hbase-6013-92.patch, hbase-6013.patch


 CopyTable doesn't report errors when invalid arguments are specified.  For 
 example, having a typo in --peer.adr (such as --peer.addr or -peer.adr) 
 silently uses the default cluster and does a same-cluster copy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6054) 0.92 failing because of missing commons-io after upgrade to hadoop 1.0.3.


[ 
https://issues.apache.org/jira/browse/HBASE-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283272#comment-13283272
 ] 

Hudson commented on HBASE-6054:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-6054 0.92 failing because of missing commons-io after upgrade to 
hadoop 1.0.3. (Revision 1340272)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/pom.xml


 0.92 failing because of missing commons-io after upgrade to hadoop 1.0.3.
 -

 Key: HBASE-6054
 URL: https://issues.apache.org/jira/browse/HBASE-6054
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: commons-io.txt, hbasedefaultcheck.txt


 See this note: 
 http://search-hadoop.com/m/0UrOr19BG8v1/test+failure+after+upgrading+to+hadoop+1.0.3+Was%253A+ClassNotFoundException%253A+org.apache.commons.io.FileUtilssubj=test+failure+after+upgrading+to+hadoop+1+0+3+Was+ClassNotFoundException+org+apache+commons+io+FileUtils

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283271#comment-13283271
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
Amend HBASE-6077. Replace HTML formatting that does not work with Docbook 
(Revision 1342383)
HBASE-6077. Document the most common secure RPC troubleshooting resolutions 
(Revision 1342107)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/branches/0.92/src/docbkx/troubleshooting.xml

apurtell : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly


[ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283273#comment-13283273
 ] 

Hudson commented on HBASE-6047:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-6047 Put.has() can't determine result correctly (Alex Newman) 
(Revision 1341741)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/Put.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestPutDotHas.java


 Put.has() can't determine result correctly
 --

 Key: HBASE-6047
 URL: https://issues.apache.org/jira/browse/HBASE-6047
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.1
Reporter: Wang Qiang
Assignee: Alex Newman
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
 PutTest.java


 the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
 the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
 value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
 ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
 it'll enter the block
 {code}
 else if (ignoreValue) {
   for (KeyValue kv: list) {
 if (Arrays.equals(kv.getFamily(), family)  
 Arrays.equals(kv.getQualifier(), qualifier)
  kv.getTimestamp() == ts) {
   return true;
 }
   }
 }
 {code}
 the expression 'kv.getTimestamp() == ts' in the if conditions should only 
 exist when 'ignoreTS=false', otherwise, the following code will return false!
 {code}
 Put put = new Put(Bytes.toBytes(row-01));
 put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01),
   1234567L, Bytes.toBytes(value-01));
 System.out.println(put.has(Bytes.toBytes(family-01),
   Bytes.toBytes(qualifier-01)));
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5920) New Compactions Logic can silently prevent user-initiated compactions from occurring


[ 
https://issues.apache.org/jira/browse/HBASE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283274#comment-13283274
 ] 

Hudson commented on HBASE-5920:
---

Integrated in HBase-0.92-security #108 (See 
[https://builds.apache.org/job/HBase-0.92-security/108/])
HBASE-5920 New Compactions Logic can silently prevent user-initiated 
compactions from occurring (Revision 1340285)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java


 New Compactions Logic can silently prevent user-initiated compactions from 
 occurring
 

 Key: HBASE-5920
 URL: https://issues.apache.org/jira/browse/HBASE-5920
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.92.1
Reporter: Derek Wollenstein
Assignee: Derek Wollenstein
Priority: Minor
  Labels: compaction
 Attachments: 5290-094.txt, HBASE-5920-0.92.1-1.patch, 
 HBASE-5920-0.92.1-2.patch, HBASE-5920-0.92.1.patch, HBASE-5920-trunk-1.patch, 
 HBASE-5920-trunk.patch


 There seem to be some tuning settings in which manually triggered major 
 compactions will do nothing, including loggic
 From Store.java in the function
   ListStoreFile compactSelection(ListStoreFile candidates)
 When a user manually triggers a compaction, this follows the same logic as a 
 normal compaction check.  when a user manually triggers a major compaction, 
 something similar happens.  Putting this all together:
 1. If a user triggers a major compaction, this is checked against a max files 
 threshold (hbase.hstore.compaction.max). If the number of storefiles to 
 compact is  max files, then we downgrade to a minor compaction
 2. If we are in a minor compaction, we do the following checks:
a. If the file is less than a minimum size 
 (hbase.hstore.compaction.min.size) we automatically include it
b. Otherwise, we check how the size compares to the next largest size.  
 based on hbase.hstore.compaction.ratio.  
   c. If the number of files included is less than a minimum count 
 (hbase.hstore.compaction.min) then don't compact.
 In many of the exit strategies, we aren't seeing an error message.
 The net-net of this is that if we have a mix of very large and very small 
 files, we may end up having too many files to do a major compact, but too few 
 files to do a minor compact.
 I'm trying to go through and see if I'm understanding things correctly, but 
 this seems like the bug
 To put it another way
 2012-05-02 20:09:36,389 DEBUG 
 org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large Compaction 
 requested: 
 regionName=str,44594594594594592,1334939064521.f7aed25b55d4d7988af763bede9ce74e.,
  store
 Name=c, fileCount=15, fileSize=1.5g (20.2k, 362.5m, 155.3k, 3.0m, 30.7k, 
 361.2m, 6.9m, 4.7m, 14.7k, 363.4m, 30.9m, 3.2m, 7.3k, 362.9m, 23.5m), 
 priority=-9, time=3175046817624398; Because: Recursive enqueue; 
 compaction_queue=(59:0), split_queue=0
 When we had a minimum compaction size of 128M, and default settings for 
 hbase.hstore.compaction.min,hbase.hstore.compaction.max,hbase.hstore.compaction.ratio,
  we were not getting a compaction to run even if we ran
 major_compact 
 'str,44594594594594592,1334939064521.f7aed25b55d4d7988af763bede9ce74e.' from 
 the ruby shell.  Note that we had many tiny regions (20k, 155k, 3m, 30k,..) 
 and several large regions (362.5m,361.2m,363.4m,362.9m).  I think the bimodal 
 nature of the sizes prevented us from doing a compaction.
 I'm not 100% sure where this errored out because when I manually triggered a 
 compaction, I did not see
 '  // if we don't have enough files to compact, just wait 
   if (filesToCompact.size()  this.minFilesToCompact) {  
 if (LOG.isDebugEnabled()) {  
   LOG.debug(Skipped compaction of  + this.storeNameStr 
 + .  Only  + (end - start) +  file(s) of size
 + StringUtils.humanReadableInt(totalSize)
 +  have met compaction criteria.); 
 }
 ' 
 being printed in the logs (and I know DEBUG logging was enabled because I saw 
 this elsewhere).

[jira] [Commented] (HBASE-6071) getRegionServerWithRetires, should log unsuccessful attempts and exceptions.

2012-05-25 Thread Igal Shilman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283285#comment-13283285
 ] 

Igal Shilman commented on HBASE-6071:
-

[~zhi...@ebaysf.com], can you please help clarifying core tests score? I went 
thru the test results and didn't find any failures. Also, browsing around 
recently submitted patches I see the same findbugs warning count, should I 
conclude that this is not necessary related to this patch?
Thanks.

 getRegionServerWithRetires, should log unsuccessful attempts and exceptions.
 

 Key: HBASE-6071
 URL: https://issues.apache.org/jira/browse/HBASE-6071
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Affects Versions: 0.92.0, 0.94.0
Reporter: Igal Shilman
Priority: Minor
  Labels: client, ipc
 Attachments: HBASE-6071.patch, 
 HConnectionManager_HBASE-6071-0.90.0.patch


 HConnectionImplementation.getRegionServerWithRetries might terminate w/ an 
 exception different then a DoNotRetryIOException, thus silently drops 
 exceptions from previous attempts.
 [~ted_yu] suggested 
 ([here|http://mail-archives.apache.org/mod_mbox/hbase-user/201205.mbox/%3CCAFebPXBq9V9BVdzRTNr-MB3a1Lz78SZj6gvP6On0b%2Bajt9StAg%40mail.gmail.com%3E])
  adding a log message inside the catch block describing the exception type 
 and details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6071) getRegionServerWithRetires, should log unsuccessful attempts and exceptions.


[ 
https://issues.apache.org/jira/browse/HBASE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283301#comment-13283301
 ] 

Zhihong Yu commented on HBASE-6071:
---

I used my script to scan 
https://builds.apache.org/job/PreCommit-HBASE-Build/1991/console and didn't 
find hanging test.
{code}
+  String message = String.format(Exception during try #%d (out of 
%d):,tries+1, numRetries);
+  LOG.debug(message,t);
{code}
nit: space should be inserted between comma and tries, between comma and t.

My original intention was using the above debug log to facilitate finding root 
cause.
It would be nice if debug log is added to 0.90 release you're using and see 
what we get.

 getRegionServerWithRetires, should log unsuccessful attempts and exceptions.
 

 Key: HBASE-6071
 URL: https://issues.apache.org/jira/browse/HBASE-6071
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Affects Versions: 0.92.0, 0.94.0
Reporter: Igal Shilman
Priority: Minor
  Labels: client, ipc
 Attachments: HBASE-6071.patch, 
 HConnectionManager_HBASE-6071-0.90.0.patch


 HConnectionImplementation.getRegionServerWithRetries might terminate w/ an 
 exception different then a DoNotRetryIOException, thus silently drops 
 exceptions from previous attempts.
 [~ted_yu] suggested 
 ([here|http://mail-archives.apache.org/mod_mbox/hbase-user/201205.mbox/%3CCAFebPXBq9V9BVdzRTNr-MB3a1Lz78SZj6gvP6On0b%2Bajt9StAg%40mail.gmail.com%3E])
  adding a log message inside the catch block describing the exception type 
 and details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6071) getRegionServerWithRetires, should log unsuccessful attempts and exceptions.


[ 
https://issues.apache.org/jira/browse/HBASE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283302#comment-13283302
 ] 

Zhihong Yu commented on HBASE-6071:
---

If you look at ScannerCallable.java in trunk, you would see the following:
{code}
  public static final String LOG_SCANNER_ACTIVITY = 
hbase.client.log.scanner.activity;
...
} catch (IOException e) {
  if (logScannerActivity) {
LOG.info(Got exception in fetching from scanner=
  + scannerId, e);
  }
{code}
You can backport related code and set hbase.client.log.scanner.activity to true.
This way you would see the exception in log.

 getRegionServerWithRetires, should log unsuccessful attempts and exceptions.
 

 Key: HBASE-6071
 URL: https://issues.apache.org/jira/browse/HBASE-6071
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Affects Versions: 0.92.0, 0.94.0
Reporter: Igal Shilman
Priority: Minor
  Labels: client, ipc
 Attachments: HBASE-6071.patch, 
 HConnectionManager_HBASE-6071-0.90.0.patch


 HConnectionImplementation.getRegionServerWithRetries might terminate w/ an 
 exception different then a DoNotRetryIOException, thus silently drops 
 exceptions from previous attempts.
 [~ted_yu] suggested 
 ([here|http://mail-archives.apache.org/mod_mbox/hbase-user/201205.mbox/%3CCAFebPXBq9V9BVdzRTNr-MB3a1Lz78SZj6gvP6On0b%2Bajt9StAg%40mail.gmail.com%3E])
  adding a log message inside the catch block describing the exception type 
 and details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2012-05-25 Thread Max Lapan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Max Lapan updated HBASE-5416:
-

Status: Open (was: Patch Available)

Improve performance of scans with some kind of filters.
---

Key: HBASE-5416
URL: https://issues.apache.org/jira/browse/HBASE-5416
Project: HBase
Issue Type: Improvement
Components: filters, performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Max Lapan
Attachments: 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch,
Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch,
Filtered_scans_v5.1.patch, Filtered_scans_v5.patch

When the scan is performed, whole row is loaded into result list, after that
filter (if exists) is applied to detect that row is needed.
But when scan is performed on several CFs and filter checks only data from
the subset of these CFs, data from CFs, not checked by a filter is not needed
on a filter stage. Only when we decided to include current row. And in such
case we can significantly reduce amount of IO performed by a scan, by loading
only values, actually checked by a filter.
For example, we have two CFs: flags and snap. Flags is quite small (bunch of
megabytes) and is used to filter large entries from snap. Snap is very large
(10s of GB) and it is quite costly to scan it. If we needed only rows with
some flag specified, we use SingleColumnValueFilter to limit result to only
small subset of region. But current implementation is loading both CFs to
perform scan, when only small subset is needed.
Attached patch adds one routine to Filter interface to allow filter to
specify which CF is needed to it's operation. In HRegion, we separate all
scanners into two groups: needed for filter and the rest (joined). When new
row is considered, only needed data is loaded, filter applied, and only if
filter accepts the row, rest of data is loaded. At our data, this speeds up
such kind of scans 30-50 times. Also, this gives us the way to better
normalize the data into separate columns by optimizing the scans performed.

[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2012-05-25 Thread Max Lapan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Max Lapan updated HBASE-5416:
-

Hadoop Flags: (was: Reviewed)
Status: Patch Available (was: Open)

Improve performance of scans with some kind of filters.
---

[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2012-05-25 Thread Max Lapan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Max Lapan updated HBASE-5416:
-

Attachment: Filtered_scans_v5.1.patch

Fixed issues with incorrect rebase, applied suggested changes from first review.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283349#comment-13283349
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

@Chunhui
I like your idea too.  As i said we are planning to raise an improvement 
activity for master restart and SSH.
Because even with the above approach i will tell one more scenario which is 
problematic.  Pls note that the scenario can come even without your suggestion 
also.

Two region servers are there.  Both went down when the flow is in 
AM.joinCluster(). Now as no RS is there at that time we will not make any 
assignment. And all will go into RIT mode waiting for timeout monitor. Now SSH 
is also waiting as the master initialization is not complete(this step is as 
per your suggestion).  Now suppose there are 100 regions all are waiting for 
getting assigned.
Now if a new RS comes up as there is a code in TimeoutMonitor
{code}
 if (regionState.getStamp() + timeout = now) {
   //decide on action upon timeout
actOnTimeOut(regionState);
  } else if (this.allRegionServersOffline  !allRSsOffline) {
// if some RSs just came back online, we can start the
// the assignment right away
actOnTimeOut(regionState);
  }
{code}
It will immediately trigger assignment.  At the same time as master 
initialization has already been done and so we are able to carry on assignment 
with SSH also.  This will lead to double assignment.  Actually in defect 
HBASe-5816 Stack was suggesting to have one common queue where any assignment 
will be done so that SSH will not interfere with that or viceversa.  
I suggest we can get in the patch that addresses the current JIRa problem and 
work on a diff JIRA that will help me to address the master restart and SSH 
area which is troublesome.


 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283357#comment-13283357
]

Hadoop QA commented on HBASE-5416:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12529706/Filtered_scans_v5.1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.TestRegionRebalancing

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1994//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1994//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1994//console

This message is automatically generated.

Improve performance of scans with some kind of filters.
---

[jira] [Assigned] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.


 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh reassigned HBASE-5892:
-

Assignee: Andrew Wang

 [hbck] Refactor parallel WorkItem* to Futures.
 --

 Key: HBASE-5892
 URL: https://issues.apache.org/jira/browse/HBASE-5892
 Project: HBase
  Issue Type: Improvement
Reporter: Jonathan Hsieh
Assignee: Andrew Wang
  Labels: noob
 Attachments: hbase-5892.patch


 This would convert WorkItem* logic (with low level notifies, and rough 
 exception handling)  into a more canonical Futures pattern.
 Currently there are two instances of this pattern (for loading hdfs dirs, for 
 contacting regionservers for assignments, and soon -- for loading hdfs 
 .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

[
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283369#comment-13283369
]

Jonathan Hsieh commented on HBASE-5892:
---

Andrew, looks good. I'm going to wait for the hadoopqa robot to execute the
test suite. Alternately, since this just modifies hbck, can you try this
command and share results: 'mvn test -PlocalTests -Dtest=TestHbaseFsck'?

I'd like to keep all hbck across versions essentially the same -- would you be
willing to port to 0.90/0.92/0.94? I'd bet that this may apply to 0.94 and
0.92, and that 0.90 would require some near trivial tweaks.

[hbck] Refactor parallel WorkItem* to Futures.
--

Key: HBASE-5892
URL: https://issues.apache.org/jira/browse/HBASE-5892
Project: HBase
Issue Type: Improvement
Reporter: Jonathan Hsieh
Assignee: Andrew Wang
Labels: noob
Attachments: hbase-5892.patch

This would convert WorkItem* logic (with low level notifies, and rough
exception handling) into a more canonical Futures pattern.
Currently there are two instances of this pattern (for loading hdfs dirs, for
contacting regionservers for assignments, and soon -- for loading hdfs
.regioninfo files).

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.


[ 
https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283412#comment-13283412
 ] 

Jonathan Hsieh commented on HBASE-6050:
---

Just for clarification - this edits are actually replayed to the daughter 
regions and these recovered.edits files are kept around for something (the CJ?) 
to eventually clean up?

 HLogSplitter renaming recovered.edits and CJ removing the parent directory 
 races, making the HBCK to think cluster is inconsistent.
 ---

 Key: HBASE-6050
 URL: https://issues.apache.org/jira/browse/HBASE-6050
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Attachments: HBASE-6050.patch


 The scenario is like this
 - A region is getting splitted.
 - The master is still not processed the split .
 - Region server goes down.
 - Split log manager starts splitting the logs and creates the 
 recovered.edits in the splitlog path.
 - CJ starts and deletes the entry from META and also just completes the 
 deletion of the region dir.
 - in hlogSplitter on final step we rename the recovered.edits to come under 
 the regiondir.
 There if the regiondir doesnot exist we tend to create and then add the 
 recovered.edits.
 Because of this HBCK thinks it to be an orphan region because we have the 
 regiondir but with no regioninfo.
 Ideally cluster is fine but we it is misleading.
 {code}
 } else {
   Path dstdir = dst.getParent();
   if (!fs.exists(dstdir)) {
 if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on  + dstdir);
   }
 }
 fs.rename(src, dst);
 LOG.debug( moved  + src +  =  + dst);
   } else {
 LOG.debug(Could not move recovered edits from  + src +
  as it doesn't exist);
   }
 }
 archiveLogs(null, corruptedLogs, processedLogs,
 oldLogDir, fs, conf);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.


[ 
https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283452#comment-13283452
 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
---

@Jon
In our case the split got completed and the RS went down due to ZK issue and 
that is why the Master was not able to respond to the split region completion.  
Because the RS went down the recovered.edits creation came into play.
Ideally CJ just cleans up the entire region directory because the parent is in 
splitted state and offlined.  Also in this case as the split is completed we 
are sure that the data is also flushed to store files. Daughter regions will 
have its own region directory.
Did i answer your question? ;)

 HLogSplitter renaming recovered.edits and CJ removing the parent directory 
 races, making the HBCK to think cluster is inconsistent.
 ---

 Key: HBASE-6050
 URL: https://issues.apache.org/jira/browse/HBASE-6050
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Attachments: HBASE-6050.patch


 The scenario is like this
 - A region is getting splitted.
 - The master is still not processed the split .
 - Region server goes down.
 - Split log manager starts splitting the logs and creates the 
 recovered.edits in the splitlog path.
 - CJ starts and deletes the entry from META and also just completes the 
 deletion of the region dir.
 - in hlogSplitter on final step we rename the recovered.edits to come under 
 the regiondir.
 There if the regiondir doesnot exist we tend to create and then add the 
 recovered.edits.
 Because of this HBCK thinks it to be an orphan region because we have the 
 regiondir but with no regioninfo.
 Ideally cluster is fine but we it is misleading.
 {code}
 } else {
   Path dstdir = dst.getParent();
   if (!fs.exists(dstdir)) {
 if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on  + dstdir);
   }
 }
 fs.rename(src, dst);
 LOG.debug( moved  + src +  =  + dst);
   } else {
 LOG.debug(Could not move recovered edits from  + src +
  as it doesn't exist);
   }
 }
 archiveLogs(null, corruptedLogs, processedLogs,
 oldLogDir, fs, conf);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node


 [ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-6088:
--

Attachment: HBASE-6088_trunk.patch

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
 Fix For: 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 {noformat}
 Due to the above exception, region splitting was failing contineously more 
 than 5hrs

--
This message is automatically generated by JIRA.
If you

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node


[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283490#comment-13283490
 ] 

rajeshbabu commented on HBASE-6088:
---

Attached patch for trunk. Please review and provide suggestions/comments.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
 Fix For: 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 {noformat}
 Due to the above exception, region splitting was

[jira] [Assigned] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node


 [ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu reassigned HBASE-6088:
-

Assignee: rajeshbabu

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 {noformat}
 Due to the above exception, region splitting was failing contineously more 
 than 5hrs

--
This message is

[jira] [Updated] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node


 [ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-6088:
--

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 {noformat}
 Due to the above exception, region splitting was failing contineously more 
 than 5hrs

--
This message is

[jira] [Comment Edited] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

[
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283369#comment-13283369
]

Jonathan Hsieh edited comment on HBASE-5892 at 5/25/12 3:19 PM:

was (Author: jmhsieh):
Andrew, looks good. I'm going to wait for the hadoopqa robot to execute
the test suite. Alternately, since this just modifies hbck, can you try this
command and share results: 'mvn test -PlocalTests -Dtest=TestHbaseFsck'?

[hbck] Refactor parallel WorkItem* to Futures.
--

Key: HBASE-5892
URL: https://issues.apache.org/jira/browse/HBASE-5892
Project: HBase
Issue Type: Improvement
Reporter: Jonathan Hsieh
Assignee: Andrew Wang
Labels: noob
Attachments: hbase-5892.patch

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node


[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283544#comment-13283544
 ] 

Hadoop QA commented on HBASE-6088:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12529722/HBASE-6088_trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 33 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1995//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1995//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1995//console

This message is automatically generated.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)

[jira] [Created] (HBASE-6094) [refGuide] Improvements to new contributor docs

2012-05-25 Thread Ian Varley (JIRA)

Ian Varley created HBASE-6094:
-

 Summary: [refGuide] Improvements to new contributor docs
 Key: HBASE-6094
 URL: https://issues.apache.org/jira/browse/HBASE-6094
 Project: HBase
  Issue Type: Improvement
Reporter: Ian Varley
Assignee: Doug Meil
Priority: Minor


book.xml
* adding section in compression appendix about changing compression codecs. 
* A frequent question on the dist-list is whether people will have to copy the 
data into a new table, etc.,  You don't.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6094) [refGuide] Improvements to new contributor docs

2012-05-25 Thread Ian Varley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Varley updated HBASE-6094:
--

Description: 
developer.xml
* Expanded explanation around git  svn, and mentioning the EGit plugin
* Expanded explanation of setting up the eclipse project
* Extra section about basic compilation using maven and eclipse
* Fix to tarball command that makes it maven2 compatible
* Greatly expanded section about contributing docs, and clarification that 
pushing generated site is only for those with permissions

  was:
book.xml
* adding section in compression appendix about changing compression codecs. 
* A frequent question on the dist-list is whether people will have to copy the 
data into a new table, etc.,  You don't.


 [refGuide] Improvements to new contributor docs
 ---

 Key: HBASE-6094
 URL: https://issues.apache.org/jira/browse/HBASE-6094
 Project: HBase
  Issue Type: Improvement
Reporter: Ian Varley
Assignee: Doug Meil
Priority: Minor

 developer.xml
 * Expanded explanation around git  svn, and mentioning the EGit plugin
 * Expanded explanation of setting up the eclipse project
 * Extra section about basic compilation using maven and eclipse
 * Fix to tarball command that makes it maven2 compatible
 * Greatly expanded section about contributing docs, and clarification that 
 pushing generated site is only for those with permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6095) ActiveMasterManager NullPointerException

2012-05-25 Thread Jimmy Xiang (JIRA)

Jimmy Xiang created HBASE-6095:
--

 Summary: ActiveMasterManager NullPointerException
 Key: HBASE-6095
 URL: https://issues.apache.org/jira/browse/HBASE-6095
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.1
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.1


It is for 0.94 and 0.92.  Trunk doesn't have the issue.

{code}
  byte [] bytes =
ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode);
  // TODO: redo this to make it atomic (only added for tests)
  ServerName master = ServerName.parseVersionedServerName(bytes);
{code}

bytes could be null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6095) ActiveMasterManager NullPointerException

2012-05-25 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6095:
---

Attachment: hbase-6095.patch

 ActiveMasterManager NullPointerException
 

 Key: HBASE-6095
 URL: https://issues.apache.org/jira/browse/HBASE-6095
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.1
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.1

 Attachments: hbase-6095.patch


 It is for 0.94 and 0.92.  Trunk doesn't have the issue.
 {code}
   byte [] bytes =
 ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode);
   // TODO: redo this to make it atomic (only added for tests)
   ServerName master = ServerName.parseVersionedServerName(bytes);
 {code}
 bytes could be null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6094) [refGuide] Improvements to new contributor docs

2012-05-25 Thread Ian Varley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Varley updated HBASE-6094:
--

Attachment: book_hbase_6094.xml.patch

 [refGuide] Improvements to new contributor docs
 ---

 Key: HBASE-6094
 URL: https://issues.apache.org/jira/browse/HBASE-6094
 Project: HBase
  Issue Type: Improvement
Reporter: Ian Varley
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_6094.xml.patch


 developer.xml
 * Expanded explanation around git  svn, and mentioning the EGit plugin
 * Expanded explanation of setting up the eclipse project
 * Extra section about basic compilation using maven and eclipse
 * Fix to tarball command that makes it maven2 compatible
 * Greatly expanded section about contributing docs, and clarification that 
 pushing generated site is only for those with permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

[
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283599#comment-13283599
]

ramkrishna.s.vasudevan commented on HBASE-6070:
---

Committed to trunk, 0.94 and 0.92.
Thanks for the review Ted.

AM.nodeDeleted and SSH races creating problems for regions under SPLIT
--

Key: HBASE-6070
URL: https://issues.apache.org/jira/browse/HBASE-6070
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Fix For: 0.92.2, 0.96.0, 0.94.1

Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch,
HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch,
HBASE-6070_trunk_1.patch

We tried to address the problems in Master restart and RS restart while SPLIT
region is in progress as part of HBASE-5806.
While doing some more we found still there is one race condition.
- Split has just started and the znode is in RS_SPLIT state.
- RS goes down.
- First call back for SSH comes.
- As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
- But now nodeDeleted event comes for the SPLIt node and there we try to
delete the RIT.
- After this we try to see in the SSH whether any node is in RIT. As we
dont find the region in RIT the region is never assigned.
When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So
we missed it. Now we found that. Will come up with a patch shortly.

[jira] [Updated] (HBASE-6095) ActiveMasterManager NullPointerException

2012-05-25 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6095:
---

Status: Patch Available  (was: Open)

 ActiveMasterManager NullPointerException
 

 Key: HBASE-6095
 URL: https://issues.apache.org/jira/browse/HBASE-6095
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.1
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.1

 Attachments: hbase-6095.patch


 It is for 0.94 and 0.92.  Trunk doesn't have the issue.
 {code}
   byte [] bytes =
 ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode);
   // TODO: redo this to make it atomic (only added for tests)
   ServerName master = ServerName.parseVersionedServerName(bytes);
 {code}
 bytes could be null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

[
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Resolution: Fixed
Status: Resolved (was: Patch Available)

AM.nodeDeleted and SSH races creating problems for regions under SPLIT
--

Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch,
HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch,
HBASE-6070_trunk_1.patch

[jira] [Commented] (HBASE-6095) ActiveMasterManager NullPointerException


[ 
https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283600#comment-13283600
 ] 

ramkrishna.s.vasudevan commented on HBASE-6095:
---

+1 on patch.

 ActiveMasterManager NullPointerException
 

 Key: HBASE-6095
 URL: https://issues.apache.org/jira/browse/HBASE-6095
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.1
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.1

 Attachments: hbase-6095.patch


 It is for 0.94 and 0.92.  Trunk doesn't have the issue.
 {code}
   byte [] bytes =
 ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode);
   // TODO: redo this to make it atomic (only added for tests)
   ServerName master = ServerName.parseVersionedServerName(bytes);
 {code}
 bytes could be null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6095) ActiveMasterManager NullPointerException

[
https://issues.apache.org/jira/browse/HBASE-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283601#comment-13283601
]

Hadoop QA commented on HBASE-6095:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12529744/hbase-6095.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1996//console

This message is automatically generated.

ActiveMasterManager NullPointerException

Key: HBASE-6095
URL: https://issues.apache.org/jira/browse/HBASE-6095
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.94.1
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
Fix For: 0.94.1

Attachments: hbase-6095.patch

It is for 0.94 and 0.92. Trunk doesn't have the issue.
{code}
byte [] bytes =
ZKUtil.getDataAndWatch(watcher, watcher.masterAddressZNode);
// TODO: redo this to make it atomic (only added for tests)
ServerName master = ServerName.parseVersionedServerName(bytes);
{code}
bytes could be null.

[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-6068:
---

Affects Version/s: 0.96.0
0.92.1
Status: Patch Available (was: Open)

Secure HBase cluster : Client not able to call some admin APIs
--

Key: HBASE-6068
URL: https://issues.apache.org/jira/browse/HBASE-6068
Project: HBase
Issue Type: Bug
Components: security
Affects Versions: 0.94.0, 0.92.1, 0.96.0
Reporter: Anoop Sam John
Assignee: Matteo Bertozzi
Attachments: HBASE-6068-v0.patch

In case of secure cluster, we allow the HBase clients to read the zk nodes by
providing the global read permissions to all for certain nodes. These nodes
are the master address znode, root server znode and the clusterId znode. In
ZKUtil.createACL() , we can see these node names are specially handled.
But there are some other client side admin APIs which makes a read call into
the zookeeper from the client. This include the isTableEnabled() call (May be
some other. I have seen this). Here the client directly reads a node in the
zookeeper ( node created for this table ) and the data is matched to know
whether this is enabled or not.
Now in secure cluster case any client can read zookeeper nodes which it needs
for its normal operation like the master address and root server address.
But what if the client calls this API? [isTableEnaled () ].

[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-6068:
---

Attachment: HBASE-6068-v0.patch

Since certain znodes are accessed by the client directly they must be marked as
readable by everyone.

HBaseAdmin.checkHBaseAvailable() - /hbase
ZKTable.populateTableStates() - /hbase/table/* znodes

Secure HBase cluster : Client not able to call some admin APIs
--

Key: HBASE-6068
URL: https://issues.apache.org/jira/browse/HBASE-6068
Project: HBase
Issue Type: Bug
Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Anoop Sam John
Assignee: Matteo Bertozzi
Attachments: HBASE-6068-v0.patch

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-25 Thread chunhui shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283605#comment-13283605
 ] 

chunhui shen commented on HBASE-5916:
-

@ram
Thanks to write much for the case.
However, I don't think the above case will happen. Correct me if wrong.

bq.At the same time as master initialization has already been done and so we 
are able to carry on assignment with SSH also. This will lead to double 
assignment
Why it will lead to double assignment? When we reassign regions in the process 
of SSH, we would skip regions as the folowing:
{code}
if (processDeadRegion(e.getKey(), e.getValue(),
  this.services.getAssignmentManager(),
  this.server.getCatalogTracker())) {
ServerName addressFromAM = this.services.getAssignmentManager()
.getRegionServerOfRegion(e.getKey());
if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
  // Skip regions that were in transition unless CLOSING or
  // PENDING_CLOSE
  LOG.info(Skip assigning region  + rit.toString());
} else if (addressFromAM != null
 !addressFromAM.equals(this.serverName)) {
  LOG.debug(Skip assigning region 
+ e.getKey().getRegionNameAsString()
+  because it has been opened in 
+ addressFromAM.getServerName());
  } else {
toAssignRegions.add(e.getKey());
  }
  }
{code}
In RIT?(not closingnot pendingClose, it won't be these two state in the above 
case ) - skip
Has onlined on other server- skip

At last, I think HBASE-5916_trunk_v7.patch is fine, and aggree we check in the 
patch for the current JIRA.
Thanks help for my doubt.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-6068:
---

Attachment: (was: HBASE-6068-v0.patch)

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-6068:
---

Attachment: HBASE-6068-v0.patch

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs


[ 
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283609#comment-13283609
 ] 

Matteo Bertozzi commented on HBASE-6068:


HBaseAdmin.checkHBaseAvailable() - exists() /hbase
ZKTable.populateTableStates() - listChildrenNoWatch() /hbase/table/* znodes
ZKTable.getTableState() - getData() /hbase/table/table name
HConnectionManager.getCurrentNrHRS() - getNumberOfChildren() - /hbase/rs/

 Secure HBase cluster : Client not able to call some admin APIs
 --

 Key: HBASE-6068
 URL: https://issues.apache.org/jira/browse/HBASE-6068
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Anoop Sam John
Assignee: Matteo Bertozzi
 Attachments: HBASE-6068-v0.patch


 In case of secure cluster, we allow the HBase clients to read the zk nodes by 
 providing the global read permissions to all for certain nodes. These nodes 
 are the master address znode, root server znode and the clusterId znode. In 
 ZKUtil.createACL() , we can see these node names are specially handled.
 But there are some other client side admin APIs which makes a read call into 
 the zookeeper from the client. This include the isTableEnabled() call (May be 
 some other. I have seen this).  Here the client directly reads a node in the 
 zookeeper ( node created for this table ) and the data is matched to know 
 whether this is enabled or not.
 Now in secure cluster case any client can read zookeeper nodes which it needs 
 for its normal operation like the master address and root server address.  
 But what if the client calls this API? [isTableEnaled () ].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283612#comment-13283612
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

@Chunhui
{code}
else if (addressFromAM != null
 !addressFromAM.equals(this.serverName)) {
  LOG.debug(Skip assigning region 
+ e.getKey().getRegionNameAsString()
+  because it has been opened in 
+ addressFromAM.getServerName());
  }
{code}
Just to clarify this, the assignment will not happen if the address mismatches 
but what if for few regions which are yet to be assigned the RIT is still not 
updated.  Chunhui, as you said all the cases discussed here are very corner 
case.  Also i really appreciate your help on this making us find out more 
cases.  Thank you very much.
Let me wait for Stack's comments also.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283614#comment-13283614
]

ramkrishna.s.vasudevan commented on HBASE-6068:
---

@Matteo
Thanks for bringing out similar cases that deals with ZK.

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Created] (HBASE-6096) AccessController v2

Andrew Purtell created HBASE-6096:
-

 Summary: AccessController v2
 Key: HBASE-6096
 URL: https://issues.apache.org/jira/browse/HBASE-6096
 Project: HBase
  Issue Type: Umbrella
  Components: security
Affects Versions: 0.96.0, 0.94.1
Reporter: Andrew Purtell


Umbrella issue for iteration on the initial AccessController drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6036) Add Cluster-level PB-based calls to HMasterInterface (minus file-format related calls)

2012-05-25 Thread Gregory Chanan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283616#comment-13283616
 ] 

Gregory Chanan commented on HBASE-6036:
---

Safe to mark this Resolved?

 Add Cluster-level PB-based calls to HMasterInterface (minus file-format 
 related calls)
 --

 Key: HBASE-6036
 URL: https://issues.apache.org/jira/browse/HBASE-6036
 Project: HBase
  Issue Type: Task
  Components: ipc, master, migration
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-6036-v2.patch, HBASE-6036.patch


 This should be a subtask of HBASE-5445, but since that is a subtask, I can't 
 also make this a subtask (apparently).
 Convert the cluster-level calls that do not touch the file-format related 
 calls (see HBASE-5453).  These are:
 IsMasterRunning
 Shutdown
 StopMaster
 Balance
 LoadBalancerIs (was synchronousBalanceSwitch/balanceSwitch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-25 Thread chunhui shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283617#comment-13283617
 ] 

chunhui shen commented on HBASE-5916:
-

bq.but what if for few regions which are yet to be assigned the RIT is still 
not updated

when master startup before initialized, the region will be in the RIT through 
AssignmentManager#processRegionsInTransition for the above case.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly

2012-05-25 Thread Alex Newman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283622#comment-13283622
 ] 

Alex Newman commented on HBASE-6047:


Could someone translate the hudson. Is this patch still working or do I need to 
rebase on to some branches. The robot's output confuses this human.

 Put.has() can't determine result correctly
 --

 Key: HBASE-6047
 URL: https://issues.apache.org/jira/browse/HBASE-6047
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.1
Reporter: Wang Qiang
Assignee: Alex Newman
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
 PutTest.java


 the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
 the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
 value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
 ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
 it'll enter the block
 {code}
 else if (ignoreValue) {
   for (KeyValue kv: list) {
 if (Arrays.equals(kv.getFamily(), family)  
 Arrays.equals(kv.getQualifier(), qualifier)
  kv.getTimestamp() == ts) {
   return true;
 }
   }
 }
 {code}
 the expression 'kv.getTimestamp() == ts' in the if conditions should only 
 exist when 'ignoreTS=false', otherwise, the following code will return false!
 {code}
 Put put = new Put(Bytes.toBytes(row-01));
 put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01),
   1234567L, Bytes.toBytes(value-01));
 System.out.println(put.has(Bytes.toBytes(family-01),
   Bytes.toBytes(qualifier-01)));
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283621#comment-13283621
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

I meant RIT is with the original server name only and not yet updated with the 
new RS. :) Good on you Chunhui.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
 HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
 HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
 HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly


[ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283626#comment-13283626
 ] 

ramkrishna.s.vasudevan commented on HBASE-6047:
---

@Alex
The hudson has taken up the patch and it is available in the versions where 
ever the patch has gone in.
There are some test case failures caused in the build due to which the hudson 
says it has failed in creating a build.  May be you can just cross check from 
the build report whether your patch has caused any failures. 

 Put.has() can't determine result correctly
 --

 Key: HBASE-6047
 URL: https://issues.apache.org/jira/browse/HBASE-6047
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.1
Reporter: Wang Qiang
Assignee: Alex Newman
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
 PutTest.java


 the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
 the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
 value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
 ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
 it'll enter the block
 {code}
 else if (ignoreValue) {
   for (KeyValue kv: list) {
 if (Arrays.equals(kv.getFamily(), family)  
 Arrays.equals(kv.getQualifier(), qualifier)
  kv.getTimestamp() == ts) {
   return true;
 }
   }
 }
 {code}
 the expression 'kv.getTimestamp() == ts' in the if conditions should only 
 exist when 'ignoreTS=false', otherwise, the following code will return false!
 {code}
 Put put = new Put(Bytes.toBytes(row-01));
 put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01),
   1234567L, Bytes.toBytes(value-01));
 System.out.println(put.has(Bytes.toBytes(family-01),
   Bytes.toBytes(qualifier-01)));
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6097) TestHRegion.testBatchPut is flaky on 0.92

2012-05-25 Thread Gregory Chanan (JIRA)

Gregory Chanan created HBASE-6097:
-

 Summary: TestHRegion.testBatchPut is flaky on 0.92
 Key: HBASE-6097
 URL: https://issues.apache.org/jira/browse/HBASE-6097
 Project: HBase
  Issue Type: Bug
  Components: test, wal
Affects Versions: 0.92.1
Reporter: Gregory Chanan
Assignee: Gregory Chanan


If I run this test in a loop, I get failures like the following:

Error Message:
expected:1 but was:2

Stack Trace:
junit.framework.AssertionFailedError: expected:1 but was:2
at junit.framework.Assert.fail(Assert.java:50)
at junit.framework.Assert.failNotEquals(Assert.java:287)
at junit.framework.Assert.assertEquals(Assert.java:67)
at junit.framework.Assert.assertEquals(Assert.java:134)
at junit.framework.Assert.assertEquals(Assert.java:140)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testBatchPut(TestHRegion.java:536)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly

2012-05-25 Thread Alex Newman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283632#comment-13283632
 ] 

Alex Newman commented on HBASE-6047:


*ramkrishna sorry to be dense, but just so I understand what you want me to do. 
I looked at the jenkins builds and it looks like we are having flakey test 
issues. Sounds like I know what my next jira is. Can you confirm that I am 
correct in my diagnosis?

 Put.has() can't determine result correctly
 --

 Key: HBASE-6047
 URL: https://issues.apache.org/jira/browse/HBASE-6047
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.1
Reporter: Wang Qiang
Assignee: Alex Newman
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
 PutTest.java


 the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
 the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
 value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
 ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
 it'll enter the block
 {code}
 else if (ignoreValue) {
   for (KeyValue kv: list) {
 if (Arrays.equals(kv.getFamily(), family)  
 Arrays.equals(kv.getQualifier(), qualifier)
  kv.getTimestamp() == ts) {
   return true;
 }
   }
 }
 {code}
 the expression 'kv.getTimestamp() == ts' in the if conditions should only 
 exist when 'ignoreTS=false', otherwise, the following code will return false!
 {code}
 Put put = new Put(Bytes.toBytes(row-01));
 put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01),
   1234567L, Bytes.toBytes(value-01));
 System.out.println(put.has(Bytes.toBytes(family-01),
   Bytes.toBytes(qualifier-01)));
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283634#comment-13283634
]

Hadoop QA commented on HBASE-6068:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12529749/HBASE-6068-v0.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1997//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1997//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1997//console

This message is automatically generated.

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available


 [ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5666:
---

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Fixed with HBASE-5849

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-0.92.patch, HBASE-5666-v1.patch, 
 HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, 
 HBASE-5666-v5.patch, HBASE-5666-v6.patch, HBASE-5666-v7.patch, 
 HBASE-5666-v8.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, 
 hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, 
 hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6097) TestHRegion.testBatchPut is flaky on 0.92

2012-05-25 Thread Gregory Chanan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283637#comment-13283637
 ] 

Gregory Chanan commented on HBASE-6097:
---

The issue is that this test checks that a sync happened like so:
{code}
assertEquals(1, HLog.getSyncOps());
{code}

But in 0.92, the LogSyncer thread will sync, and increment HLog.getSyncOps(), 
even if there are no updates to the wal.  This has been fixed in 0.94+.

So, you can trigger a failure by setting 
hbase.regionserver.optionallogflushinterval to an extremely low value and 
throwing in random sleeps to the test.

Possibilities for fixing:
1) Backport the work from 0.94+ that avoids syncing if there are no updates to 
the wal
2) Only change the test and just check that a sync ran, e.g.
{code}
assert(HLog.getSyncOps()  0);
{code}
This makes the test a bit too accepting, because then it is possible that the 
syncer can sync nothing and we'd think a sync actually ran.

I'll investigate some more.

 TestHRegion.testBatchPut is flaky on 0.92
 -

 Key: HBASE-6097
 URL: https://issues.apache.org/jira/browse/HBASE-6097
 Project: HBase
  Issue Type: Bug
  Components: test, wal
Affects Versions: 0.92.1
Reporter: Gregory Chanan
Assignee: Gregory Chanan

 If I run this test in a loop, I get failures like the following:
 Error Message:
 expected:1 but was:2
 Stack Trace:
 junit.framework.AssertionFailedError: expected:1 but was:2
 at junit.framework.Assert.fail(Assert.java:50)
 at junit.framework.Assert.failNotEquals(Assert.java:287)
 at junit.framework.Assert.assertEquals(Assert.java:67)
 at junit.framework.Assert.assertEquals(Assert.java:134)
 at junit.framework.Assert.assertEquals(Assert.java:140)
 at 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testBatchPut(TestHRegion.java:536)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6098) ACL design changes

Andrew Purtell created HBASE-6098:
-

 Summary: ACL design changes
 Key: HBASE-6098
 URL: https://issues.apache.org/jira/browse/HBASE-6098
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6099) Secure ZooKeeper integration changes

Andrew Purtell created HBASE-6099:
-

 Summary: Secure ZooKeeper integration changes
 Key: HBASE-6099
 URL: https://issues.apache.org/jira/browse/HBASE-6099
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283643#comment-13283643
]

Hadoop QA commented on HBASE-6068:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12529753/HBASE-6068-v0.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1998//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1998//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1998//console

This message is automatically generated.

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Created] (HBASE-6100) Fix the frequent testcase failures in 0.94 from build no #209

ramkrishna.s.vasudevan created HBASE-6100:
-

 Summary: Fix the frequent testcase failures in 0.94 from build no 
#209
 Key: HBASE-6100
 URL: https://issues.apache.org/jira/browse/HBASE-6100
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.94.1


Fix the flaky tests in 0.94 branch after #209.  Many test cases like the 
org.apache.hadoop.hbase.TestLocalHBaseCluster.testLocalHBaseCluster
org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired 
org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.testSingleMethod

are failing frequently.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6101) Insure Observers cover all RPC and lifecycle code paths

Andrew Purtell created HBASE-6101:
-

 Summary: Insure Observers cover all RPC and lifecycle code paths
 Key: HBASE-6101
 URL: https://issues.apache.org/jira/browse/HBASE-6101
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-25 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enis Soztutar updated HBASE-5986:
-

Attachment: HBASE-5986-0.94.patch
HBASE-5986-0.92.patch

Attaching patches for 0.92 and 0.94 branches. They are direct ports of the v3
patch, but 0.92 patch also includes HRegionServer.getOnlineRegions(byte[]
tableName) function directly copied from 0.94, since we need it. I have
discovered this when testing with 0.92, so I would like it to make into it.

One minor mishap from my part is that the v3 patch which went into trunk
includes an unrelated change in RegionServerDynamicStatistics. Related issue is
HBASE-6025. Although the change is trivial ,changing
RegionServerDynamicStatistics to extend hbase-specific MetricsMBeanBase rather
than hadoop-specific MetricsDynamicMBeanBase, we may want to note this, or
revert that part. Backport patches does not include this change.
Sorry for the trouble guys.

Clients can see holes in the META table when regions are being split

Key: HBASE-5986
URL: https://issues.apache.org/jira/browse/HBASE-5986
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Fix For: 0.96.0

Attachments: 5986-v2.txt, HBASE-5986-0.92.patch,
HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch

We found this issue when running large scale ingestion tests for HBASE-5754.
The problem is that the .META. table updates are not atomic while splitting a
region. In SplitTransaction, there is a time lap between the marking the
parent offline, and adding of daughters to the META table. This can result in
clients using MetaScanner, of HTable.getStartEndKeys (used by the
TableInputFormat) missing regions which are made just offline, but the
daughters are not added yet.
This is also related to HBASE-4335.

[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly


[ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283650#comment-13283650
 ] 

ramkrishna.s.vasudevan commented on HBASE-6047:
---

I have raised HBASE-6100 to address the test case failures in 0.94.
In 0.92 i see that TestLocalHBaseCluster is failing due to
{code}
Starting shutdown.
org.apache.hadoop.hbase.util.FileSystemVersionException: File system needs to 
be upgraded.  You have version null and I want version 7.  Run the 
'${HBASE_HOME}/bin/hbase migrate' script.
{code}

We need to correct this across all branches i feel.

 Put.has() can't determine result correctly
 --

 Key: HBASE-6047
 URL: https://issues.apache.org/jira/browse/HBASE-6047
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.1
Reporter: Wang Qiang
Assignee: Alex Newman
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
 PutTest.java


 the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
 the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
 value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
 ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
 it'll enter the block
 {code}
 else if (ignoreValue) {
   for (KeyValue kv: list) {
 if (Arrays.equals(kv.getFamily(), family)  
 Arrays.equals(kv.getQualifier(), qualifier)
  kv.getTimestamp() == ts) {
   return true;
 }
   }
 }
 {code}
 the expression 'kv.getTimestamp() == ts' in the if conditions should only 
 exist when 'ignoreTS=false', otherwise, the following code will return false!
 {code}
 Put put = new Put(Bytes.toBytes(row-01));
 put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01),
   1234567L, Bytes.toBytes(value-01));
 System.out.println(put.has(Bytes.toBytes(family-01),
   Bytes.toBytes(qualifier-01)));
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

[
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283652#comment-13283652
]

Hudson commented on HBASE-6070:
---

Integrated in HBase-0.94 #217 (See
[https://builds.apache.org/job/HBase-0.94/217/])
HBASE-6070 AM.nodeDeleted and SSH races creating problems for regions under
SPLIT (Ramkrishna) (Revision 1342725)

Result = FAILURE
ramkrishna :
Files :
*
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
*
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
*
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java

AM.nodeDeleted and SSH races creating problems for regions under SPLIT
--

Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch,
HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch,
HBASE-6070_trunk_1.patch

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283653#comment-13283653
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-0.94 #217 (See 
[https://builds.apache.org/job/HBase-0.94/217/])
Amend HBASE-6077. Remove stray tag (Revision 1342705)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/branches/0.94/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6102) API and shell usability improvements

Andrew Purtell created HBASE-6102:
-

 Summary: API and shell usability improvements
 Key: HBASE-6102
 URL: https://issues.apache.org/jira/browse/HBASE-6102
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

[
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283658#comment-13283658
]

Hudson commented on HBASE-6070:
---

Integrated in HBase-TRUNK #2922 (See
[https://builds.apache.org/job/HBase-TRUNK/2922/])
HBASE-6070 AM.nodeDeleted and SSH races creating problems for regions under
SPLIT (Ramkrishna) (Revision 1342724)

Result = FAILURE
ramkrishna :
Files :
*
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
*
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
*
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java

AM.nodeDeleted and SSH races creating problems for regions under SPLIT
--

Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch,
HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch,
HBASE-6070_trunk_1.patch

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283659#comment-13283659
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-TRUNK #2922 (See 
[https://builds.apache.org/job/HBase-TRUNK/2922/])
Amend HBASE-6077. Remove stray tag (Revision 1342704)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6002) Possible chance of resource leak in HlogSplitter


 [ 
https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6002:
--

Attachment: HBASE-6002_trunk_1.patch

Updated patch.  I have introduced one boolean to know whether close has already 
been attempted or not.

 Possible chance of resource leak in HlogSplitter
 

 Key: HBASE-6002
 URL: https://issues.apache.org/jira/browse/HBASE-6002
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0, 0.96.0
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, 
 HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch


 In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally 
 block in loop while closing the writers(wap.w) if any exception comes other 
 writers won't close.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.


[ 
https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283663#comment-13283663
 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
---

I will commit this tomorrow morning.

 HLogSplitter renaming recovered.edits and CJ removing the parent directory 
 races, making the HBCK to think cluster is inconsistent.
 ---

 Key: HBASE-6050
 URL: https://issues.apache.org/jira/browse/HBASE-6050
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Attachments: HBASE-6050.patch


 The scenario is like this
 - A region is getting splitted.
 - The master is still not processed the split .
 - Region server goes down.
 - Split log manager starts splitting the logs and creates the 
 recovered.edits in the splitlog path.
 - CJ starts and deletes the entry from META and also just completes the 
 deletion of the region dir.
 - in hlogSplitter on final step we rename the recovered.edits to come under 
 the regiondir.
 There if the regiondir doesnot exist we tend to create and then add the 
 recovered.edits.
 Because of this HBCK thinks it to be an orphan region because we have the 
 regiondir but with no regioninfo.
 Ideally cluster is fine but we it is misleading.
 {code}
 } else {
   Path dstdir = dst.getParent();
   if (!fs.exists(dstdir)) {
 if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on  + dstdir);
   }
 }
 fs.rename(src, dst);
 LOG.debug( moved  + src +  =  + dst);
   } else {
 LOG.debug(Could not move recovered edits from  + src +
  as it doesn't exist);
   }
 }
 archiveLogs(null, corruptedLogs, processedLogs,
 oldLogDir, fs, conf);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6002) Possible chance of resource leak in HlogSplitter


 [ 
https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6002:
--

Status: Patch Available  (was: Open)

 Possible chance of resource leak in HlogSplitter
 

 Key: HBASE-6002
 URL: https://issues.apache.org/jira/browse/HBASE-6002
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0, 0.96.0
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, 
 HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch


 In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally 
 block in loop while closing the writers(wap.w) if any exception comes other 
 writers won't close.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6002) Possible chance of resource leak in HlogSplitter


 [ 
https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6002:
--

Status: Open  (was: Patch Available)

 Possible chance of resource leak in HlogSplitter
 

 Key: HBASE-6002
 URL: https://issues.apache.org/jira/browse/HBASE-6002
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0, 0.96.0
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, 
 HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch


 In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally 
 block in loop while closing the writers(wap.w) if any exception comes other 
 writers won't close.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6098) ACL design changes


 [ 
https://issues.apache.org/jira/browse/HBASE-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6098:
--

  Component/s: security
Affects Version/s: 0.94.1
   0.96.0

 ACL design changes
 --

 Key: HBASE-6098
 URL: https://issues.apache.org/jira/browse/HBASE-6098
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.96.0, 0.94.1
Reporter: Andrew Purtell



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6101) Insure Observers cover all relevant RPC and lifecycle code paths


 [ 
https://issues.apache.org/jira/browse/HBASE-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6101:
--

  Component/s: security
   regionserver
   master
   coprocessors
Affects Version/s: 0.94.1
   0.96.0
  Summary: Insure Observers cover all relevant RPC and lifecycle 
code paths  (was: Insure Observers cover all RPC and lifecycle code paths)

 Insure Observers cover all relevant RPC and lifecycle code paths
 

 Key: HBASE-6101
 URL: https://issues.apache.org/jira/browse/HBASE-6101
 Project: HBase
  Issue Type: Sub-task
  Components: coprocessors, master, regionserver, security
Affects Versions: 0.96.0, 0.94.1
Reporter: Andrew Purtell



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6099) Secure ZooKeeper integration changes


 [ 
https://issues.apache.org/jira/browse/HBASE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6099:
--

  Component/s: zookeeper
   shell
   security
   client
Affects Version/s: 0.94.1
   0.96.0

 Secure ZooKeeper integration changes
 

 Key: HBASE-6099
 URL: https://issues.apache.org/jira/browse/HBASE-6099
 Project: HBase
  Issue Type: Sub-task
  Components: client, security, shell, zookeeper
Affects Versions: 0.96.0, 0.94.1
Reporter: Andrew Purtell



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6102) API and shell usability improvements


 [ 
https://issues.apache.org/jira/browse/HBASE-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6102:
--

  Component/s: shell
   security
   master
   client
Affects Version/s: 0.94.1
   0.96.0

 API and shell usability improvements
 

 Key: HBASE-6102
 URL: https://issues.apache.org/jira/browse/HBASE-6102
 Project: HBase
  Issue Type: Sub-task
  Components: client, master, security, shell
Affects Versions: 0.96.0, 0.94.1
Reporter: Andrew Purtell



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6077) Document the most common secure RPC troubleshooting resolutions


[ 
https://issues.apache.org/jira/browse/HBASE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283670#comment-13283670
 ] 

Hudson commented on HBASE-6077:
---

Integrated in HBase-0.92 #420 (See 
[https://builds.apache.org/job/HBase-0.92/420/])
Amend HBASE-6077. Remove stray tag (Revision 1342706)
Amend HBASE-6077. Replace HTML formatting that does not work with Docbook 
(Revision 1342383)

 Result = FAILURE
apurtell : 
Files : 
* /hbase/branches/0.92/src/docbkx/troubleshooting.xml

apurtell : 
Files : 
* /hbase/branches/0.92/src/docbkx/troubleshooting.xml


 Document the most common secure RPC troubleshooting resolutions
 ---

 Key: HBASE-6077
 URL: https://issues.apache.org/jira/browse/HBASE-6077
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: 6077.patch


 See attached manual troubleshooting section update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-6068:
---

Attachment: HBASE-6068-v1.patch

Missed one in the list, hbase shell call Zookeeper directly on zk_dump command
zk_dump - listChildrenNoWatch() /hbase/backup-masters/*

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter

[
https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283688#comment-13283688
]

Hadoop QA commented on HBASE-6002:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12529760/HBASE-6002_trunk_1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher
org.apache.hadoop.hbase.master.TestSplitLogManager

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1999//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1999//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1999//console

This message is automatically generated.

Possible chance of resource leak in HlogSplitter

Key: HBASE-6002
URL: https://issues.apache.org/jira/browse/HBASE-6002
Project: HBase
Issue Type: Bug
Components: wal
Affects Versions: 0.94.0, 0.96.0
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch,
HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch

In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally
block in loop while closing the writers(wap.w) if any exception comes other
writers won't close.

[jira] [Created] (HBASE-6103) HBaseServer shall read and deserialize data from each connection in parallel

2012-05-25 Thread Liyin Tang (JIRA)

Liyin Tang created HBASE-6103:
-

 Summary: HBaseServer shall read and deserialize data from each 
connection in parallel
 Key: HBASE-6103
 URL: https://issues.apache.org/jira/browse/HBASE-6103
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang


Currently HBaseServer is running with a single listener thread, which is 
responsible for accepting the connection, reading the data from network 
channel, deserializing the data into writable objects and handover to the IPC 
handler threads. 

So when there are multiple hbase clients connecting to the region server 
(HBaseServer) and reading/writing a large set of data, this listener thread 
will be performance bottleneck. 

Ideally, the listener thread shall only accept the connection and handover the 
connection to the IPC threads directly, so that each IPC thread would read the 
data from network channel, deserialize the data and execute the Call. 

In this way, the HBaseServer can read and deserialize data from each connection 
in parallel.






--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter


[ 
https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283702#comment-13283702
 ] 

Zhihong Yu commented on HBASE-6002:
---

Latest patch looks Okay.
Check out the failed tests.

 Possible chance of resource leak in HlogSplitter
 

 Key: HBASE-6002
 URL: https://issues.apache.org/jira/browse/HBASE-6002
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0, 0.96.0
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, 
 HBASE-6002_trunk.patch, HBASE-6002_trunk_1.patch


 In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally 
 block in loop while closing the writers(wap.w) if any exception comes other 
 writers won't close.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

[
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283704#comment-13283704
]

Hadoop QA commented on HBASE-6068:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12529764/HBASE-6068-v1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 33 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2000//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2000//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2000//console

This message is automatically generated.

Secure HBase cluster : Client not able to call some admin APIs
--

[jira] [Created] (HBASE-6104) Require EXEC permission to call coprocessor endpoints

2012-05-25 Thread Gary Helmling (JIRA)

Gary Helmling created HBASE-6104:


 Summary: Require EXEC permission to call coprocessor endpoints
 Key: HBASE-6104
 URL: https://issues.apache.org/jira/browse/HBASE-6104
 Project: HBase
  Issue Type: Sub-task
  Components: coprocessors, security
Reporter: Gary Helmling


The EXEC action currently exists as only a placeholder in access control.  It 
should really be used to enforce access to coprocessor endpoint RPC calls, 
which are currently unrestricted.

How the ACLs to support this would be modeled deserves some discussion:
* Should access be scoped to a specific table and CoprocessorProtocol extension?
* Should it be possible to grant access to a CoprocessorProtocol implementation 
globally (regardless of table)?
* Are per-method restrictions necessary?
* Should we expose hooks available to endpoint implementors so that they could 
additionally apply their own permission checks? Some CP endpoints may want to 
require READ permissions, others may want to enforce WRITE, or READ + WRITE.

To apply these kinds of checks we would also have to extend the RegionObserver 
interface to provide hooks wrapping HRegion.exec().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split


[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283708#comment-13283708
 ] 

Zhihong Yu commented on HBASE-5986:
---

@Enis:
Did you have a chance to run the backports through respective test suite ?

Thanks

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 5986-v2.txt, HBASE-5986-0.92.patch, 
 HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6105) Sweep all INFO level logging and aggressively drop to DEBUG, and from DEBUG to TRACE

Andrew Purtell created HBASE-6105:
-

 Summary: Sweep all INFO level logging and aggressively drop to 
DEBUG, and from DEBUG to TRACE
 Key: HBASE-6105
 URL: https://issues.apache.org/jira/browse/HBASE-6105
 Project: HBase
  Issue Type: Task
Affects Versions: 0.96.0
Reporter: Andrew Purtell


Speaking with Arjen from Facebook ops at HBaseCon, I asked if given one single 
request for improving HBase operability, what would that be. The answer was to 
be less verbose at INFO log level. For example, with many regions opening, 
anomalous events can be difficult to pick out among the 5-6 INFO level messages 
per region deployment. Where multiple INFO level messages are printed in close 
succession, we should consider coalescing them. For all INFO level messages, we 
should be aggressive about demoting them to DEBUG level. And, since we are now 
increasing the verbosity at DEBUG level, the same considerations should be 
applied there, with coalescing and demotion of really detailed/low level 
logging to TRACE.

Consider making this a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-25 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-1.patch

 [hbck] Refactor parallel WorkItem* to Futures.
 --

 Key: HBASE-5892
 URL: https://issues.apache.org/jira/browse/HBASE-5892
 Project: HBase
  Issue Type: Improvement
Reporter: Jonathan Hsieh
Assignee: Andrew Wang
  Labels: noob
 Attachments: hbase-5892-1.patch, hbase-5892.patch


 This would convert WorkItem* logic (with low level notifies, and rough 
 exception handling)  into a more canonical Futures pattern.
 Currently there are two instances of this pattern (for loading hdfs dirs, for 
 contacting regionservers for assignments, and soon -- for loading hdfs 
 .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-25 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283714#comment-13283714
 ] 

Andrew Wang commented on HBASE-5892:


Ran TestHBaseFsck, had to fix a null pointer thus new version of the patch. 
I'll port it to prior versions too.

 [hbck] Refactor parallel WorkItem* to Futures.
 --

 Key: HBASE-5892
 URL: https://issues.apache.org/jira/browse/HBASE-5892
 Project: HBase
  Issue Type: Improvement
Reporter: Jonathan Hsieh
Assignee: Andrew Wang
  Labels: noob
 Attachments: hbase-5892-1.patch, hbase-5892.patch


 This would convert WorkItem* logic (with low level notifies, and rough 
 exception handling)  into a more canonical Futures pattern.
 Currently there are two instances of this pattern (for loading hdfs dirs, for 
 contacting regionservers for assignments, and soon -- for loading hdfs 
 .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6104) Require EXEC permission to call coprocessor endpoints

[
https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283720#comment-13283720
]

Andrew Purtell commented on HBASE-6104:
---

bq. To apply these kinds of checks we would also have to extend the
RegionObserver interface to provide hooks wrapping HRegion.exec().

+1 on this at a minimum.

bq. Should access be scoped to a specific table and CoprocessorProtocol
extension?

bq. Should it be possible to grant access to a CoprocessorProtocol
implementation globally (regardless of table)?

bq. Are per-method restrictions necessary?

For the sake of simplicity, I suggest considering an EXEC permission per CF. So
that would allow the user or group specified in the grant to execute any
coprocessors installed in the region for the given CF. We can do more, but it
would be good to be informed by a specific use case then.

Require EXEC permission to call coprocessor endpoints
-

Key: HBASE-6104
URL: https://issues.apache.org/jira/browse/HBASE-6104
Project: HBase
Issue Type: Sub-task
Components: coprocessors, security
Reporter: Gary Helmling

The EXEC action currently exists as only a placeholder in access control. It
should really be used to enforce access to coprocessor endpoint RPC calls,
which are currently unrestricted.
How the ACLs to support this would be modeled deserves some discussion:
* Should access be scoped to a specific table and CoprocessorProtocol
extension?
* Should it be possible to grant access to a CoprocessorProtocol
implementation globally (regardless of table)?
* Are per-method restrictions necessary?
* Should we expose hooks available to endpoint implementors so that they
could additionally apply their own permission checks? Some CP endpoints may
want to require READ permissions, others may want to enforce WRITE, or READ +
WRITE.
To apply these kinds of checks we would also have to extend the
RegionObserver interface to provide hooks wrapping HRegion.exec().

[jira] [Comment Edited] (HBASE-6104) Require EXEC permission to call coprocessor endpoints

[
https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283720#comment-13283720
]

Andrew Purtell edited comment on HBASE-6104 at 5/25/12 8:02 PM:

bq. To apply these kinds of checks we would also have to extend the
RegionObserver interface to provide hooks wrapping HRegion.exec().

+1 on this at a minimum.

bq. Should access be scoped to a specific table and CoprocessorProtocol
extension?

bq. Should it be possible to grant access to a CoprocessorProtocol
implementation globally (regardless of table)?

bq. Are per-method restrictions necessary?

Edit: This implies some additional interface that informs the coprocessor what
CFs the principal has access rights to.

was (Author: apurtell):
bq. To apply these kinds of checks we would also have to extend the
RegionObserver interface to provide hooks wrapping HRegion.exec().

+1 on this at a minimum.

bq. Should access be scoped to a specific table and CoprocessorProtocol
extension?

bq. Should it be possible to grant access to a CoprocessorProtocol
implementation globally (regardless of table)?

bq. Are per-method restrictions necessary?

Require EXEC permission to call coprocessor endpoints
-

Key: HBASE-6104
URL: https://issues.apache.org/jira/browse/HBASE-6104
Project: HBase
Issue Type: Sub-task
Components: coprocessors, security
Reporter: Gary Helmling

[jira] [Commented] (HBASE-5498) Secure Bulk Load

[
https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283724#comment-13283724
]

Andrew Purtell commented on HBASE-5498:
---

@Francis, do you have any work, even if in a partially completed state?

Secure Bulk Load

Key: HBASE-5498
URL: https://issues.apache.org/jira/browse/HBASE-5498
Project: HBase
Issue Type: Improvement
Reporter: Francis Liu

Design doc:
https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load
Short summary:
Security as it stands does not cover the bulkLoadHFiles() feature. Users
calling this method will bypass ACLs. Also loading is made more cumbersome in
a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data
from user's directory to the hbase directory, which would require certain
write access privileges set.
Our solution is to create a coprocessor which makes use of AuthManager to
verify if a user has write access to the table. If so, launches a MR job as
the hbase user to do the importing (ie rewrite from text to hfiles). One
tricky part this job will have to do is impersonate the calling user when
reading the input files. We can do this by expecting the user to pass an hdfs
delegation token as part of the secureBulkLoad() coprocessor call and extend
an inputformat to make use of that token. The output is written to a
temporary directory accessible only by hbase and then bulkloadHFiles() is
called.

[jira] [Updated] (HBASE-6104) Require EXEC permission to call coprocessor endpoints