[jira] [Commented] (HBASE-7318) Add verbose logging option to HConnectionManager

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547740#comment-13547740
 ] 

stack commented on HBASE-7318:
--

I tried some of failed tests above.  For TestFromClientSide, it complains:

Failed tests:   testPutNoCF(org.apache.hadoop.hbase.client.TestFromClientSide): 
Should throw NoSuchColumnFamilyException

Do you get that Sergey w/ your patch (I don't see this when I remove your 
patch).

Let me know so can commit.

> Add verbose logging option to HConnectionManager
> 
>
> Key: HBASE-7318
> URL: https://issues.apache.org/jira/browse/HBASE-7318
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7318-v2.patch, HBASE-7318-v0.patch, HBASE-7318-v1.patch
>
>
> In the course of HBASE-7250 I found that client-side errors (as well as 
> server-side errors, but that's another question) are hard to debug.
> I have some local commits with useful, not-that-hacky HConnectionManager 
> logging added.
> Need to "productionize" it to be off by default but easy-to-enable for 
> debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7523) Snapshot attempt with the name of a previously taken snapshots fails sometimes.

2013-01-08 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-7523:
--

Summary: Snapshot attempt with the name of a previously taken snapshots 
fails sometimes.  (was: Snapshot attempt with the name of a previously taken 
fails sometimes.)

> Snapshot attempt with the name of a previously taken snapshots fails 
> sometimes.
> ---
>
> Key: HBASE-7523
> URL: https://issues.apache.org/jira/browse/HBASE-7523
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>
> In a test rig, we repeatedly snapshot, clone and delete a table with the same 
> using the same set of snapshot names.  Sometimes, the snapshot request will 
> be rejected until the hmaster is restarted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7523) Snapshot attempt with the name of a previously taken fails sometimes.

2013-01-08 Thread Jonathan Hsieh (JIRA)
Jonathan Hsieh created HBASE-7523:
-

 Summary: Snapshot attempt with the name of a previously taken 
fails sometimes.
 Key: HBASE-7523
 URL: https://issues.apache.org/jira/browse/HBASE-7523
 Project: HBase
  Issue Type: Sub-task
Reporter: Jonathan Hsieh


In a test rig, we repeatedly snapshot, clone and delete a table with the same 
using the same set of snapshot names.  Sometimes, the snapshot request will be 
rejected until the hmaster is restarted.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7344) subprocedure initialization fails with invalid znode data.

2013-01-08 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh resolved HBASE-7344.
---

Resolution: Duplicate
  Assignee: Jonathan Hsieh

This was fixed in review phases of HBASE-7212.  We added and used 
ZKUtil#createWithParents(ZKW, znode, byte[] data) which atomically wrote the 
data into the znode during creation (instead of creating an empty znode and 
then adding data).

The older method exposed the possibility of reading an empty 
SnapshotDescrpition.  This is not possible anymore.

> subprocedure initialization fails with invalid znode data.
> --
>
> Key: HBASE-7344
> URL: https://issues.apache.org/jira/browse/HBASE-7344
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>
> Sometimes snapshots subprocedures fail to start on RS because data read from 
> ZK is bad.  
> {code}
> 2012-12-13 07:22:55,238 ERROR 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Illegal argument 
> exception
> java.lang.IllegalArgumentException: Could not read snapshot information from 
> request.
> at 
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapsh
> otManager.java:284)
> at 
> org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2012-12-13 07:22:55,239 ERROR 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Failed due to null 
> subprocedure
> Local ForeignThreadException from null
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> Caused by: java.lang.IllegalArgumentException: Could not read snapshot 
> information from request.
> at 
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapshotManager.java:284)
> at 
> org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199)
> ... 6 more
> 2012-12-13 07:22:55,239 ERROR org.apache.zookeeper.ClientCnxn: Error while 
> calling watcher 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAborted(ZKProcedureMemberRpcs.java:266)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
> at 
> org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA adminis

[jira] [Commented] (HBASE-6386) Audit log messages do not include column family / qualifier information consistently

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547710#comment-13547710
 ] 

Hudson commented on HBASE-6386:
---

Integrated in HBase-TRUNK #3714 (See 
[https://builds.apache.org/job/HBase-TRUNK/3714/])
HBASE-6386 Audit log messages do not include column family / qualifier 
information consistently (Marcelo Vanzin) (Revision 1430691)

 Result = FAILURE
mbertozzi : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AuthResult.java


> Audit log messages do not include column family / qualifier information 
> consistently
> 
>
> Key: HBASE-6386
> URL: https://issues.apache.org/jira/browse/HBASE-6386
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.96.0
>Reporter: Marcelo Vanzin
>Assignee: Matteo Bertozzi
> Attachments: hbase-6386-v1.patch, hbase-6386-v2.patch, 
> HBASE-6386-v3.patch, HBASE-6386-v4.patch
>
>
> The code related to this issue is in 
> AccessController.java:permissionGranted().
> When creating audit logs, that method will do one of the following:
> * grant access, create audit log with table name only
> * deny access because of table permission, create audit log with table name 
> only
> * deny access because of column family / qualifier permission, create audit 
> log with specific family / qualifier
> So, in the case where more than one column family and/or qualifier are in the 
> same request, there will be a loss of information. Even in the case where 
> only one column family and/or qualifier is involved, information may be lost.
> It would be better if this behavior consistently included all the information 
> in the request; regardless of access being granted or denied, and regardless 
> which permission caused the denial, the column family and qualifier info 
> should be part of the audit log message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7477:
-

Attachment: 7477experiment.txt

More experimenting making engine return a pb Service (follow on from HBASE-6521 
and from chat Elliott and I were having earlier today).

The attached patch tries to keep the current engine model only instead of 
getProxy and stopProxy it has a start and stop that returns a pb Service 
instance instead.

But it won't work as is.  pb Service won't let us go this route.  pb Service 
would have us jettison all of this engine stuff too and just deal in Stub 
creations.

Putting aside for now.

> Remove Proxy instance from HBase RPC
> 
>
> Key: HBASE-7477
> URL: https://issues.apache.org/jira/browse/HBASE-7477
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Karthik Ranganathan
> Attachments: 7477experiment.txt
>
>
> Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
> the RPC parameters. This is pretty inefficient as it uses reflection to 
> lookup the current method name.
> The aim is to break up the proxy into an actual proxy implementation so that:
> 1. we can make it more efficient by eliminating reflection
> 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7403) Online Merge

2013-01-08 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7403:


Attachment: hbase-7403-trunkv8.patch

Improve and add test cases as Ted's suggestion.
TestMergeTransaction#testRedoMergeWhenServerRestart will test several 
restarting servers cases when merging regions

> Online Merge
> 
>
> Key: HBASE-7403
> URL: https://issues.apache.org/jira/browse/HBASE-7403
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
> 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, 
> hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, 
> hbase-7403-trunkv8.patch, merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only 
> encoded name enough
> 4.No limit when operation, you don't need to tabke care the events like 
> Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
> whether you send a wrong merge request, it has alread done for you
> 5.Only little offline time for two merging regions
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. 
> mslab)
> Current merge tools only support offline and are not able to redo if 
> exception is thrown in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
> regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
> regionB’s directory)
> 3.Add the merged regionC to .META.
> 4.Assign the merged regionC
> As design of this patch , once we do the merge work in the HDFS,we could redo 
> it until successful if it throws exception or abort or server restart, but 
> couldn’t be rolled back. 
> It depends on
> Use zookeeper to record the transaction journal state, make redo easier
> Use zookeeper to send/receive merge request
> Merge transaction is executed on the master
> Support calling merge request through API or shell tool
> About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-08 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547681#comment-13547681
 ] 

rajeshbabu commented on HBASE-7504:
---

understood,sorry for misunderstanding.Thanks.

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547676#comment-13547676
 ] 

Hudson commented on HBASE-7479:
---

Integrated in HBase-TRUNK #3713 (See 
[https://builds.apache.org/job/HBase-TRUNK/3713/])
HBASE-7479 Remove VersionedProtocol and ProtocolSignature from RPC 
(Revision 1430677)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/RPC.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/IpcProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/MasterAdminProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/MasterMonitorProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/MasterProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/RegionServerStatusProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/AdminProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClientRPC.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServerRPC.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ProtobufRpcClientEngine.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ProtobufRpcServerEngine.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RequestContext.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientEngine.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServerEngine.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/RandomTimeoutRpcEngine.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestIPC.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestProtoBufRpc.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestTokenAuthentication.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java


> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt, 7479v2.txt, 7479v3.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7474) Endpoint Implementation to support Scans with Sorting of Rows based on column values(similar to "order by" clause of RDBMS)

2013-01-08 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-7474:
-

Assignee: Anil Gupta

> Endpoint Implementation to support Scans with Sorting of Rows based on column 
> values(similar to "order by" clause of RDBMS)
> ---
>
> Key: HBASE-7474
> URL: https://issues.apache.org/jira/browse/HBASE-7474
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors, Scanners
>Affects Versions: 0.94.3
>Reporter: Anil Gupta
>Assignee: Anil Gupta
>Priority: Minor
>  Labels: coprocessors, scan, sort
> Fix For: 0.94.5
>
> Attachments: hbase-7474.patch, hbase-7474-v2.patch, 
> SortingEndpoint_high_level_flowchart.pdf
>
>
> Recently, i have developed an Endpoint which can sort the Results(rows) on 
> the basis of column values. This functionality is similar to "order by" 
> clause of RDBMS. I will be submitting this Patch for HBase0.94.3
> I am almost done with the initial development and testing of feature. But, i 
> need to write the JUnits for this. I will also try to make design doc.
> Thanks,
> Anil Gupta
> Software Engineer II, Intuit, inc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6386) Audit log messages do not include column family / qualifier information consistently

2013-01-08 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6386:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Audit log messages do not include column family / qualifier information 
> consistently
> 
>
> Key: HBASE-6386
> URL: https://issues.apache.org/jira/browse/HBASE-6386
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.96.0
>Reporter: Marcelo Vanzin
>Assignee: Matteo Bertozzi
> Attachments: hbase-6386-v1.patch, hbase-6386-v2.patch, 
> HBASE-6386-v3.patch, HBASE-6386-v4.patch
>
>
> The code related to this issue is in 
> AccessController.java:permissionGranted().
> When creating audit logs, that method will do one of the following:
> * grant access, create audit log with table name only
> * deny access because of table permission, create audit log with table name 
> only
> * deny access because of column family / qualifier permission, create audit 
> log with specific family / qualifier
> So, in the case where more than one column family and/or qualifier are in the 
> same request, there will be a loss of information. Even in the case where 
> only one column family and/or qualifier is involved, information may be lost.
> It would be better if this behavior consistently included all the information 
> in the request; regardless of access being granted or denied, and regardless 
> which permission caused the denial, the column family and qualifier info 
> should be part of the audit log message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6386) Audit log messages do not include column family / qualifier information consistently

2013-01-08 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547670#comment-13547670
 ] 

Matteo Bertozzi commented on HBASE-6386:


committed to trunk, thanks guys for the review and Marcelo for the patch

> Audit log messages do not include column family / qualifier information 
> consistently
> 
>
> Key: HBASE-6386
> URL: https://issues.apache.org/jira/browse/HBASE-6386
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.96.0
>Reporter: Marcelo Vanzin
>Assignee: Matteo Bertozzi
> Attachments: hbase-6386-v1.patch, hbase-6386-v2.patch, 
> HBASE-6386-v3.patch, HBASE-6386-v4.patch
>
>
> The code related to this issue is in 
> AccessController.java:permissionGranted().
> When creating audit logs, that method will do one of the following:
> * grant access, create audit log with table name only
> * deny access because of table permission, create audit log with table name 
> only
> * deny access because of column family / qualifier permission, create audit 
> log with specific family / qualifier
> So, in the case where more than one column family and/or qualifier are in the 
> same request, there will be a loss of information. Even in the case where 
> only one column family and/or qualifier is involved, information may be lost.
> It would be better if this behavior consistently included all the information 
> in the request; regardless of access being granted or denied, and regardless 
> which permission caused the denial, the column family and qualifier info 
> should be part of the audit log message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6386) Audit log messages do not include column family / qualifier information consistently

2013-01-08 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi reassigned HBASE-6386:
--

Assignee: Matteo Bertozzi

> Audit log messages do not include column family / qualifier information 
> consistently
> 
>
> Key: HBASE-6386
> URL: https://issues.apache.org/jira/browse/HBASE-6386
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.96.0
>Reporter: Marcelo Vanzin
>Assignee: Matteo Bertozzi
> Attachments: hbase-6386-v1.patch, hbase-6386-v2.patch, 
> HBASE-6386-v3.patch, HBASE-6386-v4.patch
>
>
> The code related to this issue is in 
> AccessController.java:permissionGranted().
> When creating audit logs, that method will do one of the following:
> * grant access, create audit log with table name only
> * deny access because of table permission, create audit log with table name 
> only
> * deny access because of column family / qualifier permission, create audit 
> log with specific family / qualifier
> So, in the case where more than one column family and/or qualifier are in the 
> same request, there will be a loss of information. Even in the case where 
> only one column family and/or qualifier is involved, information may be lost.
> It would be better if this behavior consistently included all the information 
> in the request; regardless of access being granted or denied, and regardless 
> which permission caused the denial, the column family and qualifier info 
> should be part of the audit log message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7411) Use Netflix's Curator zookeeper library

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547668#comment-13547668
 ] 

Ted Yu commented on HBASE-7411:
---

Interesting patch.
{code}
+   * to recreate the connection. This class bridges bridges curator and our 
zk-management.
{code}
'bridges' is repeated.
{code}
+WatcherSet doubleWatcher;
{code}
Since WatcherSet always contains two Watchers, should the class be called 
DoubleWatcher ?
{code}
+public synchronized ZooKeeper newZooKeeper(String connectString, int 
sessionTimeout, Watcher curatorWatcher,
{code}
Wrap long line.
{code}
+  if (LOG.isDebugEnabled()) {
+LOG.debug("Sending last zookeeper event to curator: " + lastEvent);
+  }
+  if (lastEvent != null) {
{code}
I think the log should be placed inside the if block (lastEvent != null)
{code}
+  static class ManagedZooKeeperFactory implements ZookeeperFactory {
{code}
The above class can be private, right ?
{code}
+  public void reconnect() throws IOException, InterruptedException {
+zkFactory.close(); //we don't want to close the CuratorClient
+connect();
{code}
reconnect() invokes two methods of ManagedZooKeeperFactory. Should reconnect() 
delegate to a synchronized method, reconnect(), of ManagedZooKeeperFactory ?
{code}
+  public void close() throws InterruptedException {
+if (this.curatorClient != null) {
+  this.curatorClient.close();
+}
{code}
this.curatorClient should be set to null after the close() call.

> Use Netflix's Curator zookeeper library
> ---
>
> Key: HBASE-7411
> URL: https://issues.apache.org/jira/browse/HBASE-7411
> Project: HBase
>  Issue Type: New Feature
>  Components: Zookeeper
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-7411_v0.patch
>
>
> We have mentioned using the Curator library 
> (https://github.com/Netflix/curator) elsewhere but we can continue the 
> discussion in this.  
> The advantages for the curator lib over ours are the recipes. We have very 
> similar retrying mechanism, and we don't need much of the nice client-API 
> layer. 
> We also have similar Listener interface, etc. 
> I think we can decide on one of the following options: 
> 1. Do not depend on curator. We have some of the recipes, and some custom 
> recipes (ZKAssign, Leader election, etc already working, locks in HBASE-5991, 
> etc). We can also copy / fork some code from there.
> 2. Replace all of our zk usage / connection management to curator. We may 
> keep the current set of API's as a thin wrapper. 
> 3. Use our own connection management / retry logic, and build a custom 
> CuratorFramework implementation for the curator recipes. This will keep the 
> current zk logic/code intact, and allow us to use curator-recipes as we see 
> fit. 
> 4. Allow both curator and our zk layer to manage the connection. We will 
> still have 1 connection, but 2 abstraction layers sharing it. This is the 
> easiest to implement, but a freak show? 
> I have a patch for 4, and now prototyping 2 or 3 whichever will be less 
> painful. 
> Related issues: 
> HBASE-5547
> HBASE-7305
> HBASE-7212

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547660#comment-13547660
 ] 

chunhui shen commented on HBASE-7504:
-

bq.if root is assigned in other live RS 
It is not a normal case. For common cases, we will assign ROOT in 
ServerShutdownHandler#verifyAndAssignRoot, it means we will execute the first 
if block.
In other way, server.getCatalogTracker().getRootLocation() is only reading data 
from ZK, I think it's acceptable

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7474) Endpoint Implementation to support Scans with Sorting of Rows based on column values(similar to "order by" clause of RDBMS)

2013-01-08 Thread Anil Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547652#comment-13547652
 ] 

Anil Gupta commented on HBASE-7474:
---

How can i assign this issue to myself? I am unable to do so. Please help.

> Endpoint Implementation to support Scans with Sorting of Rows based on column 
> values(similar to "order by" clause of RDBMS)
> ---
>
> Key: HBASE-7474
> URL: https://issues.apache.org/jira/browse/HBASE-7474
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors, Scanners
>Affects Versions: 0.94.3
>Reporter: Anil Gupta
>Priority: Minor
>  Labels: coprocessors, scan, sort
> Fix For: 0.94.5
>
> Attachments: hbase-7474.patch, hbase-7474-v2.patch, 
> SortingEndpoint_high_level_flowchart.pdf
>
>
> Recently, i have developed an Endpoint which can sort the Results(rows) on 
> the basis of column values. This functionality is similar to "order by" 
> clause of RDBMS. I will be submitting this Patch for HBase0.94.3
> I am almost done with the initial development and testing of feature. But, i 
> need to write the JUnits for this. I will also try to make design doc.
> Thanks,
> Anil Gupta
> Software Engineer II, Intuit, inc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-08 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547645#comment-13547645
 ] 

rajeshbabu commented on HBASE-7504:
---

if root is assigned in other live RS then with your patch 
server.getCatalogTracker().getRootLocation() will be called two times(in else 
if and log)?

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6824) Introduce ${hbase.local.dir} and save coprocessor jars there

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547638#comment-13547638
 ] 

Hadoop QA commented on HBASE-6824:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563873/hbase-6824_v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 8 zombie test(s):   
at 
org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS(TestMasterFailover.java:833)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3943//console

This message is automatically generated.

> Introduce ${hbase.local.dir} and save coprocessor jars there
> 
>
> Key: HBASE-6824
> URL: https://issues.apache.org/jira/browse/HBASE-6824
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-6824_v1-0.94.patch, hbase-6824_v1-trunk.patch, 
> hbase-6824_v2-0.94.patch, hbase-6824_v2-trunk.patch, hbase-6824_v3.patch
>
>
> We need to make the temp directory where coprocessor jars are saved 
> configurable. For this we will add hbase.local.dir configuration parameter. 
> Windows tests are failing due to the pathing problems for coprocessor jars:
> Two HBase TestClassLoading unit tests failed due to a failiure in loading the 
> test file from HDFS:
> {code}
> testClassLoadingFromHDFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> testClassLoadingFromLibDirInJar(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> {code}
> The problem is that CoprocessorHost.load() copies the jar file locally, and 
> schedules the local file to be deleted on exit, but calling 
> FileSystem.deleteOnExit(). However, the filesystem is not the file system of 
> the local file, it is the distributed file system, so on windows, the Path 
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6670) Untangle mixture of protobuf and Writable reference / usage

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6670.
--

Resolution: Not A Problem

Resolving as no longer a problem.  We removed HbaseObjectWritable a while back 
as part of "HBASE-7224 Remove references to Writable in the ipc package" 
making this issue invalid.

> Untangle mixture of protobuf and Writable reference / usage
> ---
>
> Key: HBASE-6670
> URL: https://issues.apache.org/jira/browse/HBASE-6670
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Critical
> Fix For: 0.96.0
>
>
> Currently HbaseObjectWritable uses ProtobufUtil to perform serialization of 
> Scan objects, ProtobufUtil.toParameter() calls 
> HbaseObjectWritable.writeObject().
> We should untangle such mixture and ultimately remove HbaseObjectWritable

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6903) HLog.Entry implements Writable; change to pb

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6903:
-

Priority: Major  (was: Critical)

Critical issue w/o an assignee.  Knocking down to major from critical.

> HLog.Entry implements Writable; change to pb
> 
>
> Key: HBASE-6903
> URL: https://issues.apache.org/jira/browse/HBASE-6903
> Project: HBase
>  Issue Type: Task
>  Components: wal
>Reporter: stack
> Fix For: 0.96.0
>
>
> Can we do this in a way that makes it so even after 0.96, we can read old 
> WALs?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7007) [MTTR] Study assigns to see if we can make them faster still

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547632#comment-13547632
 ] 

stack commented on HBASE-7007:
--

[~nkeywal] and/or [~jxiang] You think we can close this?

> [MTTR] Study assigns to see if we can make them faster still
> 
>
> Key: HBASE-7007
> URL: https://issues.apache.org/jira/browse/HBASE-7007
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>Priority: Critical
> Fix For: 0.96.0
>
>
> Looking at a cluster start, I saw that it took about 25 minutes to assign and 
> open 17k regions.  8 minutes was bulk assigning via zk.  17 minutes was 
> opening the regions.  "HBASE-6640 [0.89-fb] Allow multiple regions to be 
> opened simultaneously" in trunk would help but maybe we can do less work up 
> in zk for instance; e.g. if 3.4.5, we can do bulk ops in zk (or make a bulk 
> assign znode that has many regions instead of one)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7318) Add verbose logging option to HConnectionManager

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547631#comment-13547631
 ] 

Hadoop QA commented on HBASE-7318:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563874/7318-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
  org.apache.hadoop.hbase.client.TestMultiParallel
  org.apache.hadoop.hbase.constraint.TestConstraint
  org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.security.access.TestAccessController
  org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3942//console

This message is automatically generated.

> Add verbose logging option to HConnectionManager
> 
>
> Key: HBASE-7318
> URL: https://issues.apache.org/jira/browse/HBASE-7318
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7318-v2.patch, HBASE-7318-v0.patch, HBASE-7318-v1.patch
>
>
> In the course of HBASE-7250 I found that client-side errors (as well as 
> server-side errors, but that's another question) are hard to debug.
> I have some local commits with useful, not-that-hacky HConnectionManager 
> logging added.
> Need to "productionize" it to be off by default but easy-to-enable for 
> debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7482) Port HBASE-7442 HBase remote CopyTable not working when security enabled to trunk

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7482:
-

Priority: Major  (was: Critical)

Knocking down to major.  No assignee.

> Port HBASE-7442 HBase remote CopyTable not working when security enabled to 
> trunk
> -
>
> Key: HBASE-7482
> URL: https://issues.apache.org/jira/browse/HBASE-7482
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
> Fix For: 0.96.0
>
>
> Excerpt about the choice of solution from :
> The first option was actually quite messy to implement. {{clusterId}} and 
> {{conf}} are fixed in *{{HBaseClient}}* when it's created and cached by 
> *{{SecureRpcEngine}}*, so to implement the fix here I would have had to pass 
> the different cluster {{confs}} up through *{{HConnectionManager}}* and 
> *{{HBaseRPC}}* in order to override the clusterId in 
> *{{SecureClient#SecureConnection}}*.
> I've gone with the second option of creating and caching different 
> *{{SecureClients}}* for the local and remote clusters in 
> *{{SecureRpcEngine}}* - keyed off of the {{clusterId}} instead of the default 
> *{{SocketFactory}}*. I think this is a cleaner solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-2231) Compaction events should be written to HLog

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-2231:
-

Assignee: stack  (was: Todd Lipcon)

> Compaction events should be written to HLog
> ---
>
> Key: HBASE-2231
> URL: https://issues.apache.org/jira/browse/HBASE-2231
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: Todd Lipcon
>Assignee: stack
>Priority: Blocker
>  Labels: moved_from_0_20_5
> Fix For: 0.96.0
>
> Attachments: 2231-testcase-0.94.txt, 2231-testcase_v2.txt, 
> 2231-testcase_v3.txt, 2231v2.txt, 2231v3.txt, 2231v4.txt, 
> hbase-2231-testcase.txt, hbase-2231.txt
>
>
> The sequence for a compaction should look like this:
> # Compact region to "new" files
> # Write a "Compacted Region" entry to the HLog
> # Delete "old" files
> This deals with a case where the RS has paused between step 1 and 2 and the 
> regions have since been reassigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547491#comment-13547491
 ] 

stack edited comment on HBASE-7477 at 1/9/13 4:26 AM:
--

Note to self (comes of a review of what would be involved pulling the proxy 
stuff out of hbase with Elliott):

+ we'd need a means of hooking up a generic "callMethod" that took a Method and 
params with a protocol Interface -- what proxy does for us now.  The protobuf 
Service can do this for us also but w/o reflection.
+ What we have currently where we have protobuf engine pollution in the 
HBaseClient -- though this latter class is supposed to be engine agnostic -- is 
ugly and hard to follow.

Given the above, protobuf Service starts to look better.  It has kinks but 
would enforce a strong pattern -- and we are most of the way there already with 
our use of the Service#BlockingInterface.

  was (Author: stack):
Note to self (comes of a review of what would be involved pulling the proxy 
stuff out of hbase with Elliott):

+ we'd need a means of hooking up a generic "callMethod" that took a Method and 
params with a protocol Interface -- what proxy does for us now.  The protobuf 
Service does this for us.
+ What we have currently where we have protobuf engine pollution in the 
HBaseClient -- though this latter class is supposed to be engine agnostic -- is 
ugly.

Given this, protobuf Service starts to look good.  Has kinks but would enforce 
a strong pattern -- and we are most of the way there already with our use of 
the Service#BlockingInterface.
  
> Remove Proxy instance from HBase RPC
> 
>
> Key: HBASE-7477
> URL: https://issues.apache.org/jira/browse/HBASE-7477
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Karthik Ranganathan
>
> Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
> the RPC parameters. This is pretty inefficient as it uses reflection to 
> lookup the current method name.
> The aim is to break up the proxy into an actual proxy implementation so that:
> 1. we can make it more efficient by eliminating reflection
> 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7479:
-

Attachment: 7479v3.txt

Removes change to test proto file since it does not make for a difference in 
what is generated (From review by Devaraj).

This is what I committed.

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt, 7479v2.txt, 7479v3.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-7479.
--

  Resolution: Fixed
Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt, 7479v2.txt, 7479v3.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7411) Use Netflix's Curator zookeeper library

2013-01-08 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-7411:
-

Attachment: hbase-7411_v0.patch

Here is a candidate patch I've been working on. As per the discussion, the 
patch: 
 - Does not change zk connection management. We still create / close and watch 
the connection events. 
 - Curator now plugs it's watcher, when started. But it cannot create / close 
the connection. It waits for hbase to create a new connection instead. 
 - Curator is only started when it is called.
 - I've tested the implementation with curator-based read/write locks for table 
operations (HBASE-7305). 
 - The patch needs some polishing and maybe some more tests. Also I want to run 
an actual cluster, but this indicates the general approach, if we will be 
adding curator dependency after all. 

> Use Netflix's Curator zookeeper library
> ---
>
> Key: HBASE-7411
> URL: https://issues.apache.org/jira/browse/HBASE-7411
> Project: HBase
>  Issue Type: New Feature
>  Components: Zookeeper
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-7411_v0.patch
>
>
> We have mentioned using the Curator library 
> (https://github.com/Netflix/curator) elsewhere but we can continue the 
> discussion in this.  
> The advantages for the curator lib over ours are the recipes. We have very 
> similar retrying mechanism, and we don't need much of the nice client-API 
> layer. 
> We also have similar Listener interface, etc. 
> I think we can decide on one of the following options: 
> 1. Do not depend on curator. We have some of the recipes, and some custom 
> recipes (ZKAssign, Leader election, etc already working, locks in HBASE-5991, 
> etc). We can also copy / fork some code from there.
> 2. Replace all of our zk usage / connection management to curator. We may 
> keep the current set of API's as a thin wrapper. 
> 3. Use our own connection management / retry logic, and build a custom 
> CuratorFramework implementation for the curator recipes. This will keep the 
> current zk logic/code intact, and allow us to use curator-recipes as we see 
> fit. 
> 4. Allow both curator and our zk layer to manage the connection. We will 
> still have 1 connection, but 2 abstraction layers sharing it. This is the 
> easiest to implement, but a freak show? 
> I have a patch for 4, and now prototyping 2 or 3 whichever will be less 
> painful. 
> Related issues: 
> HBASE-5547
> HBASE-7305
> HBASE-7212

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-7318) Add verbose logging option to HConnectionManager

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547606#comment-13547606
 ] 

Ted Yu edited comment on HBASE-7318 at 1/9/13 3:53 AM:
---

Patch v2 fixes a compilation error:

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure
[ERROR] 
/Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java:[2102,41]
 cannot find symbol
[ERROR] symbol  : method getExhaustiveDescription()
[ERROR] location: class 
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException

  was (Author: yuzhih...@gmail.com):
Patch v2 fixes a compilation error.
  
> Add verbose logging option to HConnectionManager
> 
>
> Key: HBASE-7318
> URL: https://issues.apache.org/jira/browse/HBASE-7318
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7318-v2.patch, HBASE-7318-v0.patch, HBASE-7318-v1.patch
>
>
> In the course of HBASE-7250 I found that client-side errors (as well as 
> server-side errors, but that's another question) are hard to debug.
> I have some local commits with useful, not-that-hacky HConnectionManager 
> logging added.
> Need to "productionize" it to be off by default but easy-to-enable for 
> debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7318) Add verbose logging option to HConnectionManager

2013-01-08 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7318:
--

Attachment: 7318-v2.patch

Patch v2 fixes a compilation error.

> Add verbose logging option to HConnectionManager
> 
>
> Key: HBASE-7318
> URL: https://issues.apache.org/jira/browse/HBASE-7318
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7318-v2.patch, HBASE-7318-v0.patch, HBASE-7318-v1.patch
>
>
> In the course of HBASE-7250 I found that client-side errors (as well as 
> server-side errors, but that's another question) are hard to debug.
> I have some local commits with useful, not-that-hacky HConnectionManager 
> logging added.
> Need to "productionize" it to be off by default but easy-to-enable for 
> debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6824) Introduce ${hbase.local.dir} and save coprocessor jars there

2013-01-08 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547603#comment-13547603
 ] 

Enis Soztutar commented on HBASE-6824:
--

Note to self: add a release note on commit. 

> Introduce ${hbase.local.dir} and save coprocessor jars there
> 
>
> Key: HBASE-6824
> URL: https://issues.apache.org/jira/browse/HBASE-6824
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-6824_v1-0.94.patch, hbase-6824_v1-trunk.patch, 
> hbase-6824_v2-0.94.patch, hbase-6824_v2-trunk.patch, hbase-6824_v3.patch
>
>
> We need to make the temp directory where coprocessor jars are saved 
> configurable. For this we will add hbase.local.dir configuration parameter. 
> Windows tests are failing due to the pathing problems for coprocessor jars:
> Two HBase TestClassLoading unit tests failed due to a failiure in loading the 
> test file from HDFS:
> {code}
> testClassLoadingFromHDFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> testClassLoadingFromLibDirInJar(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> {code}
> The problem is that CoprocessorHost.load() copies the jar file locally, and 
> schedules the local file to be deleted on exit, but calling 
> FileSystem.deleteOnExit(). However, the filesystem is not the file system of 
> the local file, it is the distributed file system, so on windows, the Path 
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6824) Introduce ${hbase.local.dir} and save coprocessor jars there

2013-01-08 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6824:
-

Status: Patch Available  (was: Open)

> Introduce ${hbase.local.dir} and save coprocessor jars there
> 
>
> Key: HBASE-6824
> URL: https://issues.apache.org/jira/browse/HBASE-6824
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-6824_v1-0.94.patch, hbase-6824_v1-trunk.patch, 
> hbase-6824_v2-0.94.patch, hbase-6824_v2-trunk.patch, hbase-6824_v3.patch
>
>
> We need to make the temp directory where coprocessor jars are saved 
> configurable. For this we will add hbase.local.dir configuration parameter. 
> Windows tests are failing due to the pathing problems for coprocessor jars:
> Two HBase TestClassLoading unit tests failed due to a failiure in loading the 
> test file from HDFS:
> {code}
> testClassLoadingFromHDFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> testClassLoadingFromLibDirInJar(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> {code}
> The problem is that CoprocessorHost.load() copies the jar file locally, and 
> schedules the local file to be deleted on exit, but calling 
> FileSystem.deleteOnExit(). However, the filesystem is not the file system of 
> the local file, it is the distributed file system, so on windows, the Path 
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6824) Introduce ${hbase.local.dir} and save coprocessor jars there

2013-01-08 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6824:
-

Status: Open  (was: Patch Available)

> Introduce ${hbase.local.dir} and save coprocessor jars there
> 
>
> Key: HBASE-6824
> URL: https://issues.apache.org/jira/browse/HBASE-6824
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-6824_v1-0.94.patch, hbase-6824_v1-trunk.patch, 
> hbase-6824_v2-0.94.patch, hbase-6824_v2-trunk.patch, hbase-6824_v3.patch
>
>
> We need to make the temp directory where coprocessor jars are saved 
> configurable. For this we will add hbase.local.dir configuration parameter. 
> Windows tests are failing due to the pathing problems for coprocessor jars:
> Two HBase TestClassLoading unit tests failed due to a failiure in loading the 
> test file from HDFS:
> {code}
> testClassLoadingFromHDFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> testClassLoadingFromLibDirInJar(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> {code}
> The problem is that CoprocessorHost.load() copies the jar file locally, and 
> schedules the local file to be deleted on exit, but calling 
> FileSystem.deleteOnExit(). However, the filesystem is not the file system of 
> the local file, it is the distributed file system, so on windows, the Path 
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6824) Introduce ${hbase.local.dir} and save coprocessor jars there

2013-01-08 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6824:
-

Attachment: hbase-6824_v3.patch

Thanks for review Andrew. Rebased the patch. Will commit if passes hadoopqa. 

> Introduce ${hbase.local.dir} and save coprocessor jars there
> 
>
> Key: HBASE-6824
> URL: https://issues.apache.org/jira/browse/HBASE-6824
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-6824_v1-0.94.patch, hbase-6824_v1-trunk.patch, 
> hbase-6824_v2-0.94.patch, hbase-6824_v2-trunk.patch, hbase-6824_v3.patch
>
>
> We need to make the temp directory where coprocessor jars are saved 
> configurable. For this we will add hbase.local.dir configuration parameter. 
> Windows tests are failing due to the pathing problems for coprocessor jars:
> Two HBase TestClassLoading unit tests failed due to a failiure in loading the 
> test file from HDFS:
> {code}
> testClassLoadingFromHDFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> testClassLoadingFromLibDirInJar(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
>  Class TestCP1 was missing on a region
> {code}
> The problem is that CoprocessorHost.load() copies the jar file locally, and 
> schedules the local file to be deleted on exit, but calling 
> FileSystem.deleteOnExit(). However, the filesystem is not the file system of 
> the local file, it is the distributed file system, so on windows, the Path 
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7516) Make compaction policy pluggable

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547599#comment-13547599
 ] 

Hadoop QA commented on HBASE-7516:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563870/HBASE-7516-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 8 zombie test(s):   
at 
org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS(TestMasterFailover.java:833)
at 
org.apache.hadoop.hbase.catalog.TestCatalogTracker.testServerNotRunningIOException(TestCatalogTracker.java:250)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3941//console

This message is automatically generated.

> Make compaction policy pluggable
> 
>
> Key: HBASE-7516
> URL: https://issues.apache.org/jira/browse/HBASE-7516
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7516-v0.patch, HBASE-7516-v1.patch
>
>
> Currently, the compaction selection is pluggable. It will be great to make 
> the compaction algorithm pluggable too so that we can implement and play with 
> other compaction algorithms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547600#comment-13547600
 ] 

Ted Yu commented on HBASE-7515:
---

[~jdcryans], [~eclark]:
Is there any place that I missed ?

Thanks

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt, 7515-v4.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547577#comment-13547577
 ] 

Sergey Shelukhin commented on HBASE-7268:
-

bq. Why assigning 0 as seqNum above ?
To have some sort of valid sequence number... I assume this can only happen if 
all operations against the region don't use WAL.

> correct local region location cache information can be overwritten w/stale 
> information from an old server
> -
>
> Key: HBASE-7268
> URL: https://issues.apache.org/jira/browse/HBASE-7268
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
> HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
> HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch
>
>
> Discovered via HBASE-7250; related to HBASE-5877.
> Test is writing from multiple threads.
> Server A has region R; client knows that.
> R gets moved from A to server B.
> B gets killed.
> R gets moved by master to server C.
> ~15 seconds later, client tries to write to it (on A?).
> Multiple client threads report from RegionMoved exception processing logic "R 
> moved from C to B", even though such transition never happened (neither in 
> nor before the sequence described below). Not quite sure how the client 
> learned of the transition to C, I assume it's from meta from some other 
> thread...
> Then, put fails (it may fail due to accumulated errors that are not logged, 
> which I am investigating... but the bogus cache update is there 
> nonwithstanding).
> I have a patch but not sure if it works, test still fails locally for yet 
> unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-6466:
---

Assignee: Sergey Shelukhin  (was: chunhui shen)

> Enable multi-thread for memstore flush
> --
>
> Key: HBASE-6466
> URL: https://issues.apache.org/jira/browse/HBASE-6466
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.96.0
>Reporter: chunhui shen
>Assignee: Sergey Shelukhin
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
> HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch, 
> HBASE-6466-v4.patch
>
>
> If the KV is large or Hlog is closed with high-pressure putting, we found 
> memstore is often above the high water mark and block the putting.
> So should we enable multi-thread for Memstore Flush?
> Some performance test data for reference,
> 1.test environment : 
> random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
> regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
> regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
> per client for writing
> 2.test results:
> one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
> regionserver, appears many aboveGlobalMemstoreLimit blocking
> two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
> regionserver,
> 200 thread handler per client & two cacheFlush handlers, tps:16.1k/s per 
> regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547564#comment-13547564
 ] 

Sergey Shelukhin commented on HBASE-7521:
-

I have a patch, let me run it on cluster and post if it works...

> fix HBASE-6060 (regions stuck in opening state) in 0.94
> ---
>
> Key: HBASE-7521
> URL: https://issues.apache.org/jira/browse/HBASE-7521
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
> Still, we may want to fix the issue in 0.94 (via some different fix) because 
> the regions stuck in opening for ridiculous amounts of time is not a good 
> thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7520) org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails when I cd hbase-it and mvn verify

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547562#comment-13547562
 ] 

Sergey Shelukhin commented on HBASE-7520:
-

At least some failures can be mapped to HBASE-7268... on cluster it passes for 
me with the latest patch, but on local it does fail later due to some other 
issue.
Can HBASE-7318 please be committed so I could stop juggling 3 patches to debug? 
;)

> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails 
> when I cd hbase-it and mvn verify
> --
>
> Key: HBASE-7520
> URL: https://issues.apache.org/jira/browse/HBASE-7520
> Project: HBase
>  Issue Type: Bug
>  Components: test
> Environment: macosx trunk
>Reporter: stack
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Trying to make up something to hand off to bigtop project, running the hbase 
> it tests, this one fails.
> {code}
> durruti:failsafe-reports stack$ more 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.txt 
> ---
> Test set: 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted
> ---
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 206.538 sec 
> <<< FAILURE!
> testDataIngest(org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted)
>   Time elapsed: 206.395 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: Load failed with error code 1
> at junit.framework.Assert.fail(Assert.java:50)
> at 
> org.apache.hadoop.hbase.IngestIntegrationTestBase.runIngestTest(IngestIntegrationTestBase.java:98)
> at 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.testDataIngest(IntegrationTestRebalanceAndKillServersTargeted.java:121)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ...
> {code}
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt
>   has nothing in it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547561#comment-13547561
 ] 

chunhui shen commented on HBASE-7506:
-

bq.That means we should not verify any more, right?
We should also verify for better, see HBASE-7504


I think we could remove MetaServerShutdownHandler class, if we pass the 
carryingRoot,carryingMeta to ServerShutdownHandler. Like the following
{code}
ServerShutdownHandler(final Server server, final MasterServices services,
  final DeadServer deadServers, final ServerName serverName,
  final boolean shouldSplitHlog,final boolean carryingRoot, final boolean 
carryingMeta){
super(server, 
(carryingRoot||carryingMeta)?EventType.M_META_SERVER_SHUTDOWN:EventType.M_SERVER_SHUTDOWN);
{code}


> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7506-trunk v1.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7519) Support level compaction

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547559#comment-13547559
 ] 

Sergey Shelukhin commented on HBASE-7519:
-

[4, 7)

> Support level compaction
> 
>
> Key: HBASE-7519
> URL: https://issues.apache.org/jira/browse/HBASE-7519
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: level-compaction.pdf
>
>
> The level compaction algorithm may help HBase for some use cases, for 
> example, read heavy loads (especially, just one version is used), relative 
> small key space updated frequently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7519) Support level compaction

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547558#comment-13547558
 ] 

Sergey Shelukhin commented on HBASE-7519:
-

>From the discussion today; from my understanding leveldb key ranges in 
>different files may overlap between levels; that means that compaction for a 
>range must do something with the leftover bits of the files, or keep old files 
>for other ranges.
E.g. if I have two levels somewhere (in no particular order) - lN with [1,5), 
[6, 10] files and lM with [1, 4), [4, 7), [8, 10] files, compaction for [4, 7] 
must include both of the lN files, and produce some parts of them, or keep 
them, for the reads from other ranges. If instead it uses the largest 
overlapping range to avoid only using parts of the files, all ranges would 
eventually merge.
That would mean Compactor and other things also needs to change significantly 
(and become pluggable?) afaisee.

> Support level compaction
> 
>
> Key: HBASE-7519
> URL: https://issues.apache.org/jira/browse/HBASE-7519
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: level-compaction.pdf
>
>
> The level compaction algorithm may help HBase for some use cases, for 
> example, read heavy loads (especially, just one version is used), relative 
> small key space updated frequently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7518) Move AuthResult out of AccessController

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547556#comment-13547556
 ] 

Hudson commented on HBASE-7518:
---

Integrated in HBase-TRUNK #3712 (See 
[https://builds.apache.org/job/HBase-TRUNK/3712/])
HBASE-7518 Move AuthResult out of AccessController (Revision 1430631)

 Result = FAILURE
mbertozzi : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AuthResult.java


> Move AuthResult out of AccessController
> ---
>
> Key: HBASE-7518
> URL: https://issues.apache.org/jira/browse/HBASE-7518
> Project: HBase
>  Issue Type: Sub-task
>  Components: security
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7518-v0.patch
>
>
> split HBASE-6393 in two logical pieces.
> This jira is just for moving out the AuthResult from the AccessController
> in this way, we can get in HBASE-6386 without waiting on HBASE-6393

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547553#comment-13547553
 ] 

chunhui shen commented on HBASE-7504:
-

In the normal case, we will assign ROOT.
So only the first "if" block will be executed
{code}
 if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) {
   this.services.getAssignmentManager().assignRoot();
+}
{code}

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7522) Tests should not be writing under /tmp/

2013-01-08 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-7522:


 Summary: Tests should not be writing under /tmp/
 Key: HBASE-7522
 URL: https://issues.apache.org/jira/browse/HBASE-7522
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0, 0.94.5
Reporter: Enis Soztutar


As per the discussion 
http://mail-archives.apache.org/mod_mbox/hbase-dev/201301.mbox/%3CCA%2BRK%3D_BmV%3Dvwws4VeDJVPt6hY7NKCDEafex3XTNam630pQRBbA%40mail.gmail.com%3E,
 tests should not be writing under /tmp/ directory. 

TestStoreFile is one of the offending ones. Some of them will be fixed at 
HBASE-6824. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7516) Make compaction policy pluggable

2013-01-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7516:


Attachment: HBASE-7516-v1.patch

Renames, javadoc... will post to /r/ momentarily.

> Make compaction policy pluggable
> 
>
> Key: HBASE-7516
> URL: https://issues.apache.org/jira/browse/HBASE-7516
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7516-v0.patch, HBASE-7516-v1.patch
>
>
> Currently, the compaction selection is pluggable. It will be great to make 
> the compaction algorithm pluggable too so that we can implement and play with 
> other compaction algorithms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7516) Make compaction policy pluggable

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547550#comment-13547550
 ] 

Sergey Shelukhin commented on HBASE-7516:
-

https://reviews.apache.org/r/8895/

> Make compaction policy pluggable
> 
>
> Key: HBASE-7516
> URL: https://issues.apache.org/jira/browse/HBASE-7516
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7516-v0.patch, HBASE-7516-v1.patch
>
>
> Currently, the compaction selection is pluggable. It will be great to make 
> the compaction algorithm pluggable too so that we can implement and play with 
> other compaction algorithms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-08 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547547#comment-13547547
 ] 

rajeshbabu commented on HBASE-7504:
---

[~zjushch] 
Patch looks good.
can we avoid calling server.getCatalogTracker().getRootLocation() (reading 
znode in zookeeper) two times in normal case?

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7519) Support level compaction

2013-01-08 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547539#comment-13547539
 ] 

Enis Soztutar commented on HBASE-7519:
--

bq. It will be even great if we can dynamically tune/choose a proper one.
Sergey is doing the basic blocks of managing global/per-table/per-cf 
configuration in HBASE-7236. After that, we can start to think about 
HBASE-5678. However, even without the dynamic config, we should be able to tune 
the parameters by rolling reopen for the regions. 

> Support level compaction
> 
>
> Key: HBASE-7519
> URL: https://issues.apache.org/jira/browse/HBASE-7519
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: level-compaction.pdf
>
>
> The level compaction algorithm may help HBase for some use cases, for 
> example, read heavy loads (especially, just one version is used), relative 
> small key space updated frequently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547524#comment-13547524
 ] 

Hadoop QA commented on HBASE-7268:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563860/HBASE-7268-v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 18 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3940//console

This message is automatically generated.

> correct local region location cache information can be overwritten w/stale 
> information from an old server
> -
>
> Key: HBASE-7268
> URL: https://issues.apache.org/jira/browse/HBASE-7268
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
> HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
> HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch
>
>
> Discovered via HBASE-7250; related to HBASE-5877.
> Test is writing from multiple threads.
> Server A has region R; client knows that.
> R gets moved from A to server B.
> B gets killed.
> R gets moved by master to server C.
> ~15 seconds later, client tries to write to it (on A?).
> Multiple client threads report from RegionMoved exception processing logic "R 
> moved from C to B", even though such transition never happened (neither in 
> nor before the sequence described below). Not quite sure how the client 
> learned of the transition to C, I assume it's from meta from some other 
> thread...
> Then, put fails (it may fail due to accumulated errors that are not logged, 
> which I am investigating... but the bogus cache update is there 
> nonwithstanding).
> I have a patch but not sure if it works, test still fails locally for yet 
> unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-01-08 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-7521:
---

 Summary: fix HBASE-6060 (regions stuck in opening state) in 0.94
 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
Still, we may want to fix the issue in 0.94 (via some different fix) because 
the regions stuck in opening for ridiculous amounts of time is not a good thing 
to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547515#comment-13547515
 ] 

Sergey Shelukhin commented on HBASE-7236:
-

Would it help if I split the patch? E.g. for example first change protocol, 
then add overrides, then do the conversion of metadata to overrides.
This appears to be stuck, I wonder what it would take to make it unstuck

> add per-table/per-cf configuration via metadata
> ---
>
> Key: HBASE-7236
> URL: https://issues.apache.org/jira/browse/HBASE-7236
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
> HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
> HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch, 
> HBASE-7236-v5.patch, HBASE-7236-v6.patch, HBASE-7236-v6.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate 
> configuration for compactions for different tables and column families, as 
> their access patterns and workloads can be different. In particular, for 
> tiered compactions that are being ported from 0.89-fb branch it is necessary 
> to have, to use it properly.
> We might want to add support for compaction configuration via metadata on 
> table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547509#comment-13547509
 ] 

Ted Yu commented on HBASE-7268:
---

{code}
+   * @return SeqNum, or -1 if there's no value for server name.
{code}
Looks like javadoc update was incomplete.
{code}
+if (openSeqNum == HConstants.NO_SEQNUM) {
+  if (!r.getRegionInfo().isRootRegion()) {
+// If we opened non-root region, we should have read some sequence 
number from it.
+LOG.error("No sequence number found when opening " + 
r.getRegionNameAsString());
+  }
+  openSeqNum = 0;
{code}
Why assigning 0 as seqNum above ?
{code}
+  public long getEarliestMemstoreSeqNum(byte[] encodedRegionName) {
+cacheFlushLock.lock();
+try {
+  Long result = lastSeqWritten.get(encodedRegionName);
{code}
lastSeqWritten is ConcurrentSkipListMap, do we need the lock ?

> correct local region location cache information can be overwritten w/stale 
> information from an old server
> -
>
> Key: HBASE-7268
> URL: https://issues.apache.org/jira/browse/HBASE-7268
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
> HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
> HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch
>
>
> Discovered via HBASE-7250; related to HBASE-5877.
> Test is writing from multiple threads.
> Server A has region R; client knows that.
> R gets moved from A to server B.
> B gets killed.
> R gets moved by master to server C.
> ~15 seconds later, client tries to write to it (on A?).
> Multiple client threads report from RegionMoved exception processing logic "R 
> moved from C to B", even though such transition never happened (neither in 
> nor before the sequence described below). Not quite sure how the client 
> learned of the transition to C, I assume it's from meta from some other 
> thread...
> Then, put fails (it may fail due to accumulated errors that are not logged, 
> which I am investigating... but the bogus cache update is there 
> nonwithstanding).
> I have a patch but not sure if it works, test still fails locally for yet 
> unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7519) Support level compaction

2013-01-08 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547508#comment-13547508
 ] 

Jimmy Xiang commented on HBASE-7519:


Yes, it relates to HBASE-7055.  We need to support pluggable compaction 
policy/algorithm (HBASE-7516) so that we can play with each one and choose the 
right one per the load.

It will be even great if we can dynamically tune/choose a proper one.

> Support level compaction
> 
>
> Key: HBASE-7519
> URL: https://issues.apache.org/jira/browse/HBASE-7519
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: level-compaction.pdf
>
>
> The level compaction algorithm may help HBase for some use cases, for 
> example, read heavy loads (especially, just one version is used), relative 
> small key space updated frequently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7520) org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails when I cd hbase-it and mvn verify

2013-01-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547507#comment-13547507
 ] 

Sergey Shelukhin commented on HBASE-7520:
-

Hmm... it may be just caused by HBASE-7268.
Let me double check

> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails 
> when I cd hbase-it and mvn verify
> --
>
> Key: HBASE-7520
> URL: https://issues.apache.org/jira/browse/HBASE-7520
> Project: HBase
>  Issue Type: Bug
>  Components: test
> Environment: macosx trunk
>Reporter: stack
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Trying to make up something to hand off to bigtop project, running the hbase 
> it tests, this one fails.
> {code}
> durruti:failsafe-reports stack$ more 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.txt 
> ---
> Test set: 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted
> ---
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 206.538 sec 
> <<< FAILURE!
> testDataIngest(org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted)
>   Time elapsed: 206.395 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: Load failed with error code 1
> at junit.framework.Assert.fail(Assert.java:50)
> at 
> org.apache.hadoop.hbase.IngestIntegrationTestBase.runIngestTest(IngestIntegrationTestBase.java:98)
> at 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.testDataIngest(IntegrationTestRebalanceAndKillServersTargeted.java:121)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ...
> {code}
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt
>   has nothing in it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7519) Support level compaction

2013-01-08 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547505#comment-13547505
 ] 

Jimmy Xiang commented on HBASE-7519:


Assign to me for now since I am thinking how to implement it.  Please let me 
know if someone is interested to do it too.

> Support level compaction
> 
>
> Key: HBASE-7519
> URL: https://issues.apache.org/jira/browse/HBASE-7519
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: level-compaction.pdf
>
>
> The level compaction algorithm may help HBase for some use cases, for 
> example, read heavy loads (especially, just one version is used), relative 
> small key space updated frequently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7519) Support level compaction

2013-01-08 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HBASE-7519:
--

Assignee: Jimmy Xiang

> Support level compaction
> 
>
> Key: HBASE-7519
> URL: https://issues.apache.org/jira/browse/HBASE-7519
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: level-compaction.pdf
>
>
> The level compaction algorithm may help HBase for some use cases, for 
> example, read heavy loads (especially, just one version is used), relative 
> small key space updated frequently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7520) org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails when I cd hbase-it and mvn verify

2013-01-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-7520:
---

Assignee: Sergey Shelukhin

> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails 
> when I cd hbase-it and mvn verify
> --
>
> Key: HBASE-7520
> URL: https://issues.apache.org/jira/browse/HBASE-7520
> Project: HBase
>  Issue Type: Bug
>  Components: test
> Environment: macosx trunk
>Reporter: stack
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Trying to make up something to hand off to bigtop project, running the hbase 
> it tests, this one fails.
> {code}
> durruti:failsafe-reports stack$ more 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.txt 
> ---
> Test set: 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted
> ---
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 206.538 sec 
> <<< FAILURE!
> testDataIngest(org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted)
>   Time elapsed: 206.395 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: Load failed with error code 1
> at junit.framework.Assert.fail(Assert.java:50)
> at 
> org.apache.hadoop.hbase.IngestIntegrationTestBase.runIngestTest(IngestIntegrationTestBase.java:98)
> at 
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.testDataIngest(IntegrationTestRebalanceAndKillServersTargeted.java:121)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ...
> {code}
> org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt
>   has nothing in it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547491#comment-13547491
 ] 

stack commented on HBASE-7477:
--

Note to self (comes of a review of what would be involved pulling the proxy 
stuff out of hbase with Elliott):

+ we'd need a means of hooking up a generic "callMethod" that took a Method and 
params with a protocol Interface -- what proxy does for us now.  The protobuf 
Service does this for us.
+ What we have currently where we have protobuf engine pollution in the 
HBaseClient -- though this latter class is supposed to be engine agnostic -- is 
ugly.

Given this, protobuf Service starts to look good.  Has kinks but would enforce 
a strong pattern -- and we are most of the way there already with our use of 
the Service#BlockingInterface.

> Remove Proxy instance from HBase RPC
> 
>
> Key: HBASE-7477
> URL: https://issues.apache.org/jira/browse/HBASE-7477
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Karthik Ranganathan
>
> Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
> the RPC parameters. This is pretty inefficient as it uses reflection to 
> lookup the current method name.
> The aim is to break up the proxy into an actual proxy implementation so that:
> 1. we can make it more efficient by eliminating reflection
> 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547492#comment-13547492
 ] 

Ted Yu commented on HBASE-7515:
---

I ran TestLocalHBaseCluster locally and it passed:
{code}
Running org.apache.hadoop.hbase.TestLocalHBaseCluster
2013-01-08 16:56:16.209 java[70980:1203] Unable to load realm info from 
SCDynamicStore
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 27.618 sec
{code}

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt, 7515-v4.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547489#comment-13547489
 ] 

Hadoop QA commented on HBASE-7515:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563855/7515-v4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3938//console

This message is automatically generated.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt, 7515-v4.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7268:


Attachment: HBASE-7268-v3.patch

Changed the patch to seqnums... no longer using column TS as this will make 
things backward incompatible - old values have higher TS and cannot be 
overwritten.

> correct local region location cache information can be overwritten w/stale 
> information from an old server
> -
>
> Key: HBASE-7268
> URL: https://issues.apache.org/jira/browse/HBASE-7268
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
> HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
> HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch
>
>
> Discovered via HBASE-7250; related to HBASE-5877.
> Test is writing from multiple threads.
> Server A has region R; client knows that.
> R gets moved from A to server B.
> B gets killed.
> R gets moved by master to server C.
> ~15 seconds later, client tries to write to it (on A?).
> Multiple client threads report from RegionMoved exception processing logic "R 
> moved from C to B", even though such transition never happened (neither in 
> nor before the sequence described below). Not quite sure how the client 
> learned of the transition to C, I assume it's from meta from some other 
> thread...
> Then, put fails (it may fail due to accumulated errors that are not logged, 
> which I am investigating... but the bogus cache update is there 
> nonwithstanding).
> I have a patch but not sure if it works, test still fails locally for yet 
> unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547477#comment-13547477
 ] 

Hudson commented on HBASE-7441:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #336 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/336/])
HBASE-7441 Make ClusterManager in IntegrationTestingUtility pluggable (Liu 
Shaohui) (Revision 1430433)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestingUtility.java


> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch, 
> HBASE-7441-trunk-v2.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547476#comment-13547476
 ] 

Hudson commented on HBASE-7513:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #336 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/336/])
HBASE-7513 HDFSBlocksDistribution shouldn't send NPEs when something goes 
wrong (Revision 1430560)

 Result = FAILURE
eclark : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HDFSBlocksDistribution.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestHDFSBlocksDistribution.java


> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7513-094-1.patch, HBASE-7513-0.patch, 
> HBASE-7513-1.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big u

[jira] [Commented] (HBASE-7518) Move AuthResult out of AccessController

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547478#comment-13547478
 ] 

Hudson commented on HBASE-7518:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #336 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/336/])
HBASE-7518 Move AuthResult out of AccessController (Revision 1430631)

 Result = FAILURE
mbertozzi : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AuthResult.java


> Move AuthResult out of AccessController
> ---
>
> Key: HBASE-7518
> URL: https://issues.apache.org/jira/browse/HBASE-7518
> Project: HBase
>  Issue Type: Sub-task
>  Components: security
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7518-v0.patch
>
>
> split HBASE-6393 in two logical pieces.
> This jira is just for moving out the AuthResult from the AccessController
> in this way, we can get in HBASE-6386 without waiting on HBASE-6393

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7474) Endpoint Implementation to support Scans with Sorting of Rows based on column values(similar to "order by" clause of RDBMS)

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547471#comment-13547471
 ] 

Ted Yu commented on HBASE-7474:
---

For SortingProtocol:
{code}
+   Result[] sortIncreasing(Scan scan, byte[] columnFamily, byte[] 
columnQualifier,
{code}
I think sortAscending would be more familiar to people who have worked with 
RDBMS.
{code}
+   Result[] sortDecreasing(Scan scan, byte[] columnFamily, byte[] 
columnQualifier,
{code}
sortDescending would be a better method name.
{code}
+   * @param singleRegion does this scan request spans multiple regions?
{code}
spelling: 'spans' -> 'span'

Looking at SortingProtocolImplementation.sortIncreasing(), singleRegion is not 
referenced in the loop - we scan until there is no more row. Some clarification 
is needed in javadoc and variable name.

> Endpoint Implementation to support Scans with Sorting of Rows based on column 
> values(similar to "order by" clause of RDBMS)
> ---
>
> Key: HBASE-7474
> URL: https://issues.apache.org/jira/browse/HBASE-7474
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors, Scanners
>Affects Versions: 0.94.3
>Reporter: Anil Gupta
>Priority: Minor
>  Labels: coprocessors, scan, sort
> Fix For: 0.94.5
>
> Attachments: hbase-7474.patch, hbase-7474-v2.patch, 
> SortingEndpoint_high_level_flowchart.pdf
>
>
> Recently, i have developed an Endpoint which can sort the Results(rows) on 
> the basis of column values. This functionality is similar to "order by" 
> clause of RDBMS. I will be submitting this Patch for HBase0.94.3
> I am almost done with the initial development and testing of feature. But, i 
> need to write the JUnits for this. I will also try to make design doc.
> Thanks,
> Anil Gupta
> Software Engineer II, Intuit, inc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7477) Remove Proxy instance from HBase RPC

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547461#comment-13547461
 ] 

stack commented on HBASE-7477:
--

Tell us more what you are thinking Karthik

I'm looking at it now.

In trunk we have protobuf Service to put in place of the reflection.  This is 
autogen'd code that in essence keeps a dictionary (a ServiceDescriptor) that it 
searches to figure what particular method invocation to make (the autogenerated 
code makes up a nice, fast lookups).  The pb Service 'fit' is not perfect 
though -- it drags along some other stuff we do not want and it is missing a 
means of passing "extra" stuff unless we do some hackery -- so reluctant to 
take it on though it does away with reflection.

Good on you K.

> Remove Proxy instance from HBase RPC
> 
>
> Key: HBASE-7477
> URL: https://issues.apache.org/jira/browse/HBASE-7477
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Karthik Ranganathan
>
> Currently, we use HBaseRPC.getProxy() to get an Invoker object to serialize 
> the RPC parameters. This is pretty inefficient as it uses reflection to 
> lookup the current method name.
> The aim is to break up the proxy into an actual proxy implementation so that:
> 1. we can make it more efficient by eliminating reflection
> 2. can re-write some parts of the protocol to make it even better

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547456#comment-13547456
 ] 

Devaraj Das commented on HBASE-7479:


[~stack], I just had a small nit (put up on RB). +1 from me.

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt, 7479v2.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547453#comment-13547453
 ] 

Lars Hofhansl commented on HBASE-7515:
--

Man this is ugly now. But needs to be done... +1 on v4


> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt, 7515-v4.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547448#comment-13547448
 ] 

Hudson commented on HBASE-7513:
---

Integrated in HBase-0.94 #715 (See 
[https://builds.apache.org/job/HBase-0.94/715/])
HBASE-7513 HDFSBlocksDistribution shouldn't send NPEs when something goes 
wrong (Revision 1430580)

 Result = FAILURE
eclark : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HDFSBlocksDistribution.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestHDFSBlocksDistribution.java


> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7513-094-1.patch, HBASE-7513-0.patch, 
> HBASE-7513-1.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region woul

[jira] [Updated] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7515:
--

Attachment: 7515-v4.txt

Patch v4 addresses Elliott's concern.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt, 7515-v4.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547433#comment-13547433
 ] 

Ted Yu commented on HBASE-7479:
---

{code}
  public void authorize(UserGroupInformation user, 
   Class protocol,
   Configuration conf,
   InetAddress addr
   ) throws AuthorizationException {
{code}
The above method changes to non-static in hadoop 2.0, hence the compilation 
error.

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt, 7479v2.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7479:
-

Attachment: 7479v2.txt

Fix h2 compile failure.

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt, 7479v2.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547427#comment-13547427
 ] 

Hudson commented on HBASE-7513:
---

Integrated in HBase-TRUNK #3711 (See 
[https://builds.apache.org/job/HBase-TRUNK/3711/])
HBASE-7513 HDFSBlocksDistribution shouldn't send NPEs when something goes 
wrong (Revision 1430560)

 Result = FAILURE
eclark : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HDFSBlocksDistribution.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestHDFSBlocksDistribution.java


> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7513-094-1.patch, HBASE-7513-0.patch, 
> HBASE-7513-1.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case t

[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547424#comment-13547424
 ] 

Ted Yu commented on HBASE-3809:
---

@Jimmy:
Thanks for the reminder.
Reading into HMaster.startServiceThreads(), I found the answer.

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6386) Audit log messages do not include column family / qualifier information consistently

2013-01-08 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547420#comment-13547420
 ] 

Matteo Bertozzi commented on HBASE-6386:


Will commit in a couple of hours unless there are objections.

(this patch improves the AuthResult family/qualifiers log message)

> Audit log messages do not include column family / qualifier information 
> consistently
> 
>
> Key: HBASE-6386
> URL: https://issues.apache.org/jira/browse/HBASE-6386
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.96.0
>Reporter: Marcelo Vanzin
> Attachments: hbase-6386-v1.patch, hbase-6386-v2.patch, 
> HBASE-6386-v3.patch, HBASE-6386-v4.patch
>
>
> The code related to this issue is in 
> AccessController.java:permissionGranted().
> When creating audit logs, that method will do one of the following:
> * grant access, create audit log with table name only
> * deny access because of table permission, create audit log with table name 
> only
> * deny access because of column family / qualifier permission, create audit 
> log with specific family / qualifier
> So, in the case where more than one column family and/or qualifier are in the 
> same request, there will be a loss of information. Even in the case where 
> only one column family and/or qualifier is involved, information may be lost.
> It would be better if this behavior consistently included all the information 
> in the request; regardless of access being granted or denied, and regardless 
> which permission caused the denial, the column family and qualifier info 
> should be part of the audit log message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547419#comment-13547419
 ] 

Elliott Clark commented on HBASE-7515:
--

How can the callable ever return something when it's thrown an exception?


Try the example below:
{code}
package org.apache.hadoop.hbase.regionserver;

import org.junit.Test;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.CompletionService;
import java.util.concurrent.ExecutorCompletionService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;


public class TestFailCreate {

  @Test
  public void testFailCallable() throws Exception {
CompletionService completionService =
new ExecutorCompletionService(Executors.newCachedThreadPool());

List> futureList = new ArrayList>();
for (int i = 0; i < 10; i ++ ) {
  Future f = completionService.submit(new Callable() {
@Override
public String call() throws Exception {
  String test = "STOREFILE CLOSED";
  Thread.sleep(100);
  test = "STORE FILE START OPEN";
  if (true == true)  {
test = "STOREFILE HALF OPEN";
//This is simulating opening the store file. If somewhere in 
opening up the store
// file some exception is thrown.  The storefile is left in a half
// open state and there is no reference to it.
throw new Exception("TEST EXCEPTION");
// test = "STORE FILE FULLY OPEN"; // This line is never reached.
  }
  System.out.println("GOT TO THE RETURN LINE");
  return test;
}
  });
  futureList.add(f);
}


for (Future f: futureList) {
  try {
String s = f.get();
System.out.println("Got " + s);
  } catch (Exception e) {
System.out.println("Caught error");
  }
}



  }
}
{code}

Notice how "GOT TO THE RETURN LINE" is never printed out.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547414#comment-13547414
 ] 

Ted Yu commented on HBASE-7479:
---

Hadoop QA report can be found here: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3928/console

When compiling against hadoop 2.0, I got:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure
[ERROR] 
/Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java:[2105,33]
 non-static method 
authorize(org.apache.hadoop.security.UserGroupInformation,java.lang.Class,org.apache.hadoop.conf.Configuration,java.net.InetAddress)
 cannot be referenced from a static context
[ERROR] -> [Help 1]
{code}

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7518) Move AuthResult out of AccessController

2013-01-08 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7518:
---

   Resolution: Fixed
Fix Version/s: 0.96.0
   Status: Resolved  (was: Patch Available)

merged to trunk, thanks for the review

> Move AuthResult out of AccessController
> ---
>
> Key: HBASE-7518
> URL: https://issues.apache.org/jira/browse/HBASE-7518
> Project: HBase
>  Issue Type: Sub-task
>  Components: security
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7518-v0.patch
>
>
> split HBASE-6393 in two logical pieces.
> This jira is just for moving out the AuthResult from the AccessController
> in this way, we can get in HBASE-6386 without waiting on HBASE-6393

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7518) Move AuthResult out of AccessController

2013-01-08 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7518:
---

Status: Patch Available  (was: Open)

> Move AuthResult out of AccessController
> ---
>
> Key: HBASE-7518
> URL: https://issues.apache.org/jira/browse/HBASE-7518
> Project: HBase
>  Issue Type: Sub-task
>  Components: security
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-7518-v0.patch
>
>
> split HBASE-6393 in two logical pieces.
> This jira is just for moving out the AuthResult from the AccessController
> in this way, we can get in HBASE-6386 without waiting on HBASE-6393

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547405#comment-13547405
 ] 

stack commented on HBASE-3809:
--

bq. Point #1 from the comment @ 08/Jan/13 05:46 may not be true.

[~ted_yu] There is no comment at the above noted time.

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547404#comment-13547404
 ] 

Jimmy Xiang commented on HBASE-3809:


@Ted, check ExecutorService#getExecutor(final ExecutorType type).  Chunhui is 
right.

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547400#comment-13547400
 ] 

Ted Yu commented on HBASE-3809:
---

If I read the code correctly, there is only one ExecutorService running both 
MetaServerShutdownHandler and ServerShutdownHandler.
Point #1 from the comment @ 08/Jan/13 05:46 may not be true.

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547398#comment-13547398
 ] 

stack commented on HBASE-7479:
--

This passed when I ran the tests locally.  Waiting on DD comment before commit:

{code}
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] HBase . SUCCESS [2.100s]
[INFO] HBase - Common  SUCCESS [12.088s]
[INFO] HBase - Protocol .. SUCCESS [2.712s]
[INFO] HBase - Client  SUCCESS [1.048s]
[INFO] HBase - Hadoop Compatibility .. SUCCESS [0.803s]
[INFO] HBase - Hadoop One Compatibility .. SUCCESS [1.443s]
[INFO] HBase - Server  SUCCESS [35:48.784s]
[INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.949s]
[INFO] HBase - Integration Tests . SUCCESS [1.185s]
[INFO] HBase - Examples .. SUCCESS [29.291s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 36:42.915s
[INFO] Finished at: Tue Jan 08 14:06:40 PST 2013
[INFO] Final Memory: 46M/287M
[INFO] 
{code}

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547394#comment-13547394
 ] 

stack commented on HBASE-3809:
--

and... [~ted_yu] your point is?

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547392#comment-13547392
 ] 

Ted Yu commented on HBASE-7515:
---

I looked at storeFile.createReader call. My understanding is that the exception 
from the Callable would be delivered when get() is called.
Please take a look at the example in:
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ExecutorCompletionService.html

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547378#comment-13547378
 ] 

Ted Yu commented on HBASE-3809:
---

I was trying to find out which other ExecutorService is used to execute 
MetaServerShutdownHandler.
In MasterServices, there is only one method returning ExecutorService:
{code}
  public ExecutorService getExecutorService();
{code}
In HMaster, I only found one ExecutorService member variable:
{code}
  // Instance of the hbase executor service.
  ExecutorService executorService;
{code}

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547376#comment-13547376
 ] 

Elliott Clark commented on HBASE-7515:
--

The case where storeFile.createReader errors out is not covered by this patch.  
How about putting a try catch in the Callable (line 421), on exception try to 
close the StoreFile.  Then re-throw the exception.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7369) HConnectionManager should remove aborted connections

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547366#comment-13547366
 ] 

Hudson commented on HBASE-7369:
---

Integrated in HBase-0.94 #714 (See 
[https://builds.apache.org/job/HBase-0.94/714/])
HBASE-7369 HConnectionManager should remove aborted connections (Bryan 
Baugher) (Revision 1430533)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


> HConnectionManager should remove aborted connections
> 
>
> Key: HBASE-7369
> URL: https://issues.apache.org/jira/browse/HBASE-7369
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 0.94.3
>Reporter: Bryan Baugher
>Assignee: Bryan Baugher
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7369-0.94.txt, 7369-v5.txt, 
> HBASE-7369_HCM-remove-aborted-cnxs-2.txt, 
> HBASE-7369_HCM-remove-aborted-cnxs-3.txt, 
> HBASE-7369_HCM-remove-aborted-cnxs-4.txt, 
> HBASE-7369_HCM-remove-aborted-cnxs.txt, patch2.diff, patch3.diff, patch.diff
>
>
> When an HConnection is abort()'ed (i.e. if numerous services are lost) the 
> connection becomes unusable. HConnectionManager cache of HConnections 
> currently does not have any logic around removing aborted connections 
> automatically. Currently it is up to the consumer to do so using 
> HConnectionManager.deleteStaleConnection(HConnection).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547360#comment-13547360
 ] 

stack commented on HBASE-3809:
--

What is the point that you are trying to make @Ted Yu?

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-08 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7513:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to both trunk and 0.94.

Thanks JD.

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7513-094-1.patch, HBASE-7513-0.patch, 
> HBASE-7513-1.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6521) Address the handling of multiple versions of a protocol

2013-01-08 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6521.
--

Resolution: Invalid

Resolving as 'Invalid'. We will handle versioning by not doing versioning (See 
HBASE-7479 where we strip what versioning we currently had).

> Address the handling of multiple versions of a protocol
> ---
>
> Key: HBASE-6521
> URL: https://issues.apache.org/jira/browse/HBASE-6521
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC
>Reporter: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: pbservice.txt
>
>
> This jira is to track a solution/patch to the mailing list thread titled 
> "Handling protocol versions" - 
> http://search-hadoop.com/m/6k7GUM028E/v=threaded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547348#comment-13547348
 ] 

Ted Yu commented on HBASE-3809:
---

Here is code snippet from ServerManager.expireServer():
{code}
  this.services.getExecutorService().submit(new 
MetaServerShutdownHandler(this.master,
this.services, this.deadservers, serverName, carryingRoot, 
carryingMeta));
} else {
  this.services.getExecutorService().submit(new 
ServerShutdownHandler(this.master,
{code}


> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: chunhui shen
>Priority: Critical
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-08 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7515:
--

Attachment: 7515-v3.txt

Thanks for the review, J-D.

Patch v3 handles potential exception coming out of HStore.close().

ioe is only created once - for the first exception caught. This way small 
garbage is avoided.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7515.txt, 7515-v2.txt, 7515-v3.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-08 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7513:
-

Attachment: HBASE-7513-094-1.patch

0.94 version

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7513-094-1.patch, HBASE-7513-0.patch, 
> HBASE-7513-1.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-08 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547330#comment-13547330
 ] 

Elliott Clark commented on HBASE-7513:
--

Trunk version was Committed revision 1430560.  I'll get a 0.94 version up soon.

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7513-0.patch, HBASE-7513-1.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547321#comment-13547321
 ] 

Devaraj Das commented on HBASE-7479:


sure [~stack]. I will..

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt, 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-7474) Endpoint Implementation to support Scans with Sorting of Rows based on column values(similar to "order by" clause of RDBMS)

2013-01-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547313#comment-13547313
 ] 

Ted Yu edited comment on HBASE-7474 at 1/8/13 9:46 PM:
---

License headers in SortingClient.java and 
BigDecimalSortingColumnInterpreter.java are not properly formatted.
Some log statements, such as the following, can be at debug level.
{code}
+  log.info("Querying only one region for sorting");
{code}

{code}
+if (sortDecreasing) return instance.sortDecreasing(scan, 
columnFamily, columnQualifier,
+  colInterpreter, startIndex, pageSize, true);
+else return instance.sortIncreasing(scan, columnFamily, 
columnQualifier,
{code}
'else' keyword is not needed above.
{code}
+   * This method is used to do the merge sort the rows from multiple regions 
and produce the final output
{code}
Remove 'do the'. Wrap long line.
{code}
+for (Map.Entry regionResultsEntryMap : 
regionResultMap.entrySet()) {
{code}
regionResultsEntryMap -> regionResultsEntry or regionResultsMapEntry
{code}
+if(totalNoOfRows < startIndex)
+{
{code}
Normally left brace is on the same line as if statement. Insert a space between 
if and (.

currentMaxorMinValueRegion and maxOrMin are used in the if / else blocks. You 
can move them inside if / else block and give them names that are clearer in 
meaning.
{code}
+for (Result[] regionResult : regionResults) {
+  if ((regionResult.length - 1) < arrayIndex[regionNum]) {
{code}
regionResults and arrayIndex are both arrays. So you can use the same index to 
access them - in my opinion the code is more readable.
{code}
+  finalResult[finalResultCurrentSize++] = 
regionResults[currentMaxorMinValueRegion][arrayIndex[currentMaxorMinValueRegion]];
{code}
Wrap long line above.
{code}
+  if (colInterpreter.compare(tmp, maxOrMin) > 0) {
{code}
If I read the code correctly, the above comparison is the major difference 
between ascending and descending sorting. A little abstraction would allow you 
to unify the two cases.

Looking at SortingColumnInterpreter, this is the only method which is not 
present in ColumnInterpreter:
{code}
+  T getValue(KeyValue kv) throws IOException;
{code}
The following method is already provided by ColumnInterpreter:
{code}
  public abstract T getValue(byte[] colFamily, byte[] colQualifier, KeyValue kv)
  throws IOException;
{code}
Please consider dropping SortingColumnInterpreter

  was (Author: yuzhih...@gmail.com):
License headers in SortingClient.java and 
BigDecimalSortingColumnInterpreter.java are not properly formatted.
Some log statements, such as the following, can be at debug level.
{code}
+  log.info("Querying only one region for sorting");
{code}

{code}
+if (sortDecreasing) return instance.sortDecreasing(scan, 
columnFamily, columnQualifier,
+  colInterpreter, startIndex, pageSize, true);
+else return instance.sortIncreasing(scan, columnFamily, 
columnQualifier,
{code}
'else' keyword is not needed above.
{code}
+   * This method is used to do the merge sort the rows from multiple regions 
and produce the final output
{code}
Remove 'do the'. Wrap long line.
{code}
+for (Map.Entry regionResultsEntryMap : 
regionResultMap.entrySet()) {
{code}
regionResultsEntryMap -> regionResultsEntry or regionResultsMapEntry
{code}
+if(totalNoOfRows < startIndex)
+{
{code}
Normally left brace is on the same line as if statement. Insert a space between 
if and (.

currentMaxorMinValueRegion and maxOrMin are used in the if / else blocks. You 
can move them inside if / else block and give them names that are clearer in 
meaning.
{code}
+for (Result[] regionResult : regionResults) {
+  if ((regionResult.length - 1) < arrayIndex[regionNum]) {
{code}
regionResults and arrayIndex are both arrays. So you can use the same index to 
access them - in my opinion the code is more readable.
{code}
+  finalResult[finalResultCurrentSize++] = 
regionResults[currentMaxorMinValueRegion][arrayIndex[currentMaxorMinValueRegion]];
{code}
Wrap long line above.
{code}
+  if (colInterpreter.compare(tmp, maxOrMin) > 0) {
{code}
If I read the code correctly, the above comparison is the major difference 
between ascending and descending sorting. A little abstraction would allow you 
to unify the two cases.

Looking at SortingColumnInterpreter, this is the only method which is not 
present in ColumnInterpreter:
{code}
+  T getValue(KeyValue kv) throws IOException;
{code}
We're trying to reduce exposure of KeyValue. Meaning the following method is 
favored:
{code}
  public abstract T getValue(byte[] colFamily, byte[] colQualifier, KeyValue kv)
  throws IOException;
{code}
Please consider removing SortingColumnInterpreter
  
> Endpoint Implementation to support Scans with Sorting o

[jira] [Commented] (HBASE-7474) Endpoint Implementation to support Scans with Sorting of Rows based on column values(similar to "order by" clause of RDBMS)

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547318#comment-13547318
 ] 

Hadoop QA commented on HBASE-7474:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563810/hbase-7474-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestAtomicOperation
  org.apache.hadoop.hbase.TestLocalHBaseCluster
  org.apache.hadoop.hbase.master.TestMasterFailover

 {color:red}-1 core zombie tests{color}.  There are 7 zombie test(s):   
at 
org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS(TestMasterFailover.java:833)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3934//console

This message is automatically generated.

> Endpoint Implementation to support Scans with Sorting of Rows based on column 
> values(similar to "order by" clause of RDBMS)
> ---
>
> Key: HBASE-7474
> URL: https://issues.apache.org/jira/browse/HBASE-7474
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors, Scanners
>Affects Versions: 0.94.3
>Reporter: Anil Gupta
>Priority: Minor
>  Labels: coprocessors, scan, sort
> Fix For: 0.94.5
>
> Attachments: hbase-7474.patch, hbase-7474-v2.patch, 
> SortingEndpoint_high_level_flowchart.pdf
>
>
> Recently, i have developed an Endpoint which can sort the Results(rows) on 
> the basis of column values. This functionality is similar to "order by" 
> clause of RDBMS. I will be submitting this Patch for HBase0.94.3
> I am almost done with the initial development and testing of feature. But, i 
> need to write the JUnits for this. I will also try to make design doc.
> Thanks,
> Anil Gupta
> Software Engineer II, Intuit, inc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >