[jira] [Resolved] (HBASE-6614) General cleanup/optimizations of the protobuf RPC engine & associated RPC code

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6614.
--

Resolution: Invalid

Let me resolve this one DD as being a bit inspecific.   VersionedProtocol, 
etc., was removed already.  Gary is working on rpc engine stuff over in 
hbase-7460.  Proxy's might go in the karthik issue.  The rpc spec. will also 
require work in here.  I think we have this one covered w/ more specific 
issues.   Reopen if I have it wrong.

> General cleanup/optimizations of the protobuf RPC engine & associated RPC code
> --
>
> Key: HBASE-6614
> URL: https://issues.apache.org/jira/browse/HBASE-6614
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.96.0
>
>
> Raising this as a placeholder for doing cleanup/optimizations of the protobuf 
> RPC engine & associated RPC code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550947#comment-13550947
 ] 

stack commented on HBASE-6651:
--

Can we have the vote on HTablePool?  [~lhofhansl] You want to run it?  We 
should get rid of it in 0.96 if its going to go.

> Improve thread safety of HTablePool
> ---
>
> Key: HBASE-6651
> URL: https://issues.apache.org/jira/browse/HBASE-6651
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.1
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
> Fix For: 0.96.0
>
> Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, 
> HBASE-6651-V11.patch, HBASE-6651-V12.patch, HBASE-6651-V2.patch, 
> HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, 
> HBASE-6651-V6.patch, HBASE-6651-V7.patch, HBASE-6651-V8.patch, 
> HBASE-6651-V9.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip
>
>
> There are some operations in HTablePool accessing PoolMap in multiple places 
> without any explicit synchronization. 
> For example HTablePool.closeTablePool() calls PoolMap.values(), and calls 
> PoolMap.remove(). If other threads add new instances to the pool in the 
> middle of the calls, the newly added instances might be dropped. 
> (HTablePool.closeTablePool() also has another problem that calling it by 
> multiple threads causes accessing HTable by multiple threads.)
> Moreover, PoolMap is not thread safe for the same reason.
> For example PoolMap.put() calles ConcurrentMap.get() and calles 
> ConcurrentMap.put(). If other threads add a new instance to the concurent map 
> in the middle of the calls, the new instance might be dropped.
> And also implementations of Pool have the same problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6948:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

This patch was committed a while back.  Resolving.

> shell create table script cannot handle split key which is expressed in raw 
> bytes
> -
>
> Key: HBASE-6948
> URL: https://issues.apache.org/jira/browse/HBASE-6948
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.2
>Reporter: Ted Yu
>Assignee: Tianying Chang
> Fix For: 0.96.0
>
> Attachments: HBASE-6948.patch, HBASE-6948-trunk.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7293) [replication] Remove dead sinks from ReplicationSource.currentPeers, it's spammy

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550939#comment-13550939
 ] 

Lars Hofhansl commented on HBASE-7293:
--

Should be in 0.94 too, no?
Let's do trunk first, and then see about 0.94.

> [replication] Remove dead sinks from ReplicationSource.currentPeers, it's 
> spammy
> 
>
> Key: HBASE-7293
> URL: https://issues.apache.org/jira/browse/HBASE-7293
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7293-0.94.txt, 7293-0.94-v2.txt, 7293-0.96.txt
>
>
> I happened to look at a log today where I saw a lot lines like this:
> {noformat}
> 2012-12-06 23:29:08,318 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave 
> cluster looks down: This server is in the failed servers list: 
> sv4r20s49/10.4.20.49:10304
> 2012-12-06 23:29:15,987 WARN 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
> replicate because of a local or network error: 
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:416)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:462)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1150)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1000)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
>   at $Proxy14.replicateLogEntries(Unknown Source)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:627)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
> 2012-12-06 23:29:15,988 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave 
> cluster looks down: Connection refused
> {noformat}
> What struck me as weird is this had been going on for some days, I would 
> expect the RS to find new servers if it wasn't able to replicate. But the 
> reality is that only a few of the chosen sink RS were down so eventually the 
> source hits one that's good and is never able to refresh its list of servers.
> We should remove the dead servers, it's spammy and probably adds some slave 
> lag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7108) Don't use legal family name for system folder at region level

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550941#comment-13550941
 ] 

stack commented on HBASE-7108:
--

You up for making the suggested change [~mbertozzi]? Would be good one to get 
into 0.96.  Good on you.

> Don't use legal family name for system folder at region level
> -
>
> Key: HBASE-7108
> URL: https://issues.apache.org/jira/browse/HBASE-7108
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.2, 0.94.2, 0.96.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 0.96.0
>
> Attachments: HBASE-7108-v0.patch
>
>
> CHANGED, was: Don't allow "recovered.edits" as legal family name
> Region directories can contain folders called "recovered.edits", log 
> splitting related.
> But there's nothing that prevent a user to create a family with that name...
> HLog.RECOVERED_EDITS_DIR = "recovered.edits";
> HRegion.MERGEDIR = "merges"; // fixed with HBASE-6158
> SplitTransaction.SPLITDIR = "splits"; // fixed with HBASE-6158

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7213) Have HLog files for .META. edits only

2013-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550942#comment-13550942
 ] 

Hadoop QA commented on HBASE-7213:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12564369/7213-2.11.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestSplitTransaction

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3979//console

This message is automatically generated.

> Have HLog files for .META. edits only
> -
>
> Key: HBASE-7213
> URL: https://issues.apache.org/jira/browse/HBASE-7213
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7213-2.10.patch, 7213-2.11.patch, 7213-2.4.patch, 
> 7213-2.6.patch, 7213-2.8.patch, 7213-2.9.patch, 7213-in-progress.2.2.patch, 
> 7213-in-progress.2.patch, 7213-in-progress.patch
>
>
> Over on HBASE-6774, there is a discussion on separating out the edits for 
> .META. regions from the other regions' edits w.r.t where the edits are 
> written. This jira is to track an implementation of that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7122) Proper warning message when opening a log file with no entries (idle cluster)

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550938#comment-13550938
 ] 

stack commented on HBASE-7122:
--

+1 on patch.  Looking at HBASE-6804, I do not see this patch rolled in there.  
Is that so Himanshu?  Can you make a trunk patch?  Thanks boss.

> Proper warning message when opening a log file with no entries (idle cluster)
> -
>
> Key: HBASE-7122
> URL: https://issues.apache.org/jira/browse/HBASE-7122
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Affects Versions: 0.94.2
>Reporter: Himanshu Vashishtha
>Assignee: Himanshu Vashishtha
> Fix For: 0.96.0
>
> Attachments: HBase-7122.patch
>
>
> In case the cluster is idle and the log has rolled (offset to 0), 
> replicationSource tries to open the log and gets an EOF exception. This gets 
> printed after every 10 sec until an entry is inserted in it.
> {code}
> 2012-11-07 15:47:40,924 DEBUG regionserver.ReplicationSource 
> (ReplicationSource.java:openReader(487)) - Opening log for replication 
> c0315.hal.cloudera.com%2C40020%2C1352324202860.1352327804874 at 0
> 2012-11-07 15:47:40,926 WARN  regionserver.ReplicationSource 
> (ReplicationSource.java:openReader(543)) - 1 Got: 
> java.io.EOFException
>   at java.io.DataInputStream.readFully(DataInputStream.java:180)
>   at java.io.DataInputStream.readFully(DataInputStream.java:152)
>   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
>   at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
>   at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
>   at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:175)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:716)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:491)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:290)
> 2012-11-07 15:47:40,927 WARN  regionserver.ReplicationSource 
> (ReplicationSource.java:openReader(547)) - Waited too long for this file, 
> considering dumping
> 2012-11-07 15:47:40,927 DEBUG regionserver.ReplicationSource 
> (ReplicationSource.java:sleepForRetries(562)) - Unable to open a reader, 
> sleeping 1000 times 10
> {code}
> We should reduce the log spewing in this case (or some informative message, 
> based on the offset).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7188) Move classes into hbase-client

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7188:
-

Priority: Critical  (was: Major)

This is critical to 0.96.  What we need to do to get this in Mr. Elliott?

> Move classes into hbase-client
> --
>
> Key: HBASE-7188
> URL: https://issues.apache.org/jira/browse/HBASE-7188
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client, IPC/RPC
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: HBASE-7188-0.patch, HBASE-7188-1.patch, 
> HBASE-7188-2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7255) KV size metric went missing from StoreScanner.

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7255:
-

Priority: Critical  (was: Major)

Making critical so we don't forget it.  Seems like simple fix.

> KV size metric went missing from StoreScanner.
> --
>
> Key: HBASE-7255
> URL: https://issues.apache.org/jira/browse/HBASE-7255
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Elliott Clark
>Priority: Critical
> Fix For: 0.96.0
>
>
> In trunk due to the metric refactor, at least the KV size metric went missing.
> See this code in StoreScanner.java:
> {code}
> } finally {
>   if (cumulativeMetric > 0 && metric != null) {
>   }
> }
> {code}
> Just an empty if statement, where the metric used to be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7293) [replication] Remove dead sinks from ReplicationSource.currentPeers, it's spammy

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550932#comment-13550932
 ] 

stack commented on HBASE-7293:
--

[~lhofhansl] +1 for trunk.

> [replication] Remove dead sinks from ReplicationSource.currentPeers, it's 
> spammy
> 
>
> Key: HBASE-7293
> URL: https://issues.apache.org/jira/browse/HBASE-7293
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3, 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7293-0.94.txt, 7293-0.94-v2.txt, 7293-0.96.txt
>
>
> I happened to look at a log today where I saw a lot lines like this:
> {noformat}
> 2012-12-06 23:29:08,318 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave 
> cluster looks down: This server is in the failed servers list: 
> sv4r20s49/10.4.20.49:10304
> 2012-12-06 23:29:15,987 WARN 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
> replicate because of a local or network error: 
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:416)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:462)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1150)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1000)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
>   at $Proxy14.replicateLogEntries(Unknown Source)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:627)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
> 2012-12-06 23:29:15,988 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave 
> cluster looks down: Connection refused
> {noformat}
> What struck me as weird is this had been going on for some days, I would 
> expect the RS to find new servers if it wasn't able to replicate. But the 
> reality is that only a few of the chosen sink RS were down so eventually the 
> source hits one that's good and is never able to refresh its list of servers.
> We should remove the dead servers, it's spammy and probably adds some slave 
> lag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550929#comment-13550929
 ] 

stack commented on HBASE-7295:
--

[~ghelmling] Work in HBASE-7460 could fix this when it removes the cache of 
clients?

> Contention in HBaseClient.getConnection
> ---
>
> Key: HBASE-7295
> URL: https://issues.apache.org/jira/browse/HBASE-7295
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.94.3
>Reporter: Varun Sharma
>Assignee: Varun Sharma
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
> 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
> 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt
>
>
> HBaseClient.getConnection() synchronizes on the connections object. We found 
> severe contention on a thrift gateway which was fanning out roughly 3000+ 
> calls per second to hbase region servers. The thrift gateway had 2000+ 
> threads for handling incoming connections. Threads were blocked on the 
> syncrhonized block - we set ipc.pool.size to 200. Since we are using 
> RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
> connections - it might lead to cases where we might go slightly over the 
> ipc.max.pool.size() but the additional connections would timeout after 
> maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7315) Remove support for client-side RowLocks

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7315:
-

Priority: Critical  (was: Major)

Marking critical so we don't forget about it.

> Remove support for client-side RowLocks
> ---
>
> Key: HBASE-7315
> URL: https://issues.apache.org/jira/browse/HBASE-7315
> Project: HBase
>  Issue Type: Sub-task
>  Components: Transactions/MVCC
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: HBASE-7315.patch, HBASE-7315-v2.patch, 
> HBASE-7315-v3.patch, HBASE-7315-v4.patch, HBASE-7315-v5.patch, 
> HBASE-7315-v6.patch
>
>
> See comments in HBASE-7263.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7315) Remove support for client-side RowLocks

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550927#comment-13550927
 ] 

stack commented on HBASE-7315:
--

[~gchanan] You think above failures are because of your patch?  Would it take 
much to get this in?  Would be sweet to have this in trunk.  Good on you.

> Remove support for client-side RowLocks
> ---
>
> Key: HBASE-7315
> URL: https://issues.apache.org/jira/browse/HBASE-7315
> Project: HBase
>  Issue Type: Sub-task
>  Components: Transactions/MVCC
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
> Fix For: 0.96.0
>
> Attachments: HBASE-7315.patch, HBASE-7315-v2.patch, 
> HBASE-7315-v3.patch, HBASE-7315-v4.patch, HBASE-7315-v5.patch, 
> HBASE-7315-v6.patch
>
>
> See comments in HBASE-7263.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7365) Safer table creation and deletion using .tmp dir

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550924#comment-13550924
 ] 

stack commented on HBASE-7365:
--

+1 for trunk.  No harm letting hadoopqa have a go at v3 before commit.

> Safer table creation and deletion using .tmp dir
> 
>
> Key: HBASE-7365
> URL: https://issues.apache.org/jira/browse/HBASE-7365
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 0.96.0
>
> Attachments: HBASE-7365-v0.patch, HBASE-7365-v1.patch, 
> HBASE-7365-v2.patch, HBASE-7365-v3.patch
>
>
> Currently tables are created in the root directory, and the removal works on 
> the root directory.
> Change the code to use a /hbase/.tmp directory to make the creation and 
> removal a bit safer
> Table Creation steps
>  * Create the table descriptor (table folder, in /hbase/.tmp/)
>  * Create the table regions (always in temp)
>  * Move the table from temp to the root folder
>  * Add the regions to meta
>  * Trigger assignment
>  * Set enable flag in ZooKeeper
> Table Deletion steps
>  * Wait for regions in transition
>  * Remove regions from meta (use bulk delete)
>  * Move the table in /hbase/.tmp
>  * Remove the table from the descriptor cache
>  * Remove table from zookeeper
>  * Archive the table
> The main changes in the current code are:
>  * Writing to /hbase/.tmp and then rename
>  * using bulk delete in DeletionTableHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7468) TestSplitTransactionOnCluster hangs frequently

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550925#comment-13550925
 ] 

Lars Hofhansl commented on HBASE-7468:
--

Are you saying we should put the infinite timeouts back? These hangs did 
happens before I removed them (in an attempt to pinpoint where it actually 
hangs).

Is this a split racing with a disable of the table?
It seems like the main problem is disabling the table in a finally clause, 
which would try to disable it even when some of the assertions fail.
We could just remove the try/finally and cleanup the table only after 
successful test execution... At least that would not hide the problem as it 
does now.


> TestSplitTransactionOnCluster hangs frequently
> --
>
> Key: HBASE-7468
> URL: https://issues.apache.org/jira/browse/HBASE-7468
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 7468-jstack.txt, 7468-output.zip, 
> TestSplitTransactionOnCluster-jstack.txt
>
>
> This what I saw once in a local build.
> {code}
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:831)
> at 
> org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testShouldClearRITWhenNodeFoundInSplittingState(TestSplitTransactionOnCluster.java:650)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7213) Have HLog files for .META. edits only

2013-01-10 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550923#comment-13550923
 ] 

Devaraj Das commented on HBASE-7213:


Sorry, forgot to mention that I took into consideration the last 2-3 comments 
from [~stack] in the last patch (and in the process, removed one unnecessary 
splitMetaLog).

> Have HLog files for .META. edits only
> -
>
> Key: HBASE-7213
> URL: https://issues.apache.org/jira/browse/HBASE-7213
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7213-2.10.patch, 7213-2.11.patch, 7213-2.4.patch, 
> 7213-2.6.patch, 7213-2.8.patch, 7213-2.9.patch, 7213-in-progress.2.2.patch, 
> 7213-in-progress.2.patch, 7213-in-progress.patch
>
>
> Over on HBASE-6774, there is a discussion on separating out the edits for 
> .META. regions from the other regions' edits w.r.t where the edits are 
> written. This jira is to track an implementation of that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7213) Have HLog files for .META. edits only

2013-01-10 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-7213:
---

Attachment: 7213-2.11.patch

Hopefully my last rebase :-)

> Have HLog files for .META. edits only
> -
>
> Key: HBASE-7213
> URL: https://issues.apache.org/jira/browse/HBASE-7213
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7213-2.10.patch, 7213-2.11.patch, 7213-2.4.patch, 
> 7213-2.6.patch, 7213-2.8.patch, 7213-2.9.patch, 7213-in-progress.2.2.patch, 
> 7213-in-progress.2.patch, 7213-in-progress.patch
>
>
> Over on HBASE-6774, there is a discussion on separating out the edits for 
> .META. regions from the other regions' edits w.r.t where the edits are 
> written. This jira is to track an implementation of that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7427) Check line lenghts in the test-patch script

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-7427.
--

Resolution: Fixed

Reresolving.  That seems to be what you fellas want (Jon and Enis).

> Check line lenghts in the test-patch script
> ---
>
> Key: HBASE-7427
> URL: https://issues.apache.org/jira/browse/HBASE-7427
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.96.0
>
> Attachments: hbase-7427_v1.patch
>
>
> Checkstyle is disabled in test-patch, and it is not very easy to make it 
> work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7450) orphan RPC connection in HBaseClient leaves "null" out member, causing NPE in HCM

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7450:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Resolving.  Was committed to trunk a while back.

> orphan RPC connection in HBaseClient leaves "null" out member, causing NPE in 
> HCM
> -
>
> Key: HBASE-7450
> URL: https://issues.apache.org/jira/browse/HBASE-7450
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.92.0
>Reporter: Zavier Gao
>Assignee: Zavier Gao
> Fix For: 0.96.0
>
> Attachments: 7450_v1.txt, 7450_v2.txt, 7450_v3.txt
>
>
> Just like: https://issues.apache.org/jira/browse/HADOOP-7428
> Exceptions except IOException thrown in setupIOstreams would leave the 
> connection half-setup. But the connection would not close utill it become 
> timeout. The orphane connection cause NPE when is used in HCM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7506:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

This was committed to trunk and 0.94.  Resolving.

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7506-94.patch, 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-01-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550914#comment-13550914
 ] 

ramkrishna.s.vasudevan commented on HBASE-7521:
---

Thanks Rajesh.
@Sergey,
The above condition of assign retry and SSH also trying to assign is what i was 
mentioning.  Anyway Rajesh has explained with more details.

> fix HBASE-6060 (regions stuck in opening state) in 0.94
> ---
>
> Key: HBASE-7521
> URL: https://issues.apache.org/jira/browse/HBASE-7521
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7521-v0.patch, HBASE-7521-v1.patch
>
>
> Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
> Still, we may want to fix the issue in 0.94 (via some different fix) because 
> the regions stuck in opening for ridiculous amounts of time is not a good 
> thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7318) Add verbose logging option to HConnectionManager

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550909#comment-13550909
 ] 

Hudson commented on HBASE-7318:
---

Integrated in HBase-TRUNK #3728 (See 
[https://builds.apache.org/job/HBase-TRUNK/3728/])
HBASE-7318 Add verbose logging option to HConnectionManager (Revision 
1431894)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedWithDetailsException.java


> Add verbose logging option to HConnectionManager
> 
>
> Key: HBASE-7318
> URL: https://issues.apache.org/jira/browse/HBASE-7318
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7318-v2.patch, HBASE-7318-v0.patch, HBASE-7318-v1.patch, 
> HBASE-7318-v3.patch
>
>
> In the course of HBASE-7250 I found that client-side errors (as well as 
> server-side errors, but that's another question) are hard to debug.
> I have some local commits with useful, not-that-hacky HConnectionManager 
> logging added.
> Need to "productionize" it to be off by default but easy-to-enable for 
> debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550910#comment-13550910
 ] 

Hudson commented on HBASE-7506:
---

Integrated in HBase-TRUNK #3728 (See 
[https://builds.apache.org/job/HBase-TRUNK/3728/])
HBASE-7506 Judgment of carrying ROOT/META will become wrong when expiring 
server (Chunhui) (Revision 1431901)

 Result = FAILURE
zjushch : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java


> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7506-94.patch, 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7365) Safer table creation and deletion using .tmp dir

2013-01-10 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7365:
---

Attachment: HBASE-7365-v3.patch

v3 fixes javadoc and lineLength issue

> Safer table creation and deletion using .tmp dir
> 
>
> Key: HBASE-7365
> URL: https://issues.apache.org/jira/browse/HBASE-7365
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 0.96.0
>
> Attachments: HBASE-7365-v0.patch, HBASE-7365-v1.patch, 
> HBASE-7365-v2.patch, HBASE-7365-v3.patch
>
>
> Currently tables are created in the root directory, and the removal works on 
> the root directory.
> Change the code to use a /hbase/.tmp directory to make the creation and 
> removal a bit safer
> Table Creation steps
>  * Create the table descriptor (table folder, in /hbase/.tmp/)
>  * Create the table regions (always in temp)
>  * Move the table from temp to the root folder
>  * Add the regions to meta
>  * Trigger assignment
>  * Set enable flag in ZooKeeper
> Table Deletion steps
>  * Wait for regions in transition
>  * Remove regions from meta (use bulk delete)
>  * Move the table in /hbase/.tmp
>  * Remove the table from the descriptor cache
>  * Remove table from zookeeper
>  * Archive the table
> The main changes in the current code are:
>  * Writing to /hbase/.tmp and then rename
>  * using bulk delete in DeletionTableHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4709) Hadoop metrics2 setup in test MiniDFSClusters spewing JMX errors

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4709:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

This was committed to trunk a while back.  Resolving.

> Hadoop metrics2 setup in test MiniDFSClusters spewing JMX errors
> 
>
> Key: HBASE-4709
> URL: https://issues.apache.org/jira/browse/HBASE-4709
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.90.4, 0.92.0, 0.94.0, 0.94.1, 0.96.0
>Reporter: Gary Helmling
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4709.v2.patch, 4709.v2.patch, 4709_workaround.v1.patch
>
>
> Since switching over HBase to build with Hadoop 0.20.205.0, we've been 
> getting a lot of metrics related errors in the log files for tests:
> {noformat}
> 2011-10-30 22:00:22,858 INFO  [main] log.Slf4jLog(67): jetty-6.1.26
> 2011-10-30 22:00:22,871 INFO  [main] log.Slf4jLog(67): Extract 
> jar:file:/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-core/0.20.205.0/hadoop-core-0.20.205.0.jar!/webapps/datanode
>  to /tmp/Jetty_localhost_55751_datanode.kw16hy/webapp
> 2011-10-30 22:00:23,048 INFO  [main] log.Slf4jLog(67): Started 
> SelectChannelConnector@localhost:55751
> Starting DataNode 1 with dfs.data.dir: 
> /home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/target/test-data/7ba65a16-03ad-4624-b769-57405945ef58/dfscluster_3775fc23-1b51-4966-8133-205564bae762/dfs/data/data3,/home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/target/test-data/7ba65a16-03ad-4624-b769-57405945ef58/dfscluster_3775fc23-1b51-4966-8133-205564bae762/dfs/data/data4
> 2011-10-30 22:00:23,237 WARN  [main] impl.MetricsSystemImpl(137): Metrics 
> system not started: Cannot locate configuration: tried 
> hadoop-metrics2-datanode.properties, hadoop-metrics2.properties
> 2011-10-30 22:00:23,237 WARN  [main] util.MBeans(59): 
> Hadoop:service=DataNode,name=MetricsSystem,sub=Control
> javax.management.InstanceAlreadyExistsException: MXBean already registered 
> with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control
>   at 
> com.sun.jmx.mbeanserver.MXBeanLookup.addReference(MXBeanLookup.java:120)
>   at 
> com.sun.jmx.mbeanserver.MXBeanSupport.register(MXBeanSupport.java:143)
>   at 
> com.sun.jmx.mbeanserver.MBeanSupport.preRegister2(MBeanSupport.java:183)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:941)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482)
>   at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:56)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.initSystemMBean(MetricsSystemImpl.java:500)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:140)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:40)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1483)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1459)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:417)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:280)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:349)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:518)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:474)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:461)
> {noformat}
> This seems to be due to errors initializing the new hadoop metrics2 code by 
> default, when running in a mini cluster.  The errors themselves seem to be 
> harmless -- they're not breaking any tests -- but we should figure out what 
> configuration we need to eliminate them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7258) Hbase needs to create baseZNode recursively

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7258:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Resovling.  Was committed to trunk a while back.

> Hbase needs to create baseZNode recursively
> ---
>
> Key: HBASE-7258
> URL: https://issues.apache.org/jira/browse/HBASE-7258
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Affects Versions: 0.94.2
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: zookeeper, zookeeper.znode.parent
> Fix For: 0.96.0
>
> Attachments: HBASE-7258-0.94.patch, HBASE-7258.diff, HBASE-7258.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In deploy env, multi small hbase clusters may share a same zk cluster. So, 
> for hbase cluster1, its parent znode is /hbase/cluster1. But in hbase version 
> 0.94.1, hbase use ZKUtil.createAndFailSilent(this, baseZNode) to create 
> parent path and it will throw a NoNode exception if znode /hbase donot exist.
> We want to change it to ZKUtil.createWithParents(this, baseZNode); to suport 
> create baseZNode recursivly. 
> The NoNode exception is:
> java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMaster
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1792)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:77)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1806)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
> KeeperErrorCode = NoNode for /hbase/cluster1
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:420)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:402)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:905)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:166)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:159)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:282)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.hbase.master.H

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7363) Fix javadocs warnings for hbase-server packages from master to end

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7363:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

This has been committed already.  Resolving.

> Fix javadocs warnings for hbase-server packages from master to end
> --
>
> Key: HBASE-7363
> URL: https://issues.apache.org/jira/browse/HBASE-7363
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7363-part2.v2.patch, 7363.v1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7387) StoreScanner need to be able to be subclassed

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7387:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Resolved.  Committed to trunk.  Thanks for the patch Raymond.

> StoreScanner need to be able to be subclassed
> -
>
> Key: HBASE-7387
> URL: https://issues.apache.org/jira/browse/HBASE-7387
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.96.0
>Reporter: Raymond Liu
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE_7387_v2.patch, StoreScanner.patch
>
>
> StoreScanner can be replaced by preStoreScannerOpen hook with CP. In order to 
> reuse most of the logic in current StoreScanner, subclass it might be the 
> best approaching. Thus a lot of private member need to be changed from 
> private to protected.
> At present, in order to to implement a custom storescanner for dot 
> (HBASE-6805), only a few of the private member need to be changed as in the 
> attached storescanner.patch, while should we change all the reasonable field 
> from private to protected as in HBASE-7387-v?.patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-01-10 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550902#comment-13550902
 ] 

rajeshbabu commented on HBASE-7521:
---

Patch works fine for OPENING(HBASE-6060 patches also works fine in this case) 
but there are issues with PENDING_OPEN.
{code}
  for (RegionState rit : ritsNotYetOnServer) {
if (rit.isPendingOpen() || rit.isOpening()) {
  LOG.info("Hijacking and reassigning " + 
rit.getRegion().getRegionNameAsString() +
" that was on " + serverName + " in " + rit.getState() + " state.");
  this.services.getAssignmentManager().assign(rit.getRegion(), true, 
true, true);
}
  }
{code}
Here if we see region in PENDING_OPEN on dead server we are assigning the 
region.
In case of single region assign if we see a server is dead we will retry assign 
to some other region server.
Main race condition can happen below as in HBASE-5816
{code}
if (!hijack && !state.isClosed() && !state.isOffline()) {
  if (!regionAlreadyInTransitionException ) {
String msg = "Unexpected state : " + state + " .. Cannot transit it to 
OFFLINE.";
this.master.abort(msg, new IllegalStateException(msg));
return -1;
  } 
  LOG.debug("Unexpected state : " + state
  + " but retrying to assign because 
RegionAlreadyInTransitionException.");
}
{code}
One more thing is there is a possibility of double assignment also.

In case of PENDING_OPEN we are not able to decide whether to retry or skip 
retry(as thinking SSH can handle) becuase there are multiple cases
-> If RS went down after setting state to PENDING_OPEN, then SSH can assign the 
region(as for the patch)
at the same time single assign also retry to assign because send open RPC will 
fail with connection refused exception
Lets suppose if we skip the assign retry in case of connection refused 
exception then there is one problem with this approach.
scenario is :
1) got region plan 
2) destination server went down - ssh also processed (here we will see the 
region in offline state and skip assignment).
3) change state to PENDING_OPEN
4) then send open RPC fail with the connection refused exception and we will 
skip assign.
 
-> If we skip the assign in SSH also problem only
scenario is :
1) got region plan and region in PENDING_OPEN
2) just spawned OpenRegionHandler but didnt transition to OPENING
3) single assign came out as thinking RS is up and it will take care
4) RS went down
5) now SSH also will skip the region assignment



> fix HBASE-6060 (regions stuck in opening state) in 0.94
> ---
>
> Key: HBASE-7521
> URL: https://issues.apache.org/jira/browse/HBASE-7521
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7521-v0.patch, HBASE-7521-v1.patch
>
>
> Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
> Still, we may want to fix the issue in 0.94 (via some different fix) because 
> the regions stuck in opening for ridiculous amounts of time is not a good 
> thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7434) Use generics appropriately in RPCEngine and reduce casts, with fixing a related bug of breaking thread-safety in HConnectionManager

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7434:
-

Resolution: Invalid
Status: Resolved  (was: Patch Available)

This is a good patch.  Thank you for making it Hiroshi.  It is no longer 
applicable though now that versionedprotocol has been removed.  Sorry for your 
working on something that we could not use.  Please keep up your quality 
submissions... lets get the rest of your contribs in.

> Use generics appropriately in RPCEngine and reduce casts, with fixing a 
> related bug of breaking thread-safety in HConnectionManager
> ---
>
> Key: HBASE-7434
> URL: https://issues.apache.org/jira/browse/HBASE-7434
> Project: HBase
>  Issue Type: Improvement
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7434.patch, HBASE-7434-V2.patch
>
>
> In RpcEngine,
> {code}
>   VersionedProtocol getProxy(Class protocol, ...)
> {code}
> should be
> {code}
>T getProxy(Class protocol, ...)
> {code}
> Also, while removing casts I encountered a bug of the method 
> HConnectionManager.HConnectionImplementation.getProtocol() using broken logic 
> just like double-checked locking for HashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways

2013-01-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550900#comment-13550900
 ] 

Anoop Sam John commented on HBASE-7034:
---

Is this code came in by mistake?
{code}
RecoverableZooKeeper#setData(String path, byte[] data, int version){

  byte[] revData = zk.getData(path, false, stat);
  int idLength = Bytes.toInt(revData, ID_LENGTH_SIZE);
  int dataLength = revData.length-ID_LENGTH_SIZE-idLength;
  int dataOffset = ID_LENGTH_SIZE+idLength;
  
  if(Bytes.compareTo(revData, ID_LENGTH_SIZE, id.length, 
  revData, dataOffset, dataLength) == 0) {
// the bad version is caused by previous successful setData
return stat;
  }
}
{code}
When we write the data to zk, we write an identifier for the process. Here in 
order to check whether the BADVERSION exception from zookeeper is due to a 
previous setData (from the same process), we need to compare the id read from 
the zookeeper and the id for this process (this.id).. Or am I missing some 
thing. The above offset and length calculating math and compare looks 
problematic for me.

In that case this is the issue for this bug I guess.

>From the log it is clear that there is no problem wrt the node and version at 
>1st. [As part of the transition of state from OPENING to OPENED 1st the 
>present data is read and the check below tells the data and its version every 
>thing is fine.] Immediately a connection loss happened. This triggers a retry 
>for the setData. May be the previous operation made the data change in 
>zookeeper and master got the data changed event. (?)

I think correcting the above code may solve the problems.

> Bad version, failed OPENING to OPENED but master thinks it is open anyways
> --
>
> Key: HBASE-7034
> URL: https://issues.apache.org/jira/browse/HBASE-7034
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.94.2
>Reporter: stack
>
> I have this in RS log:
> {code}
> 2012-10-22 02:21:50,698 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed 
> transitioning node 
> b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f.
>  from OPENING to OPENED -- closing region
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
> BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f
> {code}
> Master says this (it is bulk assigning):
> {code}
> 
> 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:10302-0xb3a862e57a503ba Set watcher on existing znode 
> /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f
> ...
> then this
> 
> 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:10302-0xb3a862e57a503ba Set watcher on existing znode 
> /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f
> 
> 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode 
> /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; 
> region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f.,
>  origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED
> etc.
> {code}
> Disagreement as to what is going on here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7441:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

This has been committed.  Resolving.

> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch, 
> HBASE-7441-trunk-v2.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7486) master pid file is not getting removed if we stop hbase from stop-hbase.sh

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7486:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.5)
   (was: 0.92.3)
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Tried it.  Indeed the pid sticks around.  Committed to trunk.  Thanks for the 
patch Rajeshbabu.  Did not apply to 0.94 since not major issue.

> master pid file is not getting removed if we stop hbase from stop-hbase.sh
> --
>
> Key: HBASE-7486
> URL: https://issues.apache.org/jira/browse/HBASE-7486
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7486_92.patch, HBASE-7486_94.patch, 
> HBASE-7486_trunk.patch
>
>
> in stop-hbase.sh script we are not removing master pid file after master 
> termination.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7512) Document the findbugs library annotation

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-7512.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to trunk.

> Document the findbugs library annotation
> 
>
> Key: HBASE-7512
> URL: https://issues.apache.org/jira/browse/HBASE-7512
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7512.txt
>
>
> See HBASE-7508

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7512) Document the findbugs library annotation

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7512:
-

Attachment: 7512.txt

Here is what I committed.  Also made note that Writables are pre-0.96.

Thanks for the doc. Nkeywal.

> Document the findbugs library annotation
> 
>
> Key: HBASE-7512
> URL: https://issues.apache.org/jira/browse/HBASE-7512
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7512.txt
>
>
> See HBASE-7508

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7533) Write an RPC Specification for 0.96

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550889#comment-13550889
 ] 

stack commented on HBASE-7533:
--

[~eclark] Well, we need to be able to pick through a multi response and 
correlate exception and request that caused it so yeah, an exceptionResponse 
(with an 'exception' addition to the enum...).  Or were you thinking something 
different?

> Write an RPC Specification for 0.96
> ---
>
> Key: HBASE-7533
> URL: https://issues.apache.org/jira/browse/HBASE-7533
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.96.0
>
>
> RPC format is changing for 0.96 to accomodate our protobufing all around.  
> Here is a first cut.  Please shred: 
> https://docs.google.com/document/d/1-1RJMLXzYldmHgKP7M7ynK6euRpucD03fZ603DlZfGI/edit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7403) Online Merge

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550885#comment-13550885
 ] 

Ted Yu commented on HBASE-7403:
---

+1 on v10

> Online Merge
> 
>
> Key: HBASE-7403
> URL: https://issues.apache.org/jira/browse/HBASE-7403
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
> 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv10.patch, 
> hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, 
> hbase-7403-trunkv7.patch, hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, 
> merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only 
> encoded name enough
> 4.No limit when operation, you don't need to tabke care the events like 
> Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
> whether you send a wrong merge request, it has alread done for you
> 5.Only little offline time for two merging regions
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. 
> mslab)
> Current merge tools only support offline and are not able to redo if 
> exception is thrown in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
> regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
> regionB’s directory)
> 3.Add the merged regionC to .META.
> 4.Assign the merged regionC
> As design of this patch , once we do the merge work in the HDFS,we could redo 
> it until successful if it throws exception or abort or server restart, but 
> couldn’t be rolled back. 
> It depends on
> Use zookeeper to record the transaction journal state, make redo easier
> Use zookeeper to send/receive merge request
> Merge transaction is executed on the master
> Support calling merge request through API or shell tool
> About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7403) Online Merge

2013-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550881#comment-13550881
 ] 

Hadoop QA commented on HBASE-7403:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12564361/hbase-7403-trunkv10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3976//console

This message is automatically generated.

> Online Merge
> 
>
> Key: HBASE-7403
> URL: https://issues.apache.org/jira/browse/HBASE-7403
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
> 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv10.patch, 
> hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, 
> hbase-7403-trunkv7.patch, hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, 
> merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only 
> encoded name enough
> 4.No limit when operation, you don't need to tabke care the events like 
> Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
> whether you send a wrong merge request, it has alread done for you
> 5.Only little offline time for two merging regions
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. 
> mslab)
> Current merge tools only support offline and are not able to redo if 
> exception is thrown in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
> regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
> regionB’s directory)
> 3.Add the merged regionC to .

[jira] [Updated] (HBASE-7508) Fix simple findbugs

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7508:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Mighty Nkeywal committed this.  Resolving.

> Fix simple findbugs
> ---
>
> Key: HBASE-7508
> URL: https://issues.apache.org/jira/browse/HBASE-7508
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: 7508.v1.patch, 7508.v2.patch, 7508.v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7213) Have HLog files for .META. edits only

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550873#comment-13550873
 ] 

stack commented on HBASE-7213:
--

I was trying this [~devaraj] and if fails to apply to trunk.  I tried to get it 
in but there is a big difference in this file: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java.rej
  Mind taking look see sir?

Looking at the patch again, you might move the define that is out in HConstants 
local to the wal package:

public static final String META_HLOG_FILE_EXTN = ".meta";  

Put it in HLog.java Interface?

In MasterFileSystem, should a few of the public methods have some javadoc or 
comment?  For instance, I see splitMetaLog twice but one takes a ServerName and 
another takes a list of ServerNames but their bodies do different things.  It 
is a little confusing.

Else looks good.  Lets get it in quick.  Elsewhere fellas are talking about 
removing MetaServerShutdown handler so would require yet another patch moving 
stuff around.  Good stuff.




> Have HLog files for .META. edits only
> -
>
> Key: HBASE-7213
> URL: https://issues.apache.org/jira/browse/HBASE-7213
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7213-2.10.patch, 7213-2.4.patch, 7213-2.6.patch, 
> 7213-2.8.patch, 7213-2.9.patch, 7213-in-progress.2.2.patch, 
> 7213-in-progress.2.patch, 7213-in-progress.patch
>
>
> Over on HBASE-6774, there is a discussion on separating out the edits for 
> .META. regions from the other regions' edits w.r.t where the edits are 
> written. This jira is to track an implementation of that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550869#comment-13550869
 ] 

Lars Hofhansl commented on HBASE-7507:
--

You're right (looked at the patch again).
+1 then

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7507-trunk v1.patch, 7507-trunk v2.patch, 
> 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7533) Write an RPC Specification for 0.96

2013-01-10 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550868#comment-13550868
 ] 

Elliott Clark commented on HBASE-7533:
--

My thought was that the union type would have an optional exception type in 
addition to the response types; that would allow us to more directly tie 
exceptions from multis to the action that caused them.  Though that might not 
be great.  What do others think ?

> Write an RPC Specification for 0.96
> ---
>
> Key: HBASE-7533
> URL: https://issues.apache.org/jira/browse/HBASE-7533
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.96.0
>
>
> RPC format is changing for 0.96 to accomodate our protobufing all around.  
> Here is a first cut.  Please shred: 
> https://docs.google.com/document/d/1-1RJMLXzYldmHgKP7M7ynK6euRpucD03fZ603DlZfGI/edit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-7533) Write an RPC Specification for 0.96

2013-01-10 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550868#comment-13550868
 ] 

Elliott Clark edited comment on HBASE-7533 at 1/11/13 6:13 AM:
---

My thought was that the union type would have an optional exception type in 
addition to the response types; that would allow us to more directly tie 
exceptions from multis to the action that caused them.  Though that might not 
be great.  What do you think ?

  was (Author: eclark):
My thought was that the union type would have an optional exception type in 
addition to the response types; that would allow us to more directly tie 
exceptions from multis to the action that caused them.  Though that might not 
be great.  What do others think ?
  
> Write an RPC Specification for 0.96
> ---
>
> Key: HBASE-7533
> URL: https://issues.apache.org/jira/browse/HBASE-7533
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.96.0
>
>
> RPC format is changing for 0.96 to accomodate our protobufing all around.  
> Here is a first cut.  Please shred: 
> https://docs.google.com/document/d/1-1RJMLXzYldmHgKP7M7ynK6euRpucD03fZ603DlZfGI/edit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-10 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550867#comment-13550867
 ] 

chunhui shen commented on HBASE-7507:
-

[~lhofhansl]
bq.What if I do not want to retry?
Set HStore.flush_retries_number as 1, it means no retry. Maybe name "retries" 
is not suitable. But we use 'retries' in "hbase.client.retries.number",  in 
order to keep the same, here I use the name 'retries'.
Setting 'hbase.client.retries.number' as 1 means no retry, did we changed this 
semantic?

bq.setting the retry number to 0 will force the code to reget the parameter on 
each call

{code}
+  if (HStore.flush_retries_number <= 0) {
+throw new IllegalArgumentException(
+"hbase.hstore.flush.retries.number must be > 0, not "
++ HStore.flush_retries_number);
+  }
{code}

Server will go down if set the retry number to 0

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7507-trunk v1.patch, 7507-trunk v2.patch, 
> 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7247:
-

Priority: Major  (was: Critical)

Marking major rather than critical.  This would be really sweet to have.  It 
might actually be harder getting it in if it doesn't make it into 0.96.  It is 
an improvement though so shouldn't hold up 0.96. Thats why I'm making it 
'major'.

> Assignment performances decreased by 50% because of 
> regionserver.OpenRegionHandler#tickleOpening
> 
>
> Key: HBASE-7247
> URL: https://issues.apache.org/jira/browse/HBASE-7247
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Region Assignment, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.96.0
>
> Attachments: 7247.v1.patch
>
>
> The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
> "Do this so master doesn't timeout this region-in-transition.".
> However, on the usual test, this makes the assignment time of 1500 regions 
> goes from 70s to 100s, that is, we're 50% slower because of this.
> More generally, ZooKeper commits to disk all the data update, and this takes 
> time. Using it to provide a keep alive seems overkill. At the very list, it 
> could be made asynchronous.
> I'm not sure how necessary these updates are required (I need to go deeper in 
> the internal, feedback welcome), but it seems very important to optimize 
> this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550860#comment-13550860
 ] 

Lars Hofhansl edited comment on HBASE-7507 at 1/11/13 5:55 AM:
---

What if I do not want to retry? With the current patch that is impossible 
(setting the retry number to 0 will force the code to reget the parameter on 
each call).

  was (Author: lhofhansl):
What I do not want to retry? With the current patch that is impossible 
(setting the retry number to 0 will force the code to reget the parameter on 
each call).
  
> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7507-trunk v1.patch, 7507-trunk v2.patch, 
> 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550860#comment-13550860
 ] 

Lars Hofhansl commented on HBASE-7507:
--

What I do not want to retry? With the current patch that is impossible (setting 
the retry number to 0 will force the code to reget the parameter on each call).

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7507-trunk v1.patch, 7507-trunk v2.patch, 
> 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550857#comment-13550857
 ] 

Hadoop QA commented on HBASE-7506:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12564363/7506-94.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3977//console

This message is automatically generated.

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7506-94.patch, 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7506:


Attachment: 7506-94.patch

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7506-94.patch, 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7533) Write an RPC Specification for 0.96

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550851#comment-13550851
 ] 

stack commented on HBASE-7533:
--

Chatting with [~eclark], a "downside" to the above scheme is the need in 
protobuf to list every ipc method in the UnionResponseType enum and also in the 
UnionRequestType.  It seems a bit much given we can extrapolate param type and 
return given the method name (whether we are doing reflection against 
'protocol' Interfaces or lookups in pb Service).  Elliott suggested we could 
have opaque bytes for the request and response Message.  This would mean 
unmarshal the RpcResponse, then unmarshal the contained bytes to find the 
Response Message.  This would be a bit of a pain.  Where we left it was 
prototyping out both; that would be probably more informative than 
prognosticating in front of a white board.  I'll have a go at it.

Hey [~eclark], is there a response type missing from your enum example list 
above?  The error type?

> Write an RPC Specification for 0.96
> ---
>
> Key: HBASE-7533
> URL: https://issues.apache.org/jira/browse/HBASE-7533
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.96.0
>
>
> RPC format is changing for 0.96 to accomodate our protobufing all around.  
> Here is a first cut.  Please shred: 
> https://docs.google.com/document/d/1-1RJMLXzYldmHgKP7M7ynK6euRpucD03fZ603DlZfGI/edit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7501) Introduce MetaEditor method that both adds and deletes rows in .META. table

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550850#comment-13550850
 ] 

Ted Yu commented on HBASE-7501:
---

Here is a use case from RestoreSnapshotHandler.java 
(https://reviews.apache.org/r/8674/):
{code}
  if (hrisToRemove != null) {
MetaEditor.deleteRegions(catalogTracker, hrisToRemove);
  }
  MetaEditor.addRegionsToMeta(catalogTracker, hris);
{code}

> Introduce MetaEditor method that both adds and deletes rows in .META. table
> ---
>
> Key: HBASE-7501
> URL: https://issues.apache.org/jira/browse/HBASE-7501
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Matteo Bertozzi
> Attachments: HBASE-7501-v0.patch, HBASE-7501-v1.patch
>
>
> In review of HBASE-7365, MetaEditor.deleteRegions() and 
> MetaEditor.addRegionsToMeta() are used in 
> RestoreSnapshotHandler.java.handleTableOperation() to apply changes to .META.
> I made following suggestion:
> Can we introduce new method in MetaEditor which takes List of Mutation's ?
> The Delete and Put would be grouped and then written to .META. table in one 
> transaction.
> Jon responded:
> I like that idea -- then the todo/warning or follow on could refer to that 
> method.  When we fix it, it could get used in other multi row meta 
> modifications like splits and table creation/deletion in general.
> See https://reviews.apache.org/r/8674/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550849#comment-13550849
 ] 

chunhui shen commented on HBASE-7506:
-

Patch v1 integrated to trunk, 0.94 branch

Thanks for the review ram,jimmy, ted

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7506:


Fix Version/s: 0.94.5

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7360) Snapshot 0.94 Backport

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550848#comment-13550848
 ] 

Lars Hofhansl commented on HBASE-7360:
--

Let's get this rocksolid in trunk (without any known bugs).
Then we can discuss the 0.94 backport. My reasoning for not dismissing this 
outright is the information I got that both Cloudera and Hortonworks plan to 
backport this to their 0.94-based distributions.
If that is indeed the plan we should consider doing this also in the "official" 
Apache branch.


> Snapshot 0.94 Backport 
> ---
>
> Key: HBASE-7360
> URL: https://issues.apache.org/jira/browse/HBASE-7360
> Project: HBase
>  Issue Type: New Feature
>  Components: snapshots
>Affects Versions: 0.94.3
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>
> Backport snapshot code to 0.94
> The main changes needed to backport the snapshot are related to the protobuf 
> vs writable rpc.
> Offline Snapshot
>  * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
>  * HBASE-6765 - Take a Snapshot interface
>  * HBASE-6571 - Generic multi-thread/cross-process error handling framework
>  * HBASE-6353 - Snapshots shell
>  * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
>  * HBASE-6863 - Offline snapshots
>  * HBASE-6865 - Snapshot File Cleaners
>  * HBASE-6777 - Snapshot Restore Interface
>  * HBASE-6230 - Clone/Restore Snapshots
>  * HBASE-6802 - Export Snapshot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7539) when region is closed, the closing server should null out location in meta in process

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550847#comment-13550847
 ] 

stack commented on HBASE-7539:
--

[~sershe] Yes, if you can distingush cluster shutdown from master explicit 
close.  What does client do if no address for a region?  Give up or retry?

> when region is closed, the closing server should null out location in meta in 
> process
> -
>
> Key: HBASE-7539
> URL: https://issues.apache.org/jira/browse/HBASE-7539
> Project: HBase
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>
> No point in having stale META record when the server that put it has already 
> closed the region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550845#comment-13550845
 ] 

stack commented on HBASE-7506:
--

Are you going to commit [~zjushch]?  You have two +1s.

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7506-trunk v1.patch, 7506-trunkv1.patch, 
> 7506-trunkv2.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550841#comment-13550841
 ] 

stack commented on HBASE-7479:
--

Thanks for the addendum [~ted_yu]


> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.addendum, 7479.txt, 7479.txt, 7479v2.txt, 7479v3.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-10 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7504:


  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7504-94.patch, 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7318) Add verbose logging option to HConnectionManager

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7318:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk after trying the test that failed for me previously and 
watching it pass.  Thank you Mr. Sergey.

> Add verbose logging option to HConnectionManager
> 
>
> Key: HBASE-7318
> URL: https://issues.apache.org/jira/browse/HBASE-7318
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 7318-v2.patch, HBASE-7318-v0.patch, HBASE-7318-v1.patch, 
> HBASE-7318-v3.patch
>
>
> In the course of HBASE-7250 I found that client-side errors (as well as 
> server-side errors, but that's another question) are hard to debug.
> I have some local commits with useful, not-that-hacky HConnectionManager 
> logging added.
> Need to "productionize" it to be off by default but easy-to-enable for 
> debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7529) Wrong ExecutorType for EventType.M_RS_OPEN_ROOT in trunk

2013-01-10 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7529:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Wrong ExecutorType for EventType.M_RS_OPEN_ROOT in trunk
> 
>
> Key: HBASE-7529
> URL: https://issues.apache.org/jira/browse/HBASE-7529
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.96.0
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7529-trunk.patch
>
>
> {code}
> M_RS_OPEN_ROOT(21, ExecutorType.RS_OPEN_REGION),  // Master 
> asking RS to open root
> {code}
> It's a mistake only in trunk, causing ROOT couldn't be online for a long long 
> time:
> 1.ROOT wait open-region-thread to handle opening it.
> 2.Opening regions wait for ROOT to online, but occupy the threads...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7403) Online Merge

2013-01-10 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7403:


Attachment: hbase-7403-trunkv10.patch

Improve test case in patchV10 as Ted's suggestion

> Online Merge
> 
>
> Key: HBASE-7403
> URL: https://issues.apache.org/jira/browse/HBASE-7403
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
> 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv10.patch, 
> hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, 
> hbase-7403-trunkv7.patch, hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, 
> merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only 
> encoded name enough
> 4.No limit when operation, you don't need to tabke care the events like 
> Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
> whether you send a wrong merge request, it has alread done for you
> 5.Only little offline time for two merging regions
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. 
> mslab)
> Current merge tools only support offline and are not able to redo if 
> exception is thrown in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
> regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
> regionB’s directory)
> 3.Add the merged regionC to .META.
> 4.Assign the merged regionC
> As design of this patch , once we do the merge work in the HDFS,we could redo 
> it until successful if it throws exception or abort or server restart, but 
> couldn’t be rolled back. 
> It depends on
> Use zookeeper to record the transaction journal state, make redo easier
> Use zookeeper to send/receive merge request
> Merge transaction is executed on the master
> Support calling merge request through API or shell tool
> About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-4575) Inconsistent naming for ZK config parameters

2013-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4575.
--

Resolution: Invalid

Issue does not say what is inconsistent.  Resolving as invalid.

> Inconsistent naming for ZK config parameters
> 
>
> Key: HBASE-4575
> URL: https://issues.apache.org/jira/browse/HBASE-4575
> Project: HBase
>  Issue Type: Bug
>  Components: test, Zookeeper
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: stack
>Priority: Critical
> Fix For: 0.96.0
>
>
> I've found some misnaming of certain ZK config options.  Make them consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7540) Make znode dump to print a dump of replciation znodes

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550822#comment-13550822
 ] 

stack commented on HBASE-7540:
--

Mind attaching a picture of what it looks like Mr Himanshu?  Patch looks good.  
Would it be hard to add to trunk?

> Make znode dump to print a dump of replciation znodes
> -
>
> Key: HBASE-7540
> URL: https://issues.apache.org/jira/browse/HBASE-7540
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication, UI
>Affects Versions: 0.94.3
>Reporter: Himanshu Vashishtha
>Assignee: Himanshu Vashishtha
> Fix For: 0.96.0, 0.94.4
>
> Attachments: HBASE-7540-v1.patch
>
>
> It will be nice to have a dump of replication related znodes on the master UI 
> (along with other znode dump). It helps while using replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7529) Wrong ExecutorType for EventType.M_RS_OPEN_ROOT in trunk

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550824#comment-13550824
 ] 

Hudson commented on HBASE-7529:
---

Integrated in HBase-TRUNK #3727 (See 
[https://builds.apache.org/job/HBase-TRUNK/3727/])
HBASE-7529 Wrong ExecutorType for EventType.M_RS_OPEN_ROOT in trunk 
(Revision 1431816)

 Result = FAILURE
zjushch : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java


> Wrong ExecutorType for EventType.M_RS_OPEN_ROOT in trunk
> 
>
> Key: HBASE-7529
> URL: https://issues.apache.org/jira/browse/HBASE-7529
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.96.0
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7529-trunk.patch
>
>
> {code}
> M_RS_OPEN_ROOT(21, ExecutorType.RS_OPEN_REGION),  // Master 
> asking RS to open root
> {code}
> It's a mistake only in trunk, causing ROOT couldn't be online for a long long 
> time:
> 1.ROOT wait open-region-thread to handle opening it.
> 2.Opening regions wait for ROOT to online, but occupy the threads...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7501) Introduce MetaEditor method that both adds and deletes rows in .META. table

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550821#comment-13550821
 ] 

stack commented on HBASE-7501:
--

+1

Where you going to use it?  Could we use it adding split changes?

> Introduce MetaEditor method that both adds and deletes rows in .META. table
> ---
>
> Key: HBASE-7501
> URL: https://issues.apache.org/jira/browse/HBASE-7501
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Matteo Bertozzi
> Attachments: HBASE-7501-v0.patch, HBASE-7501-v1.patch
>
>
> In review of HBASE-7365, MetaEditor.deleteRegions() and 
> MetaEditor.addRegionsToMeta() are used in 
> RestoreSnapshotHandler.java.handleTableOperation() to apply changes to .META.
> I made following suggestion:
> Can we introduce new method in MetaEditor which takes List of Mutation's ?
> The Delete and Put would be grouped and then written to .META. table in one 
> transaction.
> Jon responded:
> I like that idea -- then the todo/warning or follow on could refer to that 
> method.  When we fix it, it could get used in other multi row meta 
> modifications like splits and table creation/deletion in general.
> See https://reviews.apache.org/r/8674/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550820#comment-13550820
 ] 

Ted Yu commented on HBASE-5416:
---

The following tests failed locally:
TestMultiSlaveReplication,TestMasterReplication,TestZKLeaderManager

The first two failed without the patch, too.

I think patch v3 should be good to go.

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7365) Safer table creation and deletion using .tmp dir

2013-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550818#comment-13550818
 ] 

stack commented on HBASE-7365:
--

bq. ...more like "does it works?"

It was more the above.  Archiver will be able to deal w/ tables and regions 
over in tmp dir?  Sounds like it can so I have answer for my question.

bq.  What do you think? keep the code as is? rename the checkTempDir() in 
initTempDir() and create the checkTempDir() that only creates the directory?

What you suggest sounds like some nice clean up.  If it is too much to do, 
don't bother (You have enough on your plate at moment).


> Safer table creation and deletion using .tmp dir
> 
>
> Key: HBASE-7365
> URL: https://issues.apache.org/jira/browse/HBASE-7365
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 0.96.0
>
> Attachments: HBASE-7365-v0.patch, HBASE-7365-v1.patch, 
> HBASE-7365-v2.patch
>
>
> Currently tables are created in the root directory, and the removal works on 
> the root directory.
> Change the code to use a /hbase/.tmp directory to make the creation and 
> removal a bit safer
> Table Creation steps
>  * Create the table descriptor (table folder, in /hbase/.tmp/)
>  * Create the table regions (always in temp)
>  * Move the table from temp to the root folder
>  * Add the regions to meta
>  * Trigger assignment
>  * Set enable flag in ZooKeeper
> Table Deletion steps
>  * Wait for regions in transition
>  * Remove regions from meta (use bulk delete)
>  * Move the table in /hbase/.tmp
>  * Remove the table from the descriptor cache
>  * Remove table from zookeeper
>  * Archive the table
> The main changes in the current code are:
>  * Writing to /hbase/.tmp and then rename
>  * using bulk delete in DeletionTableHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7360) Snapshot 0.94 Backport

2013-01-10 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-7360:
--

Description: 
Backport snapshot code to 0.94

The main changes needed to backport the snapshot are related to the protobuf vs 
writable rpc.

Offline Snapshot
 * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
 * HBASE-6765 - Take a Snapshot interface
 * HBASE-6571 - Generic multi-thread/cross-process error handling framework
 * HBASE-6353 - Snapshots shell
 * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
 * HBASE-6863 - Offline snapshots
 * HBASE-6865 - Snapshot File Cleaners
 * HBASE-6777 - Snapshot Restore Interface
 * HBASE-6230 - Clone/Restore Snapshots
 * HBASE-6802 - Export Snapshot


  was:
Backport snapshot code to 0.94

The main changes needed to backport the snapshot are related to the protobuf vs 
writable rpc.

Offline Snapshot
 * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
 * HBASE-6765 - Take a Snapshot interface
 * HBASE-6571 - Generic multi-thread/cross-process error handling framework
 * HBASE-6353 - Snapshots shell
 * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
 * HBASE-6836 - Offline snapshots
 * HBASE-6865 - Snapshot File Cleaners
 * HBASE-6777 - Snapshot Restore Interface
 * HBASE-6230 - Clone/Restore Snapshots
 * HBASE-6802 - Export Snapshot



> Snapshot 0.94 Backport 
> ---
>
> Key: HBASE-7360
> URL: https://issues.apache.org/jira/browse/HBASE-7360
> Project: HBase
>  Issue Type: New Feature
>  Components: snapshots
>Affects Versions: 0.94.3
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>
> Backport snapshot code to 0.94
> The main changes needed to backport the snapshot are related to the protobuf 
> vs writable rpc.
> Offline Snapshot
>  * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
>  * HBASE-6765 - Take a Snapshot interface
>  * HBASE-6571 - Generic multi-thread/cross-process error handling framework
>  * HBASE-6353 - Snapshots shell
>  * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
>  * HBASE-6863 - Offline snapshots
>  * HBASE-6865 - Snapshot File Cleaners
>  * HBASE-6777 - Snapshot Restore Interface
>  * HBASE-6230 - Clone/Restore Snapshots
>  * HBASE-6802 - Export Snapshot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-7360) Snapshot 0.94 Backport

2013-01-10 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh reopened HBASE-7360:
---


The discussion on backporting the snapshots feature (HBASE-6055 and HBASE-7290) 
to the apache 0.94 line has been reopened on the mailing list so I've reopened 
it here for discussion.

> Snapshot 0.94 Backport 
> ---
>
> Key: HBASE-7360
> URL: https://issues.apache.org/jira/browse/HBASE-7360
> Project: HBase
>  Issue Type: New Feature
>  Components: snapshots
>Affects Versions: 0.94.3
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>
> Backport snapshot code to 0.94
> The main changes needed to backport the snapshot are related to the protobuf 
> vs writable rpc.
> Offline Snapshot
>  * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
>  * HBASE-6765 - Take a Snapshot interface
>  * HBASE-6571 - Generic multi-thread/cross-process error handling framework
>  * HBASE-6353 - Snapshots shell
>  * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
>  * HBASE-6836 - Offline snapshots
>  * HBASE-6865 - Snapshot File Cleaners
>  * HBASE-6777 - Snapshot Restore Interface
>  * HBASE-6230 - Clone/Restore Snapshots
>  * HBASE-6802 - Export Snapshot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7360) Snapshot 0.94 Backport

2013-01-10 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-7360:
--

Issue Type: New Feature  (was: Sub-task)
Parent: (was: HBASE-6055)

> Snapshot 0.94 Backport 
> ---
>
> Key: HBASE-7360
> URL: https://issues.apache.org/jira/browse/HBASE-7360
> Project: HBase
>  Issue Type: New Feature
>  Components: snapshots
>Affects Versions: 0.94.3
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>
> Backport snapshot code to 0.94
> The main changes needed to backport the snapshot are related to the protobuf 
> vs writable rpc.
> Offline Snapshot
>  * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
>  * HBASE-6765 - Take a Snapshot interface
>  * HBASE-6571 - Generic multi-thread/cross-process error handling framework
>  * HBASE-6353 - Snapshots shell
>  * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
>  * HBASE-6836 - Offline snapshots
>  * HBASE-6865 - Snapshot File Cleaners
>  * HBASE-6777 - Snapshot Restore Interface
>  * HBASE-6230 - Clone/Restore Snapshots
>  * HBASE-6802 - Export Snapshot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7501) Introduce MetaEditor method that both adds and deletes rows in .META. table

2013-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550799#comment-13550799
 ] 

Hadoop QA commented on HBASE-7501:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12564329/HBASE-7501-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.replication.TestReplicationWithCompression
  org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 9 zombie test(s):   
at 
org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS(TestMasterFailover.java:833)
at 
org.apache.hadoop.hbase.catalog.TestCatalogTracker.testServerNotRunningIOException(TestCatalogTracker.java:250)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3975//console

This message is automatically generated.

> Introduce MetaEditor method that both adds and deletes rows in .META. table
> ---
>
> Key: HBASE-7501
> URL: https://issues.apache.org/jira/browse/HBASE-7501
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Matteo Bertozzi
> Attachments: HBASE-7501-v0.patch, HBASE-7501-v1.patch
>
>
> In review of HBASE-7365, MetaEditor.deleteRegions() and 
> MetaEditor.addRegionsToMeta() are used in 
> RestoreSnapshotHandler.java.handleTableOperation() to apply changes to .META.
> I made following suggestion:
> Can we introduce new method in MetaEditor which takes List of Mutation's ?
> The Delete and Put would be grouped and then written to .META. table in one 
> transaction.
> Jon responded:
> I like that idea -- then the todo/warning or follow on could refer to that 
> method.  When we fix it, it could get used in other multi row meta 
> modifications like splits and table creation/deletion in general.
> See https://reviews.apache.org/r/8674/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550796#comment-13550796
 ] 

Lars Hofhansl commented on HBASE-5416:
--

Thanks Ted. Will wrap lines upon commit (unless there's another version needed).

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550794#comment-13550794
 ] 

Ted Yu commented on HBASE-5416:
---

Thanks Lars for the finding. I am running test suite based on patch v3.
Will report back if there is any abnormality.
I was looking for long lines.
{code}
+  public static final String LOAD_CFS_ON_DEMAND_CONFIG_KEY = 
"hbase.hregion.scan.loadColumnFamiliesOnDemand";
{code}
nit: wrap long line above.
{code}
+ * @param heap KeyValueHeap to fetch data from. It must be positioned on 
correct row before call.
{code}
Long line: it would be 100 characters wide if the trailing period is removed.
{code}
+  stopRow = nextKv == null || isStopRow(nextKv.getBuffer(), 
nextKv.getRowOffset(), nextKv.getRowLength());
{code}

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7404:
--

 Description: 
First, thanks @neil from Fusion-IO share the source code.

Usage:

1.Use bucket cache as main memory cache, configured as the following:
–"hbase.bucketcache.ioengine" "heap"
–"hbase.bucketcache.size" 0.4 (size for bucket cache, 0.4 is a percentage of 
max heap size)

2.Use bucket cache as a secondary cache, configured as the following:
–"hbase.bucketcache.ioengine" "file:/disk1/hbase/cache.data"(The file path 
where to store the block data)
–"hbase.bucketcache.size" 1024 (size for bucket cache, unit is MB, so 1024 
means 1GB)
–"hbase.bucketcache.combinedcache.enabled" false (default value being true)

See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache


What's Bucket Cache? 
It could greatly decrease CMS and heap fragment by GC
It support a large cache space for High Read Performance by using high speed 
disk like Fusion-io

1.An implementation of block cache like LruBlockCache
2.Self manage blocks' storage position through Bucket Allocator
3.The cached blocks could be stored in the memory or file system
4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
combined with LruBlockCache to decrease CMS and fragment by GC.
5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
store block) to enlarge cache space


How about SlabCache?
We have studied and test SlabCache first, but the result is bad, because:
1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
of block size, especially using DataBlockEncoding
2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache and 
LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , it 
causes CMS and heap fragment don't get any better
3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
recommend using "heap" engine 

See more in the attachment and in the patch



  was:
First, thanks @neil from Fusion-IO share the source code.

Usage:

1.Use bucket cache as a mainly memory cache, configure as the following:
–"hbase.bucketcache.ioengine" "heap"
–"hbase.bucketcache.size" 0.4 (size for bucket cache, 0.4 is a percentage of 
max heap size)

2.Use bucket cache as a secondary cache, configure as the following:
–"hbase.bucketcache.ioengine" "file:/disk1/hbase/cache.data"(The file path 
where to store the block data)
–"hbase.bucketcache.size" 1024 (size for bucket cache, unit is MB, so 1024 
means 1GB)
–"hbase.bucketcache.combinedcache.enabled" false

See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache


What's Bucket Cache? 
It could greatly decrease CMS and heap fragment by GC
It support a large cache space for High Read Performance by using high speed 
disk like Fusion-io

1.An implementation of block cache like LruBlockCache
2.Self manage blocks' storage position through Bucket Allocator
3.The cached blocks could be stored in the memory or file system
4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
combined with LruBlockCache to decrease CMS and fragment by GC.
5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
store block) to enlarge cache space


How about SlabCache?
We have studied and test SlabCache first, but the result is bad, because:
1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
of block size, especially using DataBlockEncoding
2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache and 
LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , it 
causes CMS and heap fragment don't get any better
3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
recommend using "heap" engine 

See more in the attachment and in the patch



Release Note: 
BucketCache is another implementation of BlockCache which supports big block 
cache for high performance and would greatly decrease CMS and heap 
fragmentation in JVM caused by read activities.


Usage:

1.Use bucket cache as main memory cache, configured as the following:
–"hbase.bucketcache.ioengine" "heap"
–"hbase.bucketcache.size" 0.4 (size for bucket cache, 0.4 is a percentage of 
max heap size)

2.Use bucket cache as a secondary cache, configured as the following:
–"hbase.bucketcache.ioengine" "file:/disk1/hbase/cache.data"(The file path 
where to store the block data)
–"hbase.bucketcache.size" 1024 (size for bucket cache, unit is MB, so 1024 
means 1GB)
–"hbase.bucketcache.combinedcache.enabled" false (default value being true)


  was:BucketCache is another implementation of BlockCache, which supports big 
block cache for high performance and could greatly decrease CMS and heap 
fragme

[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550790#comment-13550790
 ] 

Ted Yu commented on HBASE-7507:
---

+1 on latest patch.
Nice to see a green QA run.

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7507-trunk v1.patch, 7507-trunk v2.patch, 
> 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7536) Add test that confirms that multiple concurrent snapshot requests are rejected.

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550788#comment-13550788
 ] 

Ted Yu commented on HBASE-7536:
---

Understood.
It is up to your discretion.

> Add test that confirms that multiple concurrent snapshot requests are 
> rejected.
> ---
>
> Key: HBASE-7536
> URL: https://issues.apache.org/jira/browse/HBASE-7536
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-7536.patch, pre-hbase-7536.patch
>
>
> Currently the rule is that we can only have online snapshot running at a 
> time.  This test tries to prove this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7536) Add test that confirms that multiple concurrent snapshot requests are rejected.

2013-01-10 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550783#comment-13550783
 ] 

Jonathan Hsieh commented on HBASE-7536:
---

When I ran it we ended up with 1, 2 or 3 snapshots -- but the debugging info 
showed that they happened serially (which is valid).  

Th problem to make this even more robust, we'd need to no add more data but add 
more regions.  To make this truly robust, we'd need to either rewrite major 
portions so we can control everything.  

> Add test that confirms that multiple concurrent snapshot requests are 
> rejected.
> ---
>
> Key: HBASE-7536
> URL: https://issues.apache.org/jira/browse/HBASE-7536
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-7536.patch, pre-hbase-7536.patch
>
>
> Currently the rule is that we can only have online snapshot running at a 
> time.  This test tries to prove this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5416:
-

Attachment: 5416-0.94-v3.txt

Patch that has this fixed.
I think I should do a bit more double-checking before committing.

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550779#comment-13550779
 ] 

Hadoop QA commented on HBASE-7507:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12564325/7507-trunkv3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3974//console

This message is automatically generated.

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7507-trunk v1.patch, 7507-trunk v2.patch, 
> 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550772#comment-13550772
 ] 

Lars Hofhansl commented on HBASE-5416:
--

Found the problem. For the part of the patch that I had applied manually I 
mistook{{kv != KV_LIMIT}} for {{kv == KV_LIMIT}}.
No idea how on earth the test on my server machine passed.

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7530) [replication] Work around HDFS-4380 else we get NPEs

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550759#comment-13550759
 ] 

Hudson commented on HBASE-7530:
---

Integrated in HBase-0.94 #722 (See 
[https://builds.apache.org/job/HBase-0.94/722/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431769)

 Result = SUCCESS
jdcryans : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] Work around HDFS-4380 else we get NPEs
> 
>
> Key: HBASE-7530
> URL: https://issues.apache.org/jira/browse/HBASE-7530
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7530.patch
>
>
> I've been spending a lot of time trying to figure the recent test failures 
> related to replication. One I seem to be constantly getting is this NPE:
> {noformat}
> 2013-01-09 10:08:56,912 ERROR 
> [RegionServer:1;172.23.7.205,61604,1357754664830-EventThread.replicationSource,2]
>  regionserver.ReplicationSource$1(727): Unexpected exception in 
> ReplicationSource, 
> currentPath=hdfs://localhost:61589/user/jdcryans/hbase/.logs/172.23.7.205,61604,1357754664830/172.23.7.205%2C61604%2C1357754664830.1357754936216
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.updateBlockInfo(DFSClient.java:1885)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1858)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> at 
> org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:108)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1495)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:62)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1482)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:308)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:500)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:312)
> {noformat}
> Talking to [~tlipcon], he said it was likely fixed in Hadoop 2.0 via 
> HDFS-3222 but for Hadoop 1.0 he created HDFS-4380. This seems to happen while 
> crossing block boundaries and TestReplication uses a 20KB block size for the 
> HLog. The intent was just to get HLogs to roll more often, and this can also 
> be achieved with *hbase.regionserver.logroll.multiplier* with a value of 
> 0.0003f.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7531) [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550757#comment-13550757
 ] 

Hudson commented on HBASE-7531:
---

Integrated in HBase-0.94 #722 (See 
[https://builds.apache.org/job/HBase-0.94/722/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431769)

 Result = SUCCESS
jdcryans : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't 
> nullify the reader
> ---
>
> Key: HBASE-7531
> URL: https://issues.apache.org/jira/browse/HBASE-7531
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7531.patch
>
>
> Here's a NPE I get half the time I run TestReplication:
> {noformat}
> 2012-12-20 08:59:17,259 ERROR 
> [RegionServer:1;192.168.10.135,49168,1356011734418-EventThread.replicationSource,2]
>  regionserver.ReplicationSource$1(727): Unexpected exception in 
> ReplicationSource, 
> currentPath=hdfs://localhost:65533/user/jdcryans/hbase/.logs/192.168.10.135,49168,1356011734418/192.168.10.135%2C49168%2C1356011734418.1356011956626
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.seek(SequenceFileLogReader.java:261)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.seek(ReplicationHLogReaderManager.java:103)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:414)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:332)
> {noformat}
> The issue happens after an IOE was caught while opening the reader, the issue 
> is that it isn't set to null after that then the rest of the code assumes the 
> reader is usable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7534) [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550758#comment-13550758
 ] 

Hudson commented on HBASE-7534:
---

Integrated in HBase-0.94 #722 (See 
[https://builds.apache.org/job/HBase-0.94/722/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431769)

 Result = SUCCESS
jdcryans : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] TestReplication.queueFailover can fail because 
> HBaseTestingUtility.createMultiRegions is dangerous
> 
>
> Key: HBASE-7534
> URL: https://issues.apache.org/jira/browse/HBASE-7534
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7534.patch
>
>
> {{HBaseTestingUtility.createMultiRegions}} is an abomination, it uses an 
> already existing table and hot replaces the regions in it. I've seen 
> TestReplication failing a few times because the "old" first region is still 
> assigned and tried to flush but crashed due to the fact that the region's 
> folder is missing in HDFS: 
> {noformat}
> 2013-01-04 10:04:45,500 DEBUG 
> [RegionServer:1;172.21.3.117,57114,1357322589018.cacheFlusher] 
> regionserver.Store(844): Renaming flushed file at 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
> 2013-01-04 10:04:45,500 WARN  [IPC Server handler 8 on 57099] 
> namenode.FSDirectory(422): DIR* FSDirectory.unprotectedRenameTo: failed to 
> rename 
> /user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> /user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
>  because destination's parent does not exist
> 2013-01-04 10:04:45,503 WARN  
> [RegionServer:1;172.21.3.117,57114,1357322589018.cacheFlusher] 
> regionserver.Store(847): Unable to rename 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
> 2013-01-04 10:04:45,504 WARN  [DataStreamer for file 
> /user/jdcryans/hbase/.logs/172.21.3.117,57113,1357322588994/172.21.3.117%2C57113%2C1357322588994.1357322683769]
>  hdfs.DFSClient$DFSOutputStream$DataStreamer(2873): DataStreamer Exception: 
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
> /user/jdcryans/hbase/.logs/172.21.3.117,57113,1357322588994/172.21.3.117%2C57113%2C1357322588994.1357322683769
>  File does not exist. [Lease.  Holder: 
> DFSClient_hb_rs_172.21.3.117,57113,1357322588994, pendingcreates: 1]
> {noformat}
> Eventually the test times out because both region servers on the master 
> cluster are dead.
> It can be easily fixed by pre-creating the table with enough regions.
> FWIW a bunch of other tests are using this facility, my IDE tells me that the 
> 3 methods are called 25 times outside of {{HBaseTestingUtility}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7531) [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550748#comment-13550748
 ] 

Hudson commented on HBASE-7531:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #342 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/342/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't 
> nullify the reader
> ---
>
> Key: HBASE-7531
> URL: https://issues.apache.org/jira/browse/HBASE-7531
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7531.patch
>
>
> Here's a NPE I get half the time I run TestReplication:
> {noformat}
> 2012-12-20 08:59:17,259 ERROR 
> [RegionServer:1;192.168.10.135,49168,1356011734418-EventThread.replicationSource,2]
>  regionserver.ReplicationSource$1(727): Unexpected exception in 
> ReplicationSource, 
> currentPath=hdfs://localhost:65533/user/jdcryans/hbase/.logs/192.168.10.135,49168,1356011734418/192.168.10.135%2C49168%2C1356011734418.1356011956626
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.seek(SequenceFileLogReader.java:261)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.seek(ReplicationHLogReaderManager.java:103)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:414)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:332)
> {noformat}
> The issue happens after an IOE was caught while opening the reader, the issue 
> is that it isn't set to null after that then the rest of the code assumes the 
> reader is usable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7534) [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550749#comment-13550749
 ] 

Hudson commented on HBASE-7534:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #342 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/342/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] TestReplication.queueFailover can fail because 
> HBaseTestingUtility.createMultiRegions is dangerous
> 
>
> Key: HBASE-7534
> URL: https://issues.apache.org/jira/browse/HBASE-7534
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7534.patch
>
>
> {{HBaseTestingUtility.createMultiRegions}} is an abomination, it uses an 
> already existing table and hot replaces the regions in it. I've seen 
> TestReplication failing a few times because the "old" first region is still 
> assigned and tried to flush but crashed due to the fact that the region's 
> folder is missing in HDFS: 
> {noformat}
> 2013-01-04 10:04:45,500 DEBUG 
> [RegionServer:1;172.21.3.117,57114,1357322589018.cacheFlusher] 
> regionserver.Store(844): Renaming flushed file at 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
> 2013-01-04 10:04:45,500 WARN  [IPC Server handler 8 on 57099] 
> namenode.FSDirectory(422): DIR* FSDirectory.unprotectedRenameTo: failed to 
> rename 
> /user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> /user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
>  because destination's parent does not exist
> 2013-01-04 10:04:45,503 WARN  
> [RegionServer:1;172.21.3.117,57114,1357322589018.cacheFlusher] 
> regionserver.Store(847): Unable to rename 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
> 2013-01-04 10:04:45,504 WARN  [DataStreamer for file 
> /user/jdcryans/hbase/.logs/172.21.3.117,57113,1357322588994/172.21.3.117%2C57113%2C1357322588994.1357322683769]
>  hdfs.DFSClient$DFSOutputStream$DataStreamer(2873): DataStreamer Exception: 
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
> /user/jdcryans/hbase/.logs/172.21.3.117,57113,1357322588994/172.21.3.117%2C57113%2C1357322588994.1357322683769
>  File does not exist. [Lease.  Holder: 
> DFSClient_hb_rs_172.21.3.117,57113,1357322588994, pendingcreates: 1]
> {noformat}
> Eventually the test times out because both region servers on the master 
> cluster are dead.
> It can be easily fixed by pre-creating the table with enough regions.
> FWIW a bunch of other tests are using this facility, my IDE tells me that the 
> 3 methods are called 25 times outside of {{HBaseTestingUtility}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7530) [replication] Work around HDFS-4380 else we get NPEs

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550750#comment-13550750
 ] 

Hudson commented on HBASE-7530:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #342 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/342/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] Work around HDFS-4380 else we get NPEs
> 
>
> Key: HBASE-7530
> URL: https://issues.apache.org/jira/browse/HBASE-7530
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7530.patch
>
>
> I've been spending a lot of time trying to figure the recent test failures 
> related to replication. One I seem to be constantly getting is this NPE:
> {noformat}
> 2013-01-09 10:08:56,912 ERROR 
> [RegionServer:1;172.23.7.205,61604,1357754664830-EventThread.replicationSource,2]
>  regionserver.ReplicationSource$1(727): Unexpected exception in 
> ReplicationSource, 
> currentPath=hdfs://localhost:61589/user/jdcryans/hbase/.logs/172.23.7.205,61604,1357754664830/172.23.7.205%2C61604%2C1357754664830.1357754936216
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.updateBlockInfo(DFSClient.java:1885)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1858)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> at 
> org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:108)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1495)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:62)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1482)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:308)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:500)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:312)
> {noformat}
> Talking to [~tlipcon], he said it was likely fixed in Hadoop 2.0 via 
> HDFS-3222 but for Hadoop 1.0 he created HDFS-4380. This seems to happen while 
> crossing block boundaries and TestReplication uses a 20KB block size for the 
> HLog. The intent was just to get HLogs to roll more often, and this can also 
> be achieved with *hbase.regionserver.logroll.multiplier* with a value of 
> 0.0003f.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7534) [replication] TestReplication.queueFailover can fail because HBaseTestingUtility.createMultiRegions is dangerous

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550744#comment-13550744
 ] 

Hudson commented on HBASE-7534:
---

Integrated in HBase-TRUNK #3726 (See 
[https://builds.apache.org/job/HBase-TRUNK/3726/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] TestReplication.queueFailover can fail because 
> HBaseTestingUtility.createMultiRegions is dangerous
> 
>
> Key: HBASE-7534
> URL: https://issues.apache.org/jira/browse/HBASE-7534
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7534.patch
>
>
> {{HBaseTestingUtility.createMultiRegions}} is an abomination, it uses an 
> already existing table and hot replaces the regions in it. I've seen 
> TestReplication failing a few times because the "old" first region is still 
> assigned and tried to flush but crashed due to the fact that the region's 
> folder is missing in HDFS: 
> {noformat}
> 2013-01-04 10:04:45,500 DEBUG 
> [RegionServer:1;172.21.3.117,57114,1357322589018.cacheFlusher] 
> regionserver.Store(844): Renaming flushed file at 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
> 2013-01-04 10:04:45,500 WARN  [IPC Server handler 8 on 57099] 
> namenode.FSDirectory(422): DIR* FSDirectory.unprotectedRenameTo: failed to 
> rename 
> /user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> /user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
>  because destination's parent does not exist
> 2013-01-04 10:04:45,503 WARN  
> [RegionServer:1;172.21.3.117,57114,1357322589018.cacheFlusher] 
> regionserver.Store(847): Unable to rename 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/.tmp/b938b33268064312abfc250d2eeca61d
>  to 
> hdfs://localhost:57099/user/jdcryans/hbase/test/62c85f8a6e3d0e32b2fb21326537f5a6/f/b938b33268064312abfc250d2eeca61d
> 2013-01-04 10:04:45,504 WARN  [DataStreamer for file 
> /user/jdcryans/hbase/.logs/172.21.3.117,57113,1357322588994/172.21.3.117%2C57113%2C1357322588994.1357322683769]
>  hdfs.DFSClient$DFSOutputStream$DataStreamer(2873): DataStreamer Exception: 
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
> /user/jdcryans/hbase/.logs/172.21.3.117,57113,1357322588994/172.21.3.117%2C57113%2C1357322588994.1357322683769
>  File does not exist. [Lease.  Holder: 
> DFSClient_hb_rs_172.21.3.117,57113,1357322588994, pendingcreates: 1]
> {noformat}
> Eventually the test times out because both region servers on the master 
> cluster are dead.
> It can be easily fixed by pre-creating the table with enough regions.
> FWIW a bunch of other tests are using this facility, my IDE tells me that the 
> 3 methods are called 25 times outside of {{HBaseTestingUtility}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7531) [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't nullify the reader

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550743#comment-13550743
 ] 

Hudson commented on HBASE-7531:
---

Integrated in HBase-TRUNK #3726 (See 
[https://builds.apache.org/job/HBase-TRUNK/3726/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] NPE in SequenceFileLogReader because ReplicationSource doesn't 
> nullify the reader
> ---
>
> Key: HBASE-7531
> URL: https://issues.apache.org/jira/browse/HBASE-7531
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7531.patch
>
>
> Here's a NPE I get half the time I run TestReplication:
> {noformat}
> 2012-12-20 08:59:17,259 ERROR 
> [RegionServer:1;192.168.10.135,49168,1356011734418-EventThread.replicationSource,2]
>  regionserver.ReplicationSource$1(727): Unexpected exception in 
> ReplicationSource, 
> currentPath=hdfs://localhost:65533/user/jdcryans/hbase/.logs/192.168.10.135,49168,1356011734418/192.168.10.135%2C49168%2C1356011734418.1356011956626
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.seek(SequenceFileLogReader.java:261)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.seek(ReplicationHLogReaderManager.java:103)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:414)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:332)
> {noformat}
> The issue happens after an IOE was caught while opening the reader, the issue 
> is that it isn't set to null after that then the rest of the code assumes the 
> reader is usable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7530) [replication] Work around HDFS-4380 else we get NPEs

2013-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550745#comment-13550745
 ] 

Hudson commented on HBASE-7530:
---

Integrated in HBase-TRUNK #3726 (See 
[https://builds.apache.org/job/HBase-TRUNK/3726/])
HBASE-7530  [replication] Work around HDFS-4380 else we get NPEs
HBASE-7531  [replication] NPE in SequenceFileLogReader because
ReplicationSource doesn't nullify the reader
HBASE-7534  [replication] TestReplication.queueFailover can fail
because HBaseTestingUtility.createMultiRegions is dangerous 
(Revision 1431768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


> [replication] Work around HDFS-4380 else we get NPEs
> 
>
> Key: HBASE-7530
> URL: https://issues.apache.org/jira/browse/HBASE-7530
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7530.patch
>
>
> I've been spending a lot of time trying to figure the recent test failures 
> related to replication. One I seem to be constantly getting is this NPE:
> {noformat}
> 2013-01-09 10:08:56,912 ERROR 
> [RegionServer:1;172.23.7.205,61604,1357754664830-EventThread.replicationSource,2]
>  regionserver.ReplicationSource$1(727): Unexpected exception in 
> ReplicationSource, 
> currentPath=hdfs://localhost:61589/user/jdcryans/hbase/.logs/172.23.7.205,61604,1357754664830/172.23.7.205%2C61604%2C1357754664830.1357754936216
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.updateBlockInfo(DFSClient.java:1885)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1858)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> at 
> org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:108)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1495)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:62)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1482)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:308)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:500)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:312)
> {noformat}
> Talking to [~tlipcon], he said it was likely fixed in Hadoop 2.0 via 
> HDFS-3222 but for Hadoop 1.0 he created HDFS-4380. This seems to happen while 
> crossing block boundaries and TestReplication uses a 20KB block size for the 
> HLog. The intent was just to get HLogs to roll more often, and this can also 
> be achieved with *hbase.regionserver.logroll.multiplier* with a value of 
> 0.0003f.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-10 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550740#comment-13550740
 ] 

chunhui shen commented on HBASE-7404:
-

Thanks for the review, Ted, Sergey

Commit to trunk tomorrow if no objection.

> Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
> --
>
> Key: HBASE-7404
> URL: https://issues.apache.org/jira/browse/HBASE-7404
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
> 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
> 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
> hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket 
> Cache.pdf
>
>
> First, thanks @neil from Fusion-IO share the source code.
> Usage:
> 1.Use bucket cache as a mainly memory cache, configure as the following:
> –"hbase.bucketcache.ioengine" "heap"
> –"hbase.bucketcache.size" 0.4 (size for bucket cache, 0.4 is a percentage of 
> max heap size)
> 2.Use bucket cache as a secondary cache, configure as the following:
> –"hbase.bucketcache.ioengine" "file:/disk1/hbase/cache.data"(The file path 
> where to store the block data)
> –"hbase.bucketcache.size" 1024 (size for bucket cache, unit is MB, so 1024 
> means 1GB)
> –"hbase.bucketcache.combinedcache.enabled" false
> See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache
> What's Bucket Cache? 
> It could greatly decrease CMS and heap fragment by GC
> It support a large cache space for High Read Performance by using high speed 
> disk like Fusion-io
> 1.An implementation of block cache like LruBlockCache
> 2.Self manage blocks' storage position through Bucket Allocator
> 3.The cached blocks could be stored in the memory or file system
> 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
> combined with LruBlockCache to decrease CMS and fragment by GC.
> 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
> store block) to enlarge cache space
> How about SlabCache?
> We have studied and test SlabCache first, but the result is bad, because:
> 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
> of block size, especially using DataBlockEncoding
> 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
> and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
> it causes CMS and heap fragment don't get any better
> 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
> recommend using "heap" engine 
> See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550737#comment-13550737
 ] 

Lars Hofhansl commented on HBASE-5416:
--

On my machine at this test fails too in 0.94.

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-10 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7404:


Release Note: BucketCache is another implementation of BlockCache, which 
supports big block cache for high performance and could greatly decrease CMS 
and heap fragment caused by reading.

> Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
> --
>
> Key: HBASE-7404
> URL: https://issues.apache.org/jira/browse/HBASE-7404
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
> 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
> 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
> hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket 
> Cache.pdf
>
>
> First, thanks @neil from Fusion-IO share the source code.
> Usage:
> 1.Use bucket cache as a mainly memory cache, configure as the following:
> –"hbase.bucketcache.ioengine" "heap"
> –"hbase.bucketcache.size" 0.4 (size for bucket cache, 0.4 is a percentage of 
> max heap size)
> 2.Use bucket cache as a secondary cache, configure as the following:
> –"hbase.bucketcache.ioengine" "file:/disk1/hbase/cache.data"(The file path 
> where to store the block data)
> –"hbase.bucketcache.size" 1024 (size for bucket cache, unit is MB, so 1024 
> means 1GB)
> –"hbase.bucketcache.combinedcache.enabled" false
> See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache
> What's Bucket Cache? 
> It could greatly decrease CMS and heap fragment by GC
> It support a large cache space for High Read Performance by using high speed 
> disk like Fusion-io
> 1.An implementation of block cache like LruBlockCache
> 2.Self manage blocks' storage position through Bucket Allocator
> 3.The cached blocks could be stored in the memory or file system
> 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
> combined with LruBlockCache to decrease CMS and fragment by GC.
> 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
> store block) to enlarge cache space
> How about SlabCache?
> We have studied and test SlabCache first, but the result is bad, because:
> 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
> of block size, especially using DataBlockEncoding
> 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
> and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
> it causes CMS and heap fragment don't get any better
> 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
> recommend using "heap" engine 
> See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5416:
-

Fix Version/s: 0.94.5

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550731#comment-13550731
 ] 

Lars Hofhansl commented on HBASE-5416:
--

Does this test fail consistently for you?

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7536) Add test that confirms that multiple concurrent snapshot requests are rejected.

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550730#comment-13550730
 ] 

Ted Yu commented on HBASE-7536:
---

{code}
+   * Demonstrate that we rejected snapshot requests if there snapshot 
currently running.
{code}
'rejected' -> 'reject'
'if there snapshot' -> 'if there is snapshot'
{code}
+assertTrue("We expect at least 1 request to be rejected because of we 
concurrently" +
{code}
Can we put enough data in the table so that when ssNum is 2 or 3, the above 
assertion still holds ?

> Add test that confirms that multiple concurrent snapshot requests are 
> rejected.
> ---
>
> Key: HBASE-7536
> URL: https://issues.apache.org/jira/browse/HBASE-7536
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-7536.patch, pre-hbase-7536.patch
>
>
> Currently the rule is that we can only have online snapshot running at a 
> time.  This test tries to prove this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550726#comment-13550726
 ] 

Ted Yu edited comment on HBASE-5416 at 1/11/13 2:39 AM:


Test output from 0.94.
Here is my environment:

Darwin TYus-MacBook-Pro.local 12.2.1 Darwin Kernel Version 12.2.1: Thu Oct 18 
12:13:47 PDT 2012; root:xnu-2050.20.9~1/RELEASE_X86_64 x86_64
TYus-MacBook-Pro:hbase-snapshot-0103 tyu$ java -version
java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06-434-11M3909)
Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01-434, mixed mode)

  was (Author: yuzhih...@gmail.com):
Test output from 0.94.
  
> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7501) Introduce MetaEditor method that both adds and deletes rows in .META. table

2013-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550728#comment-13550728
 ] 

Ted Yu commented on HBASE-7501:
---

+1 on latest patch.

> Introduce MetaEditor method that both adds and deletes rows in .META. table
> ---
>
> Key: HBASE-7501
> URL: https://issues.apache.org/jira/browse/HBASE-7501
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Matteo Bertozzi
> Attachments: HBASE-7501-v0.patch, HBASE-7501-v1.patch
>
>
> In review of HBASE-7365, MetaEditor.deleteRegions() and 
> MetaEditor.addRegionsToMeta() are used in 
> RestoreSnapshotHandler.java.handleTableOperation() to apply changes to .META.
> I made following suggestion:
> Can we introduce new method in MetaEditor which takes List of Mutation's ?
> The Delete and Put would be grouped and then written to .META. table in one 
> transaction.
> Jon responded:
> I like that idea -- then the todo/warning or follow on could refer to that 
> method.  When we fix it, it could get used in other multi row meta 
> modifications like splits and table creation/deletion in general.
> See https://reviews.apache.org/r/8674/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5416:
--

Attachment: org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt

Test output from 0.94.

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch, 
> org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7501) Introduce MetaEditor method that both adds and deletes rows in .META. table

2013-01-10 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7501:
---

Attachment: HBASE-7501-v1.patch

> Introduce MetaEditor method that both adds and deletes rows in .META. table
> ---
>
> Key: HBASE-7501
> URL: https://issues.apache.org/jira/browse/HBASE-7501
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Matteo Bertozzi
> Attachments: HBASE-7501-v0.patch, HBASE-7501-v1.patch
>
>
> In review of HBASE-7365, MetaEditor.deleteRegions() and 
> MetaEditor.addRegionsToMeta() are used in 
> RestoreSnapshotHandler.java.handleTableOperation() to apply changes to .META.
> I made following suggestion:
> Can we introduce new method in MetaEditor which takes List of Mutation's ?
> The Delete and Put would be grouped and then written to .META. table in one 
> transaction.
> Jon responded:
> I like that idea -- then the todo/warning or follow on could refer to that 
> method.  When we fix it, it could get used in other multi row meta 
> modifications like splits and table creation/deletion in general.
> See https://reviews.apache.org/r/8674/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5416:
--

Status: Open  (was: Patch Available)

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   >