[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084603#comment-13084603
 ] 

Ted Yu commented on HBASE-4197:
---

Please specify hbase for Groups on review board so that more people can see the 
review request.
Good job Lars.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4194) RegionSplitter: Split on under-loaded region servers first

2011-08-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4194:
--

Description: 
When running RegionSplitter, our app devs noticed that they were getting a lot 
of NSREs.  This is caused by 2 factors: 

1. the split itself will cause an NSRE 
2. any load balancing will cause one.  

The former cannot be helped.  We can more tightly control load balancing 
though.  Instead of doing a name-sorted round-robin split across RS in the 
tier, we could sort the RS's by region count.  That way, we only split an RS 
with 10 regions after there are no more RS with 9 regions.  This will prevent 
the load balancing slop from kicking in and will fix the problem where 
restarting RegionSplitter always starts splitting at RS #1.


  was:
When running RegionSplitter, our app devs noticed that they were getting a lot 
of NSREs.  This is caused by 2 factors: 

1. the split itself will cause an NSRE 
2. any load balancing will cause one.  

The former cannot be helped.  We can more tightly control load balancing 
though.  Instead of doing a name-sorted round-robin split across RS in the 
tier, we could sorted the RS by region count.  That way, we only split an RS 
with 10 regions after there are no more RS with 9 regions.  This will prevent 
the load balancing slop from kicking in and will fix the problem where 
restarting RegionSplitter always starts splitting at RS #1.


Summary: RegionSplitter: Split on under-loaded region servers first  
(was: RegionSplitter: Split on low-loaded regions first)

 RegionSplitter: Split on under-loaded region servers first
 --

 Key: HBASE-4194
 URL: https://issues.apache.org/jira/browse/HBASE-4194
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial
 Fix For: 0.92.0

 Attachments: HBASE-4194.patch


 When running RegionSplitter, our app devs noticed that they were getting a 
 lot of NSREs.  This is caused by 2 factors: 
 1. the split itself will cause an NSRE 
 2. any load balancing will cause one.  
 The former cannot be helped.  We can more tightly control load balancing 
 though.  Instead of doing a name-sorted round-robin split across RS in the 
 tier, we could sort the RS's by region count.  That way, we only split an RS 
 with 10 regions after there are no more RS with 9 regions.  This will prevent 
 the load balancing slop from kicking in and will fix the problem where 
 restarting RegionSplitter always starts splitting at RS #1.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4194) RegionSplitter: Split on under-loaded region servers first

2011-08-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084607#comment-13084607
 ] 

Ted Yu commented on HBASE-4194:
---

Integrated to TRUNK.

Thanks for the patch Nicholas.

 RegionSplitter: Split on under-loaded region servers first
 --

 Key: HBASE-4194
 URL: https://issues.apache.org/jira/browse/HBASE-4194
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial
 Fix For: 0.92.0

 Attachments: HBASE-4194.patch


 When running RegionSplitter, our app devs noticed that they were getting a 
 lot of NSREs.  This is caused by 2 factors: 
 1. the split itself will cause an NSRE 
 2. any load balancing will cause one.  
 The former cannot be helped.  We can more tightly control load balancing 
 though.  Instead of doing a name-sorted round-robin split across RS in the 
 tier, we could sort the RS's by region count.  That way, we only split an RS 
 with 10 regions after there are no more RS with 9 regions.  This will prevent 
 the load balancing slop from kicking in and will fix the problem where 
 restarting RegionSplitter always starts splitting at RS #1.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4170) createTable java doc needs to be improved

2011-08-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4170:
-

Assignee: Mubarak Seyed

 createTable java doc needs to be improved
 -

 Key: HBASE-4170
 URL: https://issues.apache.org/jira/browse/HBASE-4170
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
 Environment: HBase-0.90.1
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
 Fix For: 0.90.5

 Attachments: create_table_javadoc_HBASE_4170.patch


 HBaseAdmin.createTable() java doc says
 public void createTable(HTableDescriptor desc,
 byte[][] splitKeys)
  throws IOException
 Creates a new table with an initial set of empty regions defined by the 
 specified split keys. The total number of regions created will be the number 
 of split keys plus one (the first region has a null start key and the last 
 region has a null end key). Synchronous operation.
 If we specify null values for first region start key and last region end key, 
 geting NullPointerException as Arrays.sort compares each element.
 I guess the documentation should not talk about null values and explain about 
 splitKeys[][] length as n-1, where n is number of regions.
 splitKeys[][] would look like
 splitKeys[0] = key value 1
 ..
 splitKeys[n-1] = key value n-1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-2399) Forced splits only act on the first family in a table

2011-08-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-2399:
-

Assignee: Ming Ma  (was: Jonathan Gray)

 Forced splits only act on the first family in a table
 -

 Key: HBASE-2399
 URL: https://issues.apache.org/jira/browse/HBASE-2399
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.20.3
Reporter: Jonathan Gray
Assignee: Ming Ma
Priority: Critical
  Labels: moved_from_0_20_5
 Fix For: 0.92.0

 Attachments: HBASE-2399-test-v1.patch, HBASE-2399-trunk.patch


 While working on a patch for HBASE-2375, I came across a few bugs in the 
 existing code related to splits.
 If a user triggers a manual split, it flips a forceSplit boolean to true and 
 then triggers a compaction (this is very similar to my current implementation 
 for HBASE-2375).  However, the forceSplit boolean is flipped back to false at 
 the beginning of Store.compact().  So the force split only acts on the first 
 family in the table.  If that Store is not splittable for some reason (it is 
 empty or has only one row), then the entire region will not be split, 
 regardless of what is in other families.
 Even if there is data in the first family, the midKey is determined based 
 solely on that family.  If it has two rows and the next family has 1M rows, 
 we pick the split key based on the two rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2399) Forced splits only act on the first family in a table

2011-08-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-2399:
--

Status: Patch Available  (was: Open)

 Forced splits only act on the first family in a table
 -

 Key: HBASE-2399
 URL: https://issues.apache.org/jira/browse/HBASE-2399
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.20.3
Reporter: Jonathan Gray
Assignee: Ming Ma
Priority: Critical
  Labels: moved_from_0_20_5
 Fix For: 0.92.0

 Attachments: HBASE-2399-test-v1.patch, HBASE-2399-trunk.patch


 While working on a patch for HBASE-2375, I came across a few bugs in the 
 existing code related to splits.
 If a user triggers a manual split, it flips a forceSplit boolean to true and 
 then triggers a compaction (this is very similar to my current implementation 
 for HBASE-2375).  However, the forceSplit boolean is flipped back to false at 
 the beginning of Store.compact().  So the force split only acts on the first 
 family in the table.  If that Store is not splittable for some reason (it is 
 empty or has only one row), then the entire region will not be split, 
 regardless of what is in other families.
 Even if there is data in the first family, the midKey is determined based 
 solely on that family.  If it has two rows and the next family has 1M rows, 
 we pick the split key based on the two rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-08-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084608#comment-13084608
 ] 

Ted Yu commented on HBASE-4153:
---

Currently regionsInTransitionInRS is a ConcurrentSkipListSet. It doesn't record 
whether the region transition was initiated by openRegion() or closeRegion().
I think we can use ConcurrentSkipListMap for regionsInTransitionInRS:
{code}
  private final Mapbyte[], Boolean regionsInTransitionInRS =
new ConcurrentSkipListMapbyte [], Boolean(Bytes.BYTES_COMPARATOR);
{code}

On a side note, I wonder why the check for regionsInTransitionInRS.contains() 
below is followed by LOG.warn() instead of by throwing exception:
{code}
  protected boolean closeRegion(HRegionInfo region, final boolean abort,
  final boolean zk) {
if (this.regionsInTransitionInRS.contains(region.getEncodedNameAsBytes())) {
  LOG.warn(Received close for region we are already opening or closing;  +
  region.getEncodedName());
  return false;
}
{code}

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
 Fix For: 0.92.0


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-08-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084614#comment-13084614
 ] 

Ted Yu commented on HBASE-1744:
---

I made ThriftServer package private so that the test compiles. But bin/hbase 
uses org.apache.hadoop.hbase.thrift2.ThriftServer, I guess it should be public.
I don't see isMasterRunning() in ThriftHBaseServiceHandler so I commented out 
the call in TestThriftServer

The new test runs much faster than the existing one:
{code}
Running org.apache.hadoop.hbase.thrift2.TestThriftServer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.213 sec
Running org.apache.hadoop.hbase.thrift.TestThriftServer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.776 sec
{code}
Maybe I am missing something.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Lars Francke
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-08-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084616#comment-13084616
 ] 

Ted Yu commented on HBASE-1744:
---

TestThriftServer.testAll() is commented out.
Same with the tests it is supposed to run.
Please finish TestThriftServer.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Lars Francke
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4192) Optimize HLog for Throughput Using Delayed RPCs

2011-08-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084618#comment-13084618
 ] 

jirapos...@reviews.apache.org commented on HBASE-4192:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1463/#review1440
---



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
https://reviews.apache.org/r/1463/#comment3334

Some kind of counter for this case would be useful, I think.


- Ted


On 2011-08-12 23:44:23, Vlad Dogaru wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1463/
bq.  ---
bq.  
bq.  (Updated 2011-08-12 23:44:23)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Changes:
bq.  
bq.  1.  Add hbase.region.wal.batchentries configuration parameter.  If this is
bq.  enabled, batch entries to the HLog in a queue.
bq.  2.  Use delayed RPCs for sync requests when aggresive batching is enabled.
bq.  This frees up RPC handler threads for the duration of the sync.
bq.  3.  Pass the RPC server instance all the way to down to HLog.  This is 
needed
bq.  to find out the current remote call, mark it as delayed, and finally 
complete
bq.  it when the sync is done.
bq.  4.  Use the region read-write consistency control to avoid exposing to
bq.  RegionScanners edits which have not yet been synced.
bq.  5.  Change a few tests which directly create HRegions or HLogs.  The
bq.  rpcServers passed in are null, HLog falls back to classic RPCs when it has 
no
bq.  knowledge of the RPC server.
bq.  6.  Add TestBatchEntriesLogRolling, which is identical to TestLogRolling,
bq.  except that it uses aggressive batching.  I'm not sure how to add tests 
that
bq.  verify the same functionality but don't duplicate code, suggestion are
bq.  welcome.
bq.  
bq.  The new parameter is disabled by default.
bq.  
bq.  
bq.  This addresses bug HBASE-4192.
bq.  https://issues.apache.org/jira/browse/HBASE-4192
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 7117bce 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java a00b93d 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 83ff7b2 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
7a917da 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 8ec53d3 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 887f736 
bq.src/main/resources/hbase-default.xml 66548ca 
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java
 dc43eb2 
bq.
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
 381ac90 
bq.  
bq.  Diff: https://reviews.apache.org/r/1463/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  All unit tests run with aggressive batching turned on and off.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Vlad
bq.  
bq.



 Optimize HLog for Throughput Using Delayed RPCs
 ---

 Key: HBASE-4192
 URL: https://issues.apache.org/jira/browse/HBASE-4192
 Project: HBase
  Issue Type: New Feature
  Components: wal
Affects Versions: 0.92.0
Reporter: Vlad Dogaru
Priority: Minor

 Introduce a new HLog configuration parameter (batchEntries) for more 
 aggressive batching of appends.  If this is enabled, HLog appends are not 
 written to the HLog writer immediately, but batched and written either 
 periodically or when a sync is requested.  Because sync times become larger, 
 they use delayed RPCs to free up RPC handler threads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084633#comment-13084633
 ] 

jirapos...@reviews.apache.org commented on HBASE-4197:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1496/
---

(Updated 2011-08-13 15:56:28.746919)


Review request for hbase, Ted Yu and Mingjie Lai.


Summary
---

1. Don't require custom scanners created by conprocessors to be subclasses of 
HRegion.RegionScanner (see HBASE-4197).
2. Simplify the interfaces for Scanners in HRegion, HRegionServer, and 
RegionObserver. This avoids a bunch instanceof checks and casts to 
HRegion.RegionScanner.

(Sorry HBase-git would not accept my patch)


This addresses bug HBASE-4197.
https://issues.apache.org/jira/browse/HBASE-4197


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
 1157311 

Diff: https://reviews.apache.org/r/1496/diff


Testing
---

Manual test attached to the bug.


Thanks,

Lars



 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084700#comment-13084700
 ] 

jirapos...@reviews.apache.org commented on HBASE-4197:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1496/#review1446
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
https://reviews.apache.org/r/1496/#comment3351

Tab should be 2 spaces.


- Ted


On 2011-08-13 15:56:28, Lars Hofhansl wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1496/
bq.  ---
bq.  
bq.  (Updated 2011-08-13 15:56:28)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Don't require custom scanners created by conprocessors to be subclasses 
of HRegion.RegionScanner (see HBASE-4197).
bq.  2. Simplify the interfaces for Scanners in HRegion, HRegionServer, and 
RegionObserver. This avoids a bunch instanceof checks and casts to 
HRegion.RegionScanner.
bq.  
bq.  (Sorry HBase-git would not accept my patch)
bq.  
bq.  
bq.  This addresses bug HBASE-4197.
bq.  https://issues.apache.org/jira/browse/HBASE-4197
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
 1157311 
bq.  
bq.  Diff: https://reviews.apache.org/r/1496/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Manual test attached to the bug.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.



 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: 

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084705#comment-13084705
 ] 

Ted Yu commented on HBASE-4197:
---

See http://hbase.apache.org/book.html#eclipse which would tell you where to 
find formatter for Eclipse.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4199) blockCache summary - backend

2011-08-13 Thread Doug Meil (JIRA)
blockCache summary - backend


 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor


This is the backend work for the blockCache summary.  Change to BlockCache 
interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition to 
HRegionInterface, and HRegionServer.

This will NOT include any of the web UI or anything else like that.  That is 
for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4199) blockCache summary - backend

2011-08-13 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4199:
-

Attachment: java_HBASE_4199.patch

 blockCache summary - backend
 

 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: java_HBASE_4199.patch


 This is the backend work for the blockCache summary.  Change to BlockCache 
 interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition 
 to HRegionInterface, and HRegionServer.
 This will NOT include any of the web UI or anything else like that.  That is 
 for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4199) blockCache summary - backend

2011-08-13 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4199:
-

Status: Patch Available  (was: Open)

 blockCache summary - backend
 

 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: java_HBASE_4199.patch


 This is the backend work for the blockCache summary.  Change to BlockCache 
 interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition 
 to HRegionInterface, and HRegionServer.
 This will NOT include any of the web UI or anything else like that.  That is 
 for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084724#comment-13084724
 ] 

Lars Hofhansl commented on HBASE-4197:
--

Thanks Ted... Installed the formatter. I will wait for some more feedback and 
then upload a new version.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4150) Potentially too many connections may be opened if ThreadLocalPool or RoundRobinPool is used

2011-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084734#comment-13084734
 ] 

Hudson commented on HBASE-4150:
---

Integrated in HBase-TRUNK #2113 (See 
[https://builds.apache.org/job/HBase-TRUNK/2113/])
HBASE-4150 update to javadoc (Karthick Sankarachary)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/PoolMap.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java


 Potentially too many connections may be opened if ThreadLocalPool or 
 RoundRobinPool is used
 ---

 Key: HBASE-4150
 URL: https://issues.apache.org/jira/browse/HBASE-4150
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Karthick Sankarachary
 Fix For: 0.92.0

 Attachments: 4150-1.txt, 4150.txt, 5140-2.txt, HBASE-4150-DOC.patch, 
 HBASE-4150_final.patch


 See 'Problem with hbase.client.ipc.pool.type=threadlocal in trunk' discussion 
 started by Lars George.
 From Lars Hofhansl:
 Looking at HBaseClient.getConnection(...) I see this:
 {code}
  synchronized (connections) {
connection = connections.get(remoteId);
if (connection == null) {
  connection = new Connection(remoteId);
  connections.put(remoteId, connection);
}
  }
 {code}
 At the same time PoolMap.ThreadLocalPool.put is defined like this:
 {code}
public R put(R resource) {
  R previousResource = get();
  if (previousResource == null) {
 ...
if (poolSize.intValue() = maxSize) {
  return null;
}
 ...
}
 {code}
 So... If the ThreadLocalPool reaches its capacity it always returns null and 
 hence all new threads will create a
 new connection every time getConnection is called!
 I have also verified with a test program that works fine as long as the 
 number of client threads (which include
 the threads in HTable's threadpool of course) is  poolsize. Once that is no 
 longer the case the number of
 connections explodes and the program dies with OOMEs (mostly because each 
 Connection is associated with
 yet another thread).
 It's not clear what should happen, though. Maybe (1) the ThreadLocalPool 
 should not have a limit, or maybe
 (2) allocations past the pool size should throw an exception (i.e. there's a 
 hard limit), or maybe (3) in that case
 a single connection is returned for all threads while the pool it over its 
 limit or (4) we start round robin with the other
 connection in the other thread locals.
 For #1 means that the number of client threads needs to be more carefully 
 managed by the client app.
 In this case it would also be somewhat pointless that Connection have their 
 own threads, we just pass stuff
 between threads.
 #2 would work, but puts more logic in the client.
 #3 would lead to hard to debug performance issues.
 And #4 is messy :)
 From Ted Yu:
 For HBaseClient, at least the javadoc doesn't match:
 {code}
* @param config configuration
* @return either a {@link PoolType#Reusable} or {@link 
 PoolType#ThreadLocal}
*/
   private static PoolType getPoolType(Configuration config) {
 return PoolType.valueOf(config.get(HConstants.HBASE_CLIENT_IPC_POOL_TYPE),
 PoolType.RoundRobin, PoolType.ThreadLocal);
 {code}
 I think for RoundRobinPool, we shouldn't allow maxSize to be 
 Integer#MAX_VALUE. Otherwise connection explosion described by Lars may incur.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4170) createTable java doc needs to be improved

2011-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084733#comment-13084733
 ] 

Hudson commented on HBASE-4170:
---

Integrated in HBase-TRUNK #2113 (See 
[https://builds.apache.org/job/HBase-TRUNK/2113/])
HBASE-4170 createTable java doc needs to be improved

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* /hbase/trunk/CHANGES.txt


 createTable java doc needs to be improved
 -

 Key: HBASE-4170
 URL: https://issues.apache.org/jira/browse/HBASE-4170
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
 Environment: HBase-0.90.1
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
 Fix For: 0.90.5

 Attachments: create_table_javadoc_HBASE_4170.patch


 HBaseAdmin.createTable() java doc says
 public void createTable(HTableDescriptor desc,
 byte[][] splitKeys)
  throws IOException
 Creates a new table with an initial set of empty regions defined by the 
 specified split keys. The total number of regions created will be the number 
 of split keys plus one (the first region has a null start key and the last 
 region has a null end key). Synchronous operation.
 If we specify null values for first region start key and last region end key, 
 geting NullPointerException as Arrays.sort compares each element.
 I guess the documentation should not talk about null values and explain about 
 splitKeys[][] length as n-1, where n is number of regions.
 splitKeys[][] would look like
 splitKeys[0] = key value 1
 ..
 splitKeys[n-1] = key value n-1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4194) RegionSplitter: Split on under-loaded region servers first

2011-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084736#comment-13084736
 ] 

Hudson commented on HBASE-4194:
---

Integrated in HBase-TRUNK #2113 (See 
[https://builds.apache.org/job/HBase-TRUNK/2113/])
HBASE-4194  RegionSplitter: Split on under-loaded region servers first

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
* /hbase/trunk/CHANGES.txt


 RegionSplitter: Split on under-loaded region servers first
 --

 Key: HBASE-4194
 URL: https://issues.apache.org/jira/browse/HBASE-4194
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial
 Fix For: 0.92.0

 Attachments: HBASE-4194.patch


 When running RegionSplitter, our app devs noticed that they were getting a 
 lot of NSREs.  This is caused by 2 factors: 
 1. the split itself will cause an NSRE 
 2. any load balancing will cause one.  
 The former cannot be helped.  We can more tightly control load balancing 
 though.  Instead of doing a name-sorted round-robin split across RS in the 
 tier, we could sort the RS's by region count.  That way, we only split an RS 
 with 10 regions after there are no more RS with 9 regions.  This will prevent 
 the load balancing slop from kicking in and will fix the problem where 
 restarting RegionSplitter always starts splitting at RS #1.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4196) TableRecordReader may skip first row of region

2011-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084735#comment-13084735
 ] 

Hudson commented on HBASE-4196:
---

Integrated in HBase-TRUNK #2113 (See 
[https://builds.apache.org/job/HBase-TRUNK/2113/])
HBASE-4196 TableRecordReader may skip first row of region

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapred/TableRecordReaderImpl.java


 TableRecordReader may skip first row of region
 --

 Key: HBASE-4196
 URL: https://issues.apache.org/jira/browse/HBASE-4196
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Jan Lukavsky
Assignee: Ming Ma
 Fix For: 0.90.5

 Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, 
 HBASE-4196-trunk.patch


 After the following scenario, the first record of region is skipped, without 
 being sent to Mapper:
  - the reader is initialized with TableRecordReader.init()
  - then nextKeyValue is called, causing call to scanner.next() - here 
 ScannerTimeoutException occurs
  - the scanner is restarted by call to restart() and then *two* calls to 
 scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4192) Optimize HLog for Throughput Using Delayed RPCs

2011-08-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084773#comment-13084773
 ] 

jirapos...@reviews.apache.org commented on HBASE-4192:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1463/#review1447
---



src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
https://reviews.apache.org/r/1463/#comment3352

The return value doesn't seem to be used.
Can you clarify why ?


- Ted


On 2011-08-12 23:44:23, Vlad Dogaru wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1463/
bq.  ---
bq.  
bq.  (Updated 2011-08-12 23:44:23)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Changes:
bq.  
bq.  1.  Add hbase.region.wal.batchentries configuration parameter.  If this is
bq.  enabled, batch entries to the HLog in a queue.
bq.  2.  Use delayed RPCs for sync requests when aggresive batching is enabled.
bq.  This frees up RPC handler threads for the duration of the sync.
bq.  3.  Pass the RPC server instance all the way to down to HLog.  This is 
needed
bq.  to find out the current remote call, mark it as delayed, and finally 
complete
bq.  it when the sync is done.
bq.  4.  Use the region read-write consistency control to avoid exposing to
bq.  RegionScanners edits which have not yet been synced.
bq.  5.  Change a few tests which directly create HRegions or HLogs.  The
bq.  rpcServers passed in are null, HLog falls back to classic RPCs when it has 
no
bq.  knowledge of the RPC server.
bq.  6.  Add TestBatchEntriesLogRolling, which is identical to TestLogRolling,
bq.  except that it uses aggressive batching.  I'm not sure how to add tests 
that
bq.  verify the same functionality but don't duplicate code, suggestion are
bq.  welcome.
bq.  
bq.  The new parameter is disabled by default.
bq.  
bq.  
bq.  This addresses bug HBASE-4192.
bq.  https://issues.apache.org/jira/browse/HBASE-4192
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 7117bce 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java a00b93d 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 83ff7b2 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
7a917da 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 8ec53d3 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 887f736 
bq.src/main/resources/hbase-default.xml 66548ca 
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java
 dc43eb2 
bq.
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
 381ac90 
bq.  
bq.  Diff: https://reviews.apache.org/r/1463/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  All unit tests run with aggressive batching turned on and off.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Vlad
bq.  
bq.



 Optimize HLog for Throughput Using Delayed RPCs
 ---

 Key: HBASE-4192
 URL: https://issues.apache.org/jira/browse/HBASE-4192
 Project: HBase
  Issue Type: New Feature
  Components: wal
Affects Versions: 0.92.0
Reporter: Vlad Dogaru
Priority: Minor

 Introduce a new HLog configuration parameter (batchEntries) for more 
 aggressive batching of appends.  If this is enabled, HLog appends are not 
 written to the HLog writer immediately, but batched and written either 
 periodically or when a sync is requested.  Because sync times become larger, 
 they use delayed RPCs to free up RPC handler threads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version

2011-08-13 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084778#comment-13084778
 ] 

Ian Varley commented on HBASE-4071:
---

I like the idea of making this pluggable (via the coprocessor framework, or 
otherwise). But I also think this is a fundamental enough policy option that 
making it hard-coded might be a good idea. When I was talking to someone the 
other day about the current TTL policy, he was like, WTF, who would want that, 
it eats your data?. There's no such thing as a keep 0 versions option, and 
thus no way to accidentally lose your most current data using that approach. 
But with the TTL version there is, which is (IMO) counter-intuitive for those 
coming from an RDBMS background.

 Data GC: Remove all versions  TTL EXCEPT the last written version
 --

 Key: HBASE-4071
 URL: https://issues.apache.org/jira/browse/HBASE-4071
 Project: HBase
  Issue Type: New Feature
Reporter: stack

 We were chatting today about our backup cluster.  What we want is to be able 
 to restore the dataset from any point of time but only within a limited 
 timeframe -- say one week.  Thereafter, if the versions are older than one 
 week, rather than as we do with TTL where we let go of all versions older 
 than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
 its like versions==1 when TTL  one week.  We want to allow that if an error 
 is caught within a week of its happening -- user mistakenly removes a 
 critical table -- then we'll be able to restore up the the moment just before 
 catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

2011-08-13 Thread fulin wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084780#comment-13084780
 ] 

fulin wang commented on HBASE-4124:
---

Please gaojinchao fix the issues, Thanks.

 ZK restarted while assigning a region, new active HM re-assign it but the RS 
 warned 'already online on this server'.
 

 Key: HBASE-4124
 URL: https://issues.apache.org/jira/browse/HBASE-4124
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: fulin wang
 Attachments: log.txt

   Original Estimate: 0.4h
  Remaining Estimate: 0.4h

 ZK restarted while assigning a region, new active HM re-assign it but the RS 
 warned 'already online on this server'.
 Issue:
 The RS failed besause of 'already online on this server' and return; The HM 
 can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version

2011-08-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084782#comment-13084782
 ] 

Lars Hofhansl commented on HBASE-4071:
--

I also think a simple notion like keep everything more recent than T, but at 
least N versions seems to be a frequent enough option to be hard-coded.
Maybe that could be indicated simply by setting both # of versions AND TTL on a 
column family.


 Data GC: Remove all versions  TTL EXCEPT the last written version
 --

 Key: HBASE-4071
 URL: https://issues.apache.org/jira/browse/HBASE-4071
 Project: HBase
  Issue Type: New Feature
Reporter: stack

 We were chatting today about our backup cluster.  What we want is to be able 
 to restore the dataset from any point of time but only within a limited 
 timeframe -- say one week.  Thereafter, if the versions are older than one 
 week, rather than as we do with TTL where we let go of all versions older 
 than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
 its like versions==1 when TTL  one week.  We want to allow that if an error 
 is caught within a week of its happening -- user mistakenly removes a 
 critical table -- then we'll be able to restore up the the moment just before 
 catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-13 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084787#comment-13084787
 ] 

jirapos...@reviews.apache.org commented on HBASE-4197:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1496/#review1448
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
https://reviews.apache.org/r/1496/#comment3353

I need to change the javadoc to say RegionScanner as well.


- Lars


On 2011-08-13 15:56:28, Lars Hofhansl wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1496/
bq.  ---
bq.  
bq.  (Updated 2011-08-13 15:56:28)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Don't require custom scanners created by conprocessors to be subclasses 
of HRegion.RegionScanner (see HBASE-4197).
bq.  2. Simplify the interfaces for Scanners in HRegion, HRegionServer, and 
RegionObserver. This avoids a bunch instanceof checks and casts to 
HRegion.RegionScanner.
bq.  
bq.  (Sorry HBase-git would not accept my patch)
bq.  
bq.  
bq.  This addresses bug HBASE-4197.
bq.  https://issues.apache.org/jira/browse/HBASE-4197
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
 1157311 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
 1157311 
bq.  
bq.  Diff: https://reviews.apache.org/r/1496/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Manual test attached to the bug.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.



 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.