date:20110812

[
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083965#comment-13083965
]

stack commented on HBASE-4015:
--

bq. Timeout monitor DOESNOT preempt an znode to OFFLINE if in PENDING_OPEN
state.

Ok.

I think I understand now. The addition of new state breaks the move to OPENING
because the check for a previous OFFLINE state will fail... so the RS will no
proceed with the open.

But in fig (iii) in your doc. you check previous state is REALLOCATE? How is
this case different from the fig (i) where you check for OFFLINE? Won't your
code have to check for both REALLOCATE and OFFLINE and the presence of either
mean its ok to procede to OPENING (and then aren't REALLOCATE and OFFLINE the
'same' state because the presence of either will mean proceed to OPENING?).

I suppose the presence of the RS name will help. If its the 'same' name, then
we can proceed to OPENING and so what if OFFLINE was hijacked and became a
REALLOCATE. If they are not the same, then we'd abort the open.

So, why not just add machine name to OFFLINE? Then we don't need REALLOCATE
state? (Ideally it would be best if master told the regionserver the version of
the znode to expect when it goes to move the znode to OPENING but that looks
hard to pass from the master over to the RS EventHandlers).

So, figuring how to do deal with timeout of regions in PENDING_OPEN is one
aspect of this issue, right? The verification of state over in timeout monitor
before acting is another aspect?

You are working on TRUNK Ram? (I believe it acts a little differently from 0.90
because of recent work done in here).

Good stuff Ram. Thanks for digging into this.

Refactor the TimeoutMonitor to make it less racy

Key: HBASE-4015
URL: https://issues.apache.org/jira/browse/HBASE-4015
Project: HBase
Issue Type: Sub-task
Affects Versions: 0.90.3
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
Fix For: 0.92.0

Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state
diagrams.pdf

The current implementation of the TimeoutMonitor acts like a race condition
generator, mostly making things worse rather than better. It does it's own
thing for a while without caring for what's happening in the rest of the
master.
The first thing that needs to happen is that the regions should not be
processed in one big batch, because that sometimes can take minutes to
process (meanwhile a region that timed out opening might have opened, then
what happens is it will be reassigned by the TimeoutMonitor generating the
never ending PENDING_OPEN situation).
Those operations should also be done more atomically, although I'm not sure
how to do it in a scalable way in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1730) Near-instantaneous online schema and table state updates

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083967#comment-13083967
]

jirapos...@reviews.apache.org commented on HBASE-1730:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1479/
---

Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

Summary
---

When the master receives an alter table call (addColumn, modifyColumn,
deleteColumn, modifyTable), it updates the .tableinfo and then closes all the
regions of that table. The patch includes:

1. Changes to reopen the regions when any of the above operations are
performed.
2. Best effort is made to preserve the locality of regions by assigning it a
region plan before closing it.
3. Throttling logic that ensures that only a configurable number of regions are
closed per region server at a time.
4. alter command in the hbase shell will block until all the regions are
updated, providing a status x/y regions updated every second.
5. alter_async command that works exactly like alter, except that it does not
block for completion or provide the status.
6. alter_status table_name which is a sync call and blocks to provide the
x/y regions updated status per second until all regions are updated.
7. modification in the unit test for enabling alter without disabling the table.

This addresses bug HBASE-1730.
https://issues.apache.org/jira/browse/HBASE-1730

Diffs
-

src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77
src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c
src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024
src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c
src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION
src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68
src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140
src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java
ae43837
src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
09891aa
src/main/ruby/hbase/admin.rb 4460d6e
src/main/ruby/shell.rb 1ec330f
src/main/ruby/shell/commands/alter.rb 1dd43ad
src/main/ruby/shell/commands/alter_async.rb PRE-CREATION
src/main/ruby/shell/commands/alter_status.rb PRE-CREATION
src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99

Diff: https://reviews.apache.org/r/1479/diff

Testing
---

I am putting this up for initial review. I have tested the functionality in a
pseudo distributed mode.
Need to run unit tests.

Thanks,

Nileema

Near-instantaneous online schema and table state updates

Key: HBASE-1730
URL: https://issues.apache.org/jira/browse/HBASE-1730
Project: HBase
Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: stack
Priority: Critical
Fix For: 0.92.0

Attachments: 1730-v2.patch, 1730-v3.patch, 1730.patch,
HBASE-1730.patch

We should not need to take a table offline to update HCD or HTD.
One option for that is putting HTDs and HCDs up into ZK, with mirror on disk
catalog tables to be used only for cold init scenarios, as discussed on IRC.
In this scheme, regionservers hosting regions of a table would watch
permanent nodes in ZK associated with that table for schema updates and take
appropriate actions out of the watcher. In effect, schema updates become
another item in the ToDo list.
{{/hbase/tables/table-name/schema}}
Must be associated with a write locking scheme also handled with ZK
primitives to avoid situations where one concurrent update clobbers another.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4193) Enhance RPC debug logging to provide more details on call contents

2011-08-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083982#comment-13083982
 ] 

Hudson commented on HBASE-4193:
---

Integrated in HBase-TRUNK #2111 (See 
[https://builds.apache.org/job/HBase-TRUNK/2111/])
HBASE-4193  Enhance RPC debug logging with details on call contents

garyh : 
Files : 
* /hbase/trunk/conf/log4j.properties
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Objects.java


 Enhance RPC debug logging to provide more details on call contents
 --

 Key: HBASE-4193
 URL: https://issues.apache.org/jira/browse/HBASE-4193
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBASE-4193.patch, HBASE-4193_final.patch


 The current HBaseServer debug logging, while verbose, doesn't provide much 
 information on the actual contents of RPC calls being handled.  This makes it 
 difficult to diagnose why some calls make take much longer to process that 
 others.  Have more information on the size of client calls, and the contents 
 of those calls (especially in the case of batch or multi operations) would 
 provide a lot more context for tracking down issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4186) No region is added to regionsInTransitionInRS


 [ 
https://issues.apache.org/jira/browse/HBASE-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-4186.
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]

 No region is added to regionsInTransitionInRS
 -

 Key: HBASE-4186
 URL: https://issues.apache.org/jira/browse/HBASE-4186
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.90.5

 Attachments: 4186.txt


 We have a skip list set called regionsInTransitionInRS (introduced in 
 HBASE-3741) where we try to maintain a list to know the currently processing 
 regions for closing and opening.
 In open region handler we are trying to throw an error if the regions are in 
 transition on that RS when we get an open call for the same region.
 But we are not adding the region into the set anywhere.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3741) Make HRegionServer aware of the regions it's opening/closing


 [ 
https://issues.apache.org/jira/browse/HBASE-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3741:
--

Fix Version/s: (was: 0.90.5)
   0.90.3

 Make HRegionServer aware of the regions it's opening/closing
 

 Key: HBASE-3741
 URL: https://issues.apache.org/jira/browse/HBASE-3741
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.90.3

 Attachments: HBASE-3741-rsfix-v2.patch, HBASE-3741-rsfix-v3.patch, 
 HBASE-3741-rsfix.patch, HBASE-3741-trunk.patch


 This is a serious issue about a race between regions being opened and closed 
 in region servers. We had this situation where the master tried to unassign a 
 region for balancing, failed, force unassigned it, force assigned it 
 somewhere else, failed to open it on another region server (took too long), 
 and then reassigned it back to the original region server. A few seconds 
 later, the region server processed the first closed and the region was left 
 unassigned.
 This is from the master log:
 {quote}
 11-04-05 15:11:17,758 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
 Sent CLOSE to serverName=sv4borg42,60020,1300920459477, load=(requests=187, 
 regions=574, usedHeap=3918, maxHeap=6973) for region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 2011-04-05 15:12:10,021 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_CLOSE, ts=1302041477758
 2011-04-05 15:12:10,021 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 ...
 2011-04-05 15:14:45,783 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=CLOSED, ts=1302041685733
 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x42ec2cece810b68 Creating (or updating) unassigned node for 
 1470298961 with OFFLINE state
 ...
 2011-04-05 15:14:45,885 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961;
  
 plan=hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
  src=sv4borg42,60020,1300920459477, dest=sv4borg40,60020,1302041218196
 2011-04-05 15:14:45,885 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  to sv4borg40,60020,1302041218196
 2011-04-05 15:15:39,410 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_OPEN, ts=1302041700944
 2011-04-05 15:15:39,410 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_OPEN for too long, reassigning 
 region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_OPEN, ts=1302041700944
 ...
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  so generated a random one; 
 hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
  src=, dest=sv4borg42,60020,1300920459477; 19 (online=19, exclude=null) 
 available servers
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  to sv4borg42,60020,1300920459477
 2011-04-05 15:15:40,951 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
 master:6-0x42ec2cece810b68 Received ZooKeeper Event, 
 type=NodeDataChanged, state=SyncConnected,

[jira] [Commented] (HBASE-1730) Near-instantaneous online schema and table state updates

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083990#comment-13083990
]

jirapos...@reviews.apache.org commented on HBASE-1730:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1479/#review1416
---

Please test this patch using a real cluster.

src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
https://reviews.apache.org/r/1479/#comment3273

Pair implements Serializable which is handled specially by
HbaseObjectWritable (see line 350 below). Why is this needed ?

- Ted

On 2011-08-12 06:14:21, Nileema Shingte wrote:
bq.
bq. ---
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1479/
bq. ---
bq.
bq. (Updated 2011-08-12 06:14:21)
bq.
bq.
bq. Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan
Gray.
bq.
bq.
bq. Summary
bq. ---
bq.
bq. When the master receives an alter table call (addColumn, modifyColumn,
deleteColumn, modifyTable), it updates the .tableinfo and then closes all the
regions of that table. The patch includes:
bq.
bq. 1. Changes to reopen the regions when any of the above operations are
performed.
bq. 2. Best effort is made to preserve the locality of regions by assigning it
a region plan before closing it.
bq. 3. Throttling logic that ensures that only a configurable number of
regions are closed per region server at a time.
bq. 4. alter command in the hbase shell will block until all the regions are
updated, providing a status x/y regions updated every second.
bq. 5. alter_async command that works exactly like alter, except that it does
not block for completion or provide the status.
bq. 6. alter_status table_name which is a sync call and blocks to provide
the x/y regions updated status per second until all regions are updated.
bq. 7. modification in the unit test for enabling alter without disabling the
table.
bq.
bq.
bq. This addresses bug HBASE-1730.
bq. https://issues.apache.org/jira/browse/HBASE-1730
bq.
bq.
bq. Diffs
bq. -
bq.
bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77
bq.src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
13c8b8c
bq.src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
49d1e7c
bq.src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java
PRE-CREATION
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68
bq.src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140
bq.
src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java
ae43837
bq.
src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
09891aa
bq.src/main/ruby/hbase/admin.rb 4460d6e
bq.src/main/ruby/shell.rb 1ec330f
bq.src/main/ruby/shell/commands/alter.rb 1dd43ad
bq.src/main/ruby/shell/commands/alter_async.rb PRE-CREATION
bq.src/main/ruby/shell/commands/alter_status.rb PRE-CREATION
bq.src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99
bq.
bq. Diff: https://reviews.apache.org/r/1479/diff
bq.
bq.
bq. Testing
bq. ---
bq.
bq. I am putting this up for initial review. I have tested the functionality
in a pseudo distributed mode.
bq. Need to run unit tests.
bq.
bq.
bq. Thanks,
bq.
bq. Nileema
bq.
bq.

Near-instantaneous online schema and table state updates

Key: HBASE-1730
URL: https://issues.apache.org/jira/browse/HBASE-1730
Project: HBase
Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: stack
Priority: Critical
Fix For: 0.92.0

Attachments: 1730-v2.patch, 1730-v3.patch, 1730.patch,
HBASE-1730.patch

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084003#comment-13084003
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/
---

(Updated 2011-08-12 08:41:37.017483)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, 
and Li Pi.


Changes
---

fixed test failure, as ted yu reported.


Summary
---

Review request - I apparently can't edit tlipcon's earlier posting of my diff, 
so creating a new one.


This addresses bug HBase-4027.
https://issues.apache.org/jira/browse/HBase-4027


Diffs (updated)
-

  conf/hbase-env.sh 2d55d27 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c 
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 7a917da 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 
  src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/HFileBlockCacheTestUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCacheTestUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 

Diff: https://reviews.apache.org/r/1214/diff


Testing
---

Ran benchmarks against it in HBase standalone mode. Wrote test cases for all 
classes, multithreaded test cases exist for the cache.


Thanks,

Li



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.diff, slabcachepatch.diff, 
 slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, 
 slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4015) Refactor the TimeoutMonitor to make it less racy

[
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084007#comment-13084007
]

ramkrishna.s.vasudevan commented on HBASE-4015:
---

bq. You are working on TRUNK Ram?
Yes Stack

bq. Won't your code have to check for both REALLOCATE and OFFLINE and the
presence of either mean its ok to procede to OPENING (and then aren't
REALLOCATE and OFFLINE the 'same' state because the presence of either will
mean proceed to OPENING?).

Yes this is what my patch does. But why we do the same operation for both
state?
this is because previously if there is a change in state other than OFFLINE
while moving to OPENING we were aborting, now this an additional state which
says its ok to go to OPENING if you find me in RE_ALLOCATE and if the server
name in me is same as your RS address. This avoids the problem of unnecessary
region getting hijacked though the RS was doing his work correctly.

bq.So, why not just add machine name to OFFLINE? Then we don't need REALLOCATE
state?
This you have already told like currently there is no version that is passed
from master to rs. Thats why a new state. If this had been possible then
OFFLINE with version passed by master would have been sufficient.

bq.So, figuring how to do deal with timeout of regions in PENDING_OPEN is one
aspect of this issue, right? The verification of state over in timeout monitor
before acting is another aspect?
Yes stack.. we have covered both these aspects and also the points told by JD.
Taking action on timeout immediately and a mechanism for both master and RS to
know what happened as part of timeout and who ever wins the race succeeds.

bq.(I believe it acts a little differently from 0.90 because of recent work
done in here).

Reg timeout monitor the one major change is now the CLSOING state node is
created by master itself and it was done by RS as in 0.90. Apart from this i
dint find any big difference till now. As part of HBASE-4083 we have introduced
the return types from Open RegionHandler which takes care of scenarios where a
race condition happens between the master changes to RE_ALLOCATE by the time
the RS has moved to OPENED.

Refactor the TimeoutMonitor to make it less racy

Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state
diagrams.pdf

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()


[ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084062#comment-13084062
 ] 

ramkrishna.s.vasudevan commented on HBASE-4175:
---

@Ted,
Currently as you mentioned FSUtils.createTableDescriptor() doesnot throw IOE.  
So I will make it throw IOE.  
Reg if the table already exists there is a check 
{code}
 if (fs.exists(tableInfoPath)
   fs.getFileStatus(tableInfoPath).getLen()  0) {
LOG.info(TableInfo already exists.. Skipping creation);
}
{code}

So we need not add any FSTableDescriptor.get() I think. Pls suggest.

Also what should be the behaviour if the table already exists.  As you had 
already told do we need to forcefully create? So for that we need to introduce 
a new api for forceful creation. 
In my current patch I am planning to return true or false and if IOE happens 
will throw the IOE to the caller.



 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan

 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4015) Refactor the TimeoutMonitor to make it less racy


[ 
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084060#comment-13084060
 ] 

ramkrishna.s.vasudevan commented on HBASE-4015:
---

@Stack,
Was seeing the possibility of using OFFLINE state.  Thought of few things
- Now we need to change behaviour in all the cases in timeoutmonitor to 
preempt the node to OFFLINE with RS name.
-Before changing to OFFLINE see what is the state in RS.  If still 
OFFLINE/OPENING change it to OFFLINE+Servername address
-After changing it to OFFLINE get the latest version and pass it to the RS 
from Master which inturn goes to the OpenRegionHandler.  
-This will be needed when we transit from OFFLINE to OPENING to ensure whether 
the current transition from OFFLINE to OPENING is for timeout call or previous 
OFFLINE to OPENING did not happen.
-also the servername is necessary to avoid processing of the transition by the 
RS who is no longer owner of the znode.
-And even in normal flow(normal assign flow) we need to add the servername of 
RS along with OFFLINE who will process the unassigned node

These will be the highlevel changes that we need to make in the current patch 
if we need to avoid the new state.  


 Refactor the TimeoutMonitor to make it less racy
 

 Key: HBASE-4015
 URL: https://issues.apache.org/jira/browse/HBASE-4015
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.90.3
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.92.0

 Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state 
 diagrams.pdf


 The current implementation of the TimeoutMonitor acts like a race condition 
 generator, mostly making things worse rather than better. It does it's own 
 thing for a while without caring for what's happening in the rest of the 
 master.
 The first thing that needs to happen is that the regions should not be 
 processed in one big batch, because that sometimes can take minutes to 
 process (meanwhile a region that timed out opening might have opened, then 
 what happens is it will be reassigned by the TimeoutMonitor generating the 
 never ending PENDING_OPEN situation).
 Those operations should also be done more atomically, although I'm not sure 
 how to do it in a scalable way in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()


 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Attachment: HBASE-4175.patch

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()


[ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084124#comment-13084124
 ] 

Ted Yu commented on HBASE-4175:
---

We should add boolean parameter, force, to FSUtils.createTableDescriptor().
If the table already exists and force parameter is false, 
FSUtils.createTableDescriptor() can simply return false.

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4195) Possible unconsistency in a memstore read after a reseek, possible performance improvement

2011-08-12 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084160#comment-13084160
 ] 

nkeywal commented on HBASE-4195:


The issue with the implementation calling only seek is that we can see writes 
in progress. From my understanding, it should not be the case (and at least, 
if it's allowed, there is an issue in the test case itself).

The error is this assert: Assert.assertEquals(i= + i, expectedCount, 
result.size());, that's different from the one mentionned in HBASE-3855.


If I change the reseek implementation to something that does no call seek at 
all, like:
{noformat}public boolean reseek(KeyValue key) {
  while (kvsetNextRow != null 
comparator.compare(kvsetNextRow, key)  0) {
kvsetNextRow = getNext(kvsetIt);
  }

  while (snapshotNextRow != null 
comparator.compare(snapshotNextRow, key)  0) {
snapshotNextRow = getNext(snapshotIt);
  }

  numIterReseek = 0;
  return (kvsetNextRow != null || snapshotNextRow != null);
}{noformat}


The whole test works fine. So it seems the issue really comes from using seek. 
The current implementation should have the same issue I think. May be we don't 
see it often (or at all) because seek is not called that often because of the 
points mentionned in 2  3 in the analysis above.

Can someone confirm that we should not see partial writes in this case?



 Possible unconsistency in a memstore read after a reseek, possible 
 performance improvement
 --

 Key: HBASE-4195
 URL: https://issues.apache.org/jira/browse/HBASE-4195
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
 Environment: all
Reporter: nkeywal
Priority: Critical

 This follows the dicussion around HBASE-3855, and the random errors (20% 
 failure on trunk) on the unit test 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
 I saw some points related to numIterReseek, used in the 
 MemStoreScanner#getNext (line 690):
 {noformat}679 protected KeyValue getNext(Iterator it) {
 680 KeyValue ret = null;
 681 long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();
 682 //DebugPrint.println(  MS@ + hashCode() + : threadpoint =  + 
 readPoint);
 683
 684 while (ret == null  it.hasNext()) {
 685   KeyValue v = it.next();
 686   if (v.getMemstoreTS() = readPoint) {
 687 // keep it.
 688 ret = v;
 689   }
 690   numIterReseek--;
 691   if (numIterReseek == 0) {
 692 break;
 693}
 694 }
 695 return ret;
 696   }{noformat}
 This function is called by seek, reseek, and next. The numIterReseek is only 
 usefull for reseek.
 There are some issues, I am not totally sure it's the root cause of the test 
 case error, but it could explain partly the randomness of the error, and one 
 point is for sure a bug.
 1) In getNext, numIterReseek is decreased, then compared to zero. The seek 
 function sets numIterReseek to zero before calling getNext. It means that the 
 value will be actually negative, hence the test will always fail, and the 
 loop will continue. It is the expected behaviour, but it's quite smart.
 2) In reseek, numIterReseek is not set between the loops on the two 
 iterators. If the numIterReseek is equals to zero after the loop on the first 
 one, the loop on the second one will never call seek, as numIterReseek will 
 be negative.
 3) Still in reseek, the test to call seek is (kvsetNextRow == null  
 numIterReseek == 0). In other words, if kvsetNextRow is not null when 
 numIterReseek equals zero, numIterReseek will start to be negative at the 
 next iteration and seek will never be called.
 4) You can have side effects if reseek ends with a numIterReseek  0: the 
 following calls to the next function will decrease numIterReseek to zero, 
 and getNext will break instead of continuing the loop. As a result, later 
 calls to next() may return null or not depending on how is configured the 
 default value for numIterReseek.
 To check if the issue comes from point 4, you can set the numIterReseek to 
 zero before returning in reseek:
 {noformat}  numIterReseek = 0;
   return (kvsetNextRow != null || snapshotNextRow != null);
 }{noformat}
 On my env, on trunk, it seems to work, but as it's random I am not really 
 sure. I also had to modify the test (I added a loop) to make it fails more 
 often, the original test was working quite well here.
 It has to be confirmed that this totally fix (it could be partial or 
 unrelated) 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting 
 before implementing a

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084269#comment-13084269
 ] 

Lars Hofhansl commented on HBASE-4197:
--

I attached a minimal patch that makes it work for me.
I am not happy with the patch, though, for two reason:
1. isFilterDone() now needs to be public.
2. If the regionserver can only ever deal with RegionScanners, maybe all the 
interfaces in coprocessors should also take RegionScanner instead.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084311#comment-13084311
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--



bq.  On 2011-08-12 17:26:40, Todd Lipcon wrote:
bq.   
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java, lines 
101-103
bq.   https://reviews.apache.org/r/1214/diff/8/?file=31771#file31771line101
bq.  
bq.   this can race against getBlock() though:
bq.   
bq.   Thread A: backingMap.get(key) returns object
bq.   Thread B: put() returns same object
bq.   Thread B: free(object)
bq.   Thread A: use object. boom?
bq.   
bq.   putIfAbsent shouldn't be any slower than put, may as well make use 
of it

Ah, gotcha! I see it now. Fixed.


- Li


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/#review1423
---


On 2011-08-12 08:41:37, Li Pi wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1214/
bq.  ---
bq.  
bq.  (Updated 2011-08-12 08:41:37)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan 
Gray, and Li Pi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Review request - I apparently can't edit tlipcon's earlier posting of my 
diff, so creating a new one.
bq.  
bq.  
bq.  This addresses bug HBase-4027.
bq.  https://issues.apache.org/jira/browse/HBase-4027
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.conf/hbase-env.sh 2d55d27 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 
1338453 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 
886c31d 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
7a917da 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
7b7bf73 
bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/io/hfile/HFileBlockCacheTestUtils.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCacheTestUtils.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 
4387170 
bq.  
bq.  Diff: https://reviews.apache.org/r/1214/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran benchmarks against it in HBase standalone mode. Wrote test cases for 
all classes, multithreaded test cases exist for the cache.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Li
bq.  
bq.



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.diff, slabcachepatch.diff, 
 slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, 
 slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated

[jira] [Commented] (HBASE-4015) Refactor the TimeoutMonitor to make it less racy

2011-08-12 Thread Jonathan Gray (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084333#comment-13084333
 ] 

Jonathan Gray commented on HBASE-4015:
--

Sorry I'm a little late to this discussion but I like the idea of not adding a 
new state.  Instead, we can just pass the znode version number in the RPC to 
the regionservers.  Or encode the servername in the znode.

 Refactor the TimeoutMonitor to make it less racy
 

 Key: HBASE-4015
 URL: https://issues.apache.org/jira/browse/HBASE-4015
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.90.3
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.92.0

 Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state 
 diagrams.pdf


 The current implementation of the TimeoutMonitor acts like a race condition 
 generator, mostly making things worse rather than better. It does it's own 
 thing for a while without caring for what's happening in the rest of the 
 master.
 The first thing that needs to happen is that the regions should not be 
 processed in one big batch, because that sometimes can take minutes to 
 process (meanwhile a region that timed out opening might have opened, then 
 what happens is it will be reassigned by the TimeoutMonitor generating the 
 never ending PENDING_OPEN situation).
 Those operations should also be done more atomically, although I'm not sure 
 how to do it in a scalable way in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084358#comment-13084358
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/
---

(Updated 2011-08-12 20:26:21.230751)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, 
and Li Pi.


Changes
---

Fixed another broken test case. (Didn't reset buffer position before doing 
compare) and fixed race.


Summary
---

Review request - I apparently can't edit tlipcon's earlier posting of my diff, 
so creating a new one.


This addresses bug HBase-4027.
https://issues.apache.org/jira/browse/HBase-4027


Diffs (updated)
-

  conf/hbase-env.sh 2d55d27 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c 
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 7a917da 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 
  src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/HFileBlockCacheTestUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCacheTestUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 

Diff: https://reviews.apache.org/r/1214/diff


Testing
---

Ran benchmarks against it in HBase standalone mode. Wrote test cases for all 
classes, multithreaded test cases exist for the cache.


Thanks,

Li



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.diff, slabcachepatch.diff, 
 slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, 
 slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread Li Pi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-4027:
-

Attachment: hbase4027v11.6.diff

fixed broken test and race condition.

 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.diff, 
 slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, 
 slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4196) TableRecordReader may skip first row of region


[ 
https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084421#comment-13084421
 ] 

Ted Yu commented on HBASE-4196:
---

Patch looks good.
There're two TableRecordReaderImpl.java files, one under mapred and one under 
mapreduce.
Both of them should be fixed.


 TableRecordReader may skip first row of region
 --

 Key: HBASE-4196
 URL: https://issues.apache.org/jira/browse/HBASE-4196
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Jan Lukavsky
Assignee: Ming Ma
 Attachments: HBASE-4196-trunk.patch


 After the following scenario, the first record of region is skipped, without 
 being sent to Mapper:
  - the reader is initialized with TableRecordReader.init()
  - then nextKeyValue is called, causing call to scanner.next() - here 
 ScannerTimeoutException occurs
  - the scanner is restarted by call to restart() and then *two* calls to 
 scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084425#comment-13084425
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--



bq.  On 2011-08-12 21:52:43, Ted Yu wrote:
bq.   
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java, line 
38
bq.   https://reviews.apache.org/r/1214/diff/11/?file=32400#file32400line38
bq.  
bq.   Still some white spaces to remove.

Got it.


bq.  On 2011-08-12 21:52:43, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java, line 37
bq.   https://reviews.apache.org/r/1214/diff/11/?file=32401#file32401line37
bq.  
bq.   Incorrect class name.

Doh. Fixed.


- Li


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/#review1430
---


On 2011-08-12 20:26:21, Li Pi wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1214/
bq.  ---
bq.  
bq.  (Updated 2011-08-12 20:26:21)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan 
Gray, and Li Pi.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Review request - I apparently can't edit tlipcon's earlier posting of my 
diff, so creating a new one.
bq.  
bq.  
bq.  This addresses bug HBase-4027.
bq.  https://issues.apache.org/jira/browse/HBase-4027
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.conf/hbase-env.sh 2d55d27 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 
1338453 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 
886c31d 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
7a917da 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
7b7bf73 
bq.src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/io/hfile/HFileBlockCacheTestUtils.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCacheTestUtils.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 
4387170 
bq.  
bq.  Diff: https://reviews.apache.org/r/1214/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran benchmarks against it in HBase standalone mode. Wrote test cases for 
all classes, multithreaded test cases exist for the cache.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Li
bq.  
bq.



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.diff, 
 slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, 
 slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see:

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084428#comment-13084428
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/
---

(Updated 2011-08-12 22:30:10.310736)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, 
and Li Pi.


Changes
---

fixed two bugs as per ted yu's reviews.


Summary
---

Review request - I apparently can't edit tlipcon's earlier posting of my diff, 
so creating a new one.


This addresses bug HBase-4027.
https://issues.apache.org/jira/browse/HBase-4027


Diffs (updated)
-

  conf/hbase-env.sh 2d55d27 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 2d4002c 
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java 097dc50 
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1338453 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 886c31d 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 7a917da 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 7b7bf73 
  src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/HFileBlockCacheTestUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCacheTestUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSizeCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 4387170 

Diff: https://reviews.apache.org/r/1214/diff


Testing
---

Ran benchmarks against it in HBase standalone mode. Wrote test cases for all 
classes, multithreaded test cases exist for the cache.


Thanks,

Li



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, 
 hbase4027v11.diff, slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.5.diff, slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-12 Thread Li Pi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-4027:
-

Attachment: hbase4027v11.7.diff

fixed two typos.

 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
 HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027-v10.5.diff, 
 hbase-4027-v10.diff, hbase-4027v10.6.diff, hbase-4027v6.diff, 
 hbase4027v11.5.diff, hbase4027v11.6.diff, hbase4027v11.7.diff, 
 hbase4027v11.diff, slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.5.diff, slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2399) Forced splits only act on the first family in a table

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084443#comment-13084443
]

jirapos...@reviews.apache.org commented on HBASE-2399:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1484/
---

Review request for hbase.

Summary
---

1. Add tests for forcesplit multi-column-family scenarios.
2. Modify HRegion so that it picks splitpoint based on largest store, instead
of the first splittable store. It applies to both forcesplit and automatic
split.

This addresses bug hbase-2399.
https://issues.apache.org/jira/browse/hbase-2399

Diffs
-

http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
1157283

http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
1157283

http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
1157283

Diff: https://reviews.apache.org/r/1484/diff

Testing
---

Thanks,

Ming

Forced splits only act on the first family in a table
-

Key: HBASE-2399
URL: https://issues.apache.org/jira/browse/HBASE-2399
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.20.3
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
Labels: moved_from_0_20_5
Fix For: 0.92.0

Attachments: HBASE-2399-test-v1.patch

While working on a patch for HBASE-2375, I came across a few bugs in the
existing code related to splits.
If a user triggers a manual split, it flips a forceSplit boolean to true and
then triggers a compaction (this is very similar to my current implementation
for HBASE-2375). However, the forceSplit boolean is flipped back to false at
the beginning of Store.compact(). So the force split only acts on the first
family in the table. If that Store is not splittable for some reason (it is
empty or has only one row), then the entire region will not be split,
regardless of what is in other families.
Even if there is data in the first family, the midKey is determined based
solely on that family. If it has two rows and the next family has 1M rows,
we pick the split key based on the two rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4195) Possible unconsistency in a memstore read after a reseek, possible performance improvement

2011-08-12 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1308#comment-1308
 ] 

nkeywal commented on HBASE-4195:


With the current implementation, setting the config RESEEKMAX_KEY to -1 (read 
with conf.getInt(RESEEKMAX_KEY, RESEEKMAX_DEFAULT);) will have this effect. 
disclaimer: i did not test it.

 Possible unconsistency in a memstore read after a reseek, possible 
 performance improvement
 --

 Key: HBASE-4195
 URL: https://issues.apache.org/jira/browse/HBASE-4195
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
 Environment: all
Reporter: nkeywal
Priority: Critical

 This follows the dicussion around HBASE-3855, and the random errors (20% 
 failure on trunk) on the unit test 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
 I saw some points related to numIterReseek, used in the 
 MemStoreScanner#getNext (line 690):
 {noformat}679 protected KeyValue getNext(Iterator it) {
 680 KeyValue ret = null;
 681 long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();
 682 //DebugPrint.println(  MS@ + hashCode() + : threadpoint =  + 
 readPoint);
 683
 684 while (ret == null  it.hasNext()) {
 685   KeyValue v = it.next();
 686   if (v.getMemstoreTS() = readPoint) {
 687 // keep it.
 688 ret = v;
 689   }
 690   numIterReseek--;
 691   if (numIterReseek == 0) {
 692 break;
 693}
 694 }
 695 return ret;
 696   }{noformat}
 This function is called by seek, reseek, and next. The numIterReseek is only 
 usefull for reseek.
 There are some issues, I am not totally sure it's the root cause of the test 
 case error, but it could explain partly the randomness of the error, and one 
 point is for sure a bug.
 1) In getNext, numIterReseek is decreased, then compared to zero. The seek 
 function sets numIterReseek to zero before calling getNext. It means that the 
 value will be actually negative, hence the test will always fail, and the 
 loop will continue. It is the expected behaviour, but it's quite smart.
 2) In reseek, numIterReseek is not set between the loops on the two 
 iterators. If the numIterReseek is equals to zero after the loop on the first 
 one, the loop on the second one will never call seek, as numIterReseek will 
 be negative.
 3) Still in reseek, the test to call seek is (kvsetNextRow == null  
 numIterReseek == 0). In other words, if kvsetNextRow is not null when 
 numIterReseek equals zero, numIterReseek will start to be negative at the 
 next iteration and seek will never be called.
 4) You can have side effects if reseek ends with a numIterReseek  0: the 
 following calls to the next function will decrease numIterReseek to zero, 
 and getNext will break instead of continuing the loop. As a result, later 
 calls to next() may return null or not depending on how is configured the 
 default value for numIterReseek.
 To check if the issue comes from point 4, you can set the numIterReseek to 
 zero before returning in reseek:
 {noformat}  numIterReseek = 0;
   return (kvsetNextRow != null || snapshotNextRow != null);
 }{noformat}
 On my env, on trunk, it seems to work, but as it's random I am not really 
 sure. I also had to modify the test (I added a loop) to make it fails more 
 often, the original test was working quite well here.
 It has to be confirmed that this totally fix (it could be partial or 
 unrelated) 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting 
 before implementing a complete solution.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-12 Thread Mingjie Lai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084451#comment-13084451
 ] 

Mingjie Lai commented on HBASE-4197:


@lars 

Yes, RegionScanner'd better to be an interface instead of a class for better 
extension. Overall the patch looks good to me. 

Can you finish the patch and post it to reviewboard? 

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row of region

2011-08-12 Thread Ming Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HBASE-4196:
---

Attachment: HBASE-4196-trunk.patch

Thanks. Here is the update. Also, please note that the mapred version used to 
handle only UnknownScannerException. It is fixed to handle IOException.


 TableRecordReader may skip first row of region
 --

 Key: HBASE-4196
 URL: https://issues.apache.org/jira/browse/HBASE-4196
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Jan Lukavsky
Assignee: Ming Ma
 Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch


 After the following scenario, the first record of region is skipped, without 
 being sent to Mapper:
  - the reader is initialized with TableRecordReader.init()
  - then nextKeyValue is called, causing call to scanner.next() - here 
 ScannerTimeoutException occurs
  - the scanner is restarted by call to restart() and then *two* calls to 
 scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084478#comment-13084478
 ] 

Lars Hofhansl commented on HBASE-4197:
--

Hey Mingjie,

how do I do that? Is there some documentation where I can read about the 
process?

Thanks.

-- Lars





 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084479#comment-13084479
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1433
---


Nice work, Eugene.  I think we're getting close.  Just two suggested 
improvements below.

The main question still open to debate, I think, is whether or not aborting the 
server on unhandled exceptions is appropriate.

On the one hand, aborting takes the fail-fast approach and makes buggy 
coprocessors much more visible.  It's a lot more likely that a bug will be 
noticed and fixed if it brings down a region server!

On the other hand, I think coprocessors already pose enough of a stability risk 
to a cluster.  I think we should be working to minimize that by containing the 
impact that a buggy coprocessor can have.  If they coprocessor really wants or 
needs to trigger an abort, it can already do so, since 
(Master|RegionServer)Services extend Server, which extends Abortable.

I think I'd be more in favor of removing the coprocessor from the active set 
(we should make this as visible as possible so it's clear the coprocessor is no 
longer active), or at least wrapping the exception in a DoNotRetryIOException 
and communicating it back to the client?  Maybe both?

I guess I'd be okay with a configuration option to abort on error (I think a 
single config option is sufficient), as long as it's disabled by default.  But 
that would still imply we need some other handling when the option is disabled.


src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
https://reviews.apache.org/r/969/#comment3299

I would just synchronize the set here:

SetString coprocessorNames = Collections.synchronizedSet(new 
HashSetString());



src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
https://reviews.apache.org/r/969/#comment3300

If you move this into loadInstance() then you don't have to duplicate it 
elsewhere, since all the other load methods wind up calling that.


- Gary


On 2011-08-10 22:48:08, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-08-10 22:48:08)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68 
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
23225d7 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c44da73 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



 Coprocessors: Flag the presence of coprocessors in logged exceptions
 

 Key: HBASE-4014
 URL: https://issues.apache.org/jira/browse/HBASE-4014
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Andrew Purtell
Assignee: Eugene Koontz
 Fix

[jira] [Updated] (HBASE-4196) TableRecordReader may skip first row of region

2011-08-12 Thread Ming Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HBASE-4196:
---

Attachment: HBASE-4196-trunk.patch

That is due to the svn flag -w used. I have fixed it.

 TableRecordReader may skip first row of region
 --

 Key: HBASE-4196
 URL: https://issues.apache.org/jira/browse/HBASE-4196
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Jan Lukavsky
Assignee: Ming Ma
 Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, 
 HBASE-4196-trunk.patch


 After the following scenario, the first record of region is skipped, without 
 being sent to Mapper:
  - the reader is initialized with TableRecordReader.init()
  - then nextKeyValue is called, causing call to scanner.next() - here 
 ScannerTimeoutException occurs
  - the scanner is restarted by call to restart() and then *two* calls to 
 scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4190) Coprocessors: pull up some cp constants from cp package to o.a.h.h.HConstants

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084505#comment-13084505
 ] 

jirapos...@reviews.apache.org commented on HBASE-4190:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1461/
---

(Updated 2011-08-13 01:08:04.852897)


Review request for hbase.


Changes
---

Based on stack's comments and an offline discussion with Gary and Andy. 
- pulled Coprocessor and CoprocessorEnvironment classes to root level, from 
o.a.h.h.coprocessor to o.a.h.h. 
- keep cp priority constants still in Coprocessor class.
- htd pattern related constants in HConstant

What do you think?


Summary
---

Coprocessors: pull up some cp constants from cp package to o.a.h.h.HConstants


This addresses bug HBASE-4190.
https://issues.apache.org/jira/browse/HBASE-4190


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/Coprocessor.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/CoprocessorEnvironment.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/HConstants.java dda254d 
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java d835582 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 
2fc8f39 
  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java 
506051d 
  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 
ec88a01 
  src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java 0290bf2 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 
54ccd6f 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java
 5d8cf4c 
  src/main/java/org/apache/hadoop/hbase/coprocessor/ObserverContext.java 
9349d5b 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java
 da8076c 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java cfbb29d 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/WALCoprocessorEnvironment.java
 6580c2c 
  src/main/java/org/apache/hadoop/hbase/coprocessor/WALObserver.java b086747 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c44da73 
  
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java 
03df574 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java 
a81ff84 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
36816e8 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java 
c85146a 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 0ab1339 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 6d31d70 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java 
d9f6e5f 
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java b4c407b 

Diff: https://reviews.apache.org/r/1461/diff


Testing
---

TestClassLoading passed locally.


Thanks,

Mingjie



 Coprocessors: pull up some cp constants from cp package to o.a.h.h.HConstants
 -

 Key: HBASE-4190
 URL: https://issues.apache.org/jira/browse/HBASE-4190
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Affects Versions: 0.90.4
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Minor
 Fix For: 0.90.5


 At HBase-3810, stack gave a comment after patch committed:
  This is a bit odd where a class in the parent package has references to a 
  sub package.
  Should these classes or at least their constants be pulled up to be at same 
  level as HTableD?
 Create a new jira where the constants will be pulled from 
 o.a.h.h.regionserver.RegionCoprocessorHost to o.a.h.h.HConstants. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-2399) Forced splits only act on the first family in a table

2011-08-12 Thread Ming Ma (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ming Ma updated HBASE-2399:
---

Attachment: HBASE-2399-trunk.patch

Fix the issues raised by Ted.

Forced splits only act on the first family in a table
-

Attachments: HBASE-2399-test-v1.patch, HBASE-2399-trunk.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2399) Forced splits only act on the first family in a table

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084509#comment-13084509
 ] 

jirapos...@reviews.apache.org commented on HBASE-2399:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1484/#review1438
---



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
https://reviews.apache.org/r/1484/#comment3332

whitespace



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
https://reviews.apache.org/r/1484/#comment

whitespace here and below


- Jonathan


On 2011-08-12 22:58:55, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1484/
bq.  ---
bq.  
bq.  (Updated 2011-08-12 22:58:55)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Add tests for forcesplit multi-column-family scenarios.
bq.  2. Modify HRegion so that it picks splitpoint based on largest store, 
instead of the first splittable store. It applies to both forcesplit and 
automatic split.
bq.  
bq.  
bq.  This addresses bug hbase-2399.
bq.  https://issues.apache.org/jira/browse/hbase-2399
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157283 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
 1157283 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
 1157283 
bq.  
bq.  Diff: https://reviews.apache.org/r/1484/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Forced splits only act on the first family in a table
 -

 Key: HBASE-2399
 URL: https://issues.apache.org/jira/browse/HBASE-2399
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.20.3
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
  Labels: moved_from_0_20_5
 Fix For: 0.92.0

 Attachments: HBASE-2399-test-v1.patch, HBASE-2399-trunk.patch


 While working on a patch for HBASE-2375, I came across a few bugs in the 
 existing code related to splits.
 If a user triggers a manual split, it flips a forceSplit boolean to true and 
 then triggers a compaction (this is very similar to my current implementation 
 for HBASE-2375).  However, the forceSplit boolean is flipped back to false at 
 the beginning of Store.compact().  So the force split only acts on the first 
 family in the table.  If that Store is not splittable for some reason (it is 
 empty or has only one row), then the entire region will not be split, 
 regardless of what is in other families.
 Even if there is data in the first family, the midKey is determined based 
 solely on that family.  If it has two rows and the next family has 1M rows, 
 we pick the split key based on the two rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


 [ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4197:
-

Attachment: 4197-bigger.txt

Slightly larger patch that does away with all casting and instanceof nonsense 
for scanners.

Please let me know if you generally agree with the approach, if so I'll get the 
review started.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4150) Potentially too many connections may be opened if ThreadLocalPool or RoundRobinPool is used


[ 
https://issues.apache.org/jira/browse/HBASE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084514#comment-13084514
 ] 

Ted Yu commented on HBASE-4150:
---

Integrate to TRUNK.

Thanks for the continued effort, Karthick.

 Potentially too many connections may be opened if ThreadLocalPool or 
 RoundRobinPool is used
 ---

 Key: HBASE-4150
 URL: https://issues.apache.org/jira/browse/HBASE-4150
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Karthick Sankarachary
 Fix For: 0.92.0

 Attachments: 4150-1.txt, 4150.txt, 5140-2.txt, HBASE-4150-DOC.patch, 
 HBASE-4150_final.patch


 See 'Problem with hbase.client.ipc.pool.type=threadlocal in trunk' discussion 
 started by Lars George.
 From Lars Hofhansl:
 Looking at HBaseClient.getConnection(...) I see this:
 {code}
  synchronized (connections) {
connection = connections.get(remoteId);
if (connection == null) {
  connection = new Connection(remoteId);
  connections.put(remoteId, connection);
}
  }
 {code}
 At the same time PoolMap.ThreadLocalPool.put is defined like this:
 {code}
public R put(R resource) {
  R previousResource = get();
  if (previousResource == null) {
 ...
if (poolSize.intValue() = maxSize) {
  return null;
}
 ...
}
 {code}
 So... If the ThreadLocalPool reaches its capacity it always returns null and 
 hence all new threads will create a
 new connection every time getConnection is called!
 I have also verified with a test program that works fine as long as the 
 number of client threads (which include
 the threads in HTable's threadpool of course) is  poolsize. Once that is no 
 longer the case the number of
 connections explodes and the program dies with OOMEs (mostly because each 
 Connection is associated with
 yet another thread).
 It's not clear what should happen, though. Maybe (1) the ThreadLocalPool 
 should not have a limit, or maybe
 (2) allocations past the pool size should throw an exception (i.e. there's a 
 hard limit), or maybe (3) in that case
 a single connection is returned for all threads while the pool it over its 
 limit or (4) we start round robin with the other
 connection in the other thread locals.
 For #1 means that the number of client threads needs to be more carefully 
 managed by the client app.
 In this case it would also be somewhat pointless that Connection have their 
 own threads, we just pass stuff
 between threads.
 #2 would work, but puts more logic in the client.
 #3 would lead to hard to debug performance issues.
 And #4 is messy :)
 From Ted Yu:
 For HBaseClient, at least the javadoc doesn't match:
 {code}
* @param config configuration
* @return either a {@link PoolType#Reusable} or {@link 
 PoolType#ThreadLocal}
*/
   private static PoolType getPoolType(Configuration config) {
 return PoolType.valueOf(config.get(HConstants.HBASE_CLIENT_IPC_POOL_TYPE),
 PoolType.RoundRobin, PoolType.ThreadLocal);
 {code}
 I think for RoundRobinPool, we shouldn't allow maxSize to be 
 Integer#MAX_VALUE. Otherwise connection explosion described by Lars may incur.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084516#comment-13084516
 ] 

Ted Yu commented on HBASE-4197:
---

I like the cleaner code after the change.
I know the following existed prior to your patch:
{code}
+public HRegionInfo getRegionName();
{code}
Can we rename the method to getRegionInfo ?
This would make the following code a little easier to understand:
{code}
region = getRegion(rs.getRegionName().getRegionName());
{code}

Thanks for your effort, Lars.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4150) Potentially too many connections may be opened if ThreadLocalPool or RoundRobinPool is used


[ 
https://issues.apache.org/jira/browse/HBASE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084518#comment-13084518
 ] 

Lars Hofhansl commented on HBASE-4150:
--

Thanks for the doc patch Karthick, it explains trade-offs very nicely.

 Potentially too many connections may be opened if ThreadLocalPool or 
 RoundRobinPool is used
 ---

 Key: HBASE-4150
 URL: https://issues.apache.org/jira/browse/HBASE-4150
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Karthick Sankarachary
 Fix For: 0.92.0

 Attachments: 4150-1.txt, 4150.txt, 5140-2.txt, HBASE-4150-DOC.patch, 
 HBASE-4150_final.patch


 See 'Problem with hbase.client.ipc.pool.type=threadlocal in trunk' discussion 
 started by Lars George.
 From Lars Hofhansl:
 Looking at HBaseClient.getConnection(...) I see this:
 {code}
  synchronized (connections) {
connection = connections.get(remoteId);
if (connection == null) {
  connection = new Connection(remoteId);
  connections.put(remoteId, connection);
}
  }
 {code}
 At the same time PoolMap.ThreadLocalPool.put is defined like this:
 {code}
public R put(R resource) {
  R previousResource = get();
  if (previousResource == null) {
 ...
if (poolSize.intValue() = maxSize) {
  return null;
}
 ...
}
 {code}
 So... If the ThreadLocalPool reaches its capacity it always returns null and 
 hence all new threads will create a
 new connection every time getConnection is called!
 I have also verified with a test program that works fine as long as the 
 number of client threads (which include
 the threads in HTable's threadpool of course) is  poolsize. Once that is no 
 longer the case the number of
 connections explodes and the program dies with OOMEs (mostly because each 
 Connection is associated with
 yet another thread).
 It's not clear what should happen, though. Maybe (1) the ThreadLocalPool 
 should not have a limit, or maybe
 (2) allocations past the pool size should throw an exception (i.e. there's a 
 hard limit), or maybe (3) in that case
 a single connection is returned for all threads while the pool it over its 
 limit or (4) we start round robin with the other
 connection in the other thread locals.
 For #1 means that the number of client threads needs to be more carefully 
 managed by the client app.
 In this case it would also be somewhat pointless that Connection have their 
 own threads, we just pass stuff
 between threads.
 #2 would work, but puts more logic in the client.
 #3 would lead to hard to debug performance issues.
 And #4 is messy :)
 From Ted Yu:
 For HBaseClient, at least the javadoc doesn't match:
 {code}
* @param config configuration
* @return either a {@link PoolType#Reusable} or {@link 
 PoolType#ThreadLocal}
*/
   private static PoolType getPoolType(Configuration config) {
 return PoolType.valueOf(config.get(HConstants.HBASE_CLIENT_IPC_POOL_TYPE),
 PoolType.RoundRobin, PoolType.ThreadLocal);
 {code}
 I think for RoundRobinPool, we shouldn't allow maxSize to be 
 Integer#MAX_VALUE. Otherwise connection explosion described by Lars may incur.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4195) Possible unconsistency in a memstore read after a reseek, possible performance improvement


[ 
https://issues.apache.org/jira/browse/HBASE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084521#comment-13084521
 ] 

Ted Yu commented on HBASE-4195:
---

I think that will do the trick.
I propose setting RESEEKMAX_DEFAULT to -1.

 Possible unconsistency in a memstore read after a reseek, possible 
 performance improvement
 --

 Key: HBASE-4195
 URL: https://issues.apache.org/jira/browse/HBASE-4195
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
 Environment: all
Reporter: nkeywal
Priority: Critical

 This follows the dicussion around HBASE-3855, and the random errors (20% 
 failure on trunk) on the unit test 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
 I saw some points related to numIterReseek, used in the 
 MemStoreScanner#getNext (line 690):
 {noformat}679 protected KeyValue getNext(Iterator it) {
 680 KeyValue ret = null;
 681 long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();
 682 //DebugPrint.println(  MS@ + hashCode() + : threadpoint =  + 
 readPoint);
 683
 684 while (ret == null  it.hasNext()) {
 685   KeyValue v = it.next();
 686   if (v.getMemstoreTS() = readPoint) {
 687 // keep it.
 688 ret = v;
 689   }
 690   numIterReseek--;
 691   if (numIterReseek == 0) {
 692 break;
 693}
 694 }
 695 return ret;
 696   }{noformat}
 This function is called by seek, reseek, and next. The numIterReseek is only 
 usefull for reseek.
 There are some issues, I am not totally sure it's the root cause of the test 
 case error, but it could explain partly the randomness of the error, and one 
 point is for sure a bug.
 1) In getNext, numIterReseek is decreased, then compared to zero. The seek 
 function sets numIterReseek to zero before calling getNext. It means that the 
 value will be actually negative, hence the test will always fail, and the 
 loop will continue. It is the expected behaviour, but it's quite smart.
 2) In reseek, numIterReseek is not set between the loops on the two 
 iterators. If the numIterReseek is equals to zero after the loop on the first 
 one, the loop on the second one will never call seek, as numIterReseek will 
 be negative.
 3) Still in reseek, the test to call seek is (kvsetNextRow == null  
 numIterReseek == 0). In other words, if kvsetNextRow is not null when 
 numIterReseek equals zero, numIterReseek will start to be negative at the 
 next iteration and seek will never be called.
 4) You can have side effects if reseek ends with a numIterReseek  0: the 
 following calls to the next function will decrease numIterReseek to zero, 
 and getNext will break instead of continuing the loop. As a result, later 
 calls to next() may return null or not depending on how is configured the 
 default value for numIterReseek.
 To check if the issue comes from point 4, you can set the numIterReseek to 
 zero before returning in reseek:
 {noformat}  numIterReseek = 0;
   return (kvsetNextRow != null || snapshotNextRow != null);
 }{noformat}
 On my env, on trunk, it seems to work, but as it's random I am not really 
 sure. I also had to modify the test (I added a loop) to make it fails more 
 often, the original test was working quite well here.
 It has to be confirmed that this totally fix (it could be partial or 
 unrelated) 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting 
 before implementing a complete solution.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4195) Possible inconsistency in a memstore read after a reseek, possible performance improvement


 [ 
https://issues.apache.org/jira/browse/HBASE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4195:
--

Summary: Possible inconsistency in a memstore read after a reseek, possible 
performance improvement  (was: Possible unconsistency in a memstore read after 
a reseek, possible performance improvement)

 Possible inconsistency in a memstore read after a reseek, possible 
 performance improvement
 --

 Key: HBASE-4195
 URL: https://issues.apache.org/jira/browse/HBASE-4195
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
 Environment: all
Reporter: nkeywal
Priority: Critical

 This follows the dicussion around HBASE-3855, and the random errors (20% 
 failure on trunk) on the unit test 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
 I saw some points related to numIterReseek, used in the 
 MemStoreScanner#getNext (line 690):
 {noformat}679 protected KeyValue getNext(Iterator it) {
 680 KeyValue ret = null;
 681 long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();
 682 //DebugPrint.println(  MS@ + hashCode() + : threadpoint =  + 
 readPoint);
 683
 684 while (ret == null  it.hasNext()) {
 685   KeyValue v = it.next();
 686   if (v.getMemstoreTS() = readPoint) {
 687 // keep it.
 688 ret = v;
 689   }
 690   numIterReseek--;
 691   if (numIterReseek == 0) {
 692 break;
 693}
 694 }
 695 return ret;
 696   }{noformat}
 This function is called by seek, reseek, and next. The numIterReseek is only 
 usefull for reseek.
 There are some issues, I am not totally sure it's the root cause of the test 
 case error, but it could explain partly the randomness of the error, and one 
 point is for sure a bug.
 1) In getNext, numIterReseek is decreased, then compared to zero. The seek 
 function sets numIterReseek to zero before calling getNext. It means that the 
 value will be actually negative, hence the test will always fail, and the 
 loop will continue. It is the expected behaviour, but it's quite smart.
 2) In reseek, numIterReseek is not set between the loops on the two 
 iterators. If the numIterReseek is equals to zero after the loop on the first 
 one, the loop on the second one will never call seek, as numIterReseek will 
 be negative.
 3) Still in reseek, the test to call seek is (kvsetNextRow == null  
 numIterReseek == 0). In other words, if kvsetNextRow is not null when 
 numIterReseek equals zero, numIterReseek will start to be negative at the 
 next iteration and seek will never be called.
 4) You can have side effects if reseek ends with a numIterReseek  0: the 
 following calls to the next function will decrease numIterReseek to zero, 
 and getNext will break instead of continuing the loop. As a result, later 
 calls to next() may return null or not depending on how is configured the 
 default value for numIterReseek.
 To check if the issue comes from point 4, you can set the numIterReseek to 
 zero before returning in reseek:
 {noformat}  numIterReseek = 0;
   return (kvsetNextRow != null || snapshotNextRow != null);
 }{noformat}
 On my env, on trunk, it seems to work, but as it's random I am not really 
 sure. I also had to modify the test (I added a loop) to make it fails more 
 often, the original test was working quite well here.
 It has to be confirmed that this totally fix (it could be partial or 
 unrelated) 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting 
 before implementing a complete solution.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2399) Forced splits only act on the first family in a table

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084522#comment-13084522
 ] 

jirapos...@reviews.apache.org commented on HBASE-2399:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1484/#review1441
---

Ship it!


+1 after fixing the white space (can you make a new patch Ming)  Good stuff.


http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
https://reviews.apache.org/r/1484/#comment3335

Nice javadoc



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
https://reviews.apache.org/r/1484/#comment3336

Yeah, there is more in here... you can see it up here in review board ming.



http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
https://reviews.apache.org/r/1484/#comment3337

Nice test.


- Michael


On 2011-08-12 22:58:55, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1484/
bq.  ---
bq.  
bq.  (Updated 2011-08-12 22:58:55)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Add tests for forcesplit multi-column-family scenarios.
bq.  2. Modify HRegion so that it picks splitpoint based on largest store, 
instead of the first splittable store. It applies to both forcesplit and 
automatic split.
bq.  
bq.  
bq.  This addresses bug hbase-2399.
bq.  https://issues.apache.org/jira/browse/hbase-2399
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157283 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
 1157283 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
 1157283 
bq.  
bq.  Diff: https://reviews.apache.org/r/1484/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Forced splits only act on the first family in a table
 -

 Key: HBASE-2399
 URL: https://issues.apache.org/jira/browse/HBASE-2399
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.20.3
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
  Labels: moved_from_0_20_5
 Fix For: 0.92.0

 Attachments: HBASE-2399-test-v1.patch, HBASE-2399-trunk.patch


 While working on a patch for HBASE-2375, I came across a few bugs in the 
 existing code related to splits.
 If a user triggers a manual split, it flips a forceSplit boolean to true and 
 then triggers a compaction (this is very similar to my current implementation 
 for HBASE-2375).  However, the forceSplit boolean is flipped back to false at 
 the beginning of Store.compact().  So the force split only acts on the first 
 family in the table.  If that Store is not splittable for some reason (it is 
 empty or has only one row), then the entire region will not be split, 
 regardless of what is in other families.
 Even if there is data in the first family, the midKey is determined based 
 solely on that family.  If it has two rows and the next family has 1M rows, 
 we pick the split key based on the two rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4190) Coprocessors: pull up some cp constants from cp package to o.a.h.h.HConstants

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084523#comment-13084523
 ] 

jirapos...@reviews.apache.org commented on HBASE-4190:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1461/#review1442
---

Ship it!


LGTM


src/main/java/org/apache/hadoop/hbase/Coprocessor.java
https://reviews.apache.org/r/1461/#comment3338

Interfaces up here in the base package is good I think.



src/main/java/org/apache/hadoop/hbase/HConstants.java
https://reviews.apache.org/r/1461/#comment3339

Do these constants belong here then now you've pulled up the Interfaces?  
If so, thats fine... just asking.



src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
https://reviews.apache.org/r/1461/#comment3340

This is good.



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
https://reviews.apache.org/r/1461/#comment3341

This is fine too I think.


- Michael


On 2011-08-13 01:08:04, Mingjie Lai wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1461/
bq.  ---
bq.  
bq.  (Updated 2011-08-13 01:08:04)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Coprocessors: pull up some cp constants from cp package to 
o.a.h.h.HConstants
bq.  
bq.  
bq.  This addresses bug HBASE-4190.
bq.  https://issues.apache.org/jira/browse/HBASE-4190
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/Coprocessor.java PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/CoprocessorEnvironment.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/HConstants.java dda254d 
bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java d835582 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 
2fc8f39 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java 
506051d 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 
ec88a01 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java 
0290bf2 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 
54ccd6f 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java
 5d8cf4c 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/ObserverContext.java 
9349d5b 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java
 da8076c 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 
cfbb29d 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/WALCoprocessorEnvironment.java
 6580c2c 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/WALObserver.java 
b086747 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c44da73 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java 
03df574 
bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java 
a81ff84 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
36816e8 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java 
c85146a 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 0ab1339 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 6d31d70 
bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java 
d9f6e5f 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java 
b4c407b 
bq.  
bq.  Diff: https://reviews.apache.org/r/1461/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestClassLoading passed locally.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Mingjie
bq.  
bq.



 Coprocessors: pull up some cp constants from cp package to o.a.h.h.HConstants
 -

 Key: HBASE-4190
 URL: https://issues.apache.org/jira/browse/HBASE-4190
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Affects Versions: 0.90.4
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Minor
 Fix For: 0.90.5


 At HBase-3810, stack gave a comment after patch committed:
  This is a bit odd where a class in the parent package has references to a 
  sub package.
  Should

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084524#comment-13084524
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--



bq.  On 2011-08-12 23:46:30, Gary Helmling wrote:
bq.   Nice work, Eugene.  I think we're getting close.  Just two suggested 
improvements below.
bq.   
bq.   The main question still open to debate, I think, is whether or not 
aborting the server on unhandled exceptions is appropriate.
bq.   
bq.   On the one hand, aborting takes the fail-fast approach and makes buggy 
coprocessors much more visible.  It's a lot more likely that a bug will be 
noticed and fixed if it brings down a region server!
bq.   
bq.   On the other hand, I think coprocessors already pose enough of a 
stability risk to a cluster.  I think we should be working to minimize that by 
containing the impact that a buggy coprocessor can have.  If they coprocessor 
really wants or needs to trigger an abort, it can already do so, since 
(Master|RegionServer)Services extend Server, which extends Abortable.
bq.   
bq.   I think I'd be more in favor of removing the coprocessor from the active 
set (we should make this as visible as possible so it's clear the coprocessor 
is no longer active), or at least wrapping the exception in a 
DoNotRetryIOException and communicating it back to the client?  Maybe both?
bq.   
bq.   I guess I'd be okay with a configuration option to abort on error (I 
think a single config option is sufficient), as long as it's disabled by 
default.  But that would still imply we need some other handling when the 
option is disabled.

I like Gary's reasoning here.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1433
---


On 2011-08-10 22:48:08, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-08-10 22:48:08)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68 
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
23225d7 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c44da73 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



 Coprocessors: Flag the presence of coprocessors in logged exceptions
 

 Key: HBASE-4014
 URL: https://issues.apache.org/jira/browse/HBASE-4014
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Andrew Purtell
Assignee: Eugene Koontz
 Fix For: 0.92.0

 Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
 HBASE-4014.patch, HBASE-4014.patch


 For some initial triage of bug reports for core versus for deployments with 
 loaded coprocessors, we need something like the Linux kernel's taint flag, 
 and list of linked in modules that show up in the output of every OOPS, to 
 appear

[jira] [Updated] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


 [ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4197:
-

Attachment: 4197-v2.txt

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084527#comment-13084527
 ] 

Lars Hofhansl commented on HBASE-4197:
--

Renamed getRegionName() to getRegionInfo().
Also cleaned up some more comments, and removed all references to
InternalScanner from HRegionServer and HRegion (there were only 3 or 4 left 
anyway).


 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084528#comment-13084528
 ] 

Ted Yu commented on HBASE-4197:
---

+1 on patch version 2.
Please use review board to get more feedback.

 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084531#comment-13084531
 ] 

jirapos...@reviews.apache.org commented on HBASE-4197:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1496/
---

Review request for Ted Yu and Mingjie Lai.


Summary
---

1. Don't require custom scanners created by conprocessors to be subclasses of 
HRegion.RegionScanner (see HBASE-4197).
2. Simplify the interfaces for Scanners in HRegion, HRegionServer, and 
RegionObserver. This avoids a bunch instanceof checks and casts to 
HRegion.RegionScanner.

(Sorry HBase-git would accept my patch)


This addresses bug HBASE-4197.
https://issues.apache.org/jira/browse/HBASE-4197


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
 1157311 

Diff: https://reviews.apache.org/r/1496/diff


Testing
---

Manual test attached to the bug.


Thanks,

Lars



 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner

2011-08-12 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084532#comment-13084532
 ] 

jirapos...@reviews.apache.org commented on HBASE-4197:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1496/
---

(Updated 2011-08-13 04:38:38.030763)


Review request for Ted Yu and Mingjie Lai.


Summary (updated)
---

1. Don't require custom scanners created by conprocessors to be subclasses of 
HRegion.RegionScanner (see HBASE-4197).
2. Simplify the interfaces for Scanners in HRegion, HRegionServer, and 
RegionObserver. This avoids a bunch instanceof checks and casts to 
HRegion.RegionScanner.

(Sorry HBase-git would not accept my patch)


This addresses bug HBASE-4197.
https://issues.apache.org/jira/browse/HBASE-4197


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
 1157311 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
 1157311 

Diff: https://reviews.apache.org/r/1496/diff


Testing
---

Manual test attached to the bug.


Thanks,

Lars



 RegionServer expects all scanner to be subclasses of HRegion.RegionScanner
 --

 Key: HBASE-4197
 URL: https://issues.apache.org/jira/browse/HBASE-4197
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
 Attachments: 4197-bigger.txt, 4197-v2.txt, 4197.txt, ScannerTest.java


 Returning just an InternalScanner from RegionObsever.{pre|post}OpenScanner 
 leads to the following exception when using the scanner.
 java.io.IOException: InternalScanner implementation is expected to be 
 HRegion.RegionScanner.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2023)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:314)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1225)
 The problem is in HRegionServer.next(...):
 {code} 
 InternalScanner s = this.scanners.get(scannerName);
 ...
   // Call coprocessor. Get region info from scanner.
   HRegion region = null;
   if (s instanceof HRegion.RegionScanner) {
 HRegion.RegionScanner rs = (HRegion.RegionScanner) s;
 region = getRegion(rs.getRegionName().getRegionName());
   } else {
 throw new IOException(InternalScanner implementation is expected  +
 to be HRegion.RegionScanner.);
   }
 {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4150) Potentially too many connections may be opened if ThreadLocalPool or RoundRobinPool is used


[ 
https://issues.apache.org/jira/browse/HBASE-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084540#comment-13084540
 ] 

stack commented on HBASE-4150:
--

+1 on doc patch.

 Potentially too many connections may be opened if ThreadLocalPool or 
 RoundRobinPool is used
 ---

 Key: HBASE-4150
 URL: https://issues.apache.org/jira/browse/HBASE-4150
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Karthick Sankarachary
 Fix For: 0.92.0

 Attachments: 4150-1.txt, 4150.txt, 5140-2.txt, HBASE-4150-DOC.patch, 
 HBASE-4150_final.patch


 See 'Problem with hbase.client.ipc.pool.type=threadlocal in trunk' discussion 
 started by Lars George.
 From Lars Hofhansl:
 Looking at HBaseClient.getConnection(...) I see this:
 {code}
  synchronized (connections) {
connection = connections.get(remoteId);
if (connection == null) {
  connection = new Connection(remoteId);
  connections.put(remoteId, connection);
}
  }
 {code}
 At the same time PoolMap.ThreadLocalPool.put is defined like this:
 {code}
public R put(R resource) {
  R previousResource = get();
  if (previousResource == null) {
 ...
if (poolSize.intValue() = maxSize) {
  return null;
}
 ...
}
 {code}
 So... If the ThreadLocalPool reaches its capacity it always returns null and 
 hence all new threads will create a
 new connection every time getConnection is called!
 I have also verified with a test program that works fine as long as the 
 number of client threads (which include
 the threads in HTable's threadpool of course) is  poolsize. Once that is no 
 longer the case the number of
 connections explodes and the program dies with OOMEs (mostly because each 
 Connection is associated with
 yet another thread).
 It's not clear what should happen, though. Maybe (1) the ThreadLocalPool 
 should not have a limit, or maybe
 (2) allocations past the pool size should throw an exception (i.e. there's a 
 hard limit), or maybe (3) in that case
 a single connection is returned for all threads while the pool it over its 
 limit or (4) we start round robin with the other
 connection in the other thread locals.
 For #1 means that the number of client threads needs to be more carefully 
 managed by the client app.
 In this case it would also be somewhat pointless that Connection have their 
 own threads, we just pass stuff
 between threads.
 #2 would work, but puts more logic in the client.
 #3 would lead to hard to debug performance issues.
 And #4 is messy :)
 From Ted Yu:
 For HBaseClient, at least the javadoc doesn't match:
 {code}
* @param config configuration
* @return either a {@link PoolType#Reusable} or {@link 
 PoolType#ThreadLocal}
*/
   private static PoolType getPoolType(Configuration config) {
 return PoolType.valueOf(config.get(HConstants.HBASE_CLIENT_IPC_POOL_TYPE),
 PoolType.RoundRobin, PoolType.ThreadLocal);
 {code}
 I think for RoundRobinPool, we shouldn't allow maxSize to be 
 Integer#MAX_VALUE. Otherwise connection explosion described by Lars may incur.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4196) TableRecordReader may skip first row of region


 [ 
https://issues.apache.org/jira/browse/HBASE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4196.
--

   Resolution: Fixed
Fix Version/s: 0.90.5
 Hadoop Flags: [Reviewed]

Committed branch and trunk.  Thanks for the patch Ming (And review Ted)

 TableRecordReader may skip first row of region
 --

 Key: HBASE-4196
 URL: https://issues.apache.org/jira/browse/HBASE-4196
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Jan Lukavsky
Assignee: Ming Ma
 Fix For: 0.90.5

 Attachments: HBASE-4196-trunk.patch, HBASE-4196-trunk.patch, 
 HBASE-4196-trunk.patch


 After the following scenario, the first record of region is skipped, without 
 being sent to Mapper:
  - the reader is initialized with TableRecordReader.init()
  - then nextKeyValue is called, causing call to scanner.next() - here 
 ScannerTimeoutException occurs
  - the scanner is restarted by call to restart() and then *two* calls to 
 scanner.next() occur, causing we have lost the first row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4170) createTable java doc needs to be improved


 [ 
https://issues.apache.org/jira/browse/HBASE-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4170:
-

   Resolution: Fixed
Fix Version/s: (was: 0.90.1)
   0.90.5
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to branch and trunk.  Thanks for the patch Mubarak.

 createTable java doc needs to be improved
 -

 Key: HBASE-4170
 URL: https://issues.apache.org/jira/browse/HBASE-4170
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
 Environment: HBase-0.90.1
Reporter: Mubarak Seyed
 Fix For: 0.90.5

 Attachments: create_table_javadoc_HBASE_4170.patch


 HBaseAdmin.createTable() java doc says
 public void createTable(HTableDescriptor desc,
 byte[][] splitKeys)
  throws IOException
 Creates a new table with an initial set of empty regions defined by the 
 specified split keys. The total number of regions created will be the number 
 of split keys plus one (the first region has a null start key and the last 
 region has a null end key). Synchronous operation.
 If we specify null values for first region start key and last region end key, 
 geting NullPointerException as Arrays.sort compares each element.
 I guess the documentation should not talk about null values and explain about 
 splitKeys[][] length as n-1, where n is number of regions.
 splitKeys[][] would look like
 splitKeys[0] = key value 1
 ..
 splitKeys[n-1] = key value n-1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4197) RegionServer expects all scanner to be subclasses of HRegion.RegionScanner