[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...

2011-07-25 Thread gaojinchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070331#comment-13070331
 ] 

gaojinchao commented on HBASE-4064:
---

Master may be crashed because of pool shutdown is asynchronous. 

The master show :
2011-07-22 13:33:27,806 INFO 
org.apache.hadoop.hbase.master.handler.EnableTableHandler: Table has 2156 
regions of which 2156 are online.

2011-07-22 13:34:28,646 INFO 
org.apache.hadoop.hbase.master.handler.EnableTableHandler: Table has 2156 
regions of which 982 are online.
2011-07-22 13:34:31,079 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
gjc:xxx ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229.
2011-07-22 13:34:31,080 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:6-0x31502ef4f0 Creating (or updating) unassigned node for 
c9b1c97ac6c00033ceb1890e45e66229 with OFFLINE state
2011-07-22 13:34:31,104 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Forcing OFFLINE; 
was=ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. 
state=OFFLINE, ts=1311312871080
2011-07-22 13:34:31,121 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
No previous transition plan was found (or we are ignoring an existing plan) for 
ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. so generated a 
random one; 
hri=ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229., src=, 
dest=C4C2.site,60020,1311310281335; 3 (online=3, exclude=null) available servers
2011-07-22 13:34:31,121 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Assigning region 
ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. to 
C4C2.site,60020,1311310281335
2011-07-22 13:34:31,122 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
gjc:xxx ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229.
2011-07-22 13:34:31,123 FATAL org.apache.hadoop.hbase.master.HMaster: 
Unexpected state trying to OFFLINE; 
ufdr5,0590386138,1311057525896.c9b1c97ac6c00033ceb1890e45e66229. 
state=PENDING_OPEN, ts=1311312871121
java.lang.IllegalStateException
at 
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1081)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1036)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:864)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:844)
at java.lang.Thread.run(Thread.java:662)
2011-07-22 13:34:31,125 INFO org.apache.hadoop.hbase.master.HMaster: Aborting


 Two concurrent unassigning of the same region caused the endless loop of 
 Region has been PENDING_CLOSE for too long...
 

 Key: HBASE-4064
 URL: https://issues.apache.org/jira/browse/HBASE-4064
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: Jieshan Bean
 Fix For: 0.90.5

 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, 
 HBASE-4064_branch90V2.patch, disableflow.png


 1. If there is a rubbish RegionState object with PENDING_CLOSE in 
 regionsInTransition(The RegionState was remained by some exception which 
 should be removed, that's why I called it as rubbish object), but the 
 region is not currently assigned anywhere, TimeoutMonitor will fall into an 
 endless loop:
 2011-06-27 10:32:21,326 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:21,326 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:21,438 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:21,441 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 2011-06-27 10:32:31,207 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:31,207 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:31,215 DEBUG 
 

[jira] [Updated] (HBASE-4120) isolation and allocation

2011-07-25 Thread Liu Jia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Jia updated HBASE-4120:
---

Attachment: System Structure.jpg

the relationship between groups and table priority

 isolation and allocation
 

 Key: HBASE-4120
 URL: https://issues.apache.org/jira/browse/HBASE-4120
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Affects Versions: 0.90.2
Reporter: Liu Jia
 Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
 HBase_isolation_and_allocation_user_guide.pdf, 
 Performance_of_Table_priority.pdf, System Structure.jpg


 The HBase isolation and allocation tool is designed to help users manage 
 cluster resource among different application and tables.
 When we have a large scale of HBase cluster with many applications running on 
 it, there will be lots of problems. In Taobao there is a cluster for many 
 departments to test their applications performance, these applications are 
 based on HBase. With one cluster which has 12 servers, there will be only one 
 application running exclusively on this server, and many other applications 
 must wait until the previous test finished.
 After we add allocation manage function to the cluster, applications can 
 share the cluster and run concurrently. Also if the Test Engineer wants to 
 make sure there is no interference, he/she can move out other tables from 
 this group.
 In groups we use table priority to allocate resource, when system is busy; we 
 can make sure high-priority tables are not affected lower-priority tables
 Different groups can have different region server configurations, some groups 
 optimized for reading can have large block cache size, and others optimized 
 for writing can have large memstore size. 
 Tables and region servers can be moved easily between groups; after changing 
 the configuration, a group can be restarted alone instead of restarting the 
 whole cluster.
 git entry : https://github.com/ICT-Ope/HBase_allocation .
 We hope our work is helpful.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

2011-07-25 Thread fulin wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070342#comment-13070342
 ] 

fulin wang commented on HBASE-4124:
---

I can't find where does it call getRegionsInTransitionInRS().add()? So I do not 
understand why add this function.
About 'already online on this server' of error, I want that the region should 
be closed or reassinged. I am trying to make a patch.

 ZK restarted while assigning a region, new active HM re-assign it but the RS 
 warned 'already online on this server'.
 

 Key: HBASE-4124
 URL: https://issues.apache.org/jira/browse/HBASE-4124
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: fulin wang
 Attachments: log.txt

   Original Estimate: 0.4h
  Remaining Estimate: 0.4h

 ZK restarted while assigning a region, new active HM re-assign it but the RS 
 warned 'already online on this server'.
 Issue:
 The RS failed besause of 'already online on this server' and return; The HM 
 can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070349#comment-13070349
 ] 

dhruba borthakur commented on HBASE-4132:
-

stack: the LogCleaner api allows archived logs to be deleted according to a 
configurable policy. One can set hbase.master.logcleaner.plugins to setup his 
own policy. In that sense, it is already pluggable. Moreover, this is done by 
the master whereas the WALObserver interface is in the RegionServer. Given the 
above, do you think that this patch needs to touch LogCleaner at all? If so, 
what is ur proposal?

Andrew: The WALObserver API additions should follow the same practice as 
providing before (pre) and after (post) hooks as everywhere else. In that 
sense, it already has logRollRequested and logRolled. Similarly, I added 
logArchiveStart and logArchiveCompleted. The remaining one is 
visitLogEntryBeforeWrite. are you suggesting that we add a 
visitLogEntryAfterWrite as well? 

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4134) the total numbers of Regions was more than the actual regions count while balancing after the hbck fixing.

2011-07-25 Thread feng xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

feng xu updated HBASE-4134:
---

Description: 
1. I found the problem(Some regions were multiplied) while running hbck to 
check the cluster's health. Here's the result:
{noformat}
ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
listed in META on region server 158-1-91-101:20020 but is multiply assigned to 
region servers 158-1-91-101:20020, 158-1-91-105:20020 
ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
listed in META on region server 158-1-91-101:20020 but is multiply assigned to 
region servers 158-1-91-101:20020, 158-1-91-105:20020 
ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
listed in META on region server 158-1-91-103:20020 but is multiply assigned to 
region servers 158-1-91-103:20020, 158-1-91-105:20020 
Summary: 
  -ROOT- is okay. 
Number of regions: 1 
Deployed on: 158-1-91-105:20020 
  .META. is okay. 
Number of regions: 1 
Deployed on: 158-1-91-103:20020 
  test1 is okay. 
Number of regions: 25297 
Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
14829 inconsistencies detected. 
Status: INCONSISTENT 
{noformat}

2. Then I tried use hbck -fix to fix the problems, everything seemed ok. But 
I found that the total numbers of Regions(35029) was more than the actual 
regions count(25299) while balancing after the fixing.
Here's the related logs snippet:
{noformat}
2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
Skipping load balancing.  servers=3 regions=25299 average=8433.0 
mostloaded=8433 
2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
Skipping load balancing.  servers=3 regions=35029 average=11676.333 
mostloaded=11677 leastloaded=11676
{noformat}

3. I tracked one region's behavior during the time. Taking the region of 
test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
(1) It was assigned to 158-1-91-101 at first. 
(2) HBCK sent closing request to RegionServer. And RegionServer closed it 
silently without noticing to HMaster.
(3) The region was still carried by RS 158-1-91-103 which was known to 
HMaster.
(4) HBCK will tricker a new assignment.

The fact is, the region was assigned again, but the old assignment information 
was still remained in the sets of AM#regions,AM#servers.

That's why did the problem of regions count was larger than the actual number 
occur.  

{noformat}
Line 178967: 2011-07-22 02:47:51,247 DEBUG 
org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: 
/hbase/unassigned/52782c0241a598b3e37ca8729da0 
(region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
Line 178968: 2011-07-22 02:47:51,247 INFO 
org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
region=52782c0241a598b3e37ca8729da0
Line 178969: 2011-07-22 02:47:51,248 INFO 
org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
assignment of 
region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
Line 178970: 2011-07-22 02:47:51,248 DEBUG 
org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
was found (or we are ignoring an existing plan) for 
test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
available servers
Line 178971: 2011-07-22 02:47:51,248 DEBUG 
org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
158-1-91-101,20020,1311231878544
Line 178983: 2011-07-22 02:47:51,285 DEBUG 
org.apache.hadoop.hbase.master.AssignmentManager: Handling 
transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
region=52782c0241a598b3e37ca8729da0
Line 179001: 2011-07-22 02:47:51,318 DEBUG 
org.apache.hadoop.hbase.master.AssignmentManager: Handling 
transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, 
region=52782c0241a598b3e37ca8729da0
Line 179002: 2011-07-22 02:47:51,319 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
event for 52782c0241a598b3e37ca8729da0; deleting unassigned node
Line 179003: 2011-07-22 02:47:51,319 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:2-0x1314ac5addb0042-0x1314ac5addb0042 Deleting existing unassigned 
node for 52782c0241a598b3e37ca8729da0 that is in expected state 
RS_ZK_REGION_OPENED
Line 179007: 2011-07-22 02:47:51,326 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:2-0x1314ac5addb0042-0x1314ac5addb0042 Successfully 

[jira] [Updated] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service

2011-07-25 Thread vamshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vamshi updated HBASE-4137:
--

Description: To implement  a scalable data structure i.e Distributed hash 
table in the HBase. In the Hbase to perform fast lookup  we can take the help 
of Distributed Hash Table (DHT), a scalable data structure.  (was: To implement 
 a scalable data structure i.e Distributed hash table in the HBase.   )
Summary: Implementation of Distributed Hash Table(DHT)  for lookup 
service  (was: Implementation of Distributed Hash Table(DHT) )

 Implementation of Distributed Hash Table(DHT)  for lookup service
 -

 Key: HBASE-4137
 URL: https://issues.apache.org/jira/browse/HBASE-4137
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.90.3
Reporter: vamshi

 To implement  a scalable data structure i.e Distributed hash table in the 
 HBase. In the Hbase to perform fast lookup  we can take the help of 
 Distributed Hash Table (DHT), a scalable data structure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-25 Thread vamshi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070371#comment-13070371
 ] 

vamshi commented on HBASE-1938:
---

Hi stack, how can we perform lookup/scanning in HBase? can we use Distributed 
Hashing (DHT) for that? I want to implement scalable data structure i.e DHT in 
HBase, for how can i proceed? please help me..Thank you

 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070374#comment-13070374
 ] 

ramkrishna.s.vasudevan commented on HBASE-3845:
---

Ted,
I tried using the 'this.cacheFlushLock.isHeldByCurrentThread()'.
The problem here is as HLog.append() may be called by other thread whereas the 
HRegion.internalFlushCache() is called by memstoreflusher thread.
So if we check this.cacheFlushLock.isHeldByCurrentThread() it returns false.

So as per your suggestion i have inlined the isFlushInProgress into 
wal.startCacheFlush() and wal.abortCacheFlush() and still going with 
AtomicBoolean.
Is it fine Ted ? I am planning to upload the patch with these changes.

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845__trunk.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)
If zookeeper.znode.parent is not specifed explicitly in Client code then HTable 
object loops continuously waiting for the root region by using /hbase as the 
base node.
---

 Key: HBASE-4138
 URL: https://issues.apache.org/jira/browse/HBASE-4138
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.4


Change the zookeeper.znode.parent property (default is /hbase).
Now do not specify this change in the client code.

Use the HTable Object.
The HTable is not able to find the root region and keeps continuously looping.

Find the stack trace:

Object.wait(long) line: not available [native method]
RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122

RootRegionTracker.waitRootRegionLocation(long) line: 73  
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[], boolean) line: 578
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[]) line: 558
HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
byte[], byte[], boolean, Object) line: 687
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[], boolean) line: 589
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[]) line: 558
HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
byte[], byte[], boolean, Object) line: 687
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[], boolean) line: 593
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[]) line: 558
HTable.init(Configuration, byte[]) line: 171   
HTable.init(Configuration, String) line: 145   
HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2827) HBase Client doesn't handle master failover

2011-07-25 Thread vamshi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070383#comment-13070383
 ] 

vamshi commented on HBASE-2827:
---

Hi jonathan,may be in this place this question is irrelevant. But please let me 
know whether we can implement a distributed hashing in HBase for fast lookup/ 
scanning purpose?? i want to implement scalable data structure i.e DHT in 
Hbase, for that how can i proceed? Thank you.

 HBase Client doesn't handle master failover
 ---

 Key: HBASE-2827
 URL: https://issues.apache.org/jira/browse/HBASE-2827
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.0
Reporter: Nicolas Spiegelberg
Assignee: Jonathan Gray

 A client on our beta tier was stuck in this exception loop when we issued a 
 new HMaster after the old one died:
 Exception while trying to connect hBase
 java.lang.reflect.UndeclaredThrowableException
 at $Proxy1.getClusterStatus(Unknown Source)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:912)
 at org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:170)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:143)
 ...
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.net.SocketTimeoutException: 2 millis timeout while 
 waiting for channel to be ready for connect. ch : 
 java.nio.channels.SocketChannel[connection-pending remote=/10.18.34.212:6]
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:309)
 at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:856)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:724)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:252)
 ... 20 more
 12:52:55,863 [pool-4-thread-5182] INFO PersistentUtil:153 - Retry after 1 
 second...
 Looking at the client code, the HConnectionManager does not watch ZK for 
 NodeDeleted  NodeCreated of /hbase/master

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

2011-07-25 Thread vamshi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070384#comment-13070384
 ] 

vamshi commented on HBASE-2645:
---

Hi Todd,may be in this place this question is irrelevant. But please let me 
know whether we can implement a distributed hashing in HBase for fast lookup/ 
scanning purpose?? i want to implement scalable data structure i.e DHT in 
Hbase, for that how can i proceed? Thank you.

 HLog writer can do 1-2 sync operations after lease has been recovered for 
 split process.
 

 Key: HBASE-2645
 URL: https://issues.apache.org/jira/browse/HBASE-2645
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.90.4
Reporter: Cosmin Lehene
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.94.0


 TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. 
 This test starts a thread that writes one edit to the log, syncs and counts. 
 During this, a HLog.splitLog operation is started. splitLog recovers the log 
 lease before reading the log, so that the original regionserver could not 
 wake up and write after the split process started.  
 The test compares the number of edits reported by the split process and by 
 the writer thread. Writer thread (called zombie in the test) should report = 
  than the splitLog (sync() might raise after the last edit gets written and 
 the edit won't get counted by zombie thread). However it appears that the 
 zombie counts 1-2 more edits. So it looks like it can sync without a lease.
 This might be a hdfs-0.20 related issue. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070458#comment-13070458
 ] 

nkeywal commented on HBASE-1938:


Hello Stack,

accesses and perhaps to make it go faster.
I will have a look at it, I see as well in this test and in the global
profiling that a lot of time is spent on it.


scanner.next();

There are two iterators in the class(kvsetIt and snapshotIt), and getLowest
compare the two to return the lowest. However, in this test, one of the list
is empty, so the value is null, and hence the real comparison on byte[] is
not executed.

On this subject, there is a possible optimisation on the function peek,
that will repeat the comparison: if peek is called multiple time, of if we
often have peek() then next(), we can save the redundant comparisons. To me,
it makes sense to precalculate the value returned by peek, and reuse it in
next().



The profiling (method: sampling, java inlining desactivated) says something
interesting:

Name; total time spent
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.next()100%
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(Iterator)
88%
org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet$MapEntryIterator.next()
44%
java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapEntryIterator.next()
36%
org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint()
26%
java.lang.ThreadLocal.get()21%
org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet$MapEntryIterator.hasNext()
8%
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getLowest()
7%
java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapIter.hasNext()3%
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getLower(KeyValue,
KeyValue)3%
java.lang.Long.longValue()2%



So we're spending 26% of the time on this:
org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint()
26%

And in this getThreadReadPoint(), the actual time is spent in:
java.lang.ThreadLocal.get()21%

It's a TLS, so we can expect a system call to get the thread id. It would be
great to save this system call in a next().

There is at least an improvement for the case when one of the list is done:
don't get the data getThreadReadPoint(). That would not change the behaviour
at all, but would already be interesting (may be 10% in this test).
Another option is to share getThreadReadPoint() value for the two iterators,
i.e. read the value in the next() function, and give it as a parameter to
getNext(). In fact, as this value seems to be a TLS, I don't see how it
could change during the execution of next(). What do you think?
Last question on this: what is the use case when the getThreadReadPoint()
will change during the whole scan (i.e.: between next)?


Most of the public methods (except reseek) are synchronized, it implies
that the scanner can be shared between threads?


At the end, it seems that there are 3 possible things to do:
1) Replacement of KeyValue lowest = getLowest();
2) theNext precalculation for peek() and next()
3) Depending on your feedback, one of the options above on
getThreadReadPoint().

This should give 5 to 15% increase in performances, not a problem solved
stuff, but could justify a first patch. I can do it (with the hbase
indenting :-)



On Sun, Jul 24, 2011 at 12:23 AM, stack (JIRA) j...@apache.org wrote:



 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070470#comment-13070470
 ] 

Ted Yu commented on HBASE-3845:
---

That is fine. 

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845__trunk.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Open  (was: Patch Available)

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845__trunk.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Patch Available  (was: Open)

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
 HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_5.patch

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
 HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_trunk_2.patch

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
 HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070498#comment-13070498
 ] 

Ted Yu commented on HBASE-4134:
---

https://issues.apache.org/jira/browse/HBASE-4053 is in 0.90.4 RC1
Do you want to try out RC1 to see if the situation of double counting has 
improved ?

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.90.4


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179001: 2011-07-22 02:47:51,318 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 

[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070520#comment-13070520
 ] 

ramkrishna.s.vasudevan commented on HBASE-4138:
---

1. I tried to identify the problem in HBASE-4138.  I ended up in the following 
analysis,

The HMaster creates the BASENODE along with unassigned node, RS node and table 
node based on 
the zookeeper.znode.parent property.

Currently when we use the HTable() as part of getConnection() if the this value 
is  not configured
we tend to create a new connection.

Two points to not here is - 
1)The HTable documentation clearly states us to use the same configuration 
object.


But what if its not done, particularly someone forgets to set this base node 
property.  Even it may be
like in my RS instance i have configured the property but not in the master 
instance.

2)The reuse of the getConnection() logic across all levels, was it intended ?

The major problem lies in the the HConnectionManager.setupZookeeperTrackers() 
which tries to
create the BASENODES again.

What i feel here is,
this should not be done as only the master should have the rights to create it 
else there are
high possibility that muliple basenodes can be created.

Currently as the Client creates the node once again with the default value 
'/hbase'
the client keeps waiting to know the root location indefinitely.

What happens in the Admin case:
The same thing happens in admin case but in HBaseAdmin() we call the 
connection.getMaster api
which throws an exception.  
'ZooKeeper available but no active master location found'

So we should prevent the Admin or HTable (In general any client even RS ) from 
creating the 
base nodes and what ever is created by the master should be used by the clients.

 If zookeeper.znode.parent is not specifed explicitly in Client code then 
 HTable object loops continuously waiting for the root region by using /hbase 
 as the base node.
 ---

 Key: HBASE-4138
 URL: https://issues.apache.org/jira/browse/HBASE-4138
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.4


 Change the zookeeper.znode.parent property (default is /hbase).
 Now do not specify this change in the client code.
 Use the HTable Object.
 The HTable is not able to find the root region and keeps continuously looping.
 Find the stack trace:
 
 Object.wait(long) line: not available [native method]  
 RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
 RootRegionTracker.waitRootRegionLocation(long) line: 73
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 578
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 589
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 593
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HTable.init(Configuration, byte[]) line: 171 
 HTable.init(Configuration, String) line: 145 
 HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Prakash Khemani (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070542#comment-13070542
 ] 

Prakash Khemani commented on HBASE-3845:


In the patch that is deployed internally we have implemented a different 
approach. We remove the region's entry in startCacheFlush() and save it (as 
opposed to the current behavior of removing the entry in completeCacheFlush()). 
If the flush aborts then we restore the saved entry.

The approach taken in the latest patch in this jira might also be OK. I have a 
few comments

{noformat}
   this.lastSeqWritten.remove(encodedRegionName);
+  Long seqWhileFlush = 
this.seqWrittenWhileFlush.get(encodedRegionName);
+  if (null != seqWhileFlush) {
+this.lastSeqWritten.putIfAbsent(encodedRegionName, seqWhileFlush);
+this.seqWrittenWhileFlush.remove(encodedRegionName);
+   
{noformat}

seqWrittenWhileFlush .get() and subsequent .remove() can be replaced by a 
single .remove()
{code}
Long seqWhileFlush = this.seqWrittenWhileFlush.remove(encodedRegionName);
if (null != seqWhileFlush) {
  lSW.put(encodedRegionName, seqWhileFlush);
else
  lSW.remove(encodedRegionName);
{code}

==
The bigger problem here is that completeCacheFlush() is not called with 
updatedLock acquired. Therefore there might still be correctness issues with 
the latest patch.

==

{noformat}
   public void abortCacheFlush() {
+this.isFlushInProgress.set(false);
 this.cacheFlushLock.unlock();
   }
{noformat}
shouldn't seqWrittenWhileFlush be cleaned up in abort cache?


 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
 HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070545#comment-13070545
 ] 

Ted Yu commented on HBASE-3845:
---

@Prakash:
Would you be able to share your patch ?

 The bigger problem here is that completeCacheFlush() is not called with 
 updatedLock acquired.
See line 1154 in HLog:
{code}
  synchronized (updateLock) {
{code}

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
 HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070588#comment-13070588
 ] 

Andrew Purtell commented on HBASE-4132:
---

@dhruba Thanks. For example should be preArchiveStart and postArchiveStart, 
preArchiveCompleted and postArchiveCompleted. In part it is a naming 
convention, in part it is a contract: Pre hooks allow introduction of 
preprocessing and also important override of default behavior and associated 
short-circuiting of base processing or any additional coprocessors. Post hooks 
allow introduction of postprocessing and modification of return values.

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_6.patch

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Open  (was: Patch Available)

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_trunk_3.patch

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Patch Available  (was: Open)

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions

2011-07-25 Thread Andrew Purtell (JIRA)
[stargate] Update ScannerModel with support for filter package additions


 Key: HBASE-4139
 URL: https://issues.apache.org/jira/browse/HBASE-4139
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.0


Filters have been added to the o.a.h.h.filters package without updating 
o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Prakash Khemani (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070614#comment-13070614
 ] 

Prakash Khemani commented on HBASE-3845:


In the method internalFlushcache() I don't see updatesLock.writeLock() being 
held around the following piece of code.

{code}
if (wal != null) {
  wal.completeCacheFlush(this.regionInfo.getEncodedNameAsBytes(),
regionInfo.getTableDesc().getName(), completeSequenceId,
this.getRegionInfo().isMetaRegion());
}
{code}

==

I will upload the internal patch for reference ...





 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread Vlad Dogaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Dogaru updated HBASE-3899:
---

Attachment: HBASE-3899.patch

@stack, followup from review board: HBaseServer.Call uses warnResponseSize from 
parent class.

Also, similar code is in production on Facebook clusters. This patch only adds 
and tests new behavior, but it is not actually used yet.

 enhance HBase RPC to support free-ing up server handler threads even if 
 response is not ready
 -

 Key: HBASE-3899
 URL: https://issues.apache.org/jira/browse/HBASE-3899
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt


 In the current implementation, the server handler thread picks up an item 
 from the incoming callqueue, processes it and then wraps the response as a 
 Writable and sends it back to the IPC server module. This wastes 
 thread-resources when the thread is blocked for disk IO (transaction logging, 
 read into block cache, etc).
 It would be nice if we can make the RPC Server Handler threads pick up a call 
 from the IPC queue, hand it over to the application (e.g. HRegion), the 
 application can queue it to be processed asynchronously and send a response 
 back to the IPC server module saying that the response is not ready. The RPC 
 Server Handler thread is now ready to pick up another request from the 
 incoming callqueue. When the queued call is processed by the application, it 
 indicates to the IPC module that the response is now ready to be sent back to 
 the client.
 The RPC client continues to experience the same behaviour as before. A RPC 
 client is synchronous and blocks till the response arrives.
 This RPC enhancement allows us to do very powerful things with the 
 RegionServer. In future, we can make enhance the RegionServer's threading 
 model to a message-passing model for better performance. We will not be 
 limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070727#comment-13070727
 ] 

jirapos...@reviews.apache.org commented on HBASE-3899:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/#review1179
---


I think we need some additional metrics for number of outstanding (delayed) 
calls... how do we debug cases where calls are getting orphaned?


src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java
https://reviews.apache.org/r/1174/#comment2475

RPC calls can return Writables or any java primitive supported by 
ObjectWritable.

So, this should probably be Object result.



src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2476

this isn't your code... but this expression is always true!



src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2477

this is a no-op. need proper error handling



src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2478

assert this.delayResponse



src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2479

assert !delayResponse



src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2480

if !delayResponse, would we ever have response == null?



src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2481

shouldn't this just be a call to enqueueInSelector now?


- Todd


On 2011-07-22 00:17:13, Vlad Dogaru wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1174/
bq.  ---
bq.  
bq.  (Updated 2011-07-22 00:17:13)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Free up RPC server Handler thread if the called routine specifies the call 
should be delayed. The RPC client sees no difference, changes are server-side 
only. This is based on the previous submitted patch from Dhruba.
bq.  
bq.  
bq.  This addresses bug HBASE-3899.
bq.  https://issues.apache.org/jira/browse/HBASE-3899
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e 
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 
bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1174/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Unit tests run. Also, the patch includes a new unit test.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Vlad
bq.  
bq.



 enhance HBase RPC to support free-ing up server handler threads even if 
 response is not ready
 -

 Key: HBASE-3899
 URL: https://issues.apache.org/jira/browse/HBASE-3899
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt


 In the current implementation, the server handler thread picks up an item 
 from the incoming callqueue, processes it and then wraps the response as a 
 Writable and sends it back to the IPC server module. This wastes 
 thread-resources when the thread is blocked for disk IO (transaction logging, 
 read into block cache, etc).
 It would be nice if we can make the RPC Server Handler threads pick up a call 
 from the IPC queue, hand it over to the application (e.g. HRegion), the 
 application can queue it to be processed asynchronously and send a response 
 back to the IPC server module saying that the response is not ready. The RPC 
 Server Handler thread is now ready to pick up another request from the 
 incoming callqueue. When the queued call is processed by the application, it 
 indicates to the IPC module that the response is now ready to be sent back to 
 the client.
 The RPC client continues to experience the same behaviour as before. A RPC 
 client is synchronous and blocks till the response arrives.
 This RPC enhancement allows us to do very powerful things with the 
 RegionServer. In future, we can make enhance the RegionServer's threading 
 model to a message-passing model for better performance. We will not be 
 limited by the number 

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070752#comment-13070752
 ] 

stack commented on HBASE-4132:
--

@Dhruba

bq. Given the above, do you think that this patch needs to touch LogCleaner at 
all? If so, what is ur proposal?

No proposal.  Just wanted to point you at some utility we have already that you 
might not have known about and that might have helped you composing your 
addition.



 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070755#comment-13070755
 ] 

stack commented on HBASE-4138:
--

@Ram Your reasoning sounds right to me.  I agree ...we should prevent the 
Admin or HTable (In general any client even RS ) from creating the base nodes 
and what ever is created by the master should be used by the clients.

Thanks for digging in on this.

 If zookeeper.znode.parent is not specifed explicitly in Client code then 
 HTable object loops continuously waiting for the root region by using /hbase 
 as the base node.
 ---

 Key: HBASE-4138
 URL: https://issues.apache.org/jira/browse/HBASE-4138
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.4


 Change the zookeeper.znode.parent property (default is /hbase).
 Now do not specify this change in the client code.
 Use the HTable Object.
 The HTable is not able to find the root region and keeps continuously looping.
 Find the stack trace:
 
 Object.wait(long) line: not available [native method]  
 RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
 RootRegionTracker.waitRootRegionLocation(long) line: 73
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 578
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 589
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 593
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HTable.init(Configuration, byte[]) line: 171 
 HTable.init(Configuration, String) line: 145 
 HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070804#comment-13070804
 ] 

stack commented on HBASE-4137:
--

Can you describe what you are trying to do?  What do you mean by 'lookup 
service'?  What are you looking up?  And why would we put in place a DHT for 
lookups when there is already a means of locating data.  Thanks.

 Implementation of Distributed Hash Table(DHT)  for lookup service
 -

 Key: HBASE-4137
 URL: https://issues.apache.org/jira/browse/HBASE-4137
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.90.3
Reporter: vamshi

 To implement  a scalable data structure i.e Distributed hash table in the 
 HBase. In the Hbase to perform fast lookup  we can take the help of 
 Distributed Hash Table (DHT), a scalable data structure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070819#comment-13070819
 ] 

stack commented on HBASE-1938:
--

bq. To me, it makes sense to precalculate the value returned by peek, and 
reuse it in next().

If there is no chance of the value changing between the peek and next, it 
sounds good (I've not looked at this code in a while).

bq.  It would be great to save this system call in a next().

Yes (I like how you figure there's a system call doing thread local get).

bq. In fact, as this value seems to be a TLS, I don't see how it could change 
during the execution of next(). What do you think?

(I'm being lazy.  I've not looked at the code).  The updates to RWCC happen at 
well-defined points so should be easy enough to elicit if there is a problem w/ 
your presumption above.

bq. Last question on this: what is the use case when the getThreadReadPoint() 
will change during the whole scan (i.e.: between next)?

IIRC, we want to let the scan see the most up-to-date view on a row though our 
guarantees are less than this (See http://hbase.apache.org/acid-semantics.html).

bq. Most of the public methods (except reseek) are synchronized, it implies 
that the scanner can be shared between threads?

That seems like a valid deduction to make.

bq. 1) Replacement of KeyValue lowest = getLowest();

You mean in MemStore#reseek?  What would you put in its place (Sorry if I'm not 
following the bouncing ball properly).

bq. ...don't get the data getThreadReadPoint()

So, we'd just hold to the current read point for how long?  The full scan?  
That might be possible given our lax guarantees above though it would be nice 
to not have to give up on up to the millisecond views on rows.

bq. Another option is to share getThreadReadPoint() value for the two 
iterators, i.e. read the value in the next() function, and give it as a 
parameter to getNext()

What are the 'two iterators' here?

Sorry N, I don't have my head as deep in this stuff as you do currently so my 
questions and answers above may be off.  Please compensate appropriately.

 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service

2011-07-25 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4137.
--

Resolution: Incomplete


The description does not seem to apply to hbase and the description has been 
sprayed across a few random issues which leads me to believe the author is not 
clear themselves on what is wanted.

Resolving as incomplete.

 Implementation of Distributed Hash Table(DHT)  for lookup service
 -

 Key: HBASE-4137
 URL: https://issues.apache.org/jira/browse/HBASE-4137
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.90.3
Reporter: vamshi

 To implement  a scalable data structure i.e Distributed hash table in the 
 HBase. In the Hbase to perform fast lookup  we can take the help of 
 Distributed Hash Table (DHT), a scalable data structure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070825#comment-13070825
 ] 

Gary Helmling commented on HBASE-4132:
--

Hmm, this seems to be a confusion of 
{{org.apache.hadoop.hbase.regionserver.wal.WALObserver}} and 
{{org.apache.hadoop.hbase.coprocessor.WALObserver}}.  Not surprising since both 
classes have the same name.  I think the former is the WAL listener used in 
replication and the latter is the coprocessor interface for WALs.  I know the 
former has been around longer, but maybe we should consider renaming it to 
WALListener.  Or maybe we should bite the bullet and combine these two 
interfaces to one.  (I say that knowing very little about replication and 
whether it would make sense/be feasible to convert it to a coprocessor 
implementation).

Anyway, I see no problem adding {{pre/postArchiveStart}} and 
{{pre/postArchiveCompleted}} to 
{{org.apache.hadoop.hbase.coprocessor.WALObserver}}, as Andy mentions.  Would 
that be sufficient, or should we look at adding the logRoll and logClose events 
from {{o.a.h.h.regionserver.wal.WALObserver}} as well?

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070829#comment-13070829
 ] 

Ted Yu commented on HBASE-1938:
---

bq. 1) Replacement of KeyValue lowest = getLowest();
It is in the seek function

bq. Another option is to share getThreadReadPoint() value for the two iterators
N was talking about the following code in MemStore.next():
{code}
  if (theNext == kvsetNextRow) {
kvsetNextRow = getNext(kvsetIt);
  } else {
snapshotNextRow = getNext(snapshotIt);
  }
{code}
The initiative was to save the system call.

 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070832#comment-13070832
 ] 

stack commented on HBASE-4132:
--

+1 on one interface only.  J-D!

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread Li Pi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-4027:
-

Attachment: slabcachepatchv4.diff

Added tests for eviction, now logs finely grained stats to file.

Added a bunch of documentation. A bunch - this should take care of most of the 
documentation concerns.

 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070834#comment-13070834
 ] 

stack commented on HBASE-4134:
--

@feng nice debugging

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.90.4


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179001: 2011-07-22 02:47:51,318 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179002: 2011-07-22 02:47:51,319 DEBUG 
 

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070835#comment-13070835
 ] 

stack commented on HBASE-4132:
--

Or, one should inherit from the other rather than repeat.

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2011-07-25 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070836#comment-13070836
 ] 

Gary Helmling commented on HBASE-3909:
--

It would be really nice to have this capability but seems way out there for 
0.92.  We can't depend on Hadoop trunk/0.23 classes for 0.92.  We could fork 
the HADOOP-7001 patch or come up with our own approach, but either one is going 
to be a lot of work.  And the server related changes to support this seem 
fairly tricky for anything beyond trivial configuration options -- ie, how to 
support reconfiguring number of rpc handler threads, say.

All this adds up to: I'd suggest we punt from 0.92.

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.92.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070843#comment-13070843
 ] 

Ted Yu commented on HBASE-3909:
---

+1 on moving out of 0.92

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.92.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070845#comment-13070845
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1191/
---

Review request for hbase and Todd Lipcon.


Summary
---

Uploading slabcachepatchv4 to review for Li Pi.


This addresses bug HBASE-4027.
https://issues.apache.org/jira/browse/HBASE-4027


Diffs
-

  conf/hbase-env.sh 2d55d27 
  pom.xml 729dc37 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5963552 
  src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SlabCache.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestSingleSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestSlabCache.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/1191/diff


Testing
---


Thanks,

Todd



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070859#comment-13070859
 ] 

stack commented on HBASE-4027:
--

Doc is great.

These could be final:

+  private LruBlockCache onHeapCache;
+  private SlabCache offHeapCache;

Says 'Metrics are the combined size and hits and misses of both caches' but 
down in getStats we seem to be getting onheap stats only.  Intentional?  Same 
for heapSize.

Do you want to leave this line in hfile? +  LOG.debug(decompressedSize =  
+ decompressedSize);

Whats it mean when you say 'An exception will be thrown if the cached data is 
larger than the size of the allocated block'?

More notes later.

 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070860#comment-13070860
 ] 

Ted Yu commented on HBASE-3845:
---

+1 on HBASE-3845_trunk_3.patch

Ran unit tests and they passed.

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2011-07-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070861#comment-13070861
 ] 

stack commented on HBASE-3909:
--

My thought on moving issues in and out of releases is just do it with 
justification rather than make justification and not move it waiting on others 
to agree.  For example, you make a good case for moving the issue out Gary, so 
go for it.

If someone objects, let them counter argue and move it back.  If a dispute, we 
can move it to the dev list to duke it out.

Good stuff.

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.92.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070862#comment-13070862
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1191/#review1182
---


could do with some tests for MetaSlab. also some multi-threaded tests - see 
MultithreadedTestUtil, example usage in TestMemStoreLAB


pom.xml
https://reviews.apache.org/r/1191/#comment2484

did you determine that this ConcurrentLinkedHashMap was different than the 
one in Guava? I thought it got incorporated into Guava, which we already depend 
on.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2485

punctuation wise, I think it would be easier to read if you hyphenated 
on-heap and off-heap. This applies to log messages below as well.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2486

No need to line-break here



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2487

consider using StringUtils.humanReadableInt for these sizes.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2488

@Override



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2489

when you're just overriding something from the superclass, no need for 
javadoc unless it says something new and exciting. If you feel like you want to 
put something there, you can use /** {@inheritDoc} */ to be explicit that 
you're inheriting from the superclass.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2490

I think you should only put-back into the on-heap cache in the case that 
the 'caching' parameter is true.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2491

hrm, the class javadoc says that the statistics should be cumulative, but 
this seems to just forward



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2492

TODOs



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2493

is this code used? seems like dead copy-paste code to me.



src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
https://reviews.apache.org/r/1191/#comment2497

extraneous debugging left in



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2498

I think this is usually called a slab class - I think that name would be 
less confusing, since Meta is already used in HBase to refer to .META.



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2499

unclear what the difference is between the two.

Is slabs the list of 2GB buffers, and buffers is the list of actual 
items that will be allocated? I think the traditional names here are slabs 
and items. where each slab holds some number of allocatable items

Also, rather than // comments, use /** javadoc comments */ before the vars



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2500

these vars probably better called maxBlocksPerSlab and maxSlabSize, since 
they're upper bounds.



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2501

I think this code would be a little easier to understand if you split it 
into one loop for the full slabs, and an if statement for the partially full 
one. Something like:

int numFullSlabs = numBlocks / maxBlocksPerSlab;
boolean hasPartialSlab = (numBlocks % maxBlocksPerSlab)  0;

for (int i = 0; i  numFullSlabs; i++) {
  alloc one of maxSlabSize;
  addBuffersForSlab(slab);
}

if (hasPartialSlab) {
  alloc the partial one
  addBuffersForSlab(slab);
}




src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2502

should be a LOG.warn



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2503

shouldn't this class have an alloc() and free() method?



src/main/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCache.java
https://reviews.apache.org/r/1191/#comment2511

shouldn't this implement BlockCache?




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070863#comment-13070863
 ] 

Jean-Daniel Cryans commented on HBASE-4132:
---

It could be weird, if we just merge them then the Replication class (and others 
implementing wal.WALObserver in the future) would have imports for CP classes 
since ObserverContext is passed in the cp.WALObserver methods. I'd prefer a 
rename of either or both. 

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...

2011-07-25 Thread gaojinchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4064:
--

Attachment: (was: HBASE-4064_branch90V2.patch)

 Two concurrent unassigning of the same region caused the endless loop of 
 Region has been PENDING_CLOSE for too long...
 

 Key: HBASE-4064
 URL: https://issues.apache.org/jira/browse/HBASE-4064
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: Jieshan Bean
 Fix For: 0.90.5

 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, 
 disableflow.png


 1. If there is a rubbish RegionState object with PENDING_CLOSE in 
 regionsInTransition(The RegionState was remained by some exception which 
 should be removed, that's why I called it as rubbish object), but the 
 region is not currently assigned anywhere, TimeoutMonitor will fall into an 
 endless loop:
 2011-06-27 10:32:21,326 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:21,326 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:21,438 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:21,441 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 2011-06-27 10:32:31,207 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:31,207 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:31,215 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:31,215 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 2011-06-27 10:32:41,164 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:41,164 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:41,172 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:41,172 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 .
 2  In the following scenario, two concurrent unassigning call of the same 
 region may lead to the above problem:
 the first unassign call send rpc call success, the master watched the event 
 of RS_ZK_REGION_CLOSED, process this event, will create a 
 ClosedRegionHandler to remove the state of the region in master.eg.
 while ClosedRegionHandler is running in  
 hbase.master.executor.closeregion.threads thread (A), another unassign call 
 of same region run in another thread(B).
 while thread B  run if (!regions.containsKey(region)), this.regions have 
 the region info, now  cpu switch to thread A.
 The thread A will remove the region from the sets of this.regions and 
 regionsInTransition, then switch to thread B. the thread B run continue, 
 will throw an exception with the msg of Server null returned 
 java.lang.NullPointerException: Passed server is null for 
 9a6e26d40293663a79523c58315b930f, but without removing the new-adding 
 RegionState from regionsInTransition,and it can not be removed for ever.
  public void unassign(HRegionInfo region, boolean force) {
 LOG.debug(Starting unassignment of region  +
   region.getRegionNameAsString() +  (offlining));
 synchronized (this.regions) {
   // 

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070877#comment-13070877
 ] 

Andrew Purtell commented on HBASE-4132:
---

bq. Hmm, this seems to be a confusion of 
org.apache.hadoop.hbase.regionserver.wal.WALObserver and 
org.apache.hadoop.hbase.coprocessor.WALObserver. 

Aha.

I agree with J-D, should do a rename.

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070887#comment-13070887
 ] 

Ted Yu commented on HBASE-4132:
---

org.apache.hadoop.hbase.regionserver.wal.WALObserver is mostly internal.
It is used for LogRoller and replication.
Shall we rename it to 
org.apache.hadoop.hbase.regionserver.wal.WALActionsListener ?
See HRegionServer.getWALActionListeners():
{code}
  // Replication handler is an implementation of WALActionsListener.
  listeners.add(this.replicationHandler);
{code}

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070891#comment-13070891
 ] 

jirapos...@reviews.apache.org commented on HBASE-3899:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/
---

(Updated 2011-07-26 01:19:52.655737)


Review request for hbase.


Changes
---

* Add checking for number of calls currently delayed. A warning message is 
issued if too many calls are delayed.
* Unit test to check that above warning works.
* endDelay() now takes an Object as a parameter, not a Writable. Initially, I 
thought the method that ended the delay should pack the response (i.e. 
endDelay(new HbaseObjectWritable(retval))), but it makes more sense to pack it 
in setResponse.
* Address other feedback from Todd Lipcon. Thanks!


Summary
---

Free up RPC server Handler thread if the called routine specifies the call 
should be delayed. The RPC client sees no difference, changes are server-side 
only. This is based on the previous submitted patch from Dhruba.


This addresses bug HBASE-3899.
https://issues.apache.org/jira/browse/HBASE-3899


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 
  src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e 
  src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java PRE-CREATION 

Diff: https://reviews.apache.org/r/1174/diff


Testing
---

Unit tests run. Also, the patch includes a new unit test.


Thanks,

Vlad



 enhance HBase RPC to support free-ing up server handler threads even if 
 response is not ready
 -

 Key: HBASE-3899
 URL: https://issues.apache.org/jira/browse/HBASE-3899
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt


 In the current implementation, the server handler thread picks up an item 
 from the incoming callqueue, processes it and then wraps the response as a 
 Writable and sends it back to the IPC server module. This wastes 
 thread-resources when the thread is blocked for disk IO (transaction logging, 
 read into block cache, etc).
 It would be nice if we can make the RPC Server Handler threads pick up a call 
 from the IPC queue, hand it over to the application (e.g. HRegion), the 
 application can queue it to be processed asynchronously and send a response 
 back to the IPC server module saying that the response is not ready. The RPC 
 Server Handler thread is now ready to pick up another request from the 
 incoming callqueue. When the queued call is processed by the application, it 
 indicates to the IPC module that the response is now ready to be sent back to 
 the client.
 The RPC client continues to experience the same behaviour as before. A RPC 
 client is synchronous and blocks till the response arrives.
 This RPC enhancement allows us to do very powerful things with the 
 RegionServer. In future, we can make enhance the RegionServer's threading 
 model to a message-passing model for better performance. We will not be 
 limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070896#comment-13070896
 ] 

jirapos...@reviews.apache.org commented on HBASE-3899:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/#review1185
---

Ship it!


Looks good. Have you run the full test suite with the current iteration of the 
patch?

- Todd


On 2011-07-26 01:19:52, Vlad Dogaru wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1174/
bq.  ---
bq.  
bq.  (Updated 2011-07-26 01:19:52)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Free up RPC server Handler thread if the called routine specifies the call 
should be delayed. The RPC client sees no difference, changes are server-side 
only. This is based on the previous submitted patch from Dhruba.
bq.  
bq.  
bq.  This addresses bug HBASE-3899.
bq.  https://issues.apache.org/jira/browse/HBASE-3899
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 
bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e 
bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1174/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Unit tests run. Also, the patch includes a new unit test.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Vlad
bq.  
bq.



 enhance HBase RPC to support free-ing up server handler threads even if 
 response is not ready
 -

 Key: HBASE-3899
 URL: https://issues.apache.org/jira/browse/HBASE-3899
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt


 In the current implementation, the server handler thread picks up an item 
 from the incoming callqueue, processes it and then wraps the response as a 
 Writable and sends it back to the IPC server module. This wastes 
 thread-resources when the thread is blocked for disk IO (transaction logging, 
 read into block cache, etc).
 It would be nice if we can make the RPC Server Handler threads pick up a call 
 from the IPC queue, hand it over to the application (e.g. HRegion), the 
 application can queue it to be processed asynchronously and send a response 
 back to the IPC server module saying that the response is not ready. The RPC 
 Server Handler thread is now ready to pick up another request from the 
 incoming callqueue. When the queued call is processed by the application, it 
 indicates to the IPC module that the response is now ready to be sent back to 
 the client.
 The RPC client continues to experience the same behaviour as before. A RPC 
 client is synchronous and blocks till the response arrives.
 This RPC enhancement allows us to do very powerful things with the 
 RegionServer. In future, we can make enhance the RegionServer's threading 
 model to a message-passing model for better performance. We will not be 
 limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions

2011-07-25 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4139:
--

Attachment: HBASE-4139.patch

 [stargate] Update ScannerModel with support for filter package additions
 

 Key: HBASE-4139
 URL: https://issues.apache.org/jira/browse/HBASE-4139
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.90.4, 0.92.0

 Attachments: HBASE-4139.patch


 Filters have been added to the o.a.h.h.filters package without updating 
 o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions

2011-07-25 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4139:
--

Fix Version/s: 0.90.4
   Status: Patch Available  (was: Open)

 [stargate] Update ScannerModel with support for filter package additions
 

 Key: HBASE-4139
 URL: https://issues.apache.org/jira/browse/HBASE-4139
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.90.4, 0.92.0

 Attachments: HBASE-4139.patch


 Filters have been added to the o.a.h.h.filters package without updating 
 o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3937) Region PENDING-OPEN timeout with un-expected ZK node state leads to an endless loop

2011-07-25 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3937:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

 Region PENDING-OPEN timeout with un-expected ZK node state leads to an 
 endless loop
 ---

 Key: HBASE-3937
 URL: https://issues.apache.org/jira/browse/HBASE-3937
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0


 I describe the scenario of how this problem happened:
 1.HMaster assigned the region A to RS1. So the RegionState was set to 
 PENDING_OPEN.
 2.For there's too many opening requests, the open process on RS1 was blocked.
 3.Some time later, TimeoutMonitor found the assigning of A was timeout. For 
 the RegionState was in PENDING_OPEN, went into the following handler 
 process(Just put the region into an waiting-assigning set):
case PENDING_OPEN:
   LOG.info(Region has been PENDING_OPEN for too  +
   long, reassigning region= +
   regionInfo.getRegionNameAsString());
   assigns.put(regionState.getRegion(), Boolean.TRUE);
   break;
 So we can see that, under this case, we consider the ZK node state was 
 OFFLINE. Indeed, in an normal disposal, it's OK.
 4.But before the real-assigning, the requests of RS1 was disposed. So that 
 affected the new-assigning. For it update the ZK node state from OFFLINE to 
 OPENING. 
 5.The new assigning started, so it send region to open in RS2. But while the 
 opening, it should update the ZK node state from OFFLINE to OPENING. For the 
 current state is OPENING, so this operation failed.
 So this region couldn't be open success anymore.
 So I think, to void this problem , under the case of PENDING_OPEN of 
 TiemoutMonitor, we should transform the ZK node state to OFFLINE first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4121) improve hbck tool to fix .META. hole issue.

2011-07-25 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-4121.
---

Resolution: Fixed

Duplicate with HBASE-4122

 improve hbck tool to fix .META. hole issue.
 ---

 Key: HBASE-4121
 URL: https://issues.apache.org/jira/browse/HBASE-4121
 Project: HBase
  Issue Type: Improvement
Reporter: feng xu
 Fix For: 0.92.0


 hbase hbck tool can check the META hole, but it can not fix this problem by 
 --fix.
 I plan to improve the tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4114) Metrics for HFile HDFS block locality

2011-07-25 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070922#comment-13070922
 ] 

Ming Ma commented on HBASE-4114:


Thanks, Stack, Ted.

1. In the experiment table above, the total number of HDFS blocks that can be 
retrieved locally by the region server as well as total number of HDFS blocks 
for all HFiles are defined on the whole cluster level. The external program 
also calculates locality information per hfile, region as well as per region 
server. It uses HDFS namenode and the calculation is independent of any map 
reduce jobs.


2. In terms of how we can calculate this metrics inside hbase, we can do in two 
steps. first one is to calcluate the metrics independent of map reduce jobs; 
the second one is to calcuate it on per map reduce job level.


3. Calculate on the locality index, independent of map reduce jobs.

a. It will first be calcuated on hfile level { total # of HDFS block, total # 
of local HDFS blocks }; then the data get aggregated on region level, finally 
get aggregated on region server level.

b. Impact on namenode. There are 2 RPC calls to NN to get block info for each 
hfile. If we assume 100 regions per RS, 10 hfiles per region, 500 RSs, we will 
have 1M RPC hits to NN. Most of the time, that won't be an issue if we only 
calculate hfile locality index when hfile is created or region is loaded by the 
RS the first time. Because HDFS can still move HDFS blocks around without hbase 
knowing it, we still need to refresh the value periodically. 

c. The computation can be done in RS or HMaster. It seems RS is better in terms 
of design(only store knows the HDFS path for hfile location, HMaster doesn't) 
and extensiblity(to calculate locality index per map reduce job). The locality 
index can be part of HServerLoad and RegionLoad for load balancer to use. RS 
will rotate through all regions periodically in its main thread. The calcuation 
interval defined by by hbase.regionserver.msginterval might be too short for 
this scenario to minimize the load to NN for a large cluster ( 20 NN RPC per RS 
per 3 sec ).

d. The locality index can be a new RS metrics. We can also put it on table.jsp 
for each region.

e. HRegionInfo is kind of static. It doesn't change over time, however, 
locality index changes overtime for a given region. Maybe 
ClusterStatus/HServerInfo/HServerLoad/RegionLoad are better?


4. Locality index calculation for scan / map reduce job.

a. The original scenario is for full table scan only. If we want to provide 
accurate locality index for any scan / map reduce, this could be tricky given 
i) map reduce job can have start/end keys and filters such as time range; ii) 
block cache can be used and thus hfile shouldn't be accounted for if there is 
cache hit. iii) data volume read from HDFS block is also a factor. Reading 
smaller buffer is different from reading bigger buffer.

b. One useful scenario is we want to find out why map jobs run slower 
sometimes. So it is useful the metrics is just there as part of map reduce job 
status page. We can estimate by using ganglia page to get the locality index 
value for the RSs at the time map reduce job is run.

c. To provide more accurate data, we can modify TableInputFormat, a) call 
HBaseAdmin.getClusterStatus to get the locality index info for each region. b) 
calculate the intersection between scan specification and ClusterStatus based 
on key range as well as column family. It isn't 100% accurate, but it might be 
good enough.

d. To be really accurate, region server needs to provide locality index for 
each scan operation back to the client.

 Metrics for HFile HDFS block locality
 -

 Key: HBASE-4114
 URL: https://issues.apache.org/jira/browse/HBASE-4114
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Reporter: Ming Ma
Assignee: Ming Ma

 Normally, when we put hbase and HDFS in the same cluster ( e.g., region 
 server runs on the datenode ), we have a reasonably good data locality, as 
 explained by Lars. Also Work has been done by Jonathan to address the startup 
 situation.
 There are scenarios where regions can be on a different machine from the 
 machines that hold the underlying HFile blocks, at least for some period of 
 time. This will have performance impact on whole table scan operation and map 
 reduce job during that time.
 1.After load balancer moves the region and before compaction (thus 
 generate HFile on the new region server ) on that region, HDFS block can be 
 remote.
 2.When a new machine is added, or removed, Hbase's region assignment 
 policy is different from HDFS's block reassignment policy.
 3.Even if there is no much hbase activity, HDFS can load balance HFile 
 blocks as other non-hbase applications push other 

[jira] [Updated] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix

2011-07-25 Thread feng xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

feng xu updated HBASE-4134:
---

Fix Version/s: (was: 0.90.4)
   0.94.0

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.94.0


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179001: 2011-07-22 02:47:51,318 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179002: 2011-07-22 02:47:51,319 DEBUG 
 

[jira] [Created] (HBASE-4140) book: Update our hadoop vendor section

2011-07-25 Thread stack (JIRA)
book: Update our hadoop vendor section
--

 Key: HBASE-4140
 URL: https://issues.apache.org/jira/browse/HBASE-4140
 Project: HBase
  Issue Type: Improvement
Reporter: stack




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4140) book: Update our hadoop vendor section

2011-07-25 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4140:
-

Attachment: hadoop.txt

Updated Cloudera mention to recommend released CDH and to note point update.  
Add reference to MapR distribution.

 book: Update our hadoop vendor section
 --

 Key: HBASE-4140
 URL: https://issues.apache.org/jira/browse/HBASE-4140
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Attachments: hadoop.txt




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070946#comment-13070946
 ] 

Jean-Daniel Cryans commented on HBASE-4132:
---

+1

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix

2011-07-25 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070954#comment-13070954
 ] 

feng xu commented on HBASE-4134:


To Ted Yu:
The HBASE-4053 patch has been integrated before this issue occurred in my test 
cluster.
I think this issue has no relationship with HBASE-4053.
the HBASE-4053 patch ensure that the region is not double counting in one 
regionserver.
but in this issue the region was carried by two(maybe more) regionservers.

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.94.0


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544,