[ 
https://issues.apache.org/jira/browse/HBASE-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759720#action_12759720
 ] 

stack commented on HBASE-1868:
------------------------------

So, just before the first instance of the above IOE, I see an instance of the 
NPE over in HBASE-1809.  After the NPE is thrown because there was no read-lock 
doing gets, I see a few java.io.IOException: Stream closed.  I wonder what file 
these were going against?  The old file that was doing the NPE or the newly 
opened compaction file, 1652111193973935565 (This is the file that was made by 
the compaction around the time of the NPE).  These two strikes may have been 
made against the 1652111193973935565 file.  If no hdfs-127 in place, could have 
hosed the dfsclient for this file.  This may have been why this file went bad.  
Looking...

So, reads from meta were failing while this issue was in place.

> Spew about rebalancing but none done....
> ----------------------------------------
>
>                 Key: HBASE-1868
>                 URL: https://issues.apache.org/jira/browse/HBASE-1868
>             Project: Hadoop HBase
>          Issue Type: Bug
>         Environment: 0.20.0 RC2
>            Reporter: stack
>
> I'm seeing loads of this in logs:
> {code}
> 2009-09-24 21:27:22,130 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Server XX.XX.XX.100,20020,1253219583523 will be unloaded for balance. Server 
> load: 5 avg: 3.78, regions can be moved: 4
> {code}
> Its like balancer is coming up w/ wrong answer to question... I don't see 
> subsequent stuff going on... It does it over and over for hours.
> Then a split comes in and its seems to shake things up.  I see it do a bunch 
> of assigning.
> {code}
> 2009-09-24 21:41:02,784 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Processing MSG_REPORT_SPLIT: locations,,1253657949707: Daughters; 
> locations,,1253
> 828460677, 
> locations,http:\x2F\x2Fen.wikipedia.org\x2Fwiki\x2FLarry_Lucchino,1253828460677
>  from aa0-009-2.u.powerset.com,20020,1253219584971; 1 of 3
> 2009-09-24 21:41:02,784 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Assigning for address: XX.XX.XX.6:20020, startcode: 1253219584971, load: 
> (reque
> sts=5213, regions=3, usedHeap=114, maxHeap=2031): total nregions to assign=2, 
> nregions to reach balance=4, isMetaAssign=false
> 2009-09-24 21:41:02,820 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
> Assigning for address: XX.XX.XX.96:20020, startcode: 1253219584175, load: 
> (requ
> ests=12, regions=4, usedHeap=404, maxHeap=2031): total nregions to assign=2, 
> nregions to reach balance=4, isMetaAssign=false
> ...
> {code}
> Then back to the 'will be unloaded'... message.
> A new split comes in and then the assigning gets triggered again... a few 
> regions are opened but not enough.
> Eventually it goes back to 'normal' (average load went to 3.85 from 3.8?)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to