[ 
https://issues.apache.org/jira/browse/HBASE-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025594#comment-13025594
 ] 

ramkrishna.s.vasudevan commented on HBASE-3821:
-----------------------------------------------

I would like to add up to zhoushuaifeng analysis
Suppose the api cleanUpSplitDir()  is throwing IOExceptin due to some DFS error
We try to roll back.
As part of rollback if we still get an DFS error then the exception is not 
handled.
We try to handle only RunTimeException and hence the exceptin gets propogated 
till the run method of the CompactSplitThread.
Here again the exception is not handled.
I think its better to handle this so that atleast the user comes to know about 
the exception.


>  "NOT flushing memstore for region" keep on printing for half an hour
> ---------------------------------------------------------------------
>
>                 Key: HBASE-3821
>                 URL: https://issues.apache.org/jira/browse/HBASE-3821
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.1
>            Reporter: zhoushuaifeng
>             Fix For: 0.90.3
>
>         Attachments: HBase-3821 v1.txt
>
>
> "NOT flushing memstore for region" keep on printing for half an hour in the 
> regionserver. Then I restart hbase. I think there may be deadlock or cycling.
> I know that when splitting region, it will doclose of region, and set 
> writestate.writesEnabled = false  and may run close preflush. This will make 
> flush fail and print "NOT flushing memstore for region". But It should be 
> finished after a while.
> logs:
> 2011-04-18 16:28:27,960 DEBUG 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested 
> for ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610. because 
> regionserver60020.cacheFlusher; priority=-1, compaction queue size=1
> 2011-04-18 16:28:30,171 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610.
> 2011-04-18 16:28:30,171 WARN 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610. has too many store 
> files; delaying flush up to 90000ms
> 2011-04-18 16:28:32,119 INFO 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs 
> -- HDFS-200
> 2011-04-18 16:28:32,285 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Roll 
> /hbase/.logs/linux253,60020,1303123943360/linux253%3A60020.1303124206693, 
> entries=5226, filesize=255913736. New hlog 
> /hbase/.logs/linux253,60020,1303123943360/linux253%3A60020.1303124311822
> 2011-04-18 16:28:32,287 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Found 1 hlogs to remove out of total 2; oldest outstanding sequenceid is 
> 11037 from region 031f37c9c23fcab17797b06b90205610
> 2011-04-18 16:28:32,288 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
> moving old hlog file 
> /hbase/.logs/linux253,60020,1303123943360/linux253%3A60020.1303123945481 
> whose highest sequenceid is 6052 to 
> /hbase/.oldlogs/linux253%3A60020.1303123945481
> 2011-04-18 16:28:42,701 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Completed major compaction of 4 file(s), new 
> file=hdfs://10.18.52.108:9000/hbase/ufdr/031f37c9c23fcab17797b06b90205610/value/4398465741579485290,
>  size=281.4m; total size for store is 468.8m
> 2011-04-18 16:28:42,712 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> completed compaction on region 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610. after 1mins, 40sec
> 2011-04-18 16:28:42,741 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
> region ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610.
> 2011-04-18 16:28:42,770 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610.: disabling 
> compactions & flushes
> 2011-04-18 16:28:42,770 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Running close preflush of 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610.
> 2011-04-18 16:28:42,771 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610., current region 
> memstore size 105.6m
> 2011-04-18 16:28:42,818 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished snapshotting, commencing flushing stores
> 2011-04-18 16:28:42,846 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610., flushing=false, 
> writesEnabled=false
> 2011-04-18 16:28:42,849 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610.
> 2011-04-18 16:28:42,849 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610., flushing=false, 
> writesEnabled=false ......
> 2011-04-18 17:04:08,803 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610., flushing=false, 
> writesEnabled=false
> 2011-04-18 17:04:08,803 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Flush requested on ufdr,,1303124043153.031f37c9c23fcab17797b06b90205610.
> Mon Apr 18 17:04:24 IST 2011 Starting regionserver on linux253 ulimit -n 1024

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to