[ 
https://issues.apache.org/jira/browse/HBASE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780128#action_12780128
 ] 

Lars George commented on HBASE-1994:
------------------------------------

>From IRC

{code}
<larsgeorge> the write path is in the thread
<clehene> it's wap
<larsgeorge> yeah
<larsgeorge> urgh
<larsgeorge> it also only catches IOException
<larsgeorge> I only know from experience that uncaught exceptions in threads 
rarely get logged
<larsgeorge> and if then not proper as the stack is different
<clehene> uhm... however an exception in that big try could live it empty
<larsgeorge> yes
<larsgeorge> fragile
<clehene> and next time it would split 
<clehene> it would just throw away all edits because it fails with EOF
<larsgeorge> could be
<larsgeorge> the read deletes the input
<St^Ack> anything in the .out files?
<larsgeorge> so when the write fails
<larsgeorge> tough luck?
<larsgeorge> the read part has the delete in the finally
<larsgeorge> so the input log is deleted for sure
<St^Ack> cosmin you think the master went down while it was splitting a log?
<larsgeorge> shouldn't that be done at the very end? Or in some sort of Atomic 
commit
<larsgeorge> as in have a big try/catch/finally and either rollback the split 
or apply it
<St^Ack> I think that general split state needs to be hoisted up into zk
<St^Ack> master takes out a 'lock'
<St^Ack> one that will evaporate if it dies mid-split
<larsgeorge> hmm
<larsgeorge> for a master crash?
<larsgeorge> as this is all done in master start anyways
<St^Ack> yeah
<St^Ack> an empty oldlogfile.log -- i can't stand typing the name even -- would 
seem to an exit w/o a call to close
{code}

> Master will lose hlog entries while splitting if region has empty 
> oldlogfile.log
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-1994
>                 URL: https://issues.apache.org/jira/browse/HBASE-1994
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.21.0
>            Reporter: Cosmin Lehene
>            Priority: Blocker
>             Fix For: 0.21.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I don't know yet how an empty oldlogfile.log can exist, however it happened.
> Master will fail to put the splits in the region oldlogfile.log if an empty 
> oldlogfile.log already exists there.
> This is the master log after I artificially reproduced it by placing an empty 
> oldlogfile.log in /hbase/.META./1028785192/oldlogfile.log and then killed the 
> regionserver that was holding the .META. table
> 2009-11-19 09:08:36,012 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Splitting 1 hlog(s) in hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773
> 2009-11-19 09:08:36,012 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Splitting hlog 1 of 1: 
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128, 
> length=0
> 2009-11-19 09:08:36,019 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Adding queue for .META.,,1
> 2009-11-19 09:08:36,037 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Pushed=795 entries from 
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128
> 2009-11-19 09:08:36,038 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Thread got 795 to process
> 2009-11-19 09:08:36,043 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Old hlog file hdfs://b0:9000/hbase/.META./1028785192/oldlogfile.log already 
> exists. Copying existing file to new file
> 2009-11-19 09:08:36,079 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Got while writing region .META.,,1 log java.io.EOFException
> 2009-11-19 09:08:36,081 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
> hlog file splitting completed in 70 millis for 
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to