[ 
https://issues.apache.org/jira/browse/ACCUMULO-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790845#comment-13790845
 ] 

Michael Allen commented on ACCUMULO-1759:
-----------------------------------------

bq. I'd believe that more of the RFile stuff and the WALog stuff.

I meant to say, I would believe that more of the RFile stuff *than* the WALog 
stuff.

> Empty walogs block recovery after power outage.
> -----------------------------------------------
>
>                 Key: ACCUMULO-1759
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1759
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0
>         Environment: * HDP 1.3
> ** {{dfs.durable.sync=true}}
> ** {{dfs.datanode.synconclose=true}}
> * encrytion patch from ACCUMULO-998
>            Reporter: Luke Brassard
>
> Power was abruptly cut to the cluster. Upon restart of HDFS, there was a 
> single Rfile that was missing a block. 
> Here are some details about the cluster:
> * HDP 1.3
> ** {{dfs.durable.sync=true}}
> ** {{dfs.datanode.synconclose=true}}
> * encrytion patch from ACCUMULO-998
> After restarting Accumulo, the Master was complaining with a series of these:
> {code}
> 2013-10-09 18:15:16,649 [recovery.HadoopLogCloser] INFO : Waiting for file to 
> be closed /accumulo/wal/10.10.0.1+9997/d52ab315-5ac1-4a5c-9085-67ae29b98b88
> 2013-10-09 18:15:16,663 [recovery.HadoopLogCloser] INFO : Waiting for file to 
> be closed /accumulo/wal/10.10.0.2+9997/d0192739-74e2-43a0-985f-3ed668259995
> 2013-10-09 18:15:16,742 [recovery.HadoopLogCloser] INFO : Waiting for file to 
> be closed /accumulo/wal/10.10.0.3+9997/de54e6dc-964a-4b33-b4fb-052e81749913
> 2013-10-09 18:15:16,833 [recovery.HadoopLogCloser] INFO : Waiting for file to 
> be closed /accumulo/wal/10.10.0.4+9997/cda5daec-25f3-443b-818a-990d3eddd56f
> {code}
> Inspection of the files above showed that they were all empty, but referenced 
> in the {{!METADATA}} table. The solution was to move or remove the files from 
> HDFS and delete the references from the metadata. The instance was then able 
> to stabilize and assign the rest of the tablets.
> It is unclear why these empty walogs existed in the first place. Is it 
> possible that there should have been data in these walogs? Or should the 
> files have been disregarded since they were empty?
> Regarding the Rfile that was missing the block, it was not being referenced 
> by the {{!METADATA}} table, so removing it had no negative effect on the 
> instance. Should there have been some reference to this file?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to