[ https://issues.apache.org/jira/browse/ACCUMULO-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790804#comment-13790804 ]
John Vines commented on ACCUMULO-1759: -------------------------------------- We catted several of the files and they were empty. Unfortunately the circumstances around this are hard to reproduce, so we haven't been able to recreate it sans patch. [~supermallen] believes that his patch shouldn't interfere with when/how the walogs are created though. > Empty walogs block recovery after power outage. > ----------------------------------------------- > > Key: ACCUMULO-1759 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1759 > Project: Accumulo > Issue Type: Bug > Affects Versions: 1.5.0 > Environment: * HDP 1.3 > ** {{dfs.durable.sync=true}} > ** {{dfs.datanode.synconclose=true}} > * encrytion patch from ACCUMULO-998 > Reporter: Luke Brassard > > Power was abruptly cut to the cluster. Upon restart of HDFS, there was a > single Rfile that was missing a block. > Here are some details about the cluster: > * HDP 1.3 > ** {{dfs.durable.sync=true}} > ** {{dfs.datanode.synconclose=true}} > * encrytion patch from ACCUMULO-998 > After restarting Accumulo, the Master was complaining with a series of these: > {code} > 2013-10-09 18:15:16,649 [recovery.HadoopLogCloser] INFO : Waiting for file to > be closed /accumulo/wal/10.10.0.1+9997/d52ab315-5ac1-4a5c-9085-67ae29b98b88 > 2013-10-09 18:15:16,663 [recovery.HadoopLogCloser] INFO : Waiting for file to > be closed /accumulo/wal/10.10.0.2+9997/d0192739-74e2-43a0-985f-3ed668259995 > 2013-10-09 18:15:16,742 [recovery.HadoopLogCloser] INFO : Waiting for file to > be closed /accumulo/wal/10.10.0.3+9997/de54e6dc-964a-4b33-b4fb-052e81749913 > 2013-10-09 18:15:16,833 [recovery.HadoopLogCloser] INFO : Waiting for file to > be closed /accumulo/wal/10.10.0.4+9997/cda5daec-25f3-443b-818a-990d3eddd56f > {code} > Inspection of the files above showed that they were all empty, but referenced > in the {{!METADATA}} table. The solution was to move or remove the files from > HDFS and delete the references from the metadata. The instance was then able > to stabilize and assign the rest of the tablets. > It is unclear why these empty walogs existed in the first place. Is it > possible that there should have been data in these walogs? Or should the > files have been disregarded since they were empty? > Regarding the Rfile that was missing the block, it was not being referenced > by the {{!METADATA}} table, so removing it had no negative effect on the > instance. Should there have been some reference to this file? -- This message was sent by Atlassian JIRA (v6.1#6144)