[
https://issues.apache.org/jira/browse/HADOOP-4866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656994#action_12656994
]
Brian Bockelman commented on HADOOP-4866:
-----------------------------------------
Hey all,
Below is the exception which is happening on the datanode side (not for the
exact same block, but I believe it's the same problem).
We have been growing our cluster constantly, meaning that almost every day we
take out nodes (to install new drives and re-kickstart them) or put in new
nodes. This has caused a lot of churn through the decommissioning process and
the balancer. We also have continuous external transfer load test which delete
files within seconds after they transfer successfully. I'd believe you if you
claimed we were pushing the boundaries :)
I have a few other patches to apply to the namenode today; I'll try getting to
the one Nicholas posted and see if that solves it.
Brian
2008-12-16 08:31:46,084 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7
on 50020, call recoverBlock(blk_5492981093339503298_94099, false,
[Lorg.apache.hadoop.hdfs.protocol.DatanodeInfo;@6325950d) from
129.93.239.144:39774: error: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: Block (=blk_5492981093339503298_94099) not found
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:1898)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.commitBlockSynchronization(NameNode.java:410)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Block
(=blk_5492981093339503298_94099) not found
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:1898)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.commitBlockSynchronization(NameNode.java:410)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy4.commitBlockSynchronization(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1461)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1442)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1508)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
> NameNode error in commitBlockSynchronization
> --------------------------------------------
>
> Key: HADOOP-4866
> URL: https://issues.apache.org/jira/browse/HADOOP-4866
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.19.0
> Reporter: Brian Bockelman
> Attachments: 4866_20081215.patch
>
>
> The NameNode continuously has an error in the commitBlockSynchronization.
> This happens for ~5 blocks at a rate of 5-10Hz. I have no idea when this
> started happening because this has been going on for days, well past the
> start of our current logs.
> This appears to be a new symptom in 0.19.0, but I have no idea what could be
> causing it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.