[
https://issues.apache.org/jira/browse/HADOOP-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dhruba borthakur updated HADOOP-1093:
-------------------------------------
Attachment: nyr2.patch
This patch does the following:
1. The client does not send block confirmations to the namenode. Instead, each
datanode sends block confirmations to the namenode.
2. The namenode does not verify that all previous blocks have achieved the
minimum replication factor before allocating the next block for the file. The
check is done when the file is closed. A successful file close guarantees that
the minimum replication factor for all blocks has been achieved.
3. The dfsclient uses an exponential backoff scheme when it has to retry RPC's
to the namenode. This helps the namenode to not get overwhelmed.
4. The datanode first forwards the data to the next datanode in the pipeline
before it writes the data to its local block file.
5. The default number of namenode server threads have increased from 10 to 30.
This helps in the case when there are 2000 datanodes. The call queue length per
thread is 100; with this setting we have a overall call queue length of 3000.
There is also configuration setting to change the number of handler threads.
> NNBench generates millions of NotReplicatedYetException in Namenode log
> -----------------------------------------------------------------------
>
> Key: HADOOP-1093
> URL: https://issues.apache.org/jira/browse/HADOOP-1093
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.0
> Reporter: Nigel Daley
> Assigned To: dhruba borthakur
> Fix For: 0.13.0
>
> Attachments: nyr2.patch
>
>
> Running NNBench on latest trunk (0.12.1 candidate) on a few hundred nodes
> yielded 2.3 million of these exceptions in the NN log:
> 2007-03-08 09:23:03,053 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 8020 call error:
> org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated yet
> at
> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:803)
> at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:309)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)
> I run NNBench to create files with block size set to 1 and replication set to
> 1. NNBench then writes 1 byte to the file. Minimum replication for the
> cluster is the default, ie 1. If it encounters an exception while trying to
> do either the create or write operations, it loops and tries again. Multiply
> this by 1000 files per node and a few hundred nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.