[
https://issues.apache.org/jira/browse/HADOOP-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dhruba borthakur updated HADOOP-1269:
-------------------------------------
Attachment: chooseTargetLock2.patch
Incorporated Konstantin's review comments.
1. NetworkTopology.isOnSameRack looks at node.getParent(). These are protected
by the clusterMap lock. So, I kept it as it way, did not make any change.
2. NetworkTopology.getDistance(): removed redundant declaration i.
3. Host2NodesMap.add locking issue. This was a good catch. I made this change.
Fixed indentation.
4. Moved the LOG statement in getAdditionalBlock as suggested.
I also ran randomWriter on a 10 node cluster. The test ran to completion. The
total elapsed time of the test without this patch was 2 hr 40 min and with this
patch was 2 hours 31 minutes. Not a single task error was encountered.
> DFS Scalability: namenode throughput impacted becuase of global FSNamesystem
> lock
> ---------------------------------------------------------------------------------
>
> Key: HADOOP-1269
> URL: https://issues.apache.org/jira/browse/HADOOP-1269
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: chooseTargetLock.patch, chooseTargetLock2.patch,
> serverThreads1.html, serverThreads40.html
>
>
> I have been running a 2000 node cluster and measuring namenode performance.
> There are quite a few "Calls dropped" messages in the namenode log. The
> namenode machine has 4 CPUs and each CPU is about 30% busy. Profiling the
> namenode shows that the methods the consume CPU the most are addStoredBlock()
> and getAdditionalBlock(). The first method in invoked when a datanode
> confirms the presence of a newly created block. The second method in invoked
> when a DFSClient request a new block for a file.
> I am attaching two files that were generated by the profiler.
> serverThreads40.html captures the scenario when the namenode had 40 server
> handler threads. serverThreads1.html is with 1 server handler thread (with a
> max_queue_size of 4000).
> In the case when there are 40 handler threads, the total elapsed time taken
> by FSNamesystem.getAdditionalBlock() is 1957 seconds whereas the methods
> that that it invokes (chooseTarget) takes only about 97 seconds.
> FSNamesystem.getAdditionalBlock is blocked on the global FSNamesystem lock
> for all those 1860 seconds.
> My proposal is to implement a finer grain locking model in the namenode. The
> FSNamesystem has a few important data structures, e.g. blocksMap,
> datanodeMap, leases, neededReplication, pendingCreates, heartbeats, etc. Many
> of these data structures already have their own lock. My proposal is to have
> a lock for each one of these data structures. The individual lock will
> protect the integrity of the contents of the data structure that it protects.
> The global FSNamesystem lock is still needed to maintain consistency across
> different data structures.
> If we implement the above proposal, both addStoredBlock() and
> getAdditionalBlock() does not need to hold the global FSNamesystem lock.
> startFile() and closeFile() still needs to acquire the global FSNamesystem
> lock because it needs to ensure consistency across multiple data structures.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.