[
https://issues.apache.org/jira/browse/HADOOP-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620335#action_12620335
]
dhruba borthakur commented on HADOOP-3631:
------------------------------------------
The problem arises because the secondary namenode reads/writes to/from the same
disk device on which the primary namenode is writing transactions. One approach
would be to allow the namenode to use three separate physicl disk devices, D1,
D2 D3.
Initial : D1 has the fsimage
D2 has edits log
D3 none
RollEditLog does the following: D1 : fsimage
D2: edits
D3 edits.new
Now, the Secondary fetches fsimage and edits, merged them and uploads them back
as follows:
D1 fsimage, fsimage.ckpt
D2: edits
D3 edits.new
This download and upload of files by the secondary does not affect namenode
performance because namenode perfrmance is dependent on performance of
transaction log that is occuring on disk device D3. This is not touched by
Secondary namenode.
The, the primary namenode atomically switches fsimage/edits (moves fsimage.ckpt
to fsimage, removes edits, renames edits.mew to edits) and we are lft with the
following:
D1 : fsimage
D2: none
D3: edits
> Transfer of image from secondary name node should not interrupt service
> -----------------------------------------------------------------------
>
> Key: HADOOP-3631
> URL: https://issues.apache.org/jira/browse/HADOOP-3631
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs
> Affects Versions: 0.17.0
> Reporter: Robert Chansler
> Priority: Critical
> Fix For: 0.19.0
>
>
> The transfer of the new image prepared by the secondary name node can
> interfere with client services. Clients observe delays in completing RPCs. In
> general, administrative activities should not be observed by the clients. For
> large clusters, administrators are reluctant to run the secondary name node
> leading to excessive edit logs. (Excessive in the sense that if the cluster
> must be restarted, a long time is required to process the log.)
> Maybe the new image does not have to be transfered; it could be fetched when
> needed.
> Maybe the priority of the transfer task can be reduced so that the transfer
> is not observed.
> Maybe a different transfer protocol is more appropriate.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.