On Thu, Sep 23, 2010 at 6:20 PM, Konstantin Shvachko <s...@yahoo-inc.com>wrote:
> Hi Shen, > > Why do we need CheckpointNode? > 1. First of all it is a compatible replacement of SecondaryNameNode. > 2. Checkpointing is also needed for periodically compacting edits. > You can do it with CN or BN, but CN is more lightweight. > I assume there could be cases when streaming edits to BN over network > can be slower than writing them to disk, so you might want to turn BN > off for performance reasons. > Also, if the BN hangs, it will hang edits on the primary node as well, since synchronous RPCs are used to push edits, right Konst? Would be worth testing a "kill -STOP" on the BN while performing operations on the primary. > 3. Also in current implementation NN allows only one BN, but multiple CNs. > So if the single BN dies the checkpointing will stall. > You can prevent it by starting two CNs instead, or one BN and one CN. > But I agree with you CN is just a subset of BN by its functionality. > > Thanks, > Konstantin > > > On 9/22/2010 5:50 PM, ChingShen wrote: > >> Thanks Konstantin, >> >> But, my main question is that because the CN can only provide an old >> state >> of the namespace, so why do we need it? I think the BN is best solution. >> >> Shen >> >> On Thu, Sep 23, 2010 at 5:20 AM, Konstantin Shvachko<s...@yahoo-inc.com >> >wrote: >> >> The CheckpointNode creates checkpoints of the namespace, but does not >>> keep >>> an up-to-date state of the namespace in memory. >>> If primary NN fails CheckpointNode can only provide an old state of the >>> namespace >>> created during latest checkpoint. >>> Also CheckpointNode is a replacement for SecondaryNameNode in earlier >>> releases. >>> >>> BackupNode does checkpoints too, but in addition keeps an up-to-date >>> state >>> of the namespace in its memory. >>> When the primary NN dies you can ask BackupNode to save namespace, which >>> will >>> create the up-to-date image, and then start NN instead of BN on the node >>> BN >>> was running using that saved image directly or start NN on a different >>> node >>> using importCheckpoint from the saved inage directory. >>> >>> See the guide here. >>> >>> >>> http://hadoop.apache.org/hdfs/docs/r0.21.0/hdfs_user_guide.html#Checkpoint+Node >>> >>> Thanks, >>> --Konstantin >>> >>> >>> On 9/8/2010 11:36 PM, ChingShen wrote: >>> >>> Hi all, >>>> >>>> I got the Backup node(BN) that includes all the checkpoint >>>> responsibilities, >>>> and it maintains an up-to-date namespace state, which is always in >>>> sync with the active NN. >>>> >>>> Q1. In which situation do we need a CN? >>>> >>>> Q2. If the NameNode machine fails, which different manual >>>> intervention between BN and CN? >>>> >>>> Thanks. >>>> >>>> Shen >>>> >>>> >>>> >>> >> > -- Todd Lipcon Software Engineer, Cloudera