Datanode / namenode UUIDs (Re: "lost" NDFS blocks following network reorg)

Andrzej Bialecki Wed, 29 Mar 2006 00:28:38 -0800

Stefan Groschupf wrote:

Hi hadoop developers,

Hi,

My comments below. Generally speaking, I think you are right - datanodesshould be initialized with a UUID, created once and persisted across IPchanges and hostname changes, and this UUID should be used to identifydatanodes/namenodes. I think the "format" command should also beimplemented for datanodes, to create their UUID when starting for thefirst time - and later on this UUID should be retrieved from the localfile somewhere in the data dir.

The local name of the data node is machineName + ":" + tmpPort. Soit can change if the port is blocked or the machine name change.May we should create the datanode only once and write it to the datafolder to be able read it later on.(?)
This local name is used to send block reports to the name node.FSNamesystem#processReport(Block newReport[], UTF8 dataNodeLocalName)process this report.In the first line of this method the DatanodeInfo is loaded by thedataNode's localName. The datanode already is in this map since aheart beat is send before a block report.
So:
DatanodeInfo node = (DatanodeInfo) datanodeMap.get(name); // noproblem but just a 'empty' container:
...
Block oldReport[] = node.getBlocks(); // will return null since noBlocks are yet associated with this node.
Since oldReport is null all code is skipped until line 901. But thisonly adds the blocks to the node container.

Umm.. I don't follow. The lines 901-905 will add these blocks from thenewReport, because newPos == 0.

In line 924 begins a section of code that collects all obsoleteblocks. First of all I wondering why we iterate throw all blocks here,this could be expansice and it would be enough to iterate over allblocks that are reported by this datanode, isn't it?If a block is still valid is tested by FSDirectory#isValidBlock thatchecks if the block is in activeBlocks.The problem I see now is that the only method that adds Blocks tactiveBlocks is unprotectedAddFile(UTF8 name, Block blocks[]). Buthere also the name node local name that may changed is involved.This method is also used to load the state of stopped or crashed namenode.So in case you stop the dfs, change host names a set of blocks will bemarked as obsolete and deleted.

I'm not 100% sure if this part is correct, but it makes me nervous, too,to involve such ephemeric things as IP/hostname in handling data thatpersists across IP/hostname changes...


--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Datanode / namenode UUIDs (Re: "lost" NDFS blocks following network reorg)

Reply via email to