[ http://issues.apache.org/jira/browse/HADOOP-146?page=all ]
Doug Cutting resolved HADOOP-146:
---------------------------------
Resolution: Fixed
I just committed this. Thanks, Konstantin!
> potential conflict in block id's, leading to data corruption
> ------------------------------------------------------------
>
> Key: HADOOP-146
> URL: http://issues.apache.org/jira/browse/HADOOP-146
> Project: Hadoop
> Type: Bug
> Components: dfs
> Versions: 0.1.0, 0.1.1
> Reporter: Yoram Arnon
> Assignee: Konstantin Shvachko
> Fix For: 0.3
> Attachments: hadoop-146-random.patch
>
> currently, block id's are generated randomly, and are not tested for
> collisions with existing id's.
> while ids are 64 bits, given enough time and a large enough FS, collisions
> are expected.
> when a collision occurs, a random subset of blocks with that id will be
> removed as extra replicas, and the contents of that portion of the containing
> file are one random version of the block.
> to solve this one could check for id collision when creating a new block,
> getting a new id in case of conflict. This approach requires the name node to
> keep track of all existing block id's (rather than just the ones who have
> reported in), and to identify old versions of a block id as in valid (in case
> a data node dies, a file is deleted, then a block id is reused for a new
> file).
> Alternatively, one could simply use sequential block id's. Here the downsides
> are:
> 1. migration from an existing file system is hard, requiring compaction of
> the entire FS
> 2. once you cycle through 64 bits of id's (quite a few years at full blast),
> you're in trouble again (or run occasional/background compaction)
> 3. you must never lose the high watermark block id.
> synchronized Block allocateBlock(UTF8 src) {
> Block b = new Block();
> FileUnderConstruction v = (FileUnderConstruction)
> pendingCreates.get(src);
> v.add(b);
> pendingCreateBlocks.add(b);
> return b;
> }
> static Random r = new Random();
> /**
> */
> public Block() {
> this.blkid = r.nextLong();
> this.len = 0;
> }
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira