[
https://issues.apache.org/jira/browse/HADOOP-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raghu Angadi updated HADOOP-1471:
---------------------------------
Component/s: dfs
Fix Version/s: 0.14.0
Description:
Patch submitted to HADOOP-893 (by me :( ) seemhave a bug in how it deals with
the set {{deadNodes}}. After the patch, the {{seekToNewSource()}} looks like
this :
{code}
public synchronized boolean seekToNewSource(long targetPos) throws
IOException {
boolean markedDead = deadNodes.contains(currentNode);
deadNodes.add(currentNode);
DatanodeInfo oldNode = currentNode;
DatanodeInfo newNode = blockSeekTo(targetPos);
if (!markedDead) {
/* remove it from deadNodes. blockSeekTo could have cleared
* deadNodes and added currentNode again. Thats ok. */
deadNodes.remove(oldNode);
}
// ...
{code}
I guess with the expectation that caller of this function decides before the
call whether to put the node in {{deadNodes}} or not. I am not sure whether
this was a bug then or not but it certainly seems to be bug now. i.e. when
there is a checksum error with replica1, we try replica2 and if there a
checksum error again, then we try replica1 again!
Note that ChecksumFileSystem.java was created after HADOOP-893 was resolved.
was:
Patch submitted to HADOOP-893 (by me :( ) seemhave a bug in how it deals with
the set {{deadNodes}}. After the patch, the {{seekToNewSource()}} looks like
this :
{code}
public synchronized boolean seekToNewSource(long targetPos) throws
IOException {
boolean markedDead = deadNodes.contains(currentNode);
deadNodes.add(currentNode);
DatanodeInfo oldNode = currentNode;
DatanodeInfo newNode = blockSeekTo(targetPos);
if (!markedDead) {
/* remove it from deadNodes. blockSeekTo could have cleared
* deadNodes and added currentNode again. Thats ok. */
deadNodes.remove(oldNode);
}
// ...
{code}
I guess with the expectation that caller of this function decides before the
call whether to put the node in {{deadNodes}} or not. I am not sure whether
this was a bug then or not but it certainly seems to be bug now. i.e. when
there is a checksum error with replica1, we try replica2 and if there a
checksum error again, then we try replica1 again!
Note that ChecksumFileSystem.java was created after HADOOP-893 was resolved.
Affects Version/s: 0.12.3
> seekToNewSource() might not work correctly with Checksum failures.
> ------------------------------------------------------------------
>
> Key: HADOOP-1471
> URL: https://issues.apache.org/jira/browse/HADOOP-1471
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.12.3
> Reporter: Raghu Angadi
> Fix For: 0.14.0
>
>
> Patch submitted to HADOOP-893 (by me :( ) seemhave a bug in how it deals
> with the set {{deadNodes}}. After the patch, the {{seekToNewSource()}} looks
> like this :
> {code}
> public synchronized boolean seekToNewSource(long targetPos) throws
> IOException {
> boolean markedDead = deadNodes.contains(currentNode);
> deadNodes.add(currentNode);
> DatanodeInfo oldNode = currentNode;
> DatanodeInfo newNode = blockSeekTo(targetPos);
> if (!markedDead) {
> /* remove it from deadNodes. blockSeekTo could have cleared
> * deadNodes and added currentNode again. Thats ok. */
> deadNodes.remove(oldNode);
> }
> // ...
> {code}
> I guess with the expectation that caller of this function decides before the
> call whether to put the node in {{deadNodes}} or not. I am not sure whether
> this was a bug then or not but it certainly seems to be bug now. i.e. when
> there is a checksum error with replica1, we try replica2 and if there a
> checksum error again, then we try replica1 again!
> Note that ChecksumFileSystem.java was created after HADOOP-893 was resolved.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.